6,814 Matching Annotations
  1. Last 7 days
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the three reviewers for their thoughtful and constructive comments which help us to improve the manuscript. Please find our responses below. * *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary This study investigates how altered expression of cleavage and polyadenylation (CPA) factors affects alternative polyadenylation (APA), transcription termination and cellular phenotypes in colorectal cancer (CRC) cell lines. The authors combine genetic perturbations of CPA factors with chemical inhibition of CPSF73 and assess effects on clonogenic potential, transcription-replication conflicts, APA profiles, and transcription termination-associated RNAPII phosphorylation patterns. The main comparisons are performed between healthy (1CT), primary tumour (SW480, HCT116) and metastatic (SW620) cell lines, which are reported to contain altered expression levels of CPA factors. The data suggest differential dependence on CPA factors between primary tumour-derived and metastatic CRC cell lines, as well as changes in transcription termination patterns. The data are overall well-presented with clear figures. However, in several cases the strength of the conclusions appears to exceed the support provided by the data, and alternative interpretations should be considered.

      Major comments 1. Clonogenic sensitivity to CPA factor perturbation and comparability of clonogenic assays between cell lines: -The data indicate that clonogenic potential in SW480 is strongly dependent on CPSF73 and PCF11, whereas SW620 appear less sensitive. However, the interpretation is complicated by differences in depletion efficiencies. In SW620 cells, PCF11 depletion appears inefficient, and protein levels remain higher than in siLUC-treated SW480 cells (Fig. 1D and S1C; also in comparison to 1CT by inference of Fig. 1C). Thus, the apparent resistance of SW620 cells could reflect insufficient depletion rather than true biological tolerance. The effectiveness of siCPSF73 treatments is difficult to assess from the presented data. Quantification of protein knockdown levels should be provided and incorporated into the interpretation.

      -In Fig. 1D, 1E, and S1D, colony formation of DMSO- or siLUC-treated SW620 and SW480 cells differs markedly in absolute terms. However, the graphs are normalized separately for each cell line, which obscures this difference. This raises two concerns: First, the baseline clonogenic capacity differs between the lines and should be discussed. Second, it is unclear whether direct comparisons between cell lines are valid when normalization is performed independently. For example, in absolute terms, 1 µM JTE-607 appears to have a similar effect in SW620 cells as 5 µM in SW480 cells, which would contradict the conclusion that metastatic cells are more tolerant to CPA perturbations. This issue should be explicitly addressed.

      We thank the reviewer for those thoughtful comments.

      a) Assessing the biological meaning of differences in PCF11 depletion efficiency between SW480 and SW620 cell lines is inherently tricky, because the two cell lines differ 3-fold in their baseline PCF11 level (Fig. 1C). Even with equal efficiencies of knock-down, the number of PCF11 molecules per cell left after the treatment will differ. We haven't mentioned this in our original manuscript but will highlight this issue in the revised version - as we agree it is an important consideration for the interpretation of the results.

      b) As requested, we will add quantification of western blots from 3 biological replicates to the revised manuscript, to demonstrate the depletion efficiencies. We agree that the single western blot presented by itself was not sufficient; the efficiency of SW620 knock-down is not lower compared to SW480.

      c) The baseline clonogenic capacity of SW480 and SW620 has been previously calculated and compared in two publications (PMID: 31961892 and 29796953). In both cases, the SW620 cells showed higher clonogenic potential than SW480, which was calculated based on the number of clones containing more than 50 cells.

      d) The reason behind normalization of our data to a control sample is the difference in cell size between the cell lines, which prohibits their direct comparison.

      For the colony formation assays, we seeded the same number of cells and cultured them for the same amount of time. However, the difference in cell size, leads to a huge difference in colony sizes (Figure 1D), therefore it was not possible to set the same parameters for counting colonies of SW480 and SW620 cells. Therefore, we decided to use an approach frequently used in high profile cancer studies (e.g. Li at al., 2023, PMID: 37620362, Waterhouse et al., 2025, PMID: 40328966, Yang et al., 2026, PMID: 41484364) and normalize each biological replicate to the control sample to analyze the response to the treatment only.

      e) During revision, we might additionally perform CellTiter 96® Non-Radioactive Cell Proliferation Assay (MTT) to test how another cancerous characteristic of SW480 and SW620 cells are affected by JTE-607.

      f) We will also perform colony formation and/or MTT assays for 3 additional cell lines: HCT116 (primary tumor-derived) and T87 and COLO-205 (metastasis-derived, which we are currently in the process of obtaining) to assess their sensitivity to JTE-607.

      g) The result of higher sensitivity of SW620 cells compared to SW480 cells has been obtained not only for PCF11 knock-down, where inter-cell line differences of baseline protein level make interpretations more difficult, but also for CPSF73 knock-down (Fig. 1D), which baseline level was similar and knock-down was equally efficient in both cell lines, and for CPSF73 inhibition (Fig. 1E); with the use of normalization procedures used frequently in literature (see point d).

      Therefore, we argue that our conclusion that SW480 cells are more sensitive than SW620 to the abrogation of 3' pre-mRNA cleavage and transcription termination is valid. However, we are willing to weaken our conclusion if the reviewer does not agree with our point of view.

      For the additional cancer-specific experiments proposed above, we suggest the usage of JTE-607 as drug treatment is more robust, reproducible, and medically relevant compared to knock-down experiments.

      1. Interpretation of transcription termination markers: -The study uses RNAPII T4ph as a marker of transcription termination, which is well justified based on the ref. [30], but still the mechanistic basis of this modification is not fully understood. Changes in T4ph localization are interpreted as consequences of CPA activity, but possible differences in kinase or phosphatase activities between cell lines are not considered that could affect the T4ph levels or localization. Therefore, conclusions based solely on T4ph redistribution should be presented with greater caution, and alternative explanations should be acknowledged.

      While in our experience RNAPII T4ph is the most sensitive and useful termination marker, we agree with the referee that its metabolism and function is insufficiently understood - this is an important and interesting direction for future investigation.

      In order to increase the robustness of our study, during revision we will additionally perform nascent transcriptomics on SW480 and SW620 using a different method, POINT-seq. POINT-seq in contrast to T4ph mNET-seq relies neither on RNAPII modification status nor is affected by pausing. We will also probe global T4ph-RNAPII levels in our cellular model by western blot. We will then adjust our manuscript accordingly.

      -Line 240 states that premature termination is increased in primary tumour cells. However, the data show increased T4ph signal (Fig. 4B) but no change in total RNAPII occupancy in gene bodies (Fig. 4A). This does not directly demonstrate increased termination. Additional evidence or a more cautious interpretation would be appropriate.

      The reviewer is right in pointing out the difference between the Total-RNAPII and T4ph-RNAPII signals across the gene body. We will provide a clearer description and explanation in the revised manuscript.

      T4ph-RNAPII is present at low levels in human cells. S2ph and S5ph are the dominant modifications, accounting for ~75% of phospho-counts, whereas T4ph has a relative abundance of ~15% (PMID: 26799765). In addition, T4ph is concentrated at gene ends and typically very low in the gene body (PMID: 28017589, 30819644, doi: 10.1101/2025.07.14.664659). Consequently, it is very easy to spot its gene-body increase in metagene analysis (Figure 4B), even when it happens only on a subset of genes in cancer samples (e.g. Fig. 4D).

      Total-RNAPII signal in the gene body largely reflects S2ph-modified RNAPII levels so its metagene analysis is not sufficiently sensitive to detect differences in gene-body T4ph-RNAPII.

      Consequently, RNAPII-T4ph and RNAPII-total mNET-seq show distinct metagene patterns and different responses to termination changes. RNAPII-T4ph mNET-seq is a sensitive method to detect changes in termination patterns, while total-RNAPII is much less specific and sensitive with respect to transcription termination.

      1. Cleavage-termination distance as a predictor of transcript levels: -Figure 5A presents median distances across all genes. It would be informative to perform a gene-wise comparison between cell lines (difference in cleavage-T4ph distance for the same gene, e.g. in 1CT vs. HCT116, individual differences plotted across all genes). This analysis could help clarify how frequently individual genes experience the effect (shortening of the cleavage-T4ph distance between 1CT and tumour cells) that is observed globally.

      Thank you for this valuable suggestion. We have performed the gene-wise comparison which is indeed very informative. Firstly, we observed the same trend as for all active protein-coding genes - shorter distance in all CRC cell lines compared to 1CT cells with the lowest values of the cleavage-termination distance in the primary tumor cells. Secondly, and even more importantly, this analysis additionally shows that the shortening effect is global - only a small percentage of genes do not undergo shortening of the cleavage-T4ph distance between 1CT and tumor cells.

      We will incorporate the results of this analysis into the figures of the revised manuscript.

      -The manuscript claims that proximity between pre-mRNA 3′-end cleavage and transcription termination predicts increased nuclear transcript levels. However, the correlation coefficients are small (Spearman r ~ -0.2 at most), indicating weak predictive power. Therefore, the use of the term "predicts," especially in the manuscript title, appears to overstate the strength of the relationship. The authors should either moderate this claim or provide additional analysis to support stronger predictive value.

      We agree with the reviewer that the term "predicts" is not ideal in this context and are happy to substitute "is associated with". The title would then read: "Proximity of pre-mRNA 3′ end processing and transcription termination is associated with enhanced gene expression".

      Minor comments -Figures 1B and S1A: The discontinuous y-axis makes it difficult to assess relative protein level differences between normal and cancer samples. Statistical testing should be included to evaluate significance.

      We had decided against statistical testing due to the problems with biological interpretation of such analyses and its limitations for proteins present in the cell at low levels and/or highly variable between samples. PCF11 is such protein. It is an order of magnitude less abundant compared to other RNA 3' processing factors, and its levels are variable as shown in our Fig. 1B (re-analyzed proteomics data from Wiśniewski et al., 2015). Therefore, the increase in PCF11 levels in this dataset is not statistically significant in Mann-Whitney test, while it is significant for CPSF73.

      The variability of PCF11 levels can be also observed in the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data in the Human Protein Atlas (while no absolute quantification was performed there).

      In two independently obtained proteomics patient datasets (Wiśniewski et al., 2015; CPTAC), as well as in Western blot assays from our cell culture model, an increase in PCF11 protein abundance is observed in cancer cells. This consistency across different datasets and our model holds greater biological relevance than the statistical analysis of highly varied samples. Nevertheless, if the reviewer requires statistics, we will include them in the revised manuscript.

      The discontinuous y-axis was applied due to broad range of protein molecules. Presentation of data with linear continuous scale did not allow to present the difference between normal and cancer samples for all the proteins on the same graph.

      Alternatively, if the reviewer and editor prefer, we are happy to present the data with log10 transformed scale. The disadvantage of log-scale is that the differences between normal and cancer samples are less obvious to the eye, the advantage is a continuous y-axis.

      -Lines 217-218: The text should emphasize that nuclear RNA abundance may not reflect cytoplasmic mRNA levels, particularly when APA alters 3′UTRs and may affect mRNA stability.

      We agree and will incorporate the reviewer's suggestion in our revised manuscript.

      -Lines 261-264: The cleavage-termination distance metric should be more clearly defined as the distance between the polyadenylation site and the T4ph signal peak.

      We plan to incorporate a drawing into the figure, to better explain our cleavage-termination definition.

      We also performed the cleavage site to T4ph signal peak (highest signal in the termination window) distance calculations, and they show the same trend as our original method (Figure 5A), with no changes to the conclusions we made. We will incorporate this additional analysis into a supplementary figure.

      **Referees cross-commenting**

      Reviewer #3:

      On the contrary to implied in the reviewer report, this manuscript does not report the effects of CPSF73 inhibitor JTE-607 on APA. On this note, as the authors discuss uncoupling of cleavage and transcription termination, they could consider (this is not a request) testing how the cleavage inhibitor JTE-607 impacts the distribution of transcription termination marker T4ph, and whether the effects would be different in different cell lines where the coupling appears to be different. This could give mechanistic insights into the sources of the differences between cell lines.

      In order to get a mechanistic idea why shorter cleavage-termination distance is associated with higher gene expression, we plan to test the cleavage efficiency on genes, which show differences in cleavage-termination distance and expression levels, between SW480 and SW620 cell lines. To this end, we will perform POINT-seq, checking differences between those cell lines in control conditions and with JTE-607. We believe that this new experimental approach will provide a deeper mechanistic insight, compared to performing further correlation analyses repeating the same experiment types.

      Reviewer #1 (Significance (Required)):

      This study addresses an important question in RNA biology and cancer research: how altered expression or pharmacological targeting of CPA factors affects alternative polyadenylation, transcription termination, and cellular phenotypes in CRC models. This topic is timely, as CPSF73 has been proposed as a therapeutic target, making it important to understand the molecular and cellular consequences of modulating CPA factor activities. A key strength and robust finding of the study is the identification of unexpected relationships between pre-mRNA 3′-end processing and transcription termination during CRC progression. Notably, the authors report that changes in alternative polyadenylation and transcription termination appear to be uncoupled and may even occur in opposite directions. This challenges simplified models in which these processes are tightly coordinated and suggests that their (mis)regulation in cancer cells may be more complex than previously appreciated. Secondly, the study provides an interesting observation that gene-specific changes in cleavage-T4ph distance correlate negatively with changes in nuclear levels of processed transcripts. This suggests a potential relationship between the spatial coupling of 3′-end processing and transcription termination and transcript abundance. If validated mechanistically, this could represent a conceptual advance in understanding how transcription termination dynamics influence gene expression outputs. However, the observed correlations are relatively weak, and the mechanistic basis of this relationship remains unclear. As such, this advance is primarily descriptive at this stage.

      As indicated in response to the cross-commenting point above, one possible mechanistic explanation why shorter cleavage-termination distance could be associated with higher gene expression, is increased cleavage efficiency when the cleavage-termination distance is short. To test this hypothesis, we will perform POINT-seq on SW480 and SW620 cell lines, in control and CPSF73 inhibition conditions. We have previously demonstrated that POINT-seq technique allows calculation of cleavage efficiencies, and its alterations (doi: 10.1101/2025.07.14.664659).

      So far, our data (Fig. 5F, G) indicates that PCF11 is involved in this process since PCF11 downregulation resulted in lengthening the distance between 3′-end cleavage and RNAPII terminal pausing. This lengthening was in parallel correlated with the decrease of the nuclear RNA levels. However, PCF11 participates in multiple steps of gene expression - pre-mRNA cleavage, alternative polyadenylation, RNAPII pausing, and mRNA export - making the underlying mechanism difficult to pinpoint without additional experiments.

      Importantly, our work provides the first clear evidence that changes in cleavage site usage and termination region usage can become uncoupled. We hope that continued tool development, together with studies like ours, will ultimately enable a full mechanistic understanding.

      Several interpretations of experimental data would benefit from more cautious framing or additional analysis. In particular, the relationship between changes in CPA factor expression levels and sensitivity to the CPSF73 inhibitor JTE-607 across CRC cell lines remains unclear from the presented data.

      During the revision we will explain more clearly the rationale for our interpretation of the data. In cases where more cautious framing would still be needed, we will include alternative interpretations.

      This work will be of interest primarily to basic researchers in RNA processing and transcription regulation, gene expression control, cancer cell biology and pharmacological targeting of RNA-processing machineries.

      Reviewer field of expertise: My expertise is in RNA processing and gene regulation. I do not have specific expertise clinical oncology or cancer biology.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Factors involved in pre-mRNA cleavage and polyadenylation (CPA) are upregulated in many cancers and have been found to be associated with poor prognosis. In their manuscript "Proximity of pre-mRNA 3′ end processing and transcription termination predicts enhanced gene expression", Stepien et al. use colorectal cancer (CRC)-derived cell lines as a model of CPA overexpression to study its biological consequences. To this end, the authors initially confirm increased expression of CPA factors in these cell lines and demonstrate that their knock-down strongly decreases the colony-forming ability of primary tumour-derived CRC cells. They further assess various phenotypes that are expected to depend on CPA activity based on the current knowledge in the field, including poly-A site selection, occurrence of transcription-replication conflicts, and the site of transcription termination. Contrary to expectations, they find a proximal shift in transcription termination to be the most prominent change in CRC ell lines with high CPA levels, despite no clear preference for proximal poly-A site usage in these cells, suggesting an uncoupling of both processes. The authors combine their 3'-end mapping data and T4P-mNET-seq data mapping terminating RNAPII to score cleavage-termination distance at individual genes and find shorter distances to correlate with increased gene expression in the different cell lines. Overall, this is a carefully conducted study, and the claims and conclusions are well supported by the data.

      I have some minor comments: 1. PLA assay to quantify transcription-replication conflicts (Figure 2). The quantified data looks very convincing and is also in good agreement with the proximal shift in transcription termination that is demonstrated later in the paper. However, the PLA channel signal in the microscopy image examples shown in panel A looks very blurry, and it is hard to imagine that one would be able to count # foci based on this. This may just be an issue with the resolution of the image provided. Apparently, there are much less foci in the treated samples shown in panel B - maybe microscopy images for these could be provided as well? Also, since none of the treatments impact the # of TRCs, it would have been nice to include a positive control known to induce TRCs to demonstrate that the assay works (if such a control is known) - this is optional, and I would not ask to repeat the entire experiment just for this additional control (but maybe the authors have done it and the data is already available?).

      We apologize for the low resolution of the picture presented in Figure 2. We were unable to upload high resolution picture file during the first submission, for technical reasons. We will improve it in the revised manuscript.

      The difference in baseline PLA foci between Fig. 2A and 2B reflects a known sensitivity of the PLA assay to cell confluency. As these two experiments were performed at different confluences, direct cross-panel comparison is not appropriate. For this reason, all quantitative comparisons in the manuscript are made strictly within the same plate, the same PLA reaction, and between wells with comparable confluency, which avoids introducing bias from these technical variables. For clarity, we plan to incorporate the above information into the Methods section. To validate assay specificity within each experiment, we confirmed that EdU-positive cells consistently showed higher PLA foci counts than EdU-negative cells from the same wells, demonstrating that quantification reflects genuine PCNA-associated signal above background. With this internal validation in place, each panel's comparisons remain valid and interpretable on their own terms.

      No classical positive control exists for a PolII-pThr4/PCNA PLA interaction, as this is a relatively unexplored proximity event with no established positive control condition. We used single-antibody negative controls to establish assay specificity, although we didn't quantify and show it. We also used EdU-negative cells within the same wells as an internal background baseline, ensuring that measured foci reflect genuine signal above background. As a proxy for positive controls, we relied on the detection of changes in PLA foci number between the tested conditions, such as the effect of 4h XRN2 degradation. Also, the consistency of biological replicates and the differences between cell lines made us quite secure we were detecting reproducible and biologically relevant differences.

      1. Figure 2A-C: please include information on number of cells quantified

      We will incorporate this information into the revised manuscript.

      1. Figure 2C: In the label, please include degron, e.g. HCT116 CPSF73-AID rather than just HCT116

      We will modify the label according to the reviewer's suggestion.

      1. Figure 5C: When quantifying nascent txn based on mNET-seq, to which extent would one expect terminally paused RNAPII along the gene body (premature termination events) to contribute to the increased signal? That is, could an increase in stalling be mistaken for an increase in transcription? Based on the metagene plot in Fig 2A it doesn't look like it, but the authors may be able to estimate the effect (if any) from their data.

      We thank the reviewer for pointing this out.

      As reviewer #1 observed, and we comment above (Rev.1 point 2b), the increase of premature termination events in cancer cells, which can be readily detected by RNAPII T4ph mNET-seq increase in the gene body, does not globally perturb total RNAPII mNET-seq profiles (see metagenes in figure 4A and 4B).

      Nevertheless, mNET-seq method does indeed detect both nascent transcription levels and RNAPII pausing, which is particularly relevant when wanting to make conclusions on a single gene level. In order to increase the robustness of our study and make stronger conclusions about nascent transcription rates, independent of stalling, during revision we will perform POINT-seq experiments in SW480 and SW620 cells. That method, in contrast to mNET-seq, is not pausing sensitive.

      Reviewer #2 (Significance (Required)):

      The observed uncoupling of poly-A site selection and size of termination window is unexpected and raises important questions on how these coupled processes can be regulated independently.

      Strengths of the study: i) Parallel assessment of different CRC-based cell lines provides evidence of phenotype stability across patients. ii) Brings together strong technical expertise combining different state-of-the-art methodologies to map and correlate poly-A site usage, site of transcription termination, and levels of nascent transcription within the same cell lines under the same conditions, providing a comprehensive dataset.

      Limitations: i) For the time being, observation limited to CRC cell lines.

      While this is the first time that we are able to show the pre-mRNA 3' cleavage and transcription termination uncoupling so clearly, we have previously reported findings in other cell types which pointed to this direction. We found in HeLa cells (PMID: 30819644) that genes preferentially using distal polyadenylation sites exhibit more proximal RNAPII terminal pausing compared to genes that predominantly use proximal polyadenylation sites. Recently, we also found in U2OS cells after SETD2 KO and renal cell carcinoma cell lines with SETD2 mutation, that readthrough transcription occurs independently of APA (doi: 10.1101/2025.07.14.664659). This phenomenon could be frequent, but it has not been investigated until now, as cleavage and termination were usually studied separately.

      In terms of the correlation between cleavage-termination distance and expression levels, in our study so far, we found it in CRC (HCT116, SW480, SW620) and cervical cancer (HeLa) cell lines. During revision we plan to test it additionally in pancreatic cell lines, with high sensitivity to JTE-607 treatment (BxPC3), medium (Panc1), and low sensitivity (MiaPaCa2).

      ii) Mechanism behind proximal shift of termination to be determined.

      We agree with the reviewer that the mechanism underlying the proximal transcription termination is missing. Our unpublished data show correlation between RNAPII pausing and transcription termination factors occupancy on chromatin. However, since more factors are involved, such as elongation speed and chromatin architecture, resolving the mechanism requires further extensive studies.

      I expect this work to be of interest to an audience interested in transcription and regulation of gene expression more broadly, with potential translational relevance for cancer therapy.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Stępień et al. aims to investigate the roles of pre-mRNA 3′ end processing and transcription termination using colorectal cancer (CRC) cell lines (normal colon epithelial cells: 1CT; primary tumours: HCT116 and SW480; metastatic tumour: SW620). By using publicly available proteomic datasets and their cellular models, the authors first demonstrate elevated expression of several cleavage and polyadenylation (CPA) and termination factors, including CPSF73 and PCF11, in CRC cells. They further assess the functional relevance of CPSF73 and PCF11, showing that siRNA-mediated knockdown of these factors reduces colony formation, particularly in primary cancer cell lines. However, they do not observe a clear association between CPA/termination factors and transcription-replication collisions (TRCs), suggesting that TRCs may not underlie the altered colony formation phenotype.

      The authors also examine alternative polyadenylation (APA) using the CPSF73 inhibitor JTE-607 and report complex APA patterns: primary tumour cell lines display a bias toward distal APA site usage, whereas the metastatic SW620 line preferentially uses proximal sites. They further evaluate transcription termination and observe a proximal shift (early termination) in primary CRC cells, and to a lesser extent in SW620 cells. Noting the apparent discrepancy between APA site shifting and proximal termination, the authors introduce a new metric termed cleavage-termination distance, defined as the distance between the coordinates of the major PAS and RNAPII termination site. They report an association between shortening of cleavage-termination distance and increased gene expression, which may contribute to the upregulation of cancer-related genes.

      Overall, this is a well-written manuscript that highlights potential roles of pre-mRNA processing and transcription termination in gene expression control, with implications for cancer biology. Nevertheless, several issues should be addressed to strengthen the study:

      Overall, this is a well-written manuscript that reveals potential roles of pre-mRNA processing and transcription termination in gene expression control, with implications for cancer biology. Nevertheless, I have a few comments that may help strengthen the study.

      Major comments: 1. The study includes two primary tumour cell lines but only one metastatic cell line (SW620), which is derived from the same patient as SW480. It remains unclear whether the observed effects represent general characteristics of metastatic tumour cells or are specific to this particular cell line.

      Our primary workhorse in this study are the cell lines SW480 and SW620, which are derived from the same patient, to avoid the confounding variable of genetic diversity between cell lines. Unfortunately, these are the only paired CRC cell lines currently available in cell banks.

      We would not want to perform further (expensive and time-consuming) genomic assays on additional CRC metastatic cell lines since the cell lines available were isolated from other types of metastasis (liver or lung, while SW620 comes from lymph node) and other patients - which would make interpreting any results obtained with them difficult. However, we plan to check the sensitivity of one more primary (HCT116) and two more metastatic (T87 and COLO-205) cancer cell lines to JTE-607 treatment in colony formation or MTT assay to find out whether the differences in CRC cell sensitivity are more cancer-stage or patient specific.

      Further on, we plan to check whether our finding of alterations in cleavage-termination distance might have clinically relevant prognostic value, even outside of the context of CRC. To this end, we will test the hypothesis that a short cleavage-termination distance could be a prognostic marker for sensitivity of cells to JTE-607 treatment. It has been previously demonstrated that pancreatic cancer (PC) cell lines differ in sensitivity to JTE-607 (PMID: 38191171). We will perform T4ph-mNET-seq and nuclear 3'mRNA-seq experiments on PC cell lines to check the cleavage-termination distance in JTE-607-sensitive (BxPC3), medium sensitive (Panc1) and least JTE-607 sensitive (MiaPaCa2) cells, and for presence or absence of correlation of this distance with the cell sensitivity to JTE-607.

      The rationale for focusing on colorectal cancer in this study requires further clarification. Although the Introduction provides a comprehensive review of CPSF73 and PCF11 in other cancer types, evidence specific to colorectal cancer is limited. Are these factors known to be mutated or dysregulated in CRC? Is their expression associated with patient survival? The authors could strengthen their rationale by performing a basic analysis using publicly available datasets (e.g., TCGA), such as evaluating expression levels in tumour versus normal tissue and generating Kaplan-Meier survival curves.

      We will respond to these questions in the revision.

      1. In Figure 5 and Supplementary Figure 5, the authors analyse cleavage-termination distance across oncogenes and tumour suppressor genes and observe a negative correlation between cleavage-termination distance and gene expression level. This is an interesting finding and suggests a possible mechanism for enhancing expression of cancer-related genes. It would be valuable to extend this analysis more systematically-for example, by stratifying genes based on cleavage-termination distance and performing gene ontology enrichment analysis / GSEA to identify functional categories enriched among genes with shorter or longer distances. The authors could further relate these gene sets to, for example, distinct phenotypes between primary vs metastatic tumours.

      This is an excellent suggestion. We will perform the above analyses carefully during the revision. Our initial analysis done upon receiving the reviews suggests that the genes, whose cleavage-termination distance decreases during tumorigenesis, while gene expression increases, are enriched for RNA processing, DNA damage response, chromatin organization and ribosome biogenesis factors. On the other hand, increased cleavage-termination distance and decreased gene expression are mostly associated with organelle assembly and protein localization. We will deepen this analysis and discuss the implication to cancer biology in our revised manuscript.

      Minor comments: 4. In Figure 2A, the number of RNAPII-PCNA PLA foci appear comparable between SW480 and SW620, whereas in Figure 2B this seems to be much lower in SW620 compared to SW480. Could the authors clarify this discrepancy?

      The difference in baseline PLA foci between Fig. 2A and 2B reflects a known sensitivity of the PLA assay to cell confluency. As these two experiments were performed at different confluencies, direct cross-panel comparison is not appropriate. For this reason, all quantitative comparisons in the manuscript are made strictly within the same plate, the same PLA reaction, and between wells with comparable confluency, which avoids introducing bias from these technical variables. For clarity, we plan to incorporate the above information into the Methods section. To validate assay specificity within each experiment, we confirmed that EdU-positive cells consistently showed higher PLA foci counts than EdU-negative cells from the same wells, demonstrating that quantification reflects genuine PCNA-associated signal above background. With this internal validation in place, each panel's comparisons remain valid and interpretable on their own terms.

      1. Is the cleavage-termination distance metric influenced by gene length? If so, should this parameter be normalised to gene length to avoid potential bias?

      No, gene length is not a bias in the cleavage-termination distance.

      • We performed correlation analysis and there is no significant correlation between the cleavage-termination distance and gene length, in any of cell line pairs in our model: HCT116 vs 1CT (spearman r=0.001, p=0.945); SW480 vs 1CT (spearman r=0.036, p=0.0654); SW620 vs 1CT (spearman r=-0.018, p=0.325).
      • Additionally, we quantified the decrease in cleavage-termination distance on the very same gene, just in different cell lines. We will incorporate this result into the manuscript.
        1. The data and analysis scripts generated in this study have not yet been made publicly available and therefore cannot be fully evaluated.

      We apologize for this omission. The revised manuscript will contain the link to our publicly available scripts in GitHub and the GEO access.

      **Referees cross-commenting**

      I agree with the reports from both Reviewer #1 and Reviewer #2.

      I would like to thank Reviewer #1 for pointing out my mistaken. The authors did not use JTE-607 to study APA; rather, they studied the differences in APA between cell lines. I apologise for the confusion.

      Reviewer #3 (Significance (Required)):

      General assessment: This study investigates the contribution of pre-mRNA 3′ end processing and transcription termination to colorectal cancer (CRC) biology using a combination of cell line comparisons (primary versus metastatic tumours), chemical, and RNAi perturbations, and bioinformatic analyses.

      The major strengths of the work include: • The use of CRC cell lines representing normal, primary, and metastatic states, including matched primary and metastatic lines derived from the same patient. • A systematic analysis of alternative polyadenylation (APA) and transcription termination, revealing a potential uncoupling between these two closely related processes. • The introduction of a novel quantitative metric-cleavage-termination distance-to examine the relationship between PAS usage and RNAPII termination. • The identification of a negative association between cleavage-termination distance and gene expression, suggesting an additional regulatory layer influencing gene expression.

      However, certain limitations should be considered: • The generalisability of conclusions regarding metastatic CRC is limited by reliance on a single metastatic cell line.

      We believe that the experiments we outlined above in response to Reviewer #3 point 1 will allow us to extend the generalizability of conclusion.

      • The translational relevance of the findings could be further strengthened through patient-level or clinical data analysis.

      We agree with the reviewer. Due to technical limitations, it is not possible to perform nascent transcriptomic experiments on patient material at this time. However, we will attempt to strengthen the translational relevance by additional experiments and analysis as indicated in response to Reviewer #3 points 1-3.

      Advance: The study proposes potentially novel roles for 3′ end cleavage and transcription termination in regulating gene expression in colorectal cancer. In particular, the conceptual distinction between APA site shifting and transcription termination, together with the introduction of the cleavage-termination distance metric, represents a conceptual advance.

      Audience: The work is primarily positioned within basic research. With additional translational context, it may also attract interest from a broader audience.

      Field of expertise: transcriptional regulation and bioinformatics

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Stępień et al. aims to investigate the roles of pre-mRNA 3′ end processing and transcription termination using colorectal cancer (CRC) cell lines (normal colon epithelial cells: 1CT; primary tumours: HCT116 and SW480; metastatic tumour: SW620). By using publicly available proteomic datasets and their cellular models, the authors first demonstrate elevated expression of several cleavage and polyadenylation (CPA) and termination factors, including CPSF73 and PCF11, in CRC cells. They further assess the functional relevance of CPSF73 and PCF11, showing that siRNA-mediated knockdown of these factors reduces colony formation, particularly in primary cancer cell lines. However, they do not observe a clear association between CPA/termination factors and transcription-replication collisions (TRCs), suggesting that TRCs may not underlie the altered colony formation phenotype.

      The authors also examine alternative polyadenylation (APA) using the CPSF73 inhibitor JTE-607 and report complex APA patterns: primary tumour cell lines display a bias toward distal APA site usage, whereas the metastatic SW620 line preferentially uses proximal sites. They further evaluate transcription termination and observe a proximal shift (early termination) in primary CRC cells, and to a lesser extent in SW620 cells. Noting the apparent discrepancy between APA site shifting and proximal termination, the authors introduce a new metric termed cleavage-termination distance, defined as the distance between the coordinates of the major PAS and RNAPII termination site. They report an association between shortening of cleavage-termination distance and increased gene expression, which may contribute to the upregulation of cancer-related genes. Overall, this is a well-written manuscript that highlights potential roles of pre-mRNA processing and transcription termination in gene expression control, with implications for cancer biology. Nevertheless, several issues should be addressed to strengthen the study:

      Overall, this is a well-written manuscript that reveals potential roles of pre-mRNA processing and transcription termination in gene expression control, with implications for cancer biology. Nevertheless, I have a few comments that may help strengthen the study.

      Major comments: 1. The study includes two primary tumour cell lines but only one metastatic cell line (SW620), which is derived from the same patient as SW480. It remains unclear whether the observed effects represent general characteristics of metastatic tumour cells or are specific to this particular cell line. 2. The rationale for focusing on colorectal cancer in this study requires further clarification. Although the Introduction provides a comprehensive review of CPSF73 and PCF11 in other cancer types, evidence specific to colorectal cancer is limited. Are these factors known to be mutated or dysregulated in CRC? Is their expression associated with patient survival? The authors could strengthen their rationale by performing a basic analysis using publicly available datasets (e.g., TCGA), such as evaluating expression levels in tumour versus normal tissue and generating Kaplan-Meier survival curves. 3. In Figure 5 and Supplementary Figure 5, the authors analyse cleavage-termination distance across oncogenes and tumour suppressor genes and observe a negative correlation between cleavage-termination distance and gene expression level. This is an interesting finding and suggests a possible mechanism for enhancing expression of cancer-related genes. It would be valuable to extend this analysis more systematically-for example, by stratifying genes based on cleavage-termination distance and performing gene ontology enrichment analysis / GSEA to identify functional categories enriched among genes with shorter or longer distances. The authors could further relate these gene sets to, for example, distinct phenotypes between primary vs metastatic tumours.

      Minor comments: 4. In Figure 2A, the number of RNAPII-PCNA PLA foci appear comparable between SW480 and SW620, whereas in Figure 2B this seems to be much lower in SW620 compared to SW480. Could the authors clarify this discrepancy? 5. Is the cleavage-termination distance metric influenced by gene length? If so, should this parameter be normalised to gene length to avoid potential bias? 6. The data and analysis scripts generated in this study have not yet been made publicly available and therefore cannot be fully evaluated.

      Referees cross-commenting

      I agree with the reports from both Reviewer #1 and Reviewer #2. I would like to thank Reviewer #1 for pointing out my mistaken. The authors did not use JTE-607 to study APA; rather, they studied the differences in APA between cell lines. I apologise for the confusion.

      Significance

      General assessment:

      This study investigates the contribution of pre-mRNA 3′ end processing and transcription termination to colorectal cancer (CRC) biology using a combination of cell line comparisons (primary versus metastatic tumours), chemical, and RNAi perturbations, and bioinformatic analyses.

      The major strengths of the work include:

      • The use of CRC cell lines representing normal, primary, and metastatic states, including matched primary and metastatic lines derived from the same patient.
      • A systematic analysis of alternative polyadenylation (APA) and transcription termination, revealing a potential uncoupling between these two closely related processes.
      • The introduction of a novel quantitative metric-cleavage-termination distance-to examine the relationship between PAS usage and RNAPII termination.
      • The identification of a negative association between cleavage-termination distance and gene expression, suggesting an additional regulatory layer influencing gene expression. However, certain limitations should be considered:
      • The generalisability of conclusions regarding metastatic CRC is limited by reliance on a single metastatic cell line.
      • The translational relevance of the findings could be further strengthened through patient-level or clinical data analysis.

      Advance:

      The study proposes potentially novel roles for 3′ end cleavage and transcription termination in regulating gene expression in colorectal cancer. In particular, the conceptual distinction between APA site shifting and transcription termination, together with the introduction of the cleavage-termination distance metric, represents a conceptual advance.

      Audience: The work is primarily positioned within basic research. With additional translational context, it may also attract interest from a broader audience.

      Field of expertise: transcriptional regulation and bioinformatics

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Factors involved in pre-mRNA cleavage and polyadenylation (CPA) are upregulated in many cancers and have been found to be associated with poor prognosis. In their manuscript "Proximity of pre-mRNA 3′ end processing and transcription termination predicts enhanced gene expression", Stepien et al. use colorectal cancer (CRC)-derived cell lines as a model of CPA overexpression to study its biological consequences. To this end, the authors initially confirm increased expression of CPA factors in these cell lines and demonstrate that their knock-down strongly decreases the colony-forming ability of primary tumour-derived CRC cells. They further assess various phenotypes that are expected to depend on CPA activity based on the current knowledge in the field, including poly-A site selection, occurrence of transcription-replication conflicts, and the site of transcription termination. Contrary to expectations, they find a proximal shift in transcription termination to be the most prominent change in CRC ell lines with high CPA levels, despite no clear preference for proximal poly-A site usage in these cells, suggesting an uncoupling of both processes. The authors combine their 3'-end mapping data and T4P-mNET-seq data mapping terminating RNAPII to score cleavage-termination distance at individual genes and find shorter distances to correlate with increased gene expression in the different cell lines. Overall, this is a carefully conducted study, and the claims and conclusions are well supported by the data.

      I have some minor comments:

      1. PLA assay to quantify transcription-replication conflicts (Figure 2). The quantified data looks very convincing and is also in good agreement with the proximal shift in transcription termination that is demonstrated later in the paper. However, the PLA channel signal in the microscopy image examples shown in panel A looks very blurry, and it is hard to imagine that one would be able to count # foci based on this. This may just be an issue with the resolution of the image provided. Apparently, there are much less foci in the treated samples shown in panel B - maybe microscopy images for these could be provided as well? Also, since none of the treatments impact the # of TRCs, it would have been nice to include a positive control known to induce TRCs to demonstrate that the assay works (if such a control is known) - this is optional, and I would not ask to repeat the entire experiment just for this additional control (but maybe the authors have done it and the data is already available?).
      2. Figure 2A-C: please include information on number of cells quantified
      3. Figure 2C: In the label, please include degron, e.g. HCT116 CPSF73-AID rather than just HCT116
      4. Figure 5C: When quantifying nascent txn based on mNET-seq, to which extent would one expect terminally paused RNAPII along the gene body (premature termination events) to contribute to the increased signal? That is, could an increase in stalling be mistaken for an increase in transcription? Based on the metagene plot in Fig 2A it doesn't look like it, but the authors may be able to estimate the effect (if any) from their data.

      Significance

      The observed uncoupling of poly-A site selection and size of termination window is unexpected and raises important questions on how these coupled processes can be regulated independently.

      Strengths of the study:

      i) Parallel assessment of different CRC-based cell lines provides evidence of phenotype stability across patients.

      ii) Brings together strong technical expertise combining different state-of-the-art methodologies to map and correlate poly-A site usage, site of transcription termination, and levels of nascent transcription within the same cell lines under the same conditions, providing a comprehensive dataset.

      Limitations:

      i) For the time being, observation limited to CRC cell lines.<br /> ii) Mechanism behind proximal shift of termination to be determined.

      I expect this work to be of interest to an audience interested in transcription and regulation of gene expression more broadly, with potential translational relevance for cancer therapy.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This study investigates how altered expression of cleavage and polyadenylation (CPA) factors affects alternative polyadenylation (APA), transcription termination and cellular phenotypes in colorectal cancer (CRC) cell lines. The authors combine genetic perturbations of CPA factors with chemical inhibition of CPSF73 and assess effects on clonogenic potential, transcription-replication conflicts, APA profiles, and transcription termination-associated RNAPII phosphorylation patterns. The main comparisons are performed between healthy (1CT), primary tumour (SW480, HCT116) and metastatic (SW620) cell lines, which are reported to contain altered expression levels of CPA factors.

      The data suggest differential dependence on CPA factors between primary tumour-derived and metastatic CRC cell lines, as well as changes in transcription termination patterns. The data are overall well-presented with clear figures. However, in several cases the strength of the conclusions appears to exceed the support provided by the data, and alternative interpretations should be considered.

      Major comments

      1. Clonogenic sensitivity to CPA factor perturbation and comparability of clonogenic assays between cell lines:
        • The data indicate that clonogenic potential in SW480 is strongly dependent on CPSF73 and PCF11, whereas SW620 appear less sensitive. However, the interpretation is complicated by differences in depletion efficiencies. In SW620 cells, PCF11 depletion appears inefficient, and protein levels remain higher than in siLUC-treated SW480 cells (Fig. 1D and S1C; also in comparison to 1CT by inference of Fig. 1C). Thus, the apparent resistance of SW620 cells could reflect insufficient depletion rather than true biological tolerance. The effectiveness of siCPSF73 treatments is difficult to assess from the presented data. Quantification of protein knockdown levels should be provided and incorporated into the interpretation.
        • In Fig. 1D, 1E, and S1D, colony formation of DMSO- or siLUC-treated SW620 and SW480 cells differs markedly in absolute terms. However, the graphs are normalized separately for each cell line, which obscures this difference. This raises two concerns: First, the baseline clonogenic capacity differs between the lines and should be discussed. Second, it is unclear whether direct comparisons between cell lines are valid when normalization is performed independently. For example, in absolute terms, 1 µM JTE-607 appears to have a similar effect in SW620 cells as 5 µM in SW480 cells, which would contradict the conclusion that metastatic cells are more tolerant to CPA perturbations. This issue should be explicitly addressed.
      2. Interpretation of transcription termination markers:
        • The study uses RNAPII T4ph as a marker of transcription termination, which is well justified based on the ref. [30], but still the mechanistic basis of this modification is not fully understood. Changes in T4ph localization are interpreted as consequences of CPA activity, but possible differences in kinase or phosphatase activities between cell lines are not considered that could affect the T4ph levels or localization. Therefore, conclusions based solely on T4ph redistribution should be presented with greater caution, and alternative explanations should be acknowledged.
        • Line 240 states that premature termination is increased in primary tumour cells. However, the data show increased T4ph signal (Fig. 4B) but no change in total RNAPII occupancy in gene bodies (Fig. 4A). This does not directly demonstrate increased termination. Additional evidence or a more cautious interpretation would be appropriate.
      3. Cleavage-termination distance as a predictor of transcript levels:
        • Figure 5A presents median distances across all genes. It would be informative to perform a gene-wise comparison between cell lines (difference in cleavage-T4ph distance for the same gene, e.g. in 1CT vs. HCT116, individual differences plotted across all genes). This analysis could help clarify how frequently individual genes experience the effect (shortening of the cleavage-T4ph distance between 1CT and tumour cells) that is observed globally.
        • The manuscript claims that proximity between pre-mRNA 3′-end cleavage and transcription termination predicts increased nuclear transcript levels. However, the correlation coefficients are small (Spearman r ~ -0.2 at most), indicating weak predictive power. Therefore, the use of the term "predicts," especially in the manuscript title, appears to overstate the strength of the relationship. The authors should either moderate this claim or provide additional analysis to support stronger predictive value.

      Minor comments

      • Figures 1B and S1A: The discontinuous y-axis makes it difficult to assess relative protein level differences between normal and cancer samples. Statistical testing should be included to evaluate significance.
      • Lines 217-218: The text should emphasize that nuclear RNA abundance may not reflect cytoplasmic mRNA levels, particularly when APA alters 3′UTRs and may affect mRNA stability.
      • Lines 261-264: The cleavage-termination distance metric should be more clearly defined as the distance between the polyadenylation site and the T4ph signal peak.

      Referees cross-commenting

      Reviewer #3: On the contrary to implied in the reviewer report, this manuscript does not report the effects of CPSF73 inhibitor JTE-607 on APA. On this note, as the authors discuss uncoupling of cleavage and transcription termination, they could consider (this is not a request) testing how the cleavage inhibitor JTE-607 impacts the distribution of transcription termination marker T4ph, and whether the effects would be different in different cell lines where the coupling appears to be different. This could give mechanistic insights into the sources of the differences between cell lines.

      Significance

      This study addresses an important question in RNA biology and cancer research: how altered expression or pharmacological targeting of CPA factors affects alternative polyadenylation, transcription termination, and cellular phenotypes in CRC models. This topic is timely, as CPSF73 has been proposed as a therapeutic target, making it important to understand the molecular and cellular consequences of modulating CPA factor activities. A key strength and robust finding of the study is the identification of unexpected relationships between pre-mRNA 3′-end processing and transcription termination during CRC progression. Notably, the authors report that changes in alternative polyadenylation and transcription termination appear to be uncoupled and may even occur in opposite directions. This challenges simplified models in which these processes are tightly coordinated and suggests that their (mis)regulation in cancer cells may be more complex than previously appreciated. Secondly, the study provides an interesting observation that gene-specific changes in cleavage-T4ph distance correlate negatively with changes in nuclear levels of processed transcripts. This suggests a potential relationship between the spatial coupling of 3′-end processing and transcription termination and transcript abundance. If validated mechanistically, this could represent a conceptual advance in understanding how transcription termination dynamics influence gene expression outputs. However, the observed correlations are relatively weak, and the mechanistic basis of this relationship remains unclear. As such, this advance is primarily descriptive at this stage. Several interpretations of experimental data would benefit from more cautious framing or additional analysis. In particular, the relationship between changes in CPA factor expression levels and sensitivity to the CPSF73 inhibitor JTE-607 across CRC cell lines remains unclear from the presented data. This work will be of interest primarily to basic researchers in RNA processing and transcription regulation, gene expression control, cancer cell biology and pharmacological targeting of RNA-processing machineries. Reviewer field of expertise: My expertise is in RNA processing and gene regulation. I do not have specific expertise clinical oncology or cancer biology.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      See Response to reviewers.docx

      Revision added:

      TOGGLE Cloud.docx

      Response to reviewers.docx

      High-resolution image.zip

      If the files are not displayed correctly, please have the editor contact the authors. As the submission system may compress images, a file named "High-resolution image.zip" has been uploaded.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors develop a method that extracts functional properties clusters cells from single-cell RNA sequencing using machine learning techniques.

      As it stands, there are several major shortcomings in the presentation of the work:

      • The motivation for the method is not well explained. Certainly analyses of single cell transcriptomics data do not capture the full state or trajectory, however there isn't any concrete example of the problem that this method intends to solve, nor how/why existing methods fail to capture this.
      • No motivation is given for any of the approaches that feature in the method, and therefore there is no consistent logical thread that a reader can follow to be convinced that the results are plausible.
      • Methods are not sufficiently explained.

      For example: There is no indication of how the original graph is obtained, nor what is diffusing in the 'graph diffusion' method. Statements like "The term 'sparse' indicates the process of sparsifying the matrix." does not explain anything about the actual process of making the matrix sparse. This could perhaps be understood in terms of a 'sparse' function in a particular linear algebra package, which would implicitly make this procedure concrete, but this is not mentioned.

      Section 2.2.2 mentions a reinforcement learning scheme, however none of the following explanation describes anything related to the commonly accepted reinforcement learning literature, and several quantities (such as the loss) remain undefined. Similarly, section 2.2.3 mentions the BERT pre-trained transformer without indicating how specifically it was modified or trained for this particular purpose, except perhaps in Figure 4 which itself is intensely confusing. Again in this section the authors mention a 'genetic algorithm' with no reference to any commonly accepted approaches used in the development of genetic algorithms for optimisation, and with no explanation of what exactly is optimised or how convergence is monitored.

      No code implementation is provided, and therefore it is impossible to use this to understand any of the methodology, and renders it impossible to reproduce.

      • Where mathematical notation is used it is incredibly confusing to read, using multiple symbols for different concepts, and not appearing to conform to any commonly accepted convention. In some cases, these are missing completely, for example on page 6, rendering it entirely impossible to follow.
      • The results do not support the assertions made about the method.

      No explanation of the alternative methods is given in section 3.1, nor why they are expected to perform the tasks chosen, or what the configuration of these models are and whether these have been optimised. In Table 2 many alternative methods are listed, however there is no explanation why only a small subset were chosen for comparison, nor what information the authors base their conclusions on (whether these were actually executed for this purpose, or whether they were interpreted from the paper).

      Metrics such as 'accuracy' are not defined, and are the only numerical evaluation of the method, whereas one would expect considerably more detailed evaluation of the claims, such as in the CoSpar paper.

      Section 2.4 mentions 'Details of scRNA-seq data processing and experimental methods are shown in the Supplementary Appendix 6 - Animal Processing.', however I have no supplementary material titled 'Appendix 6', and nothing at all that documents the scRNA-Seq pipeline.

      Section 3.2 seems to describe a very manual procedure for identifying these clusters, and seems to bear no relation to the TOGGLE procedure defined previously, so it's not clear how good an indication this is of the performance of the algorithm. Furthermore, the subsequent results seem to rarely refer to the TOGGLE method at all, and lack any meaningful comparison to alternative methods or why TOGGLE is essential for obtaining these results.

      In many cases the plots in this section are so distorted by compression that making out the text and the points is essentially impossible, and so I cannot comment on any of these.

      I would strongly recommend that the methodology of the paper is greatly expanded to cover exactly what is done, such that it is possible to reproduce in its entirety. Asking a third party who is an expert in machine learning to read through the descriptions and the mathematics would also be highly beneficial to ensure their correctness. Furthermore, it is essential that all of the claims made in the introduction and throughout the paper be systematically and explicitly validated in the results. If this cannot be done on real data, where ground-truth labels and trajectories are hard to come by, some evidence for these claims could be acquired by evaluation on simulated data.

      Significance

      The inference of cell state and trajectories from single cell RNA sequencing is a timely and important task in computational biology, with many important downstream applications. The method described in this paper aims to distinguish functionally distinct cell populations that exhibit small differences in transcript counts. However, it is not precisely articulated why the complicated approach proposed here is advantageous over simpler more conventional approaches, such as graph clustering, random-walk based methods such as CoSpar, or trajectory inference based on ODE kinetics such as in scVelo.

      Furthermore, the methods described are exceptionally vague and hard to follow, with mathematical descriptions and naming schemes that are inconsistent with the commonly accepted literature of the techniques referenced. Therefore it is also difficult for even a well-prepared reader to come to their own conclusions as to the performance and applicability of the proposed approach. This issue is compounded by the fact that there is very little in the way of validation of the specific claims made of the method, let alone in relation to alternative methods.

      The study would be greatly improved by expanded methods sections, documenting in detail what is done at each stage. Where existing work is referenced without an exact description, how the implementation differs from the reference must be addressed. Most of the description is currently in text form only, which is wholly insufficient for the kind of complex mathematical operations described. Furthermore, many of the claims throughout the paper go unaddressed in the results, where there are only a few accuracy metrics and comparisons of results, and there is no attempt to rigorously demonstrate an advantage of any of the novel components presented (for example, in an ablation study). Expanding these numerical metrics and comparisons across all applications of the method is essential for demonstrating the assertions of the paper - for example, constructing a metric for the comparison of the TOGGLE and ground-truth UMAP comparisons. In cases where there is no real ground truth in the data, simulated datasets could be used to demonstrate the plausible performance of TOGGLE in ideal scenarios.

      My expertise is in computational biology and machine learning, with a background in physics.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Overall Assessment

      The manuscript addresses an important problem-inferring subtle functional states from single-time-point scRNA-seq. However, essential methodological details are missing, several claims lack rigorous support, and key computational steps cannot be reproduced from the current description. The biological experiments validate ferroptosis itself, but do not validate the correctness of the inferred trajectories or cluster boundaries.

      Major Comments

      1. In Step 1 of TOGGLE (Figure 1), the method employs graph diffusion and emphasizes that the resulting asymmetric distance matrix encodes additional information. In fact, the entire downstream TOGGLE framework is built upon this graph and its diffusion-based similarity. However, the manuscript currently lacks essential information regarding this core component: 1) it does not explain how the asymmetric distance matrix is generated, nor does it provide explicit formulas or computational steps; 2) it does not specify which type of diffusion operator is used; 3) it is unclear how the underlying graph is constructed from the expression matrix-e.g., whether a standard kNN graph is used and whether edge weights are normalized.
      2. Although the manuscript demonstrates the biological usefulness of TOGGLE across several datasets and provides experimental validation in multiple systems, the method still lacks essential ablation analyses and performance benchmarking. These components are critical for establishing the stability, robustness, and necessity of each part of the proposed framework. Without systematic evaluation-such as comparisons to existing trajectory or state-inference methods, or ablations of key modules (diffusion construction, boundary detection, masking, GA-based merging)-it remains difficult to assess whether TOGGLE's improvements are due to the core algorithmic design or to dataset-specific behaviors. Incorporating these analyses would substantially strengthen the evidence supporting the method's reliability and generalizability.
      3. I find the methodological description insufficiently clear. The overall algorithmic framework appears to be assembled from multiple existing computational components without a unified or coherent theoretical formulation. As a result, the rationale behind each module and the mathematical connections between them are not rigorously established. In addition, the mathematical expressions throughout the manuscript lack standardized notation and clarity. For example, vectors and matrices should be consistently denoted using bold symbols, and expressions such as "cov_D" are not appropriate for describing covariance matrices in a formal mathematical context. The absence of precise notation and properly structured equations makes it difficult for readers to understand, evaluate, or reproduce the proposed method. A clearer and more rigorous mathematical exposition is necessary to support the validity of the algorithmic design.
      4. Key components (δ computation, binary assignment, recursion criteria, optimization objective) lack formal definitions or pseudocode. The "reinforcement-like" description is conceptual rather than methodological.
      5. Both masking and GA introduce stochasticity and complexity. Their necessity is not demonstrated, and no ablation study tests whether they contribute to performance or stability.
      6. Masking ratio, GA population size, mutation rate, stopping criteria, and the details of pseudotime usage are all unspecified, making the computational procedure difficult to reproduce. Moreover, given the presence of multiple stochastic components-including masking, genetic-algorithm iterations, and graph-related randomness-the manuscript should evaluate the stability of the method under different random seeds or bootstrap resampling. Without clearly defined parameters and robustness analyses, it is challenging to assess the reliability and reproducibility of the proposed framework.

      Minor Comments

      1. "cellular neighborhoods" requires a precise definition.
      2. Some figures (e.g., Fig. 2) are schematic and would benefit from quantitative clarification.

      Significance

      Advancement:If clarified and rigorously validated, TOGGLE could become a useful tool for trajectory-free state inference. Currently, the novelty lies more in application breadth than in methodological rigor.

      Audience: Likely audiences include computational biologists, neuroscientists studying ferroptosis, and researchers working on NSC epigenetics. Usage beyond these areas depends on methodological clarification.

      Expertise: Keywords: single-cell transcriptomics, graph diffusion, clustering algorithms, trajectory inference, statistical modeling. I am comfortable evaluating the computational components; biological assays are evaluated from standard computational-biology perspective.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Guo et. al, Wnt/β-catenin in the muscle spindle

      Guo and colleagues investigated a role of canonical wnt signaling in the muscle spindle. Muscle spindles are formed by myofibers upon innervation by proprioceptive sensory neurons. The signal for induction of the spindle, Nrg1, is provided by the proprioceptive neuron and induces the expression of immediate early genes like Egr3 in the emerging spindle. Subsequently, bag1/2 and chain fibers are formed, and a capsule is generated around the spindle. Besides the inducing signal, little is known about spindle development, and to my knowledge all other work remained descriptive.

      To work with muscle spindles is demanding as the spindles are very rare in muscles. The authors have developed new approaches like collecting spindles for RNAseq analyses that allow them to molecularly analyze the spindle.

      In the manuscript, the authors characterize wnt signaling in muscle spindle development. They show data from the Axin2-GFP mouse strain (Axin2 is a well-known target of canonical wnt signaling). Their data indicate that both extra- and intrafusal fibers express Axin2 at P0 and P5; at P25 and P40 Axin2 is maintained in bag2 fibers and capsule cells of spindles and downregulated in other myofibers. This indicates that canonical wnt signaling is initially active in all fibers, and subsequently restricted to bag1 fibers and capsule cells.

      Two mouse strains are used to conditionally mutate β-catenin, the transducer of canonical wnt signals, in the spindle: the first strain relies on the use of Egr3-Cre (mutates β-catenin in intrafusal fibers and capsule cells; recombination sets in presumably shortly after E15.5). The second strain uses HSACreERT2 to mutate β-catenin (recombination is induced in all myofibers after tamoxifen treatment). In the data provided, tamoxifen was injected at P5 and the animals were analyzed at P25 or later. Molecular phenotyping is done on the Egr3-Cre strain using RNAseq showing around 750 down- and 300 upregulated genes in isolated muscle spindles from β-catenin mutants.

      Other phenotyping data: Histology (Egr3-Cre):1) changes in the distribution of GLUT1 in the spindle, 2) VCAN downregulation in capsule cells of the mutant; 3) abnormal aggregation of nuclei in bag2 fibers, 4) abnormal annulospiral morphology, i.e. proprioceptive neuronal endings are abnormal.<br /> Histology (HSACreERT2): 1,2) GLUT1 and VCAN unaffected; 3) bag2 nuclear aggregation is normal; 4) abnormal annulospiral morphology, 5) abnormal gait. The authors assign the differences in phenotypes to the differences in cell type specificity of recombination (Egr3-Cre: intrafusal fibers and capsule cells; HSACreERT2: intra- and extrafusal fibers). This indicates that 1) Wnt/β-catenin affects annulospiral endings indirectly via a primary deficit in bag2 fibers and 2) nuclear aggregation phenotype in bag2 fibers is caused indirectly via a primary deficit in capsule cells and 3) a cell autonomous function Wnt/β-catenin exists in capsule cells. Overall, the work is carefully done, and the data are presented clearly. The phenotypes are relatively mild, in particular the behavioral consequences of the mutation.

      I have some major points that should be discussed and taken into account in the writing of the paper.

      1. Developmental phenotypes. The authors claim the phenotypes observed are caused by developmental deficits, but the animals are only analyzed at P25 (histology and RNAseq) or later time points. From the data shown it cannot be excluded that the spindle is formed correctly but that spindle maintenance is impaired. Additional time points would be needed to convincingly argue a developmental phenotype. Specifically, analysis of a time point when control and mutant spindles have similar histology is needed, in order to argue that subsequent developmental steps are impaired.
      2. Differences in phenotypes in the two strains. Can the authors be sure that differences in phenotypes observed in Egr3-Cre and HSACreERT2 lines are exclusively due to cell type specificity of recombination, and not due to differences in recombination efficacies? This is particularly important for the syncytial fibers. Incomplete recombination in a fiber might allow nuclei that have not recombined to provide sufficient β-catenin for signaling in the entire fiber.
      3. Please provide data that show that β-catenin is expressed in capsule cells.

      Minor

      The following sentence refers to the wrong figure (should refer to Fig. 4): While mutant loops had similar widths, loop number was reduced and the distance between loops increased (Figure 3G-G').

      Significance

      Guo and colleagues investigated a role of canonical wnt signaling in the muscle spindle. Muscle spindles are formed by myofibers upon innervation by proprioceptive sensory neurons. The signal for induction of the spindle, Nrg1, is provided by the proprioceptive neuron and induces the expression of immediate early genes like Egr3 in the emerging spindle. Subsequently, bag1/2 and chain fibers are formed, and a capsule is generated around the spindle. Besides the inducing signal, little is known about spindle development, and to my knowledge all other work remained descriptive.

      To work with muscle spindles is demanding as the spindles are very rare in muscles. The authors have developed new approaches like collecting spindles for RNAseq analyses that allow them to molecularly analyze the spindle.

      In the manuscript, the authors characterize wnt signaling in muscle spindle development. They show data from the Axin2-GFP mouse strain (Axin2 is a well-known target of canonical wnt signaling). Their data indicate that both extra- and intrafusal fibers express Axin2 at P0 and P5; at P25 and P40 Axin2 is maintained in bag2 fibers and capsule cells of spindles and downregulated in other myofibers. This indicates that canonical wnt signaling is initially active in all fibers, and subsequently restricted to bag1 fibers and capsule cells.

      Two mouse strains are used to conditionally mutate β-catenin, the transducer of canonical wnt signals, in the spindle: the first strain relies on the use of Egr3-Cre (mutates β-catenin in intrafusal fibers and capsule cells; recombination sets in presumably shortly after E15.5). The second strain uses HSACreERT2 to mutate β-catenin (recombination is induced in all myofibers after tamoxifen treatment). In the data provided, tamoxifen was injected at P5 and the animals were analyzed at P25 or later. Molecular phenotyping is done on the Egr3-Cre strain using RNAseq showing around 750 down- and 300 upregulated genes in isolated muscle spindles from β-catenin mutants.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their study the authors address an important aspect in developmental neurobiology. In particular, they investigate the molecular underpinnings of muscle spindle development in the mouse. Muscle spindles are essential components to transmit muscle stretch and proprioceptive feedback to the spinal cord. They first analyze preexisting muscle spindle specific gene expression patterns that have been established before. They find an intriguing enrichment of expression of components of the Wnt/beta-catenin pathway. Inspired by these findings the authors next genetically deleted specific components from capsule and intrafusal spindle fibers during embryogenesis. They found profound gene expression changes and morphological alterations in the spindle fibers but also at the sensory proprioceptive nerve endings. Finally, the authors deleted beta-catenin at postnatal stages and detected significant defects in proprioception function. Altogether, they conclude that beta-catenin signaling exerts important function in muscle spindle development through cell-autonomous (spindle intrinsic) and non-cell-autonomous (affecting nerve terminals and proprioceptive functions) mechanisms.

      Overall the study is excellently conceived, the experiments performed at very high standards and the results were interpreted with great care. The manuscript is very well written and the data is presented neatly. In my opinion there are just a few very minor items that the authors could address to improve the reading experience.

      1. In Figure 1B, the font color in the blue boxes are not clearly readable and I recommend to use darker color tone or even black.
      2. Figure 1C-H, it would be useful to outline the capsule and fiber compartments in the fluorescent panels to improve the orientation and better appreciation of the expressed genes.
      3. Figure 3C'-3C'', the authors should define the meaning of the red and black labelled gene names.
      4. Figure 3C', the yellow writing is hard to read, I suggest to use darker color tone.
      5. Figure 4, the authors should write the proper genotype in the boxes and in italic font.
      6. Figure 5, the authors should write the proper genotype in the boxes and in italic font.
      7. In the introduction, the authors could cite a few more (perhaps major reviews) about the Wnt/beta-catenin biochemical functions. Ideally after the first sentence in the respective paragraph.

      Significance

      Proprioception is an essential process for controlling postures and movement. The anatomical development of the muscle spindles, that are responsible for detecting muscle stretch and transmitting proprioceptive feedback to the spinal cord, has been quite well described. However, the molecular mechanisms that regulate the development of the muscle spindles with the attached proprioceptive nerve endings are not well understood. To address this gap in our knowledge the authors evaluated muscle spindle specific gene expression and probed the function of the Wnt/beta-catenin pathway (highly specifically expressed in spindle components) in muscle spindle development and function. They found striking and significant deficits in muscle spindle development and proprioception upon muscle spindle specific ablation of beta-catenin. Altogether, they conclude that beta-catenin signaling exerts important function in muscle spindle development through cell-autonomous (spindle intrinsic) and non-cell-autonomous (affecting nerve terminals and proprioceptive functions) mechanisms. Conclusively, the data and findings in the present study reflect a true advance and provide new insights into the molecular mechanisms that drive muscle spindle development and therefore proprioception.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The authors demonstrate the importance and roles of Wnt/β-catenin signalling in mammalian muscle spindle development and maintenance.

      Major comments:

      The paper is very well and clearly written with full details of the data and methods. The results, statistical analyses, and conclusions are convincing without any need for further experiments.

      Minor comments:

      • In the Introduction (paragraph 3) the authors state "Extensive morphological studies have shown that muscle spindle development begins around embryonic day (E) 14, when slow myofibers first contact proprioceptive axons and differentiate into intrafusal fibers in a sequential process." I would suggest "Extensive morphological studies have shown that muscle spindle development in the mouse begins around embryonic day (E) 14, when proprioceptive axons first contact primary myotubes initiating the differentiation of primary and secondary myotubes into intrafusal fibers in a sequential process." In the same paragraph it is stated that "Recent work has shown that Lrp4 expression in intrafusal fibers is necessary to maintain the sensory synapses of annulospiral endings..." Sensory endings, including those of muscle spindles are not usually, nor conventionally, regarded as synapses.
      • Legend to Fig 1 (F,G; inset in G shows enlargement of Fzd2) Fzd2 to read Fzd10.
      • Legend to Fig 2 (A) "Dotted lines demarcate equatorial region of spindles." I suggest "Dotted lines demarcate areas enlarged in B'-C', including equatorial regions of spindles."
      • Paragraph beginning "Next, to associate these changes..." "Surprisingly, for intrafusal genes, the most enriched GO term was "neuron projection morphogenesis,..." Why is this surprising?
      • Legend to Fig 4 "a shorter spindle height in mutants" This is unclear; I suggest "a smaller spindles diameter" would be clearer. Similarly "and shorter nucleus height" is unclear; I suggest "and smaller nuclear accumulation diameter".
      • Legend to Fig 5 Again, I think "spindle height" would be clearer as "spindle diameter". Specific experimental issues that are easily addressable.
      • The figures are all clear, in some cases when sufficiently enlarged, but careful attention needs to be paid to their final enlargements to ensure that the essential details remain clearly visible.

      Referees cross-commenting

      It is satisfying to see that all three reviewers agree on the importance of this paper, and that two reviewers clearly agree that no further experimental work is necessary to support the conclusions reached by the authors.

      Significance

      This is an important work of major significance in the area of muscle spindle studies, and in the wider field of the genetic basis of the integrated development of complex sense organs.

      My expertise is in the structure, ultrastructure, immunohistochemistry, and physiology of muscle spindles.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-point response to Reviewer comments:

      We copied the Reviewer comments below in italics. Revisions we propose in response to Reviewer comments are underlined.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Yin et al investigates how epidermal cells shape somatosensory neuron (SSN) morphology and function through selective ensheathment in Drosophila. This study builds on earlier work by another group showing that the phagocytic receptor Draper (Drpr) as a crucial epidermal factor that is important for dendrite pruning and clearance. In the present study, the authors how that Drpr also functions in the epidermis to establish the characteristic stretches of epidermal ensheathment of dendrite arborization neurons in the fruit fly Drosophila melanogaster. This is particularly true for highly branched types of dendrites but ont dendrites that show simpler branching patterns. Overexpression of Drpr increases ensheathment and nociceptor sensitivity, linking molecular recognition to sensory modulation. Further, Drpr is known to recognize phosphatidylserine (PS) on neurites to promote ensheathment and the authors show localization of a reporter for PS with epidermal membranes. Genetic manipulations that reduce PS results in a reduction in epidermal sheaths and the chemokine-like protein Orion promoting Drpr/PS interactions is required for these processes. Overall, the manuscript is well written, although at times maybe primarily for a fly audience. Reach could be improved by making it more accessible to a non-fly audience. The observation that Drpr is not only required for removing damaged or degenerating dendrites but also for their correct ensheathment of highly branched dendrites presents an important finding that could be of interest for a wider audience provided the following points are adequately addressed:

        • The Introduction could be further elaborated to help readers understand the significance of epidermal dendrite ensheathment. Addressing the following points may achieve this: (i) The Introduction would benefit from including details on developmental disorders and neurological diseases associated with defects or abnormalities in dendrite ensheathment.*

      We appreciate this suggestion. We allude to possible connections between ensheathment defects and human disease in the discussion but agree that it would be appropriate for the introduction; we will underscore this possible connection more clearly in our revised manuscript. We note studies of epidermal ensheathment are limited in mammalian systems, so links between dysregulation of epidermal ensheathment and disease have not been firmly established.

      (ii) In lines 74-79, it is unclear whether the described findings are conserved across evolution or were demonstrated in a specific model organism.

      The Reviewer refers to our statement about similarities in the cellular mechanism of epidermal ensheathment and phagocytosis. Indeed, these features are evolutionarily conserved in vertebrates, and we agree that it is worthwhile to emphasize this point. We added a statement underscoring the evolutionary conservation of the morphogenetic mechanism along with the relevant citation.

      (iii) Including a description of the known literature on phagocytosis in this process would help readers better understand the novelty and significance of this study.

      We agree with the Reviewer. In our revised introduction we will include a more detailed description of key features of phagocytic engulfment and highlight the salient differences between ensheathment and phagocytosis including the failure to complete the endocytic event in ensheathment and the persistence of PIP2 at the membrane.

      (iv) Details of published Draper function in Han et al 2014 should be elaborated along with unanswered question that is addressed in this study.

      The Han et al 2014 study established that epidermal cells, not Drosophila hemocytes (professional phagocytes), are primarily responsible for phagocytic clearance of damaged dendrites in the periphery. Similarly, the Rasmussen et al 2015 study we cite established that skin cells in vertebrates (zebrafish) act as primary phagocytes in removal of damaged peripheral neurites. These studies demonstrate the phagocytic capacity of epidermal cells, particularly in recognition of somatosensory neurites, and the Han study demonstrates that Draper is required for this epidermal phagocytosis. Neither of these studies addresses mechanisms of epidermal ensheathment; we will clarify this point in our revised introduction.

      • It is unclear why the authors focus exclusively on Drpr and Crq, without addressing emp and CG4006, both of which show higher expression levels than the former. Moreover, the conclusion that 14 out of 16 engulfment receptor genes have no role based solely on RNAi knockdown experiments is a very strong statement that may requires additional validation. The authors should provide evidence that the RNAi knockdowns achieved complete loss of gene function to support their claim about 16 engulfment receptors. In addition, at most the authors can conclude that the tested genes are individually not required.*

      The Reviewer makes several points that warrant discussion. First, the Reviewer asks “why the authors focus exclusively on Drpr and Crq, without addressing emp and CG40066.” The rationale for focusing on Drpr and Crq in our discussion of the expression data is that both Drpr and Crq function in phagocytic engulfment of damaged neurites. Our focus on Drpr for the remainder of the study is guided by the knockdown phenotypes; if either emp, CG40066, or any other receptor showed robust and reproducible effects on ensheathment we would have discussed them at length. Indeed, we identified a potentially novel ensheathment phenotype for NimB4 and devote a small portion of our discussion to its possible function. However, our primary focus in this study was to identify phagocytic receptors required for epidermal ensheathment of somatosensory neurites and drpr was the top hit from our RNAi screen.

      Second, we acknowledge that RNAi knockdown is often incomplete and without additional validation a negative result using RNAi is difficult to interpret. In our original text we state: “epidermal RNAi of 14/16 engulfment receptor genes had no significant effect on the extent of dendrite ensheathment in third instar larvae (Figure 1, F and G), consistent with the notion that most epidermal engulfment receptors are dispensable for dendrite ensheathment.” We do not claim that other receptors have “no role”, simply that our results are consistent with the interpretation that most receptors are dispensable. Furthermore, we acknowledge that multiple other receptors likely contribute to other aspects of ensheathment (lines 131-145; NimB4 knockdown causes an “empty sheath” phenotype). However, the Reviewer’s comments convince us that we should more clearly word our interpretation of the negative RNAi results more to reflect the limitations of the approach; we will incorporate this into our revision.

      Third, the Reviewer brings up the very important point that receptor redundancy could mask phenotypes. Indeed, our studies suggest that additional pathways likely function in parallel with Drpr. We agree that potential redundancy is an important consideration and absolutely warrants discussion in this section of the results; we will add this to our revised text and we have already updated the statement in the results to read “most epidermal phagocytic receptors are individually dispensable for dendrite ensheathment.”

      The final point the Review makes is that analysis of the knockdown efficacy is warranted if we want to make strong claims about gene function for other receptors. We agree that this would be an important first step, but in many cases protein perdurance masks RNAi phenotypes as well. So, efficient knockdown alone is not enough to make concrete conclusions about gene function in this developmental context.

      • What kind of genes are crq and ea?*

      Crq is a Scavenger receptor and Eater is a Nimrod-family receptor (indicated in Figure 1A).

      • Comparing Figures 1C and 1E, it appears that drpr knockdown has a differential effect on epidermal dendrite ensheathment between main and secondary branches. If this observation is correct, separate quantification for each branch type would be more appropriate, along with an explanation for the observed differences.*

      We agree with the Reviewer’s assessment that ensheathment appears to be largely absent on terminal dendrites following drpr knockdown but some ensheathment persists on major dendrites. In prior published studies we demonstrated that terminal branches are less extensively ensheathed than primary dendrites in wild-type larvae (Jiang et al 2019 eLife). We will provide this important context in our revised submission. We hypothesize that Drpr uniformly affects ensheathment across the arbor but agree with the Reviewer that quantification is warranted to evaluate this hypothesis. We will add this analysis to our revised submission.

      • For Figure 1K, it would be informative to examine how drpr knockdown affects dendrite length in these neurons.*

      We agree with the Reviewer. We demonstrate that drpr null mutants have exuberant terminal branching, but we have not yet analyzed effects of epidermal drpr RNAi. We will add this analysis to our revised manuscript.

      • For Drpr expression (Figure 3), it would be valuable to highlight any differences in expression between primary and secondary dendritic branches.*

      The Reviewer’s question about Drpr distribution at sites of ensheathment will be particularly relevant if we observe differential impacts of Drpr knockdown on ensheathment at primary and higher order dendrites. In our initial submission we showed that >70% of PIP2+ (Fig. 3B) and cora+ (Fig. 3D) epidermal sheaths also exhibited Drpr accumulation; we likewise showed that Drpr accumulation adjacent to dendrites only occurred at sites labeled by the sheath marker cora (Fig. 3G). In our revised submission, we will examine whether Drpr accumulation is more prevalent at sites of PIP2 accumulation on main branches compared to terminal branches.

      • Removing drpr leads to excessive branching of SSN dendrites. Does overexpression of drpr affect dendrite morphology in the opposite manner?*

      The Reviewer asks an intriguing question about effects of drpr overexpression. We have not examined effects of epidermal drpr overexpression on dendrite morphogenesis, but we will add these experiments to our revised manuscript.

      • Although drpr role in dendrite ensheathment is well explored, the interactions between drpr and PS seem underexplored. For example, do the changes in ensheathment as a result of manipulating PS levels require drpr? Does changing PS levels affect Drpr localization or levels?*

      The Reviewer raises two questions about the relationship between PS exposure and Drpr.

      First, they inquire whether changes in ensheathment resulting from manipulating PS levels require Drpr. We show that overexpressing the ATP8a flippase in C4da neurons, which limits PS exposure, limits the extent of ensheathment. Similarly, we show that sheath formation requires Drpr. In principle, we could assay effects of simultaneously overexpressing ATP8a in neurons and inactivating Drpr (using the Drpr null mutation), but such an experiment will likely be difficult to interpret because the individual treatments cause an almost complete loss of sheaths. We did not investigate whether increasing PS exposure increases ensheathment because prior studies demonstrated that ectopic PS exposure induces membrane shedding in C4da dendrites.

      Second, they inquire whether PS levels affect Drpr localization or levels. We demonstrate that inactivation of the PS bridging molecule Orion prevents Drpr localization at sheaths, hence we predict that neuronal overexpression of the ATP8a flippase should have a similar effect. In the revised manuscript, we will examine this possibility (monitoring Drpr distribution at epidermal contact sites with neurons overexpressing ATP8a).

      Minor Points:

        • Why there is no gene in bold category for hemocytes in Figure 1A*

      The bold type was used to indicate the receptors that were selected for screening, using a relaxed criteria for identifying receptors that were “expressed”: any receptor detected at a level of 0.1 TPM. To this point, the figure legend states: “Epidermal candidate genes in bold exhibited a TPM value > 0.1 in at least one biological sample and were selected for inclusion in RNAi screen for epidermal phagocytic receptors required for ensheathment.”

      We acknowledge that this is a relaxed criteria for “expression” and likely includes receptors that are not appreciably expressed in epidermal cells. Within the text we compare the repertoire of hemocyte and epidermal phagocytic receptors using a more standard (albeit still relatively relaxed) threshold of 0.5 TPM. We added shading to the histograms in Fig. 1A to facilitate comparison of phagocytic receptor gene expression in hemocytes and epidermal cells.

      • Line 67: "neurons BEING the most extensively..."*

      • Line 126: should read "epidermal engulfment receptors are INDIVIDUALLY dispensable"*

      • Line 216: "THE DrprD 5 mutation had no significant..."*

      • Line 230: "overexpression" instead of "overexpressed"*

      • Line 385: similar "TO"*

      These grammatical errors have been corrected. We thank the Reviewer for their careful reading of the manuscript.

      Reviewer #1 (Significance (Required)):

      This is an interesting study that adds to our understanding of the role of phagocytic receptors in shaping dendrites. Specifically, the role of Drpr (Draper) is studied, a gene previously known as an important for removal degenerating dendrites. The limitations of the manuscript as is is that it seems to be written primarily for a fly audience. Contextualizing the results and in the significance of this like conserved pathway could increase the significance.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Innervation of the skin by somatosensory neurons is a conserved process that enables perception and discrimination of mechanical stimuli. How do molecules exposed by neurons and skin cells collaborate to promote neurite-induced epidermal sheath formation? Here, the authors combine fruit fly molecular and genetic tools with high resolution imaging to address this fundamental question. Based on morphological similarity between phagocytosis and SSN ensheathment, the authors hypothesized that one or more phagocyte receptors might promote ensheathment through ligand-driven interactions with neurites. To test this hypothesis, the authors systematically screened phagocytic receptors expressed in the epidermis for functional roles in ensheathment. Through this screening approach, the authors found that the Draper (Drpr) receptor functions in epidermal cells as a significant factor required to promote ensheathment. They support this conclusion using a suite of cell- and tissue-specific RNAi tools and mutant fly lines in conjunction with elegant mechanistic work that establishes a role for the conserved "eat-me" signal phosphatidylserine (PS) in driving ensheathment.

      Major comments:

      Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      The seven key claims presented in the abstract are strongly supported by experimental data and analyses presented in the manuscript. At least one experimental result displayed in a main figure in support of the indicated key claim is summarized below. This summary does not present a comprehensive list of all data in support of a particular claim. Rather, it is an effort to confirm that each key result presented to the readership in the abstract is supported by at least one rigorously analyzed experimental result.

      We concur with the Reviewer’s interpretations of our work and appreciate the clarity of their summaries below.

        • Drpr functions in epidermal cells to promote ensheathment: Expressing a Draper RNAi under control of a larval epidermal driver (A58) led to significant reduction in total sheath length (Fig 1H), average sheath length (Fig 1I), and fraction ensheathed (Fig 1J). Similar results were obtained using two different Draper RNAi constructs.*

      The argument presented through RNAi results in Fig 1 is bolstered by data using an existing validated Draper mutant line in Fig 2A-E. A question of interest to this reviewer upon receiving the paper was whether Draper functions at initial stages of sheath formation, maintenance of existing sheaths, or both. The timelapse data in Fig 2F suggests that Draper activity is dispensable for maintaining existing sheaths.

      • ...that Draper accumulates at sites of epidermal ensheathment but not contact sites of unsheathed neurons:*

      Immunostaining experiments demonstrate that Drpr immunoreactivity is enriched at PIP2-positive membrane domains in epidermal cells (Fig 3A-B). Is this accumulation selective for epidermal sheaths? Yes. In Fig. 3E-G, the authors show that Drpr enrichment overlaps with the sheath marker cora but not with dendrites of C1da neurons or from unsheathed portions of C4da dendrite arbors. The authors confirm specificity of Drpr immunoreactivity through control experiments using a Drpr mutant (Supplementary Fig 2).

      • ...that Drpr overexpression increased ensheathment:*

      Enforced overexpression of Draper in epidermal cells via Epidermal GAL4 driving UAS-Drpr (Fig 5A) shows significantly higher levels of ensheathment of C4da neurons as compared to controls. The authors demonstrate specificity by showing that epidermal Drpr overexpression did not induce ectopic sheath formation in C1da neurons (Fig 5E-G).

      • ...that extracellular PS accumulates at sites of ensheathment:*

      Using a previously developed secreted AnnV-mScarlet reporter (Ji et al. 2023 https://doi.org/10.1073/pnas.2303392120), the authors demonstrate that PLC-PH-GFP labeled stretches were also labeled by AnnV-mScarlet (Fig 6A-B), consistent with their model that ensheathment by Drpr is mediated by PS exposure on dendrites.

      • ...that overexpression of the PS Flippase ATP8a blocks ensheathment:*

      This claim is supported by demonstrating that overexpression of ATP8A, a protein that drives drives unidirectional PS translocation from the outer to the inner leaflet of the plasma membrane, impacts C4da neurite ensheathment. Selective overexpression of ATP8A in C4da neurons using a ppk-GAL4 induced a significant reduction in epidermal sheaths (Fig 6C).

      • ...that Orion is required for sheath formation:*

      Inactivation of the chemokine-like PS bridging molecule Orion significantly reduces fraction of ensheathment (Fig 6I-L).

      • Overexpression of Draper enhanced nociceptor sensitivity to mechanical stimulus*

      Consistent with a functional role for epidermal ensheathment in responses to mechanical stimuli, the authors report a significant reduction in nocifensive responses in a behavioral assay presented in Fig 6H.

      In conclusion, the authors' claims are supported by the data as presented in this version of the manuscript.

      • Please request additional experiments only if they are essential for the conclusions. Alternatively, ask the authors to qualify their claims as preliminary or speculative, or to remove them altogether.

      n/a

      • If you have constructive further reaching suggestions that could significantly improve the study but would open new lines of investigations, please label them as "OPTIONAL".

      n/a

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated time investment for substantial experiments.

      n/a

      • Are the data and the methods presented in such a way that they can be reproduced?

      Yes. The quality of the cell imaging data presented in the figures is high. The figure legends are sufficient to follow the investigators' conceptual approach and technical progress as they build their model. Transparent presentation of the screening data in Fig. 1 F-G was particularly appreciated by these reviewers.

      Are the experiments adequately replicated and statistical analysis adequate?

      Yes. We specifically commend the table outlining all statistical tests presented in the supplementary methods and linked to each figure.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Minor comments:

      1. Could the authors further clarify Drpr's anticipated window of activity during sheath formation and/or speculate further on this point in the discussion? Live imaging in Fig. 2 suggest that Drpr is dispensable for maintenance of existing sheaths. Given that Drpr is proposed to be activated through transient phosphorylation that recruits the binding partner Shark (PMCID PMC2493287), it might be useful to clarify Drpr's window of activation (ie transient or constitutive) for an audience more familiar with Drpr's canonical functions in engulfment. The section prior to speculation about a possible role for negative regulators of phagocytosis (Line 360) might be a possible location for this addition.

      We appreciate the insightful suggestion. As the Reviewer notes, our results are consistent with a model in which Drpr is required for formation but not maintenance of sheaths. Our original hypothesis was that Drpr would transiently localize to sheaths and be largely absent from mature sheaths. However, our antibody staining suggests that Drpr persists at mature sheaths (signal from endogenously labeled Drpr protein was too dim for live imaging in our hands). We therefore favor a model in which Drpr is transiently activated to promote sheath assembly.

      In the context of engulfment, Src42A-dependent tyrosine phosphorylation of Draper promotes association of Shark and Draper pathway activation. Src42A activation is regulated by integrins and RTKs, providing a potential point of crosstalk with other pathway(s) likely involved in ensheathment. Intriguingly, membrane recruitment and activation of Talin depends in part on PIP2, and Talin promotes both Integrin activation and recruitment of PIP2-prodicing PIP5K Kinases, providing a potential feed-forward mechanism for increasing PIP2 accumulation, Talin recruitment, and Integrin activation, which can promote Src42A activation. In our revised discussion we will provide a more thorough treatment of mechanism(s) of Drpr activation.

      • The authors might consider developing their conclusion a bit further for a broad audience. For example, the gesture to Piezo dependence in the current final sentence might provide an opening to discuss an exciting future avenue focused on integrating molecular mechansensors into a comprehensive model of selective SSN ensheathment important for the perception and discrimination of touch and pain.*

      We appreciate the suggestion and agree that it is worthwhile to expand on the potential links between ensheathment and sensory neuron function in our revised discussion. Our studies thus far have largely explored mechanosensation, but it’s worth noting that the nociceptive neurons under study here are polymodal, and other functional classes of somatosensory neurons are ensheathed to differing degrees, so an intriguing open question is whether ensheathment selectively potentiates the function of mechanosensors or more generally enhances functional coupling of somatosensory neurons to the epidermis. Our finding that ensheathment levels can be bidirectionally regulated by drpr levels provides an entry point to more broadly characterizing functions for ensheathment.

      • Word missing or extra "in" in Line 69 after ECM?*

      Corrected.

      • In Fig 1 and Fig 3, the PLC(delta)-PH-GFP reporter contains the delta symbol, in other throughout the paper it does not. In addition, Fig 5 is denoted "PIP2 (PLC-PH-GFP)". For consistency the authors might consider using PLC(delta)-PH-GFP across all figures.*

      As suggested, we updated the figures and text to include the delta symbol in the reporter PLC(delta)-PH-GFP.

      • Fig 6P - do the authors suggest Orion is distributed at high concentration throughout the entire upper portion of the figure? Perhaps the coloration could be changed if Orion binding is suggested to occur between Drpr and PS.*

      We have not examined Orion distribution in the periphery, though prior studies demonstrate that it is secreted into the hemolymph from multiple sources. Our schematic focuses on sites of contact between epidermal cells and dendrites but omits the hemolymph, muscle, and other cell types in the periphery. In our initial schematic epidermal cells and Orion were shaded similarly; in our revision we chose a different color for epidermal cells to prevent confusion.

      Optional suggestions for consideration to provide further context for a broad audience:

      Optional 6. The authors might consider placing their work in the context of an emerging literature focused developmental roles for immune cell signaling molecules/other phagocyte receptors at steady state. While the present study focused on epidermal ensheathment of SSNs stands on its own as a notable contribution and does not require these citations to support its conclusions, context from an emerging literature bridging immunity and development might be of interest to a broad readership. Should the authors wish to strengthen the link between their work and findings from other systems indicating a shared role in immunity and development for key immunoreceptors and their binding partners, they might consider adding citations/phrasing indicating that Draper's molecular collaborator Shark kinase (PMCID PMC2493287) was initially discovered as a developmental gene required for dorsal closure (PMCID PMC316420). They might also consider highlighting the role of Draper's mammalian orthologs Megf10/Megf11 in regulating mosaic spacing of retinal neurons (PMCID PMC3310952).

      We appreciate the Reviewer’s suggestions, in particular the value of further highlighting relevant links between immunity and development. Not including Megf10/Megf11 (Drpr vertebrate orthologue) in our discussion was an oversight as we predict that Megf10/Megf11 serves a similar role in ensheathment of vertebrate somatosensory neurons. In our revised manuscript we will incorporate a more thorough discussion of the emerging literature bridging immunity and development.

      Optional 7. The authors might consider tying their extended discussion of integrins (~Line 320-Line 335) into their overall argument in a more cohesive manner. For example, how (if at all) do the authors see Drpr collaborating with other receptors to regulate initiation versus maintenance of sheaths? Is a model in which Drpr initiates ensheathment maintained by other molecules possible? Speculation on this point in the discussion might integrate other molecules into the authors' model in a cohesive manner and/or bolster the authors' discussion of Drpr's window of activation/deactivation during ensheathment.

      Indeed, we envision a model in which Drpr cooperates with other receptors; we discussed one possible connection to integrins above and will incorporate a fuller treatment of the possible crosstalk between these pathways in our discussion. Regarding a model in which Drpr initiates ensheathment maintained by other molecules: yes, we agree that this is possible, but our results suggest that additional receptors likely participate in sheath initiation as well. Drpr inactivation substantially reduces but does not totally eliminate ensheathment, however the sheaths that form in drpr mutants are structurally distinct from mature sheaths (shorter, narrower, appear to recruit less Cora). Hence, we favor a model in which drpr signaling cooperates with a parallel, partially redundant pathway for initiating sheath formation in response to sheath-promoting signals. Integrin signaling is a plausible candidate for this parallel pathway for reasons we discuss in our original submission (and above); in our revised discussion we will more extensively discuss the potential cross-talk between Drpr signaling and Integrin signaling in initiation and maintenance of epidermal sheaths.

      Reviewer #2 (Significance (Required)):

      This study provides a new link between a conserved phagocyte receptor (Drpr) and epidermal ensheathment of somatosensory neurons, an important process at the heart of the regulated development and function of the nervous system. As such, the Yin et al. submission is a significant contribution to a rapidly moving research area of broad interest to an intellectually diverse readership interested in the molecular and cellular basis of neurodevelopment and interactions between the nervous system and the immune system.* *

      An important strength of this study is the striking degree of the epidermal ensheathment phenotypes observed when normal Drpr expression is disrupted either through depletion, mutation, or targeted overexpression. For example, depletion of Drpr via RNAi induces a ~three fold reduction in total sheath length (Fig 1F - ~1.45 mm in controls as compared to ~0.5 mm with Drpr RNAi). Notably, epidermal enforced overexpression of Drpr induces a notable increase in the fraction of ensheathed neurons (Fig 5A-D). This strength of phenotype enables the investigators to deploy an elegant sequence of molecular and genetic tools to further probe mechanism and implicate extracellular PS in this process.* *

      Reviewer area keywords as requested: phagocytes, immune cell signaling, signal transduction

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The study by Yin and colleagues investigates how epidermal cells recognize and ensheath somatosensory neuron (SSN) dendrites in Drosophila larvae. The authors identify the phagocytic receptor Draper (Drpr) as a key mediator of selective epidermal ensheathment and demonstrate that this process relies on phosphatidylserine (PS) exposure on dendrites and the bridging molecule Orion. The work significantly advances our understanding of neuron/epidermis interactions and reveals a novel role for phagocytic recognition pathways in non-glial ensheathment.

      The manuscript is clearly written, methodologically solid and supported by compelling data. The authors combine genetic, imaging and functional approaches to uncover a mechanism of structural and functional modulation of nociceptive neurons. The results will interest researchers studying neuronal morphogenesis, epithelial biology and non-glial phagocytic pathways.

      Specific Critiques:

      While the study is strong and timely, several issues should be addressed prior to acceptance:

      Figure 1: The authors refer to the receptors as "engulfment receptors." I recommend calling them "phagocytic receptors" since not all are required for the engulfment step (e.g., Crq).

      The Reviewer makes an important distinction. We have updated our manuscript to reflect this point, replacing “engulfment receptor” with “phagocytic receptor” in the text and in our title.

      Figure 2: The title states "Drpr is required in epidermal cells..." yet the authors analyze a drpr null mutant, which lacks Drpr in all expressing cells (glia, macrophages and epidermal cells). The rationale for using the null mutant instead of epidermal-specific RNAi should be explained.

      The increased dendrite number in drpr RNAi larvae should also be noted here.

      We agree – the title is not appropriate for this version of the figure; we changed the title to better reflect the experiments being portrayed.

      Our RNAi experiments in Figure 1 and 2 demonstrate that drpr is cell autonomously required in epidermal cells for dendrite ensheathment. Here, we include analysis of an amorphic drpr allele to (1) provide further genetic support underscoring the requirement for drpr in dendrite ensheathment and (2) to underscore the observation that a small number of immature sheaths form in the complete absence of drpr, arguing for the presence of an additional pathway that contributes to sheath formation.

      Effects of epidermal drpr RNAi on dendrite number is not something we evaluated with our time-lapse studies in Figure 2. Instead, we monitored the effects of drpr knockdown on growth behavior of epidermal sheaths and found that epidermal drpr RNAi triggered an increase in the frequence of sheath retraction events and a decrease in sheath growth events.

      Figure 3: Explain the numbers on the X-axis in panels B and D. Add a panel without blue dashed outlines to better visualize Drpr expression. Adjust the red boxes to precisely match the enlarged regions.

      Each bar represents a single neuron; the numbers denote the number of sheaths sampled from each neuron. We added this to the figure and figure legend in our manuscript. We thank the Reviewer for identifying this oversight.

      We appreciate the Reviewer’s perspective on the blue hatched lines; we removed the hatched lines from the ROI and adjusted the position of the red hatched box.

      Figure 4: Why is the drpr mutant used here rather than RNAi? Please clarify the reasoning for choosing mutants in some experiments and knockdown in others.

      In Figure 2, we show analysis of the amorphic allele to further corroborate our RNAi studies, as described above. We chose to use the drpr amorphic mutant for these studies because we have no GAL4-independent reporter to label C1da neurons for analysis of dendrite arborization patterns. Although we could use HRP staining in combination with epidermal drpr RNAi, live imaging of dendrite arbors labeled by a C1da neuron GAL4 driver provides a more sensitive and reliable readout for morphogenesis studies.

      In our revised manuscript we will add analysis of C4da dendrite patterns in larvae expressing drpr RNAi in epidermal cells to evaluate whether the dendrite defects reflect epidermal requirements for drpr function.

      Figure 5: Correct the placement of white boxes in panels E-F′.

      We thank the Reviewer for identifying the mismatch. We corrected the placement to match the size of the ROIs.

      *Figure 6: AnnV staining in B is difficult to detect. Please add a version of the panel showing AnnV alone. *

      In our initial submission we include the overlay of PLC-PH-GFP and AnnV-mScarlet (B), an image showing the PLC-PH-GFP alone (B’) and an image showing the AnnV-mScarlet alone (B”).

      AnnV labeling appears weak on sheaths. Since epidermal membranes are strongly labeled, confirm PS exposure on dendrites with a commercial fluorescent Annexin V reagent.

      We appreciate the suggestion to use a commercial fluorescent Annexin V reagent and agree that it would strengthen our findings if such a reagent labeled sheaths. However, we intentionally prioritized analysis using the in vivo reporter because numerous studies indicate that epidermal sheaths are inaccessible to large molecules in solution (in the absence of detergent). One of the first assays used to monitor the in vivo distribution of sheaths was based on the inaccessibility of antibodies to ensheathed neurites (Kim et al, Neuron, 2012; also Tenenbaum et al, Current Biology, 2017; Jiang et al, eLife, 2019). More recently, we demonstrated that 10kDa dextran dyes are excluded from epidermal sheaths (Luedke et al, PLoS Genetics, 2024). Nevertheless, as part of our revision we will examine whether commercially available Annexin V reagents label sheaths.

      In F and F" sheaths are labeled in areas without visible dendrites. Please clarify.

      We note that although C4da dendrites are the most extensively ensheathed among da neurons, other neurons (most prominently C3da neurons) also exhibit significant ensheathment (Jiang et al, eLife, 2019). We use established markers of epidermal sheaths (Cora immunoreactivity in this panel; PIP2 reporters and/or Cora-GFP localization in other panels), hence Drpr accumulates at Cora+ sheaths on C4da neurons and Cora+ sheaths that form on other da neurons. We will clarify this point in the text of our revised manuscript.

      In O and P, show Drpr staining without blue dashed sheath outlines.

      We have removed the blue dashed outlines from the figure panels.

      The legend contains numerous labeling errors: there is no B′ or B"; C-G should be E-G; G-I should be H-J; I-L should be K-N; M-O should be O-R. Please revise carefully.

      The labeling errors have been corrected.

      Sup Fig 1: Add a panel with only c4da labeling to visualize dendrites.

      We have added a panel displaying only C4da dendrites to this figure.

      Sup Fig 2: The anti-Drpr signal is unexpected in the null mutant. Validate with an additional antibody (e.g., mouse monoclonal anti-Drpr from the DSHB).

      We appreciate the suggestion and have already tested the mouse monoclonal anti-Drpr antibody from DSHB and found that it is unsuitable for use in our preparations (ie, no Drpr-dependent immunoreactivity, even in specimens overexpressing Drpr).

      With respect to the comment about the unexpected signal in the null mutant, we note that the antibody is a rabbit polyclonal and is not purified. In our experience it is not uncommon for rabbit serum (even pre-immune serum) to recognize multiple antigens in the larval skin. Nevertheless, our control studies demonstrate that Drpr immunoreactivity is eliminated at epidermal sheaths in Drpr null mutants.

      Sup Fig 3: No panels A or B are shown; no PIP2 marker is present despite the legend. Please revise. Drpr overexpression appears to increase Cora levels in some cell. Could Drpr affect Cora expression or distribution? This should be addressed. Also dendrite number appears higher in Drpr-overexpressing larvae. Please state whether this is significant.

      The labeling errors in the legend have been corrected; the corresponding studies with the PIP2 marker are presented in Figure 5.

      All epidermal drivers we have characterized exhibit a low level of variegation in expression within a hemisegment that we have previously documented (Jiang et al 2014 Development; Jiang et al 2019 eLife), and we suspect that it may be related to epidermal endoreplication (epidermal cells do not synchronously endoreplicate). However, we have not observed any systematic difference in epidermal GAL4 driver or Cora-GFP expression in larvae overexpressing Drpr. We note that a single cell in the field of view in Supplemental Figure 3 exhibits a higher level of GFP fluorescence. We occasionally observe this, independent of background genotype.

      All gene names must be italicized and lowercase (e.g., drpr), including in figure labels and legends.

      All protein names must be capitalized and non-italic (e.g., Drpr, Cora).

      We appreciate the Reviewer’s feedback. We used Drpr in keeping with many recent reports, but the Reviewer is correct in outlining the standard naming conventions. We have changed the gene names to reflect convention (lowercase, italics for genes that were initially identified according to phenotypic characterization; uppercase, italics for genes named according to homology to orthologues in other species such as NimB4 and ATP8A)

      Define ROI on first use.

      Done. We defined ROI in the methods section.

      Ensure consistent phrasing: use "anti-Cora or anti-Drpr immunoreactivity" uniformly.

      We have done so.

      There a few typos which must be corrected:

        • Line 196: "containing" → "contain"*
        • Line 205: "antibodies Drpr" → "antibodies to Drpr" or "anti-Drpr antibodies"*
        • Line 331: "predominan" → "predominant"*
        • Line 353: "phagocyting" → "phagocytic"*
        • Line 385: "similar the effect" → "similar to the effect"*
        • Line 432: Title should be underlined*
        • Line 544: "drpr∆5" is missing the 5*
        • Line 569: "immunoreactivity a" → "immunoreactivity of"*

      The typographical errors have been corrected. We thank the Reviewer for the close reading of the manuscript.

      Reviewer #3 (Significance (Required)):

      The manuscript makes a meaningful contribution to the field of neuron/epidermal cells interactions by demonstrating that recognized phagocytic machinery components can be co-opted for ensheathment of sensory neurites. This not only expands our understanding of skin innervation and mechanosensation but also raises intriguing implications for how similar mechanisms might operate in vertebrates (e.g., epidermal/nerve interactions, peripheral neuropathy). Given the functional link to nociceptive sensitivity, the work may have broader relevance for pain biology and sensory disorders.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The study by Yin and colleagues investigates how epidermal cells recognize and ensheath somatosensory neuron (SSN) dendrites in Drosophila larvae. The authors identify the phagocytic receptor Draper (Drpr) as a key mediator of selective epidermal ensheathment and demonstrate that this process relies on phosphatidylserine (PS) exposure on dendrites and the bridging molecule Orion. The work significantly advances our understanding of neuron/epidermis interactions and reveals a novel role for phagocytic recognition pathways in non-glial ensheathment. The manuscript is clearly written, methodologically solid and supported by compelling data. The authors combine genetic, imaging and functional approaches to uncover a mechanism of structural and functional modulation of nociceptive neurons. The results will interest researchers studying neuronal morphogenesis, epithelial biology and non-glial phagocytic pathways.

      While the study is strong and timely, several issues should be addressed prior to acceptance: Figure 1: The authors refer to the receptors as "engulfment receptors." I recommend calling them "phagocytic receptors" since not all are required for the engulfment step (e.g., Crq).

      Figure 2: The title states "Drpr is required in epidermal cells..." yet the authors analyze a drpr null mutant, which lacks Drpr in all expressing cells (glia, macrophages and epidermal cells). The rationale for using the null mutant instead of epidermal-specific RNAi should be explained. The increased dendrite number in drpr RNAi larvae should also be noted here.

      Figure 3: Explain the numbers on the X-axis in panels B and D. Add a panel without blue dashed outlines to better visualize Drpr expression. Adjust the red boxes to precisely match the enlarged regions.

      Figure 4: Why is the drpr mutant used here rather than RNAi? Please clarify the reasoning for choosing mutants in some experiments and knockdown in others.

      Figure 5: Correct the placement of white boxes in panels E-F′.

      Figure 6: AnnV staining in B is difficult to detect. Please add a version of the panel showing AnnV alone. AnnV labeling appears weak on sheaths. Since epidermal membranes are strongly labeled, confirm PS exposure on dendrites with a commercial fluorescent Annexin V reagent. In F and F" sheaths are labeled in areas without visible dendrites. Please clarify. In O and P, show Drpr staining without blue dashed sheath outlines. The legend contains numerous labeling errors: there is no B′ or B"; C-G should be E-G; G-I should be H-J; I-L should be K-N; M-O should be O-R. Please revise carefully.

      Sup Fig 1: Add a panel with only c4da labeling to visualize dendrites. Sup Fig 2: The anti-Drpr signal is unexpected in the null mutant. Validate with an additional antibody (e.g., mouse monoclonal anti-Drpr from the DSHB). Sup Fig 3: No panels A or B are shown; no PIP2 marker is present despite the legend. Please revise. Drpr overexpression appears to increase Cora levels in some cell. Could Drpr affect Cora expression or distribution? This should be addressed. Also dendrite number appears higher in Drpr-overexpressing larvae. Please state whether this is significant.

      All gene names must be italicized and lowercase (e.g., drpr), including in figure labels and legends. All protein names must be capitalized and non-italic (e.g., Drpr, Cora). Define ROI on first use. Ensure consistent phrasing: use "anti-Cora or anti-Drpr immunoreactivity" uniformly. There a few typos which must be corrected:

      • Line 196: "containing" → "contain"
      • Line 205: "antibodies Drpr" → "antibodies to Drpr" or "anti-Drpr antibodies"
      • Line 331: "predominan" → "predominant"
      • Line 353: "phagocyting" → "phagocytic"
      • Line 385: "similar the effect" → "similar to the effect"
      • Line 432: Title should be underlined
      • Line 544: "drpr∆5" is missing the 5
      • Line 569: "immunoreactivity a" → "immunoreactivity of"

      Significance

      The manuscript makes a meaningful contribution to the field of neuron/epidermal cells interactions by demonstrating that recognized phagocytic machinery components can be co-opted for ensheathment of sensory neurites. This not only expands our understanding of skin innervation and mechanosensation but also raises intriguing implications for how similar mechanisms might operate in vertebrates (e.g., epidermal/nerve interactions, peripheral neuropathy). Given the functional link to nociceptive sensitivity, the work may have broader relevance for pain biology and sensory disorders.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Innervation of the skin by somatosensory neurons is a conserved process that enables perception and discrimination of mechanical stimuli. How do molecules exposed by neurons and skin cells collaborate to promote neurite-induced epidermal sheath formation? Here, the authors combine fruit fly molecular and genetic tools with high resolution imaging to address this fundamental question. Based on morphological similarity between phagocytosis and SSN ensheathment, the authors hypothesized that one or more phagocyte receptors might promote ensheathment through ligand-driven interactions with neurites. To test this hypothesis, the authors systematically screened phagocytic receptors expressed in the epidermis for functional roles in ensheathment. Through this screening approach, the authors found that the Draper (Drpr) receptor functions in epidermal cells as a significant factor required to promote ensheathment. They support this conclusion using a suite of cell- and tissue-specific RNAi tools and mutant fly lines in conjunction with elegant mechanistic work that establishes a role for the conserved "eat-me" signal phosphatidylserine (PS) in driving ensheathment.

      Major comments:

      Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      The seven key claims presented in the abstract are strongly supported by experimental data and analyses presented in the manuscript. At least one experimental result displayed in a main figure in support of the indicated key claim is summarized below. This summary does not present a comprehensive list of all data in support of a particular claim. Rather, it is an effort to confirm that each key result presented to the readership in the abstract is supported by at least one rigorously analyzed experimental result.

      1. Drpr functions in epidermal cells to promote ensheathment: Expressing a Draper RNAi under control of a larval epidermal driver (A58) led to significant reduction in total sheath length (Fig 1H), average sheath length (Fig 1I), and fraction ensheathed (Fig 1J). Similar results were obtained using two different Draper RNAi constructs. The argument presented through RNAi results in Fig 1 is bolstered by data using an existing validated Draper mutant line in Fig 2A-E. A question of interest to this reviewer upon receiving the paper was whether Draper functions at initial stages of sheath formation, maintenance of existing sheaths, or both. The timelapse data in Fig 2F suggests that Draper activity is dispensable for maintaining existing sheaths.
      2. ...that Draper accumulates at sites of epidermal ensheathment but not contact sites of unsheathed neurons: Immunostaining experiments demonstrate that Drpr immunoreactivity is enriched at PIP2-positive membrane domains in epidermal cells (Fig 3A-B). Is this accumulation selective for epidermal sheaths? Yes. In Fig. 3E-G, the authors show that Drpr enrichment overlaps with the sheath marker cora but not with dendrites of C1da neurons or from unsheathed portions of C4da dendrite arbors. The authors confirm specificity of Drpr immunoreactivity through control experiments using a Drpr mutant (Supplementary Fig 2).
      3. ...that Drpr overexpression increased ensheathment: Enforced overexpression of Draper in epidermal cells via Epidermal GAL4 driving UAS-Drpr (Fig 5A) shows significantly higher levels of ensheathment of C4da neurons as compared to controls. The authors demonstrate specificity by showing that epidermal Drpr overexpression did not induce ectopic sheath formation in C1da neurons (Fig 5E-G).
      4. ...that extracellular PS accumulates at sites of ensheathment: Using a previously developed secreted AnnV-mScarlet reporter (Ji et al. 2023 https://doi.org/10.1073/pnas.2303392120), the authors demonstrate that PLC-PH-GFP labeled stretches were also labeled by AnnV-mScarlet (Fig 6A-B), consistent with their model that ensheathment by Drpr is mediated by PS exposure on dendrites.
      5. ...that overexpression of the PS Flippase ATP8a blocks ensheathment: This claim is supported by demonstrating that overexpression of ATP8A, a protein that drives drives unidirectional PS translocation from the outer to the inner leaflet of the plasma membrane, impacts C4da neurite ensheathment. Selective overexpression of ATP8A in C4da neurons using a ppk-GAL4 induced a significant reduction in epidermal sheaths (Fig 6C).
      6. ...that Orion is required for sheath formation: Inactivation of the chemokine-like PS bridging molecule Orion significantly reduces fraction of ensheathment (Fig 6I-L).
      7. Overexpression of Draper enhanced nociceptor sensitivity to mechanical stimulus Consistent with a functional role for epidermal ensheathment in responses to mechanical stimuli, the authors report a significant reduction in nocifensive responses in a behavioral assay presented in Fig 6H.

      In conclusion, the authors' claims are supported by the data as presented in this version of the manuscript.

      • Please request additional experiments only if they are essential for the conclusions. Alternatively, ask the authors to qualify their claims as preliminary or speculative, or to remove them altogether.

      n/a - If you have constructive further reaching suggestions that could significantly improve the study but would open new lines of investigations, please label them as "OPTIONAL".

      n/a - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated time investment for substantial experiments. n/a - Are the data and the methods presented in such a way that they can be reproduced?

      Yes. The quality of the cell imaging data presented in the figures is high. The figure legends are sufficient to follow the investigators' conceptual approach and technical progress as they build their model. Transparent presentation of the screening data in Fig. 1 F-G was particularly appreciated by these reviewers.

      Are the experiments adequately replicated and statistical analysis adequate?

      Yes. We specifically commend the table outlining all statistical tests presented in the supplementary methods and linked to each figure.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Minor comments:

      1. Could the authors further clarify Drpr's anticipated window of activity during sheath formation and/or speculate further on this point in the discussion? Live imaging in Fig. 2 suggest that Drpr is dispensable for maintenance of existing sheaths. Given that Drpr is proposed to be activated through transient phosphorylation that recruits the binding partner Shark (PMCID PMC2493287), it might be useful to clarify Drpr's window of activation (ie transient or constitutive) for an audience more familiar with Drpr's canonical functions in engulfment. The section prior to speculation about a possible role for negative regulators of phagocytosis (Line 360) might be a possible location for this addition.
      2. The authors might consider developing their conclusion a bit further for a broad audience. For example, the gesture to Piezo dependence in the current final sentence might provide an opening to discuss an exciting future avenue focused on integrating molecular mechansensors into a comprehensive model of selective SSN ensheathment important for the perception and discrimination of touch and pain.
      3. Word missing or extra "in" in Line 69 after ECM?
      4. In Fig 1 and Fig 3, the PLC(delta)-PH-GFP reporter contains the delta symbol, in other throughout the paper it does not. In addition, Fig 5 is denoted "PIP2 (PLC-PH-GFP)". For consistency the authors might consider using PLC(delta)-PH-GFP across all figures.
      5. Fig 6P - do the authors suggest Orion is distributed at high concentration throughout the entire upper portion of the figure? Perhaps the coloration could be changed if Orion binding is suggested to occur between Drpr and PS.

      Optional suggestions for consideration to provide further context for a broad audience: Optional 6. The authors might consider placing their work in the context of an emerging literature focused developmental roles for immune cell signaling molecules/other phagocyte receptors at steady state. While the present study focused on epidermal ensheathment of SSNs stands on its own as a notable contribution and does not require these citations to support its conclusions, context from an emerging literature bridging immunity and development might be of interest to a broad readership. Should the authors wish to strengthen the link between their work and findings from other systems indicating a shared role in immunity and development for key immunoreceptors and their binding partners, they might consider adding citations/phrasing indicating that Draper's molecular collaborator Shark kinase (PMCID PMC2493287) was initially discovered as a developmental gene required for dorsal closure (PMCID PMC316420). They might also consider highlighting the role of Draper's mammalian orthologs Megf10/Megf11 in regulating mosaic spacing of retinal neurons (PMCID PMC3310952).

      Optional 7. The authors might consider tying their extended discussion of integrins (~Line 320-Line 335) into their overall argument in a more cohesive manner. For example, how (if at all) do the authors see Drpr collaborating with other receptors to regulate initiation versus maintenance of sheaths? Is a model in which Drpr initiates ensheathment maintained by other molecules possible? Speculation on this point in the discussion might integrate other molecules into the authors' model in a cohesive manner and/or bolster the authors' discussion of Drpr's window of activation/deactivation during ensheathment.

      Significance

      This study provides a new link between a conserved phagocyte receptor (Drpr) and epidermal ensheathment of somatosensory neurons, an important process at the heart of the regulated development and function of the nervous system. As such, the Yin et al. submission is a significant contribution to a rapidly moving research area of broad interest to an intellectually diverse readership interested in the molecular and cellular basis of neurodevelopment and interactions between the nervous system and the immune system.

      An important strength of this study is the striking degree of the epidermal ensheathment phenotypes observed when normal Drpr expression is disrupted either through depletion, mutation, or targeted overexpression. For example, depletion of Drpr via RNAi induces a ~three fold reduction in total sheath length (Fig 1F - ~1.45 mm in controls as compared to ~0.5 mm with Drpr RNAi). Notably, epidermal enforced overexpression of Drpr induces a notable increase in the fraction of ensheathed neurons (Fig 5A-D). This strength of phenotype enables the investigators to deploy an elegant sequence of molecular and genetic tools to further probe mechanism and implicate extracellular PS in this process.

      Reviewer area keywords as requested: phagocytes, immune cell signaling, signal transduction

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Yin et al investigates how epidermal cells shape somatosensory neuron (SSN) morphology and function through selective ensheathment in Drosophila. This study builds on earlier work by another group showing that the phagocytic receptor Draper (Drpr) as a crucial epidermal factor that is important for dendrite pruning and clearance. In the present study, the authors how that Drpr also functions in the epidermis to establish the characteristic stretches of epidermal ensheathment of dendrite arborization neurons in the fruit fly Drosophila melanogaster. This is particularly true for highly branched types of dendrites but ont dendrites that show simpler branching patterns. Overexpression of Drpr increases ensheathment and nociceptor sensitivity, linking molecular recognition to sensory modulation. Further, Drpr is known to recognize phosphatidylserine (PS) on neurites to promote ensheathment and the authors show localization of a reporter for PS with epidermal membranes. Genetic manipulations that reduce PS results in a reduction in epidermal sheaths and the chemokine-like protein Orion promoting Drpr/PS interactions is required for these processes. Overall, the manuscript is well written, although at times maybe primarily for a fly audience. Reach could be improved by making it more accessible to a non-fly audience. The observation that Drpr is not only required for removing damaged or degenerating dendrites but also for their correct ensheathment of highly branched dendrites presents an important finding that could be of interest for a wider audience provided the following points are adequately addressed:

      1. The Introduction could be further elaborated to help readers understand the significance of epidermal dendrite ensheathment. Addressing the following points may achieve this:

      (i) The Introduction would benefit from including details on developmental disorders and neurological diseases associated with defects or abnormalities in dendrite ensheathment.

      (ii) In lines 74-79, it is unclear whether the described findings are conserved across evolution or were demonstrated in a specific model organism.

      (iii) Including a description of the known literature on phagocytosis in this process would help readers better understand the novelty and significance of this study.

      (iv) Details of published Draper function in Han et al 2014 should be elaborated along with unanswered question that is addressed in this study. 2. It is unclear why the authors focus exclusively on Drpr and Crq, without addressing emp and CG4006, both of which show higher expression levels than the former. Moreover, the conclusion that 14 out of 16 engulfment receptor genes have no role based solely on RNAi knockdown experiments is a very strong statement that may requires additional validation. The authors should provide evidence that the RNAi knockdowns achieved complete loss of gene function to support their claim about 16 engulfment receptors. In addition, at most the authors can conclude that the tested genes are individually not required. 3. What kind of genes are crq and ea? 4. Comparing Figures 1C and 1E, it appears that drpr knockdown has a differential effect on epidermal dendrite ensheathment between main and secondary branches. If this observation is correct, separate quantification for each branch type would be more appropriate, along with an explanation for the observed differences. 5. For Figure 1K, it would be informative to examine how drpr knockdown affects dendrite length in these neurons. 6. For Drpr expression (Figure 3), it would be valuable to highlight any differences in expression between primary and secondary dendritic branches. 7. Removing drpr leads to excessive branching of SSN dendrites. Does overexpression of drpr affect dendrite morphology in the opposite manner? 8. Although drpr role in dendrite ensheathment is well explored, the interactions between drpr and PS seem underexplored. For example, do the changes in ensheathment as a result of manipulating PS levels require drpr? Does changing PS levels affect Drpr localization or levels?

      Minor Points:

      1. Why there is no gene in bold category for hemocytes in Figure 1A
      2. Line 67: "neurons BEING the most extensively..."
      3. Line 126: should read "epidermal engulfment receptors are INDIVIDUALLY dispensable"
      4. Line 216: "THE DrprD 5 mutation had no significant..."
      5. Line 230: "overexpression" instead of "overexpressed"
      6. Line 385: similar "TO"

      Significance

      This is an interesting study that adds to our understanding of the role of phagocytic receptors in shaping dendrites. Specifically, the role of Drpr (Draper) is studied, a gene previously known as an important for removal degenerating dendrites. The limitations of the manuscript as is is that it seems to be written primarily for a fly audience. Contextualizing the results and in the significance of this like conserved pathway could increase the significance.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this study, Wang et al. use Difteria Toxin (DT) to cause hair cell (HC) death in transgenic mice expressing the DT receptor in the HC of the inner ear. This model is assumed to cause HC loss in a selective way. The lesioned mice are assessed for translational vestibulo-ocular reflex (tVOR), vestibular sensory evoked potential (VsEP), rotational vestibulo-ocular reflex (rVOR), and single-unit recordings of vestibular afferents from cristae and maculae. Numbers of surviving HC, including total HC, type I HC (HCI) and type II HC (HCII) were also obtained at short and long times after DT exposure. By comparing the functional and histological results, the authors conclude that DT cause dose-dependent HC loss and vestibular function loss, that limited but significant HC regeneration occurs and that vestibular organs display variable but ample redundancy because robust physiological responses were obtained despite loss of high percentages of HCs.

      However, there are several limitations in the experimental design, methodological choices and analysis of the results that weaken the conclusions stated by the authors. Also, some important aspects of the work are not clear enough for an in deep scrutiny.

      The following list of weaknesses is not arranged in order of importance.

      1. Choice and use of the Pou4f3DTR/+ transgenic on FVB and C57/Bl6 backgrounds.

      1.a. Literature descriptions of the Pou4f3-DTR model used C57/Bl6 and CBA/J backgrounds and low mortality rates were found after DT administration. The present study generated Pou4f3-DTR mice on a FVB background and found that DT cause high mortality rates in this background. Comparison of the C57/Bl6 and FVB backgrounds are included in Figures 1 and 2 and the conclusion was that the C57/Bl6 background is more suitable for studying vestibular HC degeneration/regeneration. However, data are presented in Figures 3 to 7 without informing the reader whether these are from C57/Bl6 or FVB animals. Because of the information given in table 1, at least part of the data in these Figures is from the less suitable FVB mice. It is also possible that some data sets contain unbalanced numbers of animals from each strain in the different experimental conditions, with a potential impact on the robustness of the results. The strain identity of animals should be clarified across all data sets.

      1.b. Why the Pou4f3-DTR transgene was introduced in the FVB strain? The FVB strain is frequently used in transgenesis because the prominent pronuclei in their fertilized eggs and large litter size. While generation of transgenic lines in FVB mice is common, why would you want to bring an already established transgenic modification to a FVB background? It is known that FVB mice become blind by wean age due to a mutation in the Pde6b gene. Were the authors trying to have the Pou4f3-DTR model in a strain of blind animals? It is anomalous that the rationale for the FVB derivation is not provided and that the blindness of this strain is not even mentioned in an article containing VOR data. 2. Toxicity of the DT.

      2.1. The non-HC toxicity of DT is not evaluated. One of the stated reasons of the choice of the Pou4f3-DTR model to ablate HC is that other alternative models (aminoglycosides, cisplatin, IDPN) cause other toxic effects besides HC toxicity. However, the lack of evidence of other toxicities in Pou4f3-DTR mice after DT administration may simply be due to lack of assessment. Besides the inner ear, Pou4f3 is expressed in several structures including the genitourinary system, the retina, Merkel cells and subsets of somatosensory and brain neurons (https://www.ncbi.nlm.nih.gov/datasets/gene/18998/; PMID: 20826176; PMID: 22262898; PMID: 34266958; PMID: 33135183), so one would expect DT toxicity in these Pou4f3 expressing cells. Also, DT may cause other toxicities not explored in the model. The fact that the DT treatment is toxic beyond the intended HC toxicity is proven by the high (strain-dependent) mortality rate recorded in this study. A more detailed analysis of the effects of DT in the Pou4f3-DTR mice is needed before stating that the treatment is selective for the inner ear HCs. By the way, hyperactivity is not an additional toxic effect of ototoxic chemicals, it is a consequence of the vestibular function loss.

      2.2. The dose-response relationship of the DT treatment is unclear. The authors state that DT caused a dose-dependent loss of HC. However, the effects across different DT doses were not compared directly. Instead, each DT dose was compared with a different set of controls, and then the percentage of HC loss was qualitatively compared without statistical comparison. Looking at the numbers, the percent loss after the 35x2 dose is greater than that recorded after the 50x2 dose, contradicting the conclusion of a dose-dependent effect. One possible explanation is that the DT treatment has an inverted-U dose-response relationship, and the 25x2, 35x2 and 50x2 doses draw the bottom of the U. Alternatively, you have a dose-dependent effect with a dose causing a moderate effect (15) and 3 doses (25x2, 35x2, 50x2) causing near-maximal effects with differences among these groups more related to experimental variability than to dose-dependency. <br /> 3. Experimental design, use of animals, role of batch-to-batch variability in apparent results.

      3.a. The number of animals used in each experimental condition, their assignment to each assessment and participation in each dataset must be clarified. The reader is not informed on whether the animals used for physiological and histological analyses were the same or separate sets of animals were used. Also, the distribution of animals in different batches is not clarified and this may have originated apparent results through experimenter-generated bias. For instance, the HC count data are presented as two different, independent experiments, one evaluating different doses in the two strains at 14 days after exposure (Figures 1 and 2) and a second one comparing the HC counts at 2 weeks and 6 months after exposure (Figures 3 and 4). However, these were not separate experiments because at least some animals were shared in the two "experiments". This is demonstrated by the duplicate images between figures 1 and 2 and figures 3 and 4 (for instance, images D to D' in Figure 1 are the same than images C to C' in Figure 3). Therefore, at least part of the data for 2-week animals in Figure 3 have already been used as data of day-14 animals in Figure 1. This makes this reviewer suspect that 6-month animals in Figure 3 were treated with DT at different dates than 2-month animals in the same figure. Therefore, the small but significant "regeneration" could be simply due to differences in experimental outcome due to batch-to-batch experimental variability. In this kind of models, batch-to-batch experimental variability may be large and generate apparent group differences. For instance, in Figure 1, HC loss seems to be deeper after 35x2 than after 50x2. Although no statistical comparison is made between these groups, there seems to be an inversion of the dose-effect relationship that may simply depend on experimental (batch-to-batch) variability.

      3.b. The aim of revealing the relationship between HC loss and function retention should ideally be addressed using an experimental design providing subject-based data for comparison. That is, you cause the lesion, next you evaluate the function, and then you obtain the tissues for histological assessment, so the individual functional values can be matched to the individual HC numbers for a robust assessment of the relationship. In this work, group data from functional analyses are compared to group data from histological analyses, but no information is given on whether the same or different animals were used. If the same animals were used, the lack of direct comparison of the individual data is surprising and suggest that perhaps the comparison was made and conflicting results were observed. Alternatively, if different sets of animals were used, the conclusions on the "redundancy" of the vestibular organs are severely weakened because batch-to-batch variability in the extent of the lesion may be large and the lesions in the animals used for physiological assessment were in fact not assessed. As noted above, the possibility of a large batch-to-batch variability in the extent of the lesion is supported by the observation that lesions in 35x2 mice were deeper than lesions in 50x2 mice.<br /> 4. The conclusions on HC regeneration needs a deeper scrutiny and the conclusion on its dose-dependency is not supported by the data.

      4.1. The animals used for the experiments are too young to sustain claims on adult HC regeneration. DT was administered in "4-6 weeks old" animals. In rats and mice, many HC are generated at the early postnatal days and they mature over the first month. At 4 weeks after birth (postnatal day 28), the number of immature HCs in the rat utricle is small but significantly higher than at day 60 (PMID: 38895157). Therefore, 4-week-old animals may contain a higher reserve of immature cells to show up as "new HC" after damage than 6-week or 8-week-old animals. One possible origin of the differences between 2-week and 6-month DT animals would be that the 6-month group included more animals treated at 4 weeks while the 2-week group included more animals treated at 6 weeks.

      4.2. The conclusions on regeneration are based on percentages of HC densities. In the first 2-week experiment the area of the epithelium is assessed, but areas are not taken into consideration when comparing HC densities at 2 weeks and 6 months after DT. Is it possible that the increase in HC density is caused by epithelial shrinkage rather than by emergence of new HC?

      4.3. The spontaneous HC regeneration is stated to be "dose-dependent", meaning that more extensive lesions caused more vigorous regeneration. However, this is only an apparent effect caused by the use of percent data. Thus, the increase in HC counts in the utricle is said to represent a 52% after 25X2 and 118% after 50X2. However, if you look at the numbers instead of percentages, the mean number of HCs is 130 vs 86 (an increase of 44) after 25X2 and 78 vs 36 (an increase of 42) after 50X2. So, the cell counts indicate tat a similar number of "new" HCs appear after either dose. 5. The use of antibodies and the exact methodology for HC counts is unclear and perhaps defective.

      5.1. The immunohistochemical protocol did not include a specific marker for HCI, so HCI were defined as MYO7A+/Sox2- cells, HCII were MYO7A+/Sox2+ cells and supporting cells were MYO7A-/ Sox2+cells. The use of additional markers for the HCI (Spp1) or the calyx (Caspr1, tenascin-C) would have provided a more robust dataset. Also, striola/central versus peripheral regions were simply defined by approximate anatomical comparison, when positive markers of the central region are available (oncomodulin, calretinin+ calyces).

      5.2. The primary and secondary antibodies listed do not match. Two Myosin7a antibodies were used (mouse monoclonal from DSHB and rabbit polyclonal from Proteus) and a goat anti-Sox2. However, the secondaries listed are one anti-goat and two anti-rabbits. No anti-mouse is listed.

      5.3. In the figures, the reader is not informed whether the data are from the mouse anti-MYO7A or the rabbit anti-MYO7A, or whether the figure includes mixed data from both antibodies. This is highly relevant because MYO7A was used as the only positive marker for HCI, MYO7A expression may be reduced in stressed HCs (PMID: 37195449), and the two anti-MYO7A antibodies have different affinity for the target. Thus, if the 2-week samples were labelled with the mouse anti-MYO7A and the 6-month samples were labelled with the rabbit antibody, added to the possibility of reduced MYO7A expression at 2 weeks, then the apparent regeneration may be simply apparent, not real regeneration.

      5.4. The images were similarly obtained with the 63X objective in both the utricle and the crista. Why two different measures (per 10,000 square micrometres in utricle and per 2500 in crista) were computed if the original area used for counts was the same? The counts are said to be derived from these 63X square images or from merged images spanning the whole utricle. However, the results section does not include the information on the particular kind of image used for any of the counts, and all are presented similarly. The method used to obtain each count should be indicated and valid comparisons should only include counts obtained with the same method. 6. The presentation of the results and its interpretation is biased. Unbiased interpretation of the results do not support conclusions such as "we found that utricle function is largely preserved until hair cell loss exceeded 90%".

      6.1. "...a trend of increase....1.2+/-0.4 to 2.7+/-0.6...". These are similar very low numbers, close to zero, not a trend of increase.

      6.2. The reader is informed that VsEP "is particularly dependent...striolar type I hair cells". However, the next sentence stresses that measures "remained unchanged at low dose (15 ng/g), with 54% HC survival in striola" when the percentage survival of HCI was 62.7 %. The 54% survival was for total HCs.

      6.3. Lack of statistical significance is interpreted as lack of significant biological effect, when this may simply result from lack of power of the experimental design. For instance, it is concluded that the 15 ng/g dose has no effect on VsEP amplitudes, because control and DT animals did not sow statistically significant differences in this parameter. However, the comparison was made using only 4 control animals, with one of them showing a value much lower than the other 3. Also, 7 of the 8 DT animals had amplitude values below these 3 control values, and the mean value in the DT group was about 30-40% lower than the control mean. Clearly, larger groups were necessary to conclude that the 15 DT dose had no effect. Or, as suggested above, use individual animal-based comparisons to compare HC loss to loss of function. Lack of statistical significance in experiments with an insufficient number of controls can't be used to conclude that responses "were intact".

      6.4. "At 25 ng/g x2.....Notably, only 3 out of 13 exhibited elevated VsEP thresholds at this dose". However, looking at Fig 5C it seems more accurate to say that 8 out of 13 exhibited elevated thresholds. "At the highest dose (35 ng/g x2), 53.8% (7 out of 13) of the animals showed elevated VsEP thresholds", but in fact all 13 DT animals showed thresholds above the mean threshold value in the control group. 7. A total of 198 vestibular afferents were measured in 5 DT mice and 195 afferents in 4 control mice. An explanation is lacking about the representativeness of these populations, whether they represent a biased or unbiased representation of the total population of afferents. 8. Information of vehicle and volume of injection of DT is lacking. 9. Vestibular organs were "harvested". How? In PBS, fixative?<br /> 10. Why was the anterior crista used for HC counts? The VOR test used examines the reflexes generated in the lateral crista, and the lateral crista is easier to image. 11. There are several reference errors, including formal errors (duplicate o missing references) and content errors (references that do not include the information that you would expect from the text where they are cited).

      Referees cross-commenting

      While Referee #1 states that the experiments were carefully executed, in my opinion there are many details of the experimental design and execution that need to be better explained before this statement can be made.

      Significance

      The question addressed is of great interest for several reasons. To explain one, the degree of redundancy in the system greatly influences the possibilities of significant functional recovery that can be achieved by therapeutic interventions aimed at triggering HC regeneration after HC loss from any cause. The DT/transgenic mouse model is certainly an interesting model to address the question.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript by Wang et al. presents a detailed analysis of dose‑dependent vestibular hair cell damage induced by diphtheria toxin (DT) in Pou4f3‑DTR knock‑in mice. The authors examine type I and type II hair cell survival, vestibular functional outcomes, single‑unit recordings from vestibular ganglion neurons, and dose‑dependent regenerative responses. Two mouse strain backgrounds were compared, revealing similar vestibular phenotypes but markedly different survival rates. The authors conclude that ancient vestibular functions are redundant with respect to surviving hair cells across vertebrate systems.

      The experiments are carefully executed, and the data are consistent with their previous work using the same model. I recommend publication after the authors address the following minor points:

      1. Synaptic Damage Not Addressed<br /> No data are presented regarding synaptic integrity, despite the well‑established vulnerability of hair‑cell synapses across ototoxic and genetic models. Because single‑unit recordings cannot resolve synaptic morphology, additional discussion is needed-beyond the brief mention on Page 16, line 4-regarding potential synaptic loss, its expected relationship to hair‑cell degeneration, and how it might influence the interpretation of afferent responses.
      2. Higher DT Dose (50 ng/g ×2) Producing Less Damage<br /> In several datasets, the highest DT dose appears to induce less damage than the 35 ng/g ×2 dose. The authors should comment on possible explanations, such as DT solubility limits, receptor saturation, nonlinear pharmacodynamics, or strain‑specific physiological responses.
      3. Clarification of Redundancy Concept (Page 13, lines 13-15)<br /> The manuscript states that the increase in DT‑induced unresponsive afferents supports the redundancy concept. The logic behind this connection is not fully explained. Please elaborate on how the presence of unresponsive afferents aligns with or strengthens the argument for functional redundancy in vestibular systems.
      4. Therapeutic Potential of Reactivating Silent/Reserve Hair Cells<br /> The idea of reactivating silent or reserve hair‑cell populations is intriguing but underdeveloped. Expanding this section-perhaps by discussing potential molecular pathways, precedents in other sensory systems, or feasibility in mammalian vestibular organs-would strengthen the translational relevance of the work.
      5. Different DT Doses Used Between Strains (e.g., Fig. 2E-G)<br /> Although the two strains are described as having similar vestibular phenotypes, some figures use 25 ng/g ×2 for one strain and 50 ng/g ×2 for the other. Please clarify the rationale for using different doses-whether due to survival differences, pilot data, or strain‑specific sensitivity.
      6. Typographical Error (Page 8, line 8)<br /> A closing parenthesis appears to be missing.
      7. Define IDPN at First Mention<br /> Please spell out IDPN (β‑iodopropionitrile) at its first appearance in the text.

      Significance

      The manuscript by Wang et al. presents a detailed analysis of dose‑dependent vestibular hair cell damage induced by diphtheria toxin (DT) in Pou4f3‑DTR knock‑in mice. The authors examine type I and type II hair cell survival, vestibular functional outcomes, single‑unit recordings from vestibular ganglion neurons, and dose‑dependent regenerative responses. Two mouse strain backgrounds were compared, revealing similar vestibular phenotypes but markedly different survival rates. The authors conclude that ancient vestibular functions are redundant with respect to surviving hair cells across vertebrate systems.

      The experiments are carefully executed, and the data are consistent with their previous work using the same model. I recommend publication after the authors address the suggested minor points.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2022-01578R

      Corresponding author(s): Sabine Costagliola

      1. General Statements

      We are pleased to submit the revised version of our manuscript entitled “____Foxe1 deficiency impairs thyroid fate while supporting a lung differentiation program____” (Review Commons Refereed Preprint #RC-2022-01578R).

      We are grateful for the careful and constructive evaluation provided by the reviewers. Their insightful comments have significantly strengthened the manuscript, both conceptually and experimentally.

      We sincerely apologize for the delay in submitting this revision. Addressing the reviewers’ comments required additional experimental work, and during this period, the postdoctoral researcher who initiated and led the project completed her training and left the laboratory, requiring a reorganization of responsibilities within the team to ensure rigorous completion of the requested studies. We appreciate your patience and believe that the manuscript has been considerably strengthened as a result.

      Collectively, these modifications move the manuscript beyond a descriptive study and provide new mechanistic insight into the role of Foxe1 in thyroid specification, late chromatin regulation of Pax8 expression, and the permissive state originated in the Foxe1 absence leading to Nkx2.1 differentiation into lung.

      In addition, we would like to inform you that the author order has been modified in this revised version to accurately reflect contributions made during the revision process. As a result, Mírian Romitti has been moved to co–last author. All authors have reviewed and approved this change as well as the final version of the manuscript.

      We are excited to resubmit this substantially improved version and believe it now provides a clearer and more mechanistically grounded contribution to the field.

      2. Point-by-point description of the revisions

      This section is mandatory. Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript.

      • *

      We would like to thank all reviewers for their constructive comments and valuable suggestions, which have helped us improve the quality and clarity of our manuscript. Below, we provide a point-by-point response to all comments. The corresponding revisions have been incorporated into the transferred manuscript.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      • *

      Summary

      The authors investigate the effect of Foxe1KO primarily on thyroid differentiation of mouse ES cells following a previously established protocol based on sequential endoderm induction, Nkx2-1/Pax8 overexpression and stimulation of the TSHR/cyclicAMP pathway. Silencing of Foxe1 expression significantly suppresses the generation of functional thyroid follicles. By single cell profiling a great number of Foxe1 targeted genes are identified, some confirmed from previous studies and some are new candidates. Embryonic bodies lacking Foxe1 instead accumulate various lung lineage cells characterized by known cell type markers, which appear to organize in lung tissue-like structures. Based on these findings, it is suggested that Foxe1 might be involved in endoderm cell fate decisions.

      • *

      Major comments

      The title and abstract hold promise that Foxe1 is also a regulator of lung development, and that Foxe1 transcriptional activity might be decisive for thyroid versus lung fate decisions. However, there are no experimental support suggesting that one and the same ES cells at a certain critical time point may switch fate from thyroid to lung (or vice versa). Since lung markers are induced in Nkx2-1/Pax8/cAMP+ ESC it is likely that "control" organoids with maintained Foxe1 expression already contain lung lineage cells, which might expand simply by clonal selection as the thyroid lineage is suppressed by subsequent Foxe1 deletion. Although authors discuss some in this direction, it is not obvious to readers without very careful reading that this possibility and explanation is feasible and should be considered and problematized.

      We thank the reviewer for this important and thoughtful comment. We agree that our data do not demonstrate a direct fate switch of individual ES cells from thyroid to lung identity at a defined developmental time point. We have revised the title, abstract, and discussion to clarify that our findings support a model of lineage stabilization and transcriptional competition rather than active binary fate conversion.

      Our chromatin accessibility data argue against induction of a de novo lung program upon Foxe1 loss. In Foxe1 KO cells, we observe:

      • A marked reduction in chromatin accessibility at the Pax8 locus (see Figure 6B)
      • No significant gain in accessibility at canonical lung program loci (see Figure 6F) Thus, lung gene activation does not require establishment of new accessible chromatin regions. Instead, lung-associated loci appear to be in a permissive chromatin configuration in Nkx2-1+ foregut progenitors.

      Importantly, quantitative lineage analysis further supports destabilization of thyroid commitment rather than emergence of a new lineage. In wild-type organoids, approximately 80% of Nkx2-1+ cells co-express Pax8, indicating strong thyroid commitment. In contrast, in Foxe1 KO organoids, only ~20% of Nkx2-1+ cells retain Pax8 expression (see data below). This substantial reduction in Nkx2-1⁺/Pax8⁺ double-positive cells indicates collapse of thyroid lineage reinforcement, leaving a larger fraction of Nkx2-1-positive cells transcriptionally permissive and capable of engaging alternative Nkx2-1-dependent programs such as lung.

      Mechanistically, our data support the following model:

      1. Early during differentiation, Pax8 induces Foxe1 expression.
      2. Foxe1 subsequently becomes required to sustain chromatin accessibility at the Pax8 locus (supported by Figure 6B and predicted biding site, Foxe1 motif, Table S1).
      3. In Foxe1 KO cells, accessibility at the Pax8 locus collapses, reducing Pax8 expression and weakening thyroid super-enhancer activity.
      4. As the thyroid transcriptional network destabilizes, Nkx2-1, still expressed, can cooperate with lung-associated cofactors at already accessible lung loci.
      5. Lung transcription increases without requiring de novo chromatin opening, consistent with redistribution of limiting transcriptional machinery. Supporting a more direct regulatory role, motif analysis revealed a predicted Foxe1 binding site within regulatory regions of the Pax8 locus (Table S1). This is consistent with the possibility that Foxe1 directly binds Pax8-associated enhancers, potentially recruiting chromatin remodelers and/or stabilizing enhancer-promoter interactions required to maintain high Pax8 expression. While functional validation of this binding will require future studies, this observation further supports a model in which Foxe1 actively maintains Pax8 chromatin accessibility rather than indirectly affecting thyroid identity.

      Interestingly, our newly added data (Figure S8A-C) show that complete absence of Pax8 (Pax8KO mESCs) does not result in the same phenotype, displaying a complete absence of thyroid or lung organoids. This finding reinforces the hypothesis that Foxe1 is not regulating Pax8 expression at early stages of thyroid specification.

      Furthermore, our previous single-cell RNA-seq analysis of mouse thyroid organoids (Romitti et al., Frontiers in Endocrinology, 2021) did not reveal substantial lung cell population under wild-type conditions, with only a small Nkx2.1-Krt5 cluster, called non-thyroid epithelial cells being identified. This suggests that high Pax8 levels in the presence of Foxe1 effectively commit most Nkx2.1+ progenitors toward thyroid fate.

      Despite this, we agree that expansion of rare lung-competent cells, even if unlikely, cannot be formally excluded. Definitive resolution of whether a bipotent Nkx2.1+ progenitor with dual thyroid and lung potential exists would require dedicated lineage tracing at single-cell resolution. Such experiments would be necessary to distinguish between fate conversion and expansion of lineage-competent progenitors and lie beyond the scope of the current study.

      Ultimately, we have extensively revised the manuscript to clarify these points and to avoid implying direct fate switching. Our data instead support a model in which Foxe1 stabilizes thyroid commitment by maintaining Pax8 enhancer accessibility, thereby functionally restricting Nkx2.1 from engaging alternative foregut programs.

      All the above-mentioned information and discussion have been incorporated to the new version of the manuscript

      Observations that Foxe1KO did not at all influence gene expression in expanding lung-like cells are consistent with the idea that lung and thyroid specification in the model are independent phenomena, and argue against the existence of a common bipotent progenitor. If authors disagree, this issue and question should be more thoroughly discussed and argued for with more supporting experimental data than found in the current manuscript version

      We thank the reviewer for this important comment. As stated above, we agree that our current data do not formally demonstrate the existence of a common bipotent progenitor, and we have revised the manuscript to avoid overinterpretation in this regard.

      Regarding lung genes expression, we observe significant differences between WT and Foxe1 KO organoids at day 22, as assessed by qPCR (see Figure S3). In addition, single-cell RNA sequencing reveals the presence of distinct lung cell populations in the Foxe1 KO condition, characterized by high expression of specific lung lineage markers (see Figure 4). Importantly, these lung populations were not detected in our previous single-cell RNA-seq analysis of WT thyroid organoids (Romitti et al., Frontiers in Endocrinology, 2021), except for a small population of Nkx2-1+Krt5+ cluster, indicating that their emergence is specifically associated with Foxe1 loss.

      Despite the appearance of these lung-like cell types in Foxe1 KO organoids, ATAC-seq analysis does not reveal increased chromatin accessibility at canonical lung regulatory loci compared to WT (see Figure 6). This suggests that Foxe1 does not act as a direct negative regulator of the lung program. Rather, our data support a model in which Foxe1 primarily maintains thyroid lineage stability by sustaining chromatin accessibility at the Pax8 locus. In its absence, Pax8 expression is reduced, thyroid enhancer activity collapses, and thyroid differentiation is compromised.

      Consequently, Nkx2-1+ cells remain in a transcriptionally permissive state in which lung-associated loci, already epigenetically accessible in foregut-derived progenitors, can be engaged. Thus, lung differentiation appears to arise not through active induction by Foxe1 loss, but through destabilization of the thyroid program, allowing Nkx2-1 to cooperate with alternative cofactors within an already permissive chromatin landscape.

      To prevent misunderstanding, we have modified the title and substantially clarified the results and discussion sections to better reflect this model and to avoid implying direct lineage instruction or proven bipotency.

      Minor comments

      What is the fraction of. Nkx2-1+ cells that organize into follicles vs lung structures? Based on provided overview images (e.g. Figs. S1, S4) the general impression is that most cells do not form 3D-structures (i.e. do not differentiate). Please explain this and provide information in paper.

      We thank the reviewer for this helpful comment and for the opportunity to clarify this point.

      First, the images shown in Figs. S1B–C correspond to day 7 and Fig. S4E to day 10 of the differentiation protocol. As indicated in the figure legends, these represent early stages of the culture during which cells are still a pool of progenitor-like cells. At these time points, organized 3D thyroid follicles or lung-like epithelial structures are not yet formed. We have revised the figure legends to ensure this is clearly stated and to avoid the impression that full differentiation has already occurred at these stages.

      Regarding the fraction of Nkx2-1⁺ cells that organize into follicles vs. lung structures at later stages, we acknowledge that we are not able to provide an exact quantitative proportion. Due to the 3D nature of the culture system and the size heterogeneity of the structures, precise counting of all Nkx2-1⁺ cells within organoids are technically challenging. However, based on representative images (e.g., Fig. 1C) and repeated observations across independent experiments, a subset of Nkx2-1⁺ cells clearly organize into epithelial 3D structures, while others remain unorganized or in less structured aggregates.

      In the Foxe1 KO condition, the larger size and morphology of the epithelial structures suggest that a substantial proportion of Nkx2-1⁺ cells contribute to lung-like structures. Morphologically, these structures are typically larger (approximately 70–600 µm) compared to thyroid follicles (approximately 30–50 µm), supporting the impression that lung-like structures represent a significant fraction of organized epithelia in the KO condition.

      Importantly, our single-cell RNA-seq data provide additional support for epithelial organization within defined clusters. The Nkx2-1/lung clusters express high levels of epithelial markers such as Epcam and Cdh1 (E-cadherin), consistent with structured epithelial identity. In contrast, only the Thyroid 1 cluster expresses these epithelial markers robustly, whereas the Thyroid 2 and Nkx2-1⁺/Pax8⁻ clusters show low or absent expression, suggesting that not all Nkx2-1⁺ cells acquire a fully organized epithelial state.

      Fig. 1C: Supposed follicles are not shown in this graph.

      We thank the reviewer for pointing this out. We agree that, due to the low magnification, individual follicular structures are not clearly discernible in Fig. 1C. The purpose of these images was not to illustrate fully formed thyroid follicles, but rather to highlight the relative proportion of Nkx2-1⁺/Pax8⁺ double-positive cells in control versus Foxe1 KO conditions.

      To avoid confusion, we have revised the figure legend and the text and replaced the term “thyroid follicles” with “thyrocytes,” which more accurately reflects what is shown at this magnification. We believe this clarification better aligns the description with the intent of the figure.

      Why does not thyroglobulin accumulate in lumen (which if present would be a good means for quantification by counting follicles)?

      We thank the reviewer for this valuable suggestion and agree that luminal thyroglobulin (Tg) accumulation would, in principle, represent an informative readout for follicle quantification.

      However, our organoids display a fetal-like developmental state and exhibit heterogeneity in maturation and functional competence (as expected in vivo at early development). As we have previously demonstrated (Carvalho et al., Advanced Healthcare Materials, 2023), even in highly mature thyroid organoid systems, not all morphologically defined follicles are functionally active. Thus, the absence or variability of luminal Tg or iodinated Tg (Tg-I) accumulation does not necessarily indicate absence of follicle formation at this developmental stage. In other words, Tg accumulation is not a fully reliable surrogate marker for follicle presence in this context. Here we included an example of Tg staining in mouse thyroid organoids, where we can observe some regions with Tg accumulated in the lumen, while most of the cells also show (or exclusively) cytoplasmatic staining. This image further confirms the variability in Tg accumulation among derived organoids.

      To more accurately identify follicular structures, we relied on epithelial polarity and architectural markers. Specifically, we used E-cadherin and ZO-1 staining in combination with Pax8 to define organized epithelial thyroid structures. In addition, we employed an iodinated-thyroglobulin antibody (mouse anti–Tg-I, gift from C. Ris-Stalpers) and improved the quality of the Tg-I staining in Fig. 1E. This was further complemented by the Tg-EGFP reporter signal to better visualize thyroid follicular organization.

      Nevertheless, due to the intrinsic 3D nature of the culture system and structural heterogeneity of the organoids, precise quantitative assessment remains technically challenging.


      Indeed, follicles should be quantified to estimate induction success. Please also explain rounded structures in Foxe1KO image (are they distal lung buds?). Or are Control and Foxe1KO images confused in this panel?!?

      We thank the reviewer for this important comment and for raising the need for quantitative assessment.

      To estimate induction efficiency and directly compare control and Foxe1 KO conditions, we quantified Nkx2-1⁺ and Nkx2-1⁺/Pax8⁺ populations by flow cytometry (Fig. S6A-B), using the Nkx2-1_mKO2 reporter in combination with Pax8 antibody staining. We observed a marked reduction in the total number of Nkx2-1⁺ cells in Foxe1 KO organoids compared to controls, beginning at day 11 and becoming progressively more pronounced over time. By day 21, approximately 40-50% of cells in the control condition are Nkx2-1⁺, whereas only ~10-15% are Nkx2-1⁺ in the Foxe1 KO.

      Importantly, co-staining with Pax8 further revealed that in control organoids, the majority of Nkx2-1⁺ cells are also Pax8⁺ (41.9% of total cells), consistent with efficient thyroid commitment. In contrast, in Foxe1 KO organoids, only 3.1% of total cells are double positive, indicating a profound reduction in thyroid lineage. These quantitative data provide a robust measure of induction success and lineage specification efficiency.

      Regarding the rounded structures shown in Fig. 1D in the Foxe1 KO condition, the images are correctly assigned and not confused. These rounded epithelial structures represent the few thyroid follicles that form in the absence of Foxe1, as confirmed by Pax8 and Tg positivity. Although markedly reduced in frequency, follicle formation is not completely abolished in the KO condition. However, as highlighted in Fig. 1D, these self-organized follicles are not functionally mature, as evidenced by the absence of Nis/Slc5a5 expression. An additional example of a follicle derived in the Foxe1 KO condition is shown in Fig. S5B.

      Fig. 1E: text on Fig. legend is erroneously given under (F), whereas a dedicated and relevant text for (F) is missing.

      We thank the reviewer for this careful observation. The figure legend has been corrected to properly assign the text to panel (E), and a dedicated legend describing panel (F) has now been added. In addition, we have ensured that the corresponding figure panels are appropriately referenced in the main text.

      Fig. 1F. Immunostaining of iodinated thyroglobulin (Tg-I) is very poor. Is it due to a bad antibody (does it work well in in vivo thyroid stainings?) or is organification simply inefficient? Again, poor content of Tg in lumen (as also suggested by Fig. S5A), it is puzzling. Or are in vitro-generated follicles leaky (i.e. do not behave as natural thyroid follicles)?

      We thank the reviewer for this helpful comment. Following this suggestion, we have improved the quality of the iodinated thyroglobulin (Tg-I) immunostaining and included new images at higher quality and different magnifications in Fig. 1E. These revised images more clearly show the accumulation of Tg-I within the luminal compartment, particularly in the WT control condition.

      Regarding the apparent variability in Tg accumulation, we believe this reflects the fetal-like developmental state of the organoids and the heterogeneity in their maturation and functional competence. As discussed above, not all follicles generated in vitro reach the same level of functional maturity, which may influence the degree of Tg accumulation within the lumen.

      Importantly, we do not believe that the in vitro–derived follicles are structurally leaky. First, the luminal localization of iodinated Tg is clearly detectable in Fig. 1E, indicating that Tg can accumulate within the follicular lumen. Second, functional assays presented in Fig. 1F demonstrate robust iodide uptake and organification, supporting the presence of an active thyroid hormone biosynthetic machinery in these organoids.

      Figs. 2A-E: Comments on lung cell markers. A: E-cad is unspecific, Sox9 would better label branching morphogenesis

      We thank the reviewer for this helpful comment. The purpose of the first panel in Fig. 2 (A) was to highlight the presence of Nkx2-1⁺ cells organizing into epithelial structures, as indicated by E-cadherin staining. In this context, E-cadherin was used to visualize epithelial organization rather than to specifically identify lung lineage cells. This also allowed us to emphasize the clear morphological differences between thyroid follicles, which are typically smaller, and the larger epithelial structures observed in the Foxe1 KO condition that are consistent with lung-like structures.

      The presence and identity of specific lung cell populations are further addressed in the subsequent panels of Fig. 2 (B-H) and more comprehensively in the single-cell RNA-seq dataset presented in Fig. 4.

      While we agree that Sox9 staining would provide an additional marker for bud tip progenitors and branching morphogenesis, our single-cell RNA-seq analysis shows Sox9 expression within the Nkx2-1⁺/Epcam⁺/Pax8⁻/Tg⁻ population in Foxe1 KO organoids (Fig. 4B), supporting the presence of this lung progenitor population in our system.

      Finally, it is important to note that our culture system (media) is not designed to promote lung development in vitro, which probably impairs the proper physiological lung tissue formation and differentiation progress observed in optimal systems and in vivo. In addition, we believe that we have fetal-like lung organoids in vitro, as comparison to scRNAseq of E17.5 suggests. These aspects were also discussed in the new version of the manuscript.

      C: co-staining for E-cad would help differentiate cell types. D: Goblet cells seem Nkx2-1 negative, please explain.

      We thank the reviewer for these helpful comments.

      Regarding the suggestion to include E-cadherin co-staining to better distinguish cell types, we agree that this would provide additional spatial information. However, due to technical limitations related to the species of the primary antibodies used for several lung lineage markers, we were unable to include E-cadherin co-staining in many of the panels. To address epithelial identity at the transcriptomic level, in our single-cell RNA-seq analysis we specifically filtered for Nkx2-1⁺ cells that were also Epcam⁺, thereby focusing the analysis on epithelial populations present in the organoids (Fig. 4A). Consistent with this approach, the lung-related clusters identified in the dataset (Fig. 4B) show clear expression of epithelial markers, including Epcam and Cdh1 (E-cadherin) (Fig. 3E), supporting their epithelial nature.

      Regarding the observation that goblet cells appear Nkx2-1 negative, we note that the Muc5ac staining shown in Fig. 2D primarily reflects secreted mucin that accumulates within the lumen of the lung-like epithelial structures rather than intracellular staining confined to individual goblet cells. As a result, the signal is predominantly detected in the luminal space, which may give the impression that it is not associated with Nkx2.1-expressing cells. To clarify this point, we provide images highlighting Muc5ac accumulation within epithelial structures that express Nkx2.1 (Fig. 2D) and Sox2 (Fig. 2F). In addition, Fig. S5C shows a large Nkx2-1_mKO2⁺/Sox2⁺ epithelial structure with clear Muc5ac accumulation in the lumen, supporting the presence of goblet-like secretory activity within these Nkx2.1–derived lung structures.

      E: Diffuse pattern. Are assumed club cells really Nkx2-1 pos? CC10 immunostaining might help.

      • *

      We thank the reviewer for this helpful comment. The diffuse pattern observed in Fig. 2E is largely due to the 3D reconstruction of the image, which can reduce the apparent sharpness of individual cellular boundaries. Nevertheless, the image indicates that Scgb3a2+ cells are located within epithelial structures containing Nkx2.1–expressing cells.

      Following the reviewer’s suggestion, we have now included additional immunostaining for Cc10/Scgb1a1 in the revised manuscript (Fig. 2G), which further supports the presence of club-like cells in the organoids. Although we were unable to show direct co-staining with Nkx2-1, our single-cell RNA-seq analysis confirms that all Scgb3a2⁺ and Scgb1a1/Cc10⁺ cells identified in the organoids belong to a Nkx2-1⁺/Epcam⁺ epithelial population (Fig. 4A–B and Fig. S7A). This is further illustrated in the corresponding UMAP plots shown below.

      Together, these data support the interpretation that the Scgb3a2⁺ and Cc10⁺ cells detected in the organoids correspond to Nkx2.1-derived epithelial club-like cells.

      F: I doubt that SEM is conclusive for identification of specific (lung) cell types unless tissue architecture (e.g. proximal-distal positions) is considered for comparison to the natural branching process of the developing lung.

      We agree with the reviewer that SEM alone is not sufficient for the definitive identification of specific lung cell types. In this study, SEM was used to visualize ultrastructural features and morphological characteristics suggestive of differentiated epithelial cell types, based on comparisons with SEM images from human/mouse lungs. Importantly, our organoids do not represent adult lung tissue, but most likely fetal stages of lung development, this is an important aspect since cells might not display full features of adult lungs; e.g. ciliated cells show rather short cilia, compatible with early development. Similar aspect is observed with alveolar structures, that are most likely developing-alveolar sacs. This important aspect of developmental stage is now described in the figure legend (Fig. 2H).

      To improve the clarity of our SEM images, we modified the figure and replaced images that had not very clear features by new ones. We included a new image showing mucus accumulation in the luminal compartment, a larger view of developing-alveolar sacs and alveolar cells, with a zoomed image of AT2 cell. In addition, epithelium containing secretory cells and mucus blobs was included.

      Importantly, cell identity in our study was not inferred from SEM alone. We used several complementary approaches, including immunostaining, qPCR analysis, and single-cell RNA sequencing, to support the identification of the different lung epithelial populations present in the organoids.

      Nevertheless, we have decided to retain instead improving the SEM images in Fig. 2H, as they provide valuable ultrastructural characterization of the organoids and illustrate morphological features consistent with differentiated lung epithelial cells.

      Line 161: Is it really "spontaneous" generation? Please rephrase.

      We thank the reviewer for this suggestion. The term “spontaneous” has been replaced with “unexpected” to more accurately describe the generation of these structures.

      Fig. S3A. According to Major Comment above, please explain in more detail why and how lung marker expression is evident in induced "Controls" (i.e. organoids without Foxe1KO). Is it due to parallel/independent lung and thyroid differentiation? Is phenotype of rather Foxa1KO a matter of clonal selection?

      Back to our previous response, the low lung marker expression observed in control organoids likely reflects the presence of Nkx2-1⁺ foregut progenitors that remain transcriptionally permissive to alternative Nkx2.1–dependent programs. In wild-type conditions, the majority of Nkx2.1⁺ cells co-express Pax8 (~80%), indicating robust thyroid commitment, still with around 20% of the cells not committing to thyroid, what could explain an “inefficient” parallel lung differentiation in presence of Foxe1. In contrast, in Foxe1 KO organoids this proportion drops to ~20%, reflecting destabilization of the thyroid transcriptional network rather than induction of a new lineage. Consistent with this, chromatin accessibility analyses show reduced accessibility at the Pax8 locus in Foxe1 KO cells without significant gain at canonical lung loci. Together, this process could allow the expansion of the non-thyroid committed progenitors and acquisition of lung cell fate due to the permissive state of the chromatin. While expansion of rare lung-competent progenitors cannot be formally excluded, distinguishing between lineage plasticity and clonal expansion would require dedicated lineage-tracing experiments beyond the scope of this study.

      Figs. S3B-M. Scanning electron micrographs. Are these from one single (lung-like) structure imaged at different angles and magnitude or selected from multiple/different structures? If the latter, there a bias of selection that raises concern about cell identity. See similar SEM comment above.

      We thank the reviewer for this important point. The SEM images in the old Figures S3B–M did represent distinct lung-like structures rather than multiple angles of a single organoid, as we could not obtain representative images of all cell types from the same structure. However, the SEM data presented in Figure 2 already sufficiently highlight the distinct cell types and structures. To avoid redundancy, we have therefore removed panels S3B-M in the revised version of the figure.

      Line 181: Text states that cells additionally were visualized by microscopy, but this is not shown in Fig. 4.

      We thank the reviewer for pointing this out. The sentence has been revised to clarify that the reporter fluorescence can be used to track differentiation by microscopy, while the efficiency of Nkx2.1⁺ cell generation is quantified by flow cytometry, as shown in Figure S4D–E rather than Figure 4. The updated sentence reads:

      “The reporter fluorescence allowed tracking the Nkx2-1+ cells appearance by microscopy and quantification of the differentiation efficiency by flow cytometry (Figure S4D-E).”

      • *

      Fig. 4. Data based/biased on computationally Pax8-negative selected Foxe1KO cells. Are Pax8 negative cells present in "Control" (Foxe1+) organoids and a potential source of enrichment independent of the thyroid lineage?

      We thank the reviewer for raising this important point, which prompted us to further examine the Nkx2.1⁺/Pax8⁻ cell populations in both control and Foxe1 KO samples. Flow cytometry analysis (shown below) indicates that the proportion of Pax8+ and Pax8- cells among mKO2⁺ (Nkx2-1⁺) cells was comparable between control and Foxe1 KO organoids at day 9, two days after completion of doxycycline induction. This suggests that both thyroid and lung lineages were initially induced at similar levels in the two cell lines.

      This trend persists until day 12, when a clearer divergence between thyroid and lung fates begins to emerge in control versus Foxe1 KO organoids. Overall, these results indicate that Foxe1 expression reinforces thyroid lineage specification, whereas Foxe1 knockout results in an expansion of Nkx2.1+/Pax8- cells. Importantly, the PCA analysis of ATAC-seq data presented in Fig. 5G supports this conclusion.

      The paper by Fagman et al. (Am J. Pathol, 2004), which shows aberrant/ectopic thyroid differentiation in airway respiratory epithelium in ShhKO mouse embryos, may by cited and discussed with reference to the possible existence of bipotent lung/thyroid progenitors/stem-like cells in vivo.

      • *

      We thank the reviewer for this valuable suggestion and apologize for not citing this highly relevant study in the previous version of the manuscript. We have now incorporated a discussion of this work in the final paragraph of the revised manuscript.

      Added text in the manuscript: "In conclusion, the present work advances our understanding of the critical role of Foxe1 in initiating and sustaining proper thyroid tissue formation and function, while also highlighting novel molecular players for future investigation in thyroid biology. Beyond the thyroid, our findings underscore the intricate relationships among endodermal lineages during differentiation, particularly between thyroid and lung. Supporting this concept, in vivo studies by Fagman and collaborators (2004) showed that loss of Shh signaling during early organogenesis leads to thyroid dysgenesis and the appearance of aberrant thyrocytes expressing Nkx2-1, Foxe1, and Tg in the presumptive trachea, emphasizing the need to repress inappropriate thyroid programs in non-thyroid anterior foregut endoderm (Fagman et al., 2004). Building on this, it is intriguing to speculate that transient thyroid/lung bipotent progenitors may exist in vivo, analogous to the transient bipotent progenitors described during liver and pancreas development (Deutsch et al., 2001; Xu et al., 2011). Future studies using lineage tracing approaches could directly test the existence and fate of such progenitors, providing a deeper understanding of early endodermal plasticity and the mechanisms that safeguard lineage fidelity."

      Reviewer #1 (Significance (Required)):

      • *

      The results are indeed of great value mainly for developmental biologist interested in regenerative medicine and specifically concerning in vitro systems for lung and thyroid differentiation. The provided single cell data sets of thyroid progenitors undergoing differentiation and the impact of Foxe1KO are a major achievement and resource.

      • *

      This reviewer´s expertise is mainly in vivo thyroid development.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      • *

      Summary: This study by Fonseca et al investigated how the specification of mouse ESCs towards thyroid lineage was regulated by the presence or absence of Foxe1, a thyroid specific transcriptional factor. Compromised thyroid induction was observed when Foxe1 was knocked out. Interestingly, the author found increased induction of lung cells in the absence of Foxe1, suggesting its role in regulating the balance of thyroid-versus-lung specification. While interesting, the main issue with this study is the lack of quantitative analysis of cellular specification, and the lack of comprehensiveness regarding the markers used to characterize each cell lineage, especially for the lung lineages.

      • *

      Major points:

      • *

      For analyzing the outcome of lineage specification in the comparison of with and without Dox or in the comparison of control versus Foxe1 KO, the only quantitative readout is qPCR. The author should perform additional characterization using flow cytometry for NKX2.1, Pax8, Tg, Tg-I, Ecad, and ZO-1 to reveal more clear mechanism: reduced number/percentage of cellular specification into thyroid lineage, or immature phenotypes in specified thyroid cells.

      • *

      We thank the reviewer for raising this important point. We agree that incorporating quantitative analyses is essential to confirm the phenotype driven by the loss of Foxe1 expression. To address this comment, we have added additional flow cytometry analyses at different time points throughout the culture in the revised manuscript (Fig. S6A–B). Specifically, we now include quantification of Nkx2.1/mKO2⁺ cells and Tg/GFP reporter⁺ cells in both control and Foxe1KO organoids from day 7 to day 21 of the differentiation protocol.

      These data show that up to day 10 there is no significant difference in the proportion of Nkx2.1⁺ cells generated under the two conditions. However, from day 11 onwards the trajectories diverge clearly: in control organoids, Nkx2.1⁺ cells reach approximately 50% of the population, whereas only 10–15% of cells become Nkx2.1⁺ in the Foxe1KO condition (Fig. 6A and Fig. S6B). These findings are consistent with the reduced proportion of Nkx2.1⁺Pax8⁺ cells observed in Foxe1KO organoids (Fig. 6B and Fig. S6B), confirming the impairment in thyroid cell generation caused by the loss of Foxe1 expression. In addition, although not the most precise measure, we also observed a similar reduction in the proportion of Tg-GFP⁺ cells in the Foxe1KO condition compared with controls (Fig. 6A).

      While these new results provide additional quantitative insight, accurately assessing the maturation state of the generated thyroid cells by flow cytometry remains challenging due to several technical limitations:

      1. Tg quantification: Despite testing several anti-thyroglobulin antibodies for flow cytometry, we were unable to obtain reliable staining. For this reason, we included quantification of the Tg/GFP reporter described above. Despite the clear reduction in Tg+ cells among Foxe1 KO organoids, we previously demonstrated (Romitti and Eski et al., 2021; Fig. 2E) that the GFP reporter captures only a fraction of the Tg⁺ cell population present in the culture, not being the most accurate method for quantification.
      2. Tg-I and ZO-1 quantification: Due to their intraluminal and apical localization within thyroid follicles, quantification of Tg-I is not possible by FC and ZO-1 staining has demonstrated to be technically difficult and did not yield reliable results.
      3. Assessment of immature vs. mature thyrocytes: We believe that the combined datasets presented in Fig. 1 and the scRNA-seq analysis (Fig. 3) provide sufficient evidence to interpret the Foxe1KO phenotype. Together, these results indicate that: (i) Foxe1KO organoids show a reduced efficiency in generating thyrocytes and Nkx2.1+ cells compared with the control line; and (ii) the few thyrocytes that form in the absence of Foxe1 display impaired maturation.

      The authors claimed that in the absence of Foxe1, lung organoid can be observed. Quantitative analysis, such as organoid count or flow cytometry, should be provided to assess this comparing organoid identities in the presence and absence of Foxe1.

      We thank the reviewer for this important comment and we agree that a precise quantification would reinforce our findings on organoid identities. As described above, we performed flow cytometry analyses to track Nkx2.1/Pax8 cell populations over time in both WT and Foxe1KO conditions. In the WT condition, approximately 80% of Nkx2.1⁺ cells are also Pax8⁺, consistent with thyroid lineage specification. In contrast, in the Foxe1KO condition, only ~20% of Nkx2.1⁺ cells co-express Pax8, indicating a strong reduction in thyroid lineage commitment.

      Although this approach does not directly quantify lung organoids, our scRNA-seq data show that the majority of Nkx2.1⁺Pax8⁻ cells in the Foxe1KO condition display an epithelial transcriptional profile, with a substantial proportion exhibiting a lung-like signature.

      Regarding a direct quantification of the proportions of each organoid type, we encountered several technical limitations inherent to organoid systems. In particular, variability between wells and differentiations, combined with the three-dimensional complexity of the cultures, makes reliable counting of distinct organoid identities challenging.

      With respect to flow cytometry-based quantification of lung identity, the diversity of lung epithelial cell types represents an additional challenge. Available markers often label only specific subpopulations and can overlap with thyroid markers. For example, Sox2 labels airway epithelial cells but not alveolar cells, whereas Sox9, which can mark distal lung progenitors, is also highly expressed in thyrocytes. Similarly, assays with secretory cell markers (Scgb3a2, Scgb1a1, and Muc5ac) did not yield reliable staining in our system. Hopx, an alveolar marker, is also detected in the thyroid population. Although thyroid cells can be specifically identified by Pax8 staining, this overlap further complicates the combination of markers required for reliable flow cytometry quantification of lung lineages.

      Taken together, and considering that in our previous work we demonstrated by scRNA-seq that lung differentiation is not clearly observed in the control line, with only a small subset of Nkx2-1+Krt5+ cluster been detected (Romitti and Eski et al., 2021), our quantitative analyses rely primarily on Nkx2.1/Pax8 flow cytometry together with the transcriptional evidence provided by scRNA-seq.

      In Figure 2, the claim of lung cell identities is not well supported. (1) SEM data on alveolar and goblet cells is not conclusive;

      We agree with the reviewer that SEM alone is not sufficient for the definitive identification of specific lung cell types. In this study, SEM was primarily used to visualize ultrastructural features and morphological characteristics suggestive of differentiated epithelial cell types, based on comparisons with previously reported SEM images of human and mouse lung tissue.

      Importantly, our organoids do not represent adult lung tissue but rather likely correspond to early developmental stages of lung formation. This is an important consideration, as cells at these stages may not display all the morphological hallmarks observed in mature lungs. For example, the ciliated cells observed in our organoids present relatively short cilia, which is consistent with early stages of airway epithelial development. Similarly, the structures resembling alveoli are more consistent with developing alveolar sacs rather than fully mature alveoli. This developmental context is now clarified in the figure legend (Fig. 2H).

      To improve the clarity and interpretability of the SEM data, we revised the figure and replaced images in which the features were not sufficiently clear. The updated panel now includes images showing mucus accumulation within the luminal compartment, a broader view of developing alveolar sac–like structures, and a higher-magnification image highlighting cells with morphology consistent with alveolar type II–like cells. In addition, we included images of epithelial regions containing secretory cells and mucus deposits.

      Importantly, cell identity in our study was not inferred from SEM alone. Instead, we used several complementary approaches, including immunostaining, qPCR analyses, and single-cell RNA sequencing, to support the identification of the different lung epithelial populations present in the organoids.

      For these reasons, we chose to retain the SEM data in Fig. 2H while improving the image selection and annotations, as these images provide valuable ultrastructural information and illustrate morphological features consistent with differentiated lung epithelial structures.

      In addition, it’s important to note that our system is not designed (culture media composition) for optimal generation of lung organoids and we believe that despite of the indications of fetal-like lung organoids generated they might not follow the expected physiological path observed in vitro optimal models and in vivo. It could impact the maturity and the proportions of the cells derived. This discussion is also now present in the updated version of the manuscript.

      (2) Alveolar type 1 cells should be characterized by AGER and AQP5 besides HOPX

      We thank the reviewer for this valuable suggestion and agree that additional markers such as AGER and AQP5 would further support the identification of alveolar type I (AT1) cells. Following the reviewer’s recommendation, we performed additional immunostainings using AQP5 and AGER antibodies. However, we were unfortunately unable to obtain reliable staining that would clearly demonstrate AT1 cells in our organoid system.

      Nevertheless, both AQP5 and AGER transcripts are detected in the lung-like populations in our scRNA-seq dataset (Fig. 4 and examples shown below). Interestingly, their expression is not restricted to a single well-defined cluster, which may reflect the transitioning/immature state of the lung-like cells present in the organoids. Additional comparison to in vivo dataset suggests an enrichment in AT1 signature in cluster 0, which contains Foxe1KO-derived cells, however it might not reflect fully maturation of this cell type.

      Taken together, these observations further reinforce that while lung epithelial populations are present, the organoids likely represent an early developmental stage of lung differentiation rather than fully mature lung tissue, and therefore may not yet exhibit the clear marker segregation characteristic of adult alveolar cell types.

      (3) Alveolar types 2 cells should be characterized by NKX2.1 and SFTPC co-staining;

      Dear reviewer, as mentioned in the previous comment, we encountered similar technical difficulties when attempting SFTPC immunostaining, and we were unfortunately unable to obtain reliable staining in our organoid system.

      In contrast to Aqp5 and Ager, Sftpc transcripts were not detected in our scRNA-seq dataset. However, several other markers commonly associated with AT2 cells, such as Napsa, Abca3, and Lpcat1, are expressed in the lung-like populations (examples shown below). In addition, comparative analyses with an in vivo mouse lung dataset indicate transcriptomic similarities between E17.5 AT2 cells in vivo and a subset of cells present in the Foxe1KO organoids (Fig. 4C). This analysis also highlights the possible presence of AT2 precursors, reinforcing the immaturity of the system.

      Taken together, these observations suggest the presence of AT2-like cells at an early developmental stage, rather than fully differentiated or functional AT2 cells. This interpretation is consistent with the overall developmental immaturity of the lung-like structures observed in our organoid system.

      (4) For showing proximal lung identities, it would be helpful if the authors can co-stain more than one lineage, such as basal cell together with goblet cell/ciliated cells to reveal potential pseudostratified epithelium.

      We thank the reviewer for this insightful suggestion. Addressing the spatial organization of proximal lung epithelial cell types within the organoids is indeed an interesting aspect. Based on our observations, multiple epithelial cell types do not appear to consistently coexist within the same organoid structure.

      Our analyses indicate that many organoids co-express basal cell markers (p63 and Krt5) together with Sox2, but not together with Muc5ac, a marker of goblet cells. This observation may suggest that the in vitro system does not fully recapitulate the progressive epithelial maturation and spatial organization seen in vivo, such as the formation of a pseudostratified airway epithelium.

      Ideally, this question would be addressed through three-dimensional immunostaining within individual organoid structures to visualize the spatial arrangement of the different epithelial lineages. However, despite several attempts, we were unable to obtain images that would allow reliable interpretation of such co-localization.

      Regarding ciliated cells, analysis of the scRNA-seq dataset indicates that they represent a relatively rare population in our cultures, which likely further limits the ability to visualize their spatial organization within organoids.

      Minor points:

      • *

      All characterization of in vitro induced thyroid cells should be accompanied by parallel analysis of native thyroid cells (from in vivo mice) that serve as a benchmark for the maturity of the induced cells. Some staining, such as Fig 1F on Tg-I remains quite different from what is reported from in vivo findings.

      We thank the reviewer for this important comment. In our previous work (Antonica et al., Nature, 2012), the characterization of thyroid organoids was extensively performed in direct comparison with native mouse thyroid tissue, and all antibodies used in the study were benchmarked using mouse thyroid as a positive control. Regarding the maturity of the thyroid organoids generated in vitro, we previously demonstrated both in vitro and in vivo thyroid hormone (TH) production, confirming the functional capacity of the derived thyroid cells. Although a certain degree of heterogeneity in maturation is observed within WT thyroid organoids, likely reflecting their fetal-like developmental state, these findings support the presence of functionally mature thyrocytes.

      To further address the reviewer’s concern, we have now included new Tg-I staining images in Fig. 1F, which more clearly illustrate the accumulation of the thyroid hormone precursor within the luminal compartment of follicles derived from WT mESCs.

      In addition, we would like to note that the specificity and suitability of the antibodies used to stain native mouse thyroid cells have been validated in several previous studies, including Dathan et al., Dev Dyn, 2002; Gérard et al., Am J Pathol, 2008; Hartog et al., Endocrinology, 1990.

      The labeling of panel E and F in Figure.1 should be switched.

      We thank the reviewer for bringing this to our attention. The labeling of panels E and F in Fig. 1 has been corrected accordingly in the revised manuscript.

      Reviewer #2 (Significance (Required)):

      • *

      This study provided direct in vitro evidence regarding the critical role of Foxe1 for thyroid lineage induction, and suggested its role in balancing thyroid versus lung fate determination. It is thus important to the field of both thyroid and lung developmental and stem cell biology. However, the significance of this study in hindered by the lack of comprehensiveness in the analysis.

      We thank the reviewer for the positive evaluation of our study and for recognizing its relevance to both thyroid and lung developmental biology. To address the concern regarding the comprehensiveness of the analysis, we have carefully revised the manuscript to improve clarity and to better present and discuss the results of our analyses. We believe that these revisions have strengthened the manuscript and improved the overall quality of the study.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • *

      Costagliola et al. have demonstrated that Foxe1, a transcription factor, plays a key role in the proper differentiation of Nkx2-1 (+) cells into thyroid follicles. They have also revealed that some Foxe1-null/Nkx2-1 (+) cells differentiate into the lung, including airway and alveolar epithelia, in their ES cell-derived organoid system. Although it has already been appreciated that Foxe1 contributes to the thyroid development in mice and humans, this excellent study has clarified that its absence, as a result, enhances the differentiation of Nkx2-1 (+) cells into the lung. I have no serious criticisms regarding methodology, results, and interpretation of results. I' d like you to elucidate whether similar findings are obtained even from human ES cell lines in the future.

      We would like to express our sincere gratitude to Reviewer #3 for the positive feedback on our work. We fully agree that it will be important to determine whether similar findings can also be observed using human embryonic stem cell (ESC) systems.

      While the mouse model used in this study was first reported in 2012 (Antonica et al., Nature, 2012), our group has more recently developed a corresponding system to generate functional thyroid follicular cells from human pluripotent stem cells (Romitti et al., Nature Communications, 2022). Using this human platform, we are currently investigating the role of FOXE1, as well as other genes associated with congenital hypothyroidism, in human thyroid development. We anticipate that these studies will provide further insight into the mechanisms controlling thyroid lineage specification and will be the focus of future work.

      Minor comment:

        • Fig 3C-E, Fig 6B, D, and F: These figures are so small that the words are almost illegible.*

      We thank the reviewer for bringing this to our attention. The figures have been revised to improve readability, and the font sizes have been increased in Fig. 3C–E and Fig. 6B, D, and F in the updated version of the manuscript.

      Reviewer #3 (Significance (Required)):

      I'm a pathologist who specialize in lung cancer and the stem cells in the distal airway. This paper will probably attract those who are interested in the development of the thyroid or the lung, because the authors have revealed that 1) Foxe1 contributes to the proper thyroid development, and 2) its absence consequently enhances the differentiation of Nkx2-1 (+) cells into the lung.

      We thank the reviewer for this thoughtful comment and for highlighting the potential interest of our study for researchers working in thyroid and lung development. We agree that our findings provide new insight into the role of Foxe1 in thyroid lineage specification and suggest that its absence can shift the differentiation potential of Nkx2.1⁺ progenitors toward a lung epithelial fate. We hope that these results will contribute to a better understanding of the mechanisms regulating cell fate decisions within the anterior foregut endoderm and will be of interest to both the thyroid and lung developmental biology communities.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Costagliola et al. have demonstrated that Foxe1, a transcription factor, plays a key role in the proper differentiation of Nkx2-1 (+) cells into thyroid follicles. They have also revealed that some Foxe1-null/Nkx2-1 (+) cells differentiate into the lung, including airway and alveolar epithelia, in their ES cell-derived organoid system. Although it has already been appreciated that Foxe1 contributes to the thyroid development in mice and humans, this excellent study has clarified that its absence, as a result, enhances the differentiation of Nkx2-1 (+) cells into the lung. I have no serious criticisms regarding methodology, results, and interpretation of results. I' d like you to elucidate whether similar findings are obtained even from human ES cell lines in the future.

      Minor comment:

      1. Fig 3C-E, Fig 6B, D, and F: These figures are so small that the words are almost illegible.

      Significance

      I'm a pathologist who specialize in lung cancer and the stem cells in the distal airway. This paper will probably attract those who are interested in the development of the thyroid or the lung, because the authors have revealed that 1) Foxe1 contributes to the proper thyroid development, and 2) its absence consequently enhances the differentiation of Nkx2-1 (+) cells into the lung.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This study by Fonseca et al investigated how the specification of mouse ESCs towards thyroid lineage was regulated by the presence or absence of Foxe1, a thyroid specific transcriptional factor. Compromised thyroid induction was observed when Foxe1 was knocked out. Interestingly, the author found increased induction of lung cells in the absence of Foxe1, suggesting its role in regulating the balance of thyroid-versus-lung specification. While interesting, the main issue with this study is the lack of quantitative analysis of cellular specification, and the lack of comprehensiveness regarding the markers used to characterize each cell lineage, especially for the lung lineages.

      Major points:

      For analyzing the outcome of lineage specification in the comparison of with and without Dox or in the comparison of control versus Foxe1 KO, the only quantitative readout is qPCR. The author should perform additional characterization using flow cytometry for NKX2.1, Pax8, Tg, Tg-I, Ecad, and ZO-1 to reveal more clear mechanism: reduced number/percentage of cellular specification into thyroid lineage, or immature phenotypes in specified thyroid cells.

      The authors claimed that in the absence of Foxe1, lung organoid can be observed. Quantitative analysis, such as organoid count or flow cytometry, should be provided to assess this comparing organoid identities in the presence and absence of Foxe1.

      In Figure 2, the claim of lung cell identities is not well supported. (1) SEM data on alveolar and goblet cells is not conclusive; (2) Alveolar type 1 cells should be characterized by AGER and AQP5 besides HOPX; (3) Alveolar types 2 cells should be characterized by NKX2.1 and SFTPC co-staining; (4) For showing proximal lung identities, it would be helpful if the authors can co-stain more than one lineage, such as basal cell together with goblet cell/ciliated cells to reveal potential pseudostratified epithelium.

      Minor points:

      All characterization of in vitro induced thyroid cells should be accompanied by parallel analysis of native thyroid cells (from in vivo mice) that serve as a benchmark for the maturity of the induced cells. Some staining, such as Fig 1F on Tg-I remains quite different from what is reported from in vivo findings.

      The labeling of panel E and F in Figure.1 should be switched.

      Significance

      This study provided direct in vitro evidence regarding the critical role of Foxe1 for thyroid lineage induction, and suggested its role in balancing thyroid versus lung fate determination. It is thus important to the field of both thyroid and lung developmental and stem cell biology. However, the significance of this study in hindered by the lack of comprehensiveness in the analysis.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The authors investigate the effect of Foxe1KO primarily on thyroid differentiation of mouse ES cells following a previously established protocol based on sequential endoderm induction, Nkx2-1/Pax8 overexpression and stimulation of the TSHR/cyclicAMP pathway. Silencing of Foxe1 expression significantly suppresses the generation of functional thyroid follicles. By single cell profiling a great number of Foxe1 targeted genes are identified, some confirmed from previous studies and some are new candidates. Embryonic bodies lacking Foxe1 instead accumulate various lung lineage cells characterized by known cell type markers, which appear to organize in lung tissue-like structures. Based on these findings, it is suggested that Foxe1 might be involved in endoderm cell fate decisions.

      Major comments

      The title and abstract hold promise that Foxe1 is also a regulator of lung development, and that Foxe1 transcriptional activity might be decisive for thyroid versus lung fate decisions. However, there are no experimental support suggesting that one and the same ES cells at a certain critical time point may switch fate from thyroid to lung (or vice versa). Since lung markers are induced in Nkx2-1/Pax8/cAMP+ ESC it is likely that "control" organoids with maintained Foxe1 expression already contain lung lineage cells, which might expand simply by clonal selection as the thyroid lineage is suppressed by subsequent Foxe1 deletion. Although authors discuss some in this direction, it is not obvious to readers without very careful reading that this possibility and explanation is feasible and should be considered and problematized. Observations that Foxe1KO did not at all influence gene expression in expanding lung-like cells are consistent with the idea that lung and thyroid specification in the model are independent phenomena, and argue against the existence of a common bipotent progenitor. If authors disagree, this issue and question should be more thoroughly discussed and argued for with more supporting experimental data than found in the current manuscript version.

      Minor comments

      What is the fraction of. Nkx2-1+ cells that organize into follicles vs lung structures? Based on provided overview images (e.g. Figs. S1, S4) the general impression is that most cells do not form 3D-structures (i.e. do not differentiate). Please explain this and provide information in paper.

      Fig. 1C: Supposed follicles are not shown in this graph. Why does not thyroglobulin accumulate in lumen (which if present would be a good means for quantification by counting follicles)? Indeed, follicles should be quantified to estimate induction success. Please also explain rounded structures in Foxe1KO image (are they distal lung buds?). Or are Control and Foxe1KO images confused in this panel?!?

      Fig. 1E: text on Fig. legend is erroneously given under (F), whereas a dedicated and relevant text for (F) is missing.

      Fig. 1F. Immunostaining of iodinated thyroglobulin (Tg-I) is very poor. Is it due to a bad antibody (does it work well in in vivo thyroid stainings?) or is organification simply inefficient? Again, poor content of Tg in lumen (as also suggested by Fig. S5A), it is puzzling. Or are in vitro-generated follicles leaky (i.e. do not behave as natural thyroid follicles)?

      Figs. 2A-E: Comments on lung cell markers. A: E-cad is unspecific, Sox9 would better label branching morphogenesis C: co-staining for E-cad would help differentiate cell types. D: Goblet cells seem Nkx2-1 negative, please explain. E: Diffuse pattern. Are assumed club cells really Nkx2-1 pos? CC10 immunostaining might help. F: I doubt that SEM is conclusive for identification of specific (lung) cell types unless tissue architecture (e.g. proximal-distal positions) is considered for comparison to the natural branching process of the developing lung.

      Line 161: Is it really "spontaneous" generation? Please rephrase.

      Fig. S3A. According to Major Comment above, please explain in more detail why and how lung marker expression is evident in induced "Controls" (i.e. organoids without Foxe1KO). Is it due to parallel/independent lung and thyroid differentiation? Is phenotype of Foxa1KO rather a matter of clonal selection?

      Figs. S3B-M. Scanning electron micrographs. Are these from one single (lung-like) structure imaged at different angles and magnitude or selected from multiple/different structures? If the latter, there a bias of selection that raises concern about cell identity. See similar SEM comment above.

      Line 181: Text states that cells additionally were visualized by microscopy, but this is not shown in Fig. 4.

      Fig. 4. Data based/biased on computationally Pax8-negative selected Foxe1KO cells. Are Pax8 negative cells present in "Control" (Foxe1+) organoids and a potential source of enrichment independent of the thyroid lineage?

      The paper by Fagman et al. (Am J. Pathol, 2004), which shows aberrant/ectopic thyroid differentiation in airway respiratory epithelium in ShhKO mouse embryos, may by cited and discussed with reference to the possible existence of bipotent lung/thyroid progenitors/stem-like cells in vivo.

      Significance

      The results are indeed of great value mainly for developmental biologist interested in regenerative medicine and specifically concerning in vitro systems for lung and thyroid differentiation. The provided single cell data sets of thyroid progenitors undergoing differentiation and the impact of Foxe1KO are a major achievement and resource.

      This reviewer´s expertise is mainly in vivo thyroid development.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Ruby Ponnudurai

      Scientific Editor

      Review Commons

      February 16th, 2026

      Dear Dr. Ponnudurai,

      Please see below for a detailed response to reviewers for manuscript #RC-2025-03108: "Short chain fatty acids regulate the chromatin landscape and distinct gene expression changes in human colorectal cancer cells".

      __Authors' Summary: __We thank all the reviewers for the constructive and immensely helpful reviews of our manuscript. We have revised the manuscript addressing the reviewers' comments, which we feel has substantially strengthened our paper. Please see below for our point-by-point responses to the comments, which are all indicated in blue text. All changes in the manuscript are also indicated in blue text.


      Reviewer #1 Evidence, reproducibility and clarity


      In this manuscript, Kabir et al. explore the impact of microbiota-derived short-chain fatty acids (SCFAs) on chromatin structure and gene expression in human cells. They show that SCFAs, particularly butyrate, contribute to specific histone modifications such as butyrylation at H3K27, detectable in human colon tissue. Additional modifications like acetylation, butyrylation, and propionylation at H3K9 and H3K27 respond to SCFA levels and are enriched at active regulatory regions in colorectal cancer cells. Treatment with individual or combined SCFAs mimicking gut conditions alters gene expression patterns, with butyrate playing a dominant regulatory role. Butyrate's effects on gene expression are claimed to be independent of HDAC inhibition and instead rely on the p300/CBP complex through histone butyrylation. These findings underscore SCFAs as crucial modulators of epigenetic regulation in the human colon and highlight butyrate's dominant role in shaping chromatin and gene regulation beyond its known metabolic functions.

      The authors used two human cell lines and an in vivo murine model paired with RNA and ChIP sequencing approaches to identify target genes and chromatin modifications in response to SCFAs.

      While the findings are interesting and could provide important insights into the epigenetic influence of SCFAs in human cells, the study would benefit from additional experiments to strengthen the conclusions. Comments and suggestions are listed below:

      Response: We sincerely thank the reviewer for their thoughtful and constructive comments. In addition, we appreciate the recognition of the potential impact of our findings. We have addressed all comments below.

      1. Figure 1: The H3K27bu expression in human biopsies highlights the clinical significance of the current study. However, the authors need to provide more information on the human colon samples, e.g., how many total patients were analyzed, and what were the age and/or sex. Only the methods mention the use of benign TMA; this should also be clarified in the figure legends. It would also be helpful to show histone butyrylation levels in normal vs. cancer human tissues.

      Response: We completely agree that analysis of additional patient samples is important. In light of this comment, we have expanded our analysis of human colon samples. In the original manuscript, we showed IF images from patient intestinal sections. Patient demographic information (age and sex) is now included in the figure legend. While we analyzed two patients by IF, we realized that images from only one patient are shown. We also felt it was important to add additional rigor to our patient analysis. Therefore, we have incorporated additional patient samples and performed H3K27bu staining using IHC across normal and colon cancer sections obtained from 40 different patients. This is now included as Supplemental Figure 1. In addition, we have included information about age, sex, staging, and grading in Supplemental Figure 1C. Interestingly, we observed that adenocarcinoma patients have significantly decreased levels of H3K27bu compared to normal colon or normal adjacent tissues (Supplemental Figure 1B). We speculate that this may be due to alterations with the microbiota composition and dysbiosis associated with colorectal cancer (PMIDs: 26515465, 25758642, 25699023). Very interestingly, this is in contrast to reports of elevated H3K27ac in colon cancer samples (PMID: 24994966). We are excited to explore this further, and this is something we plan to follow-up on in future studies.

      1. Figure 1: In addition, given that the butyrate level descends towards the base of the colonic crypt (with the highest at the top of the crypt where mature intestinal epithelial cells reside) (Kaiko et al., 2016), it is important to show how the H3K27bu signature is distributed along the crypt. This data would further emphasize the clinical relevance of this study, given that most colorectal cancers (CRCs) arise from stem and progenitor cells.

      Response: We agree that this is an important question and recognize the elegant study by Kaiko et al. However, our human samples are obtained from commercially available tissue microarrays and the sectioning is not consistent across samples, resulting in a minimal amount of samples that we could analyze for staining patterns from crypt to villi (please see Supplemental Figure 1A for example sections). This unfortunately prevents us from completing rigorous image analysis. In future studies, we plan to perform this analysis after we obtain an IRB protocol that will allow us to answer this question in the most rigorous way possible.

      Throughout the manuscript: The rationale for selecting the two CRC cell lines (HCT 116 and Caco2) should be explained. While commonly used, providing background on their genetic differences (e.g., driver mutations) is important, as this could greatly influence the PTM landscape.

      Response: We chose to use both HCT-116 and Caco-2 cancer cell lines throughout our studies, since as noted these cells are the most commonly used lines in the literature. In addition, having consistent results across distinct genetic backgrounds strengthens our results: using both cell lines tells us whether observed PTM patterns are conserved across genetically diverse CRC contexts, as HCT-116 is characterized by mutations in KRAS and PIK3CA, while Caco-2 has mutations in APC and TP53 (PMIDs: 17088437, 24755471, and 16418264). We have added this information into the text in lines 106-107.

      The study lacks additional controls, such as a normal colon epithelial cell line and a non-colonic cell type. Including these would help determine whether the observed butyrate effects are tissue- or disease-specific. This data would also help assess whether SCFA effects, and specifically butyrate's effects, on histone acylation and gene expression are systemic or local.

      Response: Thank you for this insightful comment. We have now included additional data using normal colon cells in the form of mouse colon organoids and a distinct non-intestinal cell line, the embryonic kidney cell line HEK 293T. Importantly, we observe similar changes to chromatin after treatments with different SCFAs in both colon organoids and HEK 293T cells as shown in the cancer cell lines (Figure 1E, 1F). Interestingly, we also observe that the colon cancer cell lines have visible signal of histone butyrylation without treatment, while we only observe these modifications in HEK 293T cells following treatment.

      As for understanding systemic vs. local effects of butyrate on chromatin, we additionally treated cells with different concentrations reflecting the intestinal lumen or serum concentrations of SCFAs: 5 mM and 5 µM, respectively. While the concentrations of SCFAs can vary across individuals, we felt that these numbers reflected differences in intestinal vs. serum levels based on the literature (summarized in PMID: 27259147). Importantly, we observe that only the 5 mM SCFA treatment reflecting levels in the intestinal lumen results in induction of histone acetylation and butyrylation, while the 5 µM treatment reflecting serum SCFA levels failed to induce increased levels of these histone modifications (Figure 1F).

      Together, this data suggests that the response on chromatin to SCFAs is more universal at high concentrations. However, based on local vs. systemic concentrations throughout the body, we expect that responses on chromatin will largely be restricted to the intestine or in other areas or conditions where high concentrations of metabolites are localized.

      Figure 2: The authors show ChIP-seq results in the HCT 116 cell line. To exclude the possibility that the demonstrated chromatin signatures are cell line-specific, results from Caco2 should also be shown. In addition, the 2D environment and multiple passaging alter gene expression in cell lines; using human colonic organoids would provide a more clinically and physiologically relevant model.

      Response: We have now added Cut&Run analysis for the histone acyl marks of interest in Caco-2 cells, which is a technique analogous to ChIP to map genomic localization. Please see now Figure 2C-D. Importantly, we observe very similar localization of these histone modifications across the different cell lines. We also agree that the question of how 2D vs. 3D environment may impact localization of these modifications is important. In organoids, ChIP-seq and Cut&Run are technically difficult. In addition, we feel that using human organoids is currently beyond the scope of our manuscript. However, we previously characterized H3K27bu and H3K27ac occupancy from primary epithelial cells isolated from the mouse intestine (PMID: 38413806). Importantly, in this study we observed similar genomic enrichment of H3K27bu and H3K27ac. This suggests that the general patterns of localization of these modifications across species and across cells isolated from both 2D vs. 3D systems are similar.

      Figure 4 is very confusing. Entinostat itself, as an HDAC inhibitor (iHDAC), increases butyrylation. The data shown are insufficient to draw conclusions. First, the authors should use additional iHDACs, and second, they should illustrate the overlap in gene expression changes between all treatments using a Venn diagram to clarify which genes/signatures are specific to each treatment.

      Response: We agree that testing additional iHDACs is important. We have now included an additional iHDACs (Tucidinostat) in our studies to make more widespread conclusions beyond the activity of Entinostat. We have performed additional treatments, demonstrating that all iHDACS tested increase both histone butyrylation and acetylation (Supplemental Figure 8A-B). We also have performed qPCR for candidate differential genes and demonstrated that expression changes following our treatments with Tucidinostat phenocopy changes observed with Entinostat (Figure 5F). These dynamic gene changes show examples of genes that are responsive to butyrate treatment and p300/CBP inhibition, yet differ from other iHDAC treatment. As requested, we have additionally added a Euler plot to Figure 4 depicting the overlap between treatments in this figure (Figure 5C).

      Figure 4: The authors use an HDAC inhibitor to rule out butyrate's effect on gene expression via HDAC inhibition. However, butyrate can also modulate gene expression through activation of GPR109a. Using GPR109a antagonists is necessary to address this possibility. These data are essential to validate the specific role of histone butyrylation in gene regulation.

      Response: We thank you for this comment and completely agree that butyrate can act through multiple mechanisms, including activation of GPR109a. However, it has previously been demonstrated that this receptor is silenced via DNA methylation in human colon cancer samples and colon cancer cell lines, including HCT-116 (PMID: 19276343). Supporting this notion, we observed very low expression levels of this receptor in our HCT-116 cells (please note the very low TPM values), with minimal differences in response to butyrate treatment (Supplemental Figure 6E, included below). We have additionally included gene expression data for two other potential GPCRs activated by butyrate or other SCFAS (FFAR2 and FFAR3), and also observe very low expression of these genes. Therefore, we concluded that the butyrate effects on gene expression independent of HDAC inhibition in our data are not likely to be dependent on GPR109A or FFAR2/3 signaling.


      New ____Supplemental Figure 6E____: mRNA expression of GPCR genes that are known SCFA targets. Levels of mRNA expression (transcript per million, TPM) as assayed by RNA-seq of GPR109A (official gene name HCAR2), FFAR2, and FFAR3 in HCT-116 cells. Expression levels related to Figure 3. Statistical significance was determined using ANOVA adjusting for multiple comparisons with p

      Supplementary Figure 4 and manuscript: There is no in vivo methods section describing the tributyrin-gavaged mice. The authors should clarify how the experiment was performed, how cells were isolated, whether sorting was performed, and which markers were used.

      Response: We apologize for this confusion. The in vivo data is from previously published work that is publicly available (PMID: 38413806). We analyzed data from mice that were gavaged with tributyrin, where non-sorted IECs were analyzed for RNA-seq. We have clarified this and have added this information in the figure legend (now Supplemental Figure 6).

      Supplementary Figure 4: The GO analysis results show that lipid catabolism is among the top differentially enriched pathways. Butyrate is a known PPARγ agonist (Litvak et al., 2018), and activation of PPARγ is known to drive expression of genes involved in lipid metabolism. The authors need to rule out this function of butyrate before attributing this signature solely to histone butyrylation.

      Response: We appreciate this point and have performed additional analysis to identify whether canonical PPARγ target genes are enriched or not in our data. Additionally, we recognize that our data may reflect the combined effects of both PPARγ activation and histone butyrylation. In Supplemental Figure 6 (Supplemental Figure 4 in the previous version), we especially acknowledge that the differential genes changing may be due to varied mechanisms of butyrate action. Therefore, to address this comment, we performed additional analysis on data related to Figure 5 (previously Figure 4), where we have additional treatments including using a p300/CBP inhibitor to identify potentially more chromatin related mechanisms of action.

      We have now extended our analysis of RNA-seq data related to Figure 5 to include gene ontology enrichment that is not dependent on clustering (Supplemental Figure 9A). While we do not observe PPARγ target genes as top enriched categories, we have also specifically tested the enrichment of PPARg-related MsigDb groups using publicly available datasets (Supplemental Figure 9B). Here, we observe some enrichment of different gene sets related to PPARγ activity across different tissue systems. Together, this new data suggests that some PPARg targets are enriched with our different cell treatments, including butyrate, but they are not the predominant gene categories that we observe changing.

      Most PPARg target genes have been identified in tissue systems beyond the gut, such as adipose tissue and immune cells. To specifically analyze genes in the intestine that are PPARg-dependent, we identified select genes in the literature (PMIDs: 29182565, 28798125, and 28798125). In PMID: 29182565, these genes include lipid transport (Cd36), lipolysis (Hsl, and Atgl), and various lipid metabolism pathways (Cact, Fasn, Mlycd, Dgat2, and Agpat9). In PMID: 28798125, these genes include HMOX1, PDK4, ANGPTL4, UCP2, AQP8, and PLIN2 as butyrate/ PPARg targets. PMID: 28798125 identified Nos2 as a butyrate and PPARg target. Their expression levels following butyrate and other treatments in Figure 5 (formerly Figure 4) are now included as Supplemental Figure 9C (also included below). Interestingly, these genes respond differently compared to the other iHDAC tested (Entinostat) and are only mildly impacted by p300/CBP inhibition (please see A485_Butyrate column vs. Butyrate alone). This suggests that the major impacts on this pathway are not through p300/CBP activity or histone butyrylation, but may be due to other mechanisms of butyrate action. We have also included additional discussion of butyrate and potential roles of PPARg signaling in lines 243-256.

      New Supplemental Figure 9C.

      It would be helpful to include a table of differentially abundant genes as a supplement to the heatmaps and GO analysis.

      Response: We are happy to include tables of differentially expressed genes from all our analysis as supplemental files. This is now included as Supplemental Table 1.

      Significance

      This study explores how microbiota-derived SCFAs, particularly butyrate, influence histone acylation and gene regulation. While the topic is relevant, the work lacks important controls (e.g., normal epithelial and non-colonic cells) and omits mechanistic validation (e.g., GPR109a signaling, PPARγ involvement). The rationale for cell line selection is unclear, and in vivo methods are insufficiently described.

      Audience: The study will mainly interest specialists in microbiota-chromatin interaction. Broader impact is limited by the narrow model scope and underdeveloped mechanistic insight.

      My Expertise:

      Cancer biology, in vivo models, microbiota-host interactions.

      Response: We sincerely thank the reviewer for their very helpful comments. We hope that the above point-by-point responses adequately addresses concerns regarding controls, mechanistic validation, and methods description. We really appreciate their note that the topic is relevant, yet we also feel that our work will have broader impacts due to the interdisciplinary nature of the research and the inclusion of additional model systems (intestinal organoids and additional cell lines) and mechanistic experiments.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This study presents a novel finding that short-chain fatty acids (SCFAs) produced by microbial metabolism regulate gene transcription in human colon cancer cells by modulating histone H3K9 and H3K27 butyrylation and propionylation, both of which are associated with an open chromatin state. The authors further reveal that the major effect of the SCFA mixture is driven by butyrate and identify p300/CBP-dependent, rather than HDAC inhibition-dependent, gene regulation by butyrate. Overall, this is a well-organized study that provides valuable insight into the role of metabolites in human cells.

      Response: Thank you for your positive review of our manuscript. We really appreciate the reviewer pointing out the novelty and organization of our study. Please see below for point-by-point responses to your comments.

      Major comments:

      1. In Figures 1C and 1D, why did the SCFA mixture not increase histone butyrylation or propionylation to the same level as single butyrate treatment? Response: Thank you for this question. We believe that this effect is observed due to differences in butyrate concentrations, as we aimed to keep the total concentration of SCFAs equal across all treatments at 5 mM. In the single treatment, butyrate is at 5 mM while in the mixtures, butyrate is at 1.67 mM (1:1:1) or 1 mM (3:1:1). In addition, in Figure 3A we included a 15 mM mixture for RNA-seq analysis, where butyrate and the other SCFAs are all at 5 mM concentrations. Since we observed highly similar patterns of gene expression with 15 mM or 5 mM final SCFA mixture concentrations, we did not include the 15 mM treatment in our other experiments.

      In Figure 3B, how does butyrate block the effects of acetate and propionate on transcription?

      Response: This is a great question, but we are not necessarily claiming that butyrate is blocking effects of acetate and propionate on transcription. For example, it is also possible that butyrate induces more gene expression changes compared to acetate or propionate, as the number of differentially expressed genes is greater in butyrate-treated cells (Response Table 1).

      Comparison vs. vehicle

      __Upregulated __

      (log2FC > 0)

      __Downregulated (log2FC

      __Upregulated __

      (log2FC > 1)

      __Downregulated (log2FC

      Acetate

      3160

      3518

      433

      352

      Propionate

      3402

      3854

      1304

      735

      Butyrate

      4600

      4539

      2082

      1727

      __Response Table1. Number of differentially expressed genes for each SCFA treatment group, related to Figure 3. __RNA-seq was performed on HCT-116 cells grown in DMEM and treated with 5 mM of single SCFAs for 6 hours. Differential genes were identified using DESeq2 Wald test and statistically significant genes were defined using a padj To fully understand mechanistic differences of butyrate vs. acetate or propionate, we would need to perform additional experiments that we feel are beyond the scope of this current manuscript. However, we speculate that several mechanisms could account for these differences: for example, different histone acylations could have differential impacts on chromatin structure, reader binding, or transcription factor recruitment. As for blocking effects, select longer acylations (butyrylation and crotonylation) have been demonstrated to have repressive effects in transcription or reader protein binding in specific cell contexts (example PMIDs: 27105113, 31676231, 37311463). These are important future studies for our group and will likely shed light on additional mechanistic insights of different histone acylation functions. We have highlighted some of these concepts in the discussion (lines 301-310):

      "We also observe that butyrate and propionate treatment have both overlapping and distinct effects on gene regulation (Figure 3, Supplemental Figure 4, Supplemental Figure 8D). Propionate appears to have more modest effects compared to butyrate, as it induces a smaller number of differential gene changes and these genes do not display enrichment in ATP and nucleotide metabolism categories. These differences in gene regulatory responses to the different SCFA treatments could be due to multiple mechanisms. For example, we speculate that there could be chromatin-independent functions through distinct alterations in metabolic or signaling pathways or chromatin-dependent mechanisms through potential distinct structural effects on chromatin or differences in reader protein binding."

      Which pathways are associated with acetate- and propionate-specific DEGs?

      Response: Thank you for this insightful question. We have performed gene ontology analysis for acetate and propionate DEGs. Interestingly, there is largely overlap between the different SCFA treatments (Supplemental Figure 4A). However, propionate treatment fails to enrich for select gene ontology categories that we observe in acetate treatment (Supplemental Figure 4B, __included below). For example, by gene set enrichment analysis, acetate enriches for gene categories related to nucleotide and ATP synthesis, while propionate does not. However, both acetate and propionate (and all SCFA treatments) are enriched in categories related to the ribosome and rRNA (__Supplemental Figure 4B-C). We have added this analysis to the manuscript as Supplemental Figure 4 and included additional discussion of this analysis in the text in lines 163-171 (included below), as well as additional speculation about differences between propionylation and butyrylation in lines 301-310 (included above).

      *"We further analyzed gene programs changing with different SCFA treatments. All SCFA treatments regulated largely overlapping gene programs including those related to RNA metabolism, ATP synthesis, and ribosome function (Supplemental Figure 4a). Since butyrate overlapped greatly with the combination SCFA treatment, we specifically analyzed acetate and propionate gene programs (Supplemental Figure 4b-c). Interestingly, propionate treatment failed to enrich for select gene ontology categories that we observe in other SCFA treatments. Specifically, propionate-dependent gene programs did not include those related to ATP and nucleotide metabolism, highlighting some differences in gene expression changes following different SCFA treatments." *

      • *

      New__ Supplemental Figure 4B.__

      Which genes are related to growth inhibition in butyrate-treated cells? Does the 1:1:1 SCFA mixture have a similar impact on cell growth as single butyrate treatment?

      Response: Butyrate has previously been shown to inhibit cell growth in colon cancer cells (PMIDs 9125124, 33017771, 38398853). These include differential regulation of key cell cycle regulators, such as p21 and Cyclin D1. We have included both GO term enrichment for the 1:1:1 SCFA mix and gene expression data for select cell cycle regulators in Supplemental Figure 7C-D (7D also included below). This demonstrates that both butyrate and the SCFA mixtures, and to a lesser extent propionate, differentially regulate key cell cycle genes including CDKN1C, CDK2, CDK4, WEE1, and RB1. We have additionally performed a GLO assay for the 1:1:1 SCFAs treatment to investigate its impact on growth inhibition, which is now included as Supplemental Figure 7B. Here, we observe that the 1:1:1 and 3:1:1 mixtures of SCFAs significantly decrease cell viability. However, this is not to the same extent as butyrate treatment alone. Together, this data suggest that butyrate reduces cell viability at least in part through altering key cell cycle genes. This effect is mimicked with the SCFA mixture treatments, but to a lesser extent compared to butyrate alone.

      New Supplemental Figure 7D.


      Reviewer #2 (Significance (Required)):

      General assessment: This study clearly demonstrates the role of butyrate in gene regulation and elucidates its underlying regulatory mechanisms. However, it does not provide insight into how butyrate counteracts the effects of acetate and propionate, despite these metabolites often being detected together. In addition, it remains unclear which specific histone PTMs are associated with the distinct gene expression changes induced by different short-chain fatty acids. Lastly, the observation that histone butyrylation and propionylation correlate with active transcription is not novel.

      Advance: This study advances understanding of short-chain fatty acids in chromatin and gene regulation, highlighting butyrate's dominant role and its p300/CBP-dependent rather than HDAC inhibition-dependent mechanism.

      Audience: This work may attract significant interest in both the epigenetics and metabolism fields.

      My expertise: histone acetylation, HATs, transcriptional regulation, cancer.

      Response: We very much appreciate all of these thoughtful comments. We are thankful for the recognition that this story advances our understanding of SCFA function through chromatin and may be of significant interest to the epigenetics and metabolism fields. We hope that we have now provided additional insight into roles of propionate and acetate (Supplemental Figure 4). We also recognize that similar to other studies, we observe colocalization of the different histone marks and it is difficult to tease apart specific functions. We plan to further address this important question in future studies.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors explore the effects of short-chain fatty acids (SCFAs) acetate, propionate, and butyrate on chromatin and gene expression in human colon cancer cells. The authors first characterize the presence of histone propionylation and histone butyrylation in different colon cancer cell lines as a function of SCFA treatments. Then, they perform ChIP-seq to determine the genomic localization of these marks and observe that these marks are deposited on euchromatic regions similar to H3K4Me3 and to one another, consistent with previous reports. The authors then performed gene expression analysis to determine the contribution of the SCFAs. Interestingly, they observe that butyrate treatment alone mimicked the gene expression profile of an equimolar mixture of short-chain fatty acids treatment, at least in the tested cell lines. Finally, the authors designed an experiment to try to separate the functions of butyrate on gene expressions that are dependent on p300/CBP and are independent of the HDAC inhibition property. The following aspects of the paper need addressing-

      Response: We sincerely thank the reviewer for their very helpful and constructive comments. We appreciate the notes on interesting aspects of our study. We hope that we have addressed all concerns as described below.

      Major comments

      1. There is no confirmation of the validity of the results seen from ChIP-seq (Figure 2) and RNA-seq (Figure 3). The majority of the findings of the paper are derived from ChIP-seq and RNA-seq data, and hence, experiments validating such results need to be established. ChIP-qPCR for representative gene(s) with adequate controls needs to be performed for different acyl marks (H3K27bu, H3K27pr, H3K4Me3, H3K9pr, H3K9bu) to support the ChIP-seq results, and RT-qPCR for representative gene(s) for different treatment conditions (vehicle, acetate, propionate, butyrate, and 5 mM 1:1:1 mixture) for validating RNA-seq results. Response: We are happy to include validation by qPCR of our ChIP and RNA-seq results. The qPCR validation for Figure 3 is now included as Figure 3F and qPCR validation for ChIP-seq is included as Figure 4C. We have selected genes that are differentially expressed and also display occupancy of different histone acyl marks. In addition, we performed additional qPCR validation for our RNA-seq data related to Figure 5 (previously Figure 4), which is now included as Figure 5F-G. Lastly, we performed orthogonal analysis of ChIP using Cut&Run in Caco-2 cells, which is now included as Figure 2C-D. This further supports our findings with HCT-116 cells.

      The authors describe an interesting strategy to differentiate the different functions of butyrate (Figure 4). The authors propose that differential genes that change with p300/CBP inhibitor treatment, that are separate from HDAC inhibitor treatment, are potential genes that are a function of histone butrylation. An important control that is missing in this experiment is cells treatment with propionate. In their previous findings (Figure 1C-D), they note that both propionate and butyrate treatments elevate the levels of histone acetylation, propionylation, and butyrylation. But the HDAC inhibitory activity of propionate is not very well established, and performing experiments to prove it is are beyond the scope of this paper. Importantly, p300/CBP has been shown to catalyze histone propionylation with higher efficiency compared to histone butyrylation (PMID: 27820805, PMID: 29070843). Therefore, it would be ideal to include differentially expressed genes from propionate-treated cells in the analysis to rule out any discrepancy.

      Response: Thank you for this insightful comment. We agree that propionate also elevates histone butyrylation and may have important effects. We have therefore included our differentially expressed genes with propionate treatment from Figure 3 in our analysis related to HDAC inhibition: we have plotted these differentially expressed genes in a matched, ordered column to our clustering analysis in Figure 4 (now Figure 5) as Supplemental Figure 8D (also included below). This demonstrates that overall propionate has similar gene expression changes to butyrate, but the extent of these changes is less pronounced compared to butyrate. In addition, our qPCR validation analysis in Figure 3F demonstrates that propionate similarly regulates some differentially expressed genes affected by butyrate (such as PHOSPHO1 and HOXB9) but fails to differentially regulate other targets (such as CYSRT1). This suggests that propionate and butyrate have both overlapping and distinct targets, which is consistent with our global analyses in Figure 3A-D. Lastly, we now have included specific analysis of gene program changes related to propionate treatment (Supplemental Figure 4). Interestingly, there is largely overlap between the different SCFA treatments (Supplemental Figure 4A). However, propionate treatment fails to enrich for select gene ontology categories that we observe in other SCFA treatments (Supplemental Figure 4A-B). For example, by gene set enrichment analysis, other SCFA treatments enrich for gene categories related to nucleotide and ATP synthesis, while propionate does not. However, all SCFA treatments are enriched in categories related to the ribosome and rRNA (Supplemental Figure 4B-C). Together, this data suggests that propionate has largely similar effects to butyrate treatment in regulating gene expression programs with some distinct differences.

      New Supplemental Figure 8D.

      Along the same lines as comment #2, other possible "functions" of propionate and/or butyrate that could explain why treatment with them increase histone acetylation, propionylation, and butyrylation are not discussed. This work was not cited/discussed: PMID 34677127 despite being very closely related and relevant. Indeed, there seems to be some redundancy of efforts between that paper (2021) and this one even in terms of the specific experiments performed.

      Response: Thank you for this comment, and we sincerely apologize for our oversight in not citing this important work. We are very familiar with this paper, and this was an unfortunate accidental oversight. We have now cited it throughout the text in lines 51, 123, and 330. In addition, we expanded our discussion about how our single treatments of butyrate or propionate increase levels of multiple histone acyl marks including acetylation, butyrylation, and propionylation. We now include activation of p300 as a potential mechanism for this observation in lines 327-330: "This is consistent with the role of butyryl-CoA and propionyl-CoA functioning as activators of p300 acetyltransferase activity, where these molecules can directly stimulate p300 auto-acylation and acetylation activity on histones and other substrates12" Lastly, while we agree that many of our treatments are similar to this paper, we also feel that our downstream analysis is distinct, as we are focusing on genomic localization and gene expression changes, in addition to changes in levels of the histone marks themselves. We believe that this distinction lessens the redundancy between our papers and may be of interest to the chromatin field.

      An analysis for correlations between the ChIP-seq data for H3K27bu (Fig 2) and RNA-seq data following butyrate treatment (Fig 3) would provide further insights into whether the genes/pathways that are enriched/downregulated in H3K27bu ChIP-seq data correlate with genes/pathways that are upregulated/downregulated in RNA-seq data.

      Response: We really appreciate this suggestion and agree that this analysis would add important additional insights. We have therefore performed this analysis through binning genes by expression level and analyzed occupancy of H3K27bu according to gene expression quartiles, which is now included as Figure 4B. Additionally, we included the other histone butyrylation and propionylation marks that are the focus of our manuscript. We have found that levels of H3K27bu occupancy are correlated with high gene expression quartiles. Importantly, this is also consistent with our earlier work in primary mouse intestinal cells (PMID: 38413806).

      Minor comments

      1. All the images appear to be very low resolution. This could be due to the online submission system. Response: We apologize for this issue and believe it is due to the submission system.

      For Fig 2, the caption says "...treated with different SCFAs for 24 hours," but it is unclear precisely what the treatment was. Were the cells treated with the SCFA mix, and then ChIP-seq was performed for the 5 different marks tested? Or were there different SCFA treatments performed for each mark that was ChIPed?

      Response: We have revised the text of the figure legend to make it clear that we treated cells with individual SCFAs (propionate for propionylation marks and butyrate for butyrylation marks).

      Line 99-100: "Treatment with butyrate, propionate, or a mixture of all three SCFAs resulted in a global increase in histone butyrylation or propionylation" is misleading. The authors test only specific sites on Histone H3 using site-specific antibodies and do not test whether these treatments increase global levels of acylation on other histones and sites using pan-acyl antibodies. So, this sentence needs to be rephrased to clearly indicate that the treatments only increased at the tested sites.

      Response: Thank you for this comment. We understand this was misleading and that was not our intention at all. By writing "global levels," we simply meant levels of immunoblotting signal at these specific lysine residues. We have therefore revised the text to make it clearer (now in lines 102-104): "Treatment with butyrate, propionate, or a mixture of all three SCFAs resulted in significant increases of histone butyrylation and propionylation at select residues of histone H3, as assayed by immunoblotting".

      Reviewer #3 (Significance (Required)):

      Strengths and limitations: The experiments in the study were performed with a high degree of rigor, including appropriate controls. The discussion of the -seq data in Figs 2-4 avoided focusing on or following up on specific genes, which limited the conclusions from these data to being very broad. A key paper (that was not recent) was missing from the context presented in the paper, weakening the discussion of the data presented.

      Advance: The advance is pretty conceptually incremental. Similar experiments as in Fig 1-3 in similar models have been performed in other papers already (e.g., PMID 39789354 in 2025 and PMID 34677127 in 2021), although Fig 4 was an interesting experiment that helps differentiate the work from existing literature.

      Audience: This work would be interesting to a chromatin audience as well as a microbiome audience, but the scope of the conclusions from this paper, and it's redundancy with other literature, will limit its profile.

      My expertise is in histone PTM biochemistry and biology, including non-canonical histone acyl PTMs.

      Response: We really appreciate the thoughtful and constructive comments and the recognition that this story may be of interest to the chromatin and microbiome audiences. In addition, we acknowledge other similar recent work that is also very interesting, but we also feel that our manuscript is distinct in several important ways from these studies. In particular, the analysis of gene expression changes that we propose to be histone butyrylation dependent vs. through HDAC inhibition (Figure 5, previously Figure 4) and the finding that butyrate drives SCFA combination gene expression changes (Figure 3). We are very grateful for the recognition of these interesting findings by this reviewer. Furthermore, we also want to highlight that we have expanded our analysis of human tissues (Supplemental Figure 1), which adds additional novelty to this work.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary: The authors explore the effects of short-chain fatty acids (SCFAs) acetate, propionate, and butyrate on chromatin and gene expression in human colon cancer cells. The authors first characterize the presence of histone propionylation and histone butyrylation in different colon cancer cell lines as a function of SCFA treatments. Then, they perform ChIP-seq to determine the genomic localization of these marks and observe that these marks are deposited on euchromatic regions similar to H3K4Me3 and to one another, consistent with previous reports. The authors then performed gene expression analysis to determine the contribution of the SCFAs. Interestingly, they observe that butyrate treatment alone mimicked the gene expression profile of an equimolar mixture of short-chain fatty acids treatment, at least in the tested cell lines. Finally, the authors designed an experiment to try to separate the functions of butyrate on gene expressions that are dependent on p300/CBP and are independent of the HDAC inhibition property. The following aspects of the paper need addressing-

      Major comments

      1. There is no confirmation of the validity of the results seen from ChIP-seq (Figure 2) and RNA-seq (Figure 3). The majority of the findings of the paper are derived from ChIP-seq and RNA-seq data, and hence, experiments validating such results need to be established. ChIP-qPCR for representative gene(s) with adequate controls needs to be performed for different acyl marks (H3K27bu, H3K27pr, H3K4Me3, H3K9pr, H3K9bu) to support the ChIP-seq results, and RT-qPCR for representative gene(s) for different treatment conditions (vehicle, acetate, propionate, butyrate, and 5 mM 1:1:1 mixture) for validating RNA-seq results.
      2. The authors describe an interesting strategy to differentiate the different functions of butyrate (Figure 4). The authors propose that differential genes that change with p300/CBP inhibitor treatment, that are separate from HDAC inhibitor treatment, are potential genes that are a function of histone butrylation. An important control that is missing in this experiment is cells treatment with propionate. In their previous findings (Figure 1C-D), they note that both propionate and butyrate treatments elevate the levels of histone acetylation, propionylation, and butyrylation. But the HDAC inhibitory activity of propionate is not very well established, and performing experiments to prove it is are beyond the scope of this paper. Importantly, p300/CBP has been shown to catalyze histone propionylation with higher efficiency compared to histone butyrylation (PMID: 27820805, PMID: 29070843). Therefore, it would be ideal to include differentially expressed genes from propionate-treated cells in the analysis to rule out any discrepancy.
      3. Along the same lines as comment #2, other possible "functions" of propionate and/or butyrate that could explain why treatment with them increase histone acetylation, propionylation, and butyrylation are not discussed. This work was not cited/discussed: PMID 34677127 despite being very closely related and relevant. Indeed, there seems to be some redundancy of efforts between that paper (2021) and this one even in terms of the specific experiments performed.
      4. An analysis for correlations between the ChIP-seq data for H3K27bu (Fig 2) and RNA-seq data following butyrate treatment (Fig 3) would provide further insights into whether the genes/pathways that are enriched/downregulated in H3K27bu ChIP-seq data correlate with genes/pathways that are upregulated/downregulated in RNA-seq data.

      Minor comments

      1. All the images appear to be very low resolution. This could be due to the online submission system.
      2. For Fig 2, the caption says "...treated with different SCFAs for 24 hours," but it is unclear precisely what the treatment was. Were the cells treated with the SCFA mix, and then ChIP-seq was performed for the 5 different marks tested? Or were there different SCFA treatments performed for each mark that was ChIPed?
      3. Line 99-100: "Treatment with butyrate, propionate, or a mixture of all three SCFAs resulted in a global increase in histone butyrylation or propionylation" is misleading. The authors test only specific sites on Histone H3 using site-specific antibodies and do not test whether these treatments increase global levels of acylation on other histones and sites using pan-acyl antibodies. So, this sentence needs to be rephrased to clearly indicate that the treatments only increased at the tested sites.

      Significance

      Strengths and limitations: The experiments in the study were performed with a high degree of rigor, including appropriate controls. The discussion of the -seq data in Figs 2-4 avoided focusing on or following up on specific genes, which limited the conclusions from these data to being very broad. A key paper (that was not recent) was missing from the context presented in the paper, weakening the discussion of the data presented.

      Advance: The advance is pretty conceptually incremental. Similar experiments as in Fig 1-3 in similar models have been performed in other papers already (e.g., PMID 39789354 in 2025 and PMID 34677127 in 2021), although Fig 4 was an interesting experiment that helps differentiate the work from existing literature.

      Audience: This work would be interesting to a chromatin audience as well as a microbiome audience, but the scope of the conclusions from this paper, and it's redundancy with other literature, will limit its profile.

      My expertise is in histone PTM biochemistry and biology, including non-canonical histone acyl PTMs.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study presents a novel finding that short-chain fatty acids (SCFAs) produced by microbial metabolism regulate gene transcription in human colon cancer cells by modulating histone H3K9 and H3K27 butyrylation and propionylation, both of which are associated with an open chromatin state. The authors further reveal that the major effect of the SCFA mixture is driven by butyrate and identify p300/CBP-dependent, rather than HDAC inhibition-dependent, gene regulation by butyrate. Overall, this is a well-organized study that provides valuable insight into the role of metabolites in human cells.

      Major comments:

      1.In Figures 1C and 1D, why did the SCFA mixture not increase histone butyrylation or propionylation to the same level as single butyrate treatment? 2.In Figure 3B, how does butyrate block the effects of acetate and propionate on transcription? 3.Which pathways are associated with acetate- and propionate-specific DEGs? 4.Which genes are related to growth inhibition in butyrate-treated cells? Does the 1:1:1 SCFA mixture have a similar impact on cell growth as single butyrate treatment?

      Significance

      General assessment: This study clearly demonstrates the role of butyrate in gene regulation and elucidates its underlying regulatory mechanisms. However, it does not provide insight into how butyrate counteracts the effects of acetate and propionate, despite these metabolites often being detected together. In addition, it remains unclear which specific histone PTMs are associated with the distinct gene expression changes induced by different short-chain fatty acids. Lastly, the observation that histone butyrylation and propionylation correlate with active transcription is not novel.

      Advance: This study advances understanding of short-chain fatty acids in chromatin and gene regulation, highlighting butyrate's dominant role and its p300/CBP-dependent rather than HDAC inhibition-dependent mechanism.

      Audience: This work may attract significant interest in both the epigenetics and metabolism fields.

      My expertise: histone acetylation, HATs, transcriptional regulation, cancer

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Kabir et al. explore the impact of microbiota-derived short-chain fatty acids (SCFAs) on chromatin structure and gene expression in human cells. They show that SCFAs, particularly butyrate, contribute to specific histone modifications such as butyrylation at H3K27, detectable in human colon tissue. Additional modifications like acetylation, butyrylation, and propionylation at H3K9 and H3K27 respond to SCFA levels and are enriched at active regulatory regions in colorectal cancer cells. Treatment with individual or combined SCFAs mimicking gut conditions alters gene expression patterns, with butyrate playing a dominant regulatory role. Butyrate's effects on gene expression are claimed to be independent of HDAC inhibition and instead rely on the p300/CBP complex through histone butyrylation. These findings underscore SCFAs as crucial modulators of epigenetic regulation in the human colon and highlight butyrate's dominant role in shaping chromatin and gene regulation beyond its known metabolic functions.

      The authors used two human cell lines and an in vivo murine model paired with RNA and ChIP sequencing approaches to identify target genes and chromatin modifications in response to SCFAs. While the findings are interesting and could provide important insights into the epigenetic influence of SCFAs in human cells, the study would benefit from additional experiments to strengthen the conclusions. Comments and suggestions are listed below:

      1. Figure 1: The H3K27bu expression in human biopsies highlights the clinical significance of the current study. However, the authors need to provide more information on the human colon samples, e.g., how many total patients were analyzed, and what were the age and/or sex. Only the methods mention the use of benign TMA; this should also be clarified in the figure legends. It would also be helpful to show histone butyrylation levels in normal vs. cancer human tissues.
      2. Figure 1: In addition, given that the butyrate level descends towards the base of the colonic crypt (with the highest at the top of the crypt where mature intestinal epithelial cells reside) (Kaiko et al., 2016), it is important to show how the H3K27bu signature is distributed along the crypt. This data would further emphasize the clinical relevance of this study, given that most colorectal cancers (CRCs) arise from stem and progenitor cells.
      3. Throughout the manuscript: The rationale for selecting the two CRC cell lines (HCT 116 and Caco2) should be explained. While commonly used, providing background on their genetic differences (e.g., driver mutations) is important, as this could greatly influence the PTM landscape.
      4. The study lacks additional controls, such as a normal colon epithelial cell line and a non-colonic cell type. Including these would help determine whether the observed butyrate effects are tissue- or disease-specific. This data would also help assess whether SCFA effects, and specifically butyrate's effects, on histone acylation and gene expression are systemic or local.
      5. Figure 2: The authors show ChIP-seq results in the HCT 116 cell line. To exclude the possibility that the demonstrated chromatin signatures are cell line-specific, results from Caco2 should also be shown. In addition, the 2D environment and multiple passaging alter gene expression in cell lines; using human colonic organoids would provide a more clinically and physiologically relevant model.
      6. Figure 4 is very confusing. Entinostat itself, as an HDAC inhibitor (iHDAC), increases butyrylation. The data shown are insufficient to draw conclusions. First, the authors should use additional iHDACs, and second, they should illustrate the overlap in gene expression changes between all treatments using a Venn diagram to clarify which genes/signatures are specific to each treatment.
      7. Figure 4: The authors use an HDAC inhibitor to rule out butyrate's effect on gene expression via HDAC inhibition. However, butyrate can also modulate gene expression through activation of GPR109a. Using GPR109a antagonists is necessary to address this possibility. These data are essential to validate the specific role of histone butyrylation in gene regulation.
      8. Supplementary Figure 4 and manuscript: There is no in vivo methods section describing the tributyrin-gavaged mice. The authors should clarify how the experiment was performed, how cells were isolated, whether sorting was performed, and which markers were used.
      9. Supplementary Figure 4: The GO analysis results show that lipid catabolism is among the top differentially enriched pathways. Butyrate is a known PPARγ agonist (Litvak et al., 2018), and activation of PPARγ is known to drive expression of genes involved in lipid metabolism. The authors need to rule out this function of butyrate before attributing this signature solely to histone butyrylation.
      10. It would be helpful to include a table of differentially abundant genes as a supplement to the heatmaps and GO analysis.

      Significance

      This study explores how microbiota-derived SCFAs, particularly butyrate, influence histone acylation and gene regulation. While the topic is relevant, the work lacks important controls (e.g., normal epithelial and non-colonic cells) and omits mechanistic validation (e.g., GPR109a signaling, PPARγ involvement). The rationale for cell line selection is unclear, and in vivo methods are insufficiently described.

      Audience:

      The study will mainly interest specialists in microbiota-chromatin interaction. Broader impact is limited by the narrow model scope and underdeveloped mechanistic insight.

      My Expertise:

      Cancer biology, in vivo models, microbiota-host interactions.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this report, Segal and coworkers describe the identification of the CD99 receptor as a marker whose expression discriminates betweeen two populations of a EWS tumor cell model, with distinct morphologies, behavior and xenograft tumor progression. As a top differentially expressed gene, they identify caveolin-1, a pervasive but highly contextual modulator of tumor cell behavior, as a potential driver of Pi3K/AKT-mediated thriving and survival.

      Major conceptual comments:

      The study uses state-of-the-art technology and their observations are potentially relevant to the field. However, the description of the mechanistic link between CAV1 and CD99 is not clear. The system they developed to stably manipulate CAV1 levels would be ideal to test the relevance of their claims in vivo. Certain additional experiments might clarify the dependency on caveolae or CAV1 (cholesterol/metabolic intervention). I am also intrigued by the proceedings they used to "evolve" CD99hi and CD99lo populations: mechanical stimuli can modulate caveolae assembly, and this should be addressed at least in one independent cell model, if not assessing whether primary tumors exhibit these diverging populations.

      Major technical comments:

      • the completeness of the "mechanical passaging" should be somehow demonstrated. Does this procedure progressively affect CD99 and CAV1 expression?

      • To my understanding, knockdown assays are performed using only one shRNA sequence. this should be validated with at least one independent shRNA intervention, or a similar approach like CRISPr gene ablation

      • The "aspect ratio" morphology score should be better explained and biologically contextualized (and if possible, correlated with Akt reporter signal) in the main figures

      • Drug resistance assays should be better detailed, first in vitro and then in vivo

      Significance

      The study aims to identify contextual drivers of EWS tumor cells, and identify two potential molecules as associated with a proliferative/survival phenotype. As such, the aim of the study is important. However, the differential phenotypes investigated arise from a non-standard form of cell passaging in a single established cell line model, and no exploration on the link of their findings to actual human EWS disease is clearly explored. The proposed model of interdependence among the potential drivers found is neither clear at mechanistic level, limiting the potential of these findings to identify specific biomarkers and/or interventions. For publication, to the very least

      (1) The mechanistic ties among the different drivers explored should be better characterized (also at technical level, see above) and

      (2) the relevance of CAV1 KD in vivo should be shown. I strongly encourage the exploration of public data on EWS for the relevance of their findings, even if preliminary (i.e. expression levels associated with increased or decreased patient survival, for example).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The submitted article identifies a distinct subpopulation of Ewing Sarcoma (EwS) cells characterized by high CD99 and elevated Caveolin-1 expression; shows that Caveolin-1 in these cells orchestrates PI3K/AKT signaling by specifically modulating the spatial organization of PI3K activity on the plasma membrane; demonstrates that the CD99-High and CD99-Low states are reversible, providing a flexible mechanism for survival-oriented plasticity in response to chemotherapy; and proposes that unlike CD99-Low cells,
 CD99-High cells use a Caveolin-1-driven signaling architecture to survive.

      Mayor comments:

      • The conclusion are well substantiated by the data and no additional experiments are needed to support the claims of the paper. Data and methods are presented in such a way that they can be reproduced. Experiments replicates and statistical analysis appear adequate.

      Minor comments:

      While the bibliography is appropriate, the manuscript should clarify its novelty from the outset. Specifically, the Introduction must acknowledge that the role of increased Akt activity in Cav-1-mediated cell survival is already established (Li et al., 2003). This study, currently listed as reference 42 in the Discussion, should be moved to the Introduction and discussed alongside references 31 and 32 to properly frame the study's context.

      Regarding the role of caveolin-1 facilitating mechanosensitive Akt signaling, Sedding et al. (doi: 10.1161/01.RES.0000160610.61306.0f.) should be cited.

      Likewise, Results: "Knockdown of Cav-1 in TC71 cells drastically reduced phosphorylated Akt levels in CD99Hi cells (Figure 5A), suggesting a dependence on Caveolin-1 for their Akt signaling". Rather than suggesting, these results are consistent with published data (Yang, H et al 2016; reference 32)

      The nature of the CD99high state remains ill-defined. While the authors identify CD99 as a standard biomarker for Ewing Sarcoma (EwS) (Ref 38), they simultaneously suggest that this specific CD99high subpopulation may have been previously overlooked due to its sensitivity to conventional enzymatic dissociation. To avoid confusion, the authors should explicitly clarify the distinction between baseline CD99 expression and this highly sensitive, high-expression state.

      Figure 4C. In addition to caveolae, CD99Hi, CD99Lo, and CD99Hi+Cav-1KD cells also present conspicuous differences in ER and ribosome content. These should be taken into account and the possibility be considered that the effect of Cav-1 may actually be caveolin-independent.

      Significance

      This manuscript adds further support to the role of caveolin-1 facilitating mechanosensitive Akt signaling, and establishes a novel link to CD99 expression in Ewing sarcoma.

      These findings will be of interest for researchers across the fields of mechanobiology and preclinical oncology, particularly those investigating caveolar scaffolds and PI3K/Akt-driven malignancies.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript identified a subpopulation of Ewing sarcoma TC71 cells following mechanical passaging that express high levels of the CD99 marker. This CD99-hi TC71 population was shown by multiple in vitro and in vivo assays to be more aggressive than the CD99-lo population (Figures 1-3). The authors then show that Caveolin-1 is upregulated at the transcriptional level in CD99-hi cells and that these cells have caveolae, show that Cav1 expression in the CD99-hi cells reduced pAkt signaling and they propose that this affects survival of this cell population.

      The strength of the paper is the very exhaustive in vitro and in vivo phenotypical analysis of the CD99-hi population (Figures 1-3).

      There are however multiple important weaknesses and omissions in this paper:

      1. The study is based on a single cell line, TC71, and conclusions are extended to Ewings sarcoma in general. Attributing conclusions generally to Ewings sarcoma must necessarily be based on analysis of multiple Ewings sarcoma cell lines and preferably supported by patient tumor data.

      2. The statement that caveolin-1 is a molecular signature of the CD99-hi state is not supported by the Western blot in Supp fig 4E, that shows that CD99-hi cells have lower Cav1 levels than CD99-lo cells, even though the CD99-hi have caveolae. This must be explained mechanistically and functionally and it cannot be argued that caveolin-1 is a "molecular signature of the CD99-hi state" if caveolin-1 expression levels are reduced in the CD99-hi population.

      3. The mechanistic role of caveolin-1 selectively in the CD99-hi population needs to be better established. Data supporting a role for Cav1 in survival is weak (Fig 4D) and not supported by the data presented in Supp fig 4G where Cav1 KD shows no effect on survival. A selective role for caveolin-1 in the CD99-hi cells must be demonstrated by parallel analysis of CD99-lo cells. Similarly ,the effects of caveolin-1 knockdown on AKT signaling are restricted to CD99-hi cells (Fig 5AB) and must be also shown for CD99-lo cells.

      4. There is no data linking AKT signaling to the CD99-hi phenotype elaborately detailed in Figs 1-3. If as the authors claim " the CD99High state establishes a Caveolin-1-driven signaling architecture that supports tumor cell survival through mechanisms distinct from those used by CD99Low cells" then: 1. caveolin-1 dependent Akt signaling must be shown to be specific to CD99-hi and not CD99-lo cells; and 2. Akt signaling shown to selectively regulate survival of CD99-hi cells in a caveolin-1 dependent manner, based on the in vivo assays developed in figures 2 and 3.

      Significance

      The strength of the paper is the very exhaustive in vitro and in vivo phenotypical analysis of the CD99-hi population. There are however multiple important weaknesses and omissions in this paper which when addressed, could considerably improve the significance of the manuscripts findings for the community.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the Reviewers for their comments on our manuscript “Structural insights into mitotic-centrosome assembly”. As described below, we have substantially revised the manuscript in response to their comments and are hoping you would consider the revised manuscript “Phosphorylation relieves autoinhibition to drive Cnn centrosome scaffold assembly” at The EMBO Journal. Our specific responses (black text) to the Reviewer’s comments (blue text) are detailed below

      Reviewer #1

      Main Points:

      1) From previous studies, it seems to me that for the residues potentially relevant for the hairpin regulation there is direct evidence of phosphorylation only for S567 (mass spec, phospho-antibody). Have the authors tested single site mutants (S567A and E)? Also, have they tested D mutations? If so, this should be commented on and shown. If not, it should be tested, in particular since the 2E phospho-mimetic is not functioning properly in vivo. If S571 is indeed crucial, it should be demonstrated that it is also phosphorylated. Otherwise it is possible that the mutation of this residue simply impairs important interactions (e.g. PReM-CM2, others), independent of phosphorylation.

      As requested, we have now tested individual S567A and S571A mutations and found that they both perturb Cnn scaffold assembly, but to a lesser extent than the 2A double mutant (New Fig.S3A). We also now confirm by MS that recombinant Polo can phosphorylate both S567 and S571 in vitro, and we have examined the behaviour of a 2D mutant and find that it behaves very similarly to the 2E mutant (New Fig.S3B).

      2) It is unclear why in vitro only A mutations have been tested and not phospho-mimetics. This should be tested for the interaction between PReM and CM2. This would allow to probe the model that phosphorylation opens the hairpin to allow interaction. Currently, such proof is missing in the study. Alternatively, the authors could phosphorylate the recombinant protein in vitro. The in vivo data is harder to interpret due to the complexity of the model and the authors should take advantage of the in vitro system.

      As requested, we now show in New Fig.S5 that whereas in vitro WT Cnn490-608 and Cnn-2A490-608 behave as dimers, Cnn-2E490-608 elutes in two major fractions—a tetramer species and a much larger species that elutes in the void volume (meaning that 2E can form very large species even in the absence of CM2) (Figure S5A). In the presence of CM2, Cnn-2E490-608 forms a tetramer (that eluted slightly later than the Cnn-2E490-608 tetramer) and larger complexes that contained CM2 and eluted in the void volume with a profile similar to Cnn-2E490-608 on its own (Figure S5B). These results are consistent with the possibility that the 2E substitutions open the helical hairpin to allow self-interactions that drive homo-tetramer and larger complex assembly in vitro.

      3) Regarding the worm PReM and CM2 domains, the authors mention that they have tested in vitro phosphorylation by PLK-1, but I could not find any data showing this. They should demonstrate successful phosphorylation or test candidate site by phospho-mimetic mutation. It is possible that the worm proteins depend more strongly on phosphorylation to relieve autoinhibition compared to the fly proteins.

      This is a good point, and we apologise for this omission. We now state that we confirmed by MS analysis that the recombinant worm PLK-1 we used in these in vitro experiments phosphorylates the putative SPD-5 PReM domain on the three sites (S627, S653 and S658) known to be important for promoting SPD-5 scaffold assembly in vivo (Figure Legend, Figure 6). Thus, the lack of detectable binding between these proteins is not due to the lack of phosphorylation.

      Minor Point:

      4). Fig. 6C, D: the labeling of the chimeric constructs using "+" symbols is confusing, since it suggests that separate proteins were expressed. If I understand this correctly, with the current labeling, deltaCM2+DmCM2 means WT? The authors should write the full name of the wildtype or chimeric construct in each case and use a more standard/less confusing nomenclature. Also, I suggest to start the panels and graphs with the WT sample.

      We thank the Reviewer for this suggestion and have re-labelled this Figure to clarify this point. We understand the point about putting the WT panels first in Figure 6C,D (now Figure 5C,D) but think that this is not the correct comparison to emphasise. We are testing the ability of the various CM2 domains to “rescue” the lack of a CM2 domain, so we feel Drosophila Cnn lacking CM2 is the correct baseline for this comparison.

      Reviewer #2

      Main Comments:

      1. The title is too vague. Any number of existing papers could be said to provide "structural insights into mitotic centrosome assembly". The authors need to narrow down to a defined conclusion and state this as the title.
      2. I think the strongest and most novel aspects of this study relate to the mechanism of Cnn assembly via relief of the auto-inhibited PReM. The effort to elucidate assembly mechanisms of SPD-5 and CDK5RAP2 are comparatively light and there are no accompanying experiments in worms or human cells. Without the in vivo experiments, it's hard to know if the in vitro experiments are valid. It's speculative for the authors to say they found the true PReM for CDK5RAP2; they do not demonstrate that PLK-1 phosphorylation potentiates assembly in Figure 8. Thus, I suggest re-writing the paper to focus on Cnn. Experiments in Figure 6 are still valid if reframed. For example, substituting Cnn's CM2 with the CM2 from CDK5RAP2 vs. the C-term of SPD-5 illustrates that a simple coiled-coil with open ends (H.s.CM2) is sufficient to interact with PReM whereas a coiled-coil with a closed end (SPD-5 C-term, predicted by Figure 6A) cannot. We thank the Reviewer for these helpful comments and have re-written and re-organised the manuscript in accord with these suggestions—most importantly providing a more specific title and re-ordering the data to better focus the paper on the relief of Cnn autoinhibition.

      The purpose of Figure 1 is unclear. None of the other figures examine SPD-5 and CNN in the condensate form, which required using 4% PEG in this paper. The other assays look at the network form, which could behave differently and have different dependence on specific domains. I think they should perform the condensate assay for all other figures, otherwise leave it out. Furthermore, CDK5RAP2 is mentioned, yet not examined in Figure 1. It must be noted that CDK5RAP2 will also condense into droplets under crowding conditions or with a synthetic nucleator (Rios et al., 2025 J Cell Sci). Thus, it seems that condensation potential is a universal feature of known PCM scaffold proteins.

      The original Figure 1 has been moved to end of the paper (now Figure 8) and we now more thoroughly explain the logic of these experiments. Briefly, given that the PReM and CM2 domains in flies and worms seem to function in different ways in vivo, we sought here to test whether this was also the case in vitro—where the behaviour of full-length SPD-5 and of these domains of Cnn have been extensively studied, but never directly compared. We believe such a direct comparison will be of some interest to the field (the Woodruff et al., 2017 paper describing these in vitro SPD-5 condensates has been cited >700 times). We now also cite the Rios et al., 2025 paper but note that, despite extensive efforts, we were unable to purify enough well-behaved CDK5RAP2 for our experiments and so could not include it in this analysis. We think Rios et al., used an MBP-fusion of CDK5RAP2 in their experiments, which may explain this difference.

      The study uses different species without doing the same types of experiments on each. Sometimes human CDK5RAP2 is thrown in, sometimes not. They solve crystal structures of PReM from Cnn but not from the other proteins. This gets confusing, especially since the authors state that they seek to test if fly Cnn and worm SPD-5 assemble through different mechanisms (see last sentence of the intro). Also, if the focus is on worm vs. fly PCM assembly mechanisms, why include the human protein, especially Figure 8?

      On re-reading our original manuscript we appreciate this confusion. We hope that in re-writing the manuscript along the lines suggested by the Reviewer the logical flow of our experiments will be clearer.

      The conclusion that SPD-5's narrow PReM and "CM2" domains don't interact is consistent with the cross-linking mass spectrometry data from Rios et al. 2024. They showed only one X-link with low occurrence (1 out of 6 samples) between these two regions, even in the phosphorylated state (Fig. 1G). However, Nakajo et al (2022) claimed the opposite, showing that a larger PReM-containing construct (a.a. 272-732) interacts with a C-terminal construct (a.a. 1061-1198) after PLK-1 phosphorylation. Can the authors comment on this? Perhaps there is another site in SPD-5, outside of a.a. 541-677, that acts like the Cnn PReM?

      These are good points and we now mention this last possibility in the Discussion. We also now mention the supporting cross-linking Mass Spec data from Rios et al., 2024.

      I have serious doubts that the C-terminus of SPD-5 has a CM2 domain. To me, there is no real sequence homology with the traditional CM2's from humans and flies, and the AF3 predictions support this. Ohta et al. (2021) called this region "CM2-like" based on very poor homology, which a is questionable practice. Any coiled-coil region will appear somewhat homologous due to the heptad repeat pattern that defines them (e.g., leucines line up quite nicely). Thus, is it fair to say that SPD-5 doesn't assemble through a PReM-CM2 interaction? There may be a different region in SPD-5 that looks more like the canonical CM2. I think the authors have compelling evidence to give the C-terminal coiled-coil region in SPD-5 its own name rather than calling it CM2.

      This is a fair point, although the literature is already quite confusing on the nomenclature for the C-terminal region of SPD-5 (e.g., Ohta et al., JCB, 2021; Nakajo et al., JCS, 2022), so we are reluctant to add another name to the mix. Given that we draw comparisons with the fly and human CM2 domains (that are clearly related by sequence), we think it is easiest for readers if we use the “CM2” nomenclature throughout, although making clear our conclusion that SPD-5 “CM2” does not appear to function in the same way as fly/human CM2.

      Figure 3E. Would measuring scaffold mass be more appropriate? The PReM(deltaH1,NTH2) leads to more compact scaffolds, but maybe they assemble just as well as the deltaH1 mutant. As it stands, there is a discrepancy between panel E and F in terms of what is measured (area vs. intensity) and the outcome.

      In several previous papers we use fluorescence intensity to measure the “amount” of protein at centrosomes in vivo but, in our original paper (Feng et al., Cell, 2017), we quantified PReM::CM2 scaffold assembly in vitro by measuring the area of scaffold assembly. Thus, we prefer to present the current data in this way for consistency across publications, and we believe either measure is valid. We could measure the area and intensity of the PReM∆H1 and PReM∆H1∆NTH2 scaffolds to compare scaffold density, but we think this would unnecessarily complicate this data. The main point is not how much or how dense each scaffold is, but rather that the PReM∆H1∆NTH2 protein doesn’t really make a scaffold at all—but rather makes smaller “blobs” that tend to bunch together (further characterised in Fig.S2).

      Minor Comments:

      1. In one version of the PDF there are images missing in Fig 1F, 4C, 4D. I opened another version (source version) and the images were there. Just FYI.
      2. Figure 4A. The blue coloration makes it difficult to read the black letters.
      3. Figure 4A. Why is part of the protein colored in green? This coloration isn't defined, nor does it show up again in panel B.
      4. The layout of Figure 4 is confusing. It took me a few minutes to realize that the big red box inset belonged to panel B and not panel A.
      5. Figure 4C,D. The sample size is not mentioned in the legend.
      6. The title for Figure 4 seems too speculative. How can the authors say that phosphorylation relieves the autoinhibition without structural data?
      7. Figure 5B. The sample size is not mentioned in the legend.
      8. Figure 6B,D. The sample size is not mentioned in the legend.
      9. The text in Figure 7B is hard to read because it is too small. Please make this bigger.
      10. Figure 8C. What is colored in magenta? Is there an additional labeled protein besides mNG-CM2?
      11. Figure 8C. What is the sample size? How many images were taken? Also, why are there data points off to the right of the last column?
      12. The wording of these sections needs improving. I found them complicated and difficult to understand. We thank the Reviewer for taking the time to make these helpful comments. We have addressed all these points in the revised manuscript. On point 10, the magenta objects were fiduciary beads that were inadvertently included on this panel (and are no longer shown).

      Reviewer #3

      Major Comments: 1. The title, "Structural Insights into Mitotic-Centrosome Assembly," is overly broad. The study primarily focuses on CM2-PReM intramolecular interactions in D. melanogaster Cnn and does not comprehensively address mitotic centrosome assembly across species. A more specific title reflecting the fly-centric and structural focus would better align with the manuscript's scope and conclusions.

      As described at the start of our response to Reviewer #2, the title and focus of the manuscript have been extensively revised along these lines.

      The authors analyze condensate formation by Cnn and SPD-5 but overlook condensate formation by CDK5RAP2, which was recently reported by Rios et al. (2025, PMID: 40454523). Including CDK5RAP2 would enable a more balanced and informative comparison across fly, worm, and human homologs.

      As described in point 3 of our response to Reviewer #2, we now cite Rios et al., 2025 but note that, despite extensive efforts, we were unable to purify enough well-behaved CDK5RAP2 for our experiments and so could not include it in this analysis. We believe Rios et al., used a full-length MBP-fusion of CDK5RAP2 in their experiments, which may explain this difference as MBP is very good at keeping proteins soluble (but would not be appropriate in our experiments where we compare full-length untagged proteins).

      In Figure 3, reconstitution of Cnn scaffolds using purified CM2 and PReM fragments yields "macromolecular scaffolds," but their physical properties are not defined. It remains unclear whether these assemblies are ordered or amorphous, and whether they exhibit solid- or gel-like behavior. Moreover, the heterogeneous, scattering particles observed by negative-stain EM (Figure S3B), likely corresponding to the Cnn490-608-CM2 complex, raise the possibility of nonspecific aggregation rather than organized scaffold formation. Appropriate controls lacking CM2 are needed to exclude spontaneous aggregation of PReM fragments. In addition, testing shorter truncations of the PReM H2 helix could help define the minimal requirements for scaffold assembly. Finally, the rationale for including the CnnΔExPReM construct only in vivo (Figure 3F), but not in the in vitro assays (Figure 3A-E), should be clarified.

      We apologise, as our presentation of this data has clearly led to some confusion on these points.

      First, as we now clarify, the amorphous solid-like physical properties of the PReM::CM2 scaffolds were described in our previous paper where we also showed that these scaffolds are not simply non-specific aggregates—as several single point mutations that disrupt the LZ::CM2 tetramer also prevent PReM::CM2 scaffold assembly in vitro as well as Cnn scaffold assembly in vivo (see Fig.5, Feng et al., Cell, 2017). Also, in all in vitro scaffolding experiments we always perform a negative control (-CM2) to confirm that none of the scaffolds are aggregates of the PReM domain being tested. We don’t usually show this control now as there would be lots of empty black boxes on the Figures. We do, however, show this control for the human putative PReM domain (Figure 7C), as we are testing this here for the first time.

      Second, the request to test shorter truncations of the PReM H2 helix to define the minimal requirements for scaffold assembly is unnecessary as PReM∆H1∆NTH2 already cuts H2 at the start of the LZ, and we previously showed the LZ is required for PReM::CM2 scaffold assembly in vitro (Feng et al., Cell, 2017). Thus, any further truncation of H2 will start to remove the LZ, which we already know is essential. We have now made this point more clearly.

      Finally, the Cnn∆ExPReM construct the Reviewer mentions was tested in both the in vitro (now Figure 2B) and in vivo (now Figure 2F) assays, but the labelling was confusing so this was not clear. We have now clarified this point.

      The coarse-grained (CG) simulation methodology is insufficiently described. Given that CG approaches sacrifice atomic detail and may oversimplify interactions, readers require more information to evaluate the model's reliability and limitations. A comparison with the framework used by Ramirez et al. (2024, PMID: 38356260) would be informative. It is also unclear why available crystal structures of WT and 2A Cnn (Figure 2C; Figure S4) were not used as simulation inputs, or why the structure of Cnn490-579 2E was not determined to complete the structural comparison.Furthermore, mutation of Ser567 and Ser571 to alanine markedly stabilizes the PReM domain (Figure 5C, D), implying that these residues maintain domain flexibility. Back-mapping CG models to atomic resolution could reveal the interactions altered by these mutations. The exclusive focus on double mutants (2A and 2E) is also limiting; analysis of single-point mutants at S567 or S571 would clarify whether both residues contribute equally or play distinct roles.

      We performed coarse-grained simulations because although they simplify atomic interactions and capture overall conformational dynamics, which is what we are trying to assess here (Fig.4C,D). We now clarify this point and provide more detail of our simulation methodology in the main text and Materials and Methods. We used the full helical hairpin (i.e., H2+H3+H4) prediction in these simulations—rather than the crystal structure of the partial helical hairpin (i.e., H2+most of H3)—as we reasoned that the presence of the full H3 and H4 might influence breathing, and the full helical hairpin (see Video S1) seems likely to be the relevant biological fold. As we now show (new Figure S5), and as discussed above, the 2E mutants do not behave well in vitro so we were unable to solve their structure. We agree that we could perform atomic resolution simulations to better understand how the 2A/E and single A/E mutations might suppress/enhance breathing, but we believe such an analysis is beyond the scope of the current manuscript and would distract from our main conclusions.

      The discussion lacks sufficient integration with prior studies and often presents conclusions without adequate citation. For example, the claim that flies and humans rely on related PReM-CM2 interactions whereas worms use distinct phosphorylation-regulated mechanisms is not supported by appropriate references. In addition, limited cross-referencing to the manuscript's own data weakens the connection between results and conclusions. Expanding and better grounding the discussion in existing literature would significantly enhance its depth and clarity. We thank the Reviewer for this general point and have tried to better integrate our results with prior studies—particularly in the Discussion section.

      Minor Comments: 1. In Figure 1B, the molecular weight units for the protein marker are missing and should be included. Fixed.

      In Figures 1E and 1F, readability would be improved by including x-axis labels on all graphs, rather than only on the bottom panels.Fixed. The protein structures shown in Figures 2C and 2D sh7w b b∫ybb ould be explicitly labeled as dimers to avoid confusion. Fixed. In Figures 3A-D, using fluorescently labeled CM2 would help validate both the interaction with the PReM domain and its localization within the scaffold.We have previously tried fluorescently tagging the CM2 domain, but scaffold formation is much less robust. We do not think this invalidates this assay, as the evidence supporting the PReM::CM2 interaction is very strong—including assessing the physiological influence of multiple point mutations in both domains in residues at the heart of the interaction interface identified by crystallography (e.g., see Fig.4, Feng et al., Cell, 2017).

      In Figure 3E, no statistical comparisons are presented between the original PReM construct and other samples. In addition, information regarding sample size and the number of experimental replicates is missing from the figure legend. Fixed. In Figure 3F, the absence of a pixel intensity scale bar makes the data difficult to interpret, as color values corresponding to high and low signal intensities are unclear. Moreover, no additional centrosome marker is included, nor is there evidence that PReM fragment expression levels are comparable across samples. These concerns also apply to Figures 4C and 4D.We now include pixel intensity scales in all relevant Figures. We think we do not need to show additional centrosome markers in our images as centrosomes exhibit a very reproducible behaviour in these embryos so we can be very confident that the objects we show here are genuine centrosomes. Considering expression levels, the images in Fig.4C,D (now 3C,D) are derived from stable transgenic lines so we can measure protein expression levels and show that the 2A and 2E mutants are expressed at similar levels to WT (new Figure S6). The images in 2F are from mRNA injections, so cannot be quantified in this way. However, we have vast experience with this assay (used in >15 publications since 2014) and can tell when, very occasionally, an injected mRNA is not expressed well (as this leads to a lack of general fluorescence in the cytoplasm). In addition, we know that deletions in Cnn do not generally destabilise the protein as we have analysed many such transgenic lines (see, for example, Reviewer Figure 1). Thus, the differences in centrosomal levels observed and quantified in 2F are almost certainly not caused by differences in the stability of the proteins being generated from the injected mRNAs.

      In Figure 4A, the interacting residues of PReM and CM2 shown in the red inset would be clearer if residue annotations for each domain were displayed in distinct colors. Additionally, the legends for Figures 4C and 4D do not specify the scale bar length.Fixed. The authors state that interactions between CM2 and PReM-2A462-608 could not be detected in vitro based on SEC chromatograms (Figure 5A), yet the figure does not clearly show this result. The accompanying SDS-PAGE images are too small and lack lane labels, making interpretation difficult (a similar issue applies to Figure 7B). Furthermore, the SEC chromatogram x-axis lacks volume annotations, hindering correlation between chromatographic peaks and SDS-PAGE results (in contrast to Figure 7B, which provides an appropriate example).We thank the reviewer for these points, all of which have now been fixed/adjusted.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study by Mohamad et al. builds on prior work by Conduit et al. (2014, PMID: 24656740) and Feng et al. (2017, PMID: 28575671), which established the essential role of intramolecular interactions between the phospho-regulated multimerization (PReM) domain and centrosomin motif 2 (CM2) of Drosophila Cnn in pericentriolar matrix (PCM) expansion during mitosis. Extending these studies, the authors investigate the structural properties of Cnn's PReM and CM2 domains and compare them with homologous proteins in C. elegans (SPD-5) and humans (CDK5RAP2). Their analyses suggest a phosphorylation-dependent mechanism that relieves Cnn autoinhibition, with particular emphasis on Ser567 and Ser571 within the PReM domain. The authors further propose that, whereas Cnn and CDK5RAP2 share conserved CM2-PReM interactions, SPD-5 has diverged to employ distinct mechanisms for PCM scaffold assembly.

      Although these conclusions rely heavily on AlphaFold3-predicted models (Abramson et al., 2024, PMID: 38718835), they are supported by a combination of in vitro and in vivo experiments, including live-cell imaging and molecular dynamics simulations. However, inconsistencies between in vitro and in vivo observations weaken some interpretations and warrant more careful discussion. Addressing the concerns below would substantially strengthen the manuscript.

      Major Comments

      1. The title, "Structural Insights into Mitotic-Centrosome Assembly," is overly broad. The study primarily focuses on CM2-PReM intramolecular interactions in D. melanogaster Cnn and does not comprehensively address mitotic centrosome assembly across species. A more specific title reflecting the fly-centric and structural focus would better align with the manuscript's scope and conclusions.
      2. The authors analyze condensate formation by Cnn and SPD-5 but overlook condensate formation by CDK5RAP2, which was recently reported by Rios et al. (2025, PMID: 40454523). Including CDK5RAP2 would enable a more balanced and informative comparison across fly, worm, and human homologs.
      3. In Figure 3, reconstitution of Cnn scaffolds using purified CM2 and PReM fragments yields "macromolecular scaffolds," but their physical properties are not defined. It remains unclear whether these assemblies are ordered or amorphous, and whether they exhibit solid- or gel-like behavior. Moreover, the heterogeneous, scattering particles observed by negative-stain EM (Figure S3B), likely corresponding to the Cnn490-608-CM2 complex, raise the possibility of nonspecific aggregation rather than organized scaffold formation. Appropriate controls lacking CM2 are needed to exclude spontaneous aggregation of PReM fragments. In addition, testing shorter truncations of the PReM H2 helix could help define the minimal requirements for scaffold assembly. Finally, the rationale for including the CnnΔExPReM construct only in vivo (Figure 3F), but not in the in vitro assays (Figure 3A-E), should be clarified.
      4. The coarse-grained (CG) simulation methodology is insufficiently described. Given that CG approaches sacrifice atomic detail and may oversimplify interactions, readers require more information to evaluate the model's reliability and limitations. A comparison with the framework used by Ramirez et al. (2024, PMID: 38356260) would be informative. It is also unclear why available crystal structures of WT and 2A Cnn (Figure 2C; Figure S4) were not used as simulation inputs, or why the structure of Cnn490-579 2E was not determined to complete the structural comparison.

      Furthermore, mutation of Ser567 and Ser571 to alanine markedly stabilizes the PReM domain (Figure 5C, D), implying that these residues maintain domain flexibility. Back-mapping CG models to atomic resolution could reveal the interactions altered by these mutations. The exclusive focus on double mutants (2A and 2E) is also limiting; analysis of single-point mutants at S567 or S571 would clarify whether both residues contribute equally or play distinct roles. 5. The discussion lacks sufficient integration with prior studies and often presents conclusions without adequate citation. For example, the claim that flies and humans rely on related PReM-CM2 interactions whereas worms use distinct phosphorylation-regulated mechanisms is not supported by appropriate references. In addition, limited cross-referencing to the manuscript's own data weakens the connection between results and conclusions. Expanding and better grounding the discussion in existing literature would significantly enhance its depth and clarity.

      Minor Comments

      1. In Figure 1B, the molecular weight units for the protein marker are missing and should be included.
      2. In Figures 1E and 1F, readability would be improved by including x-axis labels on all graphs, rather than only on the bottom panels.
      3. The protein structures shown in Figures 2C and 2D should be explicitly labeled as dimers to avoid confusion.
      4. In Figures 3A-D, using fluorescently labeled CM2 would help validate both the interaction with the PReM domain and its localization within the scaffold.
      5. In Figure 3E, no statistical comparisons are presented between the original PReM construct and other samples. In addition, information regarding sample size and the number of experimental replicates is missing from the figure legend.
      6. In Figure 3F, the absence of a pixel intensity scale bar makes the data difficult to interpret, as color values corresponding to high and low signal intensities are unclear. Moreover, no additional centrosome marker is included, nor is there evidence that PReM fragment expression levels are comparable across samples. These concerns also apply to Figures 4C and 4D.
      7. In Figure 4A, the interacting residues of PReM and CM2 shown in the red inset would be clearer if residue annotations for each domain were displayed in distinct colors. Additionally, the legends for Figures 4C and 4D do not specify the scale bar length.
      8. The authors state that interactions between CM2 and PReM-2A462-608 could not be detected in vitro based on SEC chromatograms (Figure 5A), yet the figure does not clearly show this result. The accompanying SDS-PAGE images are too small and lack lane labels, making interpretation difficult (a similar issue applies to Figure 7B). Furthermore, the SEC chromatogram x-axis lacks volume annotations, hindering correlation between chromatographic peaks and SDS-PAGE results (in contrast to Figure 7B, which provides an appropriate example).

      Significance

      This work will be of interest not only to cell biologists studying centrosomes, but also to molecular biologists investigating how protein modifications regulate protein behavior.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Mohamed et al. set out to compare the assembly mechanisms of pericentriolar material (PCM) in flies and nematodes. They reveal that the main PCM scaffold protein in each species (Cnn in flies, SPD-5 in nematodes) are sufficient to form supramolecular droplets (with a crowding agent) or networks (without a crowding agent). However, they diverge in one key aspect: Cnn scaffold assembly relies on the interaction between a C-terminal CM2 domain and a central phospho-regulated domain (PReM), whereas SPD-5 does not. The authors solve the crystal structure of a region within Cnn's PReM. With the help of modeling, they speculate that this region is auto-inhibited through backfolding of alpha helices, thus preventing its interaction with the CM2 domain. This auto-inhibition would be relieved by phosphorylation, which modeling suggests would increase "breathing" of the backfolded structure. The author end by presenting evidence to suggest that the human PCM scaffold protein CDK5RAP2 may assemble through a PReM-CM2 interaction.

      Major Comments:

      1. The title is too vague. Any number of existing papers could be said to provide "structural insights into mitotic centrosome assembly". The authors need to narrow down to a defined conclusion and state this as the title.
      2. I think the strongest and most novel aspects of this study relate to the mechanism of Cnn assembly via relief of the auto-inhibited PReM. The effort to elucidate assembly mechanisms of SPD-5 and CDK5RAP2 are comparatively light and there are no accompanying experiments in worms or human cells. Without the in vivo experiments, it's hard to know if the in vitro experiments are valid. It's speculative for the authors to say they found the true PReM for CDK5RAP2; they do not demonstrate that PLK-1 phosphorylation potentiates assembly in Figure 8. Thus, I suggest re-writing the paper to focus on Cnn. Experiments in Figure 6 are still valid if reframed. For example, substituting Cnn's CM2 with the CM2 from CDK5RAP2 vs. the C-term of SPD-5 illustrates that a simple coiled-coil with open ends (H.s.CM2) is sufficient to interact with PReM whereas a coiled-coil with a closed end (SPD-5 C-term, predicted by Figure 6A) cannot.
      3. The purpose of Figure 1 is unclear. None of the other figures examine SPD-5 and CNN in the condensate form, which required using 4% PEG in this paper. The other assays look at the network form, which could behave differently and have different dependence on specific domains. I think they should perform the condensate assay for all other figures, otherwise leave it out. Furthermore, CDK5RAP2 is mentioned, yet not examined in Figure 1. It must be noted that CDK5RAP2 will also condense into droplets under crowding conditions or with a synthetic nucleator (Rios et al., 2025 J Cell Sci). Thus, it seems that condensation potential is a universal feature of known PCM scaffold proteins.
      4. The study uses different species without doing the same types of experiments on each. Sometimes human CDK5RAP2 is thrown in, sometimes not. They solve crystal structures of PReM from Cnn but not from the other proteins. This gets confusing, especially since the authors state that they seek to test if fly Cnn and worm SPD-5 assemble through different mechanisms (see last sentence of the intro). Also, if the focus is on worm vs. fly PCM assembly mechanisms, why include the human protein, especially Figure 8?
      5. The conclusion that SPD-5's narrow PReM and "CM2" domains don't interact is consistent with the cross-linking mass spectrometry data from Rios et al. 2024. They showed only one X-link with low occurrence (1 out of 6 samples) between these two regions, even in the phosphorylated state (Fig. 1G). However, Nakajo et al (2022) claimed the opposite, showing that a larger PReM-containing construct (a.a. 272-732) interacts with a C-terminal construct (a.a. 1061-1198) after PLK-1 phosphorylation. Can the authors comment on this? Perhaps there is another site in SPD-5, outside of a.a. 541-677, that acts like the Cnn PReM?
      6. I have serious doubts that the C-terminus of SPD-5 has a CM2 domain. To me, there is no real sequence homology with the traditional CM2's from humans and flies, and the AF3 predictions support this. Ohta et al. (2021) called this region "CM2-like" based on very poor homology, which a is questionable practice. Any coiled-coil region will appear somewhat homologous due to the heptad repeat pattern that defines them (e.g., leucines line up quite nicely). Thus, is it fair to say that SPD-5 doesn't assemble through a PReM-CM2 interaction? There may be a different region in SPD-5 that looks more like the canonical CM2. I think the authors have compelling evidence to give the C-terminal coiled-coil region in SPD-5 its own name rather than calling it CM2.
      7. Figure 3E. Would measuring scaffold mass be more appropriate? The PReM(deltaH1,NTH2) leads to more compact scaffolds, but maybe they assemble just as well as the deltaH1 mutant. As it stands, there is a discrepancy between panel E and F in terms of what is measured (area vs. intensity) and the outcome.

      Minor Comments

      1. In one version of the PDF there are images missing in Fig 1F, 4C, 4D. I opened another version (source version) and the images were there. Just FYI.
      2. Figure 4A. The blue coloration makes it difficult to read the black letters.
      3. Figure 4A. Why is part of the protein colored in green? This coloration isn't defined, nor does it show up again in panel B.
      4. The layout of Figure 4 is confusing. It took me a few minutes to realize that the big red box inset belonged to panel B and not panel A.
      5. Figure 4C,D. The sample size is not mentioned in the legend.
      6. The title for Figure 4 seems too speculative. How can the authors say that phosphorylation relieves the autoinhibition without structural data?
      7. Figure 5B. The sample size is not mentioned in the legend.
      8. Figure 6B,D. The sample size is not mentioned in the legend.
      9. The text in Figure 7B is hard to read because it is too small. Please make this bigger.
      10. Figure 8C. What is colored in magenta? Is there an additional labeled protein besides mNG-CM2?
      11. Figure 8C. What is the sample size? How many images were taken? Also, why are there data points off to the right of the last column?
      12. The wording of these sections needs improving. I found them complicated and difficult to understand.

      "Fly and worm Spd-2/SPD-2 and Polo/PLK-1 are clear homologues, but Cnn and SPD-5 share little sequence homology-although they are both predicted to be large coiled-coil-rich proteins. Thus, it remains unclear whether these two, largely unrelated, molecules form mitotic-PCM scaffolds that assemble and function in a similar manner"

      "We first focused on Drosophila Cnn as, although the full structure of the original PReM domain (Cnn403-608) is unknown, this domain contains an internal leucine-zipper (LZ) dimer (Cnn490-544) whose crystal structure, in a tetrameric complex with a CM2 dimer, had been solved (Figure 2A) (Feng et al., 2017)."

      "When the full PReM and CM2 domains are mixed in vitro, they form large micron-scale assemblies and point mutations that perturb the LZ::CM2 tetramer perturb PReM::CM2 scaffold assembly in vitro and Cnn scaffold assembly in vivo."

      Significance

      Overall Assessment:

      While I find the premise of this study to be interesting, its execution and presentation are not fully convincing. The study is a collection of experiments connected by a thread that can be difficult to follow. One concern is the lack of focus and a clearly stated conclusion, which is ultimately embodied by the vague title. For example, the research question at the beginning doesn't match with the outcome in the end. At the end of the introduction, the authors state they wish to compare assembly mechanisms of Cnn and SPD-5. However, at the end of the results, they present data on CDK5RAP2 and speculate on its assembly. Why introduce the human protein here? Another concern is the lack of symmetry in the experiments. There is much more in vitro characterization of Cnn than SPD-5 or CDK5RAP2, and all in vivo work is performed in flies. Finally, this study does not address if the best-established model for SPD-5 assembly-multimerization via specific, multivalent coiled-coil interactions-applies to fly Cnn. Thus, to me, this is study is a deeper dive into the mechanism of Cnn assembly, not necessarily a fair cross-species comparison. I do not have major issues with the results, but I recommend that this paper undergo significant re-writing before being re-reviewed. There are also issues with data display and reporting of experimental details (e.g., sample sizes) that should be easily fixed.

      Advance: this study provides new insight into how two specific domains interact within PCM scaffold proteins to promote scaffold assembly. It provides some new structural insight into the mechanism of Cnn auto-inhibition. However, there is limited conceptual advance, as the bigger ideas (e.g., auto-inhibition as a regulatory control, PCM scaffold assembly through condensation of coiled-coil proteins) were already established.

      Audience: this study will be of interest to cell biologists studying centrosome assembly, mitosis, and evolution.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The study by Mohamad et al. investigates the structural basis and regulatory role of phosphorylation in the assembly of the mitotic pericentriolar material (PCM) scaffold, which nucleates microtubules and organizes the poles of the mitotic spindle. They use structure determination, biochemical reconstitution and in vivo experiment in flies to address how fly, worm, and human homologs of a key scaffold protein (Cnn, SPD-5, and CDK5RAP2, respectively) are relieved from auto-inhibition in a phosphorylation-dependent manner to form extended scaffolds through interactions between PReM and CM2 domains. An important discovery is a helical hairpin structure in the PReM domain that is the basis of autoinhibition and is regulated by phosphorylation. The work addresses the fundamental question how the centrosome matures in preparation for mitosis, by increasing the size and activity of the PCM scaffold that surrounds the centrioles. It also addresses how conserved the underlying molecular mechanism are among flies, worms, and humans. The study is overall of high quality, building on previous works by the authors and other groups, and adding new structural and biochemical insight. Most of the conclusions are supported by the data. I have a few concerns though that should be addressed. An important issue is the analysis of phosphorylation sites, which appears incomplete. For example, it lacks demonstration that both of the two studied phosphorylation sites are indeed phosphorylated. Kinase motif identification and mutation is not sufficient, considering that phosphorylation is integral to the proposed model of how autoinhibitory intra-molecule interactions are relieved, and considering that phospho-mimetics have not been tested in vitro and function poorly in vivo.

      Main:

      1) From previous studies, it seems to me that for the residues potentially relevant for the hairpin regulation there is direct evidence of phosphorylation only for S567 (mass spec, phospho-antibody). Have the authors tested single site mutants (S567A and E)? Also, have they tested D mutations? If so, this should be commented on and shown. If not, it should be tested, in particular since the 2E phospho-mimetic is not functioning properly in vivo. If S571 is indeed crucial, it should be demonstrated that it is also phosphorylated. Otherwise it is possible that the mutation of this residue simply impairs important interactions (e.g. PReM-CM2, others), independent of phosphorylation.

      2) It is unclear why in vitro only A mutations have been tested and not phospho-mimetics. This should be tested for the interaction between PReM and CM2. This would allow to probe the model that phosphorylation opens the hairpin to allow interaction. Currently, such proof is missing in the study. Alternatively, the authors could phosphorylate the recombinant protein in vitro. The in vivo data is harder to interpret due to the complexity of the model and the authors should take advantage of the in vitro system.

      3) Regarding the worm PReM and CM2 domains, the authors mention that they have tested in vitro phosphorylation by PLK-1, but I could not find any data showing this. They should demonstrate successful phosphorylation or test candidate site by phospho-mimetic mutation. It is possible that the worm proteins depend more strongly on phosphorylation to relieve autoinhibition compared to the fly proteins.

      Minor:

      4). Fig. 6C, D: the labeling of the chimeric constructs using "+" symbols is confusing, since it suggests that separate proteins were expressed. If I understand this correctly, with the current labeling, deltaCM2+DmCM2 means WT? The authors should write the full name of the wildtype or chimeric construct in each case and use a more standard/less confusing nomenclature. Also, I suggest to start the panels and graphs with the WT sample.

      Significance

      The study's strength is the use of a combination of structural and biochemical approaches with in vivo model testing. Its main limitation is that the analyses of the role of phosphorylation lacks depth and is not fully conclusive, despite its importance for centrosomal scaffold assembly. The study advances our understanding of centrosomal scaffold assembly and maturation at a molecular level, and how specific molecular aspects of these processes are conserved or differ among different organisms. The findings are of interest to cell biologists. My expertise is in centrosome and microtubule biology.

  2. Mar 2026
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      _Overall, we were encouraged by the comments of the reviewers, who mostly agreed that the study advance our understanding of two component system signaling mechanisms. The most substantive critique raised was the lack of mechanistic insight into the specific binding sites of Cu and NO on the PdtaS protein and the lack of examination of additional ligands such as cyclic di-GMP and zinc. We agree with this critique and cannot, and did not, make specific statements about the location of ligand binding. However, w_e draw a clear distinction in the manuscript between the functional effects of a chemical entity (ligand) on kinase activity and knowledge of the precise binding site of that ligand on the protein. As acknowledged, we did not determine the binding site. However, we do demonstrate the functional effect of the ligands, and these effects cannot occur without physical interaction between the ligand and the protein, so we believe the statement that the ligands are having this effect through binding is accurate, without knowledge of the precise location of that binding. ____

      Reviewer 1:

      • The primary concern pertains to ligand recognition by PdtaS. While PdtaS constitutive autophosphorylation is shown to be dependent on dimerization, there is no direct evidence of ligand binding. How Cu and NO inhibits PdtaS activity remains uncharacterized. Is it unclear if there are specific binding pockets inducing PdtaS conformational switch, if both substrates compete for a single binding pocket, or if Cu and NO inhibit dimerization by binding to the dimer interface. Similarly, it is unclear if NO does not covalently modify the key cysteine residues by S-nitrosylation, nor if Cu induces a distinct and reversible thiol-switch by site-specific oxidation that regulates PdtaS dimerization and activity.
      • __Response: Our discussion already contained this sentence: "Although we identify mutations with both positive and negative effects on dimer affinity which have effects of ligand inhibition of the kinase, this data does not identify specific molecular details on how dimerization is inhibited and whether Cu and NO both interact with the same regions of the dimer interface." We have added a sentence acknowledging that we do not determine whether NO is covalently modifying a thiol (line 343). __

      • Given the focus on ligand effects on PdtaS dimerization and activity, zinc and c-di-GMP should also be considered, as prior studies have suggested they may be sensed by PdtaS. Similarly, given the claim of multiligand sensing, it would be valuable to examine the combined effects of NO and Cu. Do they act additively, synergistically, or interfere with each other?

      • Response: We have added data to the manuscript examining the effect of c-di-GMP on kinase activity in combination with Cu. We do not observe a substantial synergistic effect. This new data is now Figure S3.

      • PdtaS variants and mutants are neither introduced nor adequately described. For example, in lines 144-150, PdtaS-H303Q and G443 are mentioned without citation, and their construction is not described in the Materials and Methods section. As a result, it is difficult to determine which experiments and constructs are specific to this manuscript. Please provide a detailed Materials and Methods section, and include as supplementary material a complete list of all strains, primers, and constructs used in this study, along with their origins.

      • __Response: We have added a section to the methods detailing the construction of the PdtaS mutant protein expression plasmids. __

      • References: Xing J et al 2023 is duplicated. Please correct in the text and in the references list.

      • Response: We have deleted the duplicate reference

      • Please provide molecular weights on gels (fig. 1C, D, E, 2A, 5C, D, 7A). Please provide incubation time for kinase reactions in figure legends (e.g. Fig 1C, D, E, ...).

      • __Response: All of these incubation times are included in the materials and methods. We ____will add to the figure legends depending on journal style. We have added selected MW numbers to the MW markers in 1D,E, 5D,7A.

      __

      • Please indicate whether representative experiments are shown, and specify the number of replicates performed for each assay (e.g. Fig 1C, D, E, ...). This information is essential for assessing the reproducibility and robustness of the findings.
      • __Response: We are somewhat confused by this comment. For each claim made about quantitative effects, we include a quantitation panel that contains experimental replicates (1F, 2D, 4D, 5C, 7B). For MST graphs, we state the number of replicates for each time point. __

      • Please clarify the discrepancy in Figure 2A regarding the calcium concentration used. The results section (line 163) refers to 10 µM, whereas the figure legend (line 393) states 1 mM.

      • __Response: We have corrected the figure legend to 10____m __

      • Figure 2A should include zinc, as previous work by the authors has shown that zinc directly inhibits the kinase activity of PdtaS. It would also be informative to test c-di-GMP in Fig. 2, given that c-di-GMP has been described to binds PdtaS (PMID: 33772870), and that c-di-GMP binding at dimer interfaces has been demonstrated in transcription factors (e.g., PMID: 25171413).

      • __Response: We did not test zinc because in our prior studies the effects of Zn and Cu were identical. We have tested c-di-GMP as noted above (see new Fig S3). __

      • The interpretation in lines 206-207 is not convincing. PdtaS homologs may differ in ligand specificity, precluding the presence of a conserved ligand-binding cavity but not of a specific ligand binding cavity in the GAF/PAS domains. Functional divergence of the binding site can occur, and this possibility should be acknowledged.

      • Response: We are somewhat confused by this comment. This is the sentence in question: "____This analysis suggests that the PdtaS kinase family has evolved to conserve the dimerization interface, shown above to be important for autokinase activity, but that the putative ligand binding domains do not have a conserved ligand cavity, arguing against a specific ligand that binds in the GAF or PAS pocket in this family of histidine kinases." The sentence does not argue that there is no ligand binding in the GAF PAS cavity, only that the cavity is not conserved, and this argues against a single ligand. To clarify this point, we will insert the word "single" before "specific ligand"

      Reviewer 2:

      • The dimer model is consistent with trans phosphorylation, but I did not see model quality described, especially in the H303-ATP binding interface. Can the authors provide AlphaFold PAE and pLDDT scores?
      • __Response: We have added a SI figure (Figure S4) with this data. __

      • Although the effect of Cu and NO on the two mutant PdtaS is clear, why the WT activity in Fig. 2A is not also inhibited is not obvious to me, especially since WT dimerization is affected by Cu and NO (Fig. 2B, C). Is there also cis-autophosphorylation that masks reductions in trans phosphorylation? Is the WT signal saturated on this autorad?

      • __Response: This assay, as noted in the figure legend, was done with 10_m_M Cu. This dose is only mildly inhibitory to the wild type kinase, as demonstrated in Figure 5D-E, which clearly demonstrates Cu inhibition. We don't have an explanation for why the trans phosphorylation mutant pair is inhibited by lower doses of Cu. It is possible this reflects some cis-autophosphorylation, but the strong inhibition of trans autophosphorylation is consistent with our model. __

      • The two Cys residues in PdtaS were previously found to affect kinase activity. Here, the authors show they also modestly affect dimerization. Since ~1/3 of mycobacteriales have both Cys, a double mutant would have been interesting for the in vitro characterization (it is used in live bacteria in Fig. 7A) and might show a more pronounced effect (not critical).

      • __Response: Although we agree, we attempted to purify the double cysteine mutant from * coli* but unable to due to insolubility, so we were unable to test the protein. __

      • Although competition data and structural model clearly indicate trans phosphorylation, some cis-phosphorylation can probably not be ruled out, especially since the dimer mutant H67A shows some activity. Although that mutation does not seem to fully disrupt the dimer, the H67A activity could be indicative of some cis-phosphorylation.

      • __Response: The H67A mutant is a dimerization mutant that weakens, but does not completely disrupt the dimer. This mutant cannot be used to distinguish cis vs trans phosphorylation and therefore we cannot rule out a mixture of cis vs trans autophosphorylation. The data in figure 1 argues for trans phosphorylation being the dominant mechanism. __

      • The Cys residues destabilize the dimer, and mutating the Cys stabilizes it, even canceling out the effect of the chemical destabilizers Cu and NO. In Fig. 4A, it looks like all Cys are too far apart to form disulfides, but Cu2+ can cause formation of disulfides. Can the authors comment on the distance of the Cys and the likelihood that disulfides have a role in this mechanism? If this were plausible, thiol-to-disulfide ratios with and without Cu could be directly measured. Although a bit more of a stretch, NO could also contribute to disulfide formation through ROS, and disulfides could be a way by which these two disparate ligands have the shared effect on activation shown here.

      • __Response: As stated above, we are not able to comment about whether direct modification of these cysteines is occurring. We do not believe the proximity of the cysteines would allow disulfide formation. __

      • The interdomain mutation Arg261Ala is quite nice and shows a specific effect on activity, but not dimerization, indicating that this interdomain ion bond somehow transfers the dimerization signal from the GAF to the PAS domain. Were there any other interdomain bonds? For completeness, was the basal autophosphorylating or phosphotransfer activity to PdtaR affected by the 261 mutation?

      • __Response: We did not detect any other bonds in our modeling. The basal level of autophosphorylation of the R261A protein compared to WT is visible at the lower end of the Cu inhibition curve in 6E and is comparable. We did not observe a difference in autophosphorylation at 0 Cu in the gels supporting this curve. __

      • First sentence in the Discussion, wording: The study investigates kinase activation, not signal sensing in a strict sense.

      • __Response: We have edited this sentence. __

      • Although this is primarily a biochemical, mechanistic study, one or two sentences on the biological significance of PdtaS/R in M. tuberculosis in the Introduction would be nice

      • __Response: We believe these sentences in the introduction already satisfies this request: "PdtaS and PdtaR were implicated in the Rip1 pathway of * tuberculosis signal transduction by a genetic suppressor screen in which inactivation of either PdtaS or PdtaR reverted the copper and nitric oxide sensitivity of M. tuberculosis lacking rip1. Copper and NO directly inhibit the kinase activity of PdtaS, an inhibition that requires the N terminal GAF and PAS domains _[39]_, indicating that the GAF-PAS are necessary to transmit the inhibitory signal to the kinase domain." We would also note that although much of the data is biochemical, we test the in vivo relevance of our model using M. tuberculosis* strains carrying PdtaS mutations

      __

      Reviewer 3


      • Limited characterization and validation of dimerization measurements: (a) while MST is an established technique, the central thesis relies heavily on dimerization measurements using this single method. Given the importance of this finding, at least one additional orthogonal approach would strengthen the conclusions significantly. Analytic size exclusion chromatography (SEC) could be a very simple, accessible and reliable approach to address this core mechanistic question. By choosing the right size resolution separation matrix, the authors should be able to separate complete monomers, from partial complexes (e.g. dimers only held through the kinase domain) and full dimers (the species the authors expect for the constitutively active wt protein). Ready advantage of having the wt protein can be taken, as well as several dimerization mutants (C53A, C57A, H67A), and presence/absence of cognate ligands (NO, Cu). For necessary reference standards, a dilution series should be able to reveal the elution position for wt monomers (and if this approach reveals to be difficult, mild chaotropic conditions can always be attempted, often times also pH shifts can do the job). Other techniques can point in the same direction as SEC, such as SAXS (best coupled to a SEC, or SEC-SAXS), native polyacrylamide gel electrophoresis, and/or dynamic light scattering.
      • __Response: We appreciate the reviewers' careful suggestions for additional experimental approaches. Although logical, we are unable to undertake them at this time and further exploration will hopefully be stimulated by our study. __

      (b) More importantly, additional techniques should be chosen such that a clear distinction can be made between two different scenarios, namely: that only the sensory domains (PAS/GAF) undergo ligand-triggered dissociation; or, instead, that the entire protein dissociates into separate monomers (i.e. including the kinase domains). This seems like an extremely important distinction, so that the proposed kinase-regulation mechanism is well understood/described. The first scenario would be less "disruptive" wrt previous paradigms (sensory domain dissociation could well be linked to a conformational rearrangement that allosterically inhibits kinase auto-phosphorylation).

      • __Response: We agree that this is an important distinction. We would note here that our prior data (Buglino et al eLife 2021) demonstrated that the isolated kinase domain of PdtaS is not inhibited by copper or NO, indicating that the effect of these ligands both requires the GAF-PAS and that the kinase dimer itself is not sensitive to ligand induced inhibition. This result does not directly address the reviewer's question, which is whether there is localized inhibition of dimerization in the GAF-PAS dimer, which, via an allosteric mechanism, inhibits phosphorylation by the kinase domain, which we have shown is in trans, without actual dissociation. We are not aware of a technique that could distinguish what would presumably be a type of allosteric localized dimer disruption from full dissociation. Our data clearly indicates that the kinase inhibition effect is mediated by the dimer dissociation effect on the GAF/PAS and full characterization of the effects of that on the kinase domain will await further studies outside of this paper. __
      • R261A mechanistic inconsistency: The manuscript shows that the R261A mutant has attenuated copper inhibition in vitro, albeit remaining functional in vivo (Figure 7B). While the authors acknowledge this suggests their "interdomain coupling model is incomplete or compensated by other mechanism in vivo," this significant discrepancy undermines confidence in the proposed mechanism and deserves more thorough investigation and/or discussion.
      • __Response: We thank the reviewer for this comment, which relates to the R261A mutant. We disagree that this result "undermines confidence in the proposed mechanism". We rigorously interrogated our in vitro findings by genetic complementation in M. tuberculosis cells using epitope tagged proteins and these results largely confirmed the model in that C53A, C53A/C57A, and H67A all inactivated signaling, as predicted from our model. R261 is the exception and, as we discuss, it indicates that we do not completely understand the in vivo determinants of coupling between the GAF-PAS dimer and the kinase domain, which is dependent on R261A in vitro. __

      Insufficient evidence about signal integration: While the authors argue this mechanism enables "integration of multiple inputs into the kinase without the constraints of specific ligand recognition" (lines 342-344), this appears conceptually flawed to me. The ligands (Cu and NO) must still be specifically sensed and bound somewhere on the protein to trigger dimerization disruption - the mechanism simply uses dimerization modulation as the output rather than the more typical allosteric conformational changes. The conservation pattern (interface > binding sites) may reflect selective pressure to maintain dimerization capability across the family, while individual species evolved different ligand specificities. The authors should clarify that their mechanism represents a novel output mode for ligand sensing rather than an alternative to specific ligand recognition, and discuss how this distinction affects their evolutionary interpretation.

      • Response: We thank the reviewer for this comment. The issue raised is the difference between ligand "recognition" and "sensing" with the former implying a specific binding site (which we acknowledge above and in the paper that we do not identify) and the functional output modified by ligands. Our data supports that dimerization is an important mechanism of sensing, but we do not claim that the dimer interface is the binding site for the ligands. We would note that the following sentences were in the reviewed version of the paper and we believe clearly make the exact distinction that the reviewer requests: __ Abstract: "These results indicate that a single bacterial kinase can __sense chemically diverse inputs through inhibition of dimerization dependent phosphorylation"

      Line 110: "Ligand binding pockets of GAF and PAS domains can bind a wide variety of ligands[38], but it remains to be determined whether multi-ligand sensing by PdtaS represents a manifestation of specific chemical recognition by the GAF-PAS domains or some other mechanism."

      Line 119: "Mutations in the GAF dimer interface that alter dimerization also impair multi-ligand sensing __of Cu and NO in vitro and in M. tuberculosis cells. Our findings establish a mechanism of __multi ligand sensing through alteration of sensor oligomeric state."

      Line 330: "Taken together, these data are consistent with a model in which modulation of dimer affinity is the sensing mechanism of the mycobacterial clade of PdtaS kinases, rather than specific recognition of Cu or NO by the ligand binding pockets of the GAF or PAS domains."

      __We have edited one instance (line 337) in the discussion where the use of "recognition" might have been misconstrued. __

      Minor Comments 1. The manuscript could better explain why PdtaS is described as "constitutively active" - the distinction between showing autophosphorylation activity in vitro versus true constitutive activity could be clearer. Can the authors show or refer to evidence of live constitutive PdtaR phosphorylation by PdtaS? (e.g. PhosTag electrophoresis gels of whole protein extracts and Western blotting revealed by anti-PdtaR; the use of NO and Cu can easily be used as inhibitors in such experimental setup).

      • Response: We thank the reviewer for this question. Our basis for claiming that the kinase is constitutively active, both for autophosphorylation and phosphotransfer to PdtaR, is the following:
      • __In the work of others and our prior work, PdtaS autophosphorylates without added ligand, which is contrary to most histidine kinases which are ligand activated. __
      • __PtdaS phosphorylates PdtaR without added ligand in vitro (see Figure 5C of this paper) __
      • __In terms of in vivo demonstration of PdtaR phosphorylation, this is very challenging in all response regulators given the unstable nature of the aspartate phosphorylation. We have been unsuccessful in visualizing PdtaR phosphorylation in vivo using phostag or western blotting. However, we note that our prior work demonstrated that mutation of the phosphoacceptor residue in PdtaR (D65A) phenocopied loss of both PdtaS and PdtaR (Buglino et al eLife 2021). __

      • Figure 5D shows some gel quality issues and also limited detail in the legend to know what exactly each panel represents and labels' definitions (e.g. "Ca" on the first lane, etc). The difference between wt and mutant is not clearcut to me, difficult from these data alone to derive a reliable Ki. Furthermore, the control gel on the bottom for the wt (I believe this is a cold control gel to see loaded quantities of protein on each lane?), seems to have less protein in the higher Cu concentrations.

      • __Response: The calcium lane is the divalent ion control as in the other figures. The legend of this figure refers to figure 1, which is more detailed and the identical assay. As noted in the methods, the phosphorylation signal is normalized to the total protein in the lower gel, so the lower amount of protein in these lanes in incorporated into the quantitation, which itself is derived from triplicate experiments, as noted in the legend. __

      Enough experimental detail should be included on figure legends so that the experiments are self-explanatory.

      • __Response: This is a journal style question that will be addressed depending on the identity of the eventual journal. __

      Lines 184-185 : they only refer to the fact that c-di-GMP binds to the GAF domain of PdtaS, yet the paper by Hariharan et al 2021 also shows that it activates PdtaS's autokinase activity. This should be double-checked and taken into account for the discussion of cognate ligands' effects.

      • Response: As noted above, we have added a supplemental figure (S3) that examines the effect of c-di-GMP on autokinase activity. Hariharan reported activation of PdtaS by c-di-GMP (Figure 6A of that publication). We do not see similar activation and do not observe an effect of cDG with Cu. Line 233: "Dimerization separation of function mutations" title is unclear

      • Response: Edited The structural model source (AlphaFold3) should be mentioned in the main text, not just figure legends. AF3-predicted models should be illustrated according per-residue pLDDT reliability indices (typically with a color ramp).

      • __Response: The paper contains this sentence:_ "_We performed bioinformatic analyses of the conservation of PdtaS across the Actinomycetota phyla and mapped this conservation onto the predicted full length PdtaS dimer structure predicted using AlphaFold". We have added the specific AlphaFold model (3) to this sentence. __

      • __We have also added an SI figure containing the pLDDT data (Figure S4). __
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Sankhe and collaborators, investigates the mechanism by which the Mycobacterium tuberculosis two-component system PdtaS/PdtaR senses copper and nitric oxide. The authors demonstrate that PdtaS is a constitutively active histidine kinase that autophosphorylates in trans, and that ligand-triggered inhibition occurs through disruption of dimerization rather than typical allosteric conformational changes of the dimeric species. Through phylogenetic analysis, mutagenesis, and biochemical assays, they show that conservation occurs primarily at the dimer interface rather than putative ligand binding sites, supporting a novel mechanism of multi-ligand sensing through modulation of oligomeric state. The experimental design is generally sound, with appropriate controls and multiple lines of evidence supporting the main conclusions. The trans-autophosphorylation experiments are particularly elegant and convincing. While there are some mechanistic concerns that should be addressed (particularly around R261A, and the actual dimerization extent/effect), the core findings are significant, and the work represents an important contribution to understanding bacterial signal transduction.

      Major Comments:

      1. Limited characterization and validation of dimerization measurements: (a) while MST is an established technique, the central thesis relies heavily on dimerization measurements using this single method. Given the importance of this finding, at least one additional orthogonal approach would strengthen the conclusions significantly. (b) More importantly, additional techniques should be chosen such that a clear distinction can be made between two different scenarios, namely: that only the sensory domains (PAS/GAF) undergo ligand-triggered dissociation; or, instead, that the entire protein dissociates into separate monomers (i.e. including the kinase domains). This seems like an extremely important distinction, so that the proposed kinase-regulation mechanism is well understood/described. The first scenario would be less "disruptive" wrt previous paradigms (sensory domain dissociation could well be linked to a conformational rearrangement that allosterically inhibits kinase auto-phosphorylation). Analytic size exclusion chromatography (SEC) could be a very simple, accessible and reliable approach to address this core mechanistic question. By choosing the right size resolution separation matrix, the authors should be able to separate complete monomers, from partial complexes (e.g. dimers only held through the kinase domain) and full dimers (the species the authors expect for the constitutively active wt protein). Ready advantage of having the wt protein can be taken, as well as several dimerization mutants (C53A, C57A, H67A), and presence/absence of cognate ligands (NO, Cu). For necessary reference standards, a dilution series should be able to reveal the elution position for wt monomers (and if this approach reveals to be difficult, mild chaotropic conditions can always be attempted, often times also pH shifts can do the job). Other techniques can point in the same direction as SEC, such as SAXS (best coupled to a SEC, or SEC-SAXS), native polyacrylamide gel electrophoresis, and/or dynamic light scattering.
      2. R261A mechanistic inconsistency: The manuscript shows that the R261A mutant has attenuated copper inhibition in vitro, albeit remaining functional in vivo (Figure 7B). While the authors acknowledge this suggests their "interdomain coupling model is incomplete or compensated by other mechanism in vivo," this significant discrepancy undermines confidence in the proposed mechanism and deserves more thorough investigation and/or discussion.
      3. Insufficient evidence about signal integration: While the authors argue this mechanism enables "integration of multiple inputs into the kinase without the constraints of specific ligand recognition" (lines 342-344), this appears conceptually flawed to me. The ligands (Cu and NO) must still be specifically sensed and bound somewhere on the protein to trigger dimerization disruption - the mechanism simply uses dimerization modulation as the output rather than the more typical allosteric conformational changes. The conservation pattern (interface > binding sites) may reflect selective pressure to maintain dimerization capability across the family, while individual species evolved different ligand specificities. The authors should clarify that their mechanism represents a novel output mode for ligand sensing rather than an alternative to specific ligand recognition, and discuss how this distinction affects their evolutionary interpretation.

      Minor Comments

      1. The manuscript could better explain why PdtaS is described as "constitutively active" - the distinction between showing autophosphorylation activity in vitro versus true constitutive activity could be clearer. Can the authors show or refer to evidence of live constitutive PdtaR phosphorylation by PdtaS? (e.g. PhosTag electrophoresis gels of whole protein extracts and Western blotting revealed by anti-PdtaR; the use of NO and Cu can easily be used as inhibitors in such experimental setup).
      2. Figure 5D shows some gel quality issues and also limited detail in the legend to know what exactly each panel represents and labels' definitions (e.g. "Ca" on the first lane, etc). The difference between wt and mutant is not clearcut to me, difficult from these data alone to derive a reliable Ki. Furthermore, the control gel on the bottom for the wt (I believe this is a cold control gel to see loaded quantities of protein on each lane?), seems to have less protein in the higher Cu concentrations.
      3. Enough experimental detail should be included on figure legends so that the experiments are self-explanatory.
      4. Lines 184-185 : they only refer to the fact that c-di-GMP binds to the GAF domain of PdtaS, yet the paper by Hariharan et al 2021 also shows that it activates PdtaS's autokinase activity. This should be doube-checked and taken into account for the discussion of cogante ligands' effects.
      5. Line 233: "Dimerization separation of function mutations" title is unclear
      6. The structural model source (AlphaFold3) should be mentioned in the main text, not just figure legends. AF3-predicted models should be illustrated according per-residue pLDDT reliability indices (typically with a color ramp).
      7. Ensure consistent reporting of replicate numbers across all experiments.

      Significance

      General assessment:

      The study provides elegant trans-autophosphorylation experiments and strong phylogenetic support for dimerization interface conservation. However, it relies heavily on MST as the sole method for measuring dimerization -the main finding in terms of novelty- and shows mechanistic inconsistencies (R261A functional in vivo despite attenuated inhibition in vitro).

      Scientific Advance:

      While the authors overstate novelty by claiming to bypass "specific ligand recognition" (ligands must still bind specifically to trigger dimerization disruption), the identification of dimerization modulation as the inhibitory output mechanism represents a meaningful advance. The work establishes an important framework for understanding how M. tuberculosis senses multiple host-derived stresses and may inform studies of other inhibitory sensor kinases.

      Target Audience:

      I believe this work will be of great interest to bacterial signaling researchers and M. tuberculosis pathogenesis specialists, with broader appeal to the microbiology community studying two-component systems and host-pathogen interactions. The dimerization-based mechanism may also attract structural biologists studying multi-domain sensor architectures.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      This study explores the activation mechanism of a two-component system in M. tuberculosis, PdtaS/R. PdtaS/R is known to sense Cu and NO. Here, the authors show that PdtaS autophosphorylates in trans upon dimerization. PdtaS is constitutively active, and Cu and NO binding inactivate the kinase by preventing dimerization. The dimerization interface, but not the ligand binding domain, is conserved in PdtaS orthologs, and disruption of the dimer interface also disrupts ligand sensing in vitro and in live M. tuberculosis.

      Major comments

      This is an interesting and thorough analysis of the (in)activation mechanism of a TCS. Although much work has been done on such systems, this TCS is quite interesting as it is a rarer cytosolic, soluble system, because it has been shown to sense two chemically very different ligands- Cu and NO (and apparently also cdi-GMP)- and because it is constitutively active and inactivated by ligands, which is more unusual. A main strength of the paper is the identification of a range of mutants with specific effects on dimerization, activity, auto and substrate phosphorylation and GAF-PAS interactions to probe and parse the contribution of different aspects of the mechanism. The flow of the experiments is logical, and the data are generally clear, even though TCS phosphorylation is short-lived and can be tricky to capture. The heterodimer mixing experiments using WT, phosphoreceptor His-, and ATP binding mutants are clear and conclusively show trans phosphorylation. The control ruling out dimerization defects of the mutants is useful, and bioinformatic analysis of the GAF and PAS domains shows surprisingly clearly the conservation of the dimer interface, not the ligand binding site. The experiments showing the effects of dimerization on the activity of PdtaS are conclusive, with mutations showing stronger (Cys) or weaker (His67, already shown in a previous paper) dimerization and the expected effects on PdtaS activity. Testing some key mutants in live bacteria is another nice feature of the study that shows that in vitro findings (mostly) carry over to live bacteria, which is not always the case, and often just not tested. In sum, this is a solid, straightforward study on the activation mechanism of a more unusual M. tuberculosis TCS.

      Minor comments

      The dimer model is consistent with trans phosphorylation, but I did not see model quality described, especially in the H303-ATP binding interface. Can the authors provide AlphaFold PAE and pLDDT scores?

      Although the effect of Cu and NO on the two mutant PdtaS is clear, why the WT activity in Fig. 2A is not also inhibited is not obvious to me, especially since WT dimerization is affected by Cu and NO (Fig. 2B, C). Is there also cis-autophosphorylation that masks reductions in trans phosphorylation? Is the WT signal saturated on this autorad?

      Related: Although competition data and structural model clearly indicate trans phosphorylation, some cis-phosphorylation can probably not be ruled out, especially since the dimer mutant H67A shows some activity. Although that mutation does not seem to fully disrupt the dimer, the H67A activity could be indicative of some cis-phosphorylation.

      Some kinetic experiments would have been useful to gauge the timescale of these mechanisms (but not critical).

      The two Cys residues in PdtaS were previously found to affect kinase activity. Here, the authors show they also modestly affect dimerization. Since ~1/3 of mycobacteriales have both Cys, a double mutant would have been interesting for the in vitro characterization (it is used in live bacteria in Fig. 7A) and might show a more pronounced effect (not critical).

      The Cys residues destabilize the dimer, and mutating the Cys stabilizes it, even canceling out the effect of the chemical destabilizers Cu and NO. In Fig. 4A, it looks like all Cys are too far apart to form disulfides, but Cu2+ can cause formation of disulfides. Can the authors comment on the distance of the Cys and the likelihood that disulfides have a role in this mechanism? If this were plausible, thiol-to-disulfide ratios with and without Cu could be directly measured. Although a bit more of a stretch, NO could also contribute to disulfide formation through ROS, and disulfides could be a way by which these two disparate ligands have the shared effect on activation shown here.

      The interdomain mutation Arg261Ala is quite nice and shows a specific effect on activity, but not dimerization, indicating that this interdomain ion bond somehow transfers the dimerization signal from the GAF to the PAS domain. Were there any other interdomain bonds? For completeness, was the basal autophosphorylating or phosphotransfer activity to PdtaR affected by the 261 mutation?

      First sentence in the Discussion, wording: The study investigates kinase activation, not signal sensing in a strict sense.

      Although this is primarily a biochemical, mechanistic study, one or two sentences on the biological significance of PdtaS/R in M. tuberculosis in the Introduction would be nice

      Christoph Grundner

      Significance

      This is a mechanistic study on the (in)activation mechanism of an M. tuberculosis TCS with some unusual features: The kinase, PdtaS, is constitutively active and ligand binding inactivates it. It is a soluble system that binds multiple ligands, fewer of which have been described to date. While trans-phosphorylation and regulation by dimerization are not conceptually new, they were also not a given in this more unusual TCS. The authors identified relevant mutants at several steps of the activation mechanism to specifically probe the effect of dimerization, interdomain communication etc., and test their relevance in live bacteria, which goes beyond what comparable biochemical studies typically do. The authors have previously published some aspects of the PdtaS/R activation mechanism (inhibition by Cu and NO, Cys mutants). The conserved dimer interface suggests that many of the PdtaS orthologs are similarly regulated, and that different ligands can converge on this dimerization-dependent activation. Thus, the study could be relevant for the whole family and the range of ligands they likely sense. The study summarizes the findings in the idea that PdtaS activity relies on dimerization to react to divergent ligands rather than specific ligand binding to the GAF and PAS domains. This statement is perhaps too strong and that rather than either/or, it is likely both. Overall, this is a thorough biochemical study that reveals aspects of a more non-typical TCS activation mechanism that are of high interest to the Mtb and bacterial signaling field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript the authors aim to characterize the mechanism regulating PdtaS activity, a histidine kinase of Mycobacterium tuberculosis responding to nitric oxide (NO) and copper ions (Cu). The PdtaS/PdtaR two-component system is atypical, with PdtaS being cytoplasmic and phosphorylating PdtaR in the absence of signal(s). Here, the authors characterize key residues involved in PdtaS dimerization necessary for PdtaS activity on PdtaR, in vitro and in M. tuberculosis. They show that both NO and Cu inhibit dimerization and thus PdtaS autokinase activity. They propose a model where changes in dimer affinity serve as the sensing mechanism allowing to integrate multiple signals without relying on specific ligand binding.

      Major comments:

      • The primary concern pertains to ligand recognition by PdtaS. While PdtaS constitutive autophosphorylation is shown to be dependent on dimerization, there is no direct evidence of ligand binding. How Cu and NO inhibits PdtaS activity remains uncharacterized. Is it unclear if there are specific binding pockets inducing PdtaS conformational switch, if both substrates compete for a single binding pocket, or if Cu and NO inhibit dimerization by binding to the dimer interface. Similarly, it is unclear if NO does not covalently modify the key cysteine residues by S-nitrosylation, nor if Cu induces a distinct and reversible thiol-switch by site-specific oxidation that regulates PdtaS dimerization and activity.
      • Since there is currently no direct evidence of ligand binding or residue modification, the conclusions drawn-particularly in the title and discussion-should be presented more cautiously, unless further structural, biochemical, or genetic data substantiate ligand binding.
      • Given the focus on ligand effects on PdtaS dimerization and activity, zinc and c-di-GMP should also be considered, as prior studies have suggested they may be sensed by PdtaS. Similarly, given the claim of multiligand sensing, it would be valuable to examine the combined effects of NO and Cu. Do they act additively, synergistically, or interfere with each other?
      • OPTIONAL. A significant limitation is the exclusive consideration of a single PdtaS conformation-the autophosphorylation-competent state. Histidine kinases typically cycle through at least three distinct enzymatic activities: phosphatase, autokinase, and phosphotransfer. Each of these functions relies on specific conformational states, which are often modulated by ligand binding. It is therefore important to investigate whether PdtaS also possesses phosphatase activity. Do ligands such as NO and Cu influence this activity-increasing phosphatase function, or simply inhibiting autophosphorylation and/or phosphotransfer? Moreover, does the monomeric or dimeric form of PdtaS exhibit phosphatase activity? In addition, the stability of phosphorylated PdtaR should be addressed, as it is crucial for understanding the overall dynamics and output of the signaling cascade.

      Minor comments:

      • PdtaS variants and mutants are neither introduced nor adequately described. For example, in lines 144-150, PdtaS-H303Q and G443 are mentioned without citation, and their construction is not described in the Materials and Methods section. As a result, it is difficult to determine which experiments and constructs are specific to this manuscript. Please provide a detailed Materials and Methods section, and include as supplementary material a complete list of all strains, primers, and constructs used in this study, along with their origins.
      • References: Xing J et al 2023 is duplicated. Please correct in the text and in the references list.
      • Please provide molecular weights on gels (fig. 1C, D, E, 2A, 5C, D, 7A).
      • Please provide incubation time for kinase reactions in figure legends (e.g. Fig 1C, D, E, ...).
      • Please indicate whether representative experiments are shown, and specify the number of replicates performed for each assay (e.g. Fig 1C, D, E, ...). This information is essential for assessing the reproducibility and robustness of the findings.
      • Please clarify the discrepancy in Figure 2A regarding the calcium concentration used. The results section (line 163) refers to 10 µM, whereas the figure legend (line 393) states 1 mM.
      • Figure 2A should include zinc, as previous work by the authors has shown that zinc directly inhibits the kinase activity of PdtaS. It would also be informative to test c-di-GMP in Fig. 2, given that c-di-GMP has been described to binds PdtaS (PMID: 33772870), and that c-di-GMP binding at dimer interfaces has been demonstrated in transcription factors (e.g., PMID: 25171413).
      • I am not convinced by the interpretation line 206-207. PdtaS homologs can have different ligand specificity, impling the conservation of a ligand cavity in the GAF/PAS domains.
      • The interpretation in lines 206-207 is not convincing. PdtaS homologs may differ in ligand specificity, precluding the presence of a conserved ligand-binding cavity but not of a specific ligand binding cavity in the GAF/PAS domains. Functional divergence of the binding site can occur, and this possibility should be acknowledged.

      Significance

      The manuscript demonstrates that PdtaS autokinase activity occurs in trans and that homodimerization is critical for its constitutive activity. While these findings extend previous work by the authors (PMID: 34003742)-which had already identified several key residues in PdtaS, including cysteines essential for NO and Cu sensing-they represent incremental advances. The mechanistic model proposed remains speculative without substantial additional experimental validation. The conclusions rely heavily on structural predictions from AlphaFold, representing only a single conformation corresponding to the autokinase-competent state. Crucially, the manuscript does not provide direct evidence for the mechanism of NO and Cu sensing. It also excludes the possibility of direct ligand binding-including untested candidates such as zinc and c-di-GMP-to a specific pocket, without experimentally addressing the hypothesis. These gaps significantly limit the mechanistic insight and overall impact of the study.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Velázquez et al. investigate the transcriptomic and epigenetic consequences of exogenous expression of the Ewing sarcoma fusion oncogene EWSR1∷FLI1 (EF1) in yeast. The study provides compelling evidence that EF1 can bind ETS transcription factor motifs, as well as a single 4xGGAA repeat within the S. cerevisiae genome, despite the absence of several canonical cofactors often implicated in EF1 biology, including CBP/p300 and Polycomb group (PcG) proteins. The authors further show that EF1 expression redistributes RNA Polymerase II toward EF1-bound ETS sites, yet. Strikingly, this relocalization is accompanied by only modest global transcriptional effects relative to those reported in human or insect systems. In addition, EF1 expression reverses repeat-associated epigenetic silencing of synthetic GGAA microsatellites in engineered reporter strains. Taken together, the data support the conclusion that GGAA microsatellite-mediated transcriptional rewiring is a comparatively conserved EF1 property, whereas broader transcriptomic changes arising from individual EF1-bound ETS sites appear to be more context-dependent. Overall, the manuscript is clearly written and logically organized, and the methodological descriptions and data-analysis details appear sufficient to enable reproducibility.

      Major comments

      None.

      Minor comments

      1. Additional citations in the Introduction. A small number of additional references would further support specific statements.
        • End of paragraph 3: consider adding Boulay et al. (2017).
        • Paragraph 4, after "proximal and distal genes": consider adding Tomasou et al. (2015) and Orth et al. (2022).
      2. Potential antibody cross-reactivity in the CoIP experiment. Is there an EWSR1 homolog in S. cerevisiae? To rule out unintended interactions, it would be helpful to exclude binding of endogenous yeast proteins by the anti-EWS antibody used for co-immunoprecipitation, for example via a homology search and/or appropriate specificity controls.
      3. Clarification of EF1-associated toxicity. The authors suggest that part of the observed transcriptional signal may reflect EF1 toxicity. It would strengthen the interpretation to characterize this phenotype more explicitly (e.g., growth rate over time, viability/cell death, or longer-term fitness effects). In addition, it would be informative to test whether toxicity depends on EF1 DNA-binding activity and whether truncation mitigates toxicity, as reported in Drosophila (Mahnoor et al., 2024).
      4. RNA Polymerase II relocalization versus limited transcriptional output. The apparent recruitment/repositioning of RNA Pol II in the absence of substantial transcriptional change is particularly interesting. This point could be strengthened by assessing RNA Pol II "states," for example using phosphorylation-state specific antibodies to distinguish stalled/paused from actively elongating polymerase.
      5. Promoter choice in the GGAA microsatellite reporter. The use of a modified constitutive promoter to monitor GGAA microsatellite-dependent activation is somewhat unconventional. Many studies (e.g., Gangwal et al., Hölting et al.) use minimal promoters to demonstrate EF1-dependent upregulation rather than reversal of heterochromatin-associated silencing. A brief rationale for the chosen design or discussion of how it relates to prior reporter paradigms would help readers contextualize the approach.
      6. Reconstituting minimal cofactor requirements for ETS-site activity. The finding that EF1 binding at single ETS sites does not translate into strong transcriptomic remodeling in yeast is an intriguing aspect of the work. The manuscript could be further enriched by systematic attempts to reconstitute candidate cofactors in this minimal system to define the minimal requirements for ETS-site-dependent gene activation.

      Significance

      This study will be of broad interest because it convincingly separates EF1-driven GGAA microsatellite-dependent activation from the canonical regulatory functions of FLI1, reinforcing the concept that fusion transcription factors can acquire genuinely neomorphic activities-activities that may ultimately be therapeutically actionable. Beyond the biological insight, the successful establishment of exogenous EF1 expression in yeast is itself a notable technical achievement, given the longstanding challenges associated with EF1 expression in heterologous settings. As presented here, yeast offers a valuable platform to interrogate EF1 function in a simplified and more controlled context.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this article, Velázquez et al. use the S. cerevisiae model to study the interactome of the fusion protein EWSR1::FLI1, characteristic of Ewing sarcoma. They employ this system to determine to what extent the neomorphic features of EWSR1::FLI1 depend on specific cofactors. The use of yeast as a simplified system allows the exploration of EWSR1::FLI1's dependence on key cofactors (ETS-TFs, Polycomb, CBP/p300, BAF) that are absent in this organism. Since these proteins are poorly conserved in yeast (e.g., BAF) or do not exist at all (ETS-TFs, Polycomb, CBP/p300), the aim is to determine whether their presence is necessary for EWSR1::FLI1 function. According to the authors, the interactome of EWSR1::FLI1 is much more limited in yeast than in humans, sharing only the core transcriptional machinery (RNA Pol II, FACT), failing to recruit BAF complexes, and showing enrichment in the SAGA chromatin remodeling complex, which is not part of the human interactome. In yeast, EWSR1::FLI1 binds to ETS consensus sites and CA repeats. Moreover, EWSR1::FLI1 induces only minor changes in the yeast expression profile, in contrast to what is observed in humans or Drosophila. Finally, EWSR1::FLI1 can bind to artificially introduced GGAA repeat regions in yeast, which naturally lacks such repeats, the sequences that EWSR1::FLI1 targets in humans to activate transcription. The study is methodologically sound and could potentially shed light on the requirements of EWSR1::FLI1.

      Velázquez and colleagues provide a conceptually interesting model. However, the interpretation of the results requires caution. As a final conclusion, although the work is methodologically correct, the high toxicity of the model prevents more definitive insights. Based on the results presented, it cannot be stated that yeast is a suitable model to study EWSR1::FLI1 mechanisms. Put differently, the presence of specific cofactors is required to recapitulate, at least partially, the intrinsic effects of EWSR1::FLI1.

      Major comments

      As shown in Supplementary Figure 1, the expression of EWSR1::FLI1 in yeast is highly toxic. Indeed, as the authors acknowledge in the discussion (page 11, end of the last paragraph), many of the observed expression changes may be a consequence of this toxicity, reflected in the high proportion of deregulated genes related to stress. This makes it difficult to draw clear conclusions. Since the model is inducible, it would be worth considering whether expression levels of EWSR1::FLI1 could be controlled. Achieving a balance with tolerable expression levels might help clarify the situation.

      Only the interactome of EWSR1::FLI1 has been studied. Given that the interactome largely depends on DNA-binding capacity, it would be informative to analyze the interactome of the RRLL mutant. It would also be valuable to determine the interactomes of wild-type EWSR1 and FLI1, or better, of the domains involved in the fusion, to identify which proteins interact with each domain.

      A fundamental feature of EWSR1::FLI1 is its ability to form condensates on GGAA repeats. These condensates are essential for cofactor recruitment and transcriptional activation at GGAA microsatellites. Since yeast lacks GGAA microsatellites, condensate formation is unlikely, but it would be important to determine whether it occurs. For example, the methodology used by Selig EE et al. (Phase separation of the oncogenic fusion protein EWS::FLI1 is modulated by its DNA-binding domain. Proc Natl Acad Sci U S A. 2025;122(20):e2221823122) could be applied to assess condensate formation, which may explain many of the results described.

      One striking aspect is the discrepancy between binding regions and regulated genes. The authors report 834 binding regions but only 86 regulated genes (73 up, 13 down). This suggests no clear association between DNA binding and transcriptional activation in yeast, possibly due to the absence of cofactors or the inability to form aggregates on GGAA microsatellites. The lack of correlation points to indirect mechanisms, such as RNA Pol II redistribution, as the authors suggest.

      In Ewing sarcoma cells, EWSR1::FLI1 binding to ETS consensus sites is associated with repression through displacement of ETS transcription factors. In yeast, most binding sites appear to be putative ETS consensus sites, but no repressive effect is observed, likely because there are no ETS factors to displace. This should be discussed.

      The claim that EWSR1::FLI1 "deregulates the yeast transcriptome" seems exaggerated, given the small number of affected genes compared to humans. Moreover, many correspond to stress responses, suggesting nonspecific effects of overexpression. This conclusion should be tempered.

      In Figure 5A, EWSR1::FLI1 restores GFP expression but to levels still lower than those observed in control cells. In humans, combining the GGAA promoter with EWSR1::FLI1 leads to higher expression than baseline. The absence of this effect in yeast indicates that EWSR1::FLI1 is not sufficiently active, even when a GGAA repeat locus is introduced, likely due to missing cofactors. This possible explanation should be included in the discussion.

      Minor comments

      • Raw data from proteomics, RNA-seq, and ChIP-seq studies should be deposited in public repositories.
      • Supplementary material should include a table with interactome results, listing proteins, relative expression levels (e.g., log2), p-values, adjusted p-values, and whether the protein has been described in the human interactome (with references).
      • Supplementary material should also include a table with RNA-seq results, showing means, fold changes, adjusted p-values, and whether the genes are EWSR1::FLI1 targets in humans (see Orth MF et al., Cell Reports, 2022;41(10):111761).
      • The number of experiments performed in proteomics, RNA-seq, and ChIP-seq should be specified both in the methods section and in figure legends.
      • In Figures 4D and 5A, the meaning of the letters used to indicate statistical significance should be clarified, or preferably replaced with asterisks, which are the standard convention.
      • A GSEA analysis should be performed to test whether the expression pattern regulated in yeast corresponds to the transcriptional program of EWSR1::FLI1 in Ewing sarcoma cells.

      Significance

      Strengths

      1. Originality of the experimental model The use of yeast as a simplified system eliminates the influence of endogenous cofactors (ETS-TFs, Polycomb, CBP/p300, BAF), allowing the study of intrinsic functions of EWSR1::FLI1 in a "minimal" context.
      2. Relevant findings on the interactome The study shows that the interactome in yeast is reduced and centered on the core transcriptional machinery (RNA Pol II, FACT, SAGA), providing information on essential and specific interactions, although already known.
      3. Validation of binding to ETS motifs and GGAA repeats The study confirms that EWSR1::FLI1 retains its ability to bind ETS motifs and GGAA repeats in yeast (when artificially introduced), reproducing a key feature of its biology in human cells.

      Weaknesses

      1. Limited scope of transcriptome deregulation Only 86 genes (73 up and 13 down) were regulated, a very small percentage compared to humans. Moreover, many appear to correspond to stress responses, which challenges the claim that EWSR1::FLI1 "deregulates the yeast transcriptome."
      2. Methodological and data presentation shortcomings The number of replicates in the interactome studies is not specified, and the raw proteomic and transcriptomic data are not deposited in public repositories.
      3. Limitations of the model compared to humans The yeast model does not recapitulate the most relevant and characteristic aspects of EWSR1::FLI1 in human cells, but it is useful to confirm the need for cofactors to carry out its function.

      Advance

      The study does not provide major conceptual advances in the biology of EWSR1::FLI1, although it indirectly confirms what is already known about the requirement for cofactors and DNA domains necessary for its function. In other words, the inability to reproduce what is observed in the more complex human system precisely confirms that those elements present in humans and absent in yeast are essential for its activity.

      Audience

      The article is undoubtedly of interest to a specialized audience in the field of Ewing sarcoma, although in my opinion of limited interest to other fields. It is also relevant from a basic research perspective, as it highlights why not every model is suitable for studying the biology of a given protein, and that exploring this is important to understand when a particular model may not be the most appropriate-an insight that is itself valuable.

      Expertise My field of expertise is the study of Ewing sarcoma, which I have pursued for 25 years. I cannot assess in detail the methodological description of the proteomic studies, but I do consider myself qualified to interpret the results derived from them

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This study addressed the effect of EWS:FLI1 expression in heterologous budding yeast, using three sets of experiments: 1) EWS::FLI1 interactome in yeast, 2) EWS::FLI1 genomic binding sites and transcriptome changes induced by EWS::FLI1 in yeast, and 3) whether EWS:FLI1 can activate transcription through GGAA tandem repeats in yeast. The yeast EWS::FLI1 interactome showed enrichment for the SAGA chromatin remodeling complex and lacked the BAF and spliceosome complexes while sharing RNA polymerase II and FACT with the human counterpart. EWS::FLI1 bound to hundreds of ETS sites in the yeast genome, but caused minimal transcriptional changes. EWS::FLI1 activated a GGAA tandem repeat-driven reporter in yeast.

      Major comments:

      While the manuscript title states EWS::FLI1 functions as a pioneer factor in Saccharomyces cerevisiae, the authors merely demonstrated that EWS:FLI1 can activate a GGAA repeat-driven reporter in yeast, without any evidence for GGAA-repeats having compact closed chromatin configuration, which is opened by EWS::FLI1, recruiting chromatin modifiers and allowing access by other transcription factors (these are the definition of a 'pioneer factor'). The claim that EWS::FLI1 functions as a pioneer factor in yeast is not supported by the small amount of data from a heterologous reporter experiment.

      OPTIONAL: The toxicity of wild-type EWS::FLI1 in Saccharomyces cerevisiae, as seen in most human cell types, is intriguing and clarifying the nature of this toxicity could provide a useful insight.

      The data and the methods are presented in such a way that they can be reproduced.

      The replicates and statistical analyses are generally adequate.

      Minor comments:

      Prior studies are referenced appropriately.

      The figures are generally clear and accurate. Some of the text is wordy and can be simplified to improve the readability.

      Significance

      The study addressed the consequence of EWS::FLI1 expression in heterologous budding yeast system. Overall, the study is largely descriptive with little mechanistic insights, and the findings from the heterologous system do not seem to have much impact on the Ewing sarcoma field though they might suffice the curiosity of those in the field. The audience would be 'specialized,' limited to those working on the basic science of Ewing sarcoma.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Perez-Rico et al use dCAS9 to activate and repress L1 expression levels in the subfamily level for younger mouse L1s to address their cisregulatory potential

      They choose to transcriptionally perturb full-length L1s subfamily.

      An important topic, and I liked the approach of using CRISPR to perturb an entire subfamily of transposable elements. However, this study has yet to add any new findings or conclusions. This study can be improved by further analysis or experimentation. The regulatory role of L1s in cis-regulatory function has been previously demonstrated in mouse and human stem cell systems using multiple approaches, including CRISPR-mediated perturbations, CRISPR-based deletions, and 3D chromatin structure analysis. My main concern is the analysis is very superficial and lacks integration of other datatypes such as HiC to further support the role of L1s in enhancer activity and further validation on candidate loci. One of the major limitations of CRISPR-based perturbations is that it can be difficult to conclude the cis-regulatory role of these targeted elements (L1) as synthetic tethering of VPR or KRAB domain can indirectly spread their effect in cis, which otherwise may not have a specific cis-regulatory role to regulates genes in cis. Some of the major comments that needs to be addressed to strengthen the manuscript conclusions

      1. One of the important controls could be targeting different or older L1s to show how it can affect gene expression. can authors perturb older families?
      2. Although meta-analysis of L1s show up/downregulation of L1MdTf, ChIPseq for dCAS9 is needed to demonstrate which L1s are directly targeted in the experiment and analyse the data separately for dCAS9 bound L1s vs rest of the L1s in the subfamily.
      3. L1s' transcriptional activation could be independent of their cis-regulatory activity. So, it is important to analyse the enhancer chromatin feature of TEs: Do they make 3D contact with the nearby genes?
      4. how many replicates of RNA seq is done?
      5. Are genes that respond to CRISPR activation in the same TAD? Do they loop with LINE1?
      6. Figures 4 A &B show clear perturbation of host genes upon CRISPRi. However, they need to show L1s outside the genes (not in the introns) as CRISPR dCAS9 perturbation of Intronic L1s is expected to affect the expression of genes.
      7. One of the key experiment missing is to genetically delete candidate L1s that are intergenic that are predicted to have a cis-regulatory function in the RNAseq data.
      8. Authors should present the RNAseq data for genes 10 to 100 kb away from dCAS9-KRAB and dCAS9-VPR targeted L1s. It will be interesting to compare KRAB v VPR effect with distance.
      9. The suggested role of L1s in neuronal gene expression is not strongly supported by their experiments or anlaysis. The authors end the results section with "Most of these genes are not expressed or have low expression levels in ESCs compared to other upregulated genes that are not part of these processes and pathways (Fig.S9E)". This result suggests that dCAS9-VPR targeting to multiple (How many are targeted is missing) L1s leads to the activation of many genes that are not normally expressed in mESCs due to the spreading of transcriptional activators or mESCs getting spontaneously differentiated to the neuronal lineage. Authors need to address these points.

      Minor comments

      Fig 2E) which housekeeping gene

      Figures 4 A &B, show if these L1s make looping interactions or are in the same TAD as target genes.

      Referee Cross-commenting

      I agree with the other two reviewers points.

      Significance

      This study can be improved by further analysis or experimentation. In its current version, the conceptual advance offered by the current results and approach of the manuscript seem limited.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this study Pérez-Rico et al. set out to study the cis-regulatory effects of evolutionary young L1 elements (specifically L1MdTf) in mouse embryonic stem cells. Using state of the art systems, they manage to silence/activate these elements expression and study the following gene expression cascade.

      Major comments:

      1. A general assessment or profiling of the L1MdTf expression in WT mESC is missing. It is very challenging to set their results in perspective (especially those of the repression system) without a general overview of the baseline L1-expression in mESC. How many elements does the WT mESC express? How is the genomic distribution of these elements? And how does it relate to nearby gene expression?
      2. Off-target effects are very poorly documented and limited to in silico analysis. Documentation of changes in H3K4me3 and H3K9me3 upon activation/silencing would be beneficial to interpret the results. As it is now it is very difficult to understand why the activation-system is so much stronger that the silencing system.
      3. It would be beneficial to explain why different experimental strategies were used for silencing/activation?
      4. The study would benefit of a more detailed characterization of their systems' effect. E.g. It is not clear from Figure 3D-E how many of these repeats are part of the L1MdTf subfamily or other closely related subfamilies. This is relevant to assess the specificity of the systems.
      5. The RNA-seq analysis relies on reads that are too short to uniquely map most L1s in the mouse genome. This should be mentioned and discussed.
      6. A key issue that remains unresolved is the functional of L1-silencing/activation in mESC. Does this change cell morphology, pluripotency, or differentiation potential? This is essential information.
      7. Fig 3B: Is this n=2? Hence why there is no statistical analyses? Same for Fig 5A-B: here a p-value is presented but it would be good to highlight the statistically significant results somehow.
      8. Are only L1MdTf elements plotted in Fig 3D and 3E? As the authors claim that the system is quite specific, it would be good to have a clear view of the repeats that are differentially expressed and are evolutionarily close to the L1MdTf family. The figure might also benefit of some further characterization on a family level (multi-mapping) rather than single elements (unique mapping or multi mapping with reads that map to a single location, like here)
      9. The authors provide some evidence for the existence of a bidirectional promoter on L1MdTf elements. In what elements was this observed? Does it result in chimeric transcripts? And if so, are these genes in line with the enrichment found?

      Minor comments:

      1. The introduction would benefit from a more detailed description of the L1-content in the mouse genome including the presence of polymorphic elements.
      2. There is no characterization of the edited mESC. Stainings? Markers from bulk RNA seq analysis? Morphology assessment?

      Significance

      The manuscript is timely and in line to some similar and very recent studies suggesting a link between evolutionary young L1s and neuronal-related genes in mice (e.g. Toda et al.). However, the manuscript is quite preliminary in the current form. The investigation of potential off-target effects of their system requires more in-depth investigations. In addition, the manuscript could be improved with further exploration into the functional consequences of L1-silencing/activation.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      The authors utilize engineered transcriptional effectors (CRISPRa or KRAB-Zinc Finger proteins) to activate or inhibit the expression of retrotransposons in mouse embryonic stem cells. They focus on a young L1 family in mouse, L1MdTf, which is particularly active. This study reveals extensive alterations in gene expression for surrounding genes, and it suggests that the dysregulated genes are enriched in repressive chromatin and preferentially associated with neuronal functions.

      Major comments

      The experiments are well-executed and controlled overall. However, to fully support the conclusions, a few aspects of the work could be strengthened:

      1. Engineered Effector Binding: The binding sites for the engineered effectors (CRISPRa or KRAB-Zinc Finger proteins) were predicted computationally, but not validated experimentally using techniques like ChIP-seq. Although mapping reads within L1MdTf might be challenging, ChIP-seq could still be informative. It would help define the ratio of on-target vs. off-target binding and elucidate potential off-target effects (binding outside L1MdTf repeats).
      2. Mechanism of action: The study somehow assumes that the effectors influence L1MdTf transcription, which then impacts the host transcriptome. However, it is also possible that the engineered effectors directly modulate nearby cis-regulatory elements (promoters or enhancers) due to their proximity to L1MdTf sequences. Further investigation is needed to differentiate between these possibilities.
      3. Statistics: The study lacks robust statistical testing in several figures (e.g., Fig. 1B,D; 2C,E; S2; S5C,D). Additionally, the null hypotheses and statistical models used are not explicitly defined. To strengthen the analysis, the authors could quantify the percentage of differentially expressed (DE) or allele-specifically expressed genes associated with targeted L1s compared to non-targeted L1s or other relevant control groups. Furthermore, neuronal genes are known to be longer than average, which can bias enrichment analyses like those performed using DAVID. This increased length makes them more likely to harbor L1 insertions and potentially be affected by the engineered effectors targeting these elements. It would be informative to see how the conclusions change if the null models account for gene size or only consider genes with full-length L1MdTf elements.
      4. Allele-specific expression: The analysis of allele-specific expression could be further strengthened by incorporating L1 heterozygosity data. Since the whole genome sequence of these cells is available, calling genotypes for L1 insertions using established pipelines like MELT or others would be a straightforward approach. This would allow the authors to ensure that allele-specific expression patterns are indeed associated with L1 insertional polymorphisms.

      Minor comments

      1. Western blot analysis could benefit from a more quantitative approach. Analyzing the intensity profiles of the ORF1p bands might help differentiate between potential ORF1p isoforms and pinpoint the specific changes (reduction or increase) in the top band. Additionally, an earlier explanation for the observed doublet band as a consequence of coexisting L1 families would be helpful for readers unfamiliar with mouse L1s.
      2. The chromHMM enrichment analysis reports fold enrichments (ratios of observed vs. expected associations with chromatin states). However, the magnitude of these enrichments seems to be modest in most cases. It would be informative to see the actual number of sequences falling into each category for a better understanding of the enrichment patterns.

      Significance

      Strengths

      • This study contributes to a limited body of research exploring the consequences of the global activation or repression of entire transposable element families.
      • Notably, the authors achieve this manipulation in a relevant cell type (embryonic stem cells) where retrotransposition is known to occur, utilizing a retrotransposition-competent L1 family (L1MdTf).

      Limitations

      • While the study offers a relatively novel approach, the biological question addressed could be further refined. This would allow for more specific conclusions regarding the role/consequences of L1MdTf dysregulation.
      • Artificially targeting L1 sequences is conceptually similar with randomly targeting the genome for transcriptional effects, since many, possibly most, L1 elements may not be naturally expressed under any biological conditions.

      While the manuscript confirms the known potential of L1 sequences to act as cis-regulatory elements, it does not provide significant new insights into the mechanisms underlying their regulatory potential. Notably, a recent study by Tong et al. IJMS (2024) demonstrated the activation of young human L1 elements using CRISPRa. This work similarly confirmed the ability of young human L1 elements to form chimeric transcripts and influence gene expression, likely through promoter or enhancer activity. However, key limitations of both studies are that the strength of the engineered effector might influence the observed transcriptional effects, and that the targeted L1 elements might not be naturally expressed, potentially leading to an overestimation of the regulatory impact of L1 activation.

      The study focuses on the impact of L1MdTf on gene expression, but exploring the phenotypic consequences of manipulating its expression could significantly broaden its significance. This would provide valuable insights into the functional impact of L1 activity in a broader context, potentially attracting a wider audience. In this context, L1MdTf silencing in mESC cells which naturally express these elements could be more insightful than artificially over-activating them. Phenotypic analyses could encompass cell cycle progression, commitment to the primed pluripotent state, totipotency, differentiation potential, DNA damage response, or X chromosome inactivation. Identifying whether observed phenotypes arise from cis-effects (on neighboring genes) or trans-effects (mediated by L1 RNA or proteins) would be crucial for understanding the underlying mechanisms.

      My field of expertise lies in molecular and cellular biology and genomics, with a particular focus on transposable elements.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We sincerely thank all the reviewers for their thoughtful and constructive comments.

      In our revision, we have addressed the reviewer's specific criticisms with additional experiments and text edits as described below. We believe the constructive feedback from peer reviews helped us to significantly extend our mechanistic findings and strengthen the manuscript through revision.

      Point-by-point description of the revisions

      Reviewer #1:

      Summary:

      The study by Zatulovskiy et al. examined how cell size influences cell susceptibility to ferroptosis. The authors found a size dependence specifically for ferroptosis-inducing drug Era2, but not for other drugs. Using various human cell lines (HMEC, HT 1080, RPE 1), the authors generated populations of small and large G1 cells by FACS, CDK4/6 inhibition (palbociclib), or inducible cyclin D1 knockdown, and measured cell susceptibility to ferroptosis. Larger cells were more resistant than smaller cells. Mechanistically, larger cells showed reduced plasma membrane lipid peroxidation, higher glutathione concentrations, and changes in relevant cellular proteins levels, as analyzed using previously published data. Deleting ACSL4, which is involved in ferroptosis, partly eliminated the size dependence of ferroptosis. The work concludes that cell size is a key determinant of ferroptosis susceptibility.

      My major concerns about this work focus on whether many of the results reflect cell size or cell cycle effects, and whether the FACS-based size-scaling analyses have some misleading features to their design & presentation. If these concerns can be addressed with new experiments, then the conclusions of this paper are justified. If these concerns cannot be addressed, then the authors should more directly acknowledge the alternative hypothesis that cell cycle effects may explain many of their results.

      The experiments seem to be replicated sufficiently, and most conclusions rely on data from multiple cell lines. My minor comments focus on needs to provide statistics and method details, and on suggestions on how to improve text clarity, but these edits are easily done and don't require new experiments. Overall, this is an interesting study, and it should be published once the concerns below are addressed.

      Major comments:

      • In experiments reported in Fig 1 and 2A, the authors sort small and large cells in G1, plate them, and later start the drug treatments & cell monitoring. Are these cells actively cycling (progressing in the cell cycle), and how fast? The large cells are likely to enter S phase earlier than the small cells, so by the time that the authors start their drug treatments, they may be comparing cells in different cell cycle stages, which could influence drug sensitivity more than cell size (as the authors also suggest later in Fig 2). This needs to be controlled for. Furthermore, even if the cells remain in G1 after sorting until the drug treatments are started, the authors should address the fact that the drugs are present for a long time, thus targeting the cells in various cell cycle stages.

      We agree with the reviewer that the cell cycle stage could affect ferroptosis susceptibility and could be a confounding effect in asynchronous cells. One of us (Dixon) reported the cell cycle effects on ferroptosis previously, and we observe them in this manuscript too (Fig. 2B,C,E). We now state this more clearly both in the Results and in the Discussion sections, where we write:

      Line 159: "We note that non-arrested cells had a lower susceptibility to Era2-induced ferroptosis compared to cells that were arrested in G1 for 2-3 days, despite being smaller in size. This is likely due to the difference in the fraction of cells in different cell cycle phases between arrested and non-arrested conditions since cells in S/G2/M phases are known to be more resistant to ferroptosis than cells in G0/G1 phases (Rodencal et al, 2024; Kuganesan et al, 2023)"

      Line 533: "Cells in G1 phase of the cell cycle were reported to be more susceptible to ferroptosis (Rodencal et al, 2024; Kuganesan et al, 2023), which suggested that ferroptosis inducers could be used in combination with cancer drugs, like the CDK4/6 inhibitor palbociclib, that arrest cells in G1 phase of the cell cycle (Herrera-Abreu et al, 2024). However, while CDK4/6 inhibitors arrest cells in G1, they do not inhibit cell growth, such that the longer they are arrested, the larger the cells grow (Lanz et al, 2022; Crozier et al, 2023; Manohar et al, 2023). This results in a complex, non-monotonic ferroptotic response dynamics in cells treated with CDK4/6 inhibitors (Fig. 2B,E). Just following CDK4/6 inhibitor treatment, as more and more cells are arrested in G1 phase, cells become more sensitive to both RSL3- and erastin-induced ferroptosis (Kuganesan et al, 2023; Rodencal et al, 2024). However, the longer the cells are arrested, the larger they become, which further promotes their susceptibility to RSL3 (Fig. S1B) but reduces their susceptibility to Era2-induced ferroptosis (Fig. 2B). The fact that the cell cycle arrest and cell size increase have opposing effects on Era2-induced ferroptosis susceptibility could explain why different studies reported seemingly contradictory results, where sometimes an increased and sometimes a decreased or unchanged sensitivity to system xc- inhibitors was observed depending on the cell type, duration and type of cell cycle arrest (Lee et al, 2024; Kuganesan et al, 2023; Rodencal et al, 2024). Such complex interplay between the cell cycle and cell size effects on ferroptosis suggests that combination therapies utilizing CDK4/6 inhibitors and ferroptosis inducers would have to carefully choose a dosage schedule.""

      Given the potentially confounding effects of the cell cycle in cycling cells sorted by size, we performed an additional experiment, in which RPE-1 cells were pre-treated with the CDK4/6 inhibitor palbociclib to synchronize them in G1 phase prior to treatment. These cells were then continuously exposed to palbociclib during the Era2 treatment (Fig. 2C-E). RPE-1 cells pre-treated with palbociclib for 2 and 4 days had the same cell cycle distribution with 94% of cells being arrested in G1, but with different sizes. Cells treated with palbociclib for 4 days were significantly larger and more resistant to Era2 as can be seen in the Figure 2C-E.

      Additionally, in the experiment shown in Fig. 5E,F, where we FACS-sorted WT and ACSL4 KO HMEC cells by cell size, and then measured Era2 susceptibility, we pre-treated the cells with palbociclib for 24 h to synchronize them in G1 prior to the sorting. We then cultured the cells in the presence of palbociclib during the Era2 treatment to avoid the cell cycle effects observed in Fig. 2. In this case, we still observe that larger cells are more resistant to Era2, consistent with our conclusion that cell size protects against Era2-induced ferroptosis.

      Reviewer #1: "Can the G1 arrest-driven changes in drug susceptibility (Fig 2 C-D) be attributed to cell size? Can the authors rescue the palbociclib treatment with rapamycin or other growth inhibitors that allow size to remain small during G1 arrest?"

      We have attempted to perform these experiments, but when we co-treated the cells with palbociclib and mTORC inhibitors, but observed variable results, which are likely due to the fact that prolonged mTORC inhibition itself rewires cellular metabolism and reduces cell susceptibility to ferroptosis, as one of us (Dixon) found previously (Armenta et al. (2022), Ferroptosis inhibition by lysosome-dependent catabolism of extracellular protein. Cell Chemical Biology 29: 1588-1600.e7). Our results were consistent with this previous report and is now included in a new supporting figure panel (Fig. S3C).

      Thus, upon palbociclib+rapamycin co-treatment there seems to be a competition between cell-size-mediated and metabolism-mediated effects of mTORC inhibition on ferroptosis, which leads to variable outcomes.

      Reviewer #1: "In Fig 2E-F, is the cell cycle distribution of the samples influenced by CCND1 shRNA induction? Are the drug sensitivity effects due to cell size or cell cycle changes?"

      The CCND1 manipulation model is extensively characterized in our recent work cited in this manuscript (You et al. (2025), Cell size-dependent mRNA transcription drives proteome remodeling. 2025.10.30.685141 doi:10.1101/2025.10.30.685141). Indeed, CCND1 shRNA cells have a slightly elongated G1 phase due to a ~30% reduction in Cyclin D1 concentration: the G1 fraction changes from ~70% in wild-type to ~80% in CCND1 shRNA cells, which could potentially affect the ferroptosis susceptibility, but the additional results obtained on synchronized RPE-1 cells, described above (Fig. 2C-E), support the conclusion that the primary effect on Era2 sensitivity is due to cell size.

      Reviewer #1: "Can the authors address the meaningfulness of the FACS-based size-scaling results in cases where cell-to-cell variability is very large? For example, in Fig 4D&G, the results are so variable even in identically sized cells that the importance of the size-scaling pattern seems questionable."

      We do observe variability in fluorescent probe-based measurements of GSH and lipid oxidation, which could be due to biological (natural cell heterogeneity) and/or technical (low sensitivity of the probes) reasons. However, when we look at binned data and compare the mean values {plus minus} s.e.m. for each bin, we observe a robust and reproducible trend (black line with dark-grey shaded area), even though the SD is quite broad (lighter shaded area). We believe such trends are meaningful when describing cell death in probabilistic terms as we do. I.e., the GSH measurement might not be precise enough to predict cell death for a given individual cell, but the statistical trend is clear and these measurements help predict cell death probabilities for cells of different sizes.

      Reviewer #1: "In Figs 4B-D, the cell size axis seems to have over 4-fold size variability, but when the authors show the analysis of this data (Figs 4E-G) the variability is only 2-fold. What was excluded and on what basis?"

      To address this point, we have now clarified in the Methods section how the data were processed and what data points we excluded from this analysis:

      Line 671: "For all binned flow cytometry data plots, the cells below the 2nd and above the 98th cell size percentiles were excluded to remove the extreme outliers. Then, the remaining data were binned by size and plotted as background-corrected average fluorescence intensity for each bin against the bin's average cell size. Bins with fewer than 200 cells were excluded from the analysis to reduce noise."

      Typically, such pre-processing reduces the size range, mostly from the large-cell end, because of the long right tail of the size distribution containing a few very large cells.

      Reviewer #1: "Based on the methods section & figure legends of Fig 4B-I, the RPE cells were not pre-sorted to include only G1 cells, nor did the assay account for cell cycle differences. How can these data be used to explain results from earlier figures, where analyses were exclusively focused on size differences in G1?"

      This is a valid point: Cells in the GSH measurement experiment were not gated by Hoechst signal for G1 phase because the channel normally used for Hoechst staining was in this case occupied by the MCB probe. However, given the data in Fig. 4A,B showing that the GSH production machinery is superscaling when measured specifically in G1-phase cells, we believe the flow cytometry data in Fig. 4C-J showing GSH concentration increasing with cell size across the whole cell cycle is very likely true for G1 cells as well.

      Reviewer #1: "Minor comments:

      I recommend clarifying in the early introduction that all size changes discussed are in the absence of DNA content increase."

      We have now clarified this in the introduction (Line 41 and Line 81).

      Reviewer #1: "The introduction seems to cite primary research and review paper in the same sentences, which is a bit misleading as the reviews don't seem to add new evidence."

      We have removed review citations where they did not provide additional context.

      Reviewer #1: "*OPTIONAL* In the second introduction paragraph, consider the classification/description of the three different mechanisms. Currently, it seems that these mechanisms are not independent of each other, and the details provided about each mechanism are inconsistent."

      We have now modified this paragraph to make the description more consistent.

      Reviewer #1: "Please provide statistics for the IC50 values reported based on Fig 1C. Were small and large cells statistically different? Are the IC50 values reported as +/- standard deviation or some other metric?"

      This has now been clarified in the text as follows:

      "For example, at the 72 h time point, the Era2 IC50 was 28 {plus minus} 11 µM (mean {plus minus} SD) for large cells versus 2.0 {plus minus} 1.4 µM for small cells (Student's t-test: p = 0.039) (Fig. 1C)."

      Reviewer #1: "*OPTIONAL* Providing more insight into why Era2 and RSL3 treatments yield more opposite responses would be of great interest to the field."

      We agree this is an important point that should be discussed in more detail. In the field of ferroptosis, context-dependent (i.e., cell type-specific) effects are common and multiple groups including our own (Dixon) have published extensively on genes and mechanisms that can lead to differences between erastin2 and RSL3 sensitivity. For example, there are studies showing that the mTOR pathway or the p53 pathway can either prevent or promote ferroptosis, depending on the cell type and/or other currently unknown variables. To address more specifically the differences between Era2 and RSL3 in the context our observed cell-size-dependent response, we have now added more data and discussion. In the Results section we added panel 4B and the following text:

      Line 359: "While the upregulation of GSH biosynthesis may promote the resistance of larger cells to ferroptosis, such an upregulation alone cannot explain why larger cells become more resistant to ferroptosis induced by the cystine import inhibitor Era2, but not, for example, by the GPX4 inhibitor RSL3 (Chan et al, 2025) (Figs. 2B, S1B). We found previously that upon mTORC1 inhibition cells can evade cystine deprivation-induced ferroptosis by uptake and catabolism of cysteine-rich extracellular proteins, mostly albumin (Armenta et al, 2022) (Fig. S3C). This process involves albumin degradation in lysosomes, predominantly by cathepsin B (CatB), and subsequent export of cystine from lysosomes to fuel the synthesis of glutathione. Large cells undergo proteome rearrangements similar to those occurring upon mTORC1 inhibition (Zatulovskiy et al, 2022). This suggests that large cells may upregulate CatB expression to bypass the Era2-induced cystine import inhibition via system xc-. To test this hypothesis, we used flow cytometry to measure how the expression of cathepsin B and the system xc- cystine/glutamate transporter SLC7A11 (xCT) scales with cell size (Fig. 4B). We found that SLC7A11 concentration modestly decreases, while CatB concentration significantly increases with cell size (Fig. 4B). This shift in the ratio between SLC7A11 and CatB supports the hypothesis that larger cells may rely less on cystine import via system xc- and thus become more resistant to system xc- inhibition by Era2."

      Additionally, in the Discussion we added the following:

      Line 578: "We show that large cells may become resistant specifically to Era2 but not RSL3 through the upregulation of lysosomal function, particularly cathepsin B expression, which enables the uptake and catabolism of cysteine-rich extracellular proteins. A size-dependent shift in the ratio between SLC7A11 and cathepsin B makes large cells less dependent on cystine import via system xc-, and thus, more resistant to Era2. In addition to this, it was reported that RSL3 can induce ferroptosis independently of GPX4 and may target other selenoproteins (DeAngelo et al, 2025; Cheff et al, 2023), which could also contribute to the difference in size-dependent responses to RSL3 and Era2."

      Reviewer #1: "Is the BODIPY-C11 labeling specific to plasma membrane, as suggested by the writing of the authors, or do the results shown integrate signals over all cell membranes?"

      We thank the reviewer for pointing this out. BODIPY-C11 581/591 stains many membranes in the cell, not just the plasma membrane. We have changed the wording in the manuscript to reflect this.

      Reviewer #1: "How exactly is gating done for the flow cytometry samples? Especially when analyzing size-scaling, the results are likely to be sensitive to outliers, such as those seen in Fig 4C (a subpopulation of very low CFSE stained cells). Can the authors clarify their methods and/or display supplementary figures with gating examples?"

      We have now specified our gating strategy in the Methods section (Line 663) and added a corresponding Supplementary Figure S5:

      "Single cells were gated based on FSC-A vs SSC-A, then FSC-A vs FSC-H, then SSC-A vs SSC-W plots. From this population of single cells, G1 cells were selected using Hoechst-A vs FSC-A plot for subsequent scaling analysis"

      Reviewer #1: "In Fig 4, total protein staining was used as a control, whereas Fig 5B b-actin was used as a control. Why did the authors rely on different controls approaches for essentially the same measurements? Are these controls comparable?"

      In our flow cytometry experiments, we consistently use live-cell total protein stain (CFSE) for live cells, and anti-Tubulin immunofluorescent staining for fixed cells, both of which scale in proportion to cell volume and act as a read-out for total cellular protein content (Lanz and Zatulovskiy et al., Mol Cell 2022; Berenson et al. MBoC 2019), which we use to calculate concentrations of other cellular components (analogous to loading controls). In Fig. 5B, beta-Actin is used as a reference - a protein whose concentration does not change with cell size, as opposed to ACSL4 whose concentration decreases with cell size. In this plot, both ACSL4 and beta-Actin amounts were normalized to alpha-Tubulin, which is analogous to a concentration calculation using loading control. This is now explained in more detail in the Figure legend.

      Reviewer #2:

      "Zatulovskiy et al. demonstrate that cell size modulates susceptibility to ferroptosis, a form of iron-dependent cell death driven by lipid peroxidation. Using human cell lines (HMEC, HT-1080, RPE-1), the authors examined cell size through FACS sorting, CDK4/6 inhibition and inducible cyclin D1 knockdown. They found that larger cells are more resistant to ferroptosis induced by system xc⁻ inhibition (erastin2), but more sensitive to GPX4 inhibition (RSL3), highlighting pathway-specific size dependencies.

      Mechanistically, larger cells exhibited:

      • Higher glutathione levels, supporting lipid peroxide detoxification
      • Increased ferritin expression, promoting iron sequestration
      • Lower ACSL4 levels, reducing incorporation of peroxidation-prone lipids These findings were supported by high-throughput microscopy, flow cytometry (BODIPY-C11 lipid peroxidation assays), and proteomic analyses. The study concludes that cell size influences proteome composition and metabolic capacity, thereby shaping cell death decisions, an insight with implications for aging, cancer, and ferroptosis-based therapies.

      Major Comments

      1. Direct evaluation of SLC7A11 abundance and function is needed The opposite size-dependent effects of erastin2 and RSL3 strongly suggest a role for SLC7A11/system xc⁻ activity in size-dependent ferroptosis resistance. However, SLC7A11 levels were not quantified due to insufficient peptide detection in the proteomic data.

      o Direct measurement of SLC7A11 protein levels (immunoblotting or flow cytometry) in small vs large cells would test whether its expression scales with size.

      o Functional perturbation (siRNA/CRISPR knockdown) followed by erastin2 treatment would provide mechanistic validation.

      o Use of additional SLC7A11 inhibitors (e.g., sulfasalazine, sorafenib) could further test whether the size resistance phenotype is xc⁻-specific."

      We agree that the difference in size-dependent responses to RSL3 and Era2 is an important point that needs further investigation and discussion, as other reviewers also pointed out. To address more specifically the differences between Era2 and RSL3 in the context of cell-size-dependent response, we have now added more data and discussion. In the Results section we added panel 4B measuring SLC7A11 and Cathepsin B scaling with cell size and the following text:

      Line 359: "While the upregulation of GSH biosynthesis may promote the resistance of larger cells to ferroptosis, such an upregulation alone cannot explain why larger cells become more resistant to ferroptosis induced by the cystine import inhibitor Era2, but not, for example, by the GPX4 inhibitor RSL3 (Chan et al, 2025) (Figs. 2B, S1B). We found previously that upon mTORC1 inhibition cells can evade cystine deprivation-induced ferroptosis by uptake and catabolism of cysteine-rich extracellular proteins, mostly albumin (Armenta et al, 2022) (Fig. S3C). This process involves albumin degradation in lysosomes, predominantly by cathepsin B (CatB), and subsequent export of cystine from lysosomes to fuel the synthesis of glutathione. Large cells undergo proteome rearrangements similar to those occurring upon mTORC1 inhibition (Zatulovskiy et al, 2022). This suggests that large cells may upregulate CatB expression to bypass the Era2-induced cystine import inhibition via system xc-. To test this hypothesis, we used flow cytometry to measure how the expression of cathepsin B and the system xc- cystine/glutamate transporter SLC7A11 (xCT) scales with cell size (Fig. 4B). We found that SLC7A11 concentration modestly decreases, while CatB concentration significantly increases with cell size (Fig. 4B). This shift in the ratio between SLC7A11 and CatB supports the hypothesis that larger cells may rely less on cystine import via system xc- and thus become more resistant to system xc- inhibition by Era2."

      Additionally, in the Discussion we added the following:

      Line 578: "We show that large cells may become resistant specifically to Era2 but not RSL3 through the upregulation of lysosomal function, particularly cathepsin B expression, which enables the uptake and catabolism of cysteine-rich extracellular proteins. A size-dependent shift in the ratio between SLC7A11 and cathepsin B makes large cells less dependent on cystine import via system xc-, and thus, more resistant to Era2. In addition to this, it was reported that RSL3 can induce ferroptosis independently of GPX4 and may target other selenoproteins (DeAngelo et al, 2025; Cheff et al, 2023), which could also contribute to the difference in size-dependent responses to RSL3 and Era2."

      Reviewer #2: "2. Functional tests of ferritin contribution to resistance are needed

      Although elevated ferritin (FTH1/FTL) levels in larger cells represent a strong correlational signal, definitive experimental evidence establishing causality is currently lacking.

      o Measuring the labile iron pool directly in size-stratified populations would strengthen the link.

      o Knockdown of FTH1 or FTL could reveal whether ferritin upregulation is necessary for the resistance of large cells to ferroptosis."

      We thank the reviewer for raising this point. We have now completed additional experiments, as suggested by the reviewer, and found that iron chelation is unlikely to mediate the size-dependent response to Era2. We have modified the manuscript accordingly and added the following data and discussion to address this point:

      Line 296: "The observed increase in ferritin concentration with cell size could therefore lead to additional Fe2+ ion chelation, which in turn would protect large cells from iron-dependent lipid peroxidation and ferroptosis. However, when we measured the concentration of labile intracellular Fe2+ using a fluorescent probe FerroOrange (Hirayama et al, 2020), we did not observe any size-dependent decrease in labile iron concentration (Fig. S2A). Previous work suggests a link between increased sequestration of ferrous iron in lysosomes and resistance to ferroptosis. It was reported that senescent cells, which are also large (Fig. S3A,B), gain resistance to ferroptosis through lysosomal alkalinization and sequestration of ferrous iron in lysosomes (Loo et al, 2025). We therefore tested whether the superscaling of lysosomes observed in large cells (Lanz et al, 2022; You et al, 2025) promotes Era2 resistance through lysosomal iron sequestration. To do this, we stained the cells with the lysosomal iron detection probe Lyso-FerroRed (Saimoto et al, 2025) and measured its scaling using flow cytometry (Fig. S2B). We observed that the amount of Lyso-FerroRed, and therefore, the amount of lysosomal iron, scaled in direct proportion to cell size, just like the total cellular protein content (Fig. S2B). These results indicate that iron chelation by ferritin and its sequestration in lysosomes are unlikely to play a crucial role in size-dependent decrease in Era2 sensitivity."

      Reviewer #2: "3. Relevance to senescence should be addressed experimentally or explicitly discussed

      Given that senescent cells are enlarged and accumulate in aged and tumour tissues, testing senescent models for erastin2 resistance would greatly strengthen the physiological significance."

      We agree that an increase in cell size contributing to the resistance of senescent cells to ferroptosis is intriguing. We have now added a Supplementary Figure S3 and discussion of this point in the manuscript as follows:

      Discussion line 552: "our data suggest that previously reported resistance of senescent cells to ferroptosis can at least partially be due to the increased cell size, a well-established hallmark of senescence."

      Reviewer #2: "Minor Comments

      1. Mechanistic nuance regarding RSL3 should be included RSL3 has been reported to induce ferroptosis independently of GPX4 (PMID: 37087975, PMID: 40392234) and may target other selenoproteins such as TXNRD1. This nuance would help explain the observed divergence between RSL3 and erastin2 sensitivity across sizes."

      We have now added this in the Discussion as suggested by the reviewer (line 583):

      "In addition to this, it was reported that RSL3 can induce ferroptosis independently of GPX4 and may target other selenoproteins (DeAngelo et al, 2025; Cheff et al, 2023), which could also contribute to the difference in size-dependent responses to RSL3 and Era2."

      Reviewer #2: "2. Dynamic range of BODIPY-C11 assays needs commentary

      Despite high erastin2 doses, the oxidized BODIPY signal remains close to DMSO levels. The authors should comment on whether this reflects high GSH buffering capacity, probe limitations, or other factors."

      We believe there are both technical (narrow dynamic range of the probe) and biological reasons for the relatively small (2-3 fold) difference in Oxidized-to-Non-oxidized BODIPY-C11 ratios between DMSO and Era2-treated cells. The biological reason is that the cells continue producing GSH until they fully deplete the cystine pool, which happens ~20-24 h after Era2 addition. Once the cystine pool is depleted, the cells very rapidly deplete GSH and initiate cell death. Therefore, there is only a short time window where cells are strongly depleted of GSH before dying. We see this small fraction of cells with a high Oxidized BODIPY-C11 signal in our flow cytometry experiments and in previous microscopy analysis of BODIPY-C11 (Murray et al., Protocol for detection of ferroptosis in cultured cells. STAR Protoc. 2023), but at our chosen time point (20h Era2) most cells are not as bright because we aimed to analyze the population before the onset of widespread cell death.

      Reviewer #2: "3. Western blot for shCycD1 depletion should be included

      CycD1 depletion usually causes cells to stop proliferating, which is not the case here. Therefore, depletion must be partial. The level of depletion should be shown by immunblotting."

      The CCND1 manipulation model is extensively characterized in our recent work cited in this manuscript (You et al. (2025), Cell size-dependent mRNA transcription drives proteome remodeling. 2025.10.30.685141 doi:10.1101/2025.10.30.685141). CCND1 shRNA cells do not fully arrest in G0/G1 because the concentration of Cyclin D1 protein in this system is only partially decreased, as the reviewer noted. As a result, the cells have a slightly elongated G1 phase due to a ~30% reduction in Cyclin D1 concentration, but continue to proliferate. The G1 fraction changes from ~70% in wild-type to ~80% in CCND1 shRNA cells.

      Reviewer #3:

      "In this manuscript, Zatulovskiy and colleagues elaborate on their previous work describing cell size-dependent changes in the proteome by investigating whether these changes can be correlated in differences in cell physiology. Using a cleverly-designed high throughput screen, they searched for compounds that differently-sized cells display differential sensitivity towards. Their primary hit, Era2, is involved in the ferroptosis pathway and serves as the starting point for a detailed study of how excess cell size protects cells from ferroptosis-induced cell death via: 1) lower concentrations of ACSL4 (which produces peroxidation-prone PUFAs), 2) increased ferritin concentrations, and 3) increased GSH concentrations.

      Overall, the experiments in this manuscript are well-designed and interpreted. It is an extremely well-written manuscript with a clear trajectory of logic. I have only a few major concerns that should be addressed before publication:"

      We thank Reviewer #3 for their careful reading of the manuscript and for the clear summary of our study and its central findings. We appreciate their positive assessment of the experimental design, interpretation, and overall clarity of the writing and logical flow. We are also grateful for their constructive feedback and take their major concerns seriously; we have addressed each point in detail below.

      Reviewer #3: "Major concerns:

      1) In Figure 3E, the authors gate their flow cytometry data using SYTOX so that they are only analyzing live cells. Based on their gating scheme, it seems like there are really a lot of dead cells. Presumably the cells that died were the most sensitive to Era2, so it seems an oversight to discard these cells. Of course, it is not appropriate to analyze dead cells, but this could potentially be solved by using a shorter treatment duration than 24 hours wherein fewer cells die."

      This is a good point. To address it, we have now replaced this panel with a time point where most cells are still alive (20 h, 0.2 µM Era2), as suggested by the reviewer (Fig. 3E,F). This did not change the conclusion that BODIPY-C11 oxidation decreases with cell size.

      Reviewer #3: "2) In Figure 5, are the small, medium, and large bins for ACSL4 KO cells the same as for WT cells? If the ACSL4 KO cells are just bigger to begin with, this could explain why the "small" bin has greater cell survival than the WT small bin. Moreover, is the overlap between the three bins the same in the WT and KO cells?"

      This is an important point that we now address with data shown in Fig. S4B. We have now added a Supplementary Figure S4B to show the relative size of small, medium, and large WT and ACSL4 KO HMEC cells. As seen from this graph, the ACSL4 KO cells are not bigger than WT cells. Importantly, the fold-range between the small and large FACS-sorted cells is similar (~1.9 to 2-fold).

      Reviewer #3: "3) Loo, et al. Nat Comms 2025 similarly found that senescent cells (which are enlarged) are resistant to ferroptosis using the same inhibitor as the authors. In contrast to the authors, they show that this is due to lysosomal alkalinization and sequestration of ferrous iron in lysosomes. Given that Lanz et al. 2022 found that lysosomal components super-scale with cell size, it seems like this would be an important hypothesis to address. Free lysosomal iron can be easily measured with the LysoRhoNox stain. Loo et al. was able to restore ferroptosis sensitivity in senescent cells using the V-ATPase activator EN6, so it would be important for the authors to address whether this (or similar) treatment would have the same effect in enlarged cells."

      This is an excellent point. We have now performed this experiment and added it to the manuscript, as suggested by the reviewer. Based on the Lyso-FerroRed staining (another brand name for the LysoRhoNox probe), we do not see an increase in lysosomal iron sequestration in large cells (Fig. S2B) - see the graph and the corresponding discussion below:

      Line 301: "Previous work suggests a link between increased sequestration of ferrous iron in lysosomes and resistance to ferroptosis. It was reported that senescent cells, which are also large (Fig. S3A,B), gain resistance to ferroptosis through lysosomal alkalinization and sequestration of ferrous iron in lysosomes (Loo et al, 2025). We therefore tested whether the superscaling of lysosomes observed in large cells (Lanz et al, 2022; You et al, 2025) promotes Era2 resistance through lysosomal iron sequestration. To do this, we stained the cells with the lysosomal iron detection probe Lyso-FerroRed (Saimoto et al, 2025) and measured its scaling using flow cytometry (Fig. S2B). We observed that the amount of Lyso-FerroRed, and therefore, the amount of lysosomal iron, scaled in direct proportion to cell size, just like the total cellular protein content (Fig. S2B). These results indicate that iron chelation by ferritin and its sequestration in lysosomes are unlikely to play a crucial role in size-dependent decrease in Era2 sensitivity."

      Reviewer #3: "Minor concerns:

      1) It would be helpful if this manuscript were re-submitted with line numbers to more easily reference the text."

      We have added line numbers for convenience.

      Reviewer #3: "2) In Figure 5A and other figures that reproduce data from Lanz et al. 2022, it would be helpful to have a summary curve for the overall abundance of each protein rather than only the individual peptide curves. These plots (particularly Figure 5A) are difficult to interpret since some peptides were presumably more abundant / measured with higher confidence than others."

      We have added the average ACSL4 protein slope line to Fig. 5A.

      Reviewer #3: "3) In Figure 5, the authors show the validation of the ACSL4 KO HT-1080 cell line but not HMEC, even though both are used in this figure. It would be useful to show both. Additionally, the authors switch back and forth between the two cell lines for this figure, and it is not clear why."

      We have added the HMEC ACSL4 KO validation Western blot in Fig. S4A (see below).

      For the BODIPY oxidation experiment (Fig. 5D), we used HT-1080 instead of HMEC because HT1080 cells are sensitive to lower concentrations of Era2, and therefore, we could better optimize the Era2 concentrations and treatment durations to measure BODIPY oxidation at the time point when most cells are still alive but demonstrate a pronounced oxidized BODIPY signal.

      Reviewer #3: "4) In Figure 5B, the authors use antibody-based staining of ACSL4 and flow cytometry to correlate a loss of ACSL4 expression with increased cell size, validating the proteomics data in Figure 5A. This does not seem like a good way to do this. Firstly, fixing cells with formaldehyde alters their size (is this proportional across differently sized cells? It's impossible to know), which makes it inappropriate to use SSC as a proxy for size in this particular situation. Secondly, the normalization scheme here doesn't make sense. If actin was used as a reference protein, why was tubulin used to normalize ACSL4 abundance? Overall, this seems like a very round-about experiment that could have just been addressed by doing a simple western blot with the four size bins sorted from live cells (as it was in the proteomics). If the issue is that ACSL4 is not detectable by western in the HMEC cells, another solution would be plating the live, sorted bins on coverslips and measuring by IF (or using the HT-1080 cells)."

      We prefer IF flow cytometry to Western blotting for protein scaling analysis because it is more quantitative and provides cell size and protein content information for each individual cell. While in principle, different-sized cells might change their size differently during fixation, the cells that were larger or smaller prior to the fixation remain larger or smaller after fixation as well. Therefore, the SSC measurement after fixation still provides reliable information on size ranking, even if SSC does not perfectly linearly scale with cell volume. We do not use the SSC information to calculate protein concentrations here. Instead, we divide the amount of our protein of interest in the cell by the amount of constitutively-expressed Tubulin, which acts as an analogue of a loading control in this experiment. In Fig. 5B, both ACSL4 and Actin were normalized to Tubulin to estimate their concentrations. Actin is used just as a reference protein to show how the concentration of a perfectly scaling protein remains constant across cell size, as opposed to the sub-scaling ACSL4. Tubulin in this case was used as a proxy for total cellular protein content, which scales linearly in proportion to cell volume. This approach for determining the scaling behaviors of different proteins was previously validated in Lanz et al., Mol Cell 2022.

      Reviewer #3: "5) In Figure 5E/5F, the authors pre-arrest the cells in G1 with palbociclib before size-sorting them. The pre-arrest is not done in other experiments using this cell line for size-sorting, so it would be important for the authors to comment on why this was done for this experiment but not others."

      As we found in Fig. 2B-E, the cell cycle has confounding effects on size-dependent ferroptosis susceptibility measurements (as discussed in detail in our response to the first major point of Reviewer #1 above). Briefly, to avoid these confounding effects and isolate the effects of cell size from the effects of the cell cycle, we pre-synchronized the cells with 24 h treatment with palbociclib in Fig. 5E,F. This is now better clarified in the text, as follows:

      Line 456: "In this experiment, we synchronized cells in G1 phase using palbociclib prior to cell sorting and also incubated the sorted cells in the presence of palbociclib during Era2 treatment to isolate cell size effects from the previously observed confounding effects of the cell cycle on ferroptosis (Fig. 2B,E)."

      Reviewer #3: "6) Conceptually, it is difficult for me to understand why large cell size sensitizes cells to GPX4 inhibition but confers resistance to Era2 treatment. Particularly given the pathway described in Figure 3A, I am having trouble understanding why these would convey such opposing phenotypes. Shouldn't the extra ferritin in the bigger cells also help them cope with GPX4 inhibition if, as the authors state in the discussion, the increased sensitivity to the GPX4 inhibitor is reported to be mediated by (among other things) iron accumulation? A deeper discussion of this seeming-incongruity would be helpful for contextualizing the broader role of cell size in determining ferroptosis sensitivity."

      We agree this is an important point, which was also raised by the other reviewers. As such, we note that context-dependent (i.e., cell type-specific) effects are common in the ferroptosis field, and multiple groups including our own (Dixon) have published extensively on genes and mechanisms that can lead to differences between erastin2 and RSL3. For example, there are studies showing that the mTOR pathway or the p53 pathway can both prevent and promote ferroptosis, depending on the cell type or some other hidden variable.

      To better address the differences between Era2 and RSL3 in the context of the cell-size-dependent response, we have now added more data and discussion. In the Results section we added panel 4B and the following text:

      Line 359: "While the upregulation of GSH biosynthesis may promote the resistance of larger cells to ferroptosis, such an upregulation alone cannot explain why larger cells become more resistant to ferroptosis induced by the cystine import inhibitor Era2, but not, for example, by the GPX4 inhibitor RSL3 (Chan et al, 2025) (Figs. 2B, S1B). We found previously that upon mTORC1 inhibition cells can evade cystine deprivation-induced ferroptosis by uptake and catabolism of cysteine-rich extracellular proteins, mostly albumin (Armenta et al, 2022) (Fig. S3C). This process involves albumin degradation in lysosomes, predominantly by cathepsin B (CatB), and subsequent export of cystine from lysosomes to fuel the synthesis of glutathione. Large cells undergo proteome rearrangements similar to those occurring upon mTORC1 inhibition (Zatulovskiy et al, 2022). This suggests that large cells may upregulate CatB expression to bypass the Era2-induced cystine import inhibition via system xc-. To test this hypothesis, we used flow cytometry to measure how the expression of cathepsin B and the system xc- cystine/glutamate transporter SLC7A11 (xCT) scales with cell size (Fig. 4B). We found that SLC7A11 concentration modestly decreases, while CatB concentration significantly increases with cell size (Fig. 4B). This shift in the ratio between SLC7A11 and CatB supports the hypothesis that larger cells may rely less on cystine import via system xc- and thus become more resistant to system xc- inhibition by Era2."

      Figure 4. (B) Flow-cytometry-based measurement of cystine/glutamate transporter SLC7A11 (xCT) and cathepsin B (CatB) concentrations in G1-phase RPE-1 cells demonstrates a modest decrease in SLC7A11 and a significant increase in Cathepsin B concentrations with cell size. To calculate the concentrations of SLC7A11 and CatB, their amounts were measured with flow cytometry using immunofluorescence and normalized to the amounts of α-Tubulin. The data were binned by cell size, and mean values for each bin were plotted against normalized cell size (solid blue line for SLC7A11 and red line for CatB). Shaded areas denote the s.e.m. for each bin.

      Additionally, in the Discussion we added the following:

      Line 578: "We show that large cells may become resistant specifically to Era2 but not RSL3 through the upregulation of lysosomal function, particularly cathepsin B expression, which enables the uptake and catabolism of cysteine-rich extracellular proteins. A size-dependent shift in the ratio between SLC7A11 and cathepsin B makes large cells less dependent on cystine import via system xc-, and thus, more resistant to Era2. In addition to this, it was reported that RSL3 can induce ferroptosis independently of GPX4 and may target other selenoproteins (DeAngelo et al, 2025; Cheff et al, 2023), which could also contribute to the difference in size-dependent responses to RSL3 and Era2."

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, Zatulovskiy and colleagues elaborate on their previous work describing cell size-dependent changes in the proteome by investigating whether these changes can be correlated in differences in cell physiology. Using a cleverly-designed high throughput screen, they searched for compounds that differently-sized cells display differential sensitivity towards. Their primary hit, Era2, is involved in the ferroptosis pathway and serves as the starting point for a detailed study of how excess cell size protects cells from ferroptosis-induced cell death via: 1) lower concentrations of ACSL4 (which produces peroxidation-prone PUFAs), 2) increased ferritin concentrations, and 3) increased GSH concentrations.

      Overall, the experiments in this manuscript are well-designed and interpreted. It is an extremely well-written manuscript with a clear trajectory of logic. I have only a few major concerns that should be addressed before publication:

      Major concerns:

      1) In Figure 3E, the authors gate their flow cytometry data using SYTOX so that they are only analyzing live cells. Based on their gating scheme, it seems like there are really a lot of dead cells. Presumably the cells that died were the most sensitive to Era2, so it seems an oversight to discard these cells. Of course, it is not appropriate to analyze dead cells, but this could potentially be solved by using a shorter treatment duration than 24 hours wherein fewer cells die.

      2) In Figure 5, are the small, medium, and large bins for ACSL4 KO cells the same as for WT cells? If the ACSL4 KO cells are just bigger to begin with, this could explain why the "small" bin has greater cell survival than the WT small bin. Moreover, is the overlap between the three bins the same in the WT and KO cells?

      3) Loo, et al. Nat Comms 2025 similarly found that senescent cells (which are enlarged) are resistant to ferroptosis using the same inhibitor as the authors. In contrast to the authors, they show that this is due to lysosomal alkalinization and sequestration of ferrous iron in lysosomes. Given that Lanz et al. 2022 found that lysosomal components super-scale with cell size, it seems like this would be an important hypothesis to address. Free lysosomal iron can be easily measured with the LysoRhoNox stain. Loo et al. was able to restore ferroptosis sensitivity in senescent cells using the V-ATPase activator EN6, so it would be important for the authors to address whether this (or similar) treatment would have the same effect in enlarged cells.

      Minor concerns:

      1) It would be helpful if this manuscript were re-submitted with line numbers to more easily reference the text.

      2) In Figure 5A and other figures that reproduce data from Lanz et al. 2022, it would be helpful to have a summary curve for the overall abundance of each protein rather than only the individual peptide curves. These plots (particularly Figure 5A) are difficult to interpret since some peptides were presumably more abundant / measured with higher confidence than others.

      3) In Figure 5, the authors show the validation of the ACSL4 KO HT-1080 cell line but not HMEC, even though both are used in this figure. It would be useful to show both. Additionally, the authors switch back and forth between the two cell lines for this figure, and it is not clear why.

      4) In Figure 5B, the authors use antibody-based staining of ACSL4 and flow cytometry to correlate a loss of ACSL4 expression with increased cell size, validating the proteomics data in Figure 5A. This does not seem like a good way to do this. Firstly, fixing cells with formaldehyde alters their size (is this proportional across differently sized cells? It's impossible to know), which makes it inappropriate to use SSC as a proxy for size in this particular situation. Secondly, the normalization scheme here doesn't make sense. If actin was used as a reference protein, why was tubulin used to normalize ACSL4 abundance? Overall, this seems like a very round-about experiment that could have just been addressed by doing a simple western blot with the four size bins sorted from live cells (as it was in the proteomics). If the issue is that ACSL4 is not detectable by western in the HMEC cells, another solution would be plating the live, sorted bins on coverslips and measuring by IF (or using the HT-1080 cells).

      5) In Figure 5E/5F, the authors pre-arrest the cells in G1 with palbociclib before size-sorting them. The pre-arrest is not done in other experiments using this cell line for size-sorting, so it would be important for the authors to comment on why this was done for this experiment but not others.

      6) Conceptually, it is difficult for me to understand why large cell size sensitizes cells to GPX4 inhibition but confers resistance to Era2 treatment. Particularly given the pathway described in Figure 3A, I am having trouble understanding why these would convey such opposing phenotypes. Shouldn't the extra ferritin in the bigger cells also help them cope with GPX4 inhibition if, as the authors state in the discussion, the increased sensitivity to the GPX4 inhibitor is reported to be mediated by (among other things) iron accumulation? A deeper discussion of this seeming-incongruity would be helpful for contextualizing the broader role of cell size in determining ferroptosis sensitivity.

      Significance

      Strengths: well-designed study, clear results that are broadly in line with other recent studies on the same topic.

      Limitations: opposing results between Era2 and GPX4i (which should in theory cause similar phenotypes) call into question how broadly cell size impacts ferroptosis susceptibility / indicate that this may occur in a limited context

      Advances: first study to show that enlarged cells might have higher antioxidant capacity

      Audiences: cell size researchers, ferroptosis/cell death researchers

      My expertise: cell cycle checkpoints, cell size control, proteomics

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Zatulovskiy et al. demonstrate that cell size modulates susceptibility to ferroptosis, a form of iron-dependent cell death driven by lipid peroxidation. Using human cell lines (HMEC, HT-1080, RPE-1), the authors examined cell size through FACS sorting, CDK4/6 inhibition and inducible cyclin D1 knockdown. They found that larger cells are more resistant to ferroptosis induced by system xc⁻ inhibition (erastin2), but more sensitive to GPX4 inhibition (RSL3), highlighting pathway-specific size dependencies. Mechanistically, larger cells exhibited:

      • Higher glutathione levels, supporting lipid peroxide detoxification
      • Increased ferritin expression, promoting iron sequestration
      • Lower ACSL4 levels, reducing incorporation of peroxidation-prone lipids

      These findings were supported by high-throughput microscopy, flow cytometry (BODIPY-C11 lipid peroxidation assays), and proteomic analyses. The study concludes that cell size influences proteome composition and metabolic capacity, thereby shaping cell death decisions, an insight with implications for aging, cancer, and ferroptosis-based therapies.

      Major Comments

      1. Direct evaluation of SLC7A11 abundance and function is needed The opposite size-dependent effects of erastin2 and RSL3 strongly suggest a role for SLC7A11/system xc⁻ activity in size-dependent ferroptosis resistance. However, SLC7A11 levels were not quantified due to insufficient peptide detection in the proteomic data.
        • Direct measurement of SLC7A11 protein levels (immunoblotting or flow cytometry) in small vs large cells would test whether its expression scales with size.
        • Functional perturbation (siRNA/CRISPR knockdown) followed by erastin2 treatment would provide mechanistic validation.
        • Use of additional SLC7A11 inhibitors (e.g., sulfasalazine, sorafenib) could further test whether the size resistance phenotype is xc⁻-specific.
      2. Functional tests of ferritin contribution to resistance are needed Although elevated ferritin (FTH1/FTL) levels in larger cells represent a strong correlational signal, definitive experimental evidence establishing causality is currently lacking.
        • Measuring the labile iron pool directly in size-stratified populations would strengthen the link.
        • Knockdown of FTH1 or FTL could reveal whether ferritin upregulation is necessary for the resistance of large cells to ferroptosis.
      3. Relevance to senescence should be addressed experimentally or explicitly discussed Given that senescent cells are enlarged and accumulate in aged and tumour tissues, testing senescent models for erastin2 resistance would greatly strengthen the physiological significance.

      Minor Comments

      1. Mechanistic nuance regarding RSL3 should be included RSL3 has been reported to induce ferroptosis independently of GPX4 (PMID: 37087975, PMID: 40392234) and may target other selenoproteins such as TXNRD1. This nuance would help explain the observed divergence between RSL3 and erastin2 sensitivity across sizes.
      2. Dynamic range of BODIPY-C11 assays needs commentary Despite high erastin2 doses, the oxidized BODIPY signal remains close to DMSO levels. The authors should comment on whether this reflects high GSH buffering capacity, probe limitations, or other factors.
      3. Western blot for shCycD1 depletion should be included CycD1 depletion usually causes cells to stop proliferating, which is not the case here. Therefore, depletion must be partial. The level of depletion should be shown by immunblotting.

      Are the suggested experiments realistic in terms of time and resources? Yes, measuring SLC7A11 expression (£400, 2 days) and knocking down SLC7A11 and FTL/FTH1 (£1000, 1-2 weeks).

      Are the data and the methods presented in such a way that they can be reproduced? Yes

      Are the experiments adequately replicated and statistical analysis adequate? Number of replicates are adequate. The different statistical analyses performed should be mentioned.

      Referee cross-commenting

      It seems that all reviewers are largely in agreement. Key things to focus on in a revised version are:

      • Clarifying the pathway to better explain why different ferroptosis inducers show opposite phenotypes.
      • Dependency of effects on cell cycle phase.
      • Checking the lysosomal iron in cells of different size.
      • Describing and showing all statistical tests.

      Significance

      General Assessment: This study presents a mechanistic link between cell size and ferroptosis susceptibility. Using high-throughput microscopy, proteomics, and genetic perturbations across multiple human cell lines, the authors demonstrate that larger cells are more resistant to ferroptosis induced by system xc⁻ inhibition (erastin2). This resistance is attributed to elevated glutathione production, increased ferritin-mediated iron sequestration, and reduced ACSL4-dependent lipid peroxidation. The experimental design is rigorous and multifaceted, with consistent results across cell types and size manipulation methods. While the study is limited to in vitro systems, its conceptual and mechanistic insights lay the groundwork for future in vivo and translational investigations.

      Advance: This work is the first to systematically show that cell size directly influences ferroptosis susceptibility via proteome scaling. It reconciles previous findings that large cells are sensitized to GPX4 inhibition (RSL3) by demonstrating that the ferroptosis pathway targeted system xc⁻ vs GPX4 determines the direction of size-dependent vulnerability. The study provides a conceptual advance by positioning cell size as a regulatory axis in cell death decisions, and a mechanistic advance by identifying size-dependent changes in glutathione metabolism, ferritin levels, and ACSL4 expression. Audience: This research will be of interest to specialists in cell death, ferroptosis, redox biology, and cancer biology. It also holds relevance for aging researchers and translational scientists exploring ferroptosis-based therapies. The findings may influence how cell size heterogeneity is considered in therapeutic design, particularly in oncology and senescence-targeting strategies.

      Field of Expertise: Translational cancer biology, cell cycle regulation, proteomics, therapy resistance, molecular mechanisms of cell death.

      Limitations of Evaluation: I do not have sufficient expertise to evaluate the glutathione kinetics, though the biological interpretation is clear and well-supported.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The study by Zatulovskiy et al. examined how cell size influences cell susceptibility to ferroptosis. The authors found a size dependence specifically for ferroptosis-inducing drug Era2, but not for other drugs. Using various human cell lines (HMEC, HT 1080, RPE 1), the authors generated populations of small and large G1 cells by FACS, CDK4/6 inhibition (palbociclib), or inducible cyclin D1 knockdown, and measured cell susceptibility to ferroptosis. Larger cells were more resistant than smaller cells. Mechanistically, larger cells showed reduced plasma membrane lipid peroxidation, higher glutathione concentrations, and changes in relevant cellular proteins levels, as analyzed using previously published data. Deleting ACSL4, which is involved in ferroptosis, partly eliminated the size dependence of ferroptosis. The work concludes that cell size is a key determinant of ferroptosis susceptibility.

      My major concerns about this work focus on whether many of the results reflect cell size or cell cycle effects, and whether the FACS-based size-scaling analyses have some misleading features to their design & presentation. If these concerns can be addressed with new experiments, then the conclusions of this paper are justified. If these concerns cannot be addressed, then the authors should more directly acknowledge the alternative hypothesis that cell cycle effects may explain many of their results.

      The experiments seem to be replicated sufficiently, and most conclusions rely on data from multiple cell lines. My minor comments focus on needs to provide statistics and method details, and on suggestions on how to improve text clarity, but these edits are easily done and don't require new experiments. Overall, this is an interesting study, and it should be published once the concerns below are addressed.

      Major comments:

      • In experiments reported in Fig 1 and 2A, the authors sort small and large cells in G1, plate them, and later start the drug treatments & cell monitoring. Are these cells actively cycling (progressing in the cell cycle), and how fast? The large cells are likely to enter S phase earlier than the small cells, so by the time that the authors start their drug treatments, they may be comparing cells in different cell cycle stages, which could influence drug sensitivity more than cell size (as the authors also suggest later in Fig 2). This needs to be controlled for. Furthermore, even if the cells remain in G1 after sorting until the drug treatments are started, the authors should address the fact that the drugs are present for a long time, thus targeting the cells in various cell cycle stages.
      • Can the G1 arrest-driven changes in drug susceptibility (Fig 2 C-D) be attributed to cell size? Can the authors rescue the palbociclib treatment with rapamycin or other growth inhibitors that allow size to remain small during G1 arrest?
      • In Fig 2E-F, is the cell cycle distribution of the samples influenced by CCND1 shRNA induction? Are the drug sensitivity effects due to cell size or cell cycle changes?
      • Can the authors address the meaningfulness of the FACS-based size-scaling results in cases where cell-to-cell variability is very large? For example, in Fig 4D&G, the results are so variable even in identically sized cells that the importance of the size-scaling pattern seems questionable.
      • In Figs 4B-D, the cell size axis seems to have over 4-fold size variability, but when the authors show the analysis of this data (Figs 4E-G) the variability is only 2-fold. What was excluded and on what basis?
      • Based on the methods section & figure legends of Fig 4B-I, the RPE cells were not pre-sorted to include only G1 cells, nor did the assay account for cell cycle differences. How can these data be used to explain results from earlier figures, where analyses were exclusively focused on size differences in G1?

      Minor comments:

      • I recommend clarifying in the early introduction that all size changes discussed are in the absence of DNA content increase.
      • The introduction seems to cite primary research and review paper in the same sentences, which is a bit misleading as the reviews don't seem to add new evidence.
      • OPTIONAL In the second introduction paragraph, consider the classification/description of the three different mechanisms. Currently, it seems that these mechanisms are not independent of each other, and the details provided about each mechanism are inconsistent.
      • Please provide statistics for the IC50 values reported based on Fig 1C. Were small and large cells statistically different? Are the IC50 values reported as +/- standard deviation or some other metric?
      • OPTIONAL Providing more insight into why Era2 and RSL3 treatments yield more opposite responses would be of great interest to the field.
      • Is the BODIPY-C11 labeling specific to plasma membrane, as suggested by the writing of the authors, or do the results shown integrate signals over all cell membranes?
      • How exactly is gating done for the flow cytometry samples? Especially when analyzing size-scaling, the results are likely to be sensitive to outliers, such as those seen in Fig 4C (a subpopulation of very low CFSE stained cells). Can the authors clarify their methods and/or display supplementary figures with gating examples?
      • In Fig 4, total protein staining was used as a control, whereas Fig 5B b-actin was used as a control. Why did the authors rely on different controls approaches for essentially the same measurements? Are these controls comparable?

      Significance

      I work in the cell size research field, and I am familiar with other related works in this field. My evaluation reflects a specialist's view of this study. Overall, this study will be of a large interest to a small group of specialists, and specific aspects of the work will also gain some interest from broader basic research audiences studying mechanisms of drug responses and ferroptosis in general. However, I do not see this work gaining very broad interest across larger audiences, simply because the field of cell size research is not of broad interest, and this is not a landmark study for the field.

      The field of cell size research has long searched for size-dependent functions, as these could help explain why cell size matters. This study is a nice addition to our field, helping establish ferroptosis as a size-dependent function. However, the significance of this work relies on how clearly the authors can establish that their results are cell size rather than cell cycle effects (see major comments above). Should the authors address these concerns, then this study will provide some conceptual and mechanistic insight.

      Regarding mechanistic insights, this work is in stark contrast to a recent study about size-dependency of ferroptosis (https://doi.org/10.1016/j.isci.2025.112363), where increased cell size heightened sensitivity to the GPX4 inhibitor RSL3, thus suggesting an opposite conclusion than what the authors observed with the drug Era2. The authors examined this contradiction, and while their results with the drug RSL3 agreed with the recent study, they did not explain why different drug mechanisms yield opposite results. Providing more insights into this discrepancy would increase the impact of this work.

      Regardless of the impact of this work, I want to emphasize that I am fully supportive of seeing this work published once the technical concerns have been addressed. Our field will benefit from this work, and this work could catalyze important future research. The general topic studied here has the potential to become very important.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer 1

      __#1.1 __In Fig 1 it appears that both PI3P and PI3,5P2 are present on lysosomes (Lamp1-FRB) and that recruitment of MTM1 (via FKBP) depletes both of them. Additionally, In the images and quantifications, it appears that after recruitment of MTM1, there is no vesicular pools of PI3P or PI3,5P2 left at all. This raises several concerns:

      __#1.1a __What is happening to the pool of PI3P on Rab5 endosomes? The main pool of PI3P is on Rab5 endosomes and separate from LAMP1 lysosomes. Why does recruitment of MTM1 to lysosomes then lead to almost complete depletion of all PI3P (Fig1 B bottom panel), when the lysosomal pool of this lipid should be minor compared to that on early endosomes?

      We plan to add data showing recruitment of FKBP-MTM1 to FRB-Rab5 positive endosomes. This also causes complete depletion of both PI3P and PI(3,5)P₂ probes. We believe this is the result of the over-expression of Rab5 or LAMP1 reporters, the robust activity of FKBP-MTM1 and the somewhat indistinct nature of these compartments (e.g. 10.1016/j.cell.2024.11.028) – in short, there is enough residual LAMP1 present in Rab5 enriched compartments or Rab5 in LAMP1-enriched membranes that robust depletion of lipids occurs in both when over-expressed .

      __#1.1b __The main problem comes from disagreement with figure 3. There it is shown that ".PI(3,5)P2 does not significantly overlap with PI3P". If this is true, then why does recruitment of MTM1 to LAMP1 lead to the depletion of both PI3P and PI3,5P2. This would indicate that they are on the same compartment (LAMP1 positive lysosomes)?

      We plan to add a complete, quantitative co-localization analysis of PI(3,5)P₂ and PI3P probes, as well as an analysis of both lipid probes with Rab5 and LAMP1. We already showed in such an analysis that PI3P is more closely co-localized with Rab5 than Rab7 (10.1083/jcb.201312072), but this is not absolute: not all late endosomal compartments are alike, and some contain PI3P whereas some do not (e.g. 10.1016/j.cell.2023.09.027). Likewise, we expect to show near exclusion of PI3P and PI(3,5)P₂, with greater co-localization between PI3P and Rab5 vs LAMP1, and the opposite for PI(3,5)P₂ (as shown in 10.1083/jcb.202408158). But nonetheless, there will be some PI3P-positive LAMP1 membranes.

      #1.1c A merge and colocalization analysis is at least needed to address how much of SnxA and EEA1-FYVE is localized to LAMP1-FRB. But just by looking at the one cell depicted, the localisation of LAMP1-FRB seems to be rather distinct from EEA1-FYVE and SnxA

      We believe our response to #1.1b will also address his point.

      #1.2 Minor: Fig2 is all in all way to small and it is difficult to actually see the signal in panel A and C. It should also be noted that in these panels (zoomed in to my best ability) that there is clear localisation of SnxA to the plasma membrane. Is this pool lost after Apilimod treatment, or is this unspecific binding? This PM localisation also appears in Fig1 B and 5C and should be discussed.

      We will enlarge the panels in figure 2 to ease visualization of the detail. We will also describe the PM localization of the probe, and how this is insensitive to apilimod treatment (Fig2. 2A). We will add this to the discussion, also citing how a non-PI(3,5)P₂ dependent PM localization of SnxA was also noted in the original study (Vines et al), and note this as a potential draw back of the full-length SnxA.

      __#1.3 __It should also be pointed out that dextran will be taken up through macropinocytosis and might follow different endocytic trafficking to conventional receptor.

      We will add a description to the results for figure 2 describing how dextran is taken up non-selectively through fluid-phase endocytosis (i.e. both conventional receptor-mediated endocytosis and micropinocytosis).

      #1.4 It also seems that the dextran signal decreases after Ap treatment. Is this the case? Has this been quantified? Would that point towards the whole endocytic structure being degraded and not just PI3,5P2?

      We believe this reduction is due to simple photobleaching of the dye under time-lapse imaging. We will quantify this as suggested – but crucially, we included the DMSO vehicle control which will perfectly control for photobleaching.

      __#1.5 __OPTIONAL: The timeline of PI3,5P2 degradation and vacuolation is a great experiment and very important. However, PI3,5P2 production is critically dependent on prior generation of PI3P. It could be of interest to readers what happens to SnxA localisation when cells are treated with VPS34-IN-1 or Wortmannin. How does loss PI3,5P2 compare between the timing of PIKfyve inhibition and inhibition of PI3K that generate its substrate. The timescales of this might tell us something interesting about the timings of phosphoinositide conversions. Also, treatment with PI3K inhibitors is known to not lead to vacuolisation even though it does also lead to a loss of PI3,5P2, further indicating that there might be more to the vacuolation phenotype than just the loss of PI3,5P2. These experiments would add novelty and detail to the study but do not change the overall findings or conclusions.

      This would indeed be an feasible experiment, but we do not believe the novelty would be significant: rapid depletion of PI3P with PI3K inhibitors has already been documented; notably, we demonstrated depletion of PI3P by wortmannin with a similar ~10 min time constant (10.1083/jcb.201312072) to the PI(3,5)P₂ depletion by apilimod reported herein. Since substrate depletion is ultimately an enzyme inhibitor, we would anticipate very similar kinetics for PI(3,5)P₂ depletion by apilimod vs wortmannin. However, if the reviewer and editor believe this experiment is essential, we could add it. We just do not believe it significantly strengthens the paper.

      #1.6a: Major: Fig3: The distinct localisation of PI3P and PI3,5P2 looks convincing but does not seem to be in line with data in Fig1.

      We believe our response to #1.1 addresses this point also.

      __#1.6b __It should be clearly noted and discussed that the structure in Fig3 B is a nascent macropinosomes (the size and proximity to the PM make this clear) and this transition on macropinosomes as been well characterised in Vines et al 2023

      We agree with the reviewer that this structure could easily be a macropinosome, though without a marker, we cannot definitively make this claim. Of course, the same logic applies to our erroneous declaration that this is an endosome (though HEK293A cells do possess some similarly very large, Rab5+ve sorting endosomes)! We will therefore amend the text to classify this as “an initially PI3P positive compartment”.

      #1.7 Major minor: Figure 5 is a very powerful approach to produce a completely independent pool of PI3,5P2. As control, the PIKfyve-CCR-Kina construct should be recruited to mitochondria in the absence of the MavQ construct. This should lead to the generation of PI5P (the other known product of PIKfyve) without generating PI3,5P2. This would be a useful control to test that SnxA does not bind to PI5P. (As a side note, this method would be great to use to try and find a bone fide PI5P effector. Maybe by isolating mitos and sending them to Mass Spec after altering their PIP composition? But this is obviously a completely different story)

      The is an excellent suggestion. We plan to add these experiments as a control for Fig. 5, though there will be nuanced interpretation – because we have no way of verifying PI5P generation by PIKfyve in the absence of a verified PI5PP biosensor. Furthermore, direct synthesis of PI5P by PIKfyve remains controversial (e.g. 10.1073/pnas.1203106109). Incidentally, we have attempted to co-recruit a PIP4K to convert any PI5P to PI(4,5)P₂, but this did not work.

      #1.8 Major: Figure 6 is great work, but it must be discussed that SnxA is a constitutive dimer while the 2xPx is a monomer. Therefore, the SnxA dimer will also contain two fluorescent proteins and will therefore also appear brighter than the 2xPx. Therefore, the increased baseline of SnxA might just be due to its nature as dimer and not because it has a higher affinity to PI3,5P2 compared to the 2xPx. OPTIONAL: One way to clear this up would be to make a 2xPx construct that also contains 2 fluorescent proteins. Also, the increase in fluorescence should also be shown as normalized to the initial intensity. From the graphs in 6a it seems that both have an ~1.3 and ~1.5 fold increase?

      We will add a discussion of the constitutive dimer nature of full-length SnxA. However, we disagree with the assertion that the dual GFP of the GFP-SnxA dimers accounts for the increased baseline labeling – since our quantification is based on the ratio of fluorescence at the membrane to elsewhere. Since there is a heavily diffuse signal of unbound probe, this diffuse pool is also dually labelled by the dimeric GFP-SnxA – in effect, the increased signal on membranes by dimeric GFPs is cancelled by the cytosolic dimers. We will clarify this point in the discussion.

      #1.9 OPTIONAL (but important): It would be great to compare the recently described G13V mutant of SnxA that is reported to have a higher affinity to PI3,5P2 (Preprint, Nishimura et al 2025). These experiments would add novelty and detail to the study but do not change the overall findings or conclusions. But as the study sets out to find the best PI3,5P2 probe, this might be very important

      This is indeed an important study. That said, the existence of this mutant does not change the overall importance of our demonstration here that PI(3,5)P₂ is both necessary and sufficient to recruit SnxA to membranes. We also feel that it is beyond the scope of our manuscript to perform detailed controls on the findings of another manuscript that itself has not passed peer review. However, we will add a point to the discussion about the likely impact of the G13V mutant.

      #1.10 Minor: In the discussion it is stated that "Vines et al. showed SnxA to have higher affinity for PI(3,5)P2 in vitro compared to 2xPx-SnxA" The Kd (to model membranes with 5% PI3,5P2 in POPC background!) for SnxA there is 187+-13nM and for 2xPx 217+-7 nM. Taking the top and bottom ranges of the uncertainty intervals to 200nM and 210nM respectively. While this is still a slightly higher Kd for SnxA, I am unsure how physiologically relevant this difference is.

      This is an excellent point, and we will amend the discussion to point out that the subtle differences in affinity likely don’t explain the enhanced performance of SnxA, as indeed the dual GFPs don’t as covered in #1.8.

      __#1.11 __Minor: In the discussion it is stated that "But, we could observe a short temporal overlap between the two lipids within a single endosome, lasting only a few minutes". If this statement refers to Figure 3B, then it must be clearly stated that this a macropinosome and not be conflated with canonical endosomes

      We will less precisely define these membranes as we did in response to #1.6b

      __#2.1 __While the data are by enlarge backed up by the presented data, there are two instances in which controls do not show the desired properties in some figures (but do in others). Specifically, the kinase dead control construct experiments in Figures 4D and 5D suffer from a lack of localization of the FKBP-PIKfyve-CCR-Kinase-Dead construct to the Rab5 endosomes or mitochondria respectively. Without robust localization of the control kinase, the dependency on PI(3,5)P2 synthesis (as opposed to physical kinase mediated recruitment) cannot be excluded. The construct shows robust relocalization in Figure 6 when the 2xPx-SnxA is being evaluated. With the combined evidence shown in the paper, this does not create significant doubt about the overall conclusions, but this issue needs to be addressed.

      We believe the poor apparent recruitment in figures 4D and 5D stems from two reasons; firstly, the signal to noise of the tagBFP-kinase construct is low (especially since the photodamaging 405 nm excitation laser must be used at low intensity of these time-lapse experiments). Therefore, one solution is to simply select example images to demonstrate recruitment where expression of TagBFP2 had better signal-to-noise. As can be seen from the quantification of data in figures 5 and 6, recruitment of FKBP-PIKfyve-CCR-Kinase and FKBP-PIKfyve-CCR-Kinase-Dead were very similar, but this was not so for figure 4; in this case, we will re-analyze the data to analyze only FKBP-PIKfyve-CCR-Kinase cells with similar recruitment to the dead for a direct comparison; if there is insufficient data, we will repeat the experiments to obtain balanced recruitment of both.

      __#3.1 __In Figure 1B, the authors show a rapid depletion of PI3P upon recruitment of MTM1 to LAMP1-positive membranes. The apparent minimal overlap between EEA1-FYVE and LAMP1 signals in the images is intriguing. It would be helpful if the authors could comment on whether this reflects rapid membrane trafficking dynamics or nonspecific membrane targeting of MTM1. In addition, the lysosomal SnxA signal appears to decrease with somewhat slower kinetics compared to PI3P. A short clarification as to whether MTM1 is expected to directly dephosphorylate PI(3,5)P₂, or whether the observed reduction in PI(3,5)P₂ is secondary to depletion of its precursor PI3P, resulting in gradual turnover of the PI(3,5)P₂ pool. Clarification of this point would further refine the mechanistic interpretation, although this point does not detract from the overall conclusions.

      The first part of this comment will be addressed by our response to #1.1. We will also add a discussion detailing how MTM1 itself has a somewhat controversial substrate selectivity, with the first report stating it was PI3P selective and did not hydrolyze PI(3,5)P₂ (10.1073/pnas.160255697), whereas a subsequent study found that it has robust activity against PI(3,5)P₂ (10.1016/s0960-9822(03)00132-5). Therefore, direct activity on PI(3,5)P₂ cannot be ruled out. Secondly, we will describe how our data and Pemberton et al showed that PIKfyve recruitment depletes PI3P, demonstrating that PIKfyve activity can outstrip VPS34 that makes PI3P. Therefore, in PI(3,5)P₂-positive, PI3P-negative membranes, it is possible that VPS34 is still active, but PI3P never accumulates because PIKfyve immediately converts it to PI(3,5)P₂.

      #3.2. In Figure 3, the authors conclude that SnxA and FYVE do not overlap, but rather localize in close proximity. While this is an interesting observation, some degree of apparent overlap can be perceived in the images. Although the qualitative impression is largely convincing, inclusion of a quantitative colocalization analysis (e.g., Pearson's or Manders' coefficients) could further strengthen this conclusion.

      We are planning to add precisely this analysis in response to #1.1b.

      #3.3. In Figure 4D, SnxA localization is still detectable in cells expressing the kinase-dead PIKfyve construct. If endogenous wild-type PIKfyve is present, residual PI(3,5)P₂ production could explain this signal and influence the interpretation of the experiment. The authors should clearly state which cell lines were used and whether endogenous PIKfyve activity was present in each condition. This information should be provided consistently in the figure legends to avoid ambiguity.

      We will clarify the results and legends to make clear that endogenous PIKfyve is indeed expressed in these cells, with either FKBP-PIKfyve-CCR-Kina or FKBP-PIKfyve-CCR-Kina-Dead. Will clarify that over-expressed catalytic PIKfyve fragments will not perturb endogenous PIKfyve activity, explaining basal SnxA localization in figures 4C and 4D.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      • The manuscript by Swayhoover et al. systematically validates the functionality and specificity of the PI(3,5)P₂ biosensor SnxA. The authors demonstrate that SnxA loses its lysosomal localization upon targeted recruitment of MTM1 to LAMP1-positive membranes, as well as following treatment with the selective PIKfyve inhibitor Apilimod. Together, these findings convincingly show that SnxA membrane association critically depends on the presence of PI(3,5)P₂.

      • To further assess whether PI(3,5)P₂ is sufficient for SnxA recruitment, the authors established a reconstitution approach in which a PIKfyve construct was co-recruited with the PI 3-kinase MavQ to the mitochondrial membrane, an organelle that normally lacks detectable PI(3,5)P₂. Induced recruitment of the active enzymes led to robust mitochondrial localization of SnxA, whereas no such recruitment was observed when a kinase-dead PIKfyve construct was used. These results demonstrate that ectopic production of PI(3,5)P₂ is sufficient to drive SnxA membrane association.

      • Lastly, the authors compare SnxA to previously described PI(3,5)P₂ reporters, ML1Nx2 and 2xPX-SnxA. To this end, SnxA was co-expressed with either reporter, and PI(3,5)P₂ production was induced on Rab5-positive endosomal structures via recruitment of the PIKfyve construct. While 2xPX-SnxA showed partial recruitment to Rab5-positive membranes, this effect was less pronounced than that observed for full-length SnxA. In contrast, ML1Nx2 exhibited apparent membrane association even in cells expressing the kinase-dead PIKfyve construct, and no additional relocalization was observed upon recruitment of the active kinase. These findings suggest that, under the experimental conditions tested, ML1Nx2 lacks substrate specificity and does not reliably report inducible PI(3,5)P₂ production. Overall, this study represents a compelling and well-executed cell biological analysis that supports the functionality and specificity of the recently established SnxA reporter. By providing a rigorously validated tool to monitor PI(3,5)P₂ dynamics in mammalian cells, the work will facilitate future investigations into the still poorly understood mechanisms regulating PI(3,5)P₂ metabolism. The experiments are carefully conducted, and the authors draw appropriate conclusions from their data. I do not have major concerns; however, I list a few minor comments below.

      Minor points

      1. In Figure 1B, the authors show a rapid depletion of PI3P upon recruitment of MTM1 to LAMP1-positive membranes. The apparent minimal overlap between EEA1-FYVE and LAMP1 signals in the images is intriguing. It would be helpful if the authors could comment on whether this reflects rapid membrane trafficking dynamics or nonspecific membrane targeting of MTM1. In addition, the lysosomal SnxA signal appears to decrease with somewhat slower kinetics compared to PI3P. A short clarification as to whether MTM1 is expected to directly dephosphorylate PI(3,5)P₂, or whether the observed reduction in PI(3,5)P₂ is secondary to depletion of its precursor PI3P, resulting in gradual turnover of the PI(3,5)P₂ pool. Clarification of this point would further refine the mechanistic interpretation, although this point does not detract from the overall conclusions.

      2. In Figure 3, the authors conclude that SnxA and FYVE do not overlap, but rather localize in close proximity. While this is an interesting observation, some degree of apparent overlap can be perceived in the images. Although the qualitative impression is largely convincing, inclusion of a quantitative colocalization analysis (e.g., Pearson's or Manders' coefficients) could further strengthen this conclusion.

      3. In Figure 4D, SnxA localization is still detectable in cells expressing the kinase-dead PIKfyve construct. If endogenous wild-type PIKfyve is present, residual PI(3,5)P₂ production could explain this signal and influence the interpretation of the experiment. The authors should clearly state which cell lines were used and whether endogenous PIKfyve activity was present in each condition. This information should be provided consistently in the figure legends to avoid ambiguity.

      Significance

      The manuscript by Swayhoover et al. systematically validates the functionality and specificity of the PI(3,5)P₂ biosensor SnxA. The authors demonstrate that SnxA loses its lysosomal localization upon targeted recruitment of MTM1 to LAMP1-positive membranes, as well as following treatment with the selective PIKfyve inhibitor Apilimod. Together, these findings convincingly show that SnxA membrane association critically depends on the presence of PI(3,5)P₂.

      • To further assess whether PI(3,5)P₂ is sufficient for SnxA recruitment, the authors established a reconstitution approach in which a PIKfyve construct was co-recruited with the PI 3-kinase MavQ to the mitochondrial membrane, an organelle that normally lacks detectable PI(3,5)P₂. Induced recruitment of the active enzymes led to robust mitochondrial localization of SnxA, whereas no such recruitment was observed when a kinase-dead PIKfyve construct was used. These results demonstrate that ectopic production of PI(3,5)P₂ is sufficient to drive SnxA membrane association.

      • Lastly, the authors compare SnxA to previously described PI(3,5)P₂ reporters, ML1Nx2 and 2xPX-SnxA. To this end, SnxA was co-expressed with either reporter, and PI(3,5)P₂ production was induced on Rab5-positive endosomal structures via recruitment of the PIKfyve construct. While 2xPX-SnxA showed partial recruitment to Rab5-positive membranes, this effect was less pronounced than that observed for full-length SnxA. In contrast, ML1Nx2 exhibited apparent membrane association even in cells expressing the kinase-dead PIKfyve construct, and no additional relocalization was observed upon recruitment of the active kinase. These findings suggest that, under the experimental conditions tested, ML1Nx2 lacks substrate specificity and does not reliably report inducible PI(3,5)P₂ production.

      -Overall, this study represents a compelling and well-executed cell biological analysis that supports the functionality and specificity of the recently established SnxA reporter. By providing a rigorously validated tool to monitor PI(3,5)P₂ dynamics in mammalian cells, the work will facilitate future investigations into the still poorly understood mechanisms regulating PI(3,5)P₂ metabolism. The experiments are carefully conducted, and the authors draw appropriate conclusions from their data.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      • Overall, the work is done very thoroughly, and considering the increasing interest in the less studies phosphoinositol species, it is also of significant interest. As outlined by the authors, the work is an extension of relatively recent work on the use of recent work SnxA. However, the very low cellular levels, as well as the (nicely demonstrated) rapid dynamics but a high demand on specificity. The authors also evaluate a tandem sensor (2xPx-SnxA).

      • While the data are by enlarge backed up by the presented data, there are two instances in which controls do not show the desired properties in some figures (but do in others). Specifically, the kinase dead control construct experiments in Figures 4D and 5D suffer from a lack of localization of the FKBP-PIKfyve-CCR-Kinase-Dead construct to the Rab5 endosomes or mitochondria respectively. Without robust localization of the control kinase, the dependency on PI(3,5)P2 synthesis (as opposed to physical kinase mediated recruitment) cannot be excluded. The construct shows robust relocalization in Figure 6 when the 2xPx-SnxA is being evaluated. With the combined evidence shown in the paper, this does not create significant doubt about the overall conclusions, but this issue needs to be addressed.

      Significance

      The authors address a compelling need for the improvement and characterization of sensors that are capable of detecting changes in levels and localization of PI(3,5)P2. Some previous characterization had confirmed the ability of a novel sensor based on the PX domain containing Dictyostelium sorting nexin-like protein SnxA, to detect PI(3,5)P2 throughout the endosomal pathway, specifically on Rab7 enriched endosomes. The low cellular levels of PI(3,5)P2 put an emphasis on the specificity of detection and the authors expanded the characterization of a SnxA based biosensor, including ectopic expression to mitochondria.

      Considering the involvement of PI(3,5)P2 in a range f cellular processes, availability of a trusted sensor will be of interest to a broader audience.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This highly important study by Swayhoover et al. uses state of the art cell biology and microscopy techniques to vigorously and unbiasedly assess biosensors against the phosphoinositide PI(3,5)P2. The independent validation of these biosensors is of crucial importance to the wider cell biology community. Using a range of orthogonal approaches, the authors clearly demonstrate the specificity and high affinity of the recently discovered biosensor SnxA. The ectopic generation of PI(3,5)P2 on mitochondria is a particularly outstanding and novel approach and clearly demonstrates the specificity of SnxA towards PI(3,5)P2 and rules out other factors in its recruitment, such coincidence detection mechanisms. The quantitative comparison against other versions of this biosensor is well done and adds important data to the field. While I do see some flaws in the manuscript, as well as several opportunities to improve both novelty and impact of the work, I do strongly supports its publication.

      Major:

      • In Fig 1 it appears that both PI3P and PI3,5P2 are present on lysosomes (Lamp1-FRB) and that recruitment of MTM1 (via FKBP) depletes both of them. Additionally, In the images and quantifications, it appears that after recruitment of MTM1, there is no vesicular pools of PI3P or PI3,5P2 left at all. This raises several concerns:

      1) What is happening to the pool of PI3P on Rab5 endosomes? The main pool of PI3P is on Rab5 endosomes and separate from LAMP1 lysosomes. Why does recruitment of MTM1 to lysosomes then lead to almost complete depletion of all PI3P (Fig1 B bottom panel), when the lysosomal pool of this lipid should be minor compared to that on early endosomes?

      2) The main problem comes from disagreement with figure 3. There it is shown that ".PI(3,5)P2 does not significantly overlap with PI3P". If this is true, then why does recruitment of MTM1 to LAMP1 lead to the depletion of both PI3P and PI3,5P2. This would indicate that they are on the same compartment (LAMP1 positive lysosomes)?

      3) A merge and colocalization analysis is at least needed to address how much of SnxA and EEA1-FYVE is localised to LAMP1-FRB. But just by looking at the one cell depicted, the localisation of LAMP1-FRB seems to be rather distinct from EEA1-FYVE and SnxA

      • Minor: Fig2 is all in all way to small and it is difficult to actually see the signal in panel A and C. It should also be noted that in these panels (zoomed in to my best ability) that there is clear localisation of SnxA to the plasma membrane. Is this pool lost after Apilimod treatment, or is this unspecific binding? This PM localisation also appears in Fig1 B and 5C and should be discussed.

      a) It should also be pointed out that dextran will be taken up through macropinocytosis and might follow different endocytic trafficking to conventional receptor.

      b) It also seems that the dextran signal decreases after Ap treatment. Is this the case? Has this been quantified? Would that point towards the whole endocytic structure being degraded and not just PI3,5P2?

      c) OPTIONAL: The timeline of PI3,5P2 degradation and vacuolation is a great experiment and very important. However, PI3,5P2 production is critically dependent on prior generation of PI3P. It could be of interest to readers what happens to SnxA localisation when cells are treated with VPS34-IN-1 or Wortmannin. How does loss PI3,5P2 compare between the timing of PIKfyve inhibition and inhibition of PI3K that generate its substrate. The timescales of this might tell us something interesting about the timings of phosphoinositide conversions. Also, treatment with PI3K inhibitors is known to not lead to vacuolisation even though it does also lead to a loss of PI3,5P2, further indicating that there might be more to the vacuolation phenotype than just the loss of PI3,5P2. These experiments would add novelty and detail to the study but do not change the overall findings or conclusions.

      Major: Fig3: The distinct localisation of PI3P and PI3,5P2 looks convincing but does not seem to be in line with data in Fig1

      • It should be clearly noted and discussed that the structure in Fig3 B is a nascent macropinosomes (the size and proximity to the PM make this clear) and this transition on macropinosomes as been well characterised in Vines et al 2023

      • Major minor: Figure 5 is a very powerful approach to produce a completely independent pool of PI3,5P2. As control, the PIKfyve-CCR-Kina construct should be recruited to mitochondria in the absence of the MavQ construct. This should lead to the generation of PI5P (the other known product of PIKfyve) without generating PI3,5P2. This would be a useful control to test that SnxA does not bind to PI5P. (As a side note, this method would be great to use to try and find a bone fide PI5P effector. Maybe by isolating mitos and sending them to Mass Spec after altering their PIP composition? But this is obviously a completely different story)

      • Major: Figure 6 is great work, but it must be discussed that SnxA is a constitutive dimer while the 2xPx is a monomer. Therefore, the SnxA dimer will also contain two fluorescent proteins and will therefore also appear brighter than the 2xPx. Therefore, the increased baseline of SnxA might just be due to its nature as dimer and not because it has a higher affinity to PI3,5P2 compared to the 2xPx. OPTIONAL: One way to clear this up would be to make a 2xPx construct that also contains 2 fluorescent proteins. Also, the increase in fluorescence should also be shown as normalised to the initial intensity. From the graphs in 6a it seems that both have an ~1.3 and ~1.5 fold increase?

      • OPTIONAL (but important): It would be great to compare the recently described G13V mutant of SnxA that is reported to have a higher affinity to PI3,5P2 (Preprint, Nishimura et al 2025). These experiments would add novelty and detail to the study but do not change the overall findings or conclusions. But as the study sets out to find the best PI3,5P2 probe, this might be very important

      • Minor: In the discussion it is stated that "Vines et al. showed SnxA to have higher affinity for PI(3,5)P2 in vitro compared to 2xPx-SnxA" The Kd (to model membranes with 5% PI3,5P2 in POPC background!) for SnxA there is 187+-13nM and for 2xPx 217+-7 nM. Taking the top and bottom ranges of the uncertainty intervals to 200nM and 210nM respectively. While this is still a slightly higher Kd for SnxA, I am unsure how physiologically relevant this difference is.

      • Minor: In the discussion it is stated that "But, we could observe a short temporal overlap between the two lipids within a single endosome, lasting only a few minutes". If this statement refers to Figure 3B, then it must be clearly stated that this a macropinosome and not be conflated with canonical endosomes

      Significance

      This is a strong study with vigorous and well controlled experiments. It is of broad interest to the cell biology community. With the recent discovery of a biosensors against PI(3,5)P2 it is crucial that novel tools undergo the scrutiny of the scientific community. For this, the publication of confirmatory results is of key importance. The authors do an excellent job to discuss their findings in the context of recently published work and clearly highlight where other studies have shown similar results previously. All of the conclusions are underpinned by vigorously controlled experiments. I think the study would benefit from some additional experiments to increase both its, novelty and quality.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03242R

      Corresponding author(s): Shinya Kuroda

      1. General Statements

      We appreciate the reviewers for the critical review of the manuscript and the valuable comments. We have carefully considered the reviewer's comments and have revised our manuscript accordingly.

      The reviewers' comments in this letter are in Bold and Italics.

      2. Point-by-point description of the revisions

      Response to Reviewer #1's Comments

      Evidence, reproducibility and clarity:

      Major comments

      1. This study leaves out lipid metabolism as a major energy metabolism pathway relevant to AD. The authors themselves cite the significance of acylcarnitines and CPT1A in AD (pg. 3, lines 32-33, pg. 4, lines 1-2). Lipid metabolism and homeostasis is known to be disrupted in AD1. Fatty acid oxidation is a known energy source in the prefrontal cortex2 and will also generate acetyl coA, which this study reveals is a significant decreased metabolite in AD. Furthermore, sphingomyelin emerges as one of the major decreased DEMs as well. Thus, lipid metabolism should be highlighted in Figure 3 and discussed throughout the manuscript; otherwise its omission should be clearly stated and justified.

      We appreciate the reviewer's insightful comment regarding a critical role of lipid metabolism in AD. We recognize that lipid metabolism is a metabolic pathway deeply involved in AD pathology (Baloni et al., 2022, 2020; Varma et al., 2021). Accordingly, we have revised the Limitations section to more strongly emphasize its role as a vital energy source (pg. 13, lines 15-17). Regarding the visualization of lipid metabolism, we extracted lipid-related pathway from the trans-omic network but found that the regulatory relationships among DEPs and DEMs were excessively complex and interconnected. Thus, interpreting this regulatory network seemed to be more challenging compared to the other energy production pathways presented in our manuscript. Therefore, we have concluded that the pathway analysis in our trans-omic network may not be suitable for deeply elucidating the lipid dysregulation in AD. We have added a statement acknowledging this as a limitation of our current methodology in the revised manuscript (pg. 13, lines 13-22).

      The covariates used for differential analysis should be discussed and justified. Notably, age is used as a covariate for transcriptomic analysis but not proteomic and metabolomic analysis, with no justification. Additionally, given the known importance of lipid metabolism in AD and the putative role of APOE in lipid homeostasis3, APOE genetic status should be considered as a covariate, or its omission should be justified.<br />

      We appreciate the reviewer's comment regarding the included covariates in differential analyses of our study. The reason we did not include other variables, such as age at death and RIN, is that these data were not available for each sample. Thus, we referred to the original research articles from which proteomic or metabolomic datasets used in our study were derived. Regarding the metabolomic dataset, in the original article (Batra et al., 2023), only two metabolites, 1-methyl-5-imidazoleacetate and N6-carboxymethyllysine, were significantly associated with age. In addition, no metabolites were significantly associated with sex, BMI, and years of education. Regarding the proteomic dataset, in the original article (Johnson et al., 2020), age at death, PMI, and sex were included as covariates in the analyses, though these variables were not found to strongly influence the data (Extended Data Fig.2 in (Johnson et al., 2020)).

      The authors make a conclusion statement that suggests intervention: "Collectively, our data suggests that preserving or improving the ability to produce ATP and early intervention in the process of nitrogen metabolism are candidates for the prevention and treatment of dementia" (pg. 12, lines 12-14). This claim is not well-supported by the evidence provided in the study. There are a few limitations: (a) This was an observational, not interventional study; (b) The study did not establish whether the metabolic disruptions are causes or effects in AD; and (c) ATP or other bioenergetic indicators were not directly measured. Therefore, any statements about potential interventions should be removed or qualified as highly speculative.

      We agree with the reviewer that the statement regarding potential interventions was not sufficiently supported by our analyses. Accordingly, we have removed the sentence regarding prevention and treatment from the revised manuscript (e.g., we have deleted final paragraph of the previous manuscript).

      In conjunction with the last point, the main conclusion of the study is that energy production is down in AD. The data presented in Figure 3 are consistent with this conclusion, but it is far from definitive due to limitations stated above in comments 3a and 3b. The authors should offer additional support for this conclusion: experimental follow-up, flux modeling, analysis of alternative datasets with ATP measurement, causal inference.<br />

      We sincerely thank the reviewer for this valuable and constructive suggestion. Regarding flux modeling, we agree that metabolic flux analysis could provide important mechanistic insight. Indeed, previous studies have applied flux modeling in the context of lipid metabolism in Alzheimer's disease (Baloni et al., 2022). We also attempted to perform flux modeling focusing on energy metabolism. However, we found it difficult to obtain biologically meaningful and robust results and therefore decided not to include these analyses in the current manuscript.

      With respect to ATP measurements, we fully agree that direct evidence of altered ATP levels would further strengthen our conclusion. However, to the best of our knowledge, there are currently no publicly available large-scale datasets that directly measure ATP levels in human postmortem brain tissues. This limitation makes it challenging to incorporate validation in the present study.

      Regarding experimental follow-up, we agree that functional validation is essential to confirm the mechanistic implications of our findings. We are actively considering follow-up experimental studies. However, we consider the present work to be a multi-omic integrative analysis aimed at identifying key molecular alterations and generating biologically important hypotheses. We have revised the Limitation section to more clearly position this manuscript as an observational systems-level analysis (pg. 13, lines 20-22).

      The validation analysis did not sufficiently show the generalizability of this study's results. The authors demonstrated a correlation of 0.53 to the MSBB transcriptomics data and 0.60 to the AMP-AD DiverseCohorts proteomics data. Beyond these correlation coefficients, no meaningful comparison between the datasets is offered. How concordant are the differentially expressed features (or pathways) between the datasets? How robust would the trans-omic network be if incorporating the alternate datasets? Is the main conclusion (energy metabolism is down in AD) supported by the validation datasets? We think this analysis should be expanded and described in the main text. Although the results for external metabolomics datasets are reported in Fig S2C, correlation coefficients with the external data are not reported. The authors state, "Note that each study used different definitions for AD and CT groups, had variations in measurement methods and brain regions analyzed." We appreciate these limitations. However, the external data should be re-analyzed using the same definitions of AD and CT, if possible. The limitations and results (which DEMs are shared between datasets) should be discussed in the main text. __

      We thank the reviewer for this important comment regarding the generalizability of our findings. In the revised manuscript, we have expanded the validation analyses and summarized the results in Figure S2. First, at the transcriptomic level, Figure S2B and S2C show the overlap between up- and downregulated genes in AD identified in our ROSMAP-derived analyses and those reported in a previously published large-scale meta-analysis of 2,114 postmortem samples across seven brain regions (Wan et al., 2020). A substantial proportion of DEGs were shared, supporting cross-cohort and cross-region robustness to some extent. At the proteomic level, Figure S2E shows a comparison between the ROSMAP and the AMP-AD DiverseCohorts datasets. We highlighted the subset of enzymes involved in the energy metabolism analysis shown in Fig. 3 and calculated a separate correlation coefficient for this subset (Pearson coefficient = 0.86, p-value = 1.5e-7), further supporting our main conclusion. In addition, to assess the concordance between the two datasets in a threshold-independent manner, we additionally performed Rank-Rank Hypergeometric Overlap (RRHO) analysis (Figure S2E). RRHO analysis (Cahill et al., 2018; Plaisier et al., 2010) enables the comparison of ranked protein lists without relying on arbitrary differential expression cutoffs and has been used for cross-dataset comparison in several previous studies (Fröhlich et al., 2024; Maitra et al., 2023). The RRHO heatmaps demonstrated significant enrichment in the concordant quadrants, confirming systematic agreement between datasets beyond simple correlation coefficients. For metabolomics, Figure S2G shows RRHO analyses comparing the ROSMAP metabolomic data with other datasets measured by the same UPLC-MS/MS platform (Batra et al., 2024; Novotny et al., 2023), demonstrating significant concordance in ranked metabolite changes in AD.

      The glycolysis analysis and discussion needs more development. Glycolysis and gluconeogenesis share many of the same enzymes, but they are not the same pathway and should not be discussed as such. To make a claim about the overall influence of enzyme and metabolite levels on glycolysis, the authors should focus on the energetically committing steps of glycolysis (hexokinase, phosphofructokinase, pyruvate kinase) in Figure 3A, and include the full/current version of the figure in the supplement. Gluconeogenesis-specific enzymes (pyruvate carboxylase, PEPCK) are not mentioned at all - are they among the DEPs/DEGs?<br />

      We appreciate the reviewer's comment regarding the distinction between glycolysis and gluconeogenesis pathway. Among the gluconeogenesis-specific enzyme proteins, G6PC1, FBP1, PC, and PCK2 were measured in our dataset, but none of them were identified as DEPs. In addition, gluconeogenesis is a process that occurs primarily in the liver and kidney rather than the brain. Given this biological context and the lack of significant changes in relevant enzymes, we have revised the terminology throughout the manuscript, replacing "glycolysis/gluconeogenesis pathway" with "glycolysis pathway" in the revised version.

      Given that there wasn't good concordance between the DEGs and DEPs, did including the mRNA and transcription factor layers in the network really add anything useful? It seems like the main conclusions of the manuscript were driven by the protein and metabolite layers only. How many of the DE metabolic enzymes were coregulated at the transcript and protein level? It would be useful to include the 5-layer trans-omic network in the supplement to display these results. Given your network, at what level does it appear that energy metabolism is regulated?<br />

      It is true that our primary conclusion regarding the regulation of energy metabolism is driven by the changes in protein and metabolite abundance. However, we consider the low concordance between mRNA and protein expression itself to be an important feature of AD pathology, as also reported in previous studies (Johnson et al., 2022; Tasaki et al., 2022). Although we did not perform a further analysis of this discordance, we believe that including the TF and mRNA layers into the metabolic trans-omic network strengthens a system-wide view of metabolic dysregulation in AD.

      Regarding the mRNA changes corresponding to the DEP enzymes, please refer to Figure S7A.

      Comment further on the results from Figure 2D. What can be learned from identifying metabolites with the greatest degree centrality? What pathways other than energy metabolism are highlighted by the trans-omic network?<br />

      We assume that some energetic indicators, including AMP and acetyl-CoA, and nitrogen metabolism-related metabolites, Glu, 2-oxoglutarate, and urea, can be potential key regulators of dysregulated metabolism in AD.

      (Suggestion) We suggest the authors leverage their trans-omic network in additional ways beyond giving a snapshot of a few energy metabolism pathways. The analysis of top DEMs could go further. What pathways are impacted beyond energy metabolism? Among the metabolic reactions allosterically regulated by top DEMs, what metabolic pathways are enriched?<br />

      We identified the enriched metabolic pathways that were allosterically regulated by DEMs in AD using Fisher's exact test. Alanine, aspartate, and glutamate metabolism pathways were significantly enriched in 2-oxoglutarate, glutarate, alanine, and glutamate-regulating metabolic reactions. Arginine and proline metabolism pathway was enriched in N-methyl-L-arginine and putrescine-regulating metabolic reactions. Arginine biosynthesis pathway was enriched in arginine-regulating metabolic reactions. Glycerophospholipid metabolism pathway was enriched in CDP-ethanolamine-regulating metabolic reactions. Glycine, serine, and threonine metabolism pathway was enriched in serine-regulating metabolic reactions. Purine metabolism pathway was enriched in AMP-regulating metabolic reactions. Pyrimidine metabolism pathway was enriched in deoxyuridine and thymidine-regulating metabolic reactions. Sphingolipid metabolism pathway was enriched in sphingosine-regulating metabolic reactions. However, this analysis did not yield sufficiently valuable insights into the regulatory relationships among biomolecules in AD. Thus, we did not include these results in the revised manuscript.

      (Suggestion) Figure 3 shows that most differential signal in AD points to lower energy production due to the combination of differentially expressed metabolites and enzymes, but we are not given much context about the strength of these among all the differential signals. We would suggest including volcano plots where the features of interest, i.e. DE enzymes and metabolites, are colored differently (or a similar figure).<br />

      We thank the reviewer for this constructive suggestion. To provide better context regarding the importance of the differential signals, we have added volcano plots for mRNAs, proteins, and metabolites in Figure S4A, B, and C.

      (Suggestion) The PPI network could be better leveraged to understand metabolic changes in AD. If nodes are grouped into subnetworks (e.g. by Louvain / Leiden clustering) and tested for pathway enrichment, could you find functional subnetworks of coordinately up- and down- regulated metabolic enzymes? This could yield some pathways of interest beyond the energy metabolism pathways already highlighted.<br />

      We appreciate the reviewer's suggestion to utilize the PPI network for subnetwork analysis. However, it is important to note that the proteomic dataset analyzed in this study is derived from the original work of (Johnson et al., 2020). In that paper, the authors already performed a Weighted Gene Co-expression Network Analysis (WGCNA) across several datasets to identify co-expressed modules and functional pathways.

      Given this, we assumed that applying additional clustering methods to the same dataset would be unlikely to yield significant biological insights beyond the established findings.

      __ ____Minor comments __

      12. "All genes" and "all metabolites" should not be the background for the proteomic and metabolic pathway enrichment analysis by Metascape and MetaboAnalyst. The background should be limited to the proteins and metabolites that were measured.

      We fully agree with the reviewer that using "all gene" or "all metabolites" as a background is not suitable for enrichment analyses. As suggested, we have revised the enrichment analyses using the measured proteins and metabolites as a background in both Metascape and MetaboAnalyst (Fig. S4D).

      Highlight the metabolic enzymes in Fig S2B. Calculate a separate correlation coefficient for the enzymes extracted in the energy metabolism analysis from Fig 3.<br />

      We appreciate the reviewer's suggestion to refine the correlation analysis. As requested, we have revised Fig. S2D to explicitly highlight the subset of enzymes involved in the energy metabolism analysis shown in Fig. 3. We calculated a separate correlation coefficient for the subset (Pearson coefficient = 0.86, p-value = 1.5e-7).

      Use a multiple hypothesis adjusted p-value or q-value in Figure S3.<br />

      We agree with the reviewer regarding the necessity of correcting for multiple comparisons. Accordingly, we have revised Fig. S4D using q-values.

      Describe the methods used to calculate the logFC values from the validation dataset.<br />

      We have revised the Methods to include a detailed description of the procedure used to calculate the log2FC values for the validation datasets (pg. 21, lines 13-15).

      It is difficult to read Figure 3. We would recommend really emphasizing to the reader to refer to Fig S7B as a "key" to this figure. The description of the red/blue arrows and nodes in the methods section (pg. 24, lines 21-36, pg 25, lines 1-4) were also helpful, but very lengthy. We recommend putting an abridged version of this description into the Fig S7 figure legend.<br />

      We appreciate the feedback regarding the readability of Fig. 3. As recommended, we have revised the manuscript to explicitly direct readers to Fig. S8B as an essential "key" for interpreting the network visualization (pg. 8, lines 28). Furthermore, we have added an abridged description of the network elements to the legend of Fig. S8B.

      The S7 figure legend should refer to panels A and B, not E and F.<br />

      We apologize for this oversight. We have corrected the legend of Fig. S8.

      (Suggestion) Are any of the differentially expressed metabolites allosteric regulators of the DE transcription factors? This could be interesting to discuss.<br />

      We appreciate the reviewer's insightful suggestion about the potential allosteric regulation of the DETFs by DEMs. We conducted an extensive literature search to identify any reports related to this perspective. However, to the best of our knowledge, no such direct interactions have been reported to date.

      Significance:

      The study's strength lies in leveraging three omics modalities across large patient cohorts (n ~ 150-240) to identify coherent signals between transcriptomics, proteomics, and metabolomics in postmortem DLPFC tissue. It was encouraging to see that the main result, showing downregulation for TCA, oxidative phosphorylation, and ketone body metabolism, emerged from consistent signals across both proteomics and metabolomics. This result was consistent with previous findings in other models cited by the author4,5 and other studies 6,7 demonstrating deficiency in energy-producing pathways in AD. Another strength of the study is the application of thoughtful methodology to connect differentially expressed proteins and metabolites via an intermediate data layer of metabolic reactions. The authors leverage the KEGG and BRENDA databases and apply sound logic to estimate the effects of enzyme level and metabolite level on pathway activity, with metabolites serving as substrate, product, or allosteric regulator for reactions. This trans-omic network methodology was developed in previous studies cited by the author8,9. However, as written, this study is limited in its contribution of new knowledge to the AD research field. The main conclusion (energy production is down in AD, due to regulatory disruption of energy metabolism) is not strongly supported (see comments 1, 3, and 4 for elaboration). The evidence could be improved by orthogonal approaches: further experimentation, further integration of external datasets, causal modeling, or flux modeling. Alternatively, even in the absence of new experimental and computational approaches, the story could be made more complete by further leveraging the trans-omic network to provide insights into (a) the regulation of energy metabolism; and (b) the impacts of key disrupted metabolites (see comments 7-9). The study is also limited in its demonstrating the power of these methodologies to provide integrative insights. As mentioned above, the integration of enzyme levels and metabolite levels is clearly useful (Figure 3). In contrast, the utility of the mRNA and transcription factor layers was not evident. The study did not appear to improve or expand upon trans-omic network methodology described in the previous works. Finally, the various analyses (analyzing the trans-omic network for nodes with the highest degree centrality, the PPI analysis, and viewing the energy metabolism pathways in the network) provided disparate results that were only tenuously connected in the discussion section.


      Response to Reviewer #2's Comments____

      Evidence, reproducibility and clarity: Summary

      This manuscript integrates public transcriptomic, proteomic, and metabolomic datasets from ROSMAP DLPFC samples to construct a multi-layer metabolic trans-omic network in Alzheimer's disease. By linking transcription factors, enzyme mRNAs, proteins, metabolic reactions, and metabolites, the authors report coordinated downregulation of the TCA cycle, oxidative phosphorylation, and ketone body metabolism, along with mixed regulatory signals in glycolysis/gluconeogenesis. They interpret these patterns as indicative of broad energetic dysfunction and alterations in amino-acid/nitrogen metabolism in AD. While the framework is conceptually appealing, much of the analysis remains descriptive, and several biological interpretations extend beyond what the data can robustly support. The reliance on bulk tissue without accounting for cell-type composition, limited covariate adjustment, and the absence of validation or sensitivity analyses reduce confidence in the mechanistic conclusions. Overall, the study provides a preliminary systems-level overview, but additional rigor is needed before the proposed trans-omic regulatory insights can be considered convincing.

      Major Comments

      1. Interpretation requires more cautious phrasing, and validation is essential. The manuscript frequently asserts that specific pathways are "inhibited" or that energetic deficits are "compensated," but these conclusions extend beyond what the descriptive, bulk-level data can support. Because no metabolic flux, causality, or direct functional measurements are included, the results should be framed as putative regulatory shifts, not confirmed impairments. Critically, key claims about pathway inhibition would require flux modeling, perturbation analyses, or experimental validation to be convincing. Without such validation, the mechanistic interpretations remain speculative.

      We thank the reviewer for this crucial comment. We fully agree that, given the descriptive and bulk-level nature of our analysis, mechanistic interpretations must be made with caution. In the absence of direct metabolic flux measurements or experimental validation, our findings should be interpreted as putative regulatory shifts rather than confirmed functional impairments. Accordingly, we have revised the manuscript to temper mechanistic claims. We have replaced definitive statements with more speculative phrasing (e.g., "Our analysis revealed a putative coordinated downregulation ..." instead of "Our analysis revealed a coordinated downregulation ..." in Abstract section; "we demonstrate the systems-level view of the potential dysregulated energy production ..." instead of "we demonstrate the systems-level view of the dysregulated energy production ..." in pg. 10, lines 25-26).

      Although the authors acknowledge this in the limitations, bulk-level differences may primarily reflect altered proportions of neurons, astrocytes, microglia, and oligodendrocytes rather than true within-cell-type regulation. Incorporating a cell-type deconvolution or performing a sensitivity analysis would substantially improve interpretability. This issue also impacts the trans-omic network: if the molecules included originate from different cell types, the inferred regulatory relationships may not reflect true intracellular processes.

      We appreciate the reviewer's point that bulk-level differences can reflect altered proportions of different brain cell types, subsequently affecting the inferred trans-omic network analysis. To assess the changes in cell type proportions of the samples that we used in our study, we additionally used public single-cell transcriptomic datasets, which were obtained from DLPFC tissue of 465 subjects in the ROSMAP cohort (Green et al., 2024). For each omic data that we used in our analyses, we matched the same subjects and calculated the following cell type proportions, astrocytes, excitatory neurons, inhibitory neurons, microglias, oligodendrocytes, and OPCs. Then, we statistically compared the cell type proportions between control subjects and patients with AD (Fig. S3). In the transcriptomic data, we confirmed that the proportion of inhibitory neurons in the AD group was smaller than in the CT group, and that the proportion of oligodendrocytes in the AD group was larger than in the CT group. In the proteomic data, we did not observe any statistically significant changes in the cell type proportion between the two group. In the metabolomic data, we found that the proportion of inhibitory neurons in the AD group was smaller than in the CT group (pg. 6, lines 8-11).

      Differential analysis covariates. For the differential expression analyses, only gender and PMI were included as covariates. Additional variables, such as age at death, RIN, neuropathological measures, and comorbidities, can strongly influence molecular profiles and should be considered to ensure that the observed differences reflect AD-related biology rather than confounding pathological or technical factors.

      We appreciate the reviewer's comment regarding the included covariates in differential analyses of our study. The reason we did not include other variables, including age at death and RIN, is that these data for each sample were not available. Thus, we referred to original research articles from which proteomic or metabolomic datasets used in our study were derived. Regarding the metabolomic dataset, in the original article (Batra et al., 2023), only two metabolites, 1-methyl-5-imidazoleacetate and N6-carboxymethyllysine, were significantly associated with age. In addition, no metabolites were significantly associated with sex, BMI, or education. Regarding the proteomic dataset, in the original article, age at death, PMI, and sex were included as covariates in the analyses, though these variables were not found to strongly influence the data (Extended Data Fig.2 in (Johnson et al., 2020)).

      Network stability and sample non-overlap. Proteomic, transcriptomic, and metabolomic data come from partially overlapping individuals. The authors should test whether the reconstructed network is robust to: different significance thresholds, restricting analyses to overlapping samples and alternative definitions of AD vs control.

      __ __We appreciate the reviewer's comment for the trans-omic network stability. In our study, the number of individuals for whom all omic modalities were measured was relatively small (n=25 in CT and n=35 in AD). This limited overlap reduces statistical power and can affect the downstream network construction. We have acknowledged this limitation in the revised manuscript and clarified that the reconstructed networks should be interpreted with caution regarding reproducibility and generalizability (pg. 13, lines 13-23).

      Minor Comments

      1. Some TF enrichment and regulatory inferences lack explicit mention of multiple-testing correction.

      We apologize for the lack of clarity in our original description. We have corrected for multiple-testing for the TF inference. Thus, we have revised the Methods section to explicitly describe the correction method used and the threshold applied (pg. 23, lines 23-24).

      The limitations section is strong but should explicitly discuss the influence of postmortem interval on metabolite levels.<br />

      We appreciate the reviewer's comment about the effect of postmortem interval on changes in metabolite levels. Accordingly, we have added the description of this perspective in our revised manuscript (pg. 13, lines 1-5).

      __*Reviewer #2 (Significance (Required)):

      Significance *__

      The study extends a trans-omic integration framework, originally applied to metabolic disease, into the context of Alzheimer's pathology. Although the biological findings largely confirm known alterations in mitochondrial and energy metabolism, the network-based approach offers a structured way to view cross-layer regulatory changes. Its main advance is conceptual rather than biological, providing a unified framework rather than uncovering fundamentally new mechanisms. This work will primarily interest researchers in neurodegeneration and systems biology, as well as computational groups developing multi-omics integration methods.

      Response to Reviewer #3's Comments


      Evidence, reproducibility and clarity

      This study leverages existing transcriptomic, metabalomic and proteomic datasets from prefrontal cortex (PFC) to assess metabolic dysregulation in Alzheimer's disease (AD). They found a downregulation of multiple metabolic pathways, including TCA cycle, oxidative phosphorylation, and ketone metabolism, that may explain bioenergetic alterations in AD. The study used matching ROSMAP omics datasets from the DLPFC that have allowed more robust data integration. However, the datasets are all generated using bulk tissue, which makes data interpretation difficult. For example, the AD changes they observed may be due to shifts in cell type proportion with disease (e.g. cell death, neuron inflammation). Did the authors account for any potential shifts in cell type proportion in their analysis?* *

      __If the assumption is that the changes in AD are cell intrinsic, which cell types are likely to be impacted? Can the authors integrate any existing single-cell analysis to infer which cell types may be driving the signals they detect, and whether this accounts for some of the antagonistic regulatory effects that were detected?______

      We thank the reviewer for their insightful comments. We agree that the use of bulk tissue datasets cannot account for cell-type heterogeneity. As noted in our Limitations section (pg. 12, lines 24-27), we recognize that previous studies have found that the Braak stage is correlated positively with microglia and astrocyte proportions and negatively with oligodendrocyte proportion (Hannon et al., 2024; Shireby et al., 2022). Regarding the integration of single-cell analysis, we have referenced recent snRNA-seq findings (Mathys et al., 2024) in our Limitations section (pg. 12, lines 28-32) to deconvolve our bulk signatures.

      Furthermore, in our revised manuscript, we additionally used public single-cell transcriptomic datasets, which were obtained from DLPFC tissue of 465 subjects in the ROSMAP cohort (Green et al., 2024). For each omic data that we used in our analyses, we matched the same subjects and calculated the following cell type proportions, astrocytes, excitatory neurons, inhibitory neurons, microglia, oligodendrocytes, and OPCs. Then, we statistically compared the cell type proportions between control subjects and patients with AD (Fig. S3). In the transcriptomic data, we confirmed that the proportion of inhibitory neurons in the AD group was smaller than in the CT group, and that the proportion of oligodendrocytes in the AD group was larger than in the CT group. In the proteomic data, we did not observe any statistically significant changes in the cell type proportion between the two groups. In the metabolomic data, we found that the proportion of inhibitory neurons in the AD group was smaller than in the CT group (pg. 6, lines 8-11).

      Significance

      The manuscript provides multimodal insight into metabolic dysregulation in AD in the PFC. Given that metabolic dysfunction is likely to play a major in disease pathogenesis, this is a study of importance. However, the findings lack granularity at the cell type level, which limits the impact of the study.

      Reference

      1. Baloni, P., Arnold, M., Buitrago, L., Nho, K., Moreno, H., Huynh, K., Brauner, B., Louie, G., Kueider-Paisley, A., Suhre, K., Saykin, A. J., Ekroos, K., Meikle, P. J., Hood, L., Price, N. D., Alzheimer's Disease Metabolomics Consortium, Doraiswamy, P. M., Funk, C. C., Hernández, A. I., ... Kaddurah-Daouk, R. (2022). Multi-Omic analyses characterize the ceramide/sphingomyelin pathway as a therapeutic target in Alzheimer's disease. Communications Biology, 5(1), 1074.
      2. Baloni, P., Funk, C. C., Yan, J., Yurkovich, J. T., Kueider-Paisley, A., Nho, K., Heinken, A., Jia, W., Mahmoudiandehkordi, S., Louie, G., Saykin, A. J., Arnold, M., Kastenmüller, G., Griffiths, W. J., Thiele, I., Alzheimer's Disease Metabolomics Consortium, Kaddurah-Daouk, R., & Price, N. D. (2020). Metabolic Network Analysis Reveals Altered Bile Acid Synthesis and Metabolism in Alzheimer's Disease. Cell Reports. Medicine, 1(8), 100138.
      3. Batra, R., Arnold, M., Wörheide, M. A., Allen, M., Wang, X., Blach, C., Levey, A. I., Seyfried, N. T., Ertekin-Taner, N., Bennett, D. A., Kastenmüller, G., Kaddurah-Daouk, R. F., Krumsiek, J., & Alzheimer's Disease Metabolomics Consortium (ADMC). (2023). The landscape of metabolic brain alterations in Alzheimer's disease. Alzheimer's & Dementia: The Journal of the Alzheimer's Association, 19(3), 980-998.
      4. Batra, R., Krumsiek, J., Wang, X., Allen, M., Blach, C., Kastenmüller, G., Arnold, M., Ertekin-Taner, N., Kaddurah-Daouk, R., & Alzheimer's Disease Metabolomics Consortium (ADMC). (2024). Comparative brain metabolomics reveals shared and distinct metabolic alterations in Alzheimer's disease and progressive supranuclear palsy. Alzheimer's & Dementia: The Journal of the Alzheimer's Association, 20(12), 8294-8307.
      5. Cahill, K. M., Huo, Z., Tseng, G. C., Logan, R. W., & Seney, M. L. (2018). Improved identification of concordant and discordant gene expression signatures using an updated rank-rank hypergeometric overlap approach. Scientific Reports, 8(1), 9588.
      6. Fröhlich, A. S., Gerstner, N., Gagliardi, M., Ködel, M., Yusupov, N., Matosin, N., Czamara, D., Sauer, S., Roeh, S., Murek, V., Chatzinakos, C., Daskalakis, N. P., Knauer-Arloth, J., Ziller, M. J., & Binder, E. B. (2024). Single-nucleus transcriptomic profiling of human orbitofrontal cortex reveals convergent effects of aging and psychiatric disease. Nature Neuroscience, 27(10), 2021-2032.
      7. Green, G. S., Fujita, M., Yang, H.-S., Taga, M., Cain, A., McCabe, C., Comandante-Lou, N., White, C. C., Schmidtner, A. K., Zeng, L., Sigalov, A., Wang, Y., Regev, A., Klein, H.-U., Menon, V., Bennett, D. A., Habib, N., & De Jager, P. L. (2024). Cellular communities reveal trajectories of brain ageing and Alzheimer's disease. Nature, 633(8030), 634-645.
      8. Hannon, E., Dempster, E. L., Davies, J. P., Chioza, B., Blake, G. E. T., Burrage, J., Policicchio, S., Franklin, A., Walker, E. M., Bamford, R. A., Schalkwyk, L. C., & Mill, J. (2024). Quantifying the proportion of different cell types in the human cortex using DNA methylation profiles. BMC Biology, 22(1), 17.
      9. Johnson, E. C. B., Carter, E. K., Dammer, E. B., Duong, D. M., Gerasimov, E. S., Liu, Y., Liu, J., Betarbet, R., Ping, L., Yin, L., Serrano, G. E., Beach, T. G., Peng, J., De Jager, P. L., Haroutunian, V., Zhang, B., Gaiteri, C., Bennett, D. A., Gearing, M., ... Seyfried, N. T. (2022). Large-scale deep multi-layer analysis of Alzheimer's disease brain reveals strong proteomic disease-related changes not observed at the RNA level. Nature Neuroscience, 25(2), 213-225.
      10. Johnson, E. C. B., Dammer, E. B., Duong, D. M., Ping, L., Zhou, M., Yin, L., Higginbotham, L. A., Guajardo, A., White, B., Troncoso, J. C., Thambisetty, M., Montine, T. J., Lee, E. B., Trojanowski, J. Q., Beach, T. G., Reiman, E. M., Haroutunian, V., Wang, M., Schadt, E., ... Seyfried, N. T. (2020). Large-scale proteomic analysis of Alzheimer's disease brain and cerebrospinal fluid reveals early changes in energy metabolism associated with microglia and astrocyte activation. Nature Medicine, 26(5), 769-780.
      11. Maitra, M., Mitsuhashi, H., Rahimian, R., Chawla, A., Yang, J., Fiori, L. M., Davoli, M. A., Perlman, K., Aouabed, Z., Mash, D. C., Suderman, M., Mechawar, N., Turecki, G., & Nagy, C. (2023). Cell type specific transcriptomic differences in depression show similar patterns between males and females but implicate distinct cell types and genes. Nature Communications, 14(1), 2912.
      12. Mathys, H., Boix, C. A., Akay, L. A., Xia, Z., Davila-Velderrain, J., Ng, A. P., Jiang, X., Abdelhady, G., Galani, K., Mantero, J., Band, N., James, B. T., Babu, S., Galiana-Melendez, F., Louderback, K., Prokopenko, D., Tanzi, R. E., Bennett, D. A., Tsai, L.-H., & Kellis, M. (2024). Single-cell multiregion dissection of Alzheimer's disease. Nature, 632(8026), 858-868.
      13. Novotny, B. C., Fernandez, M. V., Wang, C., Budde, J. P., Bergmann, K., Eteleeb, A. M., Bradley, J., Webster, C., Ebl, C., Norton, J., Gentsch, J., Dube, U., Wang, F., Morris, J. C., Bateman, R. J., Perrin, R. J., McDade, E., Xiong, C., Chhatwal, J., ... Harari, O. (2023). Metabolomic and lipidomic signatures in autosomal dominant and late-onset Alzheimer's disease brains. Alzheimer's & Dementia: The Journal of the Alzheimer's Association, 19(5), 1785-1799.
      14. Plaisier, S. B., Taschereau, R., Wong, J. A., & Graeber, T. G. (2010). Rank-rank hypergeometric overlap: identification of statistically significant overlap between gene-expression signatures. Nucleic Acids Research, 38(17), e169.
      15. Shireby, G., Dempster, E. L., Policicchio, S., Smith, R. G., Pishva, E., Chioza, B., Davies, J. P., Burrage, J., Lunnon, K., Seiler Vellame, D., Love, S., Thomas, A., Brookes, K., Morgan, K., Francis, P., Hannon, E., & Mill, J. (2022). DNA methylation signatures of Alzheimer's disease neuropathology in the cortex are primarily driven by variation in non-neuronal cell-types. Nature Communications, 13(1), 5620.
      16. Tasaki, S., Xu, J., Avey, D. R., Johnson, L., Petyuk, V. A., Dawe, R. J., Bennett, D. A., Wang, Y., & Gaiteri, C. (2022). Inferring protein expression changes from mRNA in Alzheimer's dementia using deep neural networks. Nature Communications, 13(1), 655.
      17. Varma, V. R., Wang, Y., An, Y., Varma, S., Bilgel, M., Doshi, J., Legido-Quigley, C., Delgado, J. C., Oommen, A. M., Roberts, J. A., Wong, D. F., Davatzikos, C., Resnick, S. M., Troncoso, J. C., Pletnikova, O., O'Brien, R., Hak, E., Baak, B. N., Pfeiffer, R., ... Thambisetty, M. (2021). Bile acid synthesis, modulation, and dementia: A metabolomic, transcriptomic, and pharmacoepidemiologic study. PLoS Medicine, 18(5), e1003615.
      18. Wan, Y.-W., Al-Ouran, R., Mangleburg, C. G., Perumal, T. M., Lee, T. V., Allison, K., Swarup, V., Funk, C. C., Gaiteri, C., Allen, M., Wang, M., Neuner, S. M., Kaczorowski, C. C., Philip, V. M., Howell, G. R., Martini-Stoica, H., Zheng, H., Mei, H., Zhong, X., ... Logsdon, B. A. (2020). Meta-Analysis of the Alzheimer's Disease Human Brain Transcriptome and Functional Dissection in Mouse Models. Cell Reports, 32(2), 107908.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study leverages existing transcriptomic, metabalomic and proteomic datasets from prefrontal cortex (PFC) to assess metabolic dysregulation in Alzheimer's disease (AD). They found a downregulation of multiple metabolic pathways, including TCA cycle, oxidative phosphorylation, and ketone metabolism, that may explain bioenergetic alterations in AD.

      The study used matching ROSMAP omics datasets from the DLPFC that have allowed more robust data integration. However, the datasets are all generated using bulk tissue, which makes data interpretation difficult. For example, the AD changes they observed may be due to shifts in cell type proportion with disease (e.g. cell death, neuron inflammation). Did the authors account for any potential shifts in cell type proportion in their analysis?

      If the assumption is that the changes in AD are cell intrinsic, which cell types are likely to be impacted? Can the authors integrate any existing single-cell analysis to infer which cell types may be driving the signals they detect, and whether this accounts for some of the antagonistic regulatory effects that were detected?

      Significance

      The manuscript provides multimodal insight into metabolic dysregulation in AD in the PFC. Given that metabolic dysfunction is likely to play a major in disease pathogenesis, this is a study of importance. However, the findings lack granularity at the cell type level, which limits the impact of the study.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      This manuscript integrates public transcriptomic, proteomic, and metabolomic datasets from ROSMAP DLPFC samples to construct a multi-layer metabolic trans-omic network in Alzheimer's disease. By linking transcription factors, enzyme mRNAs, proteins, metabolic reactions, and metabolites, the authors report coordinated downregulation of the TCA cycle, oxidative phosphorylation, and ketone body metabolism, along with mixed regulatory signals in glycolysis/gluconeogenesis. They interpret these patterns as indicative of broad energetic dysfunction and alterations in amino-acid/nitrogen metabolism in AD. While the framework is conceptually appealing, much of the analysis remains descriptive, and several biological interpretations extend beyond what the data can robustly support. The reliance on bulk tissue without accounting for cell-type composition, limited covariate adjustment, and the absence of validation or sensitivity analyses reduce confidence in the mechanistic conclusions. Overall, the study provides a preliminary systems-level overview, but additional rigor is needed before the proposed trans-omic regulatory insights can be considered convincing.

      Major Comments

      1. Interpretation requires more cautious phrasing, and validation is essential. The manuscript frequently asserts that specific pathways are "inhibited" or that energetic deficits are "compensated," but these conclusions extend beyond what the descriptive, bulk-level data can support. Because no metabolic flux, causality, or direct functional measurements are included, the results should be framed as putative regulatory shifts, not confirmed impairments. Critically, key claims about pathway inhibition would require flux modeling, perturbation analyses, or experimental validation to be convincing. Without such validation, the mechanistic interpretations remain speculative.
      2. Although the authors acknowledge this in the limitations, bulk-level differences may primarily reflect altered proportions of neurons, astrocytes, microglia, and oligodendrocytes rather than true within-cell-type regulation. Incorporating a cell-type deconvolution or performing a sensitivity analysis would substantially improve interpretability. This issue also impacts the trans-omic network: if the molecules included originate from different cell types, the inferred regulatory relationships may not reflect true intracellular processes.
      3. Differential analysis covariates. For the differential expression analyses, only gender and PMI were included as covariates. Additional variables, such as age at death, RIN, neuropathological measures, and comorbidities, can strongly influence molecular profiles and should be considered to ensure that the observed differences reflect AD-related biology rather than confounding pathological or technical factors.
      4. Network stability and sample non-overlap. Proteomic, transcriptomic, and metabolomic data come from partially overlapping individuals. The authors should test whether the reconstructed network is robust to: different significance thresholds, restricting analyses to overlapping samples and alternative definitions of AD vs control.

      Minor Comments

      1. Some TF enrichment and regulatory inferences lack explicit mention of multiple-testing correction.
      2. The limitations section is strong but should explicitly discuss the influence of postmortem interval on metabolite levels.

      Significance

      The study extends a trans-omic integration framework, originally applied to metabolic disease, into the context of Alzheimer's pathology. Although the biological findings largely confirm known alterations in mitochondrial and energy metabolism, the network-based approach offers a structured way to view cross-layer regulatory changes. Its main advance is conceptual rather than biological, providing a unified framework rather than uncovering fundamentally new mechanisms. This work will primarily interest researchers in neurodegeneration and systems biology, as well as computational groups developing multi-omics integration methods.

      Reviewer expertise

      My background is in Alzheimer's disease, multi-omics integration, and computational systems biology. I am not a specialist in enzymology or allosteric regulation and therefore cannot fully evaluate the biochemical specificity of those annotations.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Major comments

      1. This study leaves out lipid metabolism as a major energy metabolism pathway relevant to AD. The authors themselves cite the significance of acylcarnitines and CPT1A in AD (pg. 3, lines 32-33, pg. 4, lines 1-2). Lipid metabolism and homeostasis is known to be disrupted in AD1. Fatty acid oxidation is a known energy source in the prefrontal cortex2 and will also generate acetyl coA, which this study reveals is a significant decreased metabolite in AD. Furthermore, sphingomyelin emerges as one of the major decreased DEMs as well. Thus, lipid metabolism should be highlighted in Figure 3 and discussed throughout the manuscript; otherwise its omission should be clearly stated and justified.
      2. The covariates used for differential analysis should be discussed and justified. Notably, age is used as a covariate for transcriptomic analysis but not proteomic and metabolomic analysis, with no justification. Additionally, given the known importance of lipid metabolism in AD and the putative role of APOE in lipid homeostasis3, APOE genetic status should be considered as a covariate, or its omission should be justified.
      3. The authors make a conclusion statement that suggests intervention: "Collectively, our data suggests that preserving or improving the ability to produce ATP and early intervention in the process of nitrogen metabolism are candidates for the prevention and treatment of dementia" (pg. 12, lines 12-14). This claim is not well-supported by the evidence provided in the study. There are a few limitations: (a) This was an observational, not interventional study; (b) The study did not establish whether the metabolic disruptions are causes or effects in AD; and (c) ATP or other bioenergetic indicators were not directly measured. Therefore, any statements about potential interventions should be removed or qualified as highly speculative.
      4. In conjunction with the last point, the main conclusion of the study is that energy production is down in AD. The data presented in Figure 3 are consistent with this conclusion, but it is far from definitive due to limitations stated above in comments 3a and 3b. The authors should offer additional support for this conclusion: experimental follow-up, flux modeling, analysis of alternative datasets with ATP measurement, causal inference..
      5. The validation analysis did not sufficiently show the generalizability of this study's results. The authors demonstrated a correlation of 0.53 to the MSBB transcriptomics data and 0.60 to the AMP-AD DiverseCohorts proteomics data. Beyond these correlation coefficients, no meaningful comparison between the datasets is offered. How concordant are the differentially expressed features (or pathways) between the datasets? How robust would the trans-omic network be if incorporating the alternate datasets? Is the main conclusion (energy metabolism is down in AD) supported by the validation datasets? We think this analysis should be expanded and described in the main text.

      Although the results for external metabolomics datasets are reported in Fig S2C, correlation coefficients with the external data are not reported. The authors state, "Note that each study used different definitions for AD and CT groups, had variations in measurement methods and brain regions analyzed." We appreciate these limitations. However, the external data should be re-analyzed using the same definitions of AD and CT, if possible. The limitations and results (which DEMs are shared between datasets) should be discussed in the main text. 6. The glycolysis analysis and discussion needs more development. Glycolysis and gluconeogenesis share many of the same enzymes, but they are not the same pathway and should not be discussed as such. To make a claim about the overall influence of enzyme and metabolite levels on glycolysis, the authors should focus on the energetically committing steps of glycolysis (hexokinase, phosphofructokinase, pyruvate kinase) in Figure 3A, and include the full/current version of the figure in the supplement. Gluconeogenesis-specific enzymes (pyruvate carboxylase, PEPCK) are not mentioned at all - are they among the DEPs/DEGs? 7. Given that there wasn't good concordance between the DEGs and DEPs, did including the mRNA and transcription factor layers in the network really add anything useful? It seems like the main conclusions of the manuscript were driven by the protein and metabolite layers only. How many of the DE metabolic enzymes were coregulated at the transcript and protein level? It would be useful to include the 5-layer trans-omic network in the supplement to display these results. Given your network, at what level does it appear that energy metabolism is regulated? 8. Comment further on the results from Figure 2D. What can be learned from identifying metabolites with the greatest degree centrality? What pathways other than energy metabolism are highlighted by the trans-omic network? 9. (Suggestion) We suggest the authors leverage their trans-omic network in additional ways beyond giving a snapshot of a few energy metabolism pathways. The analysis of top DEMs could go further. What pathways are impacted beyond energy metabolism? Among the metabolic reactions allosterically regulated by top DEMs, what metabolic pathways are enriched? 10. (Suggestion) Figure 3 shows that most differential signal in AD points to lower energy production due to the combination of differentially expressed metabolites and enzymes, but we are not given much context about the strength of these among all the differential signals. We would suggest including volcano plots where the features of interest, i.e. DE enzymes and metabolites, are colored differently (or a similar figure). 11. (Suggestion) The PPI network could be better leveraged to understand metabolic changes in AD. If nodes are grouped into subnetworks (e.g. by Louvain / Leiden clustering) and tested for pathway enrichment, could you find functional subnetworks of coordinately up- and down- regulated metabolic enzymes? This could yield some pathways of interest beyond the energy metabolism pathways already highlighted.

      Minor comments

      1. "All genes" and "all metabolites" should not be the background for the proteomic and metabolic pathway enrichment analysis by Metascape and MetaboAnalyst. The background should be limited to the proteins and metabolites that were measured.
      2. Highlight the metabolic enzymes in Fig S2B. Calculate a separate correlation coefficient for the enzymes extracted in the energy metabolism analysis from Fig 3.
      3. Use a multiple hypothesis adjusted p-value or q-value in Figure S3.
      4. Describe the methods used to calculate the logFC values from the validation dataset.
      5. It is difficult to read Figure 3. We would recommend really emphasizing to the reader to refer to Fig S7B as a "key" to this figure. The description of the red/blue arrows and nodes in the methods section (pg. 24, lines 21-36, pg 25, lines 1-4) were also helpful, but very lengthy. We recommend putting an abridged version of this description into the Fig S7 figure legend.
      6. The S7 figure legend should refer to panels A and B, not E and F.

      7. (Suggestion) Are any of the differentially expressed metabolites allosteric regulators of the DE transcription factors? This could be interesting to discuss.

      Significance

      The study's strength lies in leveraging three omics modalities across large patient cohorts (n ~ 150-240) to identify coherent signals between transcriptomics, proteomics, and metabolomics in postmortem DLPFC tissue. It was encouraging to see that the main result, showing downregulation for TCA, oxidative phosphorylation, and ketone body metabolism, emerged from consistent signals across both proteomics and metabolomics. This result was consistent with previous findings in other models cited by the author4,5 and other studies 6,7 demonstrating deficiency in energy-producing pathways in AD. Another strength of the study is the application of thoughtful methodology to connect differentially expressed proteins and metabolites via an intermediate data layer of metabolic reactions. The authors leverage the KEGG and BRENDA databases and apply sound logic to estimate the effects of enzyme level and metabolite level on pathway activity, with metabolites serving as substrate, product, or allosteric regulator for reactions. This trans-omic network methodology was developed in previous studies cited by the author8,9. However, as written, this study is limited in its contribution of new knowledge to the AD research field. The main conclusion (energy production is down in AD, due to regulatory disruption of energy metabolism) is not strongly supported (see comments 1, 3, and 4 for elaboration). The evidence could be improved by orthogonal approaches: further experimentation, further integration of external datasets, causal modeling, or flux modeling. Alternatively, even in the absence of new experimental and computational approaches, the story could be made more complete by further leveraging the trans-omic network to provide insights into (a) the regulation of energy metabolism; and (b) the impacts of key disrupted metabolites (see comments 7-9). The study is also limited in its demonstrating the power of these methodologies to provide integrative insights. As mentioned above, the integration of enzyme levels and metabolite levels is clearly useful (Figure 3). In contrast, the utility of the mRNA and transcription factor layers was not evident. The study did not appear to improve or expand upon trans-omic network methodology described in the previous works. Finally, the various analyses (analyzing the trans-omic network for nodes with the highest degree centrality, the PPI analysis, and viewing the energy metabolism pathways in the network) provided disparate results that were only tenuously connected in the discussion section.

      References

      1. Yin F. Lipid metabolism and Alzheimer's disease: clinical evidence, mechanistic link and therapeutic promise. FEBS J. 2023;290(6):1420-1453. doi:10.1111/febs.16344
      2. Ju Y, Li S, Kong X, Zhao Q. Exploring fatty acid metabolism in Alzheimer's disease: the key role of CPT1A. Sci Rep. 2024;14(1):31483. doi:10.1038/s41598-024-82999-z
      3. Sienski G, Narayan P, Bonner JM, et al. APOE4 disrupts intracellular lipid homeostasis in human iPSC-derived glia. Sci Transl Med. 2021;13(583):eaaz4564. doi:10.1126/scitranslmed.aaz4564
      4. Zhang C, Rissman RA, Feng J. Characterization of ATP Alternations in an Alzheimer's Transgenic Mouse Model. J Alzheimers Dis JAD. 2015;44(2):375-378. doi:10.3233/JAD-141890
      5. Jia D, Wang F, Yu H. Systemic alterations of tricarboxylic acid cycle enzymes in Alzheimer's disease. Front Neurosci. 2023;17:1206688. doi:10.3389/fnins.2023.1206688
      6. Sang C, Philbert SA, Hartland D, et al. Coenzyme A-Dependent Tricarboxylic Acid Cycle Enzymes Are Decreased in Alzheimer's Disease Consistent With Cerebral Pantothenate Deficiency. Front Aging Neurosci. 2022;14. doi:10.3389/fnagi.2022.893159
      7. Patel V, Mill J, Okonkwo OC, Salamat S, Li L, Raife T. Global Energy Metabolism Deficit in Alzheimer Disease Brain. J Prev Alzheimers Dis. 2024;11(1):171-178. doi:10.14283/jpad.2023.91
      8. Egami R, Kokaji T, Hatano A, et al. Trans-omic analysis reveals obesity-associated dysregulation of inter-organ metabolic cycles between the liver and skeletal muscle. iScience. 2021;24(3):102217. doi:10.1016/j.isci.2021.102217
      9. Bai Y, Morita K, Kokaji T, et al. Trans-omic analysis reveals opposite metabolic dysregulation between feeding and fasting in liver associated with obesity. iScience. 2024;27(3):109121. doi:10.1016/j.isci.2024.109121
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC- 2025-03341

      Corresponding author(s): Thomas, Leonard

      1. General Statements [optional]* *

      The reviews are positive, constructive, and balanced. The reviewers highlighted the novelty, scope, technical rigor, and strength of evidence of the study. The reviewers also noted the technological advance in modeling of multi-domain proteins that we report. In summary, there are two major advances reported in this study, both of which have important implications, both within the field of lipid signaling and in the broader field of in silico structural modeling.

      Lipid signaling. We have elucidated the mechanism by which a protein kinase is allosterically activated by a specific lipid second messenger (PIP3) at atomic resolution. To the best of our knowledge, this has not been achieved for any kinase to date. Our findings have implications for (a) the spatial and temporal confinement of Tec signaling in cells by PIP3, (b) the rationalization of disease-causing mutations in XLA, and (c) the development of novel therapeutics that could be of clinical value in the treatment of B-cell malignancies. As such, we believe that this study will be of interest to a wide spectrum of basic scientists in the cell signaling community, as well as translational, and clinical scientists.

      __In silico structural modeling. __Whilst developed primarily to answer the biological question of PIP3-mediated activation of the Tec kinases (see above), the improvement in AlphaFold modeling that we report has significant implications for all scientists concerned with structural modeling in silico, specifically with respect to the modeling of both multi-domain proteins and protein complexes. Given the widespread adoption of AlphaFold as a hypothesis generator, the audience for which these developments are relevant is actually very large, transcending all fields of the biological sciences.

      2. Description of the planned revisions

      • *The major suggestion made by reviewers #2 and #3 was the inclusion of a negative control in the lipid nanodisc assays (Figure 5) to confirm that it is PIP3 that specifically activates MbTEC. This is a constructive and valuable addition to our study, particularly in light of the fact that PI(4,5)P2 is present in cells at 2-4 orders of magnitude greater concentration than PIP3. This experiment will be combined with reviewer #2's suggestion to perform a PIP3 titration in the lipid nanodiscs.

      • *

      Reviewer #2____

      Although the nanodisc experiments clearly show PIP3-dependent activation, titrating the PIP3 content in nanodiscs (e.g., 0.1%, 0.5%, 1%, 3%, 5% of PIP3) to determine whether MbTEC activation shows a graded response to lipid abundance would strengthen the conclusions. This would support the suggested allosteric mechanism and aid in differentiating between digital and analogue activation behaviour.

      • We thank the reviewer for the nice suggestion, which we will combine with the negative control suggested by the reviewer in the next comment.

      A good negative control for Figure 5C, would be a nanodisc containing another phosphoinositide. Given prior evidence that TEC-family PH domains display selectivity for PIP3, it would nevertheless be informative to test nanodiscs containing other phosphoinositides (e.g., PI(4,5)P2, PI(3,4)P2, and PI3P).

      • See response above. Reviewer #3

      Fig 5B/C: The nanodisc experiment lack some controls. In order to conclude that PIP3 is indeed critical for the observed enhance autophosphorylation of MbTEC, nanodiscs with e.g. PI3P, PI4P or PI5P should be used that are not expected to bind the MbTEC PH domain with high affinity. Likewise, or alternatively, a mutant PH domain with largely reduced PIP3 binding affinity would support trust in this central result of the paper. (estimated time investment: 1-2 months).

      • We appreciate the reviewer's suggestion, which was also proposed by reviewer #2. These experiments are planned as the number one priority (see response above).

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Reviewer #1

      Major comment: We think the proposal is overall coherent and reasonable and found it interesting. It is not, however, conclusive. Modeling played a key role in supporting this proposal, but the modelling itself was dependent on choices of parameters made by the authors. The reported AlphaFold 3 model depended on a customized MSA strategy: the authors report divergent placement of the PH domain with respect to the kinase domain in their AlphaFold 3 runs. In light of this observation they used a manually curated TEC family MSA with taxonomic reweighting. This helped the model convergence but it introduced arbitrarity in the modeling step.

      • We believe that it is necessary to clarify what exactly our custom AF pipeline does so as to avoid confusion, but also to render our work more impactful for future studies that employ AlphaFold. The divergent placement of the PH domain in AF's standard configuration arises from the inclusion of sequences in the MSA that do not belong to the Tec gene family (Supplementary Figure 4C) but are structurally related at the individual domain level and therefore identified by the profile Hidden Markov Models used by AF to generate deep MSAs. These sequences are unrelated to Tec phylogenetically and therefore have evolved under different selection pressures. What our custom pipeline does is exclude these sequences from the MSA, such that the evolutionary covariance signature exploited by AF to guide inter-residue distance restraints comes only from bona fide Tec sequences. In a second step, we sample the sequences to ensure taxonomic balance (sequence databases are heavily biased in terms of taxonomic representation). This step increases sequence diversity and, with it, the strength of the co-variance signal. Therefore, rather than introducing "arbitrariness" in the modeling, we actually reduce it.
      • Since the advance that we report in modeling multi-domain proteins with AlphaFold is applicable to all multi-domain proteins and protein complexes, we believe that it is valuable to convey the significance of the input MSA in as clear a fashion as possible. To illustrate why AlphaFold fails in its standard configuration, we have therefore performed an in silico analysis of the MSA automatically generated by AF when it is prompted to predict the structure of MbTec. We now include this analysis as a new Supplementary Figure (Supplementary Figure 4C). As can be seen, of the 50,000 sequences in the AF3-generated MSA, only 1,898 contain the complete set of regulatory PH, SH3, SH2 and kinase domains that characterize the Tec kinases. The remaining 48,102 sequences, while containing one or more of the individual domains found in Tec, are phylogenetically unrelated. This means that the co-variance signature that AF relies upon for accurate prediction of inter-domain interactions is contained in

        Minor comment: In two places the authors wrote "PIP3 is necessary and sufficient for both MbTEC activation and inactivation." This seems logically impossible. Revision is required.

      • We appreciate the reviewer's confusion here. This conclusion stemmed from the observation that PIP3 engagement is sufficient to promote full activation of MbTec on lipid nanodiscs in vitro (the synergistic effect of the hydrophobic stack mutation is lost in this context due to the presence of the polyproline motif in the PH-SH3 linker). However, in vivo, the SH2 domain is essential for BTK activation (by mediating its recruitment to activated receptors) and therefore it is incorrect to state that PIP3 is necessary and sufficient. It is necessary, but not sufficient - this is, again, analogous to an AND gate in an electronic circuit. We have revised the manuscript accordingly. Significance

      It attempted to clarify the role of the PH domain in TEC activation from a mechanistic perspective. If confirmed, it can potentially lead to novel approaches of drug discovery targeting TEC kinases.

      • Whilst we shied away from a discussion of therapeutic potential in our discussion to avoid unnecessary hype, the reviewer raises an important point, especially in light of the recent clinical success of BTK inhibitors in treating B-cell malignancies. As such, we have used the request made by Reviewer #2 to compare MbTec with Akt to highlight the potential for a new therapeutic modality in Tec kinase inhibition. The recent FDA approval of Capivasertib (November, 2023), an allosteric inhibitor of Akt, for the treatment of hormone-receptor (HR) positive, HER2-negative advanced or metastatic breast cancer provides a nice proof-of-concept. This discussion can be found in the response to Reviewer #2. Reviewer #3 also alluded to the "blockbuster drugs" used to treat B-cell malignancies, so we felt it appropriate to at least comment on the potential implications of our findings for the development of novel therapeutics. Reviewer #2

      • The inference for Figure 3 that PH domain exerts a strong autoinhibitory influence on kinase activity that cannot be overcome by disruption of the SH3-kinase interaction would benefit from further clarification. It is not immediately clear from the data that PH-domain-mediated inhibition should be seen as dominant rather than synergistic with SH3-kinase linker interactions. Although the autophosphorylation stoichiometry was measured for MbTEC32K L396A and MbTECFL L396A, a more thorough quantitative evaluation of the relative contributions of PH-domain removal versus SH3-linker disruption would be possible if this analysis were extended to MbTEC32K. Discussing whether these inhibitory components might instead work together/cooperatively to limit kinase activity or is it one dominant over the other , the authors are urged to thoroughly explain the reasoning behind the conclusion provided.

      • The reviewer raises an interesting question regarding the relative contributions of the various regulatory domains to autoinhibition. Ultimately, what our data show, both for MbTec autophosphorylation and substrate phosphorylation, is that disruption of the SH3-kinase interface results in kinase activation. The amplitude of the activation, however, is dependent on whether the PH domain is present or not. In the presence of the PH domain, the activation is very modest, whereas when it is removed, the amplitude is an order of magnitude greater. This reflects the fact that SH3 domain displacement without PH domain displacement does not permit acquisition of a conformation compatible with activation loop autophosphorylation. This implies that PIP3-dependent allosteric activation is a prerequisite for complete activation of Tec. PH domain deletion is also not permissive for complete activation, which requires SH3 domain displacement on top to drive autophosphorylation, an observation consistent with previous experimental data on Src. As the reviewer indicates, these are synergistic with one another - Tec is a coincidence detector of multiple signals, all of which are required for full activation. Our conclusion that the inhibitory influence of the PH domain cannot be overcome by displacement of the SH3 and SH2 domain, however, is important, since it strongly implies that PIP3 is necessary for Tec activation (i.e. that Tec is an AND gate and not an OR gate). We have revised our description of these results to better reflect the relative contributions of the various regulator domains:

      "These observations indicate that the PH and SH3 domains exert synergistic inhibitory effects on the kinase domain and that disengagement of both domains by ligand binding is required for complete activation of MbTec. This is the equivalent of an AND gate in an electronic circuit, as opposed to an OR gate."

      It would also be valuable if the authors in the discussion section can draw a contrast with PIP3-dependent activation mechanism of AKT. This would be helpful in highlighting the uniqueness of PIP3 dependent TEC activation.

      • We thank the reviewer for highlighting the value of comparing MbTec to Akt, for which the activation mechanism has been intensively studied, both in our lab and in many others. There are, indeed, some interesting similarities, which we now comment on in the following paragraph, which has been incorporated into our discussion section: "It is worth noting that the regulation of MbTec by PIP3 is analogous, although not entirely homologous, to the regulation of the Ser/Thr kinases Akt and PDK1. Like Tec, Akt and PDK1 contain PIP3-sensing PH domains which mediate autoinhibition of their respective kinase domains (PMIDs: 28157504 and 35387990). Although the autoinhibitory interfaces of Tec and Akt are structurally different, both interfaces impair activation loop phosphorylation and substrate binding, as well as PIP3 binding (PMIDs: 28157504, 29632185, 3438531). The specific autoinhibitory conformation of Akt has been exploited in the development of allosteric inhibitors, which exhibit significantly improved on-target specificity and have recently been approved for the treatment of cancer (PMID: 38592948). As such, our findings open a new potential therapeutic modality for the development of selective Tec kinase inhibitors. Given the recent success of ATP-competitive BTK inhibitors in treating B-cell malignancies (PMIDs: 26639149, 36511784), there is enormous therapeutic potential."

      *Minor Comments

      *

      Y579 and R581 comes without a significant context. Can the authors elaborate on these residues a bit.

      • We have tried to better introduce the rationale behind mutation of these residues by rephrasing this part of the results. The changes from the previous version are underlined:

      "Consistent with the loss of an energetically favorable interface, deletion of the PH domain resulted in a 6{degree sign}C reduction in thermal stability (Figure 2F, Supplementary Figure 6C). We next tested the specificity of the predicted PH-kinase interaction by mutating Y579 and R581, which are conserved residues in the interface (Figure 2G). Mutation of Y579 and R581 to alanine reduced thermal stability by 3{degree sign}C, while their mutation to asparate and glutamate respectively resulted in the same thermal stability as MbTEC32K lacking its PH domain (Figure 2F, Supplementary Figure 6D). These observations indicate that substitution of Y579 and R581 with alanine weakens the autoinhibitory conformation by reducing van der Waals contacts, but substitution with charged residues that introduce unfavorable interactions is sufficient to completely disrupt the interface. Consistently, MbTEC32K bound to the PH domain with an affinity of 4.0 mM, but binding of MbTEC32K Y579D R581E was barely detected (Figure 2H)." +

      Figure 2H - In the legend make wt as WT so that it matches the figure panel

      • Fixed.
      • Supplementary Figure 1J - Adjust the orientation of intensity on y axis

      • Fixed (now Supplementary Figure 2J).

      • Supplementary Figure 1H - In the figure it should be Y579 and R581

      • Fixed (now Supplementary Figure 2H).

      • Can the authors add that 5C is the representative autoradiographs for each construct from panel 5B. Make it clear.

      • Fixed.

      • Write the units for intensity on the y axis for the entire supplementary figure 1 • Supplementary Figure 2J and 2K - Make the 6 subscript in the legend for Gly 6.

      • Fixed (now Supplementary Figure 3J-K).

      • Can the authors include RRID wherever applicable in the methods section.

      • We have added in the RRID reference for the cell line employed in this study.

      • Include a space between i and was in the sentence " Each sequence iwas assigned a raw weight .

      • Fixed.

      • I think MSA is coming twice in the line above structure inference in the methods section. MSAs is repeating after balanced MSA. Kindly look into it.

      • Fixed.

        The work has been done using the TEC kinase from the choanoflagellate M.brevicollis, presumably for practical reasons of expression and purification. PIP3 signalling, to my knowledge, has not formally been demonstrated in choanoflagellates. This remains a concern in respect of the relevance of these findings to true metazoans which is the setting in which Class I PI3kinase generated PIP3 signalling is seen.

      • We appreciate the reviewer's concerns regarding the relevance of our findings to PIP3 signaling in metazoans. Whilst the production and sensing of PIP3 has not formally been demonstrated in a choanoflagellate, we believe that sufficient circumstantial evidence exists that should allay these concerns. Specifically:

      • Evolutionary evidence exists for the presence of the PI3K machinery in the last eukaryotic ancestor (LECA) (PMID: 26482564), approximately 1.2-1.8 billion years ago. Choanoflagellates, are, by comparison quite young (600-650 My).
      • Choanoflagellates have an extensive tyrosine kinase signaling network, including RTKs (PMID: 18621719)
      • PI3K/PIP3/PTEN signaling has been robustly demonstrated in organisms that predate choanoflagellates by hundreds of millions of years, including Amoebozoa e.g. D. discoideum and E. histolytica (PMIDs: 9778249, 11352940, 12062103, 12062104, 12802064).
      • Monosiga brevicollis encodes:
      • class I PI3K p110 and p85 homologs (Manning et al, PNAS 2008)
      • a PTEN homolog
      • note that class I PI3Kd is responsible for the plasma membrane PIP3 signal in metazoan immune cells, meaning that a homolog of this enzyme is present in choanoflagellates
      • Choanoflagellates encode homologs of metazoan proteins that are known to respond specifically to PIP3, including:
      • MbTec
      • PDK1 (NCBI Reference Sequence: XP_004995400.1)
      • Akt (NCBI Reference Sequence: XP_001743446.1)
      • A recent kinase inhibitor screen in the choanoflagellate S. rosetta revealed the activity of known PI3K inhibitors (regulation of growth, phosphotyrosine signaling etc) (PMID: 40226336)
      • Conclusion: choanoflagellates inherited an ancient lipid-signaling toolkit.
      • Nevertheless, we believe that the reviewer makes an important point that is important to clarify for the uninitiated reader. We therefore propose the following additional paragraph to our discussion section that deals explicitly with these concerns:

      "Although PIP3 signaling has not been explicitly demonstrated in a choanoflagellate, the machineries for its production predate choanoflagellates by at least 500 My (PMID: 26482564). PI3K-mediated production, PH domain-mediated sensing, and PTEN-mediated degradation of PIP3 have all been robustly demonstrated to control chemotaxis in the slime mold Dictyostelium discoideum (PMIDs: 9778249, 11352940, 11389841, 12062103, 12062104, 12802064). While the Tec kinases emerged more recently (PMID: 30183386), PI3K, PTEN, PDK1, and Akt are all found in choanoflagellates, suggesting that choanoflagellates inherited an ancient lipid signaling toolkit and that the Tec kinases were a novel evolutionary addition to the toolbox."

      Reviewer #3____

      Points to be addressed:

      Fig 1B: For the sequence alignment, a few more residues before/after the four critical selected residues should be shown. This allows the reader to evaluate how conserved these residues really are. (estimated time investment: ~1 day max.)

      • Figure 1B is not actually a conventional sequence alignment, since it shows four residues that are structurally related, but not found in a contiguous sequence. However, we have added a new Supplementary Figure panel (Supplementary Figure 1A) to show the sequence motifs for each residue.

        Fig. 2 I/J/K: It is more customary to show HDX-MS results mapped on a structural cartoon representation (and not surface representation). The current representation makes it impossible to see which functional areas of the different domains show increased/decreased HDX. In addition, mapping HDX changes on a linear sequence/sec structure plot (as also commonly used to represent HDX-MS data) should be shown in SI. (estimated time investment: Reviewer #1

      This is important because the whole thesis of this manuscript rest on the model's suggestion that the kinase domain sequesters the PIP3 binding site of the PH domain. The authors found that in cells full-length MbTEC transiently associated with the membrane but the isolated PH domain enjoyed more prolonged membrane association. The authors interpreted this difference in membrane association in terms of different sequestration of the PIP3-binding PH domain by the kinase domain, but the PH-kinase interaction is based on a model and it needs further validation.

      • Model validation, particularly in the era of AlphaFold, is critical, as the reviewer correctly notes. However, we dispute the reviewer's assertion that the PH-kinase interface derived from our model needs further validation. The following is a summary of all the orthogonal ways in which we validated the model. In terms of publishing standards, we believe we have exceeded what is widely accepted as robust evidence for a specific interface.
      • The pair-alignment error (PAE) plot (Figure 1H) exhibits prediction errors in the PH-kinase interface which are (a) extremely low and (b) comparable with those in the SH3-kinase, SH2-kinase, and SH3-SH2 interfaces, all of which are superimposable with experimental structures.
      • Comparison of the model with experimental small-angle X-ray scattering (SAXS) in solution revealed a near-perfect fit (Figure 2A). This demonstrates that the global conformation of the model is an accurate reflection of the conformation of MbTEC in solution.
      • Mutation of the interface on the kinase side leads to a loss of thermal stability equivalent to deletion of the PH domain (Figure 2F-G) and a failure to bind the PH domain in trans (Figure 2H).
      • Changes in HDX-MS of the interface-mutated protein (Figure 2I-L) are comparable to those in the PH domain-deleted construct (Supplementary Figure 6E-J).
      • Reciprocal mutation of the interface on the PH domain leads to a reduction in binding affinity for the SH3-SH2-kinase (32K) protein (Figure 4C).

      While autophosphorylation is dramatically enhanced by PIP3 containing nanodiscs, the interpretation can be complicated, as the manuscript itself acknowledged that membrane based experiments cannot readily deconvolute local concentration effects from allosteric effects, because concentrating proteins on a membrane can promote dimerization dependent autophosphorylation.

      • It is precisely for these reasons that we conducted the experiments detailed in Figure 3, since they do not convolute allosteric activation with local concentration on a membrane. These experiments underpin our conclusions that MbTec is specifically activated by dissociation of its PH domain from the kinase domain and not just by local concentration on a PIP3-containing membrane. Whilst the experiments in Figure 3 do not say anything about the specificity of the PH-kinase interface (which we addressed with other experiments), they unambiguously confirm the inhibitory effect of the PH domain that other studies have reported previously. Reviewer #2

      To elaborate on the point of sufficiency, can the authors utilise FRB-FKBP system to synthesize PIP3 ectopically and see if it leads to the recruitment of FL and PH in addition to PDGF stimulation. It will also be valuable if the authors can use PI3K inhibitors post PDGF stimulation to validate this point further. A colocalization with PIP3 biosensor post PDGF stimulation will also be a great control.

      • The reviewer's suggestion to use the FKBP-FRB system to synthesize PIP3 ectopically is elegant but, in our opinion, not necessary. The specific recruitment of Tec kinases to the plasma membrane in response to growth factor-stimulated production of PIP3 is well established (e.g. Varnai et al, JBC 1999). As such, a PIP3 biosensor is not necessary, since the Tec kinases are well established PIP3 sensors in cells.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The 5 human TEC kinases are cytoplasmic tyrosine kinases containing a prototypic (Src-like) SH2-SH2-kinase domain module with critical roles in various signaling pathways, in particular T- and B-cell signaling. One Tec kinase family member, Btk, is the target of major blockbuster drugs that revolutionized outcome of patients with a variety of B-cell malignancies. While major insights into the structure and regulation of Tec kinases by phosphorylation, as well as intra- and inter-molecular protein-protein interactions could be obtained over the past 3 decades, the role and precise mechanisms of TEC kinase regulation by binding to phosphoinositides at membranes, via its PH-TH unit, are much less clear.

      In this manuscript, the authors study the structure and regulation of an ancestral TEC kinase from the choanoflagellate M. brevicollis (MbTEC), which has a largely reduced set of tyrosine kinases (as compared to mammals), and therefore might offer a focused look on conserved essential kinase regulation that diversified and acquired cell-type specific fine-tuning during evolution. The manuscript first provides a nice workflow to obtain an accurate model of MbTEC using AlphaFold 3 modeling. SAXS supports the predicted compact conformation of MbTEC in solution. Removal of the PH domain resulted in lower thermal stability indicating an energetically favorable intramolecular interface, which was subsequently supported by HDX-MS measurements using full-length and 3 domain core (SH2-SH2-kinase domain: 32K) and possible PH-KD interface mutations of MbTEC. Kinase activity (autophosphorylation and substrate phosphorylation) assays support an autoinhibitory effect of the PH domain of MbTEC activation. Making use of a gain-of-function mutant, initially identified in human Btk PH domain, and in vitro experiments with nanodiscs containing PIP3 show strong activation of MbTEC autophosphorylation. Overall, the manuscript supports a model of PIP3-stimulated relief from PH domain-mediated autoinhibition of MbTEC resulting in full activation also involving disruption of the SH2-kinase linker interaction with the SH3 domain and displacement of the SH2 domain. The authors have used several different structural biology and biochemical assays, all of which allow for relatively precise (semi-)quantative answers to the underlying research questions. Hence, the claims and conclusions are very well supported and leave (very) little to be desired (see points below). This is a nice and clean structural biochemistry paper with generally well controlled experiments and appropriate choice of research methods. The manuscript text is well written, previous work appropriately mentioned/discussed and results are carefully interpreted and gauged towards a final model.

      Points to be addressed:

      Fig 1B: For the sequence alignment, a few more residues before/after the four critical selected residues should be shown. This allows the reader to evaluate how conserved these residues really are. (estimated time investment: ~1 day max.) Fig. 2 I/J/K: It is more customary to show HDX-MS results mapped on a structural cartoon representation (and not surface representation). The current representation makes it impossible to see which functional areas of the different domains show increased/decreased HDX. In addition, mapping HDX changes on a linear sequence/sec structure plot (as also commonly used to represent HDX-MS data) should be shown in SI. (estimated time investment: <1 week) Fig 5B/C: The nanodisc experiment lack some controls. In order to conclude that PIP3 is indeed critical for the observed enhance autophosphorylation of MbTEC, nanodiscs with e.g. PI3P, PI4P or PI5P should be used that are not expected to bind the MbTEC PH domain with high affinity. Likewise, or alternatively, a mutant PH domain with largely reduced PIP3 binding affinity would support trust in this central result of the paper. (estimated time investment: 1-2 months).

      Significance

      MbTEC kinase structure and regulation has not previously been studied. Hence, novelty is very good. Given the overall conservation of the structure and regulatory mechanisms of TEC kinases (and related SRC and ABL kinases), as well as the large number of prior studies on these kinases (by Prof. Leonard and several others in the field), many aspects of this study are not overly surprising and more confirmatory than groundbreaking. On the other hand, it is a well-controlled and experimentally "clean" study. It uses a state-of-the-art combination of modern structural/biochemical methods, provides some important technological advance to reliably model multi domain signaling proteins using AlphaFold 3. The detailed dissection of MbTEC regulation provides some novel aspects and offers a convincing model for TEC kinase activation with implications for the human/mouse protein. I am convinced that the manuscript will be of interest for the broader signaling community, as well as basic scientists in the fields of structural biology and membrane cell biology.

      I do have sufficient expertise to evaluate this manuscript, as I have used essentially all of the described methods in my lab before and have been working on proteins related to MbTEC for my entire scientific career.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Overall Summary

      The authors have tried to put forward a comprehensive structural and biochemical analysis of an ancestral TEC kinase from Monosiga brevicollis (MbTEC). The authors have used a wide array of state-of-the art approaches such as protein biochemistry, mutagenesis, thermal stability assays, SAXS, AlphaFold 3 modeling with curated MSAs, HDX-MS, kinase assays, lipid nanodiscs, mass spectrometry, and cell-based imaging to propose a detailed mechanism for autoinhibition of MbTEC and PIP3-dependent allosteric activation.

      Overall Comment

      The overall study presented is quite scientifically challenging and conceptually ambitious. The authors are to be commended for coming up with a manuscript with impressive technical rigor , experimental thoroughness and care with which the manuscript is written and presented.

      Major Comments

      1. The inference for Figure 3 that PH domain exerts a strong autoinhibitory influence on kinase activity that cannot be overcome by disruption of the SH3-kinase interaction would benefit from further clarification. It is not immediately clear from the data that PH-domain-mediated inhibition should be seen as dominant rather than synergistic with SH3-kinase linker interactions. Although the autophosphorylation stoichiometry was measured for MbTEC32K L396A and MbTECFL L396A, a more thorough quantitative evaluation of the relative contributions of PH-domain removal versus SH3-linker disruption would be possible if this analysis were extended to MbTEC32K. Discussing whether these inhibitory components might instead work together/cooperatively to limit kinase activity or is it one dominant over the other , the authors are urged to thoroughly explain the reasoning behind the conclusion provided.
      2. To elaborate on the point of sufficiency, can the authors utilise FRB-FKBP system to synthesize PIP3 ectopically and see if it leads to the recruitment of FL and PH in addition to PDGF stimulation. It will also be valuable if the authors can use PI3K inhibitors post PDGF stimulation to validate this point further. A colocalization with PIP3 biosensor post PDGF stimulation will also be a great control.
      3. Although the nanodisc experiments clearly show PIP3-dependent activation, titrating the PIP3 content in nanodiscs (e.g., 0.1%, 0.5%, 1%, 3%, 5% of PIP3) to determine whether MbTEC activation shows a graded response to lipid abundance would strengthen the conclusions. This would support the suggested allosteric mechanism and aid in differentiating between digital and analogue activation behaviour.
      4. A good negative control for Figure 5C, would be a nanodisc containing another phosphoinositide. Given prior evidence that TEC-family PH domains display selectivity for PIP3, it would nevertheless be informative to test nanodiscs containing other phosphoinositides (e.g., PI(4,5)P2, PI(3,4)P2, and PI3P)
      5. It would also be valuable if the authors in the discussion section can draw a contrast with PIP3-dependent activation mechanism of AKT . This would be helpful in highlighting the uniqueness of PIP3 dependent TEC activation.

      Minor Comments

      • Y579 and R581 comes without a significant context. Can the authors elaborate on these residues a bit.
      • Figure 2H - In the legend make wt as WT so that it matches the figure panel
      • Supplementary Figure 1J - Adjust the orientation of intensity on y axis
      • Supplementary Figure 1H - In the figure it should be Y579 and R581
      • Can the authors add that 5C is the representative autoradiographs for each construct from panel 5B. Make it clear.
      • Write the units for intensity on the y axis for the entire supplementary figure 1
      • Supplementary Figure 2J and 2K - Make the 6 subscript in the legend for Gly 6.
      • Can the authors include RRID wherever applicable in the methods section.
      • Include a space between i and was in the sentence " Each sequence iwas assigned a raw weight .
      • I think MSA is coming twice in the line above structure inference in the methods section. MSAs is repeating after balanced MSA. Kindly look into it.

      Significance

      General assessment

      This is a study on the TEC family of kinases that have an important role in the immune cells. Thus, alterations in their function is linked to both primary immunodeficiency as well as hematological malignancies. Thus understanding their mechanism of activation is of fundamental importance understanding protein kinase regulation as well as developing potential therapy for blood cell disorders.

      The work has been done using the TEC kinase from the choanoflagellate M.brevicollis, presumably for practical reasons of expression and purification. PIP3 signalling, to my knowledge, has not formally been demonstrated in choanoflagellates. This remains a concern in respect of the relevance of these findings to true metazoans which is the setting in which Class I PI3kinase generated PIP3 signalling is seen.

      Advance

      This study advances details of the molecular mechanism by which PIP3 interacts with and regulates TEC kinase function. This is a study in basic structural biology.

      Audience

      This study will be of interest to structural biologists and those with an interest in understanding phosphoinositide regulated protein function.

      My expertise

      Biochemistry and cell biology, phosphoinsoitde signalling

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript focuses on PIP3 dependent activation in TEC family kinases. It used the choanoflagellate Monosiga brevicollis TEC kinase MbTEC as the model system. The authors proposed that MbTEC adopts a compact solution conformation in its autoinhibited state, with the PH domain bound with the kinase domain. Based on a AlphaFold3 model, the PH-kinase interaction was proposed to sequester the PIP3 binding site of the PH domain, hindering the membrane interaction of the PH domain, and in turn disrupting TEC activation. The study organized data collected from many different experiments, including structure modeling, solution measurements, mutagenesis, activity assays, in vitro membrane reconstitution, and cellular localization assays.

      Major comment: We think the proposal is overall coherent and reasonable and found it interesting. It is not, however, conclusive. Modeling played a key role in supporting this proposal, but the modelling itself was dependent on choices of parameters made by the authors. The reported AlphaFold 3 model depended on a customized MSA strategy: the authors report divergent placement of the PH domain with respect to the kinase domain in their AlphaFold 3 runs. In light of this observation the they used a manually curated TEC family MSA with taxonomic reweighting. This helped the model convergence but it introduced arbitrarity in the modeling step. This is important because the whole thesis of this manuscript rest on the model's suggestion that the kinase domain sequesters the PIP3 binding site of the PH domain. The authors found that in cells full-length MbTEC transiently associated with the membrane but the isolated PH domain enjoyed more prolonged membrane association. The authors interpreted this difference in membrane association in terms of different sequestration of the PIP3-binding PH domain by the kinase domain, but the PH-kinase interaction is based on a model and it needs further validation. While autophosphorylation is dramatically enhanced by PIP3 containing nanodiscs, the interpretation can be complicated, as the manuscript itself acknowledged that membrane based experiments cannot readily deconvolute local concentration effects from allosteric effects, because concentrating proteins on a membrane can promote dimerization dependent autophosphorylation.

      Minor comment: In two places the authors wrote "PIP3 is necessary and sufficient for both MbTEC activation and inactivation." This seems logically impossible. Revision is required.

      Significance

      It attempted to clarify the role of the PH domain in TEC activation from a mechanistic perspective. If confirmed, it can potentially lead to novel approaches of drug discovery targeting TEC kinases.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In this paper, the authors combine advanced quantitative mass spectrometry-based proteomics with the PRIMMUS cell sorting technique to profile cell cycle proteins across the G1, S, and G2/M phases in CD8 T cells, embryonic stem cells (mESCs), and NIH3T3 fibroblasts. Their well-designed experimental approach-avoiding artifacts from synchronization-yields an impressive and detailed dataset on cell cycle dynamics. The experimental approach to interrogate three different cell lines for cell cycle protein dynamics, without using synchronization, is technically excellent. The data extracted from that initial experiment, which is shown in figures 2, 3, 4 and 5, is clearly described and well presented.

      My concern is with conceptual advance of study. The author set out to determine how cell cycle control mechanisms could give rise to rapid proliferation in CD8 t cells, but it is not clear they have addressed that question. The data do not clearly support a model in which mESCs or CD8 T cells exhibit dampened cell cycle protein periodicity compared to 3T3 fibroblasts, nor for a change in APC/C being a key driver. Notably, the depletion of Emi1 appears to most significantly impact later cell cycle stages, as has been reported previously by others, and thus it is unclear that in this system it is promoting S-phase entry. And, while the impact on re-replication is clear, this is a very well reported phenotype. Overall, I am very impressed by the technical aspects of the study, but thought the authors could put more effort toward developing more meaningful interpretations of the data, which felt under-developed.

      Comments and recommendations:

      • There does not appear to be any statistical comparative analysis accompanying the proteomics data. While the differences highlighted in the text are generally clear from the figures, including statistical comparisons would strengthen the analysis and provide greater confidence in the reported findings.
      • It is unclear what the box and whiskers represent in these plots. By convention, box plots typically display the interquartile range, with the central line indicating the median and the whiskers representing the minimum and maximum values or a defined range. If the authors are using a different convention, it should be clearly stated in the figure legends. Alternatively, I recommend using a different data representation to avoid potential confusion.
      • Line 133 and figure 2B and D. It's unclear to me what 'common cell cycle regulated protein' are. The methods: "Fold changes were calculated and ANOVAs performed to identify significantly changing proteins in each cell line using the following filtering criteria: p < 0.05 and fold change > 2." does not fully clarify the analysis. For example, it's not clear what the fold changes are being compared to-is it a specific cell cycle phase, a baseline, or another reference point? Additionally, is the reported p-value corrected for multiple hypothesis testing? Overall, the methodology could benefit from further clarification.
      • In Figure 2B, it is unclear how the data are normalized. Is it appropriate to compare values across rows, columns, or both? Additionally, the ordering of the CD8 T cell G2/M samples appears to be ascending from top to bottom-what is the rationale behind this ordering? Clarifying the normalization method and how to interpret comparisons within the figure would help improve its readability. As it is now, I am not clear on how this data supports the statement where it is called in the text "There was a core set of 121 proteins that were consistently cell cycle regulated across all three cell types" (lines 130-132). It would be interesting to know how these data compare the authors prior proteomics data on cell cycle proteins and to that of authors.
      • In Figure 2C, are these the most frequent GO terms among the "cell cycle-regulated proteins"? It is unclear whether this represents a curated list of commonly known cell cycle-related terms or if these are the top GO terms identified through enrichment analysis. Clarifying this distinction would help readers better understand how these terms were selected and what the figure is intended to convey.
      • The interpretation of differences in some APC/C components is not clear to me, particularly given the relationship to APC/C substrate expression. For example, ANAPC10 is down in CD8 cells, but the APC/C appears highly active given the high differential between substrate abundance between G1 and S/G2/M cells.
      • I have some concerns regarding the Emi1 knockout experiments. First, the knockout does not appear to be particularly efficient. In addition, the results do not necessarily support the proposed conclusion. If Emi1 was a key driver of G1/S progression, I would think you would observe more 2N cells, not fewer. The relationship to DNA damage, made in Fig 6f is unclear.
      • In the flow cytometry data in Figure 6C, the Emi1 knockout cells are EdU-positive and have >4N DNA content. This is consistent with a re-replication phenotype and for a vital role of Emi1 in preventing re-activation of the APC/C in late S or G2 phase. It is difficult to square this with the assertion that it is promoting G1/S progression. Notably, this is a well reported Emi1 phenotype.

      Minor comments:

      • Some figures lack error bars (e.g., Figures 2C, 2D, and 5A). I also recommend including individual data points on all graphs, as this is now required by many journals. Since most of the graphs likely represent experimental triplicates, adding these elements should be straightforward and will allow readers to see actual spread of the data.
      • The terminology used to refer to CD8 T cells is inconsistent across the manuscript-for example, "CD8T" in Figure 2 and "CD8" in Figure 3. Standardizing this terminology throughout the figures and text would improve clarity.
      • The authors frequently refer to "increases in pRB-," which is somewhat confusing. I recommend rephrasing this as "decrease in pRB" to make the interpretation clearer and easier to follow.

      Significance

      Significance and Advance

      Strength: Excellent deep, quantitative proteomic dataset of cell cycle proteome dynamics across three independent cell lines. Done without chemical or other synchronization. This incrementally improves on previous proteomics datasets.

      Weaknesses: Limited conceptual advance in our understanding of cell cycle proteomes, or insight into how different cell types proliferate at different rates.

      Audience: interested audiences include those study cell cycle control, proteomics.

      My expertise: cell cycle control and protein dynamics.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      This study uses proteomics as an approach to understand regulatory mechanisms contributing to a short cell division cycle in CD8 T cells. While the cell cycle machinery has been studied in mouse embryonic stem cells (mESCs) and these discoveries have provided a framework for understanding atypically short cell cycles, whether those mechanisms are conserved in other cell types that undergo a short cycle has not been clear. Analysis of the cellular proteome was performed at three different cell-cycle stages (G1, S, G2/M) and across three cell lines (CD8 T cells and mESCs and fibroblast 3T3 cells as comparators for short and normal cycles, respectively). An important innovation here is the application of a PRIMMUS protocol, which uses markers of cell cycle stage and sorting rather than treatments, such as inhibitors, that can have unwanted perturbations. The authors find both similarities and differences between CD8 T cells and mESCs. Of particular interest, CycE levels are constitutively high in both cell types with a short cycle compared to fibroblast cells, but APC/C substrates are uniquely lower during G1 in CD8 T cells. Using CRISPR knock-out, the APC/C inhibitor Emi1 is then implicated as a key regulator of G1/S control in CD8 T cells as well as phenotypic fate.

      Major Comments:

      Overall, the conclusions are fairly made from the data. In particular, the proteomics data and analysis are of high technical quality. The data and methods are clearly presented in a manner that could be reproduced, and experiments are, with one exception noted gere, performed with appropriate replicates and statistical analysis. I have only two major comments.

      1. From the description of the CRISPR methods and results, the origin of the Emi 1KO cells used in all the experiments is unclear. Only one Emi1 KO sample or experiment is shown along with control in each panel. A number of questions should be answered in order to clarify. Was this sample originated from the total cells grown out of the gRNA treatment (and on which the TIDE analysis was performed)? If so, which exon was the gRNA targeting in the cells used in subsequent experiments? Were single clones ever grown? Ideally, experiments should be performed with multiple single clonal lines, and at least one from each exon used, if possible. Alternatively, a population generated from each of the two guids should be used in all the experiments as a proper biological replicate.

      It may be that homozygous deletion of Emi1 is not possible because of the effects on proliferation. If so, the authors should comment on this along with why the protein level in Sup. Fig. 2b is still so high in the KO line. Considering the inefficiency of the overall KO, it is especially important to understand Emi1 expression throughout the polyclonal population to interpret the results. Are there some cells expressing normal levels and some cells expressing no Emi1? If monoclonal, are the observed effects from haploinsufficiency? More description of the method and caveats to the interpretation of results are needed here, and results would be made more robust if more than one clone or gRNA treatment were used. It would be expected this may take 2-3 months with reasonable resources to perform this additional analysis, if not done already. 2. OPTIONAL. One shortcoming of the manuscript is the lack of insights into why knock out of Emi1 leads to G1 arrest. Follow up experiments on the APC/C substrates in Fig. S1b or analysis of proteins that are upregulated or downregulated by Emi KO would increase the significance of the manuscript. These experiments may be time consuming and could be considered beyond the scope of this study, but they would increase the impact of the study.

      Minor comments:

      1. Some brief aspects of the methods could be added to the description of results to help improve clarity in a few places. For example, the threshold for including the 121 proteins in Fig. 2b and what antibodies were used for staining in the PRIMMUS approach would be important enough that it should be stated in results with no need to look up detailed method.
      2. It should be clarified that in most cases, use of the term "abundance" in comparing protein levels is actually a relative abundance for a specific cell type. For example in the following sentence, relative and not absolute abundances are being compared : "In mESC, there was higher abundance in G1 phase compared to CD8 T cells and NIH3T3 (Fig.136 2d)." Clarifying upon first use of the term abundance would be sufficient.
      3. In some of the protein abundance panels there does not appear to be any mark in the graph, which is consistent with there being no observation of protein. This absence of any observation should be made explicit in the figure captions.
      4. It should be clarified with its first use (line 264) that "pRb" means phosphorylated Rb.

      Significance

      The key conclusions are significant in that they advance our understanding of how cell cycle regulatory mechanisms can be adapted in cell types with short cell cycles. Results are also relevant in particular for the biology of T cells, and the finding that Emi1 controls the cell cycle and cell fate in CD8 T cells is noteworthy. These represent advances in our mechanistic understanding of cell division. In addition to the significance of the conclusions, the greatest strength of the manuscript is the technical quality of the proteomics data and the fair conclusions drawn from them. The results will be of interest to scientists studying basic research in cell, immune, and cancer biology.

      The greatest limitation of the study is the lack of insights into how Emi1 is driving S phase and how its loss leads to accumulation of cells in G1. This result is quite interesting, because it is opposite what may be expected considering the known function of Emi1 as an APC/C substrate inhibitor and the observation in Fig. 5 that APC/C substrates are similarly low in CD8 T cells and 3T3 cells. This discrepancy and the logic of testing Emi KO should at least be more thoroughly discussed, and additional experiments, as described above, would increase the impact of the study. A second limitation is the lack of conclusions from many comparisons of protein levels among the cell type and cell-cycle phases. For example, the significance of similarities and differences in levels of the Cdks (Fig. 4) and Cdh1 (Fig. 5) is left undiscussed.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this study by Lewis and colleagues, the authors use highly quantitative proteomics to compare cell cycle patterns in T lymphocytes to mouse embryonic stem cells and differentiated fibroblasts. The work is significant because the cell cycle field has been dominated by only a few paradigms based on model systems such as yeasts, cancer-derived cell lines, and fibroblasts. The different cell cycles in other cell types have not been thoroughly investigated because they are less experimentally tractable. Prior studies have drawn conclusions about embryonic stem cells that turned out later to be primarily from attempts to induce synchronization prior to analysis. Here, the authors apply their previously-described method for detecting proteins according to cell cycle phase by flow-sorting and then mass spectrometry. They find expected cell cycle fluctuations in the three cell lines, but also a few unexpected differences. Primarily, cyclin E protein appears to be expressed at higher levels in the fast-proliferating mESCs and CD8 cells compared to NIH3TS and to be less cell cycle-regulated. The mechanism and functional relevance for this difference are not considered.<br /> The authors chose to focus several experiments on the APC/C ubiquitin ligase inhibitor, Emi1, which they find appropriately cell cycle-regulated in CD8 and mESCs but expressed at higher levels in CD8 cells. They show that CD8 differentiation is associated with cell cycle changes and that perturbing the cell cycle in CD8 cells affects their differentiation (which is not surprising). Altogether this study provides an excellent resource for the field from the detailed high-quality proteomics data sets (though see point #8). On the other hand, the experiments with EMI1-KO and differentiating cells seem largely unrelated to the cell cycle analysis itself; these results may be mostly indirect consequences of cell cycle perturbation rather than specific to EMI1 function. This is a very thorough, rigorous, and valuable proteomics characterization of cell cycle differences, but the choice of follow-up functional experiments isn't logical in my opinion.


      Major points:

      1. It is not clear how the quantification accounts for differences in cell size or nuclear/cytoplasmic fractions among the different cell types. For example, ESCs are typically small with relatively less cytoplasm. Since a great many cell cycle proteins are nuclear, normalizing to total protein analyzed and not cell number could make these proteins appear more abundant. In other words, does 1 ug of protein from mESCs represent the same number of cells or nuclei as 1 ug of protein from NIH3T3?
      2. The motivation for the project was given as wanting to understand if fast-cycling CD8 T cells resemble fast-cycling mESCs, which the authors show they partially do. The molecular phenotype in common is cyclin E abundance, but we are left without an understanding for how that occurs. Are there differences in expression of the E3 ligase for cyclin E?
      3. We also don't learn if the constitutive cyclin E is why cells cycle quickly because no experiments to change cyclin E are provided.
      4. In an unexpected turn for the project, the authors generated inducibly Emi1-null CD8 T cells and analyzed them for replication, re-replication, and phosphorylated Rb. This section is fine, but the rationale isn't obvious, particularly if constitutive cyclin E (not an APC/C substrate) is the proposed driver of fast proliferation. Emi1 isn't the general reason for fast proliferation if it is only high during G1 for CD8 and not mESCs. Emi1 loss has also been known to promote re-replication for quite a few years, a finding the authors cite. The replication-dependent DNA damage and cell cycle arrest are both well-known outcomes from re-replication. So, although the data are of good quality, I'm not sure that showing Emi1 has the same role in CD8 cells as what was reported in other cells adds much to the field. The authors could choose to save Figure 6 for a more APC/C-focused study and use EMI1 as a general cell proliferation marker in Figure 7.
      5. What is the cell cycle distribution of the IL-15 and IL-2 differentiated cells? If they are different, then some of the protein abundance changes to regulated proteins such as p21 might be a symptom of cell cycle differences rather than differentiation. It seems IL-15 is more cell cycle promoting form Figure S3, but that isn't clear from the assays presented.
      6. The change in the CD62L marker is argued to be unrelated to cell cycle and somehow more directly downstream of Emi1. The connection here is tenuous, in part because there's no comparison to a similar cell cycle perturbation independent of Emi1. How would Emi1 modulate CD62L if CDC62L does not show evidence of being an APC/C substrate?
      7. As presented, Emi1 has a unique effect on the immune phenotype. It is also possible that many cell cycle perturbations would similarly affect differentiation and not only Emi1. Can the authors distinguish Emi1-specific effects from more common cell cycle perturbation effects? Alternatively, interpreting the EMI1-KO data more generally could avoid confusion.
      8. The authors should provide legends to the supplementary tables in the text and/or the tables themselves. Several column headers in the tables are not defined.

      Minor points:

      • a. Figure 4g is titled Cdkn2a (which would be p16) but the legend says its Cdkn1a (p21). The p21 protein was undetected in NIH3T3? One would expect it to be in the G1 cells at least.
      • b. Line 56: cyclin E is not an APC/C substrate as implied by the text. If Skp2 is a bona fide substrate (not sure that's true), then APC/C activity would have opposite effects on Cyclin E vs Cyclin A.
      • c. What is the red arrow in Figure 6C?
      • d. Figure 7b is protein copy number whereas all other graphs are for ppm. The legend on line 686 has "cite immPRES" but this citation didn't make it into the submitted manuscript.

      Significance

      The work is significant because the cell cycle field has been dominated by only a few paradigms based on model systems such as yeasts, cancer-derived cell lines, and fibroblasts. The different cell cycles in other cell types have not been thoroughly investigated because they are less experimentally tractable. Prior studies have drawn conclusions about embryonic stem cells that turned out later to be primarily from attempts to induce synchronization prior to analysis. This is a very thorough, rigorous, and valuable proteomics characterization of cell cycle differences, but the choice of follow-up functional experiments isn't logical in my opinion.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      Summary of findings and key conclusions This manuscript asks how pharmacologic targeting of the outer mitochondrial membrane protein MIRO1 (RHOT1) with a MIRO1-binding compound (MR3) reshapes immunosuppressive programs in the glioma tumor microenvironment (TME). The core of the paper is a cross-species transcriptomic comparison that combines an in vivo mouse dataset with an ex vivo human perturbation dataset. Model systems and approach (as described): • Mouse in vivo: GL261-Luc intracranial glioma in C57BL/6J mice; MR3 is administered intracranially at the implantation site (10 µM in 5 µL DMSO) on days 11 and 18, and tumors are harvested on day 22 for single-nucleus RNA-seq (snRNA-seq). • Mouse snRNA-seq: NeuN-based nuclei sorting, 10x Genomics v3.1; alignment to mm10; Seurat-based integration and annotation. Tumor-cell calling is supported by CNV inference (SCEVAN/CopyKAT). One MR3-treated sample is excluded after QC, leaving 3 control vs 2 MR3-treated samples (11,940 NeuN− nuclei). • Human ex vivo: freshly resected glioma cores from 3 patients are cultured with 10 µM MR3 or DMSO for 24 h, followed by bulk RNA-seq (STAR alignment to hg19; DESeq2 for differential expression). • Cross-species integration: the analysis is restricted to 1:1 orthologs and protein-coding genes shared across datasets; inferred cell-cell signaling is explored with CellChat. Main findings (as presented): • MR3 shifts expression of a subset of glioma-associated genes toward a non-tumor-like direction ("rescued genes") and is associated with large changes in inferred cell-type composition in the mouse snRNA-seq dataset (including a marked drop in the fraction of nuclei annotated as tumor: 44.5% to 4.3%; Fig. 1E). • Across TCGA-vs-GTEx (glioma-upregulated genes) and three MR3 response analyses (mouse snRNA-seq, mouse pseudo-bulk, and human bulk RNA-seq), PARP11/Parp11 is reported as the only gene that is consistently upregulated in glioma and consistently downregulated by MR3 (Fig. 2B). • Within the mouse myeloid compartment, Parp11 is most enriched in MAC4 and MAC1, while MAC1 shows high Cd274 (Pdl1/PD-L1). MR3 reduces Parp11 in MAC4/MAC1 and reduces Cd274 in MAC1 (Fig. 2H). • CellChat analysis suggests that in controls MAC1 is the dominant sender of PD-L1/PD-1 signaling to CD8+ T cells (Fig. 3C), and that this PD-L1/PD-1 interaction is strongly diminished after MR3 (Fig. 3E). • The authors propose a paracrine model in which MAC4-derived PGE2 (via Ptges3) sustains Parp11 expression in MAC1 through cAMP/PKA/CREB, promoting PD-L1-mediated T-cell suppression; MR3 disrupts this circuitry (Fig. 4). Major comments 1. Strength of the conclusions Two parts of the story felt well supported by the data as shown. First, the cross-species convergence on PARP11/Parp11 is a clear and potentially useful result (Fig. 2B). Second, the myeloid subclustering plus CellChat analysis makes a coherent case that PD-L1/PD-1 signaling in this model is dominated by a specific macrophage subset (MAC1) and changes after MR3 (Fig. 2H, Fig. 3). Where I was less convinced is when the manuscript moves from "transcriptomic and modeling evidence" to causal statements such as "MIRO1-mediated axis driving immunosuppression" and "MR3 reduces tumor burden by reactivating immunity." At the moment, several central inferences remain indirect: • Causality is inferred primarily from transcriptomic shifts and ligand-receptor inference rather than functional immune readouts.

      -We thank the Reviewer for the constructive evaluation. We have toned down the claims throughout the manuscript with tracking.

      • __ On-target attribution to MIRO1 hinges on MR3 being a MIRO1 binder; the study does not include a genetic MIRO1 perturbation or a target-engagement/epistasis test in the relevant immune compartments (and the authors acknowledge this limitation in the Discussion).__ -We have examined on-target activity of MR3 in our other papers. For example, by depleting Miro1 with CRISPRi in glioma cells (Miro1 KD cells), we found that it phenocopied the effect of MR3. We also expressed Miro1-7A, a drug-resistant mutant of Miro1 predicted to be unable to bind MR3 (1) in Miro1 KD glioma cells, which rendered glioma cells insensitive to MR3 treatment. These data demonstrate that in cellular glioma models, Miro1 is the target of MR3 and MR3 exerts its functions via directly binding to Miro1.

      We have also excluded off-target effect of MR3 by examining other mitochondrial GTPases (1, 2) including Miro2.

      We agree these data were not done specifically in immune compartments, and have acknowledged it in Discussion and added more explanation in Introduction citing our published papers.

      • __ The very large reduction in "tumor cell proportion" (Fig. 1E) is striking but is still a composition measure of recovered nuclei; it is not, on its own, a direct measurement of tumor size/burden and could be sensitive to differential nuclei recovery or cell loss during processing.__ -We agree that the "tumor cell proportion" in Fig. 1E represents the composition of recovered nuclei and is not, by itself, a direct measurement of tumor size or burden. We have removed "tumor burden" throughout the manuscript to avoid confusion.

      To determine whether the observed reduction might reflect technical bias, we examined the quality control metrics across all samples. Of the six initial samples (three control and three treated), one treated sample (TN1) showed clear quality concerns and was therefore excluded from downstream analysis.

      For the remaining samples, the distributions of detected genes per nucleus and total RNA counts per nucleus were similar between groups. The percentage of mitochondrial reads was consistently low, and only a small fraction of nuclei was removed during filtering, indicating overall comparable nuclei quality. Notably, the treated samples yielded similar or even higher total numbers of recovered nuclei, despite showing a lower tumor cell proportion. Please refer to new Fig. S1A for these results.

      Together, these observations suggest that the decrease in tumor cell proportion is unlikely to be explained simply by differential nuclei recovery, sequencing depth, or filtering effects. That said, we recognize that compositional differences in single-nucleus RNA sequencing data do not provide a direct measurement of tumor burden. We have revised the manuscript to clarify this point and to indicate that independent future approaches would be required for definitive assessment.

      I think the paper can go forward in its current scope, but the strength of the claims should match the level of evidence. If the authors want to keep strong, causal language in the title/abstract ("driving immunosuppression," "reduces tumor burden"), then I consider one or two targeted validation experiments essential (see below). Alternatively, the authors can temper the language and position the mechanistic model more explicitly as a hypothesis generated from the transcriptomic analysis.

      -We thank the Reviewer! We have toned down the claims throughout the manuscript to make the data consistent with the conclusion.

      __ Statements that should be labeled as preliminary/speculative (unless additional validation is added) • MAC4-derived PGE2 as the upstream driver of MAC1 Parp11/PD-L1: plausible and nicely consistent with Ptges3 being MAC4-high in controls and reduced with MR3 (Fig. 4A), but not demonstrated.__

      -We have changed the conclusion of this part to:

      Together, these bioinformatic findings suggest that MAC4 may produce PGE₂, which could act on nearby MAC1 cells in a paracrine manner to increase Parp11 expression, although this model needs to be functionally validated.

      • __ MIRO1 _→ mtDNA _→ cGAS/STING _→_ Ptges3 as a mechanistic chain: interesting, but currently framed largely by pathway knowledge plus modest expression changes (Supplementary Fig. S5).__ -We have added: "which requires future functional investigation."

      • __ "MR3 reactivates anti-tumor immunity to reduce tumor burden": the gene set enrichment and CellChat shifts are consistent with immune activation, but immune-mediated tumor control is not directly tested.__ -We have toned down these claims on tumor burden and only conclude as: MR3 may enhance anti-tumor immune responses.

      __ Replication and statistics Mouse snRNA-seq replication is limited after QC (3 control vs 2 MR3-treated animals). With n=2 treated, it is hard to know whether some of the biggest composition and cluster-level changes are robust to animal-to-animal variability.__

      -As also explained to Rev 2, we originally planned 3 mice per group. Despite losing one after QC, sample-level pseudobulk PCA analysis (treating each mouse as one replicate) of the mice shows clear separation of treated from untreated groups (new Fig. S2C), supporting technical reproducibility despite a small n. The two MR3-treated samples clustered together and were clearly separated from controls, indicating that the transcriptional effect of MR3 exceeds inter-animal variability (new Fig. S2C). The reduction in tumor cell proportion was also observed in both treated animals (new Fig. S2F). We have added this description to the Results (Page 5, lines 116-118) and included a new figure showing the tumor cell proportion for each animal (new Fig. S2F).

      We acknowledge this is a limitation, but as the Reviewer also pointed out that our paper's significance is to transcriptomically link Miro1 to well-known immune suppression factors in glioma TME and integrate 3 glioma databases which will facilitate researchers in the field to advance their own research. Thus, our methods and resource should be still valid and useful to the community.

      Relatedly, the snRNA-seq differential expression is performed with Seurat FindMarkers (Wilcoxon rank-sum). Per-cell testing can inflate significance if biological replicate structure is not accounted for (pseudoreplication). I suggest the authors clarify exactly how they handled sample-level replication for the key DE results and, where possible, re-run the main DE comparisons using a sample-aware approach (e.g., pseudo-bulk within cell types/subclusters).

      -We thank the reviewer for raising this important point. In the original analysis, differential expression was performed using Seurat's FindMarkers function which performs per-cell testing. We acknowledge that this approach can overestimate significance if biological replicate structure is not explicitly accounted for.

      To address this, we re-ran the key differential expression analyses using a pseudo-bulk approach: counts were aggregated per cell type/subcluster per sample, and DE testing was performed across samples rather than individual cells. The main results and conclusions remain consistent with the original analysis, while this approach ensures that statistical significance properly reflects biological replication (new FigS3. D-F).

      For the human bulk RNA-seq, the methods indicate 3 patient tissues split across MR3 vs DMSO for 24 h. In DESeq2, a paired design (including patient as a blocking factor) would be important to avoid patient-to-patient variability dominating the treatment signal; the manuscript should confirm whether the design formula accounted for this.

      -In the revised manuscript, we re-ran the DESeq2 analysis using a paired design with patient as a blocking factor and compared DMSO and MR3 within each patient (P1-P3). The results are consistent with our previous analysis. PARP11 remains significantly downregulated (raw p-value Finally, several places in the Methods define significance using p-value cutoffs (e.g., GEPIA3 TCGA/GTEx analysis uses p 1; human DE uses p = 1). Because multiple testing is substantial in all of these analyses, I recommend reporting FDR-adjusted values consistently (and being explicit about whether figures/tables show raw or adjusted p-values).

      -We have now used FDR-adjusted values for the TCGA/GTEx analysis and have updated Fig. 1C (top left), Results, and Methods accordingly. PARP11 remains significant after FDR correction.

      For the human bulk RNA-seq, very few genes pass an adjusted p 2FC| > 1 across all four differential expression analyses and updated the corresponding description in Methods.

      __ Do the data support the macrophage-to-CD8 suppression claim? The CellChat PD-L1/PD-1 network figures are suggestive (Fig. 3C/E), but ligand-receptor inference is not the same as demonstrating functional T-cell inhibition. At minimum, I would like to see one orthogonal readout (flow or immunostaining) showing that PD-L1__ protein on myeloid cells and PD-1 on CD8 T cells change in the expected directions after MR3, and that CD8 T cells show an activation/effector signature at the protein level.

      -We agree this would be clearly the next step in functional studies, but the current manuscript is focused on transcriptomic analysis and method building, so we have toned down any claims at the functional level.

      In addition, we have observed that T cells after MR3 treatment show upregulation of cytotoxicity- and IFN-response-related genes consistent with enhanced effector function at the transcriptional level. We have added new Fig. S6A and explanation in Result.

      __ PARP11: mediator vs marker The cross-species PARP11 result is the most convincing and potentially generalizable finding in the manuscript (Fig. 2B). However, in the specific context of this study, PARP11 is still best supported as a conserved MR3-responsive candidate rather than a demonstrated causal driver of PD-L1-mediated suppression. If the authors want to argue PARP11 is an effector of the pathway (rather than a marker), they should either soften the language or add a minimal functional linkage experiment within the existing scope (see "Optional" experiments below).__

      -We have softened the overall language throughout the manuscript to emphasize the correlation and PARP11 as a marker and to reflect the bioinformatic nature of the study. As this paper's main goal is method development and resource building, with already 11 figures, we think functional experiments could be done in another paper.

      __ Reproducibility and clarity of methods I appreciate that the authors provide a code/data portal (MiroScape) and a GitHub link. To make the study as reproducible as possible, I recommend: • Deposit raw sequencing reads for both mouse and human datasets (GEO/SRA) and include accession numbers in the manuscript.__

      -We have just deposited all raw data. Accession numbers will be provided once it is public.

      • __ Provide a short, consolidated "computational reproducibility" note with software versions and key parameters (Seurat, CellChat, STAR, DESeq2, etc.).__ -Added

      • __ Clarify pseudo-bulk construction (what is aggregated, at what level, and how many biological replicates contribute to each pseudo-bulk comparison).__ -Added

      • __ Add a brief summary of MR3 provenance/validation and what "MIRO1-binding" means operationally in the context of these experiments (especially for readers outside the MIRO1 field).__ -We have added this in Introduction.

      Experiments requested (kept within the existing claims) I am intentionally not suggesting new lines of experimentation. The experiments below are aimed only at supporting the paper's current central claims. I separate them into items I consider essential vs optional, depending on how strongly the authors want to phrase mechanistic conclusions.

      -We thank the Reviewer. We have toned down the claims to reflect the bioinformatic nature of the paper. We will perform suggested experiments below in another paper.

      Essential if the title/abstract continue to use strong causal language • Protein-level validation of the PD-L1/PD-1 axis and CD8 activation in the GL261 model. A focused flow cytometry panel (myeloid PD-L1; CD8 PD-1 plus one or two effector markers such as GZMB/IFNG/Ki67) or multiplex IF/IHC on tumor sections would substantially strengthen the central MAC1 ____→____ CD8 claim. • An orthogonal measure of tumor burden in the same treatment paradigm. The manuscript currently treats the drop in the fraction of nuclei annotated as tumor (Fig. 1E) as a reduction in tumor burden; I recommend including IVIS longitudinal data and/or histologic tumor area/volume at harvest to support this statement. • If feasible, modestly increase in vivo biological replication (the snRNA-seq analysis currently has n=2 treated after QC). Even adding one additional treated animal that passes QC would help. Feasibility (rough guidance only; core pricing varies widely by institution): a repeat GL261 cohort to harvest tumors for flow and/or histology typically takes ~3-6 weeks end-to-end. A small flow panel plus core time is often on the order of a few thousand USD (antibodies and cytometry), while basic histology/IF quantification might be in the hundreds to low-thousands. If the authors already have stored tissue from the existing cohort, some of this could be faster/cheaper. Optional (only if the authors want the MAC4 ____→____ PGE2 ____→____ Parp11 mechanism to be more than a model) • Measure PGE2 (ELISA or targeted lipidomics) in tumor lysates/conditioned media from control vs MR3-treated samples, or provide a closer proxy for PGE2 pathway engagement in the relevant clusters. Optional (only if the authors want to argue PARP11 is an effector) • A minimal functional linkage experiment (in vitro) testing whether PARP11 perturbation phenocopies the relevant aspect of MR3 in macrophages (e.g., PD-L1 levels and/or the ability to suppress CD8 activation in a co-culture). This could be done with a PARP11 inhibitor or knockdown. I do not think in vivo genetics are required for this manuscript, but some functional tie would prevent overinterpretation.

      __ Minor comments A. Analysis/experimental clarifications that seem straightforward • Human DESeq2: please clarify whether the DESeq2 design was paired by patient (i.e., patient as a blocking factor).__

      -See above. We re-ran the human differential expression analysis using a paired design with patient as a blocking factor and explained in Methods.

      • __ snRNA-seq DE: please clarify whether any sample-aware method was used for the key DE conclusions (especially Parp11/Cd274 changes) rather than per-cell statistics alone.__ -See above. The key DE results are based on sample-level pseudobulk (each mouse as one replicate). The two MR3-treated samples cluster together in pseudobulk PCA (new Fig. S2C), and the tumor reduction is seen in both animals (new Fig. S2F), supporting robustness to animal variability.

      • __ CellChat: because min.cells filtering is used (min.cells = 20), please note this explicitly in figure legends where subclusters appear only in one condition, so readers understand why certain labels are missing.__ -We have edited the Fig 3 legend accordingly.

      __ Figure and text consistency issues I noticed several figure/legend/citation issues that look like simple fixes: • Fig. 3 legend panel labeling: the legend text refers to the PD-L1/PD-1 chord plot as (C) MR3− and (D) MR3+, but (D) is the heatmap panel; the chord plots are (C) and (E). This should likely read (C) MR3− and (E) MR3+.__

      -Yes, and corrected.

      • __ Fig. 5 panel reference: the Results text refers to the Cross Species module as Fig. 5F, but the Fig. 5 legend defines panels (A-E) and labels (E) as "Cross Species module." Please reconcile (either change the text to Fig. 5E or add a panel F).__ -Changed to "E".

      • __ Discussion figure citation: the Discussion cites Ptges3/PGE2 evidence as "(Figure 3)," but Ptges3 is shown in Fig. 4A and the model is in Fig. 4B.__ -Added "Figure 4A-B" there.

      • __ Fig. 1D numbers: the Results text states 509/1,602 (mouse) and 15/106 (human) "rescued" genes (Fig. 1D), but the Fig. 1D pie charts are labeled with different totals (mouse total 3490; human total 104). Please reconcile the denominators and ensure the figure matches the text and analysis choice (bulk vs snRNA vs filtered gene sets).__ -For the cross-species analysis, we only counted genes with human-mouse orthologs so that the two datasets were compared in the same gene space. This avoids inflation from species-specific genes. We have added a clarification in the figure legend.

      • __ Fig. 2 legend: there is a stray quote in "lymphoid subclusters" (appears as subclusters").__ -removed.

      __ Presentation and framing • Tone down or carefully qualify statements equating snRNA-seq composition shifts with reduced tumor burden (or add an orthogonal tumor-burden measurement as suggested above).__

      -We have removed "tumor burden" throughout the manuscript.

      • __ Where possible, tie mechanistic language explicitly to the level of evidence ("consistent with," "suggests," "model proposes") so readers do not over-interpret the transcriptomic inference.__ -done.

      • __ Consider adding a small schematic in the Results or a short "interpretation" sentence in the figure legends explaining what the CellChat plots do and do not show, since non-specialists can misread these as direct interaction measurements.__ -We have added explanations in Fig 3 legends for CellChat and emphasized the transcriptomic nature of the data.

      __ Prior literature The PARP11 immunotherapy literature is cited appropriately. For the PGE2 angle, it may help readers if the authors add one or two glioma-focused references on PGE2-mediated myeloid/T-cell suppression (if not already in the full reference list).__

      -We have added two more papers showing PGE2 may induce MDSCs and immunosuppresion in glioma (3) (4).

      Significance

      Nature and significance of the advance The advance here is primarily conceptual and resource-oriented. Conceptually, the work connects a mitochondrial regulator (MIRO1) to a specific, testable immunosuppressive circuit in the glioma TME. Technically, the cross-species perturbation framework and the accompanying MiroScape portal should be useful to groups looking for conserved, drug-responsive immune programs.

      Context within the existing literature Immunosuppression in glioma and the importance of tumor-associated myeloid populations are well established, as is the limited success of checkpoint blockade in GBM. The manuscript's proposed MAC4/MAC1 paracrine model and its emphasis on PD-L1/PD-1 signaling adds a focused, hypothesis-generating view of how particular macrophage states might sustain CD8 dysfunction. The identification of PARP11 as a conserved MR3-responsive gene also fits with emerging work implicating PARP11 in immunoregulatory programs and response to immunotherapy.

      Audience • Neuro-oncology and glioma TME researchers (myeloid heterogeneity, immune suppression). • Tumor immunology groups interested in myeloid-driven checkpoint resistance. • Researchers working on mitochondrial stress signaling and immunometabolism. • Computational biologists building cross-species or multi-modal integration frameworks. Reviewer expertise and limitations Keywords: glioma microenvironment; macrophage/microglia biology; tumor immunology; single-cell/nucleus transcriptomics; computational ligand-receptor inference. Limitations: I am not a medicinal chemist, so I cannot deeply evaluate MR3 chemistry, PK/PD, or specificity beyond what is presented. I also did not evaluate the full web-portal implementation beyond the manuscript description.

      Reviewer #2

      Evidence, reproducibility and clarity

      The authors study responses to MIRO1 inhibition in a mouse model of GL261 GBM and in human tissue pieces treated ex vivo. They provide an interesting link between mitochondrial function and potential therapeutic outcomes in a tumor type that is typically challenging to treat. The manuscript is written clearly, in correct English language and figures are well structured and easy to interpret. -We thank the Reviewer for the positive comments. We want to clarify that the compound binds to Miro1 and doesn't inhibit Miro1's GTPase activity (1). We have now added explanation in Introduction.

      __ Major critique: 1. However, I need to stress that study is based of few experiments with low robustness. The predominant experiment is single-nuclei RNAseq analysis of GL261 tumors implanted into mice, constituting 3 CTRL and 2 treated mice, due to removal of 3rd animal following sequencing (low recovery of high quality nuclei). Therefore, the sample group is small. This is understandable for snRNA-seq experiment (although 3 animals in treated group is somewhat necessary), but the efficiency of treatment with MR3 should be better documented in a larger cohort of animals. Crucial changes in distribution of cell types or polarisation of myeloid cells should be confirmed with flow cytometry, which is more feasible on a larger cohort.__

      -We agree. As explained to Rev 3, the current paper is focused on conceptual and methodical advances and providing a resource to the community, which is already big with 11 figures. As Rev 1 mentioned, our paper's significance is to transcriptomically link Miro1 to well-known immune suppression factors in glioma TME and integrate 3 glioma databases which will facilitate researchers in the field to advance their own research. Importantly, PCA analysis of the mice at the animal level showed clear separation of treated from untreated groups and the reduction in tumor cell proportion was also observed in both treated animals (new Fig. S2C, F), supporting technical reproducibility despite a small n. Thus, our methods and resource should be still valid and useful to the community. Exploring the tumor-reducing efficacy of MR3 or combined treatments (e.g. with anti-PD-L1 or PARP11 inhibitor) in larger cohorts is an exciting next step.

      __ Human model does not seem robust (also, only 3 patients). Very few genes are affected by treatment (incomparably less than in mice), which poses a question if the model is sufficient to study the effect of the treatment. This should be at least discussed and arguments should be stated why such model is suitable.__

      -We agree and the observed variability in treatment response is actually expected and consistent with the well-established molecular and phenotypic heterogeneity of human glioma. Importantly, despite this diversity, we identified one gene (PARP11) consistently altered across all patient's samples and mouse model. This cross-species reproducibility supports the biological and translational relevance of the finding of PARP11. We have now added this to Discussion.

      In addition, we reanalyzed the human bulk RNA-seq using a paired design with patient as a blocking factor as suggested by another reviewer, which increased the number of DE genes (new Fig. 1C).

      __ Fig. S1E shows that actually few genes are commonly affected between human and mouse experiments. So conclusion about "conserved" modulation by MR3 seem an overstatement.__

      -We meant "Parp11" is conserved. We have deleted "conserved" throughout the manuscript when we didn't refer specifically Parp11 to avoid confusion.

      __ Mechanistic conclusions about PARP11, PGE, PD-L1 etc are not documented by any wet lab experiments, just by bioinformatic modelling.__ -We have scrutinized the Main Text to emphasize this.

      Minor: 1. Authors should discuss choice of GL261 model. It is immunogenic and does not resemble human GBM ideally, so the choice should be explained.

      -Although GL261 model demonstrates higher immunogenicity compared to human GBM, this feature enables evaluation of immune-modulating therapies and mechanisms in an immune-competent setting. This model still preserves critical aspects of glioma biology, including immunosuppressive TME, invasive behavior, and intracranial growth (5). Thus, this model provides a suitable platform for our study of mechanistic investigation of immune cells in the TME. We have now added this to Method.

      __ In clustering of mouse snRNAseq data, T cells seem underclustered, e.g. Treg cluster clearly constitutes half of Il2ra-positive and negative cells, the latter probably being conventional CD4+ T cells (usually CD4+ T cells in GL261 are 50:50 Treg and conventional). This can affect further conclusions on cell:cell interactions.__

      -We thank the reviewer for this important observation. We agree that in the former annotation, it was improper to annotate all the CD4+ T cells as Treg cells, given the limited expression of Foxp3, Il2ra and other Treg marker genes. Consequently, the previously annotated "Treg cluster" likely includes both regulatory-like and conventional CD4+ T cells.

      We have further clustered the CD4+ T cell population and found that if we divided CD4+ T cells into conventional CD4+ T and Treg cells, it yielded few Treg cells for downstream analysis (~50). This would compromise the robustness and reliability of our following analysis (CellChat/DEA/etc).

      To address this, we have revised our annotation and now refer to this population more conservatively as "regulatory-like CD4+ T cells" rather than bona fide Tregs. Importantly, this subset still exhibits elevated expression of immunoregulatory molecules and is associated with CD8+ T cell dysfunction, preserving the main conclusions regarding immune suppression within the tumor microenvironment. We have updated the Results, Figures, and Discussion accordingly to clarify this revised annotation and its implications for cell-cell interactions.

      Please refer to following new figures for the updated annotation and associated results:

      Fig. 2G-H, Fig. 3A-G, Fig. S4C-D,G, Fig. S5-B-G, Fig. S6A.

      Significance

      The study provides an interesting conclusion and potentially relevant discovery. However, in opinion of this reviewer, the performed experiments do not strengthen this sufficiently, especially in terms of mechanical insights and weak data on human samples. In the line of general literature on new treatments of GBM and testing thereof in mouse model, this study lacks mechanistic insights and solid data on therapeutic efficiency.

      -As mentioned above, the goal of this paper is to provide novel methods to integrate datasets, resource building, and identify markers in the glioma TME. It will serve as useful resources to the community and form the foundation for future therapeutic validation in larger cohorts. We have acknowledged the limitations in the revised manuscript.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): ____

      The authors of "Cross-Species Transcriptomic Integration Reveals a Conserved, MIRO1-Mediated Macrophage-to-T-Cell Signaling Axis Driving Immunosuppression in Glioma" present transcriptomic, both bulk RNA Seq and single nucleus RNA Seq, from GL261 murine gliomas treated with the Miro1 targeting compound MR3. RNA Seq data from human tumor explants treated with MR3 is also presented. The authors compared DEGs from their treated tissues with publicly available RNA Seq data sets comparing DEGs from normal tissue and Glioma tumors. The goal being to identify genes modulated by MR3 that may be underlying glioma growth, TME changes, and immunosuppression. There is a significant amount of data presented, with in-depth analysis conducted on the sequencing data sets. The manuscript is lacking in mechanistic depth and this reviewer feels that the results are over-interpreted, especially without any additional conformational assays run to confirm the interpretation of the sequencing data. There were many bold statements made (lines 109-110, 117, 130-131, 142-144, 163-165) that I felt did not have enough evidence to back up their claims. __

      -We have toned down these places mentioned above:

      Line 109-110: Deleted now

      Line 117: Deleted now

      Line 130-131: Deleted now

      Line 142-144: Deleted: "highly differentially expressed", the rest of the sentence is supported by our data.

      Line 163-165: Deleted now

      As explained later, our paper is focused on bioinformatic analysis and resource and method building. In-depth functional studies will be performed in another paper.

      __A significant concern is the lack of conformation that MR3 is targeting Miro1 in these models. __

      -We have done this in another manuscript where we show that in cellular glioma models, Miro1 is the target of MR3 and MR3 exerts its functions via directly binding to Miro1.

      __Previous publications from the authors have shown evidence that MR3 reduces Miro1 expression in cell and fly models. Sometimes this requires the co application of FCCP or antimycin A. Thus, the results attributed within cannot be attributed to Miro1 changes but rather any on or off-target effect of MR3. __

      -We originally discovered MR3 by ligand-based in silico modeling and thermal shift direct binding assay (1, 2). Thus, MR3 is a Miro1 binder (stated in Abstract and Introduction too, now we have added more background in Introduction). Indeed, sometimes we saw MR3 reduced Miro1 protein levels under certain conditions, for example, in vivo in flies after days of feeding (1, 2), or in PD cells upon Antimycin A or CCCP treatment (1, 2, 6, 7). MR3 mostly likely exerts its function via altering Miro1 protein-protein interactions (8) and Miro1 protein is subsequently degraded in proteasomes following complex dissociation or after posttranslational modifications (1, 2) (8). We have stated this hypothesis in Result section (page 10, possible model).

      In our other papers we have excluded off-target effect of MR3 by examining other mitochondrial GTPases (1, 2) including Miro2, and by showing Miro1 KD glioma cells phenocopied the effects of MR3 and drug-resistant Miro1 mutant in glioma cells rendered insensitivity to MR3. These data show Miro1 is the main target of MR3.

      We have added more explanations to the Introduction.

      __Understanding that mouse studies are expensive and time-consuming, and the acquisition of human tissue is not trivial, the sample sets are still small. Further confirmation of findings in cell models, organoids etc. would strengthen the findings and justify the smaller sample size of mice and human tissue. __

      -We agree and we have another in-depth study. However, the current paper is focused on conceptual and methodical advances and providing a resource to the community, which is already big with 11 figures. As Rev 1 mentioned, our paper's significance is to transcriptomically link Miro1 to well-known immune suppression factors in glioma TME and integrate 3 glioma databases which will facilitate researchers in the field to advance their own research. Thoroughly understanding Miro1's role in glioma TME is our next goal as stated in Discussion and is beyond the scope of the current study.

      __The website MiroScape will be a very useful tool in the proper hands. ____

      1. Confirm activity of MR3 on Miro1 in relevant samples. Direct downregulation? Modulation of other targets known to be altered by MR3? __

      -As mentioned above, we have shown in tumor cells, MR3 disrupts pathogenic Miro1-protein interactions without the need to reduce Miro1 protein. There is currently no other target known to be altered by MR3, not even Miro2, demonstrated before (1, 2). We have added more explanations in Main Text.

      __ Conduct further mechanistic work to validate claims inferred by differentially expressed genes.__

      -As mentioned above, our current paper is focused on bioinformatic methods and resource building. Further mechanistic work will be performed in another paper.

      __ Significantly temper claims related cell targeting, direct communication between cells and overarching responses inferred from Sequencing data. -Done. See above and Main Text.

      Reviewer #3 (Significance (Required)):

      My laboratories expertise lies in signaling related to mitochondrial structure and function. We have investigated the Miro1 protein and effects on cellular responses related to Miro1 expression. We have tested the MR3 compound in our own systems with limited success. Therefore my major concerns lie in validating the on-target activity of the compound in their models. __-As explained above, in our other papers we have thoroughly examined on-target activity of MR3 by courter-screening other Miro1 related/similar proteins (1, 2, 6, 7) and by using Miro1 KD cells. We have now added more explanations in Main Text.

      __ With additional mechanistic validation this could be a very significant study. Using advanced model systems as the authors do allows for a comprehensive understanding of tissue responses. This is far advanced from simple single cell line culture studies but also adds significant complexity to the interpretation of the data. I am a strong believer that Sequecing data must be validated with functional assays.__

      -We agree and are actively conducting those studies. However, bioinformatic analysis and method and resource building are sometimes too comprehensive to combine with functional data which may take years to obtain. We think our paper's method, markers identified in TME, and resources will be very useful to the community.

      References

      1. Hsieh CH, Li L, Vanhauwaert R, Nguyen KT, Davis MD, Bu G, Wszolek ZK, Wang X. Miro1 Marks Parkinson's Disease Subset and Miro1 Reducer Rescues Neuron Loss in Parkinson's Models. Cell metabolism. 2019;30(6):1131-40 e7. Epub 2019/10/01. doi: 10.1016/j.cmet.2019.08.023. PubMed PMID: 31564441; PMCID: PMC6893131.
      2. Li L, Conradson DM, Bharat V, Kim MJ, Hsieh CH, Minhas PS, Papakyrikos AM, Durairaj AS, Ludlam A, Andreasson KI, Partridge L, Cianfrocco MA, Wang X. A mitochondrial membrane-bridging machinery mediates signal transduction of intramitochondrial oxidation. Nat Metab. 2021. Epub 2021/09/11. doi: 10.1038/s42255-021-00443-2. PubMed PMID: 34504353.
      3. Mi Y, Guo N, Luan J, Cheng J, Hu Z, Jiang P, Jin W, Gao X. The Emerging Role of Myeloid-Derived Suppressor Cells in the Glioma Immune Suppressive Microenvironment. Front Immunol. 2020;11:737. Epub 2020/05/12. doi: 10.3389/fimmu.2020.00737. PubMed PMID: 32391020; PMCID: PMC7193311.
      4. Dean PT, Hooks SB. Pleiotropic effects of the COX-2/PGE2 axis in the glioblastoma tumor microenvironment. Front Oncol. 2022;12:1116014. Epub 20230126. doi: 10.3389/fonc.2022.1116014. PubMed PMID: 36776369; PMCID: PMC9909545.
      5. Mathios D, Kim JE, Mangraviti A, Phallen J, Park CK, Jackson CM, Garzon-Muvdi T, Kim E, Theodros D, Polanczyk M, Martin AM, Suk I, Ye X, Tyler B, Bettegowda C, Brem H, Pardoll DM, Lim M. Anti-PD-1 antitumor immunity is enhanced by local and abrogated by systemic chemotherapy in GBM. Science translational medicine. 2016;8(370):370ra180. Epub 2016/12/23. doi: 10.1126/scitranslmed.aag2942. PubMed PMID: 28003545; PMCID: PMC5724383.
      6. Bharat V, Durairaj AS, Vanhauwaert R, Li L, Muir CM, Chandra S, Kwak CS, Le Guen Y, Nandakishore P, Hsieh CH, Rensi SE, Altman RB, Greicius MD, Feng L, Wang X. A mitochondrial inside-out iron-calcium signal reveals drug targets for Parkinson's disease. Cell Rep. 2023;42(12):113544. Epub 2023/12/07. doi: 10.1016/j.celrep.2023.113544. PubMed PMID: 38060381.
      7. Bharat V, Hsieh CH, Wang X. Mitochondrial Defects in Fibroblasts of Pathogenic MAPT Patients. Front Cell Dev Biol. 2021;9:765408. Epub 2021/11/23. doi: 10.3389/fcell.2021.765408. PubMed PMID: 34805172; PMCID: PMC8595217.
      8. Kwak CS, Du Z, Creery JS, Wilkerson EM, Major MB, Elias JE, Wang X. Optogenetic Proximity Labeling Maps Spatially Resolved Mitochondrial Surface Proteomes and a Locally Regulated Ribosome Pool. bioRxiv. 2025. Epub 2026/01/07. doi: 10.64898/2025.12.21.693523. PubMed PMID: 41497653; PMCID: PMC12767525.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors of "Cross-Species Transcriptomic Integration Reveals a Conserved, MIRO1-Mediated Macrophage-to-T-Cell Signaling Axis Driving Immunosuppression in Glioma" present transcriptomic, both bulk RNA Seq and single nucleus RNA Seq, from GL261 murine gliomas treated with the Miro1 targeting compound MR3. RNA Seq data from human tumor explants treated with MR3 is also presented. The authors compared DEGs from their treated tissues with publicly available RNA Seq data sets comparing DEGs from normal tissue and Glioma tumors. The goal being to identify genes modulated by MR3 that may be underlying glioma growth, TME changes, and immunosuppression. There is a significant amount of data presented, with in-depth analysis conducted on the sequencing data sets. The manuscript is lacking in mechanistic depth and this reviewer feels that the results are over-interpreted, especially without any additional conformational assays run to confirm the interpretation of the sequencing data. There were many bold statements made (lines 109-110, 117, 130-131, 142-144, 163-165) that I felt did not have enough evidence to back up their claims. A significant concern is the lack of conformation that MR3 is targeting Miro1 in these models. Previous publications from the authors have shown evidence that MR3 reduces Miro1 expression in cell and fly models. Sometimes this requires the co application of FCCP or antimycin A. Thus, the results attributed within cannot be attributed to Miro1 changes but rather any on or off-target effect of MR3. Understanding that mouse studies are expensive and time-consuming, and the acquisition of human tissue is not trivial, the sample sets are still small. Further confirmation of findings in cell models, organoids etc. would strengthen the findings and justify the smaller sample size of mice and human tissue. The website MiroScape will be a very useful tool in the proper hands.

      1. Confirm activity of MR3 on Miro1 in relevant samples. Direct downregulation? Modulation of other targets known to be altered by MR3?
      2. Conduct further mechanistic work to validate claims inferred by differentially expressed genes.
      3. Significantly temper claims related cell targeting, direct communication between cells and overarching responses inferred from Sequencing data.

      Significance

      My laboratories expertise lies in signaling related to mitochondrial structure and function. We have investigated the Miro1 protein and effects on cellular responses related to Miro1 expression. We have tested the MR3 compound in our own systems with limited success. Therefore my major concerns lie in validating the on-target activity of the compound in their models.

      With additional mechanistic validation this could be a very significant study. Using advanced model systems as the authors do allows for a comprehensive understanding of tissue responses. This is far advanced from simple single cell line culture studies but also adds significant complexity to the interpretation of the data. I am a strong believer that Sequecing data must be validated with functional assays.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors study responses to MIRO1 inhibition in a mouse model of GL261 GBM and in human tissue pieces treated ex vivo. They provide an interesting link between mitochondrial function and potential therapeutic outcomes in a tumor type that is typically challenging to treat. The manuscript is written clearly, in correct English language and figures are well structured and easy to interpret.

      Major critique:

      1. However, I need to stress that study is based of few experiments with low robustness. The predominant experiment is single-nuclei RNAseq analysis of GL261 tumors implanted into mice, constituting 3 CTRL and 2 treated mice, due to removal of 3rd animal following sequencing (low recovery of high quality nuclei). Therefore, the sample group is small. This is understandable for snRNA-seq experiment (although 3 animals in treated group is somewhat necessary), but the efficiency of treatment with MR3 should be better documented in a larger cohort of animals. Crucial changes in distribution of cell types or polarisation of myeloid cells should be confirmed with flow cytometry, which is more feasible on a larger cohort.
      2. Human model does not seem robust (also, only 3 patients). Very few genes are affected by treatment (incomparably less than in mice), which poses a question if the model is sufficient to study the effect of the treatment. This should be at least discussed and arguments should be stated why such model is suitable.
      3. Fig. S1E shows that actually few genes are commonly affected between human and mouse experiments. So conclusion about "conserved" modulation by MR3 seem an overstatement.
      4. Mechanistic conclusions about PARP11, PGE, PD-L1 etc are not documented by any wet lab experiments, just by bioinformatic modelling.

      Minor:

      1. Authors should discuss choice of GL261 model. It is immunogenic and does not resemble human GBM ideally, so the choice should be explained.
      2. In clustering of mouse snRNAseq data, T cells seem underclustered, e.g. Treg cluster clearly constitutes half of Il2ra-positive and negative cells, the latter probably being conventional CD4+ T cells (usually CD4+ T cells in GL261 are 50:50 Treg and conventional). This can affect further conclusions on cell:cell interactions.

      Significance

      The study provides an interesting conclusion and potentially relevant discovery. However, in opinion of this reviewer, the performed experiments do not strengthen this sufficiently, especially in terms of mechanical insights and weak data on human samples. In the line of general literature on new treatments of GBM and testing thereof in mouse model, this study lacks mechanistic insights and solid data on therapeutic efficiency.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary of findings and key conclusions

      This manuscript asks how pharmacologic targeting of the outer mitochondrial membrane protein MIRO1 (RHOT1) with a MIRO1-binding compound (MR3) reshapes immunosuppressive programs in the glioma tumor microenvironment (TME). The core of the paper is a cross-species transcriptomic comparison that combines an in vivo mouse dataset with an ex vivo human perturbation dataset.

      Model systems and approach (as described):

      • Mouse in vivo: GL261-Luc intracranial glioma in C57BL/6J mice; MR3 is administered intracranially at the implantation site (10 µM in 5 µL DMSO) on days 11 and 18, and tumors are harvested on day 22 for single-nucleus RNA-seq (snRNA-seq).
      • Mouse snRNA-seq: NeuN-based nuclei sorting, 10x Genomics v3.1; alignment to mm10; Seurat-based integration and annotation. Tumor-cell calling is supported by CNV inference (SCEVAN/CopyKAT). One MR3-treated sample is excluded after QC, leaving 3 control vs 2 MR3-treated samples (11,940 NeuN− nuclei).
      • Human ex vivo: freshly resected glioma cores from 3 patients are cultured with 10 µM MR3 or DMSO for 24 h, followed by bulk RNA-seq (STAR alignment to hg19; DESeq2 for differential expression).
      • Cross-species integration: the analysis is restricted to 1:1 orthologs and protein-coding genes shared across datasets; inferred cell-cell signaling is explored with CellChat.

      Main findings (as presented):

      • MR3 shifts expression of a subset of glioma-associated genes toward a non-tumor-like direction ("rescued genes") and is associated with large changes in inferred cell-type composition in the mouse snRNA-seq dataset (including a marked drop in the fraction of nuclei annotated as tumor: 44.5% to 4.3%; Fig. 1E).
      • Across TCGA-vs-GTEx (glioma-upregulated genes) and three MR3 response analyses (mouse snRNA-seq, mouse pseudo-bulk, and human bulk RNA-seq), PARP11/Parp11 is reported as the only gene that is consistently upregulated in glioma and consistently downregulated by MR3 (Fig. 2B).
      • Within the mouse myeloid compartment, Parp11 is most enriched in MAC4 and MAC1, while MAC1 shows high Cd274 (Pdl1/PD-L1). MR3 reduces Parp11 in MAC4/MAC1 and reduces Cd274 in MAC1 (Fig. 2H).
      • CellChat analysis suggests that in controls MAC1 is the dominant sender of PD-L1/PD-1 signaling to CD8+ T cells (Fig. 3C), and that this PD-L1/PD-1 interaction is strongly diminished after MR3 (Fig. 3E).
      • The authors propose a paracrine model in which MAC4-derived PGE2 (via Ptges3) sustains Parp11 expression in MAC1 through cAMP/PKA/CREB, promoting PD-L1-mediated T-cell suppression; MR3 disrupts this circuitry (Fig. 4).

      Major comments

      1. Strength of the conclusions Two parts of the story felt well supported by the data as shown. First, the cross-species convergence on PARP11/Parp11 is a clear and potentially useful result (Fig. 2B). Second, the myeloid subclustering plus CellChat analysis makes a coherent case that PD-L1/PD-1 signaling in this model is dominated by a specific macrophage subset (MAC1) and changes after MR3 (Fig. 2H, Fig. 3). Where I was less convinced is when the manuscript moves from "transcriptomic and modeling evidence" to causal statements such as "MIRO1-mediated axis driving immunosuppression" and "MR3 reduces tumor burden by reactivating immunity." At the moment, several central inferences remain indirect:
        • Causality is inferred primarily from transcriptomic shifts and ligand-receptor inference rather than functional immune readouts.
        • On-target attribution to MIRO1 hinges on MR3 being a MIRO1 binder; the study does not include a genetic MIRO1 perturbation or a target-engagement/epistasis test in the relevant immune compartments (and the authors acknowledge this limitation in the Discussion).
        • The very large reduction in "tumor cell proportion" (Fig. 1E) is striking but is still a composition measure of recovered nuclei; it is not, on its own, a direct measurement of tumor size/burden and could be sensitive to differential nuclei recovery or cell loss during processing. I think the paper can go forward in its current scope, but the strength of the claims should match the level of evidence. If the authors want to keep strong, causal language in the title/abstract ("driving immunosuppression," "reduces tumor burden"), then I consider one or two targeted validation experiments essential (see below). Alternatively, the authors can temper the language and position the mechanistic model more explicitly as a hypothesis generated from the transcriptomic analysis.
      2. Statements that should be labeled as preliminary/speculative (unless additional validation is added)
        • MAC4-derived PGE2 as the upstream driver of MAC1 Parp11/PD-L1: plausible and nicely consistent with Ptges3 being MAC4-high in controls and reduced with MR3 (Fig. 4A), but not demonstrated.
        • MIRO1 → mtDNA → cGAS/STING → Ptges3 as a mechanistic chain: interesting, but currently framed largely by pathway knowledge plus modest expression changes (Supplementary Fig. S5).
        • "MR3 reactivates anti-tumor immunity to reduce tumor burden": the gene set enrichment and CellChat shifts are consistent with immune activation, but immune-mediated tumor control is not directly tested.
      3. Replication and statistics Mouse snRNA-seq replication is limited after QC (3 control vs 2 MR3-treated animals). With n=2 treated, it is hard to know whether some of the biggest composition and cluster-level changes are robust to animal-to-animal variability. Relatedly, the snRNA-seq differential expression is performed with Seurat FindMarkers (Wilcoxon rank-sum). Per-cell testing can inflate significance if biological replicate structure is not accounted for (pseudoreplication). I suggest the authors clarify exactly how they handled sample-level replication for the key DE results and, where possible, re-run the main DE comparisons using a sample-aware approach (e.g., pseudo-bulk within cell types/subclusters). For the human bulk RNA-seq, the methods indicate 3 patient tissues split across MR3 vs DMSO for 24 h. In DESeq2, a paired design (including patient as a blocking factor) would be important to avoid patient-to-patient variability dominating the treatment signal; the manuscript should confirm whether the design formula accounted for this. Finally, several places in the Methods define significance using p-value cutoffs (e.g., GEPIA3 TCGA/GTEx analysis uses p < 0.05 and |log2FC| > 1; human DE uses p < 0.05 and log2FC >= 1). Because multiple testing is substantial in all of these analyses, I recommend reporting FDR-adjusted values consistently (and being explicit about whether figures/tables show raw or adjusted p-values).
      4. Do the data support the macrophage-to-CD8 suppression claim? The CellChat PD-L1/PD-1 network figures are suggestive (Fig. 3C/E), but ligand-receptor inference is not the same as demonstrating functional T-cell inhibition. At minimum, I would like to see one orthogonal readout (flow or immunostaining) showing that PD-L1 protein on myeloid cells and PD-1 on CD8 T cells change in the expected directions after MR3, and that CD8 T cells show an activation/effector signature at the protein level.
      5. PARP11: mediator vs marker The cross-species PARP11 result is the most convincing and potentially generalizable finding in the manuscript (Fig. 2B). However, in the specific context of this study, PARP11 is still best supported as a conserved MR3-responsive candidate rather than a demonstrated causal driver of PD-L1-mediated suppression. If the authors want to argue PARP11 is an effector of the pathway (rather than a marker), they should either soften the language or add a minimal functional linkage experiment within the existing scope (see "Optional" experiments below).
      6. Reproducibility and clarity of methods I appreciate that the authors provide a code/data portal (MiroScape) and a GitHub link. To make the study as reproducible as possible, I recommend:
        • Deposit raw sequencing reads for both mouse and human datasets (GEO/SRA) and include accession numbers in the manuscript.
        • Provide a short, consolidated "computational reproducibility" note with software versions and key parameters (Seurat, CellChat, STAR, DESeq2, etc.).
        • Clarify pseudo-bulk construction (what is aggregated, at what level, and how many biological replicates contribute to each pseudo-bulk comparison).
        • Add a brief summary of MR3 provenance/validation and what "MIRO1-binding" means operationally in the context of these experiments (especially for readers outside the MIRO1 field). Experiments requested (kept within the existing claims) I am intentionally not suggesting new lines of experimentation. The experiments below are aimed only at supporting the paper's current central claims. I separate them into items I consider essential vs optional, depending on how strongly the authors want to phrase mechanistic conclusions. Essential if the title/abstract continue to use strong causal language
        • Protein-level validation of the PD-L1/PD-1 axis and CD8 activation in the GL261 model. A focused flow cytometry panel (myeloid PD-L1; CD8 PD-1 plus one or two effector markers such as GZMB/IFNG/Ki67) or multiplex IF/IHC on tumor sections would substantially strengthen the central MAC1 → CD8 claim.
        • An orthogonal measure of tumor burden in the same treatment paradigm. The manuscript currently treats the drop in the fraction of nuclei annotated as tumor (Fig. 1E) as a reduction in tumor burden; I recommend including IVIS longitudinal data and/or histologic tumor area/volume at harvest to support this statement.
        • If feasible, modestly increase in vivo biological replication (the snRNA-seq analysis currently has n=2 treated after QC). Even adding one additional treated animal that passes QC would help. Feasibility (rough guidance only; core pricing varies widely by institution): a repeat GL261 cohort to harvest tumors for flow and/or histology typically takes ~3-6 weeks end-to-end. A small flow panel plus core time is often on the order of a few thousand USD (antibodies and cytometry), while basic histology/IF quantification might be in the hundreds to low-thousands. If the authors already have stored tissue from the existing cohort, some of this could be faster/cheaper. Optional (only if the authors want the MAC4 → PGE2 → Parp11 mechanism to be more than a model)
        • Measure PGE2 (ELISA or targeted lipidomics) in tumor lysates/conditioned media from control vs MR3-treated samples, or provide a closer proxy for PGE2 pathway engagement in the relevant clusters. Optional (only if the authors want to argue PARP11 is an effector)
        • A minimal functional linkage experiment (in vitro) testing whether PARP11 perturbation phenocopies the relevant aspect of MR3 in macrophages (e.g., PD-L1 levels and/or the ability to suppress CD8 activation in a co-culture). This could be done with a PARP11 inhibitor or knockdown. I do not think in vivo genetics are required for this manuscript, but some functional tie would prevent overinterpretation.

      Minor comments

      A. Analysis/experimental clarifications that seem straightforward

      • Human DESeq2: please clarify whether the DESeq2 design was paired by patient (i.e., patient as a blocking factor).
      • snRNA-seq DE: please clarify whether any sample-aware method was used for the key DE conclusions (especially Parp11/Cd274 changes) rather than per-cell statistics alone.
      • CellChat: because min.cells filtering is used (min.cells = 20), please note this explicitly in figure legends where subclusters appear only in one condition, so readers understand why certain labels are missing.

      B. Figure and text consistency issues

      I noticed several figure/legend/citation issues that look like simple fixes: - Fig. 3 legend panel labeling: the legend text refers to the PD-L1/PD-1 chord plot as (C) MR3− and (D) MR3+, but (D) is the heatmap panel; the chord plots are (C) and (E). This should likely read (C) MR3− and (E) MR3+. - Fig. 5 panel reference: the Results text refers to the Cross Species module as Fig. 5F, but the Fig. 5 legend defines panels (A-E) and labels (E) as "Cross Species module." Please reconcile (either change the text to Fig. 5E or add a panel F). - Discussion figure citation: the Discussion cites Ptges3/PGE2 evidence as "(Figure 3)," but Ptges3 is shown in Fig. 4A and the model is in Fig. 4B. - Fig. 1D numbers: the Results text states 509/1,602 (mouse) and 15/106 (human) "rescued" genes (Fig. 1D), but the Fig. 1D pie charts are labeled with different totals (mouse total 3490; human total 104). Please reconcile the denominators and ensure the figure matches the text and analysis choice (bulk vs snRNA vs filtered gene sets). - Fig. 2 legend: there is a stray quote in "lymphoid subclusters" (appears as subclusters").

      C. Presentation and framing

      • Tone down or carefully qualify statements equating snRNA-seq composition shifts with reduced tumor burden (or add an orthogonal tumor-burden measurement as suggested above).
      • Where possible, tie mechanistic language explicitly to the level of evidence ("consistent with," "suggests," "model proposes") so readers do not over-interpret the transcriptomic inference.
      • Consider adding a small schematic in the Results or a short "interpretation" sentence in the figure legends explaining what the CellChat plots do and do not show, since non-specialists can misread these as direct interaction measurements.

      D. Prior literature The PARP11 immunotherapy literature is cited appropriately. For the PGE2 angle, it may help readers if the authors add one or two glioma-focused references on PGE2-mediated myeloid/T-cell suppression (if not already in the full reference list).

      Significance

      Nature and significance of the advance

      The advance here is primarily conceptual and resource-oriented. Conceptually, the work connects a mitochondrial regulator (MIRO1) to a specific, testable immunosuppressive circuit in the glioma TME. Technically, the cross-species perturbation framework and the accompanying MiroScape portal should be useful to groups looking for conserved, drug-responsive immune programs.

      Context within the existing literature

      Immunosuppression in glioma and the importance of tumor-associated myeloid populations are well established, as is the limited success of checkpoint blockade in GBM. The manuscript's proposed MAC4/MAC1 paracrine model and its emphasis on PD-L1/PD-1 signaling adds a focused, hypothesis-generating view of how particular macrophage states might sustain CD8 dysfunction. The identification of PARP11 as a conserved MR3-responsive gene also fits with emerging work implicating PARP11 in immunoregulatory programs and response to immunotherapy.

      Audience

      • Neuro-oncology and glioma TME researchers (myeloid heterogeneity, immune suppression).
      • Tumor immunology groups interested in myeloid-driven checkpoint resistance.
      • Researchers working on mitochondrial stress signaling and immunometabolism.
      • Computational biologists building cross-species or multi-modal integration frameworks.

      Reviewer expertise and limitations

      Keywords: glioma microenvironment; macrophage/microglia biology; tumor immunology; single-cell/nucleus transcriptomics; computational ligand-receptor inference. Limitations: I am not a medicinal chemist, so I cannot deeply evaluate MR3 chemistry, PK/PD, or specificity beyond what is presented. I also did not evaluate the full web-portal implementation beyond the manuscript description.

  3. Feb 2026
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity (Required)

      Ali et al investigate the composition of putative kinetochore subcomplexes in the unicellular eukaryote Tetrahymena thermophila. Up to the point of this study, only a CENP-A ortholog and two subunits of the microtubule-binding Ndc80 complex had been clearly identified. This left open the question, whether Tetrahmena kinetochores follow the conventional organization found in common model systems such as yeast or human cells, or contain many unconventional proteins. The authors combine proximity biotinylation coupled to mass spectrometry with deep homology searches and structure predictions.

      Extensive bioinformatic analysis of the T. thermophila genome allows the authors to annotate 16 genes as kinetochore genes (KiTT). Using sequence comparisons with known kinetochore proteins, they were able to relate their novel KiTT proteins to the conserved kinetochore components Cenp A, Cenp C, the KMN network, as well as auxiliary proteins. In particular, the authors were able to complete the organization of the Ndc80 complex and identify subunits of the Mtw1/Mis12 complex and a Knl1 ortholog. This characterizes a KMN network as the centerpiece of the Tetrahymena kinetochore architecture.

      The CCAN seems to be represented solely by CENP-C, with key binding interfaces to KMN and CENP-A being preserved. An interesting aspect is that neither a Dam1-, nor a Ska homolog seems to support the Ndc80 complex. Instead, the authors identify a Kinesin-6 homolog that may potentially compensate for the absence of these factors.

      The study is well-designed, the results are thoughtfully discussed and the expertly conducted experiments highlight the power of combining experimental identification (BioID) with bioinformatic analyses.

      We appreciate the favorable assessment of our manuscript and would like to extend our thanks for the reviewers’ constructive criticisms and insightful comments. Where possible we aim to incorporate them (see below).

      Major comments The functional validation of the newly identified subunits using RNAi feels somewhat limited in this study. I understand there a technical limitations in this system, but whenever possible, I would at least expect the authors to explore differential effects on different parts of the kinetochore using the reagents they have at hand. In the particular, the authors show the effects of depleting KiTT12 (the kinesin-6 homolog) on Ndc80 kinetochore localization. It would be important to check effects also on CENP-A (using the anti-CNA1 antibody), or on other subunits. Given the available reagents, this should be readily possible.

      We agree that examining the effect of KiTT12 depletion on inner kinetochore components will strengthen the functional interpretation. While we do not expect, based on KiTT12’s relative location, a direct impact of KiTT12 RNAi on CNA1 (CENP-A) or CENP-C, we will perform immunofluorescence analyses using anti-CNA1 and anti-CENP-C antibodies in KiTT12 RNAi cells (and KiTT2 (NUF2) RNAi as control). These experiments will allow us to determine whether KiTT12 depletion specifically affects outer kinetochore integrity (as suggested by Ndc80 mislocalization) or more broadly perturbs kinetochore architecture (CNA1/CENP-C). We will include quantitative analyses of signal intensity and kinetochore organization to clarify potential hierarchical dependencies.

      The organization of the Knl1 ortholog and the question of whether a mitotic checkpoint is present, deserves some additional discussion. Interestingly, the positional organization of a PP1 binding motif at the N-terminus of a long disordered domain seems conserved. On the other hand, MELT motifs appear to be absent. The authors should discuss the implications of this some more. Is there an Mps1 homolog? What about the error correction machinery including Aurora B and the CPC? The putative MadBub homolog does not seem to localize to kinetochores, but maybe this is not detectable, unless the respective conditions (unattached kinetochores) are generated. Is it known, how the system reacts to spindle depolymerization?

      Tetrahymena does not appear to have a spindle checkpoint, given prior reports that chromosome segregation is not halted by microtubule depolymerization [Kaczanowski et al. 1985]. In line with this, the SAC protein orthologs that are present lack the motifs to mount a sufficient response and halt the cell cycle. We thus agree that the architecture of the Tetrahymena KNL1 ortholog and possible other SAC-related proteins raises important evolutionary questions. We will expand the discussion to address:

      • The absence of canonical MELT motifs in the Tetrahymena KNL1 ortholog.
      • The absence of a detectable Mps1 ortholog in our homology searches.
      • The divergence of the Tetrahymena MadBub protein and its lack of conserved KEN–ABBA–KEN motifs typically required for APC/C inhibition.
      • The absence of Mad2 and Mad2-binding motif in Cdc20. Relevant REFS:

      • Kaczanowski et al. 1985, Experimental Cell Research

      • Loidl et al. 2009, Molecular Biology of the Cell

      • Kops et al. 2020, Current Biology

      Minor comments - Introduction: When introducing the Tetrahymena kinetochore, please add some sentences on microtubule/spindle organization in the MIC. What is known about the kinetochore-microtubule attachment site in Tetrahymena?

      We will expand the introduction to include a concise description of spindle organization in the micronucleus (MIC), including known features of centromere clustering, spindle assembly, and microtubule attachment sites during MIC mitosis.

      Relevant REFs:

      • Davidson et al. 1975, Biosystems

      • Lafountain Jr et al. 1979, Chromosoma

      • Lafountain Jr et al. 1980, Cell Motility

      • Line 128: putative homology to Spc24 (E=13), comment on why this was considered, what cutoffs were applied etc..

      We will clarify the homology detection criteria, including E-value thresholds, domain architecture considerations, reciprocal searches, and structure-based validation. We will explain why this candidate was retained despite weak sequence similarity and how structural prediction strengthened confidence. In (very) short, we used the ‘top hit’ principle. E=13 for spc24 was simply the first hit and upon AlphaFold-predicted structures, the protein was clearly similar to spc24.

      • Line 135: briefly mention and discuss conservation of the RWD folds in the Spc24-25 orthologs.

      We will expand this section to explicitly describe conservation of the RWD fold and how structural modeling supports ortholog assignment despite sequence divergence. The E-values mentioned in line 128 for instance are for the RWD domain-only, not the full-length protein, we will further indicate this in the text.

      • Line 194: Maybe replace "show" with "suggest", given there is no experimental data behind the CENP-C identification

      We agree and will revise wording to “suggest” to avoid overstatement. However, we do want to point out that CENP-C/KiTT8 was identified experimentally as well through the BioID pipeline, and also an antibody was raised against KiTT8 that places this protein at the inner kinetochore.

      • Figure 7B: please add the information for the RNAi target directly to the Figure

      We will add the requested information directly to the figure.

      • Figures in the combined pdf: please add the respective Figure number or Supplementary Figure number directly on the Figure.

      We will add the figure numbers to the supplementary figure files.

      Significance (Required)

      While functional studies are often conducted in very few model organisms, exploring the evolutionary variations of kinetochore architecture can help to understand the design principles of kinetochores. I also helps to assign functions to specific subcomplexes and can reveal how adaptations of a core machinery occurs. Tetrahymena is historically an important experimental system that has had a great impact on the understanding of multiple aspects of nuclear biology. Deciphering the organization of the chromosome segregation machinery in this organism is therefore of great interest to researchers interested in mitosis and genome stability.


      Reviewer #2

      Evidence, reproducibility and clarity (Required)

      Summary Ali, Raas et al. provide a comprehensive molecular characterization of the kinetochore in the ciliate Tetrahymena thermophila. By integrating proximity proteomics (TurboID) with structure-based "deep" homology detection, they identify 16 kinetochore proteins (KiTT1-16), including nine highly diverged "cryptic" orthologs of conserved LECA components and four lineage-specific proteins. Their results demonstrate that while the Tetrahymena kinetochore lacks a conventional CCAN complex, it maintains a recognizable outer kinetochore structure supplemented by novel proteins essential for faithful chromosome segregation.

      Major comments 1. Representation of known kinetochore diversity - Since this manuscript wants to highlight that it is important to characterize kinetochore components in different eukaryotic clades, it would be good to highlight the known diversity from the literature in Figure 1, e.g. indicating species/clades for which components have been experimentally validated vs. only computationally inferred. - It would be good to specifically highlight this on the figure for the clade closest to Tetrahymena in which KT components have been experimentally validated (Apicomplexa?). - L58-64: the sentences 'we have a limited understanding about kinetochore composition and function from other branches of the eukaryotic tree of life' and 'these surveys also uncovered a surprisingly extensive diversity of kinetochore composition across eukaryotes' seem to contradict each other. Instead of/in addition to the literature described in the introduction, as suggested above, having known diversity indicated on a figure would therefore be helpful. This could be done quite roughly, just mentioning the number of verified KT components and the number of species for which this was done.

      We will add a more elaborate version of Figure 1a (or include an extended version to the supplement), summarizing the requested information in the above three points. Indeed, our mention of diversity in lineages is an inferred one, not a directly tested one. We will amend the text to clarify this.

      • L46-L56: when explaining the structure of the KT, it would be good to already refer to a figure, like the diagram of a human KT in 1B. As it is now, the introduction first explains the general structure, and then goes into diversity. This is fine, but it would be easier to understand if the figure panels followed this order.

      We will include additional references to figure 1b at the appropriate places in the introduction.

      The data can sometimes be represented in a more straightforward manner: - L120-...: After reading through the whole text, I understand why the authors choose to talk about Spc24 and Spc25 first (since Spc25 is also used in the TurboID experiment). However, the presented pipeline for these two proteins is much less convincing than for the other proteins. Spc24/25: 'Some homology > slight structure similarity > right localization in immunostaining' vs. the pipeline for the other proteins: 'TurboID > confirmation using homology + immunostaining' (what is depicted in Fig. 2C). The latter is very convincing, but by starting off with the less convincing pipeline, the reader starts off on the wrong track. Since Spc24 is not used in the end for the first TurboID results, is Spc25 necessary at this point or can this come later?

      We used this ‘story line’ because it was the way it happened. It felt wrong to us to pretend we hadn’t already found Spc24 and Spc25 by bioinformatic means before doing the TurboID, which might also have caused concerns with some readers as to our ability to detect orthologs for these and other proteins. Of note: a re-analysis of the Spc24-BioID experiment revealed that it was previously wrongfully considered unusable, hence we now include it in our NDC80-C based TurboID discovery pipeline in Figure 2. We will where possible revise the narrative structure to more clearly explain the logic of the discovery pipeline, while maintaining transparency about the historical order in which candidates were identified. We will streamline the Spc24/25 section and more prominently introduce the TurboID-driven identification pipeline (Figure 2C) to guide the reader.

      • It is very good and thorough that the authors noticed that some of the KT proteins were simply missed because they were not part of the original predicted proteome. However, why weren't the TurboID analyses simply redone with the new proteome? The authors could still note that it was important to use the most recent version, but it would be much more straightforward for readers to immediately have the most up to date analysis.

      We thank the reviewer for pointing this out. We agree that remapping to the most recent proteome annotation will improve clarity. We will remap the TurboID datasets to the updated Tetrahymena proteome, which includes Nnf1 and Csm1, and report whether additional components are identified. Of note: in a preliminary analysis with the newest version of the proteome we do not find any new proteins in the NDC80-C-TurboID experiments. We will also clarify in the manuscript what “not in original proteome” refers to and revise Figure 2C accordingly.

      Figure 4 and accompanying paragraph: this is an interesting analysis, but impossible to interpret without comparing with the branch length of other Tetrahymena proteins or Tetrahymena as a species (if I interpreted the analysis correctly). L251: 'this underscores the high rates of evolution of kinetochore proteins'. This could be true, but this isn't proven here because there is no comparison with the evolutionary rate of other proteins in Tetrahymena.

      The reviewer is correct in arguing that without comparisons to other proteins, the statement that kinetochores proteins in Tetrahymena evolved at high rates is incorrect, or at least not supported by the present data. What we meant was to say that they evolved at high/increased rates compared to kinetochore proteins of other species. This in our view explains why we have missed them in past searches, regardless of whether this is specific to the kinetochore in Tetrahymena or to Tetrahymena proteins in general. We will amend the text to reflect this more clearly. We will explicitly acknowledge analytical limitations and remove claims regarding lineage-specific acceleration.

      Figure 5: For further validation and to better show the layered structure of the Tetrahymena kinetochore it would be nice to have a couple of images here with increased resolution by using expansion microscopy.

      We agree that improved spatial resolution would strengthen the layered organization model. We will attempt to perform expansion microscopy (ExM) on selected tagged kinetochore components and incorporate representative images into the revised manuscript (main or supplementary figures).

      Minor comments - Abstract: if you are going to call out individual components, maybe also point out the few that were already known (KiTT1-2 and 14). Otherwise the reader might be confused about the missing numbers.

      We will revise this in the abstract.

      • L37: is 'cryptic ortholog' an official term? Doesn't this just depend on the starting point of the homology search and the number of experimentally verified hits you have in certain parts of the tree? Just wondering.

      This is a valid question. Indeed, ‘cryptic’ refers to the starting point of our study (based on our previous analyses) and the process towards identifying them as being canonical. We chose this term because we feel it signifies to the reader that identifying these orthologs required approaches beyond conventional orthology searches.

      • For future submissions, it would be useful to have the figure numbers indicated on the figures, because now it was sometimes difficult to keep track.

      As mentioned above, we will add the figure numbers to the revision.

      • L51: mentioning the SAC might make it a bit too complicated for people not 100% familiar with all the complexes. Either leave it out until later, or have a short sentence explaining what the SAC is.

      We will leave out the spindle assembly checkpoint (SAC) in the beginning and will bring it up at a later point, also explaining its explicit function.

      • Figure 1A: the identity of the black 'nuclei' is not explained for the Ciliophora and Apicomplexa in the figure or figure legend.

      We apologise for the confusing black organelles in apicomplexans, these are actually the micronemes and apical complex, characteristic features of these parasites. We will change the color to that of the clade so that it is clear that only ciliates have two types of nuclei (nuclear dimorphism).

      • In Figure 1B, instead of saying 'absent', wouldn't it be more correct to say something like 'not found/detected/identified'?

      We agree and will replace ‘absent’ by ‘not detected’.

      • Figure 1C. During interphase, sometimes homologous chromosomes seem to cluster at the centromeres (5 foci - example on the left), but sometimes they don't (10 foci - example on the right). Is this something you observe a lot? Is it strain-dependent?

      We thank the reviewer for making a very good point. In principle we take the cells showing 5 foci to be interphase cells. We interpret the cells with 10 foci to be cells just prior to mitosis. So these would be G2 cells where the homologous chromosomes have been replicated and the sister pairs are still seen together here. However, if this would be the case one would expect to see 20 centromeres/kinetochores in metaphase and this is not always observed. To prevent confusion on this point, we will replace the right panel in 1C for one that contains 5 foci and will make it more clear that these foci indeed represent homologous chromosomes. In addition, we will make panels to clearly show the behaviour of chromosomes over the different stages of mitosis.

      • Figure 1C (and later in Fig. 5): centromeres don't seem to align during metaphase. Is this true, or are these examples showing late metaphase/early anaphase?

      Indeed, a true metaphase similar to classic textbook images does not seem to be present. In 3D reconstructions we do see that kinetochores sit close to the nuclear envelope forming a sphere on the outside of the spindle, but almost never exactly in the same plane. Whether this means we simply have not caught true metaphase state, or there is none (like for instance in apicomplexans, which also do not appear to have a spindle checkpoint), is unclear at this point. We will further review our images and will use consistent stages for these images, and will revise terminology on metaphase state if warranted.

      • Why was STU2 included in the kinetochore? Wouldn't it be better classified as a MAP as in Fig. 3A? I saw this is actually discussed in the discussion, but maybe this explanation should come earlier.

      We thank the reviewer for pointing this out and will add a short sentence about the MAP function of STU2, and kinetochore localization in other lineages in the introduction.

      • Figure 2A: 'strong similarity'. For a TM score of 0.4 and 0.54, I am not sure I would say 'strong similarity'. Visually, they also look different. TM is also not explained in the legend.

      What we meant to say with ‘strong similarity’ is that a domain is predicted with a matching set of secondary structure elements to the RWD domains in yeast Spc24/Spc25. As for the TM score, a score of ≥ 0.5 has been shown to be a robust metric for fold similarity significance , which is the case for the comparison of the putative T. thermophila Spc25 ortholog and the yeast ortholog. However, we acknowledge that the T. thermophila Spc24 ortholog shows additional beta sheets compared to its yeast counterpart and has a TM score below 0.5, and so we will tone down this statement and remove ‘strong similarity’. We nonetheless maintain that this protein is a Spc24 ortholog with derived properties in its RWD domain.

      Relevant reference on TM score interpretation:

      Xu & Yang 2010 Bioinformatics (https://pmc.ncbi.nlm.nih.gov/articles/PMC2913670/)

      • Fig. 2D: why not PC2? Please explain this somewhere.

      We thank the reviewer for this question. We shall add an elaborate explanation of the PC selection in the method section. In short, PC2 (together with PC1 or PC3) did not reveal any separate cluster/cloud of points surrounding the NDC80-C components (KiTT1-4). Since PC3 did reveal such a cluster, we opted to select PC3.

      • Fig. 3C-D: 'striking similarity', again, it is hard to evaluate whether this is true from the figures and TM values alone (all are >0.5). Either change the phrasing, or explain how much similarity one would expect between homologs.

      Please see our response to the previous question regarding the significance of a TM score of ≥ 0.5.

      • How certain are you that these are all diverged homologs? For example, for KNL1, could another RWD domain-containing protein have evolved to become a kinetochore protein?

      In most cases we consider multiple lines of evidence: AF2/3, HHsearch and overall protein topology, in the case of RWD KT proteins, a coiled-coil followed by a single or double RWD. In the case of SPC24, SPC25 and CSM1 we have clear best hits for both structure and sequence searches. For KNL1 (double RWD), we have a newer version of our eukaryote-wide ortholog alignment now usable for HHsearches, which reveals KiTT7 (KNL1) to be the best hit also. As such, the RWD domain proteins that we uncover are not merely some RWD, but are specifically those of the kinetochore that are found in other lineages. In addition, there are only very few double RWD proteins present amongst eukaryotes, which makes the proposed scenario of homolog replacement for KNL1 unlikely.

      • Fig. 5: why wasn't CNA1 used as a marker of the inner kinetochore or tested?

      The CNA1 antibody gave quite some background (see figure 1C), we therefore favored the use of the CENPC/KiTT8 antibody.

      • Fig. 8: There is a time axis below, but I'm not sure what is indicated on this axis. Are the events above mapped on this axis?

      We agree this axis may be confusing. The idea was to show a number of ancestral nodes relevant for the evolutionary events noted in this figure. We will add clear references in the figure to each of these ancestors.

      • L347-349: 'convergent evolution'. Is the loss of the CCAN convergent evolution, or was it already lost in the SAR common ancestor?

      This was indeed convergent evolution. Amongst Stramenopila most CCAN subunits can be detected (see for instance van Hooff et al. 2017). In addition, the alveolate ancestor already had the CCAN as we can clearly detect orthologs in Colponemida. We will add this piece of information to the presence/absence plot in either Figure 1 or in the supplemental (see comment above to Reviewer 1).

      Significance (Required)

      General Assessment: The study is robust, thorough, and well-written. The analyses are technically sound, and the authors avoid overstating their conclusions. Key strengths include the successful identification of diverged components using a "deep homology" pipeline and the functional validation of novel subunits. To improve the study, the data representation could be made more straightforward, and the manuscript structure could be condensed to better highlight the most convincing results. Finally, the claims on the speed of evolution of the kinetochore components need to be better supported.

      Advance: The study provides the first molecular map of a ciliate kinetochore. By uncovering "cryptic" orthologs that escaped previous detection, the work demonstrates that many "missing" complexes in diverse eukaryotes are likely present but highly diverged.

      Audience: This work will interest evolutionary cell biologists studying mitosis and kinetochores (especially those interested in eukaryotic diversity), as well as the ciliate research community. It also serves as a methodological roadmap for researchers using structural homology to identify divergent proteins in other non-model organisms.

      Expertise: My field of expertise includes evolutionary cell biology, kinetochores, centromeres, microbiology, microscopy and phylogenetics.


      Reviewer #3

      Evidence, reproducibility and clarity (Required)

      Kinetochores are protein complexes essential for chromosome segregation in all eukaryotes. Unexpectedly, despite their crucial function, many kinetochore components evolve rapidly, which can hinder their identification based solely on sequence comparisons. In this study, the authors combine experimental and computational analyses to provide insights into the composition of the kinetochore protein complex in the ciliate Tetrahymena thermophila. This study makes an important contribution because kinetochore components in Tetrahymena have not previously been investigated experimentally, and the composition of the Tetrahymena complex was largely unknown.

      Starting with previously identified orthologs of the outer kinetochore proteins Ndc80 and Nuf2, the authors computationally identified the two additional members of the Ndc80 complex, Spc24 and Spc25. All four components were subjected to BioID analyses, leading to the identification of 23 additional candidates, some of which are factors known to be associated with centromeric chromatin in other eukaryotes (condensin, etc.). Focusing on a subset of unknown components, the authors provide experimental support for their kinetochore participation using microscopy and confirm distant homology with several known kinetochore components in other eukaryotes. Four components referred to as KiTT10-13, however, lack detectable orthology to known kinetochore components.

      Relative localization analyses using super-resolution microscopy revealed that KiTT10, 11, and 13 are more proximal to the inner kinetochore component CENP-C, while KiTT12 localizes closer to outer kinetochore components. Remote homology and phylogenetic analyses identify divergent WD40 or SANT domains in KiTT10 and 11, as well as a kinesin motor domain for the outer-kinetochore proximal KiTT12. Finally, RNAi-mediated depletion of KiTT12 demonstrated its requirement for accurate chromosome segregation and Ndc80 localization.

      Overall, I think this manuscript is interesting and makes an important contribution to the field of kinetochore biology. The results of this study, particularly regarding the novel kinetochore components identified, will likely also spark follow-up studies. My major comment concerns the discussion and presentation of the data:

      Major Comments At times, the explanation of homology search appears very technical and would not be accessible to non-experts..

      We thank the reviewer for raising this point. Given that the homology detection approach is an important part of the message of our manuscript, we do think that it is warranted to keep some technicalities in the results section. However, we do agree that quite some detail could be quite easily transferred to a specific supplementary section about our homology detection approach. We will rewrite the results section to better suit non-experts.

      Moreover, the authors could include more details about their analysis of TurboID data to improve clarity.

      I was initially confused what does "not in original proteome" mean in the figure before understanding that two different proteome versions were used. I think it would be less confusing for the reader if the authors simply map their bioID data to the most recent version of the Tetrahymena proteome, which includes both Nnf1 and Csm1. Is it possible that this might also reveal the presence of other components in addition to the two that were specifically targeted?

      We agree that mapping to the most recent proteome annotation will eliminate confusion. We will remap all TurboID datasets to the updated proteome and report whether additional candidates are detected. We will revise the figure legends to clearly explain enrichment categories and annotation differences between proteome versions (or in a supplementary section). So far we have not detected any new proteins in a re-analysis of the MS data for components of the NDC80-C.

      The data presentation in Figure 2 is confusing and requires clarification of the analyses performed. The Figure legend for panel 2C is incomplete. For example, there is no mapping for the character "*" in the legend. The legend can be revised for better clarity. Also, more than 23 proteins are shown in the 2D inset; were those not enriched in the other BioID experiments? It would be helpful to include a legend for these hits as well.

      We will revise Figure 2 and its legend to:

      • Clearly define all symbols (including “*”).
      • Provide a complete legend for enriched hits.
      • Clarify PCA interpretation.
      • Explicitly state how many proteins are included and how they were categorized. I would be cautious about using the word comprehensive, as the identification depends on many aspects, including the completeness of the annotated proteome used to map the MassSpec spectra against. Even if their bioID experiments always converge on the same set of proteins, factors can still be missing due to annotation issues. In addition, certain components might be refractory for detection by MassSpec due to their amino acid composition. Other digestion methods, other than trypsin, could, however, identify those.

      We agree that this term overstates completeness. We will revise wording to reflect that our identification is extensive but dependent on proteome annotation and mass spectrometry detectability.

      Figure 4: I guess the result is somewhat expected given the previous inability to identify these components computationally. I guess the distribution of the non-tetrahymena components might be skewed towards lower sequence divergence, since they do not include orthologs that require experimental approaches for identification. If the authors agree, this could be added as a discussion.

      We agree with the reviewer that Figure 4 merely showcases why we could not detect these kinetochore orthologs in the first place. In our present analysis we did not include orthologs of species with previously shown ‘difficult-to-detect’ orthologs. We will add discussion acknowledging that detectable homologs in other species may be biased toward less divergent sequences and that experimental identification may reveal additional highly diverged components elsewhere.

      The telophase-specific localization of TTHERM_00932010 is interesting. Although the paper focuses on the structural composition of kinetochores, it would be useful if the authors included more details about this protein.

      We will expand the description of TTHERM_00932010 to provide additional contextual information regarding domain architecture, expression timing, and potential functional implications. Off note, for this protein we cannot detect any orthologs outside Tetrahymena spp.

      What is the function of kinesin-6, known roles with respect to chromosome segregation in other species?

      We already discuss the role of kinesin-6 in chromosome segregation in other species in the discussion section at L355-356 (bioRxiv v1). We will expand this section and add two more sentences on diverse functions of this family in eukaryotes.

      Perhaps MadBub localization is more apparent in the presence of unattached kinetochores? In that scenario, it would be useful if the authors knock down KiTT12 and test whether they can localize MadBub.

      We agree this is an interesting possibility. However, systematic spindle perturbation experiments fall outside the primary scope of this structural study. We will clarify this limitation and discuss it as a direction for future work.

      Minor comments It would be useful if the authors added either the expression of all genes or known constitutive genes as a background profile to Figure 2E, in order for the reader to be able to evaluate the G2/M specific increase in expression of bioID hits.

      The data has been taken from [Bertagna et al. 2025, Bioinformatics], and the expression profiles of all the other proteins are provided for inspection by the reviewer/reader (see Table S3). The data representation asked for by the reviewer can thus be found in Bertagna et al. 2025. To provide further overview, we will add a supplementary figure including expression profile for protein with peaks in each of the cell cycle phases, including an overview of those peaking in G2/M.

      What is TTHERM_0046753? One of the identified unknown hits? It is also not part of Figure 2E unless this is a typo and the correct identifier should be 00467535?

      The reviewer is correct that this is a typo on our end, for which we apologise. The correct identifier should be 00467535.

      Why are 29 expressions shown in 2E but only 27 mentioned in the text (23 bioID hits as well as the four Ndc80 complex components)? Or did the authors instead identify 25 specific bioID hits that were further classified into the different categories? A rewrite on this section would likely help the reader to better understand the analyses of the PCA data.

      We agree this section can do with some optimization. We will clarify the number of proteins included in PCA and expression analyses and revise the relevant section for clarity.

      Significance (Required)

      This study highlights the importance of non-model organisms, such as ciliates, in understanding the evolution of the chromosome segregation machinery. Studies on such organisms would shed light on the evolutionary aspects of kinetochore biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Kinetochores are protein complexes essential for chromosome segregation in all eukaryotes. Unexpectedly, despite their crucial function, many kinetochore components evolve rapidly, which can hinder their identification based solely on sequence comparisons. In this study, the authors combine experimental and computational analyses to provide insights into the composition of the kinetochore protein complex in the ciliate Tetrahymena thermophila. This study makes an important contribution because kinetochore components in Tetrahymena have not previously been investigated experimentally, and the composition of the Tetrahymena complex was largely unknown. Starting with previously identified orthologs of the outer kinetochore proteins Ndc80 and Nuf2, the authors computationally identified the two additional members of the Ndc80 complex, Spc24 and Spc25. All four components were subjected to BioID analyses, leading to the identification of 23 additional candidates, some of which are factors known to be associated with centromeric chromatin in other eukaryotes (condensin, etc.). Focusing on a subset of unknown components, the authors provide experimental support for their kinetochore participation using microscopy and confirm distant homology with several known kinetochore components in other eukaryotes. Four components referred to as KiTT10-13, however, lack detectable orthology to known kinetochore components. Relative localization analyses using super-resolution microscopy revealed that KiTT10, 11, and 13 are more proximal to the inner kinetochore component CENP-C, while KiTT12 localizes closer to outer kinetochore components. Remote homology and phylogenetic analyses identify divergent WD40 or SANT domains in KiTT10 and 11, as well as a kinesin motor domain for the outer-kinetochore proximal KiTT12. Finally, RNAi-mediated depletion of KiTT12 demonstrated its requirement for accurate chromosome segregation and Ndc80 localization. Overall, I think this manuscript is interesting and makes an important contribution to the field of kinetochore biology. The results of this study, particularly regarding the novel kinetochore components identified, will likely also spark follow-up studies. My major comment concerns the discussion and presentation of the data:

      Major Comments

      At times, the explanation of homology search appears very technical and would not be accessible to non-experts. Moreover, the authors could include more details about their analysis of TurboID data to improve clarity. I was initially confused what does "not in original proteome" mean in the figure before understanding that two different proteome versions were used. I think it would be less confusing for the reader if the authors simply map their bioID data to the most recent version of the Tetrahymena proteome, which includes both Nnf1 and Csm1. Is it possible that this might also reveal the presence of other components in addition to the two that were specifically targeted?<br /> The data presentation in Figure 2 is confusing and requires clarification of the analyses performed. The Figure legend for panel 2C is incomplete. For example, there is no mapping for the character "*" in the legend. The legend can be revised for better clarity. Also, more than 23 proteins are shown in the 2D inset; were those not enriched in the other BioID experiments? It would be helpful to include a legend for these hits as well. I would be cautious about using the word comprehensive, as the identification depends on many aspects, including the completeness of the annotated proteome used to map the MassSpec spectra against. Even if their bioID experiments always converge on the same set of proteins, factors can still be missing due to annotation issues. In addition, certain components might be refractory for detection by MassSpec due to their amino acid composition. Other digestion methods, other than trypsin, could, however, identify those.<br /> Figure 4: I guess the result is somewhat expected given the previous inability to identify these components computationally. I guess the distribution of the non-tetrahymena components might be skewed towards lower sequence divergence, since they do not include orthologs that require experimental approaches for identification. If the authors agree, this could be added as a discussion.<br /> The telophase-specific localization of TTHERM_00932010 is interesting. Although the paper focuses on the structural composition of kinetochores, it would be useful if the authors included more details about this protein. What is the function of kinesin-6, known roles with respect to chromosome segregation in other species? Perhaps MadBub localization is more apparent in the presence of unattached kinetochores? In that scenario, it would be useful if the authors knock down KiTT12 and test whether they can localize MadBub.

      Minor comments

      It would be useful if the authors added either the expression of all genes or known constitutive genes as a background profile to Figure 2E, in order for the reader to be able to evaluate the G2/M specific increase in expression of bioID hits What is TTHERM_0046753? One of the identified unknown hits? It is also not part of Figure 2E unless this is a typo and the correct identifier should be 00467535? Why are 29 expressions shown in 2E but only 27 mentioned in the text (23 bioID hits as well as the four Ndc80 complex components)? Or did the authors instead identify 25 specific bioID hits that were further classified into the different categories? A rewrite on this section would likely help the reader to better understand the analyses of the PCA data.

      Significance

      This study highlights the importance of non-model organisms, such as ciliates, in understanding the evolution of the chromosome segregation machinery. Studies on such organisms would shed light on the evolutionary aspects of kinetochore biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      Ali, Raas et al. provide a comprehensive molecular characterization of the kinetochore in the ciliate Tetrahymena thermophila. By integrating proximity proteomics (TurboID) with structure-based "deep" homology detection, they identify 16 kinetochore proteins (KiTT1-16), including nine highly diverged "cryptic" orthologs of conserved LECA components and four lineage-specific proteins. Their results demonstrate that while the Tetrahymena kinetochore lacks a conventional CCAN complex, it maintains a recognizable outer kinetochore structure supplemented by novel proteins essential for faithful chromosome segregation.

      Major comments

      1. Representation of known kinetochore diversity
        • Since this manuscript wants to highlight that it is important to characterize kinetochore components in different eukaryotic clades, it would be good to highlight the known diversity from the literature in Figure 1, e.g. indicating species/clades for which components have been experimentally validated vs. only computationally inferred.
        • It would be good to specifically highlight this on the figure for the clade closest to Tetrahymena in which KT components have been experimentally validated (Apicomplexa?).
        • L58-64: the sentences 'we have a limited understanding about kinetochore composition and function from other branches of the eukaryotic tree of life' and 'these surveys also uncovered a surprisingly extensive diversity of kinetochore composition across eukaryotes' seem to contradict each other. Instead of/in addition to the literature described in the introduction, as suggested above, having known diversity indicated on a figure would therefore be helpful. This could be done quite roughly, just mentioning the number of verified KT components and the number of species for which this was done.
        • L46-L56: when explaining the structure of the KT, it would be good to already refer to a figure, like the diagram of a human KT in 1B. As it is now, the introduction first explains the general structure, and then goes into diversity. This is fine, but it would be easier to understand if the figure panels followed this order.
      2. The data can sometimes be represented in a more straightforward manner:
        • L120-...: After reading through the whole text, I understand why the authors choose to talk about Spc24 and Spc25 first (since Spc25 is also used in the TurboID experiment). However, the presented pipeline for these two proteins is much less convincing than for the other proteins. Spc24/25: 'Some homology > slight structure similarity > right localization in immunostaining' vs. the pipeline for the other proteins: 'TurboID > confirmation using homology + immunostaining' (what is depicted in Fig. 2C). The latter is very convincing, but by starting off with the less convincing pipeline, the reader starts off on the wrong track. Since Spc24 is not used in the end for the first TurboID results, is Spc25 necessary at this point or can this come later?
        • It is very good and thorough that the authors noticed that some of the KT proteins were simply missed because they were not part of the original predicted proteome. However, why weren't the TurboID analyses simply redone with the new proteome? The authors could still note that it was important to use the most recent version, but it would be much more straightforward for readers to immediately have the most up to date analysis.
      3. Figure 4 and accompanying paragraph: this is an interesting analysis, but impossible to interpret without comparing with the branch length of other Tetrahymena proteins or Tetrahymena as a species (if I interpreted the analysis correctly). L251: 'this underscores the high rates of evolution of kinetochore proteins'. This could be true, but this isn't proven here because there is no comparison with the evolutionary rate of other proteins in Tetrahymena.
      4. Figure 5: For further validation and to better show the layered structure of the Tetrahymena kinetochore it would be nice to have a couple of images here with increased resolution by using expansion microscopy.

      Minor comments

      • Abstract: if you are going to call out individual components, maybe also point out the few that were already known (KiTT1-2 and 14). Otherwise the reader might be confused about the missing numbers.
      • L37: is 'cryptic ortholog' an official term? Doesn't this just depend on the starting point of the homology search and the number of experimentally verified hits you have in certain parts of the tree? Just wondering.
      • For future submissions, it would be useful to have the figure numbers indicated on the figures, because now it was sometimes difficult to keep track.
      • L51: mentioning the SAC might make it a bit too complicated for people not 100% familiar with all the complexes. Either leave it out until later, or have a short sentence explaining what the SAC is.
      • Figure 1A: the identity of the black 'nuclei' is not explained for the Ciliophora and Apicomplexa in the figure or figure legend.
      • In Figure 1B, instead of saying 'absent', wouldn't it be more correct to say something like 'not found/detected/identified'? Figure 1C. During interphase, sometimes homologous chromosomes seem to cluster at the centromeres (5 foci - example on the left), but sometimes they don't (10 foci - example on the right). Is this something you observe a lot? Is it strain-dependent?
      • Figure 1C (and later in Fig. 5): centromeres don't seem to align during metaphase. Is this true, or are these examples showing late metaphase/early anaphase?
      • Why was STU2 included in the kinetochore? Wouldn't it be better classified as a MAP as in Fig. 3A? I saw this is actually discussed in the discussion, but maybe this explanation should come earlier.
      • Figure 2A: 'strong similarity'. For a TM score of 0.4 and 0.54, I am not sure I would say 'strong similarity'. Visually, they also look different. TM is also not explained in the legend.
      • Fig. 2D: why not PC2? Please explain this somewhere.
      • Fig. 3C-D: 'striking similarity', again, it is hard to evaluate whether this is true from the figures and TM values alone (all are >0.5). Either change the phrasing, or explain how much similarity one would expect between homologs.
      • How certain are you that these are all diverged homologs? For example, for KNL1, could another RWD domain-containing protein have evolved to become a kinetochore protein?
      • Fig. 5: why wasn't CNA1 used as a marker of the inner kinetochore or tested?
      • Fig. 8: There is a time axis below, but I'm not sure what is indicated on this axis. Are the events above mapped on this axis?
      • L347-349: 'convergent evolution'. Is the loss of the CCAN convergent evolution, or was it already lost in the SAR common ancestor?

      Significance

      General Assessment: The study is robust, thorough, and well-written. The analyses are technically sound, and the authors avoid overstating their conclusions. Key strengths include the successful identification of diverged components using a "deep homology" pipeline and the functional validation of novel subunits. To improve the study, the data representation could be made more straightforward, and the manuscript structure could be condensed to better highlight the most convincing results. Finally, the claims on the speed of evolution of the kinetochore components need to be better supported.

      Advance: The study provides the first molecular map of a ciliate kinetochore. By uncovering "cryptic" orthologs that escaped previous detection, the work demonstrates that many "missing" complexes in diverse eukaryotes are likely present but highly diverged.

      Audience: This work will interest evolutionary cell biologists studying mitosis and kinetochores (especially those interested in eukaryotic diversity), as well as the ciliate research community. It also serves as a methodological roadmap for researchers using structural homology to identify divergent proteins in other non-model organisms.

      Expertise: My field of expertise includes evolutionary cell biology, kinetochores, centromeres, microbiology, microscopy and phylogenetics.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Ali et al investigate the composition of putative kinetochore subcomplexes in the unicellular eukaryote Tetrahymena thermophila. Up to the point of this study, only a CENP-A ortholog and two subunits of the microtubule-binding Ndc80 complex had been clearly identified. This left open the question, whether Tetrahmena kinetochores follow the conventional organization found in common model systems such as yeast or human cells, or contain many unconventional proteins. The authors combine proximity biotinylation coupled to mass spectrometry with deep homology searches and structure predictions. Extensive bioinformatic analysis of the T. thermophila genome allows the authors to annotate 16 genes as kinetochore genes (KiTT). Using sequence comparisons with known kinetochore proteins, they were able to relate their novel KiTT proteins to the conserved kinetochore components Cenp A, Cenp C, the KMN network, as well as auxiliary proteins. In particular, the authors were able to complete the organization of the Ndc80 complex and identify subunits of the Mtw1/Mis12 complex and a Knl1 ortholog. This characterizes a KMN network as the centerpiece of the Tetrahymena kinetochore architecture. The CCAN seems to be represented solely by CENP-C, with key binding interfaces to KMN and CENP-A being preserved. An interesting aspect is that neither a Dam1-, nor a Ska homolog seems to support the Ndc80 complex. Instead, the authors identify a Kinesin-6 homolog that may potentially compensate for the absence of these factors. The study is well-designed, the results are thoughtfully discussed and the expertly conducted experiments highlight the power of combining experimental identification (BioID) with bioinformatic analyses.

      Major comments

      The functional validation of the newly identified subunits using RNAi feels somewhat limited in this study. I understand there a technical limitations in this system, but whenever possible, I would at least expect the authors to explore differential effects on different parts of the kinetochore using the reagents they have at hand. In the particular, the authors show the effects of depleting KiTT12 (the kinesin-6 homolog) on Ndc80 kinetochore localization. It would be important to check effects also on CENP-A (using the anti-CNA1 antibody), or on other subunits. Given the available reagents, this should be readily possible. The organization of the Knl1 ortholog and the question of whether a mitotic checkpoint is present, deserves some additional discussion. Interestingly, the positional organization of a PP1 binding motif at the N-terminus of a long disordered domain seems conserved. On the other hand, MELT motifs appear to be absent. The authors should discuss the implications of this some more. Is there an Mps1 homolog? What about the error correction machinery including Aurora B and the CPC? The putative MadBub homolog does not seem to localize to kinetochores, but maybe this is not detectable, unless the respective conditions (unattached kinetochores) are generated. Is it known, how the system reacts to spindle depolymerization?

      Minor comments

      Introduction: When introducing the Tetrahymena kinetochore, please add some sentences on microtubule/spindle organization in the MIC. What is known about the kinetochore-microtubule attachment site in Tetrahymena?

      Line 128: putative homology to Spc24 (E=13), comment on why this was considered, what cutoffs were applied etc..

      Line 135: briefly mention and discuss conservation of the RWD folds in the Spc24-25 orthologs

      Line 194: Maybe replace "show" with "suggest", given there is no experimental data behind the CENP-C identification

      Figure 7B: please add the information for the RNAi target directly to the Figure

      Figures in the combined pdf: please add the respective Figure number or Supplementary Figure number directly on the Figure.

      Significance

      While functional studies are often conducted in very few model organisms, exploring the evolutionary variations of kinetochore architecture can help to understand the design principles of kinetochores. I also helps to assign functions to specific subcomplexes and can reveal how adaptations of a core machinery occurs. Tetrahymena is historically an important experimental system that has had a great impact on the understanding of multiple aspects of nuclear biology. Deciphering the organization of the chromosome segregation machinery in this organism is therefore of great interest to researchers interested in mitosis and genome stability.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The study by Hemberg et al investigated whether the critical adaptor protein MAVS is required for IFN induction and viral resistance in primary human respiratory epithelial cells infected with influenza A virus (IAV). Specifically, air-liquid interface pulmonary bronchial epithelial cells (ALI-PBECs) are engineered with CRISPR/Cas9 to knockout MAVS (or AAVS1 as a control), followed by IAV infection and subsequent monitoring of the 1) IFN/ISG response and 2) viral replication. The study is thoroughly conducted, the methodology and the results are clearly described, and the study is written in a clear and transparent manner. No overstatements are made and limitations of the study are clearly indicated. I therefore only have a few minor comments:

      We would like to thank Reviewer 1 for their assessment of our manuscript. We are pleased with the overall positive comments, especially that the study was considered thoroughly conducted. However, this reviewer also raised some concerns that we will address in the point-to-point review below.

      • In the Discussion the following sentence should be nuanced: "However, most studies report no observable effect of MAVS KO on antiviral protection18,20". Please indicate that this refers to the context of IAV infection. I would also clarify that in this sentence you are referring to antiviral protection rather than IFN/ISG induction. (For example: While MAVS is important for IFN/ISG induction upon IAV infection, some studies in MAVS-deficient mice showed that this did not necessarily translate to increased antiviral protection). I would also advice to stay away from 'most studies report no observable effect' when you only reference two publications.

      This is a very valid point. We have addressed the comment by adjusting the sentence accordingly:

      “The importance of MAVS in IAV infections has only been investigated in few studies using MAVS-deficient mice. While these studies show that MAVS is essential for the induction of IFNs and ISGs upon IAV infection, supporting the findings of the current study, these findings did not translate to increased antiviral protection in the mice18,20.” in the revised version of our manuscript (page 4, line 34-37).

      • Fig. 2A and 2B (and 2C and 2D): please indicate the time points in the graphs to better reflect (visually) the difference between the panels.

      We agree with this comment by the reviewer. Figure 2A-D have been adjusted to include the infectious time points.

      Reviewer #1 (Significance (Required)):

      While the results of the study are not surprising (reduced IFN/ISG responses and increased IAV replication in MAVS KO cells), the scientific advance in this study is largely of a methodological nature: the role of MAVS in IAV infection has not been previously studied in this advanced culture model system. The study is of relevance to researchers that are in the immediate research field (virologists, RNA sensing). The study provides a useful example of CRISRP/Cas9 engineering in a complex model system to investigate the role of a particular gene in antiviral innate responses, which could easily be extrapolated to other genes.

      We are thankful, that Reviewer 1 appreciates the significance and potential of the complex model used in this study.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Review of manuscript RC-2025-03340:

      The authors of the manuscript "MAVS is Important for Antiviral Defense Against Influenza A Virus in a Human Respiratory Epithelium Model" present their recent work evaluating innate immune responses in a human cell line. Specifically, they infect primary human respiratory epithelial cells with influenza A virus and use CRISPR-Cas9 to knockout MAVS and demonstrate the importance of this adaptor molecule on type I and type III IFN induction. Their work shows that MAVS is important for restricting viral replication and they conclude that it is a non-redundant protector of human respiratory epithelial cells. Below are comments that the authors should consider during revision of this manuscript.

      We thank Reviewer 2 for their careful review of our manuscript. Reviewer 2 provided several valuable suggestions for improvements, which we will address below.

      1. The authors use the IAV A/PR/8/34, which is a good, laboratory strain for initial characterization. However, since this model strives to evaluate human cell interactions the authors are encouraged to use a relevant primary human influenza virus isolate in their model. This is important as the host:pathogen interactions may differ significantly when using a human isolate.

      This is an important point. We agree that stain-specific differences might affect host-pathogen interactions. Although IAV A/PR/8/34 is well-characterized and easily reproducible in mechanistic studies, it is a laboratory adapted strain. Therefore, we have now added a statement in Discussion acknowledging this limitation:

      “An important consideration of this study is the use of the laboratory-adapted influenza A/PR/8/34 (H1N1) strain. While this strain was selected due to its widespread use and ability to provide a robust and reproducible infection model, it has undergone extensive laboratory adaption. Consequently, the host-pathogen interactions observed in this study might differ from those elicited by primary clinical isolates. Validation of these results using contemporary human influenza stains will be important to determine their broader applicability.” (page 5, line 22-27).

      1. It was unclear why the authors used SeV for a control.

      We thank the reviewer for this important comment. Sendai virus (SeV) was used as a positive control due to its well-established ability to effectively activate RIG-I-dependent antiviral signaling. SeV is widely used as a potent inducer of RIG-I-mediated innate immune responses and provide a strong and reproducible stimulus independent of influenza-specific evasion mechanisms. To clarify this rationale, we have expanded the motivation for using SeV in the manuscript as indicated below:

      “To validate disruption of MAVS-dependent signaling, MAVS KO HAE-ALI cultures were infected with Sendai virus (SeV). SeV is a well characterized activator of the RIG-I pathway and is widely used as a positive control in studies of antiviral innate immunity. Infection with SeV induces a strong and reproducible activation of downstream signaling, thereby allowing validation of pathway integrity.” (page 2, line 44-46 and page 3, line 1-2).

      Reviewer #2 (Significance (Required)):

      Strengths of this study are the model developed and the evaluation of the MAVS component within primary human epithelial cells. Limitations include the use of a lab strain of influenza that is mouse-adapted (minimizes potential interaction with human host intracellular machinery), lack of a uniform knockout, and the inclusion of SeV without indicating the benefit of using this virus as a control. The advance in this publication is modest as it uses a non-uniform knockout model to demonstrate that an already-appreciated mechanism is involved in host:pathogen interactions. Also, without the use of a human virus isolate it is unclear if this is applicable beyond the lab setting.

      We would like to thank Reviewer 2 for their time and helpful comments. We have addressed their concerns to the best of our ability. We are pleased that the reviewer recognized the strength of our model in assessing the role of MAVS in primary respiratory epithelium.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript "MAVS is Important for Antiviral Defense Against Influenza A Virus in a Human Respiratory Epithelium Model" by Hemberg et al. demonstrates that reduction of MAVS expression in human respiratory epithelium leads to diminished induction of the type I IFN gene IFNB1, the type III IFN gene IFNL1, and multiple interferon-stimulated genes in response to IAV infection. They further show increased viral replication at 48 hours post-infection in MAVS-reduced epithelial cultures, as evidenced by elevated viral gene (NP, M2, NS1) transcripts and viral protein (NS1) levels.

      We would like to thank Reviewer 3 for their assessment of our manuscript. That reviewer raised some major and minor concerns that we will address in the point-to-point review below.

      Major comments: The authors convincingly show reduced MAVS expression in their CRISPR-Cas9 KO epithelial cultures compared to AAVS1 KO controls. However, residual MAVS expression remains detectable by western blot (Fig. 1C). No data is provided regarding cell-type specificity of the gene knockdown, although immunostaining and transcript analyses confirm the presence of both ciliated and secretory cells in MAVS KO and control cultures (Fig. 1H-I). Given that IAV exhibits strain-dependent cellular tropism and replication kinetics (e.g., Roach et al., PNAS 2024; https://doi.org/10.1073/pnas.2320303121), demonstrating unbiased gene editing across cell populations would strengthen the conclusions. While technically challenging (requiring scRNA-seq or RNA-scope), acknowledging this limitation in the discussion would be sufficient.

      We appreciate the reviewer’s comment regarding residual MAVS expression and the lack of cell-type–specific gene editing. We agree that while western blot analysis demonstrates substantial reduction of MAVS, low levels of MAVS expression remain detectable. As noted by the reviewer, we did not directly assess editing efficiency across individual epithelial cell subtypes. Although we confirm the presence of ciliated and secretory populations in both KO and control cultures, unbiased quantification of gene disruption at single-cell resolution was beyond the scope of the present study. We agree that such analyses would further strengthen conclusions regarding cell-type–specific effects and have now acknowledged this as a limitation in the Discussion section.

      “While MAVS protein levels were markedly reduced in the MAVS KO cultures, low residual expression remained detectable by western blot. HAE-ALI cultures are heterogeneous, and electroporation may not affect all cells, making complete MAVS elimination challenging. Nonetheless, MAVS protein levels were consistently reduced in KO cultures from both donors compared to controls, allowing investigation of the role of MAVS in this respiratory epithelium model. An important limitation of this study is that, although both ciliated and secretory epithelial populations were confirmed in KO and control cultures, gene editing efficiency was not assessed at single-cell resolution. Given that influenza A virus can exhibit strain-dependent cellular tropism, editing efficiency may vary across epithelial subtypes. Future studies employing single-cell transcriptomic approaches will be valuable in determining the extent of MAVS disruption within distinct epithelial populations.” (page 5, line 6-15).

      The claim that MAVS is critical for restricting viral replication is somewhat less compelling. While increased viral gene transcripts and reduced NS1 protein by western blot support this, the immunofluorescence images in Fig. 3E-F are difficult to interpret at the current magnification and resolution. Co-staining for secretory and ciliated cells would help assess tropism and epithelial integrity qualitatively. Additionally, although the data indicate increased viral replication, the most rigorous confirmatory assay would be a plaque-forming assay. Inclusion of discussion acknowledging this limitation would address concerns regarding replication kinetics without PFU measurement.

      We thank the reviewer for their comment regarding the assessment of viral replication. We agree that plaque-forming assays provide the most direct quantification of infectious viral particles. In the present study, increased viral gene transcripts together with altered NS1 protein levels were used as indicators of enhanced viral burden in MAVS KO cultures. While this support increased viral replication, we acknowledge that infectious viral titers were not directly measured. We have now added a statement to the Discussion recognizing this limitation.

      Regarding the immunofluorescence images, we agree that higher magnification and co-staining for epithelial subtypes would provide additional insight into cellular tropism and epithelial integrity. This has been noted as a limitation and an area for future investigation.

      “Although increased viral transcripts and altered viral protein levels in MAVS KO cultures support enhanced viral burden, infectious viral titers were not directly quantified by plaque-forming assay, which remains the gold standard for measuring replication kinetics. Therefore, while our data indicate increased viral gene expression in the absence of MAVS, the extent to which this corresponds to increased production of infectious virions warrants further investigation. In addition, higher-resolution imaging with co-staining of epithelial subtypes would provide greater insight into potential strain-dependent tropism and epithelial integrity following infection.” (page 4, line 44-46 and page 5, line 1-4).

      Minor comments:

      Increase legibility of the scale bar in Fig. 3E.

      The size of the scale bars in all images of Fig. 3E has been increased to increase readability.

      Typographical corrections:

      "some HAE-ALI cell that wasn't affected by the KO" → "some HAE-ALI cells weren't affected by the KO"

      "ProLong Glas Antifade" → "ProLong Glass Antifade"

      All minor comments were adjusted in our manuscript.

      Reviewer #3 (Significance (Required)):

      The core strength and novelty of this study lies in the use of CRISPR-Cas9-mediated knockdown of a gene of interest in a human epithelial system. The authors demonstrate effective gene editing across multiple donors to interrogate a specific pathway. Although the results are limited to the innate responses of the respiratory epithelium-unlike prior studies using murine models, which capture a broader range of cell types and immune responses-the paper provides valuable mechanistic insight into the role of MAVS in the respiratory epithelium, the first line of defense against viral pathogens.

      We sincerely appreciate the thoughtful evaluation of our work by Reviewer 3 and their recognition of its strengths. We are particularly pleased that the reviewer highlighted the novelty of the CRISPR-Cas9–mediated gene editing approach in primary human epithelium and acknowledged the value of validating the model across multiple donors. We also appreciate their recognition that, despite the focused scope on epithelial innate responses, the study provides meaningful mechanistic insight into the role of MAVS at the respiratory barrier.

      In addition to revising the manuscript in response to reviewer’s comments, we have also made minor grammatical changes, which are listed below:

      1. On the → and their (page 1, line 12)
      2. Molecule → protein (page 1, line 13)
      3. Test → assess (page 1, line 13)
      4. Has → is various experiments have (page 2, line 16)
      5. However, there are discrepancies regarding the extent to which MAVS contributes to antiviral defense → However, discrepancies regarding the extent to which MAVS contributes to antiviral defense are still present. (page 2, line 18)
      6. Moved this sentence from line 23 to line 21 on page 2: “Only a few studies have examined the importance of MAVS in the antiviral defense against IAV, and these studies report no consistent phenotype 9,16,18,20.”.
      7. RNA virus infections → IAV infection (page 2, line 23)
      8. Deleted “then” (page 2, line 40).
      9. Infection with SeV induced C-X-C Motif Chemokine Ligand 10 (CXCL-10) and Interferon Stimulated Gene 15 (ISG-15), measured by ELISA and qPCR, however this induction was decreased in the MAVS KO cultures compared to the control (Fig. 1D, E) → Infection with SeV increased production of C-X-C Motif Chemokine Ligand 10 (CXCL-10) and Interferon Stimulated Gene 15 (ISG-15), measured by ELISA and qPCR. However, this induction was decreased in the MAVS KO cultures compared to the control (Fig. 1D, E), (page 3, line 2-5).
      10. Deleted “generated” (page 3, line 12).
      11. However → but (page 3, line 16)
      12. Was → were (page 3, line 31)
      13. Deleted “by qPCR” (page 3, line 32).
      14. To examine this, viral RNA corresponding to the segments NP, M2, and NS1 of IAV from infected AAVS1 or MAVS KO HAE-ALI was quantified by qPCR. → To examine this, viral RNA corresponding to the Nucleoprotein (NP), Matrix protein 2 (M2), and Non-structural protein 1 (NS1) of IAV from infected AAVS1 or MAVS KO HAE-ALI was quantified by qPCR. (page 4, line 3-5)
      15. Added “the” (page 4, line 25)
      16. The current study only includes two same sex donors → A limitation of the current study is the inclusion of only two same-sex donors (page 5, line 16-17)
      17. The HAE-ALI model only includes respiratory epithelial cells, and therefore it does not... → Another limitation is, that the HAE-ALI model only includes respiratory epithelial cells. Therefore, it does not … (page 5, line 18-19)
      18. Role → importance (page 5, line 20)
      19. Importance → role (page 5, line 21)
      20. Deleted “the” (page 5, line 21)
      21. An important strength of this study is, that it was conducted on primary human respiratory epithelium cells, making the results transferable to humans. → Despite these limitations, an important strength of this study is, that it was conducted on primary human respiratory epithelium cells, making the results applicable to humans. (page 5, line 27-28).
      22. Added “Although further research, is needed, this study demonstrates that MAVS KO significantly impairs IFN and ISG induction in a human respiratory epithelial model infected with IAV. Given the physiological relevance of this system, these results support an important role for MAVS in antiviral responses in the human respiratory epithelium.” (page 5, line 34-37).
      23. Figure texts: IAV (diluted 1:20 in DMEM) → IAV (MOI 0.5).
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript "MAVS is Important for Antiviral Defense Against Influenza A Virus in a Human Respiratory Epithelium Model" by Hemberg et al. demonstrates that reduction of MAVS expression in human respiratory epithelium leads to diminished induction of the type I IFN gene IFNB1, the type III IFN gene IFNL1, and multiple interferon-stimulated genes in response to IAV infection. They further show increased viral replication at 48 hours post-infection in MAVS-reduced epithelial cultures, as evidenced by elevated viral gene (NP, M2, NS1) transcripts and viral protein (NS1) levels.

      Major comments:

      The authors convincingly show reduced MAVS expression in their CRISPR-Cas9 KO epithelial cultures compared to AAVS1 KO controls. However, residual MAVS expression remains detectable by western blot (Fig. 1C). No data is provided regarding cell-type specificity of the gene knockdown, although immunostaining and transcript analyses confirm the presence of both ciliated and secretory cells in MAVS KO and control cultures (Fig. 1H-I). Given that IAV exhibits strain-dependent cellular tropism and replication kinetics (e.g., Roach et al., PNAS 2024; https://doi.org/10.1073/pnas.2320303121 ), demonstrating unbiased gene editing across cell populations would strengthen the conclusions. While technically challenging (requiring scRNA-seq or RNA-scope), acknowledging this limitation in the discussion would be sufficient.

      The claim that MAVS is critical for restricting viral replication is somewhat less compelling. While increased viral gene transcripts and reduced NS1 protein by western blot support this, the immunofluorescence images in Fig. 3E-F are difficult to interpret at the current magnification and resolution. Co-staining for secretory and ciliated cells would help assess tropism and epithelial integrity qualitatively. Additionally, although the data indicate increased viral replication, the most rigorous confirmatory assay would be a plaque-forming assay. Inclusion of discussion acknowledging this limitation would address concerns regarding replication kinetics without PFU measurement.

      Minor comments:

      Increase legibility of the scale bar in Fig. 3E.

      Typographical corrections:

      "some HAE-ALI cell that wasn't affected by the KO" → "some HAE-ALI cells weren't affected by the KO"

      "ProLong Glas Antifade" → "ProLong Glass Antifade"

      Significance

      The core strength and novelty of this study lies in the use of CRISPR-Cas9-mediated knockdown of a gene of interest in a human epithelial system. The authors demonstrate effective gene editing across multiple donors to interrogate a specific pathway. Although the results are limited to the innate responses of the respiratory epithelium-unlike prior studies using murine models, which capture a broader range of cell types and immune responses-the paper provides valuable mechanistic insight into the role of MAVS in the respiratory epithelium, the first line of defense against viral pathogens.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors of the manuscript "MAVS is Important for Antiviral Defense Against Influenza A Virus in a Human Respiratory Epithelium Model" present their recent work evaluating innate immune responses in a human cell line. Specifically, they infect primary human respiratory epithelial cells with influenza A virus and use CRISPR-Cas9 to knockout MAVS and demonstrate the importance of this adaptor molecule on type I and type III IFN induction. Their work shows that MAVS is important for restricting viral replication and they conclude that it is a non-redundant protector of human respiratory epithelial cells. Below are comments that the authors should consider during revision of this manuscript.

      1. The authors use the IAV A/PR/8/34, which is a good, laboratory strain for initial characterization. However, since this model strives to evaluate human cell interactions the authors are encouraged to use a relevant primary human influenza virus isolate in their model. This is important as the host:pathogen interactions may differ significantly when using a human isolate.
      2. It was unclear why the authors used SeV for a control.

      Significance

      A. Strengths of this study are the model developed and the evaluation of the MAVS component within primary human epithelial cells. Limitations include the use of a lab strain of influenza that is mouse-adapted (minimizes potential interaction with human host intracellular machinery), lack of a uniform knockout, and the inclusion of SeV without indicating the benefit of using this virus as a control.

      B. The advance in this publication is modest as it uses a non-uniform knockout model to demonstrate that an already-appreciated mechanism is involved in host:pathogen interactions. Also, without the use of a human virus isolate it is unclear if this is applicable beyond the lab setting.

      C. The audience for this is basic research.

      D. I am a basic science researcher who studies host:pathogen interactions using influenza models of vaccination, virus pathogenesis, and secondary bacterial infections.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The study by Hemberg et al investigated whether the critical adaptor protein MAVS is required for IFN induction and viral resistance in primary human respiratory epithelial cells infected with influenza A virus (IAV). Specifically, air-liquid interface pulmonary bronchial epithelial cells (ALI-PBECs) are engineered with CRISPR/Cas9 to knockout MAVS (or AAVS1 as a control), followed by IAV infection and subsequent monitoring of the 1) IFN/ISG response and 2) viral replication. The study is thoroughly conducted, the methodology and the results are clearly described, and the study is written in a clear and transparent manner. No overstatements are made and limitations of the study are clearly indicated. I therefore only have a few minor comments:

      • In the Discussion the following sentence should be nuanced: "However, most studies report no observable effect of MAVS KO on antiviral protection18,20". Please indicate that this refers to the context of IAV infection. I would also clarify that in this sentence you are referring to antiviral protection rather than IFN/ISG induction. (For example: While MAVS is important for IFN/ISG induction upon IAV infection, some studies in MAVS-deficient mice showed that this did not necessarily translate to increased antiviral protection). I would also advice to stay away from 'most studies report no observable effect' when you only reference two publications.
      • Fig. 2A and 2B (and 2C and 2D): please indicate the time points in the graphs to better reflect (visually) the difference between the panels.

      Significance

      While the results of the study are not surprising (reduced IFN/ISG responses and increased IAV replication in MAVS KO cells), the scientific advance in this study is largely of a methodological nature: the role of MAVS in IAV infection has not been previously studied in this advanced culture model system. The study is of relevance to researchers that are in the immediate research field (virologists, RNA sensing). The study provides a useful example of CRISRP/Cas9 engineering in a complex model system to investigate the role of a particular gene in antiviral innate responses, which could easily be extrapolated to other genes.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We appreciate the time and effort the reviewers have invested in providing constructive feedback on our manuscript. Below, we’ve detailed additional work, corrections, and improvements that we will complete during the revision process.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary

      Folding is a major morphogenetic process that shapes tissues and organs in three dimensions. The mechanisms underlying tissue folding have been extensively explored and are often driven by actomyosin-based apical constriction. Here, the authors describe changes in cell geometry and mechanics during mouse neural tube formation. They build on quantitative fixed imaging and live junction ablation to extract cell geometry and junctional tension. These analyses are performed at different developmental stages and in both male and female embryos to propose a mechanical mechanism for neural tube elevation in the brain.

      Major comments

      The authors report quantitative data on cell geometry and junctional tension inferred from laser ablation. Overall, there are numerous statements that require stronger support from the experimental data. To substantiate several of their claims, the authors need to provide a larger number of data points-or at least comparable numbers across experimental conditions-for the tension measurements. Additional statistical analyses are required throughout to support the conclusions.

      Figure 1

      1. Does the projection algorithm account for tissue curvature when computing cell geometrical parameters such as area and anisotropy? At present, our projection algorithm does not correct for tissue curvature. Curvature in the tissue can make larger cells appear smaller in projections, skew the angle of cell orientations, and change aspect ratios. The largest curvature in the midbrain neural tube samples that we analyze is found in the transition region from the midline and lateral regions (~10-30% of tissue width) of 5 ss and 8ss embryos. The regions at the midline and more laterally are relatively flat. Therefore, distortion from curvature will not dramatically alter our key conclusions. We will apply a curvature correction using existing tools (Herbert S., et al (2021) BMC Biology) to sample images and determine if there are substantial differences in curvature-sensitive cells shape metrics. These will be included in a supplement to Figure 1. If there is a significant difference, we will expand the correction to all images that we analyze and update our analysis.

      The authors should provide information on the accuracy and reliability of the cell segmentation.

      We can provide a supplement to Figure 1 to demonstrate the accuracy of the segmentation. We have used F-Actin to segment cells in our images, which is enriched along the cell junctions but can also form medial cables that cross the cell surface. Junctional actomyosin is notably brighter than medial cables, and segmentation with our trained CellPose model is consistently able to distinguish the junctions. We also checked segmentation and performed manual corrections to ensure accuracy. To demonstrate this for our readers, we will prepare samples stained with both F-actin and ZO-1, a tight junction component that is localized to cell junctions. We will then segment the image twice in CellPose, once using the F-actin signal and once using the ZO-1 signal. The resulting cell outlines will then be digitally superimposed to show how much the signals overlap, and we will plot out the cell frequency as a function of area to determine if F-actin segmentations can segment with the same fidelity as ZO-1. Recent work by a co-author has shown excellent corroboration of neuroepithelial apical cell areas segmented using F-Actin and ZO-1 (Ampartzidis I., et al. eLife 2026). We are confident that our data will show a similar result.

      The authors indicate that the rate of apical constriction differs between male and female embryos. However, apical sizes differ only at specific positions along the ML axis (Fig. 2H, I).

      In Figure 2H, we show that at 5 ss males have larger apical areas than females at the midline, adjacent lateral cells, and at the surface ectoderm-neural epithelium border. By 8ss (Figure 2I), cells at the midline are smaller in males than females, while cells in more lateral regions are now equivalent between sexes. This change in apical area over time suggests that males have faster rates of constriction than females at the midline and adjacent lateral region where male cells become smaller or equivalent in size to female cells, respectively. We will perform statistical analysis (see comment #4) to determine if there are regions with significant differences in rate and amend our language to clarify that these differences are region specific as appropriate.

      The authors should provide statistical analyses for the rates shown in Fig. 2J. Are these rates significantly different between males and females, and between medial and lateral regions?

      Currently we calculate our rates using the difference in population averages of apical area at each stage shown in Figure 2H and 2I for each sex, and dividing by the number of somite stages, 3. As a result, there is only one rate value at each midline-lateral bin for each sex which is not amenable to statistical analysis. To correct this, we will calculate rates by subtracting the average apical area of each embryo at 8 ss from the population average of embryos at 5 ss. This will create 5 rates for both females and males at each 10% midline-lateral bin. We plan to perform a two-way ANOVA to determine if there are statistical differences in rates between males and females at each bin position and between medial and lateral regions. We will also add a section describing these calculations to the “Statistical Analysis” portion of the methods.

      Please clearly state the main novelty of this study relative to the work published by Brooks et al.

      Our study builds on the work of Brooks ER, et al. (2020) eLife. Brooks demonstrates that cells in a region of the lateral neural folds undergo apical constriction (Figure 1) and that cells at the midline do not (Figure 2). We expand and improve upon this work in the following ways:

      1. A) As required by our funding sources at the NIH (NOT-OD-15-102) we have collected, analyzed, and reported on sex as a biological variable of interest. In doing so, we have shown that there are clear sex differences in apical area in the neural tube that were not previously shown. We also show that there is apical constriction within the neural tube midline in a sex dependent manner. Brooks et al do not address sex in their work.
      2. B) We have provided more complete and spatially precise information on midline-lateral patterns of apical area and apical constriction. To show changes in apical area of lateral cells, Brooks selects a 100 x 100 µm region of interest in the midbrain (Figure 1E-F, Figure 2A) but does not specify the midline-lateral or rostral-caudal location of this region of interest or standardize it between embryos of different ages and dimensions. In our study, we’ve standardized our measurements to a 100 µm wide band across the midbrain adjacent to the midbrain/hindbrain boundary (Figure 2A-C). We also standardize positions as a percent distance from midline to account for differences in width between embryos and ages. This allows us to consistently compare similar populations of cells along the midline-lateral axis and determine changes in apical area over time.
      3. C) We connect patterns of apical area and constriction to F-actin and Myosin-IIB density. Though Brooks et al report some analysis of F-actin in lateral cells (Figure 6), they do not analyze the midline cells or explore the relationship between cell shape and actomyosin.
      4. D) Finally, we tested the mechanical properties of the tissue through laser ablation in living mouse embryos. From these ablations we’ve found that tension at the midline is less than in more lateral regions. Work in the neural tubes of frog (Haigo S., et al. (2003) Current Biology, Baldwin AT., et al. (2022) eLife, Matsuda M., et al. (2023) Nature Communications) and chicken (Kinoshita N., (2008) * Cell, Nishimura T., et al. (2012) Cell) embryos has conclusively shown that enriched midline actomyosin promotes apical contractility and drives hinge formation. It was therefore largely believed that a similar contractile hinge was employed in mammals (Copp AJ. and Green NDE. (2010) J. Pathol, Nikolopoulo E., et al. (2017) Development). Collectively, our work is the first to demonstrate that such a contractile hinge is not present in the mammalian brain neural tube. Figure 3*

      The authors need to provide statistical support for the claim that large midline cells exhibit reduced F-actin and Myosin IIB levels.

      We will conduct a two-way ANOVA to determine if there are statistical differences in F-actin and Myosin IIB density at the midline and more lateral regions in both males and females. We will update our language in the text and plots as appropriate from these results.

      F-actin and Myosin IIB intensities should be plotted as a function of cell area to support the proposed anticorrelation between apical area and actomyosin levels.

      We will make plots of cell areas vs. F-actin or Myosin IIB density for cells in each embryo. We will then fit a line to determine the R value for each embryo to determine if there is a negative correlation between cell area and actomyosin intensity. We will also adjust our language in the text as appropriate based on the results of these tests.

      Statistical analyses are missing to substantiate the increase in F-actin levels between stages ss5 and ss8.

      We will perform an F-test to determine homogeneity of variance between F-actin at 5 ss and 8 ss followed by the appropriate t-test to determine if there is a statistical increase in F-actin over time. We will also amend our language in the text to reflect the results of this test.

      Figure S3 should be supported by plots showing Myosin II and F-actin intensity as a function of position along the ML axis, together with appropriate statistics.

      In Figure 3A-D, we show representative images of F-Actin and Myosin IIB density in female embryos. These are plotted as the purple lines in Figure 3 E-H. Figure 3 Supplement 1 shows representative images of F-actin and Myosin IIB density in male embryos. These are plotted as the green lines in Figure 3 E-H. We will add a line in the caption of Figure 3 Supplement 1 indicating that these samples are represented and plotted in Figure 3. We also noted a typo in the respective captions, incorrectly indicating male or females were shown in the figure. We will correct these typos as well. Additionally, we will perform the statistical tests indicated under comment #6.

      Figure 4

      The authors state that lateral tension in male embryos is not different from midline tension, yet the number of data points is much lower than in females. To support this claim, the number of ablations should be comparable across sexes.

      As part of this study we performed 270 ablations in the neural tubes of 83 mouse embryos: an exceptional scale of ablations that is the first of its kind in early embryos. We conducted our initial recoil velocity analysis blinded to information on sex. Male embryos were statistically underrepresented in our data set because male embryos develop faster than their female littermates (Seller MJ. and Perkins-Cole KJ. (1987) J. Reprod. Fert.). As such, the neural folds of male embryos were too elevated to ablate. At present we do not have the resources or justification to perform laser ablations on additional animals to obtain the number of male embryos needed to supplement the already exceptionally large data set. We will instead perform a power analysis to determine if: 1) we have a sample size large enough to detect a biologically-meaningful difference with suitable power, 2) the sample size required to detect the observed difference is so large that the difference would not be biologically meaningful, or 3) we do not have a sample size large enough to detect a difference confidently. With the results of this analysis, we will amend our language in the text to reflect the most accurate claims that can be made.

      Is lateral tension different between males and females?

      In Figure 4G we show that females have statistically different tension between the lateral and midline regions, while males do not. However, we do not test if the lateral or midline tension is different between females and males. We will perform an F-test and t-test to determine if there are statistical differences between males and females in this region.

      Similarly, the data in Fig. S4 used to claim no change in tension over time are not supported by sufficient data points.

      As discussed in comment #10, the scale of ablations is already substantial, and the initial recoil velocities were analyzed blinded to information on embryo age. We will calculate a best fit line for these plots to demonstrate if there is a trend in recoil velocity over time. We will then adjust our language in the text as appropriate with this added information.

      Would the medial and lateral tensions reported in Fig. 4G remain unchanged if the authors perform statistical analyses on 10-15 ablations per condition?

      We do not have a justification for removal or exclusion of any of the laser ablations analyzed in this study. We will instead perform a power analysis, as indicated in comment # 10, and adjust the language in the text as appropriate given the results of that analysis.

      Figure 5

      The number of data points in Fig. 5J and L is insufficient to support claims of no difference. The only detectable difference arises in the comparison with much higher sample size (Fig. 5L, ML vs RC).

      In Figure 5J we disaggregate ablations performed at the midline by directionality (midline-lateral or rostral-caudal). We were unable to detect a statistically significant difference based on the direction of initial recoil velocity in either sex, though N’s for all categories are comparable. As discussed in comments #10 and #12, the scale of ablations conducted in this study is uniquely substantial. We will perform a power analysis for our anisotropy measurements in the lateral region of the tissue to determine if we have a sample size large enough to have detect a biologically-relevant difference with high confidence or if the sample size required to detect the observed difference is so large that the difference would not be biologically meaningful. Given the results of this analysis, we will amend our language in the text to reflect the most accurate claims that can be made.

      The authors conclude that males have higher ML tension than RC tension, but given the limited data this conclusion should be amended to "no detectable difference."

      In Figure 5L, we disaggregate ablations performed in the lateral regions, by directionality (midline-lateral or rostral-caudal). We find a statistical difference in the directionality of initial recoil velocity in females. In males, though we can observe a difference in the initial recoil velocity means, we are unable to detect a statistical difference, likely due to the smaller male sample size. As discussed in comments #10 and #12, the scale of ablations conducted in this study is uniquely substantial and was conducted blinded to embryo sex. Given that males develop faster than their female littermates (Seller MJ. and Perkins-Cole KJ. (1987) J. Reprod. Fert.) we were unable to obtain more males in our data set. We will perform a power analysis for our anisotropy measurements in the lateral region of the tissue to determine if: 1) we have a sample size large enough to detect a biologically-meaningful difference with suitable power, 2) the sample size required to detect the observed difference is so large that the difference would not be biologically meaningful, or 3) we do not have a sample size large enough to detect a difference confidently. With the results of this analysis, we will amend our language in the text to reflect the most accurate claims that can be made.

      Code availability

      The authors should provide access to the code used to generate the projections.

      We are committed to ensuring open access to all code used as part of this study, including components of the projection workflow, data analysis, and figure creation. We are in the process of assembling a GitHub repository containing these files as well as documentation to allow for use by other members of the research community and public. We will publicly publish this documentary upon completion of the repository or at time of publication, whichever comes first.

      Reviewer #1 (Significance (Required)):

      The authors propose a mechanical model for neural tube elevation based on analyses of cell geometry and tension at two developmental stages. The reported differences in cell geometry or actomyosin levels do not appear to explain the differences in geometry or tension suggested between male and female embryos. This raises questions about the relationship between these measurements and their relevance for understanding the mechanisms of neural tube elevation.

      If the major concerns outlined above are rigorously addressed, the manuscript will offer a valuable descriptive characterization of neural tube cell geometry and mechanical stress during morphogenesis. Such datasets could form a foundation for future studies investigating the mechanisms driving neural tube elevation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript investigates the role of apical constriction and actomyosin organization in shaping the mouse brain neural epithelium during neural tube elevation, with particular emphasis on sex-specific differences. The authors develop an imaging and analysis pipeline to reconstruct the apical surface of the neural plate in three dimensions and perform quantitative measurements of apical cell area, actin, and myosin IIB distributions. Targeted laser ablation experiments are used to infer regional tissue tension.

      The main findings can be summarized as follows. First, the authors identify a mediolateral gradient in apical cell area, with larger cells at the midline and smaller cells on the lateral neural folds, which inversely correlates with actomyosin density. Laser ablation experiments suggest that apical tension is lower and isotropic at the midline, whereas it is higher and anisotropic on the lateral folds, particularly in females. Second, sex-dependent differences in apical cell area, constriction rates, and actomyosin levels are reported at early somite stages, preceding previously described sex biases in neural tube defects.

      The experimental work is technically solid, and the imaging and quantification pipeline represents a useful advance for analyzing large, curved epithelial surfaces. However, the study feels incomplete in its current form. Despite addressing neural tube elevation, the manuscript does not provide a comprehensive analysis of the folding process itself. Key aspects such as three-dimensional tissue morphology, curvature evolution, or global shape changes of the neural folds are not quantified. In addition, other potentially relevant cellular behaviors, such as proliferation, cell rearrangements, or contributions from neighboring tissues, are not examined, nor are they compared systematically between sexes.

      Conceptually, the study focuses narrowly on correlations between apical cell area, actomyosin density, and inferred tension. While these measurements are carefully performed, the relationship between differential actomyosin contractility and three-dimensional tissue folding remains largely descriptive. No mechanical model or simulation framework is provided to link changes in actomyosin organization and cell shape to the emergence of neural folds and hinge formation. As a result, it is difficult to assess whether the measured differences in tension (on the order of ~40%) are sufficient to account for the proposed mechanical behavior of the tissue.

      The central hypothesis advanced by the authors is that a relatively "soft" midline, flanked by stiffer, tension-bearing lateral folds, facilitates hinge formation during brain neurulation. However, this hypothesis is not directly tested by perturbation. For example, experimentally increasing contractility or stiffness at the midline (e.g., via optogenetic activation of apical constriction machinery) would provide a more direct test of causality. As it stands, the data demonstrate correlation rather than necessity or sufficiency.

      Relatedly, alternative interpretations are not fully addressed. Large apical cell areas and low actomyosin levels at the midline could arise as a consequence of tissue geometry, contact with underlying structures such as the notochord, or extrinsic mechanical constraints, rather than being the primary cause of hinge formation. Similarly, anisotropic stresses generated at the tissue or embryo scale could align cells and actomyosin cables, producing the observed patterns without requiring locally specified apical tension differences as the initiating mechanism. The manuscript does not clearly distinguish whether apical tension asymmetries are a driver of folding or an emergent outcome of folding dynamics.

      Finally, while the identification of sex differences is intriguing, it remains unclear what mechanistic insight is gained beyond establishing that such differences exist. The functional consequences of these differences for neural tube closure, robustness, or failure are not explored, nor is it clear how they integrate into the proposed lateral tension model.

      In summary, this study provides high-quality measurements of apical cell geometry, actomyosin organization, and inferred tension in the mouse neural epithelium. However, the lack of direct perturbations, mechanical modeling, and quantitative analysis of three-dimensional tissue deformation limits the strength of the mechanistic conclusions. Addressing these gaps would substantially strengthen the manuscript and clarify the causal role of apical tension patterns in neural fold formation.

      __ __The reviewer makes an excellent point, that direct perturbation of the system would enable us to test our hypothesis and inform whether the reduced contractility at the midline is essential for neural tube elevation. However, at present the technology needed to conduct an optogenetic experiment like that described by the reviewer does not exist. As with the laser ablations, an optogenetic experiment requires access to live and healthy embryos. Currently, mouse embryos can be cultured for several days in roller culture, where they are continuously rotated, or for several hours in static culture (Aguilera-Castrejon A. and Hanna JH. (2021) J. Vis. Exp.). Both techniques require that the yolk and amniotic sacs remain intact around the embryo. To access the apical surface of the brain neural tube for imaging, both sacs must be breached, after which the embryo has about 30 minutes before it begins to exhibit altered cellular morphology and tissue integrity and ultimate embryo death.

      The neural tube elevates over several hours and closes fully after more than a day (Jacobson AG. and Tam PPL. (1982) The Anatomical Record). Even if we did acquire mice expressing photoactivatable constructs, the support membranes of the embryos would need to be breached to activate protein interactions. The embryos would die before any meaningful progress in neural tube elevation could be evaluated. Conducting an experiment like this would greatly advance our understanding of the system, and we hope that the needed technologies are developed to enable future work of this nature. The Galea lab previously purchased a photo-activatable Cre line, but was unable to induce deletion of a protein of interest using this allele before closure of the neural tube was completed (and the blue light needed to activate the cre was photo-toxic).

      At present, there is some experimental evidence to suggest that lack of apical constriction at the midline if important for proper neural tube closure. Brooks ER, et al. (2020) eLife shows that a truncated Ift122 mutant, leads to abnormal constriction of the midline cells but does not disrupt lateral cell apical constriction, leading to a failure in brain neural tube closure in these embryos. Ift122 regulates trafficking and signaling proteins in cilia, which in turn regulates Sonic hedgehog signaling which Brooks ER, et al. also demonstrates regulates apical constriction. While this disruption is clearly multifaceted and nuanced, it provides some genetic support for the lack of apical constriction at the midline being important for neural tube closure.



      Major Comments

      Figure quality. Figure 1 contains very low-resolution images, which makes it difficult to evaluate the segmentation quality and tissue morphology. Higher-resolution versions should be provided.

      In Figure 1, we outline the conceptual strategy and approach used to create and analyze shell projections of the curved neural tube. As much of our analysis builds from segmentation of cells in the projections, being able to assess segmentation quality from high resolution images is critical to evaluating the quality of the data shown. As discussed in comment #2, we will create a supplement to Figure 1 to demonstrate the accuracy of the segmentation. This will include high resolution images of both the label used to segment and the resulting segmentation, with corresponding overlays.

      Cell segmentation strategy and validation. The authors segment cell areas using Myosin II and F-actin signals. This approach may introduce inaccuracies, as actomyosin cables can traverse the apical surface of individual cells and do not always coincide with cell boundaries. Segmentation based on junctional markers such as ZO-1 may be more appropriate. At minimum, the authors should provide a quantitative validation of segmentation accuracy, for example by overlaying segmentation results on raw images together with a nuclear marker (e.g., DAPI or H2B-GFP), to demonstrate that the number of segmented cells corresponds to the number of nuclei.

      We will provide a supplement to Figure 1 to demonstrate the accuracy of the segmentation. We have used F-Actin to segment cells in our images. F-actin is enriched along junctions but cells can also have medial pools and F-actin cables, which might lead to errors. Though we understand the reviewer’s logic in asking to align segmentations with marked nuclei, the morphology of the neural epithelium makes this approach infeasible. The neural epithelium is pseudostratified, and nuclear position varies along the apical-basal axis depending on the cell cycle phase of each cell. As a result, an apical shell projection of nuclei would not capture all nuclei and a maximum intensity projection in Z of all nuclei would be uninterpretable as there would be substantial XY overlap between nuclei. Instead, we will create a supplement to Figure 1 to demonstrate the accuracy of the segmentation as discussed in comment #2. We will segment samples stained for both F-Actin and junctional markers like ZO-1. We will then create overlays of the resulting cell outlines and a cell area frequency plot for both segmentations to evaluate if F-actin based segmentation deviates from tight junction-based segmentation.

      Lack of cross-sectional views of neural tube morphology. The manuscript would benefit from the inclusion of cross-sectional images of the neural tissue at different developmental stages. This would serve two purposes: (i) to demonstrate that the authors have a comprehensive understanding of the full three-dimensional folding process during neural tube closure, including medial and lateral hinge formation, and (ii) to allow readers to visualize the tissue geometry corresponding to the analyzed projection datasets (e.g., at 5 ss and 8 ss).

      A key component of our model states that the changes in cell-level morphology and features correspond to changes in tissue level morphology (Figure 6). Specifically, that lateral apical constriction coincides with the flattening and elevation of the dorsal bulges on the lateral neural folds. We agree that it is beneficial to include additional visuals of tissue morphology. We plan to add an additional figure at the start of manuscript that details both the dorsal and relevant cross-sectional views of the somite stages analyzed. These visuals will take the form of graphical illustrations along with 3D confocal microscopy images and optical reconstructions of samples.

      Sex-specific differences in overall neural plate morphology. The authors report that at 5 ss, males consistently have larger apical cell areas than females. It is unclear whether this difference reflects a global difference in neural plate morphology. Showing representative images of female and male neural plates would help readers directly assess whether there are overt morphological differences beyond those revealed by quantitative analysis.

      If one sex has larger cells than the other, it would be reasonable to expect that the neural folds may be wider as well. In Figure 2B-C, we show representative images of male embryos at 5 and 8 ss. As part of the additions we indicated in comment #19, we will also include dorsal and cross-sectional views of both male and female embryos at the stages analyzed. If there is a difference in tissue morphology between sexes, we will also quantify these differences in tissue size, curvature, etc.

      Cell number analysis. The authors state, based on prior literature, that cell numbers do not change between 5 and 8 ss. Given that the tissue is already segmented in the current study, this claim should be directly verified using the authors' own data. This analysis should be straightforward and would strengthen the conclusions.

      We agree and will determine the number of cells analyzed for each embryo to test if there are changes in cell numbers at different stages and between sexes, along with appropriate statistical tests.

      Relation between tissue curvature and cellular properties. It would be highly informative to extract the three-dimensional morphology of the neural plate, in particular its curvature, and examine how curvature correlates with two-dimensional cell anisotropy, apical area, and F-actin/Myosin intensity. For example, at 8 ss the authors report a U-shaped dependence of cell area along the mediolateral axis. How does this pattern relate to local tissue curvature?

      We agree with this assessment and will create optical reslices in the midbrain adjacent to but excluding the midbrain hindbrain boundary. We will then divide the apical surface into 10% bins and fit a circle to the apical surface of the neural epithelium in each to calculate the local radius of curvature, which is the reciprocal of curvature for the surface. We can then correlate these values with two-dimensional cell shape and actomyosin density metrics.

      Visualization of sex differences in medial actin levels (Figure 3). In Figure 3, the reported female-male difference in medial actin levels would benefit from visualization of the raw data. A zoomed-in inset of the midline region, shown separately for females and males, would help substantiate this claim.

      In Figure 3, we demonstrate patterns of the whole-cell apical F-actin (Fig. 3A, B) and Myosin IIB (Fig. 3C, D) density. We find that there is no difference in F-Actin density between males and females (Fig. 3E, F), but a significant difference in midline Myosin IIB density at 5 ss that is mostly absent by 8 ss (Fig. 3G, H). We currently provide representative images for female and male myosin IIB expression across the midline-lateral axis in Figure 3C, D, and Figure 3-Supplement 1C and D. We can provide a close-up image of Myosin IIB in the midline region for both sexes as part of Figure 3, with additional annotations on existing representative image to indicate their origin.

      Typographical error. Line 143: please correct "cell are" to "cell area".

      We thank the reviewer for pointing out this error and will correct this typo and perform additional editing to correct any other typos present in the manuscript.

      Quantitative correlation analysis between cell area and actomyosin. The authors qualitatively discuss the relationship between cell area dynamics and actomyosin levels. It would strengthen the analysis to directly compute and report correlations between these variables, and to explicitly test whether actin and myosin levels are anti-correlated with apical cell area.

      As discussed in comment #6, we will plot cell area vs. F-actin or Myosin IIB density for each embryo and fit a line to calculate their correlation coefficient. From there, we will determine if there is a negative correlation between cell area and actomyosin intensity.

      Interpretation of anti-correlation and contractile hinge mechanism. In lines 143-157, the authors state that the observed anti-correlation between actomyosin and cell area argues against a contractile hinge mechanism. However, this anti-correlation could also suggest that apical cell area is determined by local mechanical or geometric constraints rather than by local actomyosin contractility. The authors should clarify and discuss this alternative interpretation.

      Within the neural epithelium of mice and other vertebrates, F-actin and myosin-IIB are enriched on the apical surface relative to other regions of the cell (Sadler TW, et al. (1982) Science, Matsuda M., et al. (2023), Nat. Communication, Röper, K. (2013) BioArchitecture). This poises the actomyosin network to be able to selectively constrict the apical surface relative to the basal side of the cells. Apical constriction is observed to actively facilitate the formation of hinges in folding tissues (Chanet S, et al. (2017), Nature Communications, Nishimura T., et al. (2012) Cell, Chistrodoulou N., et al. (2015) Cell Reports) in what we term the contractile hinge model of tissue folding. Tissues that employ this model of folding are expected to have small apical areas and apical enrichment of contractile actomyosin at the hinge point during folding. We observe large apical areas, low apical actomyosin density, and low apical tension at the midline hinge of the mouse midbrain neural tube, which are all inconsistent with a contractile hinge mechanism being employed in this tissue folding process. We agree with the reviewer that “cell shape does not always match [acto]myosin contractility levels, because cell shape depends on extrinsic, as well as intrinsic forces” (Line 147-149). We also agree that anticorrelation of actomyosin density and apical cell area does not per se argue against the contractile hinge model and will amend our language to be clearer. We will also further elaborate on potential extrinsic factors that may lead to the observed cell behaviors at the midline in the discussion.

      Statistical robustness of laser ablation results (Figure 4). The differences in recoil velocity between regions appear small, with substantial overlap between the distributions. In addition, the sample sizes for lateral versus midline ablations appear unequal (with visibly more data points in the lateral condition). These factors raise concerns about the robustness and statistical significance of the reported differences, which should be addressed more carefully.

      In Figure 4E, we show initial recoil velocities binned only by region: lateral vs. midline and report a 3.03 μm/s vs. 2.40 μm/s, or 26% difference between the two regions. We then show in Figure 4G that by considering another relevant variable, sex, we find initial recoil to be 3.15 μm/s vs. 2.30 μm/s, or 37% difference in females and 2.68 μm/s vs. 2.57 μm/s, or 4% difference in males. We go on to show In Figure 5L that within the lateral region that recoils also vary by direction, with a 38% difference. Ultimately the final conclusions that we draw regarding tissue tension that we present in our model are derived from the most finely disaggregated data in Figure 5. Our goal in presenting a stepwise disaggregation of the data was to demonstrate which variables had the greatest impact on the variance within our data set. We agree with the reviewer that a more precise statistical analysis of this data set is warranted that accounts for the complexity and multitude of variables that can influence our conclusions. In addition to the power analysis described in comment #10, we plan to conduct a mixed-effect model analysis of our data that considers factors including sex, age, cut direction, cut region, cut number, and embryos to determine which factors explain the most variance in the population. We will add this analysis as a supplement to Figure 4 alongside a description of the tests performed in the Statistical Analysis section of the methods. We will also adjust our language in the text to clearly state the limitations of the data as presented and qualify conclusions as appropriate.

      Speculative statement regarding anisotropic tension in males. Line 278: "We believe that both sexes demonstrate anisotropic tension, given that males have cell aspect ratios and orientations in the lateral neural folds similar to females." This statement is speculative. Either anisotropic tension in males should be directly measured and reported, or this statement should be removed.

      As discussed in comment # 15, in Figure 5L, though we can observe a difference in the initial recoil velocity means, we are unable to detect a statistical difference. Ablations were conducted blinded to embryo sex, but fewer male embryos were suitable for ablation because males develop faster than their female littermates (Seller MJ. and Perkins-Cole KJ. (1987) J. Reprod. Fert.). We were therefore unable to obtain more males in our data set. At present we do not have the resources to perform additional laser ablations to supplement the existing data set. We will instead perform a power analysis for our anisotropy measurements in the lateral region of the tissue to determine if: 1) we have a sample size large enough to detect a biologically-meaningful difference with suitable power, 2) the sample size required to detect the observed difference is so large that the difference would not be biologically meaningful, or 3) we do not have a sample size large enough to detect a difference confidently. With the results of this analysis, we will amend our language in the text to reflect the most accurate claims that can be made.

      Reviewer #2 (Significance (Required)):

      This study provides high-quality measurements of apical cell geometry, actomyosin organization, and inferred tension in the mouse neural epithelium. However, the lack of direct perturbations, mechanical modeling, and quantitative analysis of three-dimensional tissue deformation limits the strength of the mechanistic conclusions. Addressing these gaps would substantially strengthen the manuscript and clarify the causal role of apical tension patterns in neural fold formation. At the end of the day, the authors suggest an hypothesis that is not well support by their data, which is of high quality.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary

      This manuscript by De La O et al addresses a long-standing question of how actomyosin contributes mechanically to cranial neural tube elevation in the mouse, a system in which classical midline contractile hinge models appear insufficient. The authors develop an image-processing and analysis pipeline that enables reconstruction and quantitative analysis of the apical actomyosin network across the large, curved dorsal surface of the mouse brain neuroepithelium. Using this approach, combined with laser ablation-based tension measurements in live embryos, they report a medio-lateral gradient of apical cell area and an inverse gradient of actomyosin density. Contrary to contractile hinge models described in frog, chick and invertebrate systems, they find that the midline exhibits low, isotropic tension, while the lateral neural folds show higher anisotropic apical tension, consistent with their proposal of a "lateral tension" mechanism for neural tube elevation.

      The work provides an important reframing of actomyosin function in mammalian cranial neurulation, supported by extensive quantitative imaging and mechanical measurements. The finding that lateral, rather than midline, actomyosin networks dominate tissue tension is compelling and helps reconcile previous observations that midline hinge formation in mouse can proceed despite actomyosin perturbation. The study is technically sophisticated and addresses a biologically important process with clear relevance to neural tube defect etiology. However, several aspects of the statistical treatment, interpretation of laser ablation data, and mechanistic framing require clarification or tempering to fully support the authors' conclusions.

      Major comments

      Statistical unit and pseudo-replication in cell-based analyses (Figures 2-3)

      In Figures 2 and 3, it is unclear whether statistical comparisons were performed at the level of individual cells or embryos. Because cells are nested within embryos, treating cells as independent observations raises concerns about pseudo-replication and inflated statistical significance, particularly for sex-dependent effects. While the color-coded maps are visually compelling, they may overstate confidence in differences between conditions if embryo-to-embryo variability is not explicitly accounted for.

      Clarification is needed as to whether statistical testing was performed on embryo-level summary values (e.g., one value per embryo per positional bin), or whether hierarchical or mixed-effects models were used with embryo treated as a random effect. Providing embryo-level summary plots would also help readers assess inter-embryo variability. Addressing this point is important for confidence in both the reported medio-lateral gradients and the sex differences.

      We agree with the reviewer that it is inappropriate to calculate statistics based on measurements of individual cells. As indicated under the ‘Statistical Analysis and Figure Assembly’ section of our methods “For fixed images, cell shape and protein intensity analysis (Figure 2H-J, Figure 3E-H, Figure 5E-H), N = 5 embryos for all conditions and n, or the number of cells in each 10% bin, is ≥ 150 cells for each embryo” (Line 556 – 558). The average and SD between embryos are shown in these plots and is calculated at the embryo level, not the cell level. We chose to consolidate this information in the methods section as the same data set is used across the three figures. We will add a line to the figure captions that N values for all experiments can be found in this section of the methods. We will also provide supplementary plots showing the bin averages for each individual embryo, color coded by embryo to show the distribution of the data set.

      Interpretation of actomyosin density as a proxy for contractility (Figure 3)

      The descriptive correlation between apical cell area and actomyosin density is clear and consistent. However, actomyosin abundance alone does not necessarily equate to force generation, particularly in the absence of measurements of myosin activation state (e.g., pMLC), actomyosin dynamics, or direct perturbations linking actomyosin levels to mechanical output. Although the authors appropriately note that cell shape does not always reflect intrinsic contractility, actomyosin density is nevertheless used to argue against a contractile hinge mechanism.

      While the subsequent laser ablation experiments address tissue tension more directly, the mechanistic conclusions drawn from actomyosin density measurements alone would benefit from more careful qualification. Tempering language that equates actomyosin enrichment with contractile output, or explicitly acknowledging these limitations, would strengthen the interpretation.

      It is largely believed that apical pools of actomyosin are active and that apical localization of actomyosin is dependent on activation. Shroom3, an actin-binding protein, is localized to the apical adherens junctions in the neural tube (Haigo SL., et al. (2003) Curr. Biol., Hildebrand JD. and Soriano P. (1999) Cell), where it can recruit Rho kinases (ROCKs) that in turn phosphorylate and activate Myosin IIB (Nishimura T. and Takeichi M. (2008) Dev.). Mutations in Shroom3 lead to neural tube close defects and its overexpression in the neural tube can induce apical constriction and increased apical accumulation of Myosin II tube (Haigo SL., et al. (2003) Curr. Biol., Hildebrand JD. (2005) J. Cell Sci.). In the mouse neural tube, Myosin IIB intensity is greater in cells that can apically constrict than in those that cannot constrict (Galea GL., et al. (2021) Nat Commun). Additionally, inhibition of ROCK reduces apical tension, presumably by reduction of activated Myosin II (Butler MB., et al. (2019) J. Cell Sci.). We agree with the reviewer’s assessment that to definitively state that the apical pools of Myosin IIB and F-actin are promoting apical contractility, a demonstration of the phosphorylation state of the Myosin II regulatory light chain (pMLC) or observations/perturbations in live embryos is necessary. We will adjust our language to reflect this limitation. We will also provide information on the relationship between apically localized actomyosin and contractility.

      Statistical and biological independence of laser ablation measurements (Figures 4-5)

      The Methods indicate that 155 laser ablations were analyzed across 71 embryos, implying that multiple ablations were performed per embryo. It would be helpful to clarify how this hierarchical data structure was handled statistically. Specifically, were recoil velocities averaged per embryo, paired with embryos for ML vs. RC comparisons, or analyzed using hierarchical/mixed-effects models?

      Our laser ablation data set captures variables including embryo sex, age, cut location, cut direction, and cut number. Therefore, we did not feel it appropriate to average recoils within the same embryo as these cuts were intentionally in different regions (lateral vs midline) or in different orientations (i.e. a rostral-caudal cut and midline-lateral cut on opposite lateral folds), which our analysis has shown would lead to averaging out potential differences. Ablations were far apart from each other, and we had checked that ablation order did not predict changes in recoil. However, we agree with the reviewer that a more precise statistical analysis of this data set is warranted that accounts for the complexity of variables potentially influencing initial recoil velocities. As discussed in comment #27, we plan to conduct a mixed-effect model analysis of our data that considers the above and add this analysis as a supplement to figure 4. We will include a description of this in the methods and our language in the text to clearly state the limitations of the data as presented and qualify conclusions as appropriate.

      In addition, embryos were subjected up to 5 ablations within a short time window. Because laser ablation disrupts tissue integrity and can induce rapid cytoskeletal remodeling, it is unclear whether later ablations represent independent measurements of the native tension state. Clarification is needed regarding whether the authors tested for effects of ablation order (e.g., first vs. later cuts), ensured sufficient spatial separation between ablation sites, or verified that repeated ablations did not systematically alter recoil measurements. Demonstrating that initial recoil velocity is independent of cut number would substantially strengthen confidence in the mechanical conclusions.

      We agree with the reviewer that laser ablations cause disruptions to tissue, and these disruptions can impact the results of additional ablations performed near the site of prior ablations. The average embryo in our data set has three ablations: one on either neural fold and one at the midline, with hundreds of µm distances from each other. In embryos that had more than 3 ablations made far away from each other (additional ablations were performed in the hindbrain rhombomeres, rhombomere boundaries, at the neuroepithelium and surface ectoderm boundary, or at the zipper point, but n numbers of these are insufficient for analysis). We will supplement the methods text describing the laser ablations to clarify this for readers. Additionally, after an ablation, displacement is not detectable further than 3-5 cell lengths away from the cut even after several seconds post ablation. We will provide visual examples of these cuts after ablation to demonstrate this phenomenon. As discussed in comment #27 and #32, we will also perform mixed-effect modeling to determine if cut number impacts observed initial recoil velocities. We will also provide plots demonstrating relevant examples of these comparisons (e.g. sequential lateral cuts made in the same direction).

      Interpretation of sex-dependent tension differences (Figures 4-5)

      Figure 4 shows a clear lateral-greater-than-midline tension difference in females, whereas this pattern is not detected in males under initial analysis. Later, Figure 5 reveals directional anisotropy in the lateral neural folds of both sexes. As currently framed, this creates some ambiguity regarding whether the proposed lateral tension mechanism is sex-specific, sex-biased in magnitude, or sex-general but masked by directional averaging in males.

      Clarifying this distinction, both in figure presentation and in the text, would strengthen the mechanistic interpretation and prevent confusion. In particular, it would be helpful to more clearly explain how directional anisotropy reconciles the apparent absence of regional tension differences in males in Figure 4.

      We appreciate the reviewer taking the time to indicate this point of confusion. We ultimately conclude that the lateral tension model of neural tube elevation is agnostic of sex. Though there are nuanced differences in some of the details regarding Myosin IIB density, midline apical constriction and tension anisotropy, we do not believe these differences would fundamentally change the mechanical model used between sexes. With specific regards to masking of the lateral neural fold tension in males, we briefly address this in the discussion: “The averaging of [Rostra-Caudal] and [Midline-Lateral] [Initial Recoil Velocities] likely masked tension differences between the midline hinge and lateral neural folds, creating the false impression that males did not have high tension on the lateral neural folds” (Line 280-282). We will adjust the text in the results and discussion section to clearly indicate that are lateral tension model applies to both sexes, though some differences in specific details exist, and that averaging may have led to the result in Figure 4G.

      Causal overreach in mechanistic interpretation of anisotropic tension

      While the laser ablation data convincingly demonstrates spatial and directional differences in recoil consistent with patterned mechanical anisotropy, the manuscript frequently treats anisotropic apical tension as a mechanistic explanation for neural tube elevation. The presented experiments do not directly test whether anisotropic apical tension is necessary or sufficient for tissue bending, nor whether isotropic tension at the midline plays a causal role. Initial recoil velocity reflects not only pre-existing tension but also tissue geometry and viscoelastic properties, which may differ between midline and lateral regions.

      As such, statements suggesting that anisotropic lateral tension "explains" neural fold elevation should be tempered or reframed. The data strongly support spatial patterning of mechanical properties but do not yet establish causal primacy. Recasting the model as a mechanically consistent framework rather than a definitive mechanism would better align conclusions with the data.

      Our lateral tension model proposes that a regionalized difference in tension, with high tension in the lateral neural folds and low tension at the midline, is needed to enable neural tube elevation and ultimate closure. We agree with the reviewer, our work demonstrates that the results of our laser ablation experiments, along with measurements of cell shapes and protein density, are consistent with the lateral tension model that we propose. Our model is also supported by past work that shows that perturbations that disrupt actomyosin contractility leads to defects in brain neural tube elevation and closure but not midline hinge formation. For example, chemical perturbation of actin polymerization with Cytochalasin D (Ybot-Gonzalez P. and Copp AJ. (1999) Dev. Dyn), and genetic perturbations of Shroom 3, which apically localizes actomyosin (Hildebrand JD. And Soriano P. (1999) Cell) or Fhod3, which promotes actin polymerization (Sulistomo HW, et al. (2019) J. Biol. Chem.) all have brain neural tube closure defects but form a midline hinge. However, since we do not directly perturb tension, we have only demonstrated consistency rather than causality or sufficiency. We will adjust and temper our language accordingly in the relevant sections of the results and discussion.

      Minor comments

      Manuscript length and clarity

      The manuscript is longer and more complex than necessary for its central message. Several sections of the Results, particularly methodological validation and somite-stage stratification, could be streamlined.

      We agree with the reviewer and will continue editing the manuscript, prioritizing clarity, brevity, and precision of language so that readers are able to quickly understand the key points of the manuscript.

      Sex differences section

      The section on sex differences is interesting but somewhat tangential. Clarifying whether these findings are intended as mechanistic insight or observational motivation for future work could improve focus.

        We intended this section to offer up perspectives that inform and motivate future work to continue to track, analyze, and report on sex difference during development. We will edit this section of the discussion to improve clarity and brevity so that the reader can easily acquire this takeaway. Sex differences in the penetrance of exencephaly is an active area of research and our manuscript provides the first cell-level measurements which will guide the field in disaggregating future analyses by embryo sex.
      
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This manuscript by De La O et al addresses a long-standing question of how actomyosin contributes mechanically to cranial neural tube elevation in the mouse, a system in which classical midline contractile hinge models appear insufficient. The authors develop an image-processing and analysis pipeline that enables reconstruction and quantitative analysis of the apical actomyosin network across the large, curved dorsal surface of the mouse brain neuroepithelium. Using this approach, combined with laser ablation-based tension measurements in live embryos, they report a medio-lateral gradient of apical cell area and an inverse gradient of actomyosin density. Contrary to contractile hinge models described in frog, chick and invertebrate systems, they find that the midline exhibits low, isotropic tension, while the lateral neural folds show higher anisotropic apical tension, consistent with their proposal of a "lateral tension" mechanism for neural tube elevation.

      The work provides an important reframing of actomyosin function in mammalian cranial neurulation, supported by extensive quantitative imaging and mechanical measurements. The finding that lateral, rather than midline, actomyosin networks dominate tissue tension is compelling and helps reconcile previous observations that midline hinge formation in mouse can proceed despite actomyosin perturbation. The study is technically sophisticated and addresses a biologically important process with clear relevance to neural tube defect etiology. However, several aspects of the statistical treatment, interpretation of laser ablation data, and mechanistic framing require clarification or tempering to fully support the authors' conclusions.

      Major comments

      1. Statistical unit and pseudo-replication in cell-based analyses (Figures 2-3) In Figures 2 and 3, it is unclear whether statistical comparisons were performed at the level of individual cells or embryos. Because cells are nested within embryos, treating cells as independent observations raises concerns about pseudo-replication and inflated statistical significance, particularly for sex-dependent effects. While the color-coded maps are visually compelling, they may overstate confidence in differences between conditions if embryo-to-embryo variability is not explicitly accounted for.

      Clarification is needed as to whether statistical testing was performed on embryo-level summary values (e.g., one value per embryo per positional bin), or whether hierarchical or mixed-effects models were used with embryo treated as a random effect. Providing embryo-level summary plots would also help readers assess inter-embryo variability. Addressing this point is important for confidence in both the reported medio-lateral gradients and the sex differences. 2. Interpretation of actomyosin density as a proxy for contractility (Figure 3) The descriptive correlation between apical cell area and actomyosin density is clear and consistent. However, actomyosin abundance alone does not necessarily equate to force generation, particularly in the absence of measurements of myosin activation state (e.g., pMLC), actomyosin dynamics, or direct perturbations linking actomyosin levels to mechanical output. Although the authors appropriately note that cell shape does not always reflect intrinsic contractility, actomyosin density is nevertheless used to argue against a contractile hinge mechanism.

      While the subsequent laser ablation experiments address tissue tension more directly, the mechanistic conclusions drawn from actomyosin density measurements alone would benefit from more careful qualification. Tempering language that equates actomyosin enrichment with contractile output, or explicitly acknowledging these limitations, would strengthen the interpretation. 3. Statistical and biological independence of laser ablation measurements (Figures 4-5) The Methods indicate that 155 laser ablations were analyzed across 71 embryos, implying that multiple ablations were performed per embryo. It would be helpful to clarify how this hierarchical data structure was handled statistically. Specifically, were recoil velocities averaged per embryo, paired with embryos for ML vs. RC comparisons, or analyzed using hierarchical/mixed-effects models?

      In addition, embryos were subjected up to 5 ablations within a short time window. Because laser ablation disrupts tissue integrity and can induce rapid cytoskeletal remodeling, it is unclear whether later ablations represent independent measurements of the native tension state. Clarification is needed regarding whether the authors tested for effects of ablation order (e.g., first vs. later cuts), ensured sufficient spatial separation between ablation sites, or verified that repeated ablations did not systematically alter recoil measurements. Demonstrating that initial recoil velocity is independent of cut number would substantially strengthen confidence in the mechanical conclusions. 4. Interpretation of sex-dependent tension differences (Figures 4-5) Figure 4 shows a clear lateral-greater-than-midline tension difference in females, whereas this pattern is not detected in males under initial analysis. Later, Figure 5 reveals directional anisotropy in the lateral neural folds of both sexes. As currently framed, this creates some ambiguity regarding whether the proposed lateral tension mechanism is sex-specific, sex-biased in magnitude, or sex-general but masked by directional averaging in males.

      Clarifying this distinction, both in figure presentation and in the text, would strengthen the mechanistic interpretation and prevent confusion. In particular, it would be helpful to more clearly explain how directional anisotropy reconciles the apparent absence of regional tension differences in males in Figure 4. 5. Causal overreach in mechanistic interpretation of anisotropic tension While the laser ablation data convincingly demonstrates spatial and directional differences in recoil consistent with patterned mechanical anisotropy, the manuscript frequently treats anisotropic apical tension as a mechanistic explanation for neural tube elevation. The presented experiments do not directly test whether anisotropic apical tension is necessary or sufficient for tissue bending, nor whether isotropic tension at the midline plays a causal role. Initial recoil velocity reflects not only pre-existing tension but also tissue geometry and viscoelastic properties, which may differ between midline and lateral regions.

      As such, statements suggesting that anisotropic lateral tension "explains" neural fold elevation should be tempered or reframed. The data strongly support spatial patterning of mechanical properties but do not yet establish causal primacy. Recasting the model as a mechanically consistent framework rather than a definitive mechanism would better align conclusions with the data.

      Minor comments

      1. Manuscript length and clarity The manuscript is longer and more complex than necessary for its central message. Several sections of the Results, particularly methodological validation and somite-stage stratification, could be streamlined.
      2. Sex differences section The section on sex differences is interesting but somewhat tangential. Clarifying whether these findings are intended as mechanistic insight or observational motivation for future work could improve focus.

      Significance

      General assessment.

      This study provides a technically sophisticated and carefully executed analysis of the mechanical organization of the mouse cranial neural epithelium during neural tube elevation. Its principal strengths lie in the development of a large-scale apical imaging and reconstruction pipeline, the quantitative mapping of medio-lateral gradients in cell shape and actomyosin organization, and the use of laser ablation to directly probe regional and directional tissue tension in live embryos. Together, these approaches allow the authors to address a long-standing discrepancy between classical contractile hinge models and prior observations in mouse neurulation. The main limitations of the study relate not to data quality, but to interpretation: several conclusions rely on correlational relationships between actomyosin enrichment, cell shape anisotropy, and tissue tension, and the mechanistic language at times exceeds what is directly tested. Clarification of statistical structure and tempering of causal claims would substantially strengthen the work.

      Advance.

      Relative to prior studies of neural tube closure in frog, chick, and invertebrate systems, this work advances the field by providing direct, spatially resolved measurements of tissue tension in the mouse cranial neural tube. The identification of low, largely isotropic tension at the midline and higher, anisotropic tension in the lateral neural folds represents a conceptual advance that reframes how actomyosin contributes to mammalian neurulation. While the study does not establish causality between anisotropic apical tension and tissue bending, it offers a mechanically consistent alternative to contractile hinge models and provides a valuable framework for interpreting species-specific differences in neural tube morphogenesis. The advance is therefore primarily conceptual and technical, rather than mechanistic in the strict causal sense.

      Audience.

      This work will be of strong interest to a specialized but broad audience spanning developmental biology, epithelial mechanics, morphogenesis, and neural tube defect research. The imaging and analytical approaches are likely to be useful beyond neurulation, particularly for investigators studying force patterning in large, curved epithelial tissues. With appropriate framing, the study should also be of interest to researchers investigating the biomechanical basis of congenital defects, even if its immediate implications are primarily basic rather than translational.

      Field of expertise.

      My expertise lies in epithelial morphogenesis, tissue mechanics, actomyosin-based force generation, and quantitative imaging in developmental systems. I do not have specific expertise in clinical aspects of neural tube defect diagnosis or treatment.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript investigates the role of apical constriction and actomyosin organization in shaping the mouse brain neural epithelium during neural tube elevation, with particular emphasis on sex-specific differences. The authors develop an imaging and analysis pipeline to reconstruct the apical surface of the neural plate in three dimensions and perform quantitative measurements of apical cell area, actin, and myosin IIB distributions. Targeted laser ablation experiments are used to infer regional tissue tension.

      The main findings can be summarized as follows. First, the authors identify a mediolateral gradient in apical cell area, with larger cells at the midline and smaller cells on the lateral neural folds, which inversely correlates with actomyosin density. Laser ablation experiments suggest that apical tension is lower and isotropic at the midline, whereas it is higher and anisotropic on the lateral folds, particularly in females. Second, sex-dependent differences in apical cell area, constriction rates, and actomyosin levels are reported at early somite stages, preceding previously described sex biases in neural tube defects.

      The experimental work is technically solid, and the imaging and quantification pipeline represents a useful advance for analyzing large, curved epithelial surfaces. However, the study feels incomplete in its current form. Despite addressing neural tube elevation, the manuscript does not provide a comprehensive analysis of the folding process itself. Key aspects such as three-dimensional tissue morphology, curvature evolution, or global shape changes of the neural folds are not quantified. In addition, other potentially relevant cellular behaviors, such as proliferation, cell rearrangements, or contributions from neighboring tissues, are not examined, nor are they compared systematically between sexes.

      Conceptually, the study focuses narrowly on correlations between apical cell area, actomyosin density, and inferred tension. While these measurements are carefully performed, the relationship between differential actomyosin contractility and three-dimensional tissue folding remains largely descriptive. No mechanical model or simulation framework is provided to link changes in actomyosin organization and cell shape to the emergence of neural folds and hinge formation. As a result, it is difficult to assess whether the measured differences in tension (on the order of ~40%) are sufficient to account for the proposed mechanical behavior of the tissue.

      The central hypothesis advanced by the authors is that a relatively "soft" midline, flanked by stiffer, tension-bearing lateral folds, facilitates hinge formation during brain neurulation. However, this hypothesis is not directly tested by perturbation. For example, experimentally increasing contractility or stiffness at the midline (e.g., via optogenetic activation of apical constriction machinery) would provide a more direct test of causality. As it stands, the data demonstrate correlation rather than necessity or sufficiency.

      Relatedly, alternative interpretations are not fully addressed. Large apical cell areas and low actomyosin levels at the midline could arise as a consequence of tissue geometry, contact with underlying structures such as the notochord, or extrinsic mechanical constraints, rather than being the primary cause of hinge formation. Similarly, anisotropic stresses generated at the tissue or embryo scale could align cells and actomyosin cables, producing the observed patterns without requiring locally specified apical tension differences as the initiating mechanism. The manuscript does not clearly distinguish whether apical tension asymmetries are a driver of folding or an emergent outcome of folding dynamics.

      Finally, while the identification of sex differences is intriguing, it remains unclear what mechanistic insight is gained beyond establishing that such differences exist. The functional consequences of these differences for neural tube closure, robustness, or failure are not explored, nor is it clear how they integrate into the proposed lateral tension model.

      In summary, this study provides high-quality measurements of apical cell geometry, actomyosin organization, and inferred tension in the mouse neural epithelium. However, the lack of direct perturbations, mechanical modeling, and quantitative analysis of three-dimensional tissue deformation limits the strength of the mechanistic conclusions. Addressing these gaps would substantially strengthen the manuscript and clarify the causal role of apical tension patterns in neural fold formation.

      Major Comments

      1. Figure quality. Figure 1 contains very low-resolution images, which makes it difficult to evaluate the segmentation quality and tissue morphology. Higher-resolution versions should be provided.
      2. Cell segmentation strategy and validation. The authors segment cell areas using Myosin II and F-actin signals. This approach may introduce inaccuracies, as actomyosin cables can traverse the apical surface of individual cells and do not always coincide with cell boundaries. Segmentation based on junctional markers such as ZO-1 may be more appropriate. At minimum, the authors should provide a quantitative validation of segmentation accuracy, for example by overlaying segmentation results on raw images together with a nuclear marker (e.g., DAPI or H2B-GFP), to demonstrate that the number of segmented cells corresponds to the number of nuclei.
      3. Lack of cross-sectional views of neural tube morphology. The manuscript would benefit from the inclusion of cross-sectional images of the neural tissue at different developmental stages. This would serve two purposes: (i) to demonstrate that the authors have a comprehensive understanding of the full three-dimensional folding process during neural tube closure, including medial and lateral hinge formation, and (ii) to allow readers to visualize the tissue geometry corresponding to the analyzed projection datasets (e.g., at 5 ss and 8 ss).
      4. Sex-specific differences in overall neural plate morphology. The authors report that at 5 ss, males consistently have larger apical cell areas than females. It is unclear whether this difference reflects a global difference in neural plate morphology. Showing representative images of female and male neural plates would help readers directly assess whether there are overt morphological differences beyond those revealed by quantitative analysis.
      5. Cell number analysis. The authors state, based on prior literature, that cell numbers do not change between 5 and 8 ss. Given that the tissue is already segmented in the current study, this claim should be directly verified using the authors' own data. This analysis should be straightforward and would strengthen the conclusions.
      6. Relation between tissue curvature and cellular properties. It would be highly informative to extract the three-dimensional morphology of the neural plate, in particular its curvature, and examine how curvature correlates with two-dimensional cell anisotropy, apical area, and F-actin/Myosin intensity. For example, at 8 ss the authors report a U-shaped dependence of cell area along the mediolateral axis. How does this pattern relate to local tissue curvature?
      7. Visualization of sex differences in medial actin levels (Figure 3). In Figure 3, the reported female-male difference in medial actin levels would benefit from visualization of the raw data. A zoomed-in inset of the midline region, shown separately for females and males, would help substantiate this claim.
      8. Typographical error. Line 143: please correct "cell are" to "cell area".
      9. Quantitative correlation analysis between cell area and actomyosin. The authors qualitatively discuss the relationship between cell area dynamics and actomyosin levels. It would strengthen the analysis to directly compute and report correlations between these variables, and to explicitly test whether actin and myosin levels are anti-correlated with apical cell area.
      10. Interpretation of anti-correlation and contractile hinge mechanism. In lines 143-157, the authors state that the observed anti-correlation between actomyosin and cell area argues against a contractile hinge mechanism. However, this anti-correlation could also suggest that apical cell area is determined by local mechanical or geometric constraints rather than by local actomyosin contractility. The authors should clarify and discuss this alternative interpretation.
      11. Statistical robustness of laser ablation results (Figure 4). The differences in recoil velocity between regions appear small, with substantial overlap between the distributions. In addition, the sample sizes for lateral versus midline ablations appear unequal (with visibly more data points in the lateral condition). These factors raise concerns about the robustness and statistical significance of the reported differences, which should be addressed more carefully.
      12. Speculative statement regarding anisotropic tension in males. Line 278: "We believe that both sexes demonstrate anisotropic tension, given that males have cell aspect ratios and orientations in the lateral neural folds similar to females." This statement is speculative. Either anisotropic tension in males should be directly measured and reported, or this statement should be removed.

      Significance

      This study provides high-quality measurements of apical cell geometry, actomyosin organization, and inferred tension in the mouse neural epithelium. However, the lack of direct perturbations, mechanical modeling, and quantitative analysis of three-dimensional tissue deformation limits the strength of the mechanistic conclusions. Addressing these gaps would substantially strengthen the manuscript and clarify the causal role of apical tension patterns in neural fold formation. At the end of the day, the authors suggest an hypothesis that is not well support by their data, which is of high quality.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      Folding is a major morphogenetic process that shapes tissues and organs in three dimensions. The mechanisms underlying tissue folding have been extensively explored and are often driven by actomyosin-based apical constriction. Here, the authors describe changes in cell geometry and mechanics during mouse neural tube formation. They build on quantitative fixed imaging and live junction ablation to extract cell geometry and junctional tension. These analyses are performed at different developmental stages and in both male and female embryos to propose a mechanical mechanism for neural tube elevation in the brain.

      Major comments

      The authors report quantitative data on cell geometry and junctional tension inferred from laser ablation. Overall, there are numerous statements that require stronger support from the experimental data. To substantiate several of their claims, the authors need to provide a larger number of data points-or at least comparable numbers across experimental conditions-for the tension measurements. Additional statistical analyses are required throughout to support the conclusions.

      Figure 1

      • Does the projection algorithm account for tissue curvature when computing cell geometrical parameters such as area and anisotropy?
      • The authors should provide information on the accuracy and reliability of the cell segmentation.

      Figure 2

      • The authors indicate that the rate of apical constriction differs between male and female embryos. However, apical sizes differ only at specific positions along the ML axis (Fig. 2H, I). The authors should provide statistical analyses for the rates shown in Fig. 2J. Are these rates significantly different between males and females, and between medial and lateral regions?
      • Please clearly state the main novelty of this study relative to the work published by Brooks et al.

      Figure 3

      • The authors need to provide statistical support for the claim that large midline cells exhibit reduced F-actin and Myosin IIB levels.
      • F-actin and Myosin IIB intensities should be plotted as a function of cell area to support the proposed anticorrelation between apical area and actomyosin levels.
      • Statistical analyses are missing to substantiate the increase in F-actin levels between stages ss5 and ss8.
      • Figure S3 should be supported by plots showing Myosin II and F-actin intensity as a function of position along the ML axis, together with appropriate statistics.

      Figure 4

      • The authors state that lateral tension in male embryos is not different from midline tension, yet the number of data points is much lower than in females. To support this claim, the number of ablations should be comparable across sexes. Is lateral tension different between males and females?
      • Similarly, the data in Fig. S4 used to claim no change in tension over time are not supported by sufficient data points. Would the medial and lateral tensions reported in Fig. 4G remain unchanged if the authors perform statistical analyses on 10-15 ablations per condition?

      Figure 5

      • The number of data points in Fig. 5J and L is insufficient to support claims of no difference. The only detectable difference arises in the comparison with much higher sample size (Fig. 5L, ML vs RC). The authors conclude that males have higher ML tension than RC tension, but given the limited data this conclusion should be amended to "no detectable difference."

      Code availability

      The authors should provide access to the code used to generate the projections.

      Significance

      The authors propose a mechanical model for neural tube elevation based on analyses of cell geometry and tension at two developmental stages. The reported differences in cell geometry or actomyosin levels do not appear to explain the differences in geometry or tension suggested between male and female embryos. This raises questions about the relationship between these measurements and their relevance for understanding the mechanisms of neural tube elevation.

      If the major concerns outlined above are rigorously addressed, the manuscript will offer a valuable descriptive characterization of neural tube cell geometry and mechanical stress during morphogenesis. Such datasets could form a foundation for future studies investigating the mechanisms driving neural tube elevation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Review Commons Refereed Preprint #RC-2022-01348

      Response to Reviewers

      Dear Editor,

      Thank you for allowing us to submit a revised draft of the manuscript "Nrf2 promotes thyroid development and hormone synthesis" to Review Commons. We appreciate and are grateful for the time and effort you and the reviewers dedicated to providing feedback on our manuscript. The insightful comments were very valuable in improving the quality of our paper.

      We apologize for the delay in submitting this revised version. Addressing the reviewers' insightful comments required substantial additional experimental work, including new in vivo analyses, chromatin accessibility profiling, and functional pathway interrogation. During the revision process, the PhD student who led this project completed her training and left the laboratory, requiring a reorganization of responsibilities and experimental efforts within the team to ensure rigorous completion of the requested studies. We appreciate your patience during this period and believe that the additional data have significantly strengthened the manuscript. In addition, a list of new data and experiments performed is shown below. Please see, in italics, the point-by-point response to the reviewers' comments and concerns. All figures, lines, and page numbers refer to the revised manuscript file.

      List of experiments performed following the reviewers' comments:

      • Tg staining in zebrafish wt and nrf2aΔ5 embryos - 6dpf (Supplementary Figure 4).
      • Immunostaining for Phalloidin in mESC-derived organoids.
      • ATAC sequencing for mESC-derived organoids (WT and Nrf2 KO; day 22).
      • AP-1 pathway inhibition and gene/protein expression assessments __Reviewer #1 __

      (Evidence, reproducibility and clarity (Required)):

      This paper deals with the role of the transcription factor Nrf2 in the thyroid gland of zebrafish and in a thyroid organoid model. The subject if of relevance since Nrf2 is known to control the cellular response to oxidative stress and the thyroid is an organ where protection of oxidative stress is of major relevance, given the production of reactive oxygen species during thyroid hormone biosynthesis.

      The main result is that in Zebrafish (ZF) thyroid Nrf2 appears to be important for thyroid hormone formation since late stages ZF embryos deprived of NRF2 the levels of the thyroid hormone T4 and of its precursor, iodinated thyroglobulin(Tg), are very much decreased. However, there is no significant decrease of Thyroglobulin mRNA, albeit an impairment in the up-regulation of Tg by TSH could be observed. No effect is seen on the structure of the thyroid follicles and no developmental defect is observed, in contrast with the title of the paper.

      Conversely, in the mouse thyroid organoid model the absence of Nrf2 results in an impressive decrease of Tg mRNA and in impaired formation of thyroid follicles.

      The study is in most part elegant and technically impeccable. The data are well presented and organised as far as figure is concerned. However, much remains to be done on the interpretation and presentation of results. In addition, the text that has been put together sloppily, with many typing and punctuation mistakes and difficult to interpret sentences. A revision of typing and syntax is absolutely needed.

        • Dear reviewer, we appreciate your positive and constructive comments (addressed below) on our manuscript and we apologize for the text clarity and typos. The grammar and text structure were improved following the comments while additional experiments were performed aiming to answer the open questions.* Main concerns:
      1. The title of the paper needs to be changed. There is no evidence that there is a problem with thyroid development in ZF. The thyroid appears to be enlarged at the end of development, most likely as a consequence of increased TSH stimulation, but there is no developmental defect!

        • Dear reviewer, thank you for pointing it out. Indeed, the initial title did not represent the phenotype observed in both the Zebrafish and murine thyroid organoid models. So, in this revised version the title has been updated to reflect the effect of Nrf2/nrf2a on maturation rather than development. The new title is "The role of Nrf2/nrf2a in thyroid maturation and hormone synthesis in mammalian and non-mammalian models".*
      2. There is an evident contradiction, in ZF, between the marginal, if any, decrease of Tg mRNA and the impressive decrease in T4 and iodinated Tg. This, in my opinion, very interesting discrepancy, is never discussed. Perhaps the authors should look at the level of Tg protein. It it possible that there is an increased degradation or some negative translational control in absence of Nrf2? Alternatively, is it possible that there is a defect, yet unidentified, in the organification process? Certainly, to conclude, as the authors do in page 11, lines 236-237, that the defect in hormonogenesis depends on thyroglobulin production is, with the data presented, an unproven statement.
        • Dear reviewer, thank you for raising this question and suggesting experimental ways to tackle it. Following your comment, we performed Tg staining in zebrafish embryos. We could observe similar levels of Tg protein in nrf2a ko vs. nrf2a heterozygous and WT. Data have been added as Supplementary Figure 4 and the text has been updated (L. 175-179). This data ruled out our previous hypothesis that TH-impairment would happen in response to lower Tg levels. Since the phenotype observed in nrf2a KO is similar to the one previously demonstrated by our team with the duox KO (Giusti et al., 2020), we hypothesize that duox could be implicated in dyshormonogenesis. Since duox enzymes are known for tightly controlling H2O2 production, an essential factor for T.H. synthesis (Carvalho and Dupuy, 2017), duox dysregulation could further induce oxidative stress, and lead to hypothyroidism (L. 357-365). To access the duox expression in zebrafish we performed qPCR in pool of embryos and we did not observe any clear change in duox levels in nrf2a KO compared to WT (results included in Fig. 2S). Since duox is not exclusively expressed in the thyroid, we also performed in situ hybridization, however, we never managed to have convincing results using this technique. Finally, despite the observation that tpo and duox genes are expressed in nrf2a KO embryos, we cannot rule out that the activity of those enzymes is preserved and that the T.H. machinery is functional (L. 192-195). Due to the limitations of performing functional assays in zebrafish, the mechanism behind the dyshormonogenesis phenotype is an open question.*
      3. *

      4. The data on transcriptional effect of NRrf2 in the mouse ES cell system do not really add much. Their major effect is to contribute to a lengthy discussion that would really benefit of a substantial reduction.

        • Dear reviewer, for this revised version of the manuscript we included new ATACseq data and combined it with the previously shown transcriptomic to explore the molecular mechanisms by which Nrf2 loss drives such maturation phenotype in mESC-derived thyroid organoids (L. 298-340; figure 6A-H). Using such approach we demonstrate that Nrf2 causes significant changes in chromatin accessibility which is strongly correlated with changes in gene expression profile. We also could demonstrate that Tg expression is indeed impaired by a reduction in chromatin accessibility under the lack of Nrf2, while we identified key pathways/TFs regulated by Nrf2 that could play a role in driving the phenotype. More specifically we identified increased mRNA expression and chromatin accessibility of genes associated with AP-1 pathway activation, such as Fos, Jun-b, and Stats (L. 322-334; Figure 6 and Supplementary Figure 6). * Interestingly, studies have shown that Nrf2 and AP-1-proteins significantly overlap regulating each other at several levels, including transcriptionally. Also, despite Nrf2 being known for binding to ARE and AP-1 to TRE site, they often overlap with AP-1 being embedded into ARE. These tight relations suggest shared feed-forward and feed-back circuits between NRF2 and AP-1 factors contributing to their functioning. To further investigate if AP-1 overexpression in Nrf2 KO-derived thyroid organoids has a compensatory effect or if it contributes to the phenotype observed, we performed AP-1 inhibition during the thyroid differentiation protocol. Nrf2 KO cells differentiated with the SR11302 inhibitor (from day 7 to 22) partially reduced the Tg mRNA at higher doses (10m*M) while Tg protein and Tg-I production are not clearly distinct from the control (L. 334-340; Supplementary Figure 7A-B). This suggests that AP-1 upregulation upon loss of Nrf2 might work as a compensatory mechanism, however, due to the lack of Tg expression, which is under a direct effect of Nrf2, the functionality is not recovered. *
      5. *

      6. More time should be devoted to explain the substantial differences between the three systems studied (two in this paper, one in a previous published by partly same authors), keeping in mind that studies in mice could be largely influenced by the genetic background.

      7. Following the suggestion of the reviewer, we added a table (Table 7) summarizing the differences and similarities between the two models used in this study and the in vivo model from Ziros, et al. 2018. We also better discussed the 3 models in the discussion and added subtitles to make it clearer.*

      Reviewer #1 (Significance (Required)):

      This paper is largely confirmatory of previous results obtained in a mouse Nrf2 KO model, whose main authors are also part of this study (Ziros et al., 2018). A clarification of the molecular defect in hormone production in ZF could add the novelty that this study might need.

        • Dear reviewer, indeed our study reinforces the effect of Nrf2 in regulating of Tg expression in mice (not conserved in zebrafish). In addition, we demonstrate the transcriptional and chromatin accessibility changes promoted by the loss of Nrf2 in mouse thyroid cells. Also, the fact that Nrf2 KO ES cells do not efficiently form follicles in vitro is a very interesting and unexpected observation that reinforces the hypothesis that Tg secretion plays a role during early folliculogenesis. In zebrafish, despite the normal expression of the main thyroid markers, the defect in function could be explained by disruptions in duox and tpo activity which would impair Tg iodination. Unfortunately, the limitation of zebrafish for functional studies keeps this question open. * Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary

      In this manuscript Gillotay et al investigate and further establish a role for Nrf2 in thyroid physiology. Importantly, a role of Nrf2 in thyroid development is investigated in the Tg(tg:nlsEGFP) zebrafish model system developed by this group. This permits detailed and also live tracking of zebrafish thyroid development. By a Crispr/Cas9 approach the authors establish a functional role for nrf2a in zebrafish thyroid development. Nrf2a loss-of-function in zebrafish leads to a hypothyroid phenotype that seems to recapitulate aspects of dyshormonogenesis with a slightly enlarged gland, increased tshb, strongly reduced T4 and iodinated TG (TG-I). Morover, Tg synthesis is perturbed, only slightly during control conditions but pronouncedly during PTU challenge. By restoring nrf2a specifically in the zebrafish thyroid, the authors prove the phenotype to be cell autonomous to the thyroid.

      The authors then turn to the model system of functional thyroid follicular maturation from mESC that they have previously described. In this system Nrf2a KO leads to loss of TG expression and what the authors describe as an inability to form follicular units. Even though Nis is expressed, iodine organification is impaired, likely due to defective Tg production.

      Evidence, reproducibility and clarity

      Major comments

      This is a nicely written manuscript with mostly convincing results. The authors demonstrate laudable scientific rigor by verifying that the genetic modifications indeed have the expected effect (that nrf2 in the model systems that they create is indeed non-functional) and by using a rescue strain for the zebrafish model. In my opinion the data and the methods are presented in such a way that they can be reproduced. Generally, the conclusions are well supported by the results presented.

      The authors make a rather strong point of Nrf2 thyroid function as an evolutionary conserved mechanism and this might need some further underpinning.

        • Dear reviewer, we appreciate your positive comments on our manuscript. Concerning the evolutionary conserved mechanisms driven by Nrf2, we better discuss this aspect in this new version. Even if the lack of nrf2/nrf2a drives the same phenotypical outcome in both models, we discuss the possible distinct mechanisms by which it drives such phenotype. While in mESC-derived thyroid organoids, we confirm the repressive effect of Nrf2 loss on Tg expression and consequently on folliculogenesis and TH production, in zebrafish our new data suggest another mechanism. As suggested by reviewer 1, Tg staining was performed under "physiological" conditions and we did not observe any change in Tg protein in nrf2 KO compared to WT embryos (Supplementary Figure 4). This ruled out the hypothesis of a conserved mechanism involving tg expression. However, as discussed in the new version, this phenotype could be related to dysfunctions in the activity of enzymes associated with tg iodinations, like tpo and duox. However, due to the challenges in performing functional and activity studies in zebrafish, this hypothesis could not be validated.* Line 368. The authors state that their findings "...reinforce a recently published report on the role of Nrf2 in adult mouse thyroid physiology (Ziros et al., 2018)" and continue: "Although we did not analysed (sic!) the thyroid gland of adult nrf2a KO zebrafish"...".

      Why were not adult zebrafish investigated? As some key results of Ziros et al on adult Nrf2 KO mice differ from those of the manuscript (e.g. TG iodination under non stressed conditions seems to be increased in mice but decreased in zebrafish) it would be highly desirable to know if the zebrafish phenotype is more similar to that of mice in later adulthood.

        • Dear reviewer, indeed assessing TH status in adult zebrafish would tell us if there is a later compensatory effect occurring, what might be the case in mice. However, despite our will to test it in adults, we faced several challenges: a. A big proportion of the nrf2a homozygous embryos die around 10dpf, which could be a consequence of the absence of THs. b. The few adults that managed to survive have shown much lower ability to produce eggs (20 eggs instead of hundreds), a reduction in fertility. Considering that the complete lack of TH would be detrimental to survival, we could expect that the minority of nrf2a KO embryos able to reach adulthood are the selected ones producing a sufficient amount of T4 to keep them alive. c. Thyroid tissue analysis in adult zebrafish is technically similar to mice, where the tissue dissection and processing is quite challenging since the thyroid is very small and difficult to dissect. Taking into account these limiting aspects, unfortunately, we were not able to provide the TH production assessment for this manuscript. * As I will elaborate on later on, to me a key finding seems to be that nrf2a might have unexpected "non-canonical functions" that do not immediately seem to pertain to its presumed major function as a key regulator of defense to oxidative stress. One of these is to be an impact on Tg production. The other one seems to be a possible role in folliculogenesis, even though this might in some way be related to the impact on Tg production. I do understand that the authors want to leverage from the elegant in vitro model of mESC-derived follicles that they have developed. However, the rationale of using the model system in this context is not entirely clear to me. In Ziros et al the effect of Nrf2 deficiency was studied in a murine global knockout model, in a thyroid specific knockout (even though it might be argued that by using a Pax8 Cre-driver Nrf2 was likely knocked out also in the kidneys and in some regions of the CNS) as well as in cultured rat thyroid cells. Is it the fact that the Pax8 Cre-driver is not entirely thyroid specific that prompted the authors to turn to the mESC-model, the possibility of studying thyroid cell autonomous mechanisms of folliculogenesis (without the possible impact of other tissues and the HPT-axis) by turning to this reductionistic model or other considerations? Even though the results from the mESC model certainly are of interest, the rationale needs to be better explained and the real potential of this model is perhaps not fully exploited. Specifically, the effect on folliculogenesis in the mESC system needs to be more carefully presented:
        • Dear reviewer, indeed, the rationale behind the choice of using mESC-derived thyroid organoids to study Nrf2 loss was not clearly presented in the first version of our manuscript. The text has been modified in the updated version to better explain our choice of using the mESC-derived thyroid model. In fact, the first goal of using this system was to be able to compare the effect of Nrf2 KO during thyroid development in a mammalian system. Compared to in vivo mouse models, assessing/tracking changes (mRNA expression and live imaging) during early development is challenging due to the need for many animals to study each stage of development. In addition, we aimed to use the derived cells to perform omics and understand the mechanisms behind the phenotype. These experiments would be challenging to perform in zebrafish due to the small number of thyroid cells at 6dpf and the number of cells necessary for RNA and ATAC sequencing. Such assessments were also not performed in the Ziros, 2018 previous study. * To us, in Ziros, 2018 paper, the fact that Pax8 is expressed in the kidney was not a main drawback. In our system, the folliculogenesis phenotype was also not expected, and it is highlighted as an unexpected and novel finding during development, rather than as the rationale for this work. Our new Fig. 5B shows that even if less frequently, smaller follicles are formed. Taking this into account, we cannot exclude that this phenotype could be transitory during early stages of development and that upon Tg accumulation overtime, follicles could be formed and Tg iodinated since the iodine machinery is highly preserved. This hypothesis would fit with the findings from Ziros (2018) that show thyroid follicular organization and function in adult mice lacking Nrf2 expression. Another possible explanation is that during thyroid in vitro differentiation a higher level of stress is expected compared to an in vivo system and in the case of Nrf2 lack it would exacerbate the effect, as previously demonstrated in vivo by the overload of iodine in adult mice (Ziros, 2018).

      Line 294. Even though they might be discernible, it is difficult to really appreciate the occurrence of follicular lumina in Fig 4E WT structures. The authors consider this as a main finding (see discussion: "The most striking difference we observed between the two models was the absence of follicular organisation in Nrf2 KO thyroid cells..."). I think this would be clearer if staining for an apical marker such as ezrin or MUC1 are shown and I would also like to see some kind of quantification of follicular organization (e.g. number per area, size) between the WT and KO conditions. I think that would strengthen the notion of abolished follicular organization in Nrf2 KO cells.

        • Dear reviewer, following your comment we performed phalloidin staining to get a better view of the follicular organization in Nrf2 WT and KO-derived thyroid organoids (Fig. 5B). Also, using this staining we could observe that visually fewer and smaller follicle-like structures are formed upon Nrf2 absence. This could indicate that folliculogenesis is not completely abrogated but not fully and properly occurring in this case, mostly likely due to the low accumulation of Tg in the lumen. This reinforces the hypothesis that in mice, overtime accumulation of Tg, even if less produced, could compensate for the folliculogenesis phenotype and finally produce functional follicles (as in Ziros's paper).* Fig 5A. Also in this image it is difficult to appreciate any genuine "follicular organization" of the WT cells. Again, staining for an apical marker would be desirable. It rather seems like microlumina between two or three cells. Also a close up might be illustrative.
        • Dear reviewer, to show more clearly the follicular organization, phalloidin staining has been added to Fig. 5 (B).* In my opinion, these suggestions seem realistic in terms of time and resources, as this group has established, have access to and are proficient in using both the zebrafish lines as well as the mESC differentiation protocol towards follicular thyroid units.

      It is difficult to tell if and how mechanistic insights into the role of Nrf2 in folliculogenesis in the mESC system might be obtained. That might require opening up new lines of experimentation (that I certainly do not require), but I leave it to the authors to judge if some realistic and feasible additional experiments would possibly contribute to more mechanistically oriented understanding. It would be nice to see the authors leverage even more from their beautiful mESC model of thyroid folliculogenesis and I believe that this model might indeed provide important mechanistic insight into this issue.

        • Dear reviewer, we fully agree that the unexpected role of Nrf2 in folliculogenesis is something to be better explored in our organoid system. Even though, we are not sure if there is a direct effect of this TF in regulating this process or if it is just a consequence of Tg downregulation. As a future perspective, we aim to keep the Nrf2 KO organoids in culture for longer periods (to be established) and assess if the overtime accumulation of Tg is enough to form bigger and functional follicles.* It is known from previous literature that Tg production is diminished in nrf2 KO mice and that AREs are present in murine Tg enhancer regions (Ziros et al 2018). In the current manuscript the authors do not identify such elements in the vicinity of the zebrafish tg gene. To me, this comparison of AREs in enhancer elements is an important observation that might explain some of the differences to results in Ziros et al and warrants to be included in the results section with a figure, and not only mentioned in the discussion section by referring to a supplementary figure.
        • Dear reviewer, this aspect was further discussed in the text and the comparison figure was added as part of a main figure (Fig. 6C). In addition, in this updated version we also included new ATAC sequencing data (L. 298-340; Fig. 6) that reinforces the results previously shown by Ziros (2018) in which Nrf2 regulates Tg by binding to the ARE sites in its promoter. * It would be desirable if the authors elaborate a bit more and clearly on how they envision nrf2a impacts on Tg production in the zebrafish.
        • Dear reviewer, as suggested by reviewer 1, we performed immunostaining to evaluate the levels of Tg protein and we did not observe any difference in protein among nrf2a KO and WT embryos (Supplementary Fig. 4). This together with the lack of ARE sites at zebrafish Tg promoted ruled out an effect of nrf2 on its expression in zebrafish under physiological conditions.* In Fig 2L it seems like T4 expression is completely lacking in KO embryos, whereas Fig 2R suggests that a signal that can be quantified is indeed present. Moreover, in Fig 3J a T4 signal, albeit reduced, is seen. Is Fig 2L really representative?
        • Dear reviewer, the displayed pictures are representative of the main phenotype obtained (majority of the embryos). Fluorescent quantification was done using Leica software (described below). The software quantifies the fluorescence based on grayscale images which means that fluorescence slightly higher than the background, thus barely visible, will be quantified which can lead to a value different than zero despite no "visible" staining. This reflects the apparent discordance between the Figures 2L and 2R. Concerning Fig 3J, this is an example that a few nrf2a KO embryos can produce T4 though in lower levels than WT. This range of phenotypes is now highlighted within the text and is reflected in the Fig. 2R and 3Q quantification data.* In Fig 3Q the decrease of T4 signal seems much less pronounced than in Fig. 2R, even though it seems like the comparisons are between the same genotypes. Can the authors comment on this?
        • Dear reviewer, Fig. 2R represents embryos from F2 while Fig. 3Q are from F3 and both were performed independently. Despite the possible differences between the generations, other technical factors could be involved such as: sample fixation, staining duration of antibody, post-processing of the samples, etc. Although we tried to perform both experiments as similarly as possible, we can not rule out small differences between both experiments.* Fig 6B. What do the two columns in WT and KO represent? Has the experiment been conducted on only two biological replicates?
        • Dear reviewer, indeed, the experiments, bulk RNA and ATAC sequencing, we performed using two biological replicates. For each replicated, we pooled together at least 4 organoid wells that were not previously selected, aiming to represent a possible variability in differentiation.* Minor comments

      Fig 2S - Do the bars of this graph show the ratio of expression in KO vs WT? What is the black bar furthest to the left (labelled "WT") that seems to be some kind of normalizer? Which transcript does it represent? The same question goes also for 2T-V and 4B and C.

        • Dear reviewer, we apologize for the lack of clarity. All data displays the fold change compared to their respective control (WT). The black bar shows the control (WT=1) for each gene on the figure panel. A dashed line has been added to better visualize the differences in gene expression levels respective to the control. Figure legends have been updated for clarity. * Fig 2U - In the results section it it stated that "Upon PTU treatment, tsh-β and slc5a5a expression were increased in both WT and nrf2aΔ5 186 homozygous mutants...". In the figure it seems like there is no significant change of slc5a5 in PTU treated homozygous mutants.
        • Dear reviewer, we apologize for the mistake, the sentence was corrected accordingly.* Line 52. "...the thyroid enables the production of growth hormones...". This is not clear to me. To the authors mean GH or do they more loosely refer to T3/T4 as "growth hormones"?
        • Dear reviewer, we agree with the comment and modified the sentence accordingly to precise that the action of the thyroid on growth hormone production is mediated by the thyroid hormones.* Line 60. "... If left untreated, C.H. will cause severe mental and growth retardation in patient among other physiological consequences...". I would consider these consequences as "pathological" rather than "physiological".
        • Dear reviewer, indeed, using pathological is more appropriate, the sentence has been updated.* Line 151 A.U - if the units are arbitrary, why use such a cumbersome order of magnitude where the numbers are in the order of 10e5 and 10e6?
        • Dear reviewer, we are performing the fluorescence quantification using the quantification module of the "Leica, LAS X" software. Briefly, we delimitate region of interest for which the software will give us the value of fluorescence for each pixel with this region of interest. Although we are correcting the value of each of these pixels in the region of interest by the average fluorescent value of the pixels in the background area, the amount of pixels in each region is bringing the value to this extent. We decided to keep the raw fluorescence values to better express the differences in magnitude among the groups.* Line 246. "Based on these results, we hypothesized that adult Nrf2 KO mice might develop body-wide resistance to the effects of Nrf2 defficiency (sic!) which in turn, might reduce the visible effects on thyroid development and physiology". The concept of a putative "body-wide resistance" is a bit nebulous to me. It would be great if the authors could be a bit more precise, or at least speculate on, the putative mechanisms of such a "body-wide resistance".
        • Dear reviewer, indeed this is a speculation and we have removed this statement to improve the clarity of the manuscript. In addition, we included in the updated text a hypothesis that the overtime accumulation of Tg from early development (as seen in our in vitro system) to adulthood could lead to the proper formation of thyroid follicles and consequently T.H. synthesis. Even if not included in this manuscript, we plan to improve our in vitro model for long-term culture to assess this hypothesis.* Line 297. It is very difficult to appreciate from these images that "the percentage of Nkx2-1 cells was higher compared with the control cell line". In WT it seems that all nuclei are positive for Nkx2-1 but rather that the expression level is lower than in the KO cell line. I would like the authors to elaborate on this. Is really the percentage of "Nkx2-1 cells" (I think the authors mean Nkx2-1 positive cells) lower in WT than in the KO or it is rather a matter of staining intensities?
      1. Dear reviewer, we apologize for the lack of clarity. Rather than referring to the IF images at Fig. 4E we refer to a higher percentage of Nkx2.1+ cells in Nrf2 KO organoids when using Flow cytometry quantification (Fig. 4D). The flow cytometer graphs show the gating for Nkx2.1-stained cells and highlight that upon the absence of Nrf2 65.3% of the cells are Nkx2.1+ compared to 27.1% in the WT controls. We updated the text to avoid misunderstandings and immunostainings are mainly used to show visually the cell organization and protein expression rather than with quantitative purposes.*

      Line 309. Is it really a "lower portion of cells" that are "able to promote ioidide organification"? Do the authors consider some KO cells to be organification competent whereas other cells not? Is it not rather a globally diminished ability to organify iodine?

        • Dear reviewer, we consider that the lower ability of Nrf2 KO cells to produce Tg and consequently self-organize into follicles is the primary cause of the global reduction in iodine organification. Even though, iodine uptake is not impaired, a lower amount of cells can produce Tg-I, thus displaying the ability to organify iodine. Very likely this is limited by the number of Tg-expressing cells and/or the amount of Tg in each cell derived from Nrf2 KO mESCs.* Line 316. "...KO derived thyroid follicles". This seems contradictory to the previous notion that KO cells do not form follicles. I suggest that "follicles" is replaced by "thyrocytes" if follicular structures are indeed completely lacking. However, the phrasing "KO derived thyroid follicles" suggest that such are indeed present and might be possible to quantify as suggested above.
        • Dear reviewer, thank you for pointing it out. Indeed, the term nrf2 KO-derived thyroid follicles is not appropriate and has been changed in the text. Also, since our Phalloidin staining shows that we have small follicles formed we updated our data description and discussion for the fact that follicles seem to form, however a clear delay in size is observed among Nrf2 KO organoids. This is very likely linked to a lower expression of Tg in those organoids.* Line 327 "...among NRF2 WT cells, we detected upregulation...". What do the authors mean by "upregulation" in the context of WT cells? As compared to what? If "upregulated" means as compared to KO cell that does not seem completely appropriate. Even though this might seem like semantics, it is not intuitive to me to describe something as "upregulated" in WT cells, that would rather constitute a baseline condition. Would it rather not be considered as a "downregulation" in KO cells?
        • Dear reviewer, the statement was indeed not appropriate and we modified the text accordingly.* Line 337. "... important downregulation" seems a little unorthodox to write in a results section. The downregulation might be significant or not. If it is important or not is a different matter (of subjective biological interpretation, i.e. how biological meaning is appreciated) and more suited to be put into context in the discussion section.
        • Dear reviewer, following your comments, transcriptomic results and discussion have been updated for clarity. A more factual description has been kept in the result section while the interpretation was moved down to the discussion.* There are scattered typos and grammatical errors that make reading less pleasant and need to be corrected, preferably by a native English speaker.
        • Dear reviewer, we apologize for the text clarity and typos. The grammar and text structure were improved following the comments.* Reviewer #2 (Significance (Required)):

      Significance

      This is to the best of my knowledge the first study of a putative role for Nrf2 in thyroid development. However, a role of Nrf2 in thyroid physiology and pathology has previously been rather firmly established.

      Even though the manuscript is a very nice piece of work, it is perhaps difficult to claim that it in its present form signifies a major conceptual advance of the field, as it provides only limited mechanistic insight, especially with respect to possible "non-canonical" functions of nrf2 (mechanisms of impact on Tg production and folliculogenesis). If such insights could be obtained it would clearly increase the significance of this contribution.

        • Dear reviewer, for this revised version of the manuscript we included a new set of ATACseq data and combined it with the previously shown transcriptomic to further explore the molecular mechanisms by which Nrf2 loss drives such maturation phenotype in developing mESC-derived thyroid organoids (L. 304-340; figure 6A-H). Using such an approach we demonstrate that Nrf2 causes significant changes in chromatin accessibility which is strongly correlated with changes in gene expression profile. We also could demonstrate that Tg expression is indeed impaired by a reduction in chromatin accessibility under the lack of Nrf2, while we identified key pathways/TFs regulated by Nrf2 that could play a role in driving the phenotype or as compensatory mechanisms. More specifically we identified increased mRNA expression and chromatin accessibility of genes associated with AP-1 pathway activation, such as Fos, Jun-b, and Stats (L. 298-333; Figure 6 and Supplementary Figure 6). * Interestingly, studies have shown that Nrf2 and AP-1-proteins significantly overlap regulating each other at several levels, including transcriptionally. Also, despite Nrf2 being known for binding to ARE and AP-1 to the TRE sites, they often overlap with AP-1 being embedded into ARE. These tight relations suggest shared feed-forward and feed-back circuits between NRF2 and AP-1 factors contributing to their functioning. To further investigate if AP-1 overexpression in Nrf2 KO-derived thyroid organoids has a compensatory effect or if it contributes to the phenotype observed, we performed AP-1 inhibition during the thyroid differentiation protocol. Nrf2 KO cells differentiated with the SR11302 inhibitor (from day 7 to 22) partially reduced the Tg mRNA at higher doses (10m*M) while Tg protein and Tg-I production are not visually different from the control (L. 334-340; Supplementary Figure 7A-B). This sugge that AP-1 upregulation upon loss of Nrf2 might work as a compensatory mechanism, however, due to the lack of Tg expression, which is under a direct effect of Nrf2, the functionality is not recovered. *

      The main strength of the study is that it investigates Nrf2 function in the context of embryonic development.

      Even though the manuscript makes a point of evolutionary conserved mechanisms, I do not really see that that the discrepancies to results obtained from in vivo studies in higher vertebrates are sufficiently discussed and elaborated on.

        • Dear reviewer, following your and reviewer 1 suggestions we have updated our discussion to better highlight the similarities and differences between both models while comparing our mouse in vitro model to the in vivo model previously published.* Line 240. It feels a bit presumptuous to comment on the contents of an excellent paper that one of the authors of the current manuscript is the senior author of (Ziros et al 2018) and of course knows much better than I do.

      However, the present manuscript states (about Ziros et al 2018) that "In this study, the authors described how loss of Nrf2 function causes a thyroid phenotype only under stress conditions such as iodine overload. However, in the same study, the authors described that the loss of Nrf2 function causes a reduction of the thyroglobulin expression in PCCL3 rat thyroid follicular cell culture". In my recollection of Ziros et al there is a thyroid phenotype in KO mice not only under stress conditions, but also under normal conditions with reduced TG and increased TG-I? I would like the authors to comment on this.

        • Dear reviewer, indeed, in Ziros' (2018) paper it is shown that lack of Nrf2 does not cause hypothyroidism in physiological conditions, however, Tg expression is reduced (regulation mechanisms were also shown) but the Tg-I/Tg iodination rate was increased. Only under iodine overload, they could see an inhibitory effect on thyroid hormone production. We were not clear in our statement in the previous version which is now improved in the revised version. In addition, we also better discuss our findings in organoids and raise the hypothesis that long-term Tg accumulation could "restore" folliculogenesis and thyroid hormone synthesis. * The authors seem to emphasize the aspect of evolutionary conservation. However, even though I consider the possible effect on folliculogenesis in the mESC model as a very interesting finding, it is difficult to understsand if it is a phenomenon that is specific to the mESC model system or of more general importance. As the authors demonstrate, follicles indeed seem to develop in nrf2a deficient zebrafish. In Ziros et al 2018 no images of thyroid morphology are provided, but as KO mice are euthyroid it seems likely that follicular organization is not grossly perturbed. The authors need to elaborate on this. Even if the effect might be more or less specific to the mESC system, that does not necessarily make it less relevant. It might provide fundamental insights into the process of folliculogenesis, but for greater significance more mechanistic insight would be desirable.
        • Dear reviewer, as mentioned above, we used new tools to better analyze the "folliculogenesis impairment" previously suggested to be occurring in our organoid system. A careful assessment of the morphology of our Nrf2 KO-derived organoids using Phalloidin (Fig. 5B) staining evidenced that in fact folliculogenesis process might be undergoing in our organoids, however, the follicular-like structures are less frequently observed (difficult to quantify due to the 3D aspect of the follicular organization) while the size seems to be smaller than in WT organoids. Here we believe that due to the lower levels of Tg expressed and secreted into the lumen the size is smaller. This proposed hypothesis fits with the previous studies suggesting that Tg has a role in folliculogenesis. In addition, we cannot rule out that in vivo this same phenotype happens during early development and that overtime accumulation of Tg could lead to proper follicular formation and consequently to normal thyroid function. Interestingly, Ziros' paper shows that even in Tg downregulation conditions, T.H. production is not impaired, with a higher ratio of iodinated Tg compared to WT mice, suggesting a compensatory mechanism to overcome the lower levels of Tg. This new aspect is now further discussed in the manuscript. *

      **Referees cross-commenting**

      I largely agree with the comments of referee #1, particularly the comment that the title (and some of the discussion) of the paper needs to be changed as pointed out by referee #1 ("...there is no evidence that there is a problem with thyroid development in ZF. The thyroid appears to be enlarged at the end of development, most likely as a consequence of increased TSH stimulation, but there is no developmental defect!").

        • Dear reviewer, following both reviewer's suggestions, we modified the title of the paper to better reflect the results presented. The new title is "The role of Nrf2/nrf2a in thyroid maturation and hormone synthesis in mammalian and non-mammalian models". * In humans, developmental defects such as congenital hypothyroidism can be divided into two main categories: 1. Dyshormonogenesis, when the tissue is properly developed but thyroid function is impaired and 2. Dysgenesis, when the tissue (organogenesis is impaired) is not properly formed or not at all, resulting in hypothyroidism. The phenotype observed in nrf2a KO zebrafish corresponds to the dishormonogenesis in humans and despite that folliculogenesis seems to be preserved, the bigger size of the thyroid is not considered as a defect in organogenesis but a consequence of higher TSH stimulation. Still, it would be classified as a developmental defect. However, to avoid misinterpretation, we updated the text and highlighted that in zebrafish the lack of nrf2a results in hyperplastic non-functional thyroid tissue.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      In this manuscript Gillotay et al investigate and further establish a role for Nrf2 in thyroid physiology. Importantly, a role of Nrf2 in thyroid development is investigated in the Tg(tg:nlsEGFP) zebrafish model system developed by this group. This permits detailed and also live tracking of zebrafish thyroid development. By a Crispr/Cas9 approach the authors establish a functional role for nrf2a in zebrafish thyroid development. Nrf2a loss-of-function in zebrafish leads to a hypothyroid phenotype that seems to recapitulate aspects of dyshormonogenesis with a slightly enlarged gland, increased tshb, strongly reduced T4 and iodinated TG (TG-I). Morover, Tg synthesis is perturbed, only slightly during control conditions but pronouncedly during PTU challenge. By restoring nrf2a specifically in the zebrafish thyroid, the authors prove the phenotype to be cell autonomous to the thyroid. The authors then turn to the model system of functional thyroid follicular maturation from mESC that they have previously described. In this system Nrf2a KO leads to loss of TG expression and what the authors describe as an inability to form follicular units. Even though Nis is expressed, iodine organification is impaired, likely due to defective Tg production.

      Major comments

      This is a nicely written manuscript with mostly convincing results. The authors demonstrate laudable scientific rigor by verifying that the genetic modifications indeed have the expected effect (that nrf2 in the model systems that they create is indeed non-functional) and by using a rescue strain for the zebrafish model. In my opinion the data and the methods are presented in such a way that they can be reproduced. Generally, the conclusions are well supported by the results presented.

      The authors make a rather strong point of Nrf2 thyroid function as an evolutionary conserved mechanism and this might need some further underpinning.

      Line 368. The authors state that their findings "...reinforce a recently published report on the role of Nrf2 in adult mouse thyroid physiology (Ziros et al., 2018)" and continue: "Although we did not analysed (sic!) the thyroid gland of adult nrf2a KO zebrafish"...". Why were not adult zebrafish investigated? As some key results of Ziros et al on adult Nrf2 KO mice differ from those of the manuscript (e.g. TG iodination under non stressed conditions seems to be increased in mice but decreased in zebrafish) it would be highly desirable to know if the zebrafish phenotype is more similar to that of mice in later adulthood.

      As I will elaborate on later on, to me a key finding seems to be that nrf2a might have unexpected "non-canonical functions" that do not immediately seem to pertain to its presumed major function as a key regulator of defense to oxidative stress. One of these is to be an impact on Tg production. The other one seems to be a possible role in folliculogenesis, even though this might in some way be related to the impact on Tg production. I do understand that the authors want to leverage from the elegant in vitro model of mESC-derived follicles that they have developed. However, the rationale of using the model system in this context is not entirely clear to me. In Ziros et al the effect of Nrf2 deficiency was studied in a murine global knockout model, in a thyroid specific knockout (even though it might be argued that by using a Pax8 Cre-driver Nrf2 was likely knocked out also in the kidneys and in some regions of the CNS) as well as in cultured rat thyroid cells. Is it the fact that the Pax8 Cre-driver is not entirely thyroid specific that prompted the authors to turn to the mESC-model, the possibility of studying thyroid cell autonomous mechanisms of folliculogenesis (without the possible impact of other tissues and the HPT-axis) by turning to this reductionistic model or other considerations? Even though the results from the mESC model certainly are of interest, the rationale needs to be better explained and the real potential of this model is perhaps not fully exploited. Specifically, the effect on folliculogenesis in the mESC system needs to be more carefully presented:

      Line 294. Even though they might be discernible, it is difficult to really appreciate the occurrence of follicular lumina in Fig 4E WT structures. The authors consider this as a main finding (see discussion: "The most striking difference we observed between the two models was the absence of follicular organisation in Nrf2 KO thyroid cells..."). I think this would be clearer if staining for an apical marker such as ezrin or MUC1 are shown and I would also like to see some kind of quantification of follicular organization (e.g. number per area, size) between the WT and KO conditions. I think that would strengthen the notion of abolished follicular organization in Nrf2 KO cells.

      Fig 5A. Also in this image it is difficult to appreciate any genuine "follicular organization" of the WT cells. Again, staining for an apical marker would be desirable. It rather seems like microlumina between two or three cells. Also a close up might be illustrative.

      In my opinion, these suggestions seem realistic in terms of time and resources, as this group has established, have access to and are proficient in using both the zebrafish lines as well as the mESC differentiation protocol towards follicular thyroid units. It is difficult to tell if and how mechanistic insights into the role of Nrf2 in folliculogenesis in the mESC system might be obtained. That might require opening up new lines of experimentation (that I certainly do not require), but I leave it to the authors to judge if some realistic and feasible additional experiments would possibly contribute to more mechanistically oriented understanding. It would be nice to see the authors leverage even more from their beautiful mESC model of thyroid folliculogenesis and I believe that this model might indeed provide important mechanistic insight into this issue.

      It is known from previous literature that Tg production is diminished in nrf2 KO mice and that AREs are present in murine Tg enhancer regions (Ziros et al 2018). In the current manuscript the authors do not identify such elements in the vicinity of the zebrafish tg gene. To me, this comparison of AREs in enhancer elements is an important observation that might explain some of the differences to results in Ziros et al and warrants to be included in the results section with a figure, and not only mentioned in the discussion section by referring to a supplementary figure. It would be desirable if the authors elaborate a bit more and clearly on how they envision nrf2a impacts on Tg production in the zebrafish.

      In Fig 2L it seems like T4 expression is completely lacking in KO embryos, whereas Fig 2R suggests that a signal that can be quantified is indeed present. Moreover, in Fig 3J a T4 signal, albeit reduced, is seen. Is Fig 2L really representative?

      In Fig 3Q the decrease of T4 signal seems much less pronounced than in Fig. 2R, even though it seems like the comparisons are between the same genotypes. Can the authors comment on this?

      Fig 6B. What do the two columns in WT and KO represent? Has the experiment been conducted on only two biological replicates?

      Minor comments

      Fig 2S - Do the bars of this graph show the ratio of expression in KO vs WT? What is the black bar furthest to the left (labelled "WT") that seems to be some kind of normalizer? Which transcript does it represent? The same question goes also for 2T-V and 4B and C.

      Fig 2U - In the results section it it stated that "Upon PTU treatment, tsh-β and slc5a5a expression were increased in both WT and nrf2aΔ5 186 homozygous mutants...". In the figure it seems like there is no significant change of slc5a5 in PTU treated homozygous mutants.

      Line 52. "...the thyroid enables the production of growth hormones...". This is not clear to me. To the authors mean GH or do they more loosely refer to T3/T4 as "growth hormones"?

      Line 60. "... If left untreated, C.H. will cause severe mental and growth retardation in patient among other physiological consequences...". I would consider these consequences as "pathological" rather than "physiological".

      Line 151 A.U - if the units are arbitrary, why use such a cumbersome order of magnitude where the numbers are in the order of 10e5 and 10e6?

      Line 246. "Based on these results, we hypothesized that adult Nrf2 KO mice might develop body-wide resistance to the effects of Nrf2 defficiency (sic!) which in turn, might reduce the visible effects on thyroid development and physiology". The concept of a putative "body-wide resistance" is a bit nebulous to me. It would be great if the authors could be a bit more precise, or at least speculate on, the putative mechanisms of such a "body-wide resistance".

      Line 297. It is very difficult to appreciate from these images that "the percentage of Nkx2-1 cells was higher compared with the control cell line". In WT it seems that all nuclei are positive for Nkx2-1 but rather that the expression level is lower than in the KO cell line. I would like the authors to elaborate on this. Is really the percentage of "Nkx2-1 cells" (I think the authors mean Nkx2-1 positive cells) lower in WT than in the KO or it is rather a matter of staining intensities?

      Line 309. Is it really a "lower portion of cells" that are "able to promote ioidide organification"? Do the authors consider some KO cells to be organification competent whereas other cells not? Is it not rather a globally diminished ability to organify ioidie?

      Line 316. "...KO derived thyroid follicles". This seems contradictory to the previous notion that KO cells do not form follicles. I suggest that "follicles" is replaced by "thyrocytes" if follicular structures are indeed completely lacking. However, the phrasing "KO derived thyroid follicles" suggest that such are indeed present and might be possible to quantify as suggested above.

      Line 327 "...among NRF2 WT cells, we detected upregulation...". What do the authors mean by "upregulation" in the context of WT cells? As compared to what? If "upregulated" means as compared to KO cell that does not seem completely appropriate. Even though this might seem like semantics, it is not intuitive to me to describe something as "upregulated" in WT cells, that would rather constitute a baseline condition. Would it rather not be considered as a "downregulation" in KO cells?

      Line 337. "... important downregulation" seems a little unorthodox to write in a results section. The downregulation might be significant or not. If it is important or not is a different matter (of subjective biological interpretation, i.e. how biological meaning is appreciated) and more suited to be put into context in the discussion section.

      There are scattered typos and grammatical errors that make reading less pleasant and need to be corrected, preferably by a native English speaker.

      Significance

      This is to the best of my knowledge the first study of a putative role for Nrf2 in thyroid development. However, a role of Nrf2 in thyroid physiology and pathology has previously been rather firmly established. Even though the manuscript is a very nice piece of work, it is perhaps difficult to claim that it in its present form signifies a major conceptual advance of the field, as it provides only limited mechanistic insight, especially with respect to possible "non-canonical" functions of nrf2 (mechanisms of impact on Tg production and folliculogenesis). If such insights could be obtained it would clearly increase the significance of this contribution.

      The main strength of the study is that it investigates Nrf2 function in the context of embryonic development. Even though the manuscript makes a point of evolutionary conserved mechanisms, I do not really see that that the discrepancies to results obtained from in vivo studies in higher vertebrates are sufficiently discussed and elaborated on.

      Line 240. It feels a bit presumptuous to comment on the contents of an excellent paper that one of the authors of the current manuscript is the senior author of (Ziros et al 2018) and of course knows much better than I do. However, the present manuscript states (about Ziros et al 2018) that "In this study, the authors described how loss of Nrf2 function causes a thyroid phenotype only under stress conditions such as iodine overload. However, in the same study, the authors described that the loss of Nrf2 function causes a reduction of the thyroglobulin expression in PCCL3 rat thyroid follicular cell culture". In my recollection of Ziros et al there is a thyroid phenotype in KO mice not only under stress conditions, but also under normal conditions with reduced TG and increased TG-I? I would like the authors to comment on this.

      The authors seem to emphasize the aspect of evolutionary conservation. However, even though I consider the possible effect on folliculogenesis in the mESC model as a very interesting finding, it is difficult to understsand if it is a phenomenon that is specific to the mESC model system or of more general importance. As the authors demonstrate, follicles indeed seem to develop in nrf2a deficient zebrafish. In Ziros et al 2018 no images of thyroid morphology are provided, but as KO mice are euthyroid it seems likely that follicular organization is not grossly perturbed. The authors need to elaborate on this. Even if the effect might be more or less specific to the mESC system, that does not necessarily make it less relevant. It might provide fundamental insights into the process of folliculogenesis, but for greater significance more mechanistic insight would be desirable.

      Referees cross-commenting

      I largely agree with the comments of referee #1, particularly the comment that the title (and some of the discussion) of the paper needs to be changed as pointed out by referee #1 ("...there is no evidence that there is a problem with thyroid development in ZF. The thyroid appears to be enlarged at the end of development, most likely as a consequence of increased TSH stimulation, but there is no developmental defect!").

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This paper deals with the role of the transcription factor Nrf2 in the thyroid gland of zebrafish and in a thyroid organoid model. The subject if of relevance since Nrf2 is known to control the cellular response to oxidative stress and the thyroid is an organ where protection of oxidative stress is of major relevance, given the production of reactive oxygen species during thyroid hormone biosynthesis. The main result is that in Zebrafish (ZF) thyroid Nrf2 appears to be important for thyroid hormone formation since late stages ZF embryos deprived of NRF2 the levels of the thyroid hormone T4 and of its precursor, iodinated thyroglobulin(Tg), are very much decreased. However, there is no significant decrease of Thyroglobulin mRNA, albeit an impairment in the up-regulation of Tg by TSH could be observed. No effect is seen on the structure of the thyroid follicles and no developmental defect is observed, in contrast with the title of the paper. Conversely, in the mouse thyroid organoid model the absence of Nrf2 results in an impressive decrease of Tg mRNA and in impaired formation of thyroid follicles. The study is in most part elegant and technically impeccable. The data are well presented and organised as far as figure is concerned. However, much remains to be done on the interpretation and presentation of results. In addition, the text that has been put together sloppily, with many typing and punctuation mistakes and difficult to interpret sentences. A revision of typing and syntax is absolutely needed.

      Main concerns:

      1. The title of the paper needs to be changed. There is no evidence that there is a problem with thyroid development in ZF. The thyroid appears to be enlarged at the end of development, most likely as a consequence of increased TSH stimulation, but there is no developmental defect!
      2. There is an evident contradiction, in ZF, between the marginal, if any, decrease of Tg mRNA and the impressive decrease in T4 and iodinated Tg. This, in my opinion, very interesting discrepancy, is never discussed. Perhaps the authors should look at the level of Tg protein. It it possible that there is an increased degradation or some negative translational control in absence of Nrf2? Alternatively, is it possible that there is a defect, yet unidentified, in the organification process? Certainly, to conclude, as the authors do in page 11, lines 236-237, that the defect in hormonogenesis depends on thyroglobulin production is, with the data presented, an unproven statement.
      3. The data on transcriptional effect of NRrf2 in the mouse ES cell system do not really add much. Their major effect is to contribute to a lengthy discussion that would really benefit of a substantial reduction.
      4. More time should be devoted to explain the substantial differences between the three systems studied (two in this paper, one in a previous published by partly same authors), keeping in mind that studies in mice could be largely influenced by the genetic background.

      Significance

      This paper is largely confirmatory of previous results obtained in a mouse Nrf2 KO model, whose main authors are also part of this study (Ziros et al. , 2018). A clarification of the molecular defect in hormone production in ZF could add the novelty that this study might need.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Ruby Ponnudurai

      Scientific Editor

      Review Commons

      February 16th, 2026

      Dear Dr. Ponnudurai,

      Please see below for a detailed response to reviewers for manuscript #RC-2025-03108: "Short chain fatty acids regulate the chromatin landscape and distinct gene expression changes in human colorectal cancer cells".

      __Authors' Summary: __We thank all the reviewers for the constructive and immensely helpful reviews of our manuscript. We have revised the manuscript addressing the reviewers' comments, which we feel has substantially strengthened our paper. Please see below for our point-by-point responses to the comments, which are all indicated in blue text. All changes in the manuscript are also indicated in blue text.


      Reviewer #1 Evidence, reproducibility and clarity


      In this manuscript, Kabir et al. explore the impact of microbiota-derived short-chain fatty acids (SCFAs) on chromatin structure and gene expression in human cells. They show that SCFAs, particularly butyrate, contribute to specific histone modifications such as butyrylation at H3K27, detectable in human colon tissue. Additional modifications like acetylation, butyrylation, and propionylation at H3K9 and H3K27 respond to SCFA levels and are enriched at active regulatory regions in colorectal cancer cells. Treatment with individual or combined SCFAs mimicking gut conditions alters gene expression patterns, with butyrate playing a dominant regulatory role. Butyrate's effects on gene expression are claimed to be independent of HDAC inhibition and instead rely on the p300/CBP complex through histone butyrylation. These findings underscore SCFAs as crucial modulators of epigenetic regulation in the human colon and highlight butyrate's dominant role in shaping chromatin and gene regulation beyond its known metabolic functions.

      The authors used two human cell lines and an in vivo murine model paired with RNA and ChIP sequencing approaches to identify target genes and chromatin modifications in response to SCFAs.

      While the findings are interesting and could provide important insights into the epigenetic influence of SCFAs in human cells, the study would benefit from additional experiments to strengthen the conclusions. Comments and suggestions are listed below:

      Response: We sincerely thank the reviewer for their thoughtful and constructive comments. In addition, we appreciate the recognition of the potential impact of our findings. We have addressed all comments below.

      1. Figure 1: The H3K27bu expression in human biopsies highlights the clinical significance of the current study. However, the authors need to provide more information on the human colon samples, e.g., how many total patients were analyzed, and what were the age and/or sex. Only the methods mention the use of benign TMA; this should also be clarified in the figure legends. It would also be helpful to show histone butyrylation levels in normal vs. cancer human tissues.

      Response: We completely agree that analysis of additional patient samples is important. In light of this comment, we have expanded our analysis of human colon samples. In the original manuscript, we showed IF images from patient intestinal sections. Patient demographic information (age and sex) is now included in the figure legend. While we analyzed two patients by IF, we realized that images from only one patient are shown. We also felt it was important to add additional rigor to our patient analysis. Therefore, we have incorporated additional patient samples and performed H3K27bu staining using IHC across normal and colon cancer sections obtained from 40 different patients. This is now included as Supplemental Figure 1. In addition, we have included information about age, sex, staging, and grading in Supplemental Figure 1C. Interestingly, we observed that adenocarcinoma patients have significantly decreased levels of H3K27bu compared to normal colon or normal adjacent tissues (Supplemental Figure 1B). We speculate that this may be due to alterations with the microbiota composition and dysbiosis associated with colorectal cancer (PMIDs: 26515465, 25758642, 25699023). Very interestingly, this is in contrast to reports of elevated H3K27ac in colon cancer samples (PMID: 24994966). We are excited to explore this further, and this is something we plan to follow-up on in future studies.

      1. Figure 1: In addition, given that the butyrate level descends towards the base of the colonic crypt (with the highest at the top of the crypt where mature intestinal epithelial cells reside) (Kaiko et al., 2016), it is important to show how the H3K27bu signature is distributed along the crypt. This data would further emphasize the clinical relevance of this study, given that most colorectal cancers (CRCs) arise from stem and progenitor cells.

      Response: We agree that this is an important question and recognize the elegant study by Kaiko et al. However, our human samples are obtained from commercially available tissue microarrays and the sectioning is not consistent across samples, resulting in a minimal amount of samples that we could analyze for staining patterns from crypt to villi (please see Supplemental Figure 1A for example sections). This unfortunately prevents us from completing rigorous image analysis. In future studies, we plan to perform this analysis after we obtain an IRB protocol that will allow us to answer this question in the most rigorous way possible.

      Throughout the manuscript: The rationale for selecting the two CRC cell lines (HCT 116 and Caco2) should be explained. While commonly used, providing background on their genetic differences (e.g., driver mutations) is important, as this could greatly influence the PTM landscape.

      Response: We chose to use both HCT-116 and Caco-2 cancer cell lines throughout our studies, since as noted these cells are the most commonly used lines in the literature. In addition, having consistent results across distinct genetic backgrounds strengthens our results: using both cell lines tells us whether observed PTM patterns are conserved across genetically diverse CRC contexts, as HCT-116 is characterized by mutations in KRAS and PIK3CA, while Caco-2 has mutations in APC and TP53 (PMIDs: 17088437, 24755471, and 16418264). We have added this information into the text in lines 106-107.

      The study lacks additional controls, such as a normal colon epithelial cell line and a non-colonic cell type. Including these would help determine whether the observed butyrate effects are tissue- or disease-specific. This data would also help assess whether SCFA effects, and specifically butyrate's effects, on histone acylation and gene expression are systemic or local.

      Response: Thank you for this insightful comment. We have now included additional data using normal colon cells in the form of mouse colon organoids and a distinct non-intestinal cell line, the embryonic kidney cell line HEK 293T. Importantly, we observe similar changes to chromatin after treatments with different SCFAs in both colon organoids and HEK 293T cells as shown in the cancer cell lines (Figure 1E, 1F). Interestingly, we also observe that the colon cancer cell lines have visible signal of histone butyrylation without treatment, while we only observe these modifications in HEK 293T cells following treatment.

      As for understanding systemic vs. local effects of butyrate on chromatin, we additionally treated cells with different concentrations reflecting the intestinal lumen or serum concentrations of SCFAs: 5 mM and 5 µM, respectively. While the concentrations of SCFAs can vary across individuals, we felt that these numbers reflected differences in intestinal vs. serum levels based on the literature (summarized in PMID: 27259147). Importantly, we observe that only the 5 mM SCFA treatment reflecting levels in the intestinal lumen results in induction of histone acetylation and butyrylation, while the 5 µM treatment reflecting serum SCFA levels failed to induce increased levels of these histone modifications (Figure 1F).

      Together, this data suggests that the response on chromatin to SCFAs is more universal at high concentrations. However, based on local vs. systemic concentrations throughout the body, we expect that responses on chromatin will largely be restricted to the intestine or in other areas or conditions where high concentrations of metabolites are localized.

      Figure 2: The authors show ChIP-seq results in the HCT 116 cell line. To exclude the possibility that the demonstrated chromatin signatures are cell line-specific, results from Caco2 should also be shown. In addition, the 2D environment and multiple passaging alter gene expression in cell lines; using human colonic organoids would provide a more clinically and physiologically relevant model.

      Response: We have now added Cut&Run analysis for the histone acyl marks of interest in Caco-2 cells, which is a technique analogous to ChIP to map genomic localization. Please see now Figure 2C-D. Importantly, we observe very similar localization of these histone modifications across the different cell lines. We also agree that the question of how 2D vs. 3D environment may impact localization of these modifications is important. In organoids, ChIP-seq and Cut&Run are technically difficult. In addition, we feel that using human organoids is currently beyond the scope of our manuscript. However, we previously characterized H3K27bu and H3K27ac occupancy from primary epithelial cells isolated from the mouse intestine (PMID: 38413806). Importantly, in this study we observed similar genomic enrichment of H3K27bu and H3K27ac. This suggests that the general patterns of localization of these modifications across species and across cells isolated from both 2D vs. 3D systems are similar.

      Figure 4 is very confusing. Entinostat itself, as an HDAC inhibitor (iHDAC), increases butyrylation. The data shown are insufficient to draw conclusions. First, the authors should use additional iHDACs, and second, they should illustrate the overlap in gene expression changes between all treatments using a Venn diagram to clarify which genes/signatures are specific to each treatment.

      Response: We agree that testing additional iHDACs is important. We have now included an additional iHDACs (Tucidinostat) in our studies to make more widespread conclusions beyond the activity of Entinostat. We have performed additional treatments, demonstrating that all iHDACS tested increase both histone butyrylation and acetylation (Supplemental Figure 8A-B). We also have performed qPCR for candidate differential genes and demonstrated that expression changes following our treatments with Tucidinostat phenocopy changes observed with Entinostat (Figure 5F). These dynamic gene changes show examples of genes that are responsive to butyrate treatment and p300/CBP inhibition, yet differ from other iHDAC treatment. As requested, we have additionally added a Euler plot to Figure 4 depicting the overlap between treatments in this figure (Figure 5C).

      Figure 4: The authors use an HDAC inhibitor to rule out butyrate's effect on gene expression via HDAC inhibition. However, butyrate can also modulate gene expression through activation of GPR109a. Using GPR109a antagonists is necessary to address this possibility. These data are essential to validate the specific role of histone butyrylation in gene regulation.

      Response: We thank you for this comment and completely agree that butyrate can act through multiple mechanisms, including activation of GPR109a. However, it has previously been demonstrated that this receptor is silenced via DNA methylation in human colon cancer samples and colon cancer cell lines, including HCT-116 (PMID: 19276343). Supporting this notion, we observed very low expression levels of this receptor in our HCT-116 cells (please note the very low TPM values), with minimal differences in response to butyrate treatment (Supplemental Figure 6E, included below). We have additionally included gene expression data for two other potential GPCRs activated by butyrate or other SCFAS (FFAR2 and FFAR3), and also observe very low expression of these genes. Therefore, we concluded that the butyrate effects on gene expression independent of HDAC inhibition in our data are not likely to be dependent on GPR109A or FFAR2/3 signaling.


      New ____Supplemental Figure 6E____: mRNA expression of GPCR genes that are known SCFA targets. Levels of mRNA expression (transcript per million, TPM) as assayed by RNA-seq of GPR109A (official gene name HCAR2), FFAR2, and FFAR3 in HCT-116 cells. Expression levels related to Figure 3. Statistical significance was determined using ANOVA adjusting for multiple comparisons with p

      Supplementary Figure 4 and manuscript: There is no in vivo methods section describing the tributyrin-gavaged mice. The authors should clarify how the experiment was performed, how cells were isolated, whether sorting was performed, and which markers were used.

      Response: We apologize for this confusion. The in vivo data is from previously published work that is publicly available (PMID: 38413806). We analyzed data from mice that were gavaged with tributyrin, where non-sorted IECs were analyzed for RNA-seq. We have clarified this and have added this information in the figure legend (now Supplemental Figure 6).

      Supplementary Figure 4: The GO analysis results show that lipid catabolism is among the top differentially enriched pathways. Butyrate is a known PPARγ agonist (Litvak et al., 2018), and activation of PPARγ is known to drive expression of genes involved in lipid metabolism. The authors need to rule out this function of butyrate before attributing this signature solely to histone butyrylation.

      Response: We appreciate this point and have performed additional analysis to identify whether canonical PPARγ target genes are enriched or not in our data. Additionally, we recognize that our data may reflect the combined effects of both PPARγ activation and histone butyrylation. In Supplemental Figure 6 (Supplemental Figure 4 in the previous version), we especially acknowledge that the differential genes changing may be due to varied mechanisms of butyrate action. Therefore, to address this comment, we performed additional analysis on data related to Figure 5 (previously Figure 4), where we have additional treatments including using a p300/CBP inhibitor to identify potentially more chromatin related mechanisms of action.

      We have now extended our analysis of RNA-seq data related to Figure 5 to include gene ontology enrichment that is not dependent on clustering (Supplemental Figure 9A). While we do not observe PPARγ target genes as top enriched categories, we have also specifically tested the enrichment of PPARg-related MsigDb groups using publicly available datasets (Supplemental Figure 9B). Here, we observe some enrichment of different gene sets related to PPARγ activity across different tissue systems. Together, this new data suggests that some PPARg targets are enriched with our different cell treatments, including butyrate, but they are not the predominant gene categories that we observe changing.

      Most PPARg target genes have been identified in tissue systems beyond the gut, such as adipose tissue and immune cells. To specifically analyze genes in the intestine that are PPARg-dependent, we identified select genes in the literature (PMIDs: 29182565, 28798125, and 28798125). In PMID: 29182565, these genes include lipid transport (Cd36), lipolysis (Hsl, and Atgl), and various lipid metabolism pathways (Cact, Fasn, Mlycd, Dgat2, and Agpat9). In PMID: 28798125, these genes include HMOX1, PDK4, ANGPTL4, UCP2, AQP8, and PLIN2 as butyrate/ PPARg targets. PMID: 28798125 identified Nos2 as a butyrate and PPARg target. Their expression levels following butyrate and other treatments in Figure 5 (formerly Figure 4) are now included as Supplemental Figure 9C (also included below). Interestingly, these genes respond differently compared to the other iHDAC tested (Entinostat) and are only mildly impacted by p300/CBP inhibition (please see A485_Butyrate column vs. Butyrate alone). This suggests that the major impacts on this pathway are not through p300/CBP activity or histone butyrylation, but may be due to other mechanisms of butyrate action. We have also included additional discussion of butyrate and potential roles of PPARg signaling in lines 243-256.

      New Supplemental Figure 9C.

      It would be helpful to include a table of differentially abundant genes as a supplement to the heatmaps and GO analysis.

      Response: We are happy to include tables of differentially expressed genes from all our analysis as supplemental files. This is now included as Supplemental Table 1.

      Significance

      This study explores how microbiota-derived SCFAs, particularly butyrate, influence histone acylation and gene regulation. While the topic is relevant, the work lacks important controls (e.g., normal epithelial and non-colonic cells) and omits mechanistic validation (e.g., GPR109a signaling, PPARγ involvement). The rationale for cell line selection is unclear, and in vivo methods are insufficiently described.

      Audience: The study will mainly interest specialists in microbiota-chromatin interaction. Broader impact is limited by the narrow model scope and underdeveloped mechanistic insight.

      My Expertise:

      Cancer biology, in vivo models, microbiota-host interactions.

      Response: We sincerely thank the reviewer for their very helpful comments. We hope that the above point-by-point responses adequately addresses concerns regarding controls, mechanistic validation, and methods description. We really appreciate their note that the topic is relevant, yet we also feel that our work will have broader impacts due to the interdisciplinary nature of the research and the inclusion of additional model systems (intestinal organoids and additional cell lines) and mechanistic experiments.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This study presents a novel finding that short-chain fatty acids (SCFAs) produced by microbial metabolism regulate gene transcription in human colon cancer cells by modulating histone H3K9 and H3K27 butyrylation and propionylation, both of which are associated with an open chromatin state. The authors further reveal that the major effect of the SCFA mixture is driven by butyrate and identify p300/CBP-dependent, rather than HDAC inhibition-dependent, gene regulation by butyrate. Overall, this is a well-organized study that provides valuable insight into the role of metabolites in human cells.

      Response: Thank you for your positive review of our manuscript. We really appreciate the reviewer pointing out the novelty and organization of our study. Please see below for point-by-point responses to your comments.

      Major comments:

      1. In Figures 1C and 1D, why did the SCFA mixture not increase histone butyrylation or propionylation to the same level as single butyrate treatment? Response: Thank you for this question. We believe that this effect is observed due to differences in butyrate concentrations, as we aimed to keep the total concentration of SCFAs equal across all treatments at 5 mM. In the single treatment, butyrate is at 5 mM while in the mixtures, butyrate is at 1.67 mM (1:1:1) or 1 mM (3:1:1). In addition, in Figure 3A we included a 15 mM mixture for RNA-seq analysis, where butyrate and the other SCFAs are all at 5 mM concentrations. Since we observed highly similar patterns of gene expression with 15 mM or 5 mM final SCFA mixture concentrations, we did not include the 15 mM treatment in our other experiments.

      In Figure 3B, how does butyrate block the effects of acetate and propionate on transcription?

      Response: This is a great question, but we are not necessarily claiming that butyrate is blocking effects of acetate and propionate on transcription. For example, it is also possible that butyrate induces more gene expression changes compared to acetate or propionate, as the number of differentially expressed genes is greater in butyrate-treated cells (Response Table 1).

      Comparison vs. vehicle

      __Upregulated __

      (log2FC > 0)

      __Downregulated (log2FC

      __Upregulated __

      (log2FC > 1)

      __Downregulated (log2FC

      Acetate

      3160

      3518

      433

      352

      Propionate

      3402

      3854

      1304

      735

      Butyrate

      4600

      4539

      2082

      1727

      __Response Table1. Number of differentially expressed genes for each SCFA treatment group, related to Figure 3. __RNA-seq was performed on HCT-116 cells grown in DMEM and treated with 5 mM of single SCFAs for 6 hours. Differential genes were identified using DESeq2 Wald test and statistically significant genes were defined using a padj To fully understand mechanistic differences of butyrate vs. acetate or propionate, we would need to perform additional experiments that we feel are beyond the scope of this current manuscript. However, we speculate that several mechanisms could account for these differences: for example, different histone acylations could have differential impacts on chromatin structure, reader binding, or transcription factor recruitment. As for blocking effects, select longer acylations (butyrylation and crotonylation) have been demonstrated to have repressive effects in transcription or reader protein binding in specific cell contexts (example PMIDs: 27105113, 31676231, 37311463). These are important future studies for our group and will likely shed light on additional mechanistic insights of different histone acylation functions. We have highlighted some of these concepts in the discussion (lines 301-310):

      "We also observe that butyrate and propionate treatment have both overlapping and distinct effects on gene regulation (Figure 3, Supplemental Figure 4, Supplemental Figure 8D). Propionate appears to have more modest effects compared to butyrate, as it induces a smaller number of differential gene changes and these genes do not display enrichment in ATP and nucleotide metabolism categories. These differences in gene regulatory responses to the different SCFA treatments could be due to multiple mechanisms. For example, we speculate that there could be chromatin-independent functions through distinct alterations in metabolic or signaling pathways or chromatin-dependent mechanisms through potential distinct structural effects on chromatin or differences in reader protein binding."

      Which pathways are associated with acetate- and propionate-specific DEGs?

      Response: Thank you for this insightful question. We have performed gene ontology analysis for acetate and propionate DEGs. Interestingly, there is largely overlap between the different SCFA treatments (Supplemental Figure 4A). However, propionate treatment fails to enrich for select gene ontology categories that we observe in acetate treatment (Supplemental Figure 4B, __included below). For example, by gene set enrichment analysis, acetate enriches for gene categories related to nucleotide and ATP synthesis, while propionate does not. However, both acetate and propionate (and all SCFA treatments) are enriched in categories related to the ribosome and rRNA (__Supplemental Figure 4B-C). We have added this analysis to the manuscript as Supplemental Figure 4 and included additional discussion of this analysis in the text in lines 163-171 (included below), as well as additional speculation about differences between propionylation and butyrylation in lines 301-310 (included above).

      *"We further analyzed gene programs changing with different SCFA treatments. All SCFA treatments regulated largely overlapping gene programs including those related to RNA metabolism, ATP synthesis, and ribosome function (Supplemental Figure 4a). Since butyrate overlapped greatly with the combination SCFA treatment, we specifically analyzed acetate and propionate gene programs (Supplemental Figure 4b-c). Interestingly, propionate treatment failed to enrich for select gene ontology categories that we observe in other SCFA treatments. Specifically, propionate-dependent gene programs did not include those related to ATP and nucleotide metabolism, highlighting some differences in gene expression changes following different SCFA treatments." *

      • *

      New__ Supplemental Figure 4B.__

      Which genes are related to growth inhibition in butyrate-treated cells? Does the 1:1:1 SCFA mixture have a similar impact on cell growth as single butyrate treatment?

      Response: Butyrate has previously been shown to inhibit cell growth in colon cancer cells (PMIDs 9125124, 33017771, 38398853). These include differential regulation of key cell cycle regulators, such as p21 and Cyclin D1. We have included both GO term enrichment for the 1:1:1 SCFA mix and gene expression data for select cell cycle regulators in Supplemental Figure 7C-D (7D also included below). This demonstrates that both butyrate and the SCFA mixtures, and to a lesser extent propionate, differentially regulate key cell cycle genes including CDKN1C, CDK2, CDK4, WEE1, and RB1. We have additionally performed a GLO assay for the 1:1:1 SCFAs treatment to investigate its impact on growth inhibition, which is now included as Supplemental Figure 7B. Here, we observe that the 1:1:1 and 3:1:1 mixtures of SCFAs significantly decrease cell viability. However, this is not to the same extent as butyrate treatment alone. Together, this data suggest that butyrate reduces cell viability at least in part through altering key cell cycle genes. This effect is mimicked with the SCFA mixture treatments, but to a lesser extent compared to butyrate alone.

      New Supplemental Figure 7D.


      Reviewer #2 (Significance (Required)):

      General assessment: This study clearly demonstrates the role of butyrate in gene regulation and elucidates its underlying regulatory mechanisms. However, it does not provide insight into how butyrate counteracts the effects of acetate and propionate, despite these metabolites often being detected together. In addition, it remains unclear which specific histone PTMs are associated with the distinct gene expression changes induced by different short-chain fatty acids. Lastly, the observation that histone butyrylation and propionylation correlate with active transcription is not novel.

      Advance: This study advances understanding of short-chain fatty acids in chromatin and gene regulation, highlighting butyrate's dominant role and its p300/CBP-dependent rather than HDAC inhibition-dependent mechanism.

      Audience: This work may attract significant interest in both the epigenetics and metabolism fields.

      My expertise: histone acetylation, HATs, transcriptional regulation, cancer.

      Response: We very much appreciate all of these thoughtful comments. We are thankful for the recognition that this story advances our understanding of SCFA function through chromatin and may be of significant interest to the epigenetics and metabolism fields. We hope that we have now provided additional insight into roles of propionate and acetate (Supplemental Figure 4). We also recognize that similar to other studies, we observe colocalization of the different histone marks and it is difficult to tease apart specific functions. We plan to further address this important question in future studies.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: The authors explore the effects of short-chain fatty acids (SCFAs) acetate, propionate, and butyrate on chromatin and gene expression in human colon cancer cells. The authors first characterize the presence of histone propionylation and histone butyrylation in different colon cancer cell lines as a function of SCFA treatments. Then, they perform ChIP-seq to determine the genomic localization of these marks and observe that these marks are deposited on euchromatic regions similar to H3K4Me3 and to one another, consistent with previous reports. The authors then performed gene expression analysis to determine the contribution of the SCFAs. Interestingly, they observe that butyrate treatment alone mimicked the gene expression profile of an equimolar mixture of short-chain fatty acids treatment, at least in the tested cell lines. Finally, the authors designed an experiment to try to separate the functions of butyrate on gene expressions that are dependent on p300/CBP and are independent of the HDAC inhibition property. The following aspects of the paper need addressing-

      Response: We sincerely thank the reviewer for their very helpful and constructive comments. We appreciate the notes on interesting aspects of our study. We hope that we have addressed all concerns as described below.

      Major comments

      1. There is no confirmation of the validity of the results seen from ChIP-seq (Figure 2) and RNA-seq (Figure 3). The majority of the findings of the paper are derived from ChIP-seq and RNA-seq data, and hence, experiments validating such results need to be established. ChIP-qPCR for representative gene(s) with adequate controls needs to be performed for different acyl marks (H3K27bu, H3K27pr, H3K4Me3, H3K9pr, H3K9bu) to support the ChIP-seq results, and RT-qPCR for representative gene(s) for different treatment conditions (vehicle, acetate, propionate, butyrate, and 5 mM 1:1:1 mixture) for validating RNA-seq results. Response: We are happy to include validation by qPCR of our ChIP and RNA-seq results. The qPCR validation for Figure 3 is now included as Figure 3F and qPCR validation for ChIP-seq is included as Figure 4C. We have selected genes that are differentially expressed and also display occupancy of different histone acyl marks. In addition, we performed additional qPCR validation for our RNA-seq data related to Figure 5 (previously Figure 4), which is now included as Figure 5F-G. Lastly, we performed orthogonal analysis of ChIP using Cut&Run in Caco-2 cells, which is now included as Figure 2C-D. This further supports our findings with HCT-116 cells.

      The authors describe an interesting strategy to differentiate the different functions of butyrate (Figure 4). The authors propose that differential genes that change with p300/CBP inhibitor treatment, that are separate from HDAC inhibitor treatment, are potential genes that are a function of histone butrylation. An important control that is missing in this experiment is cells treatment with propionate. In their previous findings (Figure 1C-D), they note that both propionate and butyrate treatments elevate the levels of histone acetylation, propionylation, and butyrylation. But the HDAC inhibitory activity of propionate is not very well established, and performing experiments to prove it is are beyond the scope of this paper. Importantly, p300/CBP has been shown to catalyze histone propionylation with higher efficiency compared to histone butyrylation (PMID: 27820805, PMID: 29070843). Therefore, it would be ideal to include differentially expressed genes from propionate-treated cells in the analysis to rule out any discrepancy.

      Response: Thank you for this insightful comment. We agree that propionate also elevates histone butyrylation and may have important effects. We have therefore included our differentially expressed genes with propionate treatment from Figure 3 in our analysis related to HDAC inhibition: we have plotted these differentially expressed genes in a matched, ordered column to our clustering analysis in Figure 4 (now Figure 5) as Supplemental Figure 8D (also included below). This demonstrates that overall propionate has similar gene expression changes to butyrate, but the extent of these changes is less pronounced compared to butyrate. In addition, our qPCR validation analysis in Figure 3F demonstrates that propionate similarly regulates some differentially expressed genes affected by butyrate (such as PHOSPHO1 and HOXB9) but fails to differentially regulate other targets (such as CYSRT1). This suggests that propionate and butyrate have both overlapping and distinct targets, which is consistent with our global analyses in Figure 3A-D. Lastly, we now have included specific analysis of gene program changes related to propionate treatment (Supplemental Figure 4). Interestingly, there is largely overlap between the different SCFA treatments (Supplemental Figure 4A). However, propionate treatment fails to enrich for select gene ontology categories that we observe in other SCFA treatments (Supplemental Figure 4A-B). For example, by gene set enrichment analysis, other SCFA treatments enrich for gene categories related to nucleotide and ATP synthesis, while propionate does not. However, all SCFA treatments are enriched in categories related to the ribosome and rRNA (Supplemental Figure 4B-C). Together, this data suggests that propionate has largely similar effects to butyrate treatment in regulating gene expression programs with some distinct differences.

      New Supplemental Figure 8D.

      Along the same lines as comment #2, other possible "functions" of propionate and/or butyrate that could explain why treatment with them increase histone acetylation, propionylation, and butyrylation are not discussed. This work was not cited/discussed: PMID 34677127 despite being very closely related and relevant. Indeed, there seems to be some redundancy of efforts between that paper (2021) and this one even in terms of the specific experiments performed.

      Response: Thank you for this comment, and we sincerely apologize for our oversight in not citing this important work. We are very familiar with this paper, and this was an unfortunate accidental oversight. We have now cited it throughout the text in lines 51, 123, and 330. In addition, we expanded our discussion about how our single treatments of butyrate or propionate increase levels of multiple histone acyl marks including acetylation, butyrylation, and propionylation. We now include activation of p300 as a potential mechanism for this observation in lines 327-330: "This is consistent with the role of butyryl-CoA and propionyl-CoA functioning as activators of p300 acetyltransferase activity, where these molecules can directly stimulate p300 auto-acylation and acetylation activity on histones and other substrates12" Lastly, while we agree that many of our treatments are similar to this paper, we also feel that our downstream analysis is distinct, as we are focusing on genomic localization and gene expression changes, in addition to changes in levels of the histone marks themselves. We believe that this distinction lessens the redundancy between our papers and may be of interest to the chromatin field.

      An analysis for correlations between the ChIP-seq data for H3K27bu (Fig 2) and RNA-seq data following butyrate treatment (Fig 3) would provide further insights into whether the genes/pathways that are enriched/downregulated in H3K27bu ChIP-seq data correlate with genes/pathways that are upregulated/downregulated in RNA-seq data.

      Response: We really appreciate this suggestion and agree that this analysis would add important additional insights. We have therefore performed this analysis through binning genes by expression level and analyzed occupancy of H3K27bu according to gene expression quartiles, which is now included as Figure 4B. Additionally, we included the other histone butyrylation and propionylation marks that are the focus of our manuscript. We have found that levels of H3K27bu occupancy are correlated with high gene expression quartiles. Importantly, this is also consistent with our earlier work in primary mouse intestinal cells (PMID: 38413806).

      Minor comments

      1. All the images appear to be very low resolution. This could be due to the online submission system. Response: We apologize for this issue and believe it is due to the submission system.

      For Fig 2, the caption says "...treated with different SCFAs for 24 hours," but it is unclear precisely what the treatment was. Were the cells treated with the SCFA mix, and then ChIP-seq was performed for the 5 different marks tested? Or were there different SCFA treatments performed for each mark that was ChIPed?

      Response: We have revised the text of the figure legend to make it clear that we treated cells with individual SCFAs (propionate for propionylation marks and butyrate for butyrylation marks).

      Line 99-100: "Treatment with butyrate, propionate, or a mixture of all three SCFAs resulted in a global increase in histone butyrylation or propionylation" is misleading. The authors test only specific sites on Histone H3 using site-specific antibodies and do not test whether these treatments increase global levels of acylation on other histones and sites using pan-acyl antibodies. So, this sentence needs to be rephrased to clearly indicate that the treatments only increased at the tested sites.

      Response: Thank you for this comment. We understand this was misleading and that was not our intention at all. By writing "global levels," we simply meant levels of immunoblotting signal at these specific lysine residues. We have therefore revised the text to make it clearer (now in lines 102-104): "Treatment with butyrate, propionate, or a mixture of all three SCFAs resulted in significant increases of histone butyrylation and propionylation at select residues of histone H3, as assayed by immunoblotting".

      Reviewer #3 (Significance (Required)):

      Strengths and limitations: The experiments in the study were performed with a high degree of rigor, including appropriate controls. The discussion of the -seq data in Figs 2-4 avoided focusing on or following up on specific genes, which limited the conclusions from these data to being very broad. A key paper (that was not recent) was missing from the context presented in the paper, weakening the discussion of the data presented.

      Advance: The advance is pretty conceptually incremental. Similar experiments as in Fig 1-3 in similar models have been performed in other papers already (e.g., PMID 39789354 in 2025 and PMID 34677127 in 2021), although Fig 4 was an interesting experiment that helps differentiate the work from existing literature.

      Audience: This work would be interesting to a chromatin audience as well as a microbiome audience, but the scope of the conclusions from this paper, and it's redundancy with other literature, will limit its profile.

      My expertise is in histone PTM biochemistry and biology, including non-canonical histone acyl PTMs.

      Response: We really appreciate the thoughtful and constructive comments and the recognition that this story may be of interest to the chromatin and microbiome audiences. In addition, we acknowledge other similar recent work that is also very interesting, but we also feel that our manuscript is distinct in several important ways from these studies. In particular, the analysis of gene expression changes that we propose to be histone butyrylation dependent vs. through HDAC inhibition (Figure 5, previously Figure 4) and the finding that butyrate drives SCFA combination gene expression changes (Figure 3). We are very grateful for the recognition of these interesting findings by this reviewer. Furthermore, we also want to highlight that we have expanded our analysis of human tissues (Supplemental Figure 1), which adds additional novelty to this work.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary: The authors explore the effects of short-chain fatty acids (SCFAs) acetate, propionate, and butyrate on chromatin and gene expression in human colon cancer cells. The authors first characterize the presence of histone propionylation and histone butyrylation in different colon cancer cell lines as a function of SCFA treatments. Then, they perform ChIP-seq to determine the genomic localization of these marks and observe that these marks are deposited on euchromatic regions similar to H3K4Me3 and to one another, consistent with previous reports. The authors then performed gene expression analysis to determine the contribution of the SCFAs. Interestingly, they observe that butyrate treatment alone mimicked the gene expression profile of an equimolar mixture of short-chain fatty acids treatment, at least in the tested cell lines. Finally, the authors designed an experiment to try to separate the functions of butyrate on gene expressions that are dependent on p300/CBP and are independent of the HDAC inhibition property. The following aspects of the paper need addressing-

      Major comments

      1. There is no confirmation of the validity of the results seen from ChIP-seq (Figure 2) and RNA-seq (Figure 3). The majority of the findings of the paper are derived from ChIP-seq and RNA-seq data, and hence, experiments validating such results need to be established. ChIP-qPCR for representative gene(s) with adequate controls needs to be performed for different acyl marks (H3K27bu, H3K27pr, H3K4Me3, H3K9pr, H3K9bu) to support the ChIP-seq results, and RT-qPCR for representative gene(s) for different treatment conditions (vehicle, acetate, propionate, butyrate, and 5 mM 1:1:1 mixture) for validating RNA-seq results.
      2. The authors describe an interesting strategy to differentiate the different functions of butyrate (Figure 4). The authors propose that differential genes that change with p300/CBP inhibitor treatment, that are separate from HDAC inhibitor treatment, are potential genes that are a function of histone butrylation. An important control that is missing in this experiment is cells treatment with propionate. In their previous findings (Figure 1C-D), they note that both propionate and butyrate treatments elevate the levels of histone acetylation, propionylation, and butyrylation. But the HDAC inhibitory activity of propionate is not very well established, and performing experiments to prove it is are beyond the scope of this paper. Importantly, p300/CBP has been shown to catalyze histone propionylation with higher efficiency compared to histone butyrylation (PMID: 27820805, PMID: 29070843). Therefore, it would be ideal to include differentially expressed genes from propionate-treated cells in the analysis to rule out any discrepancy.
      3. Along the same lines as comment #2, other possible "functions" of propionate and/or butyrate that could explain why treatment with them increase histone acetylation, propionylation, and butyrylation are not discussed. This work was not cited/discussed: PMID 34677127 despite being very closely related and relevant. Indeed, there seems to be some redundancy of efforts between that paper (2021) and this one even in terms of the specific experiments performed.
      4. An analysis for correlations between the ChIP-seq data for H3K27bu (Fig 2) and RNA-seq data following butyrate treatment (Fig 3) would provide further insights into whether the genes/pathways that are enriched/downregulated in H3K27bu ChIP-seq data correlate with genes/pathways that are upregulated/downregulated in RNA-seq data.

      Minor comments

      1. All the images appear to be very low resolution. This could be due to the online submission system.
      2. For Fig 2, the caption says "...treated with different SCFAs for 24 hours," but it is unclear precisely what the treatment was. Were the cells treated with the SCFA mix, and then ChIP-seq was performed for the 5 different marks tested? Or were there different SCFA treatments performed for each mark that was ChIPed?
      3. Line 99-100: "Treatment with butyrate, propionate, or a mixture of all three SCFAs resulted in a global increase in histone butyrylation or propionylation" is misleading. The authors test only specific sites on Histone H3 using site-specific antibodies and do not test whether these treatments increase global levels of acylation on other histones and sites using pan-acyl antibodies. So, this sentence needs to be rephrased to clearly indicate that the treatments only increased at the tested sites.

      Significance

      Strengths and limitations: The experiments in the study were performed with a high degree of rigor, including appropriate controls. The discussion of the -seq data in Figs 2-4 avoided focusing on or following up on specific genes, which limited the conclusions from these data to being very broad. A key paper (that was not recent) was missing from the context presented in the paper, weakening the discussion of the data presented.

      Advance: The advance is pretty conceptually incremental. Similar experiments as in Fig 1-3 in similar models have been performed in other papers already (e.g., PMID 39789354 in 2025 and PMID 34677127 in 2021), although Fig 4 was an interesting experiment that helps differentiate the work from existing literature.

      Audience: This work would be interesting to a chromatin audience as well as a microbiome audience, but the scope of the conclusions from this paper, and it's redundancy with other literature, will limit its profile.

      My expertise is in histone PTM biochemistry and biology, including non-canonical histone acyl PTMs.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study presents a novel finding that short-chain fatty acids (SCFAs) produced by microbial metabolism regulate gene transcription in human colon cancer cells by modulating histone H3K9 and H3K27 butyrylation and propionylation, both of which are associated with an open chromatin state. The authors further reveal that the major effect of the SCFA mixture is driven by butyrate and identify p300/CBP-dependent, rather than HDAC inhibition-dependent, gene regulation by butyrate. Overall, this is a well-organized study that provides valuable insight into the role of metabolites in human cells.

      Major comments:

      1.In Figures 1C and 1D, why did the SCFA mixture not increase histone butyrylation or propionylation to the same level as single butyrate treatment? 2.In Figure 3B, how does butyrate block the effects of acetate and propionate on transcription? 3.Which pathways are associated with acetate- and propionate-specific DEGs? 4.Which genes are related to growth inhibition in butyrate-treated cells? Does the 1:1:1 SCFA mixture have a similar impact on cell growth as single butyrate treatment?

      Significance

      General assessment: This study clearly demonstrates the role of butyrate in gene regulation and elucidates its underlying regulatory mechanisms. However, it does not provide insight into how butyrate counteracts the effects of acetate and propionate, despite these metabolites often being detected together. In addition, it remains unclear which specific histone PTMs are associated with the distinct gene expression changes induced by different short-chain fatty acids. Lastly, the observation that histone butyrylation and propionylation correlate with active transcription is not novel.

      Advance: This study advances understanding of short-chain fatty acids in chromatin and gene regulation, highlighting butyrate's dominant role and its p300/CBP-dependent rather than HDAC inhibition-dependent mechanism.

      Audience: This work may attract significant interest in both the epigenetics and metabolism fields.

      My expertise: histone acetylation, HATs, transcriptional regulation, cancer

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, Kabir et al. explore the impact of microbiota-derived short-chain fatty acids (SCFAs) on chromatin structure and gene expression in human cells. They show that SCFAs, particularly butyrate, contribute to specific histone modifications such as butyrylation at H3K27, detectable in human colon tissue. Additional modifications like acetylation, butyrylation, and propionylation at H3K9 and H3K27 respond to SCFA levels and are enriched at active regulatory regions in colorectal cancer cells. Treatment with individual or combined SCFAs mimicking gut conditions alters gene expression patterns, with butyrate playing a dominant regulatory role. Butyrate's effects on gene expression are claimed to be independent of HDAC inhibition and instead rely on the p300/CBP complex through histone butyrylation. These findings underscore SCFAs as crucial modulators of epigenetic regulation in the human colon and highlight butyrate's dominant role in shaping chromatin and gene regulation beyond its known metabolic functions.

      The authors used two human cell lines and an in vivo murine model paired with RNA and ChIP sequencing approaches to identify target genes and chromatin modifications in response to SCFAs. While the findings are interesting and could provide important insights into the epigenetic influence of SCFAs in human cells, the study would benefit from additional experiments to strengthen the conclusions. Comments and suggestions are listed below:

      1. Figure 1: The H3K27bu expression in human biopsies highlights the clinical significance of the current study. However, the authors need to provide more information on the human colon samples, e.g., how many total patients were analyzed, and what were the age and/or sex. Only the methods mention the use of benign TMA; this should also be clarified in the figure legends. It would also be helpful to show histone butyrylation levels in normal vs. cancer human tissues.
      2. Figure 1: In addition, given that the butyrate level descends towards the base of the colonic crypt (with the highest at the top of the crypt where mature intestinal epithelial cells reside) (Kaiko et al., 2016), it is important to show how the H3K27bu signature is distributed along the crypt. This data would further emphasize the clinical relevance of this study, given that most colorectal cancers (CRCs) arise from stem and progenitor cells.
      3. Throughout the manuscript: The rationale for selecting the two CRC cell lines (HCT 116 and Caco2) should be explained. While commonly used, providing background on their genetic differences (e.g., driver mutations) is important, as this could greatly influence the PTM landscape.
      4. The study lacks additional controls, such as a normal colon epithelial cell line and a non-colonic cell type. Including these would help determine whether the observed butyrate effects are tissue- or disease-specific. This data would also help assess whether SCFA effects, and specifically butyrate's effects, on histone acylation and gene expression are systemic or local.
      5. Figure 2: The authors show ChIP-seq results in the HCT 116 cell line. To exclude the possibility that the demonstrated chromatin signatures are cell line-specific, results from Caco2 should also be shown. In addition, the 2D environment and multiple passaging alter gene expression in cell lines; using human colonic organoids would provide a more clinically and physiologically relevant model.
      6. Figure 4 is very confusing. Entinostat itself, as an HDAC inhibitor (iHDAC), increases butyrylation. The data shown are insufficient to draw conclusions. First, the authors should use additional iHDACs, and second, they should illustrate the overlap in gene expression changes between all treatments using a Venn diagram to clarify which genes/signatures are specific to each treatment.
      7. Figure 4: The authors use an HDAC inhibitor to rule out butyrate's effect on gene expression via HDAC inhibition. However, butyrate can also modulate gene expression through activation of GPR109a. Using GPR109a antagonists is necessary to address this possibility. These data are essential to validate the specific role of histone butyrylation in gene regulation.
      8. Supplementary Figure 4 and manuscript: There is no in vivo methods section describing the tributyrin-gavaged mice. The authors should clarify how the experiment was performed, how cells were isolated, whether sorting was performed, and which markers were used.
      9. Supplementary Figure 4: The GO analysis results show that lipid catabolism is among the top differentially enriched pathways. Butyrate is a known PPARγ agonist (Litvak et al., 2018), and activation of PPARγ is known to drive expression of genes involved in lipid metabolism. The authors need to rule out this function of butyrate before attributing this signature solely to histone butyrylation.
      10. It would be helpful to include a table of differentially abundant genes as a supplement to the heatmaps and GO analysis.

      Significance

      This study explores how microbiota-derived SCFAs, particularly butyrate, influence histone acylation and gene regulation. While the topic is relevant, the work lacks important controls (e.g., normal epithelial and non-colonic cells) and omits mechanistic validation (e.g., GPR109a signaling, PPARγ involvement). The rationale for cell line selection is unclear, and in vivo methods are insufficiently described.

      Audience:

      The study will mainly interest specialists in microbiota-chromatin interaction. Broader impact is limited by the narrow model scope and underdeveloped mechanistic insight.

      My Expertise:

      Cancer biology, in vivo models, microbiota-host interactions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      *We thank the reviewers for their insightful and constructive comments, which have substantially strengthened the manuscript. We have addressed all concerns and replaced the previous non-quantitative RNA-seq analysis with a new analysis that allowed for quantitative assessment. We were encouraged to find that the revised analysis not only confirmed our original observations but also reinforced and extended our conclusions. *

      2. Point-by-point description of the revisions

      Reviewer #1


      Significance

      Comment 1: At its current stage, this work represents a robust resource for molecular parasitology research programs, paving the way for mechanistic studies on multilayered gene expression control and it would benefit from experimental evidence for some of the claims concerning the in silico regulatory networks. Terms like "regulons", "recursive feedback loop" are employed without solid confirmation or extensive literature support. In my view, the most relevant contribution of this study is centered in the direct association between proteasome-dependent degradation and Leishmania differentiation.

      __Response: __We thank the reviewer to acknowledge the impact of our work as a robust resource for further mechanistic studies. We agree that the new concepts emerging from our multilayered analysis should be experimentally assessed. However, given the scope of our analysis (i.e. a complete systems-level analysis of bona fide, hamster-isolated L. donovani amastigotes and derived promastigotes) and the amount of data presented in the current manuscript, such functional genetic analysis will merit an independent, in-depth investigation. The current version has been very much toned down and modified to emphasize the impact of our work as a powerful new resource for downstream functional analyses.


      Evidence, reproducibility and clarity

      Comment 1: The narrative becomes somewhat diffuse with the shift to putative multilevel regulatory networks, which would benefit from further experimental validation.

      Response: We agree with the reviewer and toned down the general discussion while suggesting putative multilevel regulatory networks for follow-up, mechanistic analyses. We now emphasize those networks for which evidence in trypanosomatids and other organisms has been published. Experimental validation of some of these regulatory networks is outside the scope of our manuscript and will be pursued as part of independent investigations.

      Major issues

      Comment 1: Fig.1D suggests a significant portion of the SNPs are exclusive, with a frequency of zero in one of the two stages. Were only the heterozygous and minor alleles plotted in Fig.1D, since frequencies close to 1 are barely observed? Is the same true in Sup Fig. S2B? Why do chrs 4 and 33 show unusual patterns in S2B?

      __Response: __We thank the reviewer for this observation. The SNPs exclusive to either one or the other stage are likely the result of the 10% cutoff we use for this kind of analysis (eliminating SNPs that lack sufficient support, i.e. less than 10 reads). Due to bottle neck events (such as in vitro culture or stage differentiation), many low frequency SNPs are either 'lost' (filtered out) or 'gained' (passing the 10% cutoff) between the ama and pro samples. All SNPs above 10% were plotted. The absence of SNPs at 100% is one of the hallmarks of the Ld1S L. donovani strain we are using. Instead, these parasites show a majority of SNPs at a frequency of around 50%, which is likely a sign of a previous hybridization event. Chr 4 and chr 33 show a very low SNP density, most likely as they went through a transient monosomy at one moment of their evolutionary history, causing loss of heterozygosity. We now explain these facts in the figure legend.


      Comment 2: Chr26 revealed a striking contrasting gene coverage between H-1 and the other two samples. While a peak is observed for H-1 in the middle of this chr, the other two show a decrease in coverage. Is there any correlation with the transcriptomic/proteomic findings?

      Response: This analysis is based on normalized median read depth, taking somy variations into account. This is now more clearly specified in the figure legend. We do not see any significant expression changes that would correlate with the observed (minor) read depth changes. As indicated in the legend, we do not consider such small fluctuations (less than +/- 1,5 fold) as significant. The reversal of the signal for chr 26 sample H1 eludes us (but again, these fluctuations are minor and not observed at mRNA level).

      Comment 3: The term "regulon" is used somewhat loosely in many parts of the text. Evidence of co-transcriptomic patterns alone does not necessarily demonstrate control by a common regulator (e.g., RNA-binding protein), and therefore does not fulfill the strict definition of a regulon. It should be clear whether the authors are highlighting potential multiple inferred regulons within a list of genes or not. Maybe functional/ gene module/cluster would be more appropriate terms.

      Response: We thank the reviewer for this important comment. We replaced 'regulon' throughout the manuscript by 'co-regulated, functional gene clusters' (or similar).

      Comment 4: It is unclear whether the findings in Fig.3E are based on previous analysis of stage-specific rRNA modifications or inferred from the pre-snoRNA transcriptomic data in the current work or something else. I struggle to find the significance of presenting this here.

      __Response: __We thank the reviewer for this comment. Yes, these data show stage-specific rRNA modifications based on previous analyses that mapped stage-specific differences of pseudouridine (Y) (Rajan et al., Cell Reports 2023, DOI: 10.1016/j.celrep.2024.114203) and 2'-O-modifications (Rajan et al., Nature Com, in revision) by various RNA-seq analyses and cryoEM. This figure has been modified in the revised version to consider the identification of stage-regulated snoRNAs in our new and statistically robust RNA-seq analysis. These data are shown to further support the existence of stage-regulated ribosomes that may control mRNA translatability, as suggested by the enriched GO terms 'ribosome biogenesis', 'rRNA processing' and 'RNA methylation' shown in Figure 2. We better integrated these analyses by moving the panels from Figure 3 to Figure 2.

      Comment 5: The protein turnover analysis is missing the critical confirmation of the expected lactacystin activity on the proteasome in both ama and pro. A straightforward experiment would be an anti-polyUb western blotting using a low concentration SDS-PAGE or a proteasome activity assay on total extracts.

      Response: We thank the reviewer for this comment and have now included an anti-polyUb Western blot analysis (see Fig S7).

      Comment 6: The viability tests upon lactacystin treatment need a positive control for the PI and the YoPro staining (i.e., permeabilized or heat-killed promastigotes).

      Response: This control is now included in Fig S7 and we have added the corresponding description to the text.

      Comment 7: I found that the section on regulatory networks was somewhat speculative and less focused. Several of the associated conclusions are, in some parts, overstated, such as in "uncovered a similar recursive feedback loop" (line 566) or "unprecedented insight into the regulatory landscape" (line 643). It would be important to provide some form of direct evidence supporting a functional connection between phosphorylation/ubiquitination, ribosome biogenesis/proteins and gene expression regulation.

      Response: We agree with the reviewer and have considerably toned down our statements. Functional analyses to investigate and validate some of the shown network interactions are planned for the near future and will be published separately.

      Minor issues

      1) The ordinal transition words "First,"/"Second," are used too frequently in explanatory sections. I noted six instances. I suggest replacing or rephrasing some to improve flow.

      Response: Rectified, thanks for pointing this out.

      2) Ln 168: Unformatted citations were given for the Python packages used in the study.

      Response: Rectified, thanks for pointing this out.

      3) Fig.1D: "SNP frequency" is the preferred term in English.

      Response: Corrected.

      4) Fig.2A: not sure what "counts}1" mean.

      __Response: __This figure has been replaced.

      5) Ln 685: "Transcripts with FC 0.01 are represented by black dots" -> This sentence is inaccurate. The intended wording might be: "Transcripts with FC 0.01 are represented by black dots"

      Response: We thank the reviewer and corrected accordingly.

      6) Ln 698: Same as ln 685 mentioned above.

      Response: We thank the reviewer and corrected accordingly.

      7) Fig.2B and elsewhere: The legend key for the GO term enrichment is a bit confusing. It seems like the color scales represent the adj. p-values, but the legend keys read "Cluster efficiency" and "Enrichment score", while those values are actually represented by each bar length. Does light blue correspond to a max value of 0.05 in one scale, and dark blue to a max value of 10-7 in the other scale?

      Response: This was corrected in the figure and the legends were updated accordingly.

      8) Sup Figure S3A and S4A: The hierarchical clustering dendrograms are barely visible in the heatmaps.

      Response: Thanks for the comment. Figure S3 was removed and replaced by a hierarchical clustering and a PCA plot.

      9) S3A Legend: The following sentence sounds a bit awkward: "Rows and columns have been re-ordered thanks to a hierarchical clustering". I suggest switching "thanks to a hierarchical clustering" to "based on hierarchical clustering".

      Response: This figure was removed and the legend modified.

      10) Fig.5D: The font size everywhere except the legend key is too small. In addition, on the left panel, gene product names are given as a column, while on the right, the names are shown below the GeneIDs. Consistency would make it clearer.

      Response: Thank you, this is now rectified. To ensue readability, we reduced the number of shown protein kinase examples.



      Reviewer #2

      Evidence, reproducibility and clarity

      Comment 1: In the absence of riboprofiling the authors return to the RNA-seq to assess the levels of pre-Sno RNA (the role of the could be more explicitly stated).

      Response: We thank the reviewer for this comment. We moved the snoRNA analysis from Fig 3 to Fig 2 (see also the similar comment of reviewer 1), which better integrates and justifies this analysis. Based on the new and statistically robust RNA-seq analysis, the volcano plot showing differential snoRNA expression and possible ribosome modification has been adjusted (Figures 2C and D).

      __Comment 2: __The authors provide a clear and comprehensive description of the data at each stage of the results and this in woven together in the discussion allowing hypotheses to be formed on the potential regulatory and signalling pathways that control the differentiation of amastigotes to promastigotes. Given the amount and breadth of data presented the authors are able to present a high-level assessment of the processes that form feedback loops and/or intersectional signalling, but specific examples are not picked out for deeper validation or exploration.

      __Response: __We thank the reviewer to acknowledge the amount and breadth of data presented. As indicated above (see responses to reviewer 1), mechanistic studies will be conducted in the near future to validate some of the regulatory interactions. These will be subject of separate publications. As noted above (response to reviewer 1), we toned down the general discussion, suggest follow-up mechanistic analyses and emphasize those networks for which evidence in trypanosomatids and other organisms has been published.

      __ __ Major comments:

      Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      Comment 1: As I have understood it from the description in the text, and in Data Table 4, the RNA-seq element of the work has only been conducted using two replicates. If this is the case, it would substantially undermine the RNA-seq and the inferences drawn from it. Minimum replicates required for inferential analysis is 3 bio-replicates and potentially up to 6 or 12. It may be necessary for the authors to repeat this for the RNA-seq to carry enough weight to support their arguments. (PMID: 27022035)

      Response: We agree with the reviewer and conducted a new RNA-seq analysis with 4 independent biological replicates of spleen-purified amastigotes and derived promastigotes. Given the robustness of the stage-specific transcriptome, and the legal constrains associated with the use of animals, we chose to limit the number of replicates to the necessary. We thank the reviewer for this important comment, and the new data not only confirm the previous one (providing a high level of robustness to our data) but allowed us to increase the number of identified stage-regulated snoRNAs, thus further supporting a possible role of ribosome modification in Leishmania stage development.

      Comment 2: There are several examples that are given as reciprocal or recursive signalling pathways, but these are not followed up with independent, orthogonal techniques. I think the paper currently forms a great resource to pursue these interesting signalling interactions and is certainly more than just a catalogue of modifications, but to take it to the next level ideally a novel signalling interaction would be demonstrated using an orthogonal approach. Perhaps the regulation of the ribosomes could have been explored further (same teams recently published related work on this). Or perhaps more interestingly, a novel target(s) from the ubiquitinated protein kinases could have been explored further; for example making precision mutants that lack the ubiquitination or phosphorylation sites - does this abrogate differentiation?

      Response: We agree with the reviewer that the paper currently forms a great resource. In-depth molecular analysis investigating key signaling pathways and regulatory interactions are outside the scope of the current multilevel systems analysis but will be pursued in independent investigations.

      Comment 3: I found the use of lactacystin a bit curious as there are more potent and specific inhibitors of Leishmania proteasomes e.g. LXE-408. This could be clarified in the write-up (See below).

      __Response: __We thank the reviewer for this comment. We opted for the highly specific and irreversible proteasome inhibitor lactacystin that has been previously applied to study the Leishmania proteasome (PMID: 15234661) rather than the typanosomatid-specific drug candidate LXE408 as the strong cytotoxic effect of the latter makes it difficult to distinguish between direct effects on protein turnover and secondary effects resulting from cell death, limiting its utility for dissecting proteasome function in living parasites. We have added this information in the Results section.

      Comment 4: If it is the case that only 2 replicates of the RNA-Seq have been performed it really is not the accepted level of replication for the field. Most studies use a minimum of 3 bioreplicates and even a minimum of 6 is recommended by independent assessment of DESeq2.

      __Response: __See response to comment 1 above.


      Comment 5: As far as I could see, the cell viability assay does not include a positive control that shows it is capable of detecting cytotoxic effects of inhibitors. Add treatment showing that it can differentiate cytostatic vs cytotoxic compound.

      __Response: __This control has now been added to Fig S7.

      If you have constructive further reaching suggestions that could significantly improve the study but would open new lines of investigations, please label them as "OPTIONAL". Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated time investment for substantial experiments.

      Comment 6: It is realistic for the authors to validate the cell viability assay. If the RNA-seq needs to be repeated then this would be a substantial involvement.

      Response: Redoing the RNA-seq analysis was entirely feasible and very much improved the robustness of our results.

      Are the data and the methods presented in such a way that they can be reproduced?

      Comment 7: All the methods are written to a good level of detail. The sample prep, acquisition and data analysis of the protein mass spectrometry contained a high level of detail in a supplemental section. The authors should be more explicit about the amount of replication at each stage, as in parts of the manuscript this was quite unclear.

      Response: We thank the reviewer for this comment and explicitly state the number of replicates in Methods, Results and Figure legends for all analyses. The number of replicates for each analysis is further shown in the overview Figure S1.

      Are the experiments adequately replicated and statistical analysis adequate?

      Comment 8: Unless I have misunderstood the manuscript, I believe the RNA-seq dataset is underpowered according to the number of replicates the authors report in the text.

      Response: See response to comment 1 above.

      Comment 9: Looking at Figure 1 and S1 and Data Table 4 to show the sample workflow I was surprised to see that the RNA-seq only used 2 replicates. The authors do show concordance between the individual biological replicates, but I would consider that only having 2 is problematic here, especially given the importance placed on the mRNA levels and linkage in this study. This would constitute a major weakness of the study, given that it is the basis for a crucial comparison between the RNA and protein levels.

      Response: We agree and have repeated the RNAseq analysis using four independent biological replicates - see response to comment 1.

      Comment 10: It also wasn't clear to me how many replicates were performed at each condition for the lactacystin treatment experiment - can the authors please state this clearly in the text, it looks like 4 replicates from Figure S1 and Data Table 8.

      Response: Indeed, we did 4 replicates. This is now clarified in Methods, Results and Figure legends and shown in Figure S1.

      Comment 11: Four replicates are used for the phosphoproteomics data set, which is probably ok, but other researchers have used a minimum of 5 in phosphoproteomics experiments to deal with the high level of variability that can often be observed with low abundance proteins & modifications. The method for the phosphoproteomics analysis suggests that a detection of a phosphosite in 1 sample (also with a localisation probability of >0.75) was required for then using missing value imputation of other samples. This seems like a low threshold for inclusion of that phosphosite for further relative quantitative analysis. For example, Geoghegan et al (2022) (PMID: 36437406) used a much more stringent threshold of greater than or equal to 2 missing values from 5 replicates as an exclusion criteria for detected phoshopeptides. Please correct me if I misunderstood the data processing, but as it stands the imputation of so many missing values (potentially 3 of 4 per sample category) could be reducing the quality of this analysis.

      Response: We thank the reviewer for this remark and for highlighting best practices in phosphoproteomics data analysis. Unlike other studies that use cultured parasites and thus have access to unlimited amounts, our study employs bona fide amastigotes isolated from infected hamster spleens. In France, the use of animals is tightly controlled and only the minimal number of animals to obtain statistically significant results is tolerated (and necessary to obtain permission to conduct animal experiments).

      Regarding the number of biological replicates, we would like to emphasize that the use of four biological replicates is fully acceptable and used in quantitative proteomics and phosphoproteomics, particularly when combined with high-quality LC-MS/MS data and stringent peptide-level filtering. While some studies indeed employ five or more replicates, this is not a strict requirement, and many high-impact phosphoproteomics studies have successfully relied on four replicates when experimental quality and depth are high. In the present study, we adopted a discovery-oriented approach, aimed at detecting as many confidently identified phosphopeptides as possible. The consistency between replicates, combined with the depth of coverage and signal quality, indicates that four replicates are adequate for both the global proteome and the phosphoproteome in this context. Importantly, the quality of the MS data in this study is supported by (i) a high number of confidently identified peptides and phosphopeptides (identification FDR0.75), and (iii) reproducible quantitative profiles across replicates. Notably, most of the identified phosphopeptides are quantified in at least two replicates within a given condition (between 73.2% and 83.4% of all the identified phosphopeptides among replicates of the same condition).

      Regarding missing value imputation, we appreciate that our initial description may have been unclear and we have revised the Methods to avoid misunderstanding. Phosphosites were only considered if detected with high confidence (identification FDR0.75) in at least one replicate. This criterion was chosen to retain biologically relevant, low-abundance phosphosites, which are more difficult to identify and are often stochastically sampled in phosphoproteomics datasets. For statistical analyses, missing values within a given condition were imputed with a well-established algorithm (MLE) only when at least one observed value was present in that condition. Notably, they were replaced by values in the neighborhood of the observed intensities, rather than by globally low, noise-like values.

      We agree that more stringent exclusion rules, such as those used by Geoghegan et al. (2022), are appropriate in some contexts. However, there is no universally accepted standard for missingness thresholds in phosphoproteomics, and different strategies reflect trade-offs between sensitivity and stringency. In our discovery-oriented approach, we deliberately prioritized biological coverage while maintaining data quality. Our main conclusions are supported by coherent biological patterns, rather than by isolated phosphosite measurements.


      Comment 12: For the metabolomics analysis it looks like 2 amastigote samples were compared against 4 promastigote samples. Why not triplicates of each?

      Response: We thank the reviewer for noticing this point. It is an error in the figure file (Sup figure S1). Four biological replicates of splenic amastigotes were prepared (H130-1, H130-2, H133-1 and H133-2). Amastigotes from 2 biological replicates (H131-1 and H131-2) were seeded for differentiation into promastigotes in 4 flasks (2 per biological replicate) that were collected at passage 2. We have updated the figure file accordingly.

      Minor comments:

      __ __Specific experimental issues that are easily addressable. Are prior studies referenced appropriately?

      * *Comment 1: Yes

      Are the text and figures clear and accurate?

      * *Comment 2: The write up is clear, with the data presented coherently for each method. The analyses that link everything together are well discussed. The figures are mostly clear (see below) and are well described in the legends. There is good use of graphics to explain the experimental designs and sample names - although it is unclear if technical replicates are defined in these figures.

      Response: We thank the reviewer for these positive comments. We now included the information on replicates in the overview figure (Figure S1).

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Comment 3: As I have understood it, the authors have calculated the "phosphostoichiometry" using the ratio of change in the phosphopeptide to the ratio of the change in total protein level changes. This is detailed in the supplemental method (see below). Whilst this has normalised the data, it has not resulted in an occupancy or stoichiometry measurement, which are measured between 0-1 (0% to 100%). The normalisation has probably been sufficient and useful for this analysis, but this section needs to be re-worded to be more precise about what the authors are doing and presenting. These concepts are nicely reviewed by Muneer, Chen & Chen 2025 (PMID: 39696887) who reference seminal papers on determination of phosphopeptide occupancy - and may be a good place to start. An alternative phrase should be used to describe the ratio of ratios calculated here, not phosphostoichiometry.

      Response: We thank the reviewer for this insightful comment and fully agree with the conceptual distinction raised. The reviewer is correct that the approach used in this study does not measure absolute phosphosite occupancy or stoichiometry, which would indeed require dedicated experimental strategies and would yield values bounded between 0 and 1 (0-100%). Instead, we calculated a normalized phosphorylation change, defined as the ratio of the change in phosphopeptide abundance relative to the change in the corresponding total protein abundance (a ratio-of-ratios approach - see doi :10.1007/978-1-0716-1967-4_12), and we tested whether this normalized phosphorylation change differed significantly from zero. This normalization approach is comparable to those previously published in the « Experimental Design and Statistical Analysis of the Proteome and the Phosphoproteome » section of the following paper (DOI: 10.1016/j.mcpro.2022.100428).

      Our intention was to account for protein-level regulation and thereby better isolate changes in phosphorylation dynamics. While this normalization is informative and appropriate for the biological questions addressed here, we agree that the term "phosphostoichiometry" is imprecise and not correct in this context.

      In response, we (i) replaced the term "phosphostoichiometry" throughout the manuscript with a more accurate description, such as "normalized phosphorylation level", or "relative phosphorylation change normalized to protein abundance", and (ii) revised the corresponding Methods and Results text to clearly state that absolute occupancy was not measured.

      This rewording will improve conceptual accuracy without altering the validity or interpretation of the results.

      Comment 4: From the authors methods describing the ratio comparison approach: "Another statistical test was performed in a second step: a contrasted t-test was performed to compare the variation in abundance of each modified peptide to the one of its parent unmodified protein using the limma R package {Ritchie, 2015; Smyth, 2005}. This second test allows determining whether the fold-change of a phosphorylated peptide between two conditions is significantly different from the one of its parent and unmodified protein (paragraph 3.9 in Giai Gianetto et al 2023). An adaptive Benjamini-Hochberg procedure was applied on the resulting p-values thanks to the adjust.p function of R package cp4p {Giai Gianetto, 2016} using the Pounds et al {Pounds, 2006} method to control the False Discovery Rate level."

      Response: The references have been formatted.

      Comment 5: Several aspects of the figures that contain STRING networks are quite useful, particularly the way colour around the circle of each node to denote different molecular functions/biological processes. However, some have descended into "hairball" plots that convey little useful information that would be equally conveyed in a table, for example. Added to this, the points on the figure are identified by gene IDs which, while clear and incontrovertible, are lacking human readability. I suggest that protein name could be included here too.

      Response: We thank the reviewer for this comment but for readability we opted to keep the figure as is. We now refer to Tables 8, 9, and 12 that allow the reader to link gene IDs to protein name and annotation (if available).

      Comment 6: It is also not clear what STRING data is being plotted here, what are the edges indicating - physical interactions proven in Leishmania, or inferred interactions mapped on from other organisms? Perhaps as supplemental data provide the Cytoscape network files so readers can explore the networks themselves?

      Response: We thank the reviewer for this comment. While the STRING plugin in Cytoscape enables integrated network-based analyses, it represents protein-protein associations as a single edge per protein pair derived from the combined confidence score. Consequently, the specific contribution of individual evidence channels (e.g. experimental evidence, curated databases, co-expression, or text mining) cannot be disentangled within this framework. However, this representation was considered appropriate for the present study, which focused on global network topology and functional enrichment rather than on the interpretation of individual interaction types. The information on stringency has been added to the Methods section and the Figure legends (adding the information on confidence score cutoff).

      We decided not to submit the Cytoscape files as they were generated with previous versions of Cytoscape and the STRING plugin. Based on the differential abundance data shown in the tables it will be very easy to recreate these networks with the new versions for any follow up study.

      Comment 7: The title of columns in table S10 panel A are written in French, which will be ok for many people particularly those familiar with proteomics software outputs, but everything else is in English so perhaps those titles could be made consistent.

      __Response: __We apologize and have translated the text in English.

      Comment 8: I would suggest that the authors provide a table that has all the gene IDs of the Ld1S2D strain and the orthologs for at least one other species that is in TriTrypDB. This would make it easy to interrogate the data and make it a more useful resource for the community who work on different strains and species of Leishmania. Although this data is available it is a supplemental material file in a previous paper (Bussotti et al PNAS 2021) and not easy to find.

      Response: We thank the reviewer for this very useful suggestion and have added this table (Table S13).

      Comment 9: Figure 5b - from the legend it is not clear where the confidence values were derived in this analysis, although this is explained in the supplemental method. Perhaps the legend can be a bit clearer.

      Response: We have the following statement to the legend: 'Confidence values were derived as described in Supplementary Methods'.

      Comment 10: Can the authors discuss why lactacystin was used? While this is a commonly used proteasome inhibitor in mammalian cells there is concern that it can inhibit other proteases. At the concentrations (10 µM) the authors used there are off-target effects in Leishmania, certainly the inhibition of a carboxypeptidase (PMID: 35910377) and potentially cathepsins as is observed in other systems (PMID: 9175783). There is a specific inhibitor of the Leishmania proteasome LXE-408 (PMID: 32667203), which comes closer to fulfilling the SGC criteria (PMID: 26196764) for a chemical probe - why not use this. Does lactacystin inhibit a different aspect of proteasome activity compared to LXE-408?

      Response: We have add the following justification to the results section (see also response above to comment 3 for reviewer 2): We chose the highly specific and irreversible proteasome inhibitor lactacystin over the typanosomatid-specific, reversible drug candidate LXE408 as the latter's potent cytotoxicity can confound direct effects on protein turnover with secondary consequences of cell death, limiting its utility for dissecting proteasome function in living parasites.

      Comment 11: The application of lactacystin is changing the abundance of a multitude of proteins but no precision follow up is done to identify if those proteins are necessary and/or sufficient from driving/blocking differentiation. This could be tested using precision edited lines that are unable to be ubiquitinated? There is a lack of direct evidence that the proteins protected from degradation by lactacystin are ubiquitinated? Perhaps some of these could be tagged and IP'd then probed for ubiquitin signal. Di-Gly proteomics to reveal ubiquitinated proteins? These suggestions should be considered as OPTIONAL experiments in the relevant section above.

      Response: We very much appreciate these very interesting suggestions, which we will be considered for ongoing follow-up studies.

      Comment 12: In the data availability RNA-seq section the text for the GEO link is : (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc= GSE227637) but the embedded link takes me to (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE165615) which is data for another, different study. Also, the link to the GEO site for the DNA seq isn't working and manual searches with the archive number (BioProject PRJNA1231373 ) does not appear to find anything. The IDs for the mass spec data PRIDE/ProteomeXchange don't seem to bring up available datasets: PXD035697 and PXD035698

      Response: The links have now been rectified and validated. For those data that are still under quarantine, here is the login information: To access the data:

      DNAseq data: https://dataview.ncbi.nlm.nih.gov/object/PRJNA1231373?reviewer=6qt24dd7f475838rbqfn228d0

      RNAseq data:

      https://www.ebi.ac.uk/biostudies/ArrayExpress/studies/E-MTAB-16528?key=65367b55-d77f-4c06-b4bd-bc10f2dc0b14

      Proteomic data: http://www.ebi.ac.uk/pride

      __Username: __reviewer_pxd035698@ebi.ac.uk

      __Password: __gOIcRx0g

      Phosphoproteomic data: http://www.ebi.ac.uk/pride

      __Username: __reviewer_pxd035697@ebi.ac.uk

      __Password: __7GWtBmvx

      Significance Provide contextual information to readers (editors and researchers) about the novelty of the study, its value for the field and the communities that might be interested. The following aspects are important:

      * General assessment: provide a summary of the strengths and limitations of the study. What are the strongest and most important aspects? What aspects of the study should be improved or could be developed?*

      Strengths: Comment 1: The molecular pathways that regulate Leishmania life-stage transitions are still poorly understood, with many approaches exploring single proteins/RNAs etc in a reductionist manner. This paper takes a systems-scale approach and does a good job of integrating the disparate -omics datasets to generate hypotheses of the intersections of regulatory proteins that are associated with life-cycle progression.

      Response: We thank the reviewer for this positive assessment of our work.

      Comment 2: The differentiation step studied is from amastigote to promastigote. I am not aware that this has been studied before using phosphoproteomics. The use of the hamster derived amastigotes is a major strength. While a difficult/less common model, the use of hamsters permits the extraction of parasites that are host adapted and represent "normal", host-adapted Leishmania ploidy, the promastigote experiments are performed at a low passage number. This is a strength or the work as it reduces the interference of the biological plasticity of Leishmania when it is cultured outside the host.

      Response: We thank the reviewer for the acknowledgment of our relevant hamster system, for which we face many challenges (financial, ethical, administrative as protocols need to be approved by the French government).

      Limitations: __ __Comment 1: Potential lack of appropriate replication (see above).

      Response: See response to comment 1.

      Comment 2: Lack of follow up/validation of a novel signalling interaction identified from the systems-wide approach. There is a lack of assessment of whether a single signalling cascade is driving the differentiation or these are all parallel, requisite pathways. The authors state the differentiation is not driven by a single master regulator, but I am not sure there is adequate evidence to rule this in or out.

      Response: See response to comment 2 above.

      Advance: compare the study to the closest related results in the literature or highlight results reported for the first time to your knowledge; does the study extend the knowledge in the field and in which way? Describe the nature of the advance and the resulting insights (for example: conceptual, technical, clinical, mechanistic, functional,...).

      Comment 3: The study applies well established techniques without any particular technical step-change. The application of large-scale multi-omics techniques and integrated comparisons of the different experimental workflows allow a synthesis of data that is a step forward from that existing in the previous Leishmania literature. It allows the generation of new hypotheses about specific regulatory pathways and crosstalk that potentially drive, or are at least active, during amastigote>promastigote differentiation.

      Response: We thank the reviewer for these positive comments.

      *Audience: describe the type of audience ("specialized", "broad", "basic research", "translational/clinical", etc...) that will be interested or influenced by this research; how will this research be used by others; will it be of interest beyond the specific field? * This manuscript will have primary interest to those researchers studying the molecular and cell biology of Leishmania and other kinetoplastid parasites. The approaches used are quite standard (so not so interesting in terms of methods development etc.) and given the specific quirks of Leishmania biology it may not be that relevant to those working more broadly in parasites from different clades/phyla, or those working on opisthokont systems- yeast, humans etc. Other Leishmania focused groups will surely cherry-pick interesting hits from this dataset to advance their studies, so this dataset will form a valuable reference point for hypothesis generation.

      Response: We thank the reviewer for this assessment and agree that our data sets will be very valuable for us and other teams to generate hypotheses for follow-up studies.

      Please define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Relevant expertise: Trypanosoma & Leishmania molecular & cell biology, RNA-seq, proteomics, transcriptional/epigenetic regulation, protein kinases - some experience of UPS system.

      I have not provided comment on the metabolomics as it is outside my core expertise. However, I can see it was performed at one of the leading parasitology metabolomics labs.

      Response: We thank the reviewer for sharing expertise, investing time and intelligence in the assessment of our manuscript, and the highly constructive criticisms provided.


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      __Summary: __The study presents a comprehensive multi-omics investigation of Leishmania differentiation, combining genomic, transcriptomic, proteomic, phospho-proteomic and metabolomic data. The authors aim to uncover mechanisms of post-transcriptional and post-translational regulation that drive the stage-specific biology of L. donovani. The authors provide a detailed characterization of transcriptomic, proteomic, and phospho-proteomic changes between life stages, and dissect the relative contributions of mRNA abundance and protein degradation to stage-specific protein expression. Notably, the study is accompanied by comprehensive supplementary materials for each molecular layer and provides public access to both raw and processed data, enhancing transparency and reproducibility. While the data are rich and compelling, several mechanistic interpretations (e.g., "feedback loops," "recursive networks," "signaling cascades") are overstated. Similarly, the classification of gene sets as "regulons" is not adequately supported, as no common regulatory factor has been identified and only a single condition change (amastigote to promastigote) was assessed.

      __Response: __We thank the reviewer for these comments and have corrected the manuscript to eliminate all unjustified mechanistic interpretations.

      Major Comments:


      Comment 1:__ Across several sections (incl abstract, L559-565, L589-599, L600-L603, L610-612, L613-614, L625, L643-645, L650-652), the manuscript describes "recursive or self-controlling networks", "signaling cascades", "self-regulating", and "recursive feedback loops" - involving protein kinases, phosphatases, and translational regulators. While the data convincingly demonstrate stage-specific changes in phosphorylation and abundance changes in key molecules, the language used implies causal, direct and directional regulatory relationships that have not been experimentally validated.

      Response: __We agree with the reviewer and have corrected the text, replacing all expressions that may allude to causal or directional relationships by more neutral expressions such as 'co-expression'. __

      Comment 2: Co-expression and shared function alone do not define a regulon (L363, and several other places in the manuscript). A regulon also requires the gene set to be regulated by the same factor, for which there is no evidence here. Regulons can be derived from transcriptomic experiments, but then they need to show the same transcriptional behavior across many biological conditions, while here just 1 condition change is evaluated. Therefore, this analysis is conventional GO enrichment analysis and should not be overinterpreted into regulons.

      __Response: __We agree with the reviewer and have replaced 'regulon' with 'co-regulated gene clusters' (or similar).

      Comment 3: LFQ intensity of 0 (e.g., L389): An LFQ intensity of 0 does not necessarily indicate that a protein is absent, but rather that it was not detected. This can occur for several reasons: (1) true biological absence in one condition, (2) low abundance below the detection threshold, or (3) stochastic missingness due to random dropout in mass spectrometry. While the authors state that adjusted p-values for the 1534 proteins exclusively detected in either amastigotes or promastigotes are below 0.01, I could not find corresponding p-values for these proteins in Table 8 ('Global_Proteomic'). An appropriate statistical method designed to handle this type of missingness should be used. In this context, I also find the following statement unclear: "identified over 4000 proteins at each stage in at least 3 out of 4 biological replicates, representing 3521 differentially expressed proteins (adjusted p-value Response: We fully agree with the reviewer, an LFQ intensity of 0 may results from various reasons. We realize that our wording may have been ambiguous. For clarity, we have modified the original text to: 'Label-free quantitative proteomic analysis of 4 replicates of amastigotes and derived promastigotes identified over 4000 proteins, including 1987 differentially expressed proteins (adjusted p-value<br /> Comment 4: L412 - Figure 3B: The figure shows proteins with infinite fold changes, which result from division by zero due to LFQ intensity values of zero in one of the compared conditions. As previously noted, interpreting LFQ zero values as true absence of expression is problematic, since these zeros can arise from several technical reasons - such as proteins being just below the detection threshold or due to stochastic dropout during MS analysis. Therefore, the calculated fold changes for these proteins are likely highly overestimated. This concern is visually supported by the large gap on the y-axis (even in log scale) between these "infinite" fold changes and the rest of the data. Moreover, given Leishmania's model of constitutive gene expression, it seems biologically implausible that all these proteins would be completely absent in one stage. This issue applies not only to Figure 3B, but also to the analyses presented in Figures 4D and 4E.

      Response: __We thank the reviewer for this comment. To clarify this section, we modified the text as follows: 'Only expression changes were considered that either showed statistically significant differential abundance at both RNA and protein levels (p <br /> __Minor Comments:

      Methods L132: Typo: "A according" should be "according."

      __Response: __The 'A' refers to RNase A. We added a comma for clarification (...RNase A, according to...)

      L158: How exactly were somy levels calculated? Please specify the method used, as I could not find a clear description in the referenced manuscript.

      __Response: __We thank the reviewer for this comment. Aside the already quite detailed description in Methods and the reference there to the paper describing the pipeline, we now added a link to the description of the karyotype module of the giptools package (https://gip.readthedocs.io/en/latest/giptools/karyotype.html). There the following explanation can be found: "The karyotype module aims at comparing the chromosome sequencing coverage distributions of multiple samples. This module is useful when trying to detect chromosome ploidy differences in different isolates. For each sample the module loads the GIP files with the bin sequencing coverage (.covPerBin.gz files) and normalizes the meancoverage values by the median coverage of all bins. The bin scores are then converted to somy scores which are then used for producing plots and statistics." The description then goes into further detail.

      L158: Chromosome 36 is not consistently disomic, as stated. It has been observed in other somy states (e.g., Negreira et al. 2023, EMBO Reports, Figure 1), even if such occurrences are rare in the studied context. Normalizing by chr36 remains a reasonable choice, but it would be helpful to confirm that the majority of chromosomes appear disomic post-normalization to support the assumption that chr36 is disomic in this dataset as well.

      __Response: __We thank the reviewer for this comment. Unlike the paper cited above (using long-term cultured promastigotes), our analysis uses promastigote parasites from early culture adaptation (p2) that were freshly derived from splenic amastigotes known to be disomic (and confirmed here), which represents an internal control validating our analysis.

      L163: Suggestion: Cite the GIP pipeline here rather than delaying the reference until L173.

      Response: corrected

      L188: "Controlled" may be a miswording. Consider replacing with "confirmed" or "validated."

      Response: corrected to 'validated'

      L214: Please specify which statistical test was used to assess differential expression at the protein level. L227: Similarly, clarify which statistical test was applied for determining differential expression in the phospho-proteomics data.

      Response: As noted in the Methods section, a limma t-test was applied to determine proteins/phosphoproteins with a significant difference in abundance while imposing a minimal fold change of 2 between the conditions to conclude that they are differentially abundant {Ritchie, 2015; Smyth, 2005}.

      __Results __ L337-339: The interpretation here is too speculative. Phrases like "suggesting" and "likely" are too strong given the evidence presented. Alternative explanations, such as mosaic variation combined with early-stage selective pressure in the culture environment, should be considered.

      Response: We thank the reviewers for these suggestions and have reformulated into: 'In the absence of convergent selection, it is impossible to distinguish if these gene CNVs provide some strain-specific advantage or are merely the result of random genetic drift.'

      L340: The "undulating pattern" mentioned is somewhat subjective. To support this interpretation, consider adding a moving average (or similar) line to Figure 3A, which would more clearly highlight this trend across the data points.

      Response: These lines have been added to Figure 1C (not 3A).

      L356: It may be more accurate to say "control of individual gene expression," since Leishmania does have promoters - the key distinction is that initiation does not occur on a gene-by-gene basis.

      Response: corrected

      L403-405: The statement "this is because these metabolites comprise a glycosomal succinate shunt..." should be rephrased as a hypothesis rather than a definitive explanation, as this causal link has not been experimentally validated.

      Response: Thank you for the comment - we followed your advice.

      L407: Replace "confirming" with "matching" to avoid overstating the agreement with previous observations.

      Response: corrected

      L408: Replace "correlated" with "matched" for more accurate interpretation of results.

      Response: corrected

      L433: It is unclear how differential RNA modifications were detected. Please specify which biological material was used, the number of replicates per life stage, and how statistical evaluation of differential modifications was performed.

      Response: This figure has now been updated using our statistically robust RNA-seq analysis conducted for the revision. See comments above.

      L436: This conclusion appears incomplete. While the manuscript mentions transcript-regulated proteins, it should also note that other proteins showed discordant mRNA/protein patterns. A more balanced conclusion would mention both the matching and non-matching subsets.

      Response: We thank the reviewer for this comment and have made the necessary adjustments to better balance this conclusion.

      L441: The phrase "poor correlation" overgeneralizes and lacks nuance. Earlier sections of the manuscript describe hundreds of genes where mRNA and protein levels correlate well, suggesting that mRNA turnover plays a key regulatory role. Please rephrase this sentence to clarify that poor correlation applies only to a subset of the data.

      Response: This has been corrected to 'The discrepancies we observed in a sub-set of genes between....'.

      L454: The claim that "epitranscriptomic regulation and stage-adapted ribosomes are key processes" should be supported with references. If this builds on previously published work, please cite it accordingly.

      Response: corrected

      L457: Proteasomal degradation is a well-established mechanism in Leishmania. These findings are interesting but should be presented in the context of existing literature (e.g. Silva-Jardim et al.2014, [PMID: 15234661]) rather than as entirely novel.

      Response: corrected

      L459: The authors shoumd add a microscopy image of promastigotes treated with lactacystin. This would provide insight into whether treatment affects morphology, as is known in T. cruzi (see Dias et al., 2008). It would be particularly informative if Leishmania behaves differently.

      Response: We added this information to Figure S7.

      L472 + L481: Table 9 shows several significant GO terms not discussed in the manuscript. Please clarify how the subset presented in the text was selected.

      Response: We added this information to the text ('some of the most significantly enrichment terms included ...').

      L482: The argument that a single master regulator can be excluded is unclear. Could the authors please elaborate on the reasoning or data supporting this conclusion?

      Response: This statement was too speculative and has been removed. Instead, we added 'Thus, Leishmania differentiation correlates with the expression of complex signaling networks that are established in a stage-specific manner'.

      L494: The term "unexpected" may not be appropriate here, as protein degradation is a well-established regulatory mechanism in trypanosomatids. Consider omitting this term to better reflect the field's current understanding.

      Response: We deleted the term as suggested and reformulated to '....our results confirm the important role of protein degradation....'.

      L543: The term "feedback loop" should be used more cautiously. The current data are correlative, and no interventional experiments are provided to support a causal regulatory loop between proteasomal activity and protein kinases. As such, this remains a hypothesis rather than a confirmed mechanism.

      Response: We fully agree and have toned down the entire manuscript, referring to feedback loops only as a hypothesis and not as a fact emerging from our datasets, which set the stage for future functional analyses.

      __Discussion __ L555: As noted in L494, reconsider using the word "unexpected."

      Response: removed

      L589: The data do not fully support the presence of stage-specific ribosomes. Rather, they suggest differential ribosomal function through changes in abundance and regulation. Please consider rephrasing.

      Response: We thank the reviewer for this comment and have follow the advice reformulating the sentence according to the suggestion.

      L657-658: The discussion of post-transcriptional and post-translational regulation of gene dosage effects would benefit from citing additional literature beyond the authors' own work. E.g. the study by Cuypers et al. (PMID: 36149920) offers a relevant and comprehensive analysis covering 4 'omic layers.

      Response: We apologize for this omission and now describe and cite this publication in the Results section when concluding the results shown in Figure 1.

      L659-664: The reference to deep learning for biomarker discovery appears speculative and loosely connected to the current findings. As no such methods were applied in the study, and the manuscript does not clarify what types of biomarkers are intended, this statement could be seen as aspirational rather than evidence-based. Consider either omitting or elaborating with clear justification.

      Response: We agree and have deleted this section.

      L690 + L705 (Figure 2): The phrase "main GO terms" is vague. Please clarify the criteria for selecting the GO terms shown - were they chosen based on adjusted p-value, enrichment score, or another metric? Additionally, define "cluster efficiency," explaining how it was calculated and what it represents.

      Response: Corrected to 'some of the most significantly enriched GO terms'.

      Signed: Bart Cuypers, PhD

      **Referee cross-commenting**

      Overall, I think the other reviewers' comments are fair. They seem to align particularly on the following points:

      1) Reviewers agree that this is a comprehensive body of work with original contributions to the field of Leishmania/trypanosomatid molecular biology, and that it will serve as a valuable reference for hypothesis generation.

      2) Several reviewers raise concerns about overinterpretation of the data, particularly regarding regulatory networks, regulons, and master regulators. The interpretation and large parts of the discussion are considered too speculative without additional functional validation.

      3) There are comments about the incorrect statistical treatment of missing values in the proteomics experiments, which affects confidence in some of the conclusions.

      4) While the correlation between the two RNA-Seq replicates is high, the decision to include only two biological replicates is seen as unfortunate and not ideal for statistical robustness.

      5) The use of lactacystin should be more clearly motivated, and its limitations discussed in the context of the experiments.

      Even though I did not remark on the last two points (4 and 5) in my own review, I agree with them.

      Response: We thank the reviewer for this cross-comparison, which served us as guide to revise our manuscript. We believe that we have responded to all these concerns.

      Reviewer #3 (Significance (Required)):


      This study provides a rich, integrative multi-omics dataset that advances our understanding of stage-specific adaptation in the transcriptionally unique parasite Leishmania. By dissecting the relative contributions of mRNA abundance and protein turnover to final protein levels across life stages, the authors offer valuable insights into post-transcriptional and post-translational regulation. The work represents a resource-driven yet conceptually informative contribution to the field, with comprehensive supplementary materials and transparent data sharing standing out as additional strengths.

      However, the mechanistic insights proposed are speculative in several places and require more cautious language. The study is most impactful as a resource and descriptive atlas, initiating hypotheses for future validation. The broad scientific community working on Leishmania, trypanosomatids, and post-transcriptional regulation in eukaryotes would benefit from this work.

      Response: We thank the reviewer for this positive assessment and have modified the manuscript to further emphasize its strength as an important resource to incite mechanistic follow-up studies.

      Field of reviewer expertise: multi-omics integration, bioinformatics, molecular parasitology, transcriptomics, proteomics, metabolomics, Leishmania, Trypanosoma.

      Reviewer #4 (Evidence, reproducibility and clarity (Required)): __ __Summary:

      This study investigates the regulatory mechanisms underlying stage differentiation in Leishmania donovani, a parasitic protist. Pesher et al., aim to address the central question of how these parasites establish and maintain distinct life cycle stages in mostly the absence of transcriptional control. The authors employed a five-layered systems-level analysis comparing hamster-derived amastigotes and their in vitro-derived promastigotes. From those parasites, they performed a genomic, transcriptomic, proteomic, metabolomic and phosphoproteomic analysis to reveal the changes the parasites undertook between the two life stages.

      The main conclusion stated by the authors are:

      • The stage differentiation in vitro is largely independent of major changes in gene dosage or karyotype.
      • RNA-seq analysis identified substantial stage-specific differences in transcript abundance, forming distinct regulons with shared functional annotations. Amastigotes showed enrichment in transcripts related to amastins and ribosome biogenesis, while promastigotes exhibited enrichment in transcripts associated with ciliary cell motility, oxidative phosphorylation, and post-transcriptional regulation itself.

      • Quantitative phosphoproteome analysis revealed a significant increase in global protein phosphorylation in promastigotes. Normalizing phosphorylation changes against protein abundance identified numerous stage-specific phosphoproteins and phosphosites, indicating that differential phosphorylation also plays a crucial role in establishing stage-specific biological networks. The study identified recursive feedback loops (where components of a pathway regulate themselves) in post-transcriptional regulation, protein translation (potentially involving stage-specific ribosomes), and protein kinase activity. Reciprocal feedback loops (where components of different pathways cross-regulate each other) were observed between kinases and phosphatases, kinases and the translation machinery, and crucially, between kinases and the proteasomal system, with proteasomal inhibition disrupting promastigote differentiation.

      Response: We thank the reviewer for the time and implication dedicated to our manuscript.

      Comments:

      Further details are organised by order of apparition in the text:

      Comment 1: Material and Methods: while the authors are indicating some key parameters, providing the codes and scripts they used throughout the manuscript would improve reproducibility.

      Response: We thank the reviewer for this comment and added the URL for the codes to the data availability section.

      Comment 2: Why only 2 biological replicates for RNA while the others layers have 3 or 4?

      __Response: __We agree with the other reviewers and have repeated this analysis to have statistically more robust results.

      Comment 3: Is the slight but reproducible increase in median coverage observed for chr 1, 2, 3, 4, 6 and 20 stable on longer culture derived promastigotes and sandfly derived promastigotes ?

      Response: No, as published in Barja et al Nature EcolEvol 2017 (PMID: 29109466) and Bussotti et al PNAS 2023 (PMID: 36848551), these minor fluctuations are not predicting subsequent aneuploidies in long-term culture nor in sand fly-derived promastigotes. This information has been added to the text.

      Comment 4: Is this change of ploidy a culture adaptation representation rather than a life cycle event as the authors discuss later on? (This is probably an optional request that would be nice to include, if the authors have performed the sequencing of such parasites. Otherwise, it should be mentioned in the discussion).

      __Response: __Yes, this is a well-known culture adaptation phenomenon, on which we have published extensively. We added this conclusion and the references to the text.

      Comment 5: L333 "Likewise, stage differentiation was not associated with any major gene copy number variation (Figure 1C, Table 2)". The authors are looking here at steady differentiated stages rather than differentiation itself. "Likewise, stage differentiation was.." would be more appropriate.

      __Response: __We corrected this sentence to 'Likewise, differentiation of promastigotes was not associated with any major gene copy number variation at early passage 2'.

      Comment 6: L349-355: have the mRNA presenting change in abundance between stages been normalised by their relative DNA abundance ? Said otherwise, can the wave patterns observed at the genome level explain the respective mRNA level ? Can the authors plot in a similar way the enrichment scores in regards to the position on the genome and can the authors indicate if there is a positional enrichment in addition to the functional one they observe ? This may affect the conclusion in L356-358.

      Response: As noted above, we did not see any significant read depth changes at DNA level when comparing amastigotes and promastigotes. Thus there is no need to normalize the RNA-seq results to DNA read depth. Furthermore, in our comparative transcriptomics analysis, we only consider 2-fold or higher changes in mRNA abundance (which is far beyond the non-significant read depth change we have observed on DNA level). Manual inspection of the enrichment scores with respect to position did not reveal any significant signal (other than revealing some over-represented tandem gene arrays where all gene copies share the same location and GO term).

      Comment 8: L415 "stage-specific expression changes correlate between protein and RNA levels, suggesting that the abundance of these proteins is mainly regulated by mRNA turn-over". Overstatement. Correlation does not suggest causation. "suggesting that the abundance of these proteins could be regulated by mRNA turn-over" would be more appropriate.

      Response: We thank the reviewer for this comment and have corrected the statement accordingly.

      Comment 9: Figure 3B, could the authors clarify what are the "unique genes" that are on the infinite quadrants? It seems these proteins are identified in one stage and not the other. This implies that the corresponding missing values are missing non-at random (MNAR). Rather than removing those proteins containing NMAR from the differential expression analysis, the authors should probably impute those missing values. Methods of imputation of NMAR and MAR can be found in the literature. Indeed, the level of expression in one stage of those proteins is now missing, while it could strongly affect the conclusions the authors are drawing in figure 4E regarding the proteins targeted for degradation and rescued in presence of the proteasome inhibitor.

      Response: We thank the reviewer for this important comment. However, we would like to clarify several key points regarding the treatment of proteins identified in only one condition.

      First, the reviewer assumes that proteins identified in one stage but not the other are necessarily missing not-at-random (MNAR). However, this cannot be definitively established, as these missing values could equally be missing completely at random (MCAR). Without additional information, categorizing them specifically as MNAR may be an oversimplification. More importantly, we have concerns about the reliability of imputation methods in this specific context. Algorithms designed to impute MNAR values (such as QRILC) replace absent data using random sampling from arbitrary probability distributions, typically assuming low intensity values. However, when no intensity value has been detected or quantified for a protein in a given condition, imputing an arbitrary low value raises significant concerns about data interpretation. Such imputed values would not reflect actual measurements but rather statistical assumptions that could introduce bias into downstream analyses. For instance, imputed values could lead to the conclusion that a protein is not differentially abundant, when in reality it is detected in one condition but completely absent in the other. In our view, there are two biologically plausible scenarios: either these proteins are expressed at levels below our detection threshold, or they are genuinely absent (or present at negligible levels) in the corresponding stage. Rather than introducing potentially misleading imputed values, we chose to treat these as genuine stage-specific differences (presence/absence), which results in infinite fold-changes in Figure 3B. Critically, our approach is strongly supported by independent validation through RNA-seq data, which corroborates the differential presence/absence patterns observed at the protein level. Furthermore, our enrichment analyses reveal significant over-representation of specific biological terms among these stage-specific proteins, providing biological coherence to these findings. These converging lines of evidence (proteomics, transcriptomics, and functional enrichment) strengthen our confidence that these represent biologically meaningful differences rather than technical artifacts.Therefore, we believe our conservative approach of treating these as genuine presence/absence differences, validated by orthogonal data, is more appropriate than introducing imputed values based on arbitrary statistical assumptions.To clarify this section, we modified the text as follows: 'Only expression changes were considered that either showed statistically significant differential abundance at both RNA and protein levels (p Comment 10: L430-435 "These data fit with the GO [...] the ribosome translational activity (34)." This discussion feels out of place and context. It is too speculative and with little support by the data presented at this stage of the manuscript. It should be removed as Figure 3E or could be placed in the discussion and supplementary information.

      Response: We agree with the reviewer. In response to a comment from reviewer 1, we have moved both panels to Figure 2, which much better integrates these data.

      Comment 10: The authors present an elegant way to show stage specific degradation through the comparison of stage specific proteasome blockages that show rescue in ama of proteins present in pro and vice versa. L494 "reveal an unexpected but substantial" the term unexpected is inappropriate, as several studies have shown in kinetoplastids the essential role of protein turnover through degradation / autophagy during differentiation. Furthermore the conclusions may be strongly affected by the level of expression of the proteins in the infinite quadrants as we discussed above, and should be revised accordingly.

      Response: We rephrased the conclusion to 'In conclusion, our results confirm the important role of protein degradation in regulating the L. donovani amastigote and promastigote proteomes and identify protein kinases as key targets of stage-specific proteasomal activities.' Please see the response to comment 9 regarding the unique proteins.

      Comment 11: L518 "These data reveal a surprising level of stage-specific phosphorylation in promastigotes, which may reflect their increased biosynthetic and proliferative activities compared to amastigotes." Overstatement. Could also be due to culture adaptation - What is the overlap of stage-specific phosphorylations with previous published datasets in other species of Leishmania? Looking at such comparisons could help to decipher the role of culture adaptation response, species specificity and true differentiation conserved mechanisms.

      Response: We agree with the reviewer and have toned this statement down by adding the statement '....or simply be a consequence of culture adaptation'.

      Comment 12: The discussion is extremely speculative. While some speculation at this stage is acceptable, claiming direct link and feedback without further validation is probably far too stretched. For example, the changes of phosphorylation observed on particular sets of proteins, such as phosphatase and DUBs, need to be validated for their respective change of protein activity in the direction that fits the model of the authors. Those discussions should be toned down.

      Response: We agree with the reviewer and have strongly toned down the entire discussion, emphasizing the hypothesis-building character of our results, which provide a novel framework for future experimental analyses.

      Comment 13: A couple of typos:

      • In the phosphoproteome analysis section, "...0,2 % DCA..." should be "...0.2 % DCA..." (use a decimal point).

      • L225 "...peptide match was disable." should be "...peptide match was disabled."

      Response: both corrected

      __Reviewer #4 (Significance (Required)): __

      While there is not too much novelty around the emphasis of gene expression at post-translational level in kinetoplastid organisms, the scale of the work presented here, looking at 5 layers of potential regulations, is. Therefore, this study represents a substantial amount of work and provides interesting and comprehensive datasets useful for the parasitology community.

      Response: We thank the reviewer for this positive statement.

      Several potential concerns regarding the biological meaning of the findings were identified. These include the limitations of in vitro systems promastigote differentiation potentially limiting the conclusions, the challenge of inferring causality from correlative "omics" data, and the complexities of functional interpretation of changes in phosphorylation and metabolite levels. The proposed feedback loops and functional roles of specific molecules would require further experimental validation to confirm their biological relevance in the natural life cycle of Leishmania, but that would probably fall out of the scope of this manuscript.

      Response: We agree with the reviewer and have modified pour manuscript throughout to remove any causal relationships. Indeed, this work is setting the stage for future investigations on dissecting some of the suggested regulatory mechanisms.

      Area of expertise of the reviewers: Kinetoplastid, Differentiation, Signalling, Omics

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #4

      Evidence, reproducibility and clarity

      Summary:

      This study investigates the regulatory mechanisms underlying stage differentiation in Leishmania donovani, a parasitic protist. Pesher et al., aim to address the central question of how these parasites establish and maintain distinct life cycle stages in mostly the absence of transcriptional control. The authors employed a five-layered systems-level analysis comparing hamster-derived amastigotes and their in vitro-derived promastigotes. From those parasites, they performed a genomic, transcriptomic, proteomic, metabolomic and phosphoproteomic analysis to reveal the changes the parasites undertook between the two life stages. The main conclusion stated by the authors are:

      • The stage differentiation in vitro is largely independent of major changes in gene dosage or karyotype.
      • RNA-seq analysis identified substantial stage-specific differences in transcript abundance, forming distinct regulons with shared functional annotations. Amastigotes showed enrichment in transcripts related to amastins and ribosome biogenesis, while promastigotes exhibited enrichment in transcripts associated with ciliary cell motility, oxidative phosphorylation, and post-transcriptional regulation itself.
      • Quantitative phosphoproteome analysis revealed a significant increase in global protein phosphorylation in promastigotes. Normalizing phosphorylation changes against protein abundance identified numerous stage-specific phosphoproteins and phosphosites, indicating that differential phosphorylation also plays a crucial role in establishing stage-specific biological networks. The study identified recursive feedback loops (where components of a pathway regulate themselves) in post-transcriptional regulation, protein translation (potentially involving stage-specific ribosomes), and protein kinase activity. Reciprocal feedback loops (where components of different pathways cross-regulate each other) were observed between kinases and phosphatases, kinases and the translation machinery, and crucially, between kinases and the proteasomal system, with proteasomal inhibition disrupting promastigote differentiation.

      Comments:

      Further details are organised by order of apparition in the text:

      • Material and Methods: while the authors are indicating some key parameters, providing the codes and scripts they used throughout the manuscript would improve reproducibility.
      • Why only 2 biological replicates for RNA while the others layers have 3 or 4?
      • Is the slight but reproducible increase in median coverage observed for chr 1, 2, 3, 4, 6 and 20 stable on longer culture derived promastigotes and sandfly derived promastigotes ? Is this change of ploidy a culture adaptation representation rather than a life cycle event as the authors discuss later on? (This is probably an optional request that would be nice to include, if the authors have performed the sequencing of such parasites. Otherwise, it should be mentioned in the discussion).
      • L333 "Likewise, stage differentiation was not associated with any major gene copy number variation (Figure 1C, Table 2)". The authors are looking here at steady differentiated stages rather than differentiation itself. "Likewise, stage differentiation was.." would be more appropriate. L349-355: have the mRNA presenting change in abundance between stages been normalised by their relative DNA abundance ? Said otherwise, can the wave patterns observed at the genome level explain the respective mRNA level ? Can the authors plot in a similar way the enrichment scores in regards to the position on the genome and can the authors indicate if there is a positional enrichment in addition to the functional one they observe ? This may affect the conclusion in L356-358.
      • L415 "stage-specific expression changes correlate between protein and RNA levels, suggesting that the abundance of these proteins is mainly regulated by mRNA turn-over". Overstatement. Correlation does not suggest causation. "suggesting that the abundance of these proteins could be regulated by mRNA turn-over" would be more appropriate.
      • Figure 3B, could the authors clarify what are the "unique genes" that are on the infinite quadrants? It seems these proteins are identified in one stage and not the other. This implies that the corresponding missing values are missing non-at random (MNAR). Rather than removing those proteins containing NMAR from the differential expression analysis, the authors should probably impute those missing values. Methods of imputation of NMAR and MAR can be found in the literature. Indeed, the level of expression in one stage of those proteins is now missing, while it could strongly affect the conclusions the authors are drawing in figure 4E regarding the proteins targeted for degradation and rescued in presence of the proteasome inhibitor.
      • L430-435 "These data fit with the GO [...] the ribosome translational activity (34)." This discussion feels out of place and context. It is too speculative and with little support by the data presented at this stage of the manuscript. It should be removed as Figure 3E or could be placed in the discussion and supplementary information.
      • The authors present an elegant way to show stage specific degradation through the comparison of stage specific proteasome blockages that show rescue in ama of proteins present in pro and vice versa. L494 "reveal an unexpected but substantial" the term unexpected is inappropriate, as several studies have shown in kinetoplastids the essential role of protein turnover through degradation / autophagy during differentiation. Furthermore the conclusions may be strongly affected by the level of expression of the proteins in the infinite quadrants as we discussed above, and should be revised accordingly.
      • L518 "These data reveal a surprising level of stage-specific phosphorylation in promastigotes, which may reflect their increased biosynthetic and proliferative activities compared to amastigotes." Overstatement. Could also be due to culture adaptation - What is the overlap of stage-specific phosphorylations with previous published datasets in other species of Leishmania ? Looking at such comparisons could help to decipher the role of culture adaptation response, species specificity and true differentiation conserved mechanisms.
      • The discussion is extremely speculative. While some speculation at this stage is acceptable, claiming direct link and feedback without further validation is probably far too stretched. For example, the changes of phosphorylation observed on particular sets of proteins, such as phosphatase and DUBs, need to be validated for their respective change of protein activity in the direction that fits the model of the authors. Those discussions should be toned down.

      A couple of typos:

      • In the phosphoproteome analysis section, "...0,2 % DCA..." should be "...0.2 % DCA..." (use a decimal point).
      • L225 "...peptide match was disable." should be "...peptide match was disabled."

      Significance

      While there is not too much novelty around the emphasis of gene expression at post-translational level in kinetoplastid organisms, the scale of the work presented here, looking at 5 layers of potential regulations, is. Therefore, this study represents a substantial amount of work and provides interesting and comprehensive datasets useful for the parasitology community.

      Several potential concerns regarding the biological meaning of the findings were identified. These include the limitations of in vitro systems promastigote differentiation potentially limiting the conclusions, the challenge of inferring causality from correlative "omics" data, and the complexities of functional interpretation of changes in phosphorylation and metabolite levels. The proposed feedback loops and functional roles of specific molecules would require further experimental validation to confirm their biological relevance in the natural life cycle of Leishmania, but that would probably fall out of the scope of this manuscript.

      Area of expertise of the reviewers: Kinetoplastid, Differentiation, Signalling, Omics

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The study presents a comprehensive multi-omics investigation of Leishmania differentiation, combining genomic, transcriptomic, proteomic, phospho-proteomic and metabolomic data. The authors aim to uncover mechanisms of post-transcriptional and post-translational regulation that drive the stage-specific biology of L. donovani. The authors provide a detailed characterization of transcriptomic, proteomic, and phospho-proteomic changes between life stages, and dissect the relative contributions of mRNA abundance and protein degradation to stage-specific protein expression. Notably, the study is accompanied by comprehensive supplementary materials for each molecular layer and provides public access to both raw and processed data, enhancing transparency and reproducibility. While the data are rich and compelling, several mechanistic interpretations (e.g., "feedback loops," "recursive networks," "signaling cascades") are overstated. Similarly, the classification of gene sets as "regulons" is not adequately supported, as no common regulatory factor has been identified and only a single condition change (amastigote to promastigote) was assessed.

      Major Comments:

      1. Across several sections (incl abstract, L559-565, L589-599, L600-L603, L610-612, L613-614, L625, L643-645, L650-652), the manuscript describes "recursive or self-controlling networks", "signaling cascades", "self-regulating", and "recursive feedback loops" - involving protein kinases, phosphatases, and translational regulators. While the data convincingly demonstrate stage-specific changes in phosphorylation and abundance changes in key molecules, the language used implies causal, direct and directional regulatory relationships that have not been experimentally validated.
      2. Co-expression and shared function alone do not define a regulon (L363, and several other places in the manuscript). A regulon also requires the gene set to be regulated by the same factor, for which there is no evidence here. Regulons can be derived from transcriptomic experiments, but then they need to show the same transcriptional behavior across many biological conditions, while here just 1 condition change is evaluated. Therefore, this analysis is conventional GO enrichment analysis and should not be overinterpreted into regulons.
      3. LFQ intensity of 0 (e.g., L389): An LFQ intensity of 0 does not necessarily indicate that a protein is absent, but rather that it was not detected. This can occur for several reasons: (1) true biological absence in one condition, (2) low abundance below the detection threshold, or (3) stochastic missingness due to random dropout in mass spectrometry. While the authors state that adjusted p-values for the 1534 proteins exclusively detected in either amastigotes or promastigotes are below 0.01, I could not find corresponding p-values for these proteins in Table 8 ('Global_Proteomic'). An appropriate statistical method designed to handle this type of missingness should be used. In this context, I also find the following statement unclear: "identified over 4000 proteins at each stage in at least 3 out of 4 biological replicates, representing 3521 differentially expressed proteins (adjusted p-value < 0.01), 1534 of which were exclusively detected in either ama or pro." If a protein is exclusively detected in one stage, then by definition it should not be detected in that number of replicates at both stages. This apparent contradiction should be clarified.
      4. L412 - Figure 3B: The figure shows proteins with infinite fold changes, which result from division by zero due to LFQ intensity values of zero in one of the compared conditions. As previously noted, interpreting LFQ zero values as true absence of expression is problematic, since these zeros can arise from several technical reasons - such as proteins being just below the detection threshold or due to stochastic dropout during MS analysis. Therefore, the calculated fold changes for these proteins are likely highly overestimated. This concern is visually supported by the large gap on the y-axis (even in log scale) between these "infinite" fold changes and the rest of the data. Moreover, given Leishmania's model of constitutive gene expression, it seems biologically implausible that all these proteins would be completely absent in one stage. This issue applies not only to Figure 3B, but also to the analyses presented in Figures 4D and 4E.

      Minor Comments:

      Methods

      L132: Typo: "A according" should be "according." L158: How exactly were somy levels calculated? Please specify the method used, as I could not find a clear description in the referenced manuscript. L158: Chromosome 36 is not consistently disomic, as stated. It has been observed in other somy states (e.g., Negreira et al. 2023, EMBO Reports, Figure 1), even if such occurrences are rare in the studied context. Normalizing by chr36 remains a reasonable choice, but it would be helpful to confirm that the majority of chromosomes appear disomic post-normalization to support the assumption that chr36 is disomic in this dataset as well. L163: Suggestion: Cite the GIP pipeline here rather than delaying the reference until L173. L188: "Controlled" may be a miswording. Consider replacing with "confirmed" or "validated." L214: Please specify which statistical test was used to assess differential expression at the protein level. L227: Similarly, clarify which statistical test was applied for determining differential expression in the phospho-proteomics data.

      Results

      L337-339: The interpretation here is too speculative. Phrases like "suggesting" and "likely" are too strong given the evidence presented. Alternative explanations, such as mosaic variation combined with early-stage selective pressure in the culture environment, should be considered. L340: The "undulating pattern" mentioned is somewhat subjective. To support this interpretation, consider adding a moving average (or similar) line to Figure 3A, which would more clearly highlight this trend across the data points. L356: It may be more accurate to say "control of individual gene expression," since Leishmania does have promoters - the key distinction is that initiation does not occur on a gene-by-gene basis. L403-405: The statement "this is because these metabolites comprise a glycosomal succinate shunt..." should be rephrased as a hypothesis rather than a definitive explanation, as this causal link has not been experimentally validated. L407: Replace "confirming" with "matching" to avoid overstating the agreement with previous observations. L408: Replace "correlated" with "matched" for more accurate interpretation of results. L433: It is unclear how differential RNA modifications were detected. Please specify which biological material was used, the number of replicates per life stage, and how statistical evaluation of differential modifications was performed. L436: This conclusion appears incomplete. While the manuscript mentions transcript-regulated proteins, it should also note that other proteins showed discordant mRNA/protein patterns. A more balanced conclusion would mention both the matching and non-matching subsets. L441: The phrase "poor correlation" overgeneralizes and lacks nuance. Earlier sections of the manuscript describe hundreds of genes where mRNA and protein levels correlate well, suggesting that mRNA turnover plays a key regulatory role. Please rephrase this sentence to clarify that poor correlation applies only to a subset of the data. L454: The claim that "epitranscriptomic regulation and stage-adapted ribosomes are key processes" should be supported with references. If this builds on previously published work, please cite it accordingly. L457: Proteasomal degradation is a well-established mechanism in Leishmania. These findings are interesting but should be presented in the context of existing literature (e.g. Silva-Jardim et al.2014, [PMID: 15234661]) rather than as entirely novel. L459: The authors shoumd add a microscopy image of promastigotes treated with lactacystin. This would provide insight into whether treatment affects morphology, as is known in T. cruzi (see Dias et al., 2008). It would be particularly informative if Leishmania behaves differently. L472 + L481: Table 9 shows several significant GO terms not discussed in the manuscript. Please clarify how the subset presented in the text was selected. L482: The argument that a single master regulator can be excluded is unclear. Could the authors please elaborate on the reasoning or data supporting this conclusion? L494: The term "unexpected" may not be appropriate here, as protein degradation is a well-established regulatory mechanism in trypanosomatids. Consider omitting this term to better reflect the field's current understanding. L543: The term "feedback loop" should be used more cautiously. The current data are correlative, and no interventional experiments are provided to support a causal regulatory loop between proteasomal activity and protein kinases. As such, this remains a hypothesis rather than a confirmed mechanism.

      Discussion

      L555: As noted in L494, reconsider using the word "unexpected." L589: The data do not fully support the presence of stage-specific ribosomes. Rather, they suggest differential ribosomal function through changes in abundance and regulation. Please consider rephrasing. L657-658: The discussion of post-transcriptional and post-translational regulation of gene dosage effects would benefit from citing additional literature beyond the authors' own work. E.g. the study by Cuypers et al. (PMID: 36149920) offers a relevant and comprehensive analysis covering 4 'omic layers. L659-664: The reference to deep learning for biomarker discovery appears speculative and loosely connected to the current findings. As no such methods were applied in the study, and the manuscript does not clarify what types of biomarkers are intended, this statement could be seen as aspirational rather than evidence-based. Consider either omitting or elaborating with clear justification. L690 + L705 (Figure 2): The phrase "main GO terms" is vague. Please clarify the criteria for selecting the GO terms shown - were they chosen based on adjusted p-value, enrichment score, or another metric? Additionally, define "cluster efficiency," explaining how it was calculated and what it represents.

      Signed: Bart Cuypers, PhD

      Referee cross-commenting

      Overall, I think the other reviewers' comments are fair. They seem to align particularly on the following points:

      1. Reviewers agree that this is a comprehensive body of work with original contributions to the field of Leishmania/trypanosomatid molecular biology, and that it will serve as a valuable reference for hypothesis generation.
      2. Several reviewers raise concerns about overinterpretation of the data, particularly regarding regulatory networks, regulons, and master regulators. The interpretation and large parts of the discussion are considered too speculative without additional functional validation.
      3. There are comments about the incorrect statistical treatment of missing values in the proteomics experiments, which affects confidence in some of the conclusions.
      4. While the correlation between the two RNA-Seq replicates is high, the decision to include only two biological replicates is seen as unfortunate and not ideal for statistical robustness.
      5. The use of lactacystin should be more clearly motivated, and its limitations discussed in the context of the experiments.

      Even though I did not remark on the last two points (4 and 5) in my own review, I agree with them.

      Significance

      This study provides a rich, integrative multi-omics dataset that advances our understanding of stage-specific adaptation in the transcriptionally unique parasite Leishmania. By dissecting the relative contributions of mRNA abundance and protein turnover to final protein levels across life stages, the authors offer valuable insights into post-transcriptional and post-translational regulation. The work represents a resource-driven yet conceptually informative contribution to the field, with comprehensive supplementary materials and transparent data sharing standing out as additional strengths.

      However, the mechanistic insights proposed are speculative in several places and require more cautious language. The study is most impactful as a resource and descriptive atlas, initiating hypotheses for future validation. The broad scientific community working on Leishmania, trypanosomatids, and post-transcriptional regulation in eukaryotes would benefit from this work.

      Field of reviewer expertise: multi-omics integration, bioinformatics, molecular parasitology, transcriptomics, proteomics, metabolomics, Leishmania, Trypanosoma

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Pescher and colleagues present a manuscript detailing the multi-omic characterisation of Leishmania donovani amastigote to promastigote differentiation and integration of this data. Their aim was to reveal the cellular mechanisms and pathways by which Leishmania differentially regulate protein and mRNA levels during life-stage progression, in the absence of differential transcriptional control. Integration of RNA-seq, proteomics, metabolomics, and phosphoproteomics identify several broad regulatory processed by which the parasite differentially modulates ribosome biogenesis and function, protein abundance by proteasomal degradation, and protein kinase signalling networks. The intersection of these processes is explicitly considered in the data analysis to identify potential recursive and reciprocal signalling networks. The authors use Leishmania donovani amastigotes derived from hamster infections, and use these to initiate amastigote to promastigote differentiation time course. Initially a DNA-seq quality control is presented to rule out dramatic ploidy changes in the differentiating lines - which is known to occur in Leishmania undergoing adaptation to tissue culture. Small changes are observed but, because only 2 passages are performed, they are low and the lines remain generally diploid. There is good correlation between replicates. The implication is that neither subtle nor dramatic changes in ploidy drive life stage differentiation.

      Next, mRNA-seq is presented on duplicate samples from amastigote and promastigote forms. Correlation between the two replicates is reported to be high. Differentially abundant transcripts are identified using DESeq2 and the genes in the up/down-regulated sets are analysed by GO term enrichment (and cluster efficiency). Genes are clustered by similar function and the implication is that these compromise post-transcriptionally regulated regulons, sharing common regulatory processes. Whole cell label-free, quantitative proteomics is conducted to profile the amastigote and promastigotes forms. The differentially abundant proteins are identified and again analysed by GO term - revealing a general discordance with the RNA-seq results. The implication is that there are additional layers of regulation linking protein abundance to mRNA abundance. A metabolomics analysis performed where the authors look for changes in the metabolome of the promastigotes from the amastigotes and suggest that the stages they have exhibit expected biochemical behaviour. The RNA-seq and proteomics fold change values are then correlated to identify proteins with differential abundances linked to mRNA levels, and those that deviate from this relationship. For example, proteins that increase in promastigote forms but the corresponding mRNA is decreased etc. The authors focus in on a group of ribosomal proteins less abundant than the mRNA would suggest (as ribosome biogenesis was a regulon identified as a regulon of interest). In the absence of riboprofiling the authors return to the RNA-seq to assess the levels of pre-Sno RNA (the role of the could be more explicitly stated). They relate this to a previous study also showing differential stage specific ribosome RNA pseudouridination (and differential levels of pseudouridine transferases, this study.) - which affect ribosome translational efficiency. This is their first example of a potential recursive level of regulation.

      Because protein levels can also be impacted by proteolytic turnover the authors then search for proteins that are normally degraded by the ubiquitin-proteasome system in each stage and an intermediate 18 h timepoint - potentially allowing identification of amastigote degraded proteins, promastigote degrade proteins and intriguingly, constitutively degraded proteins. The authors use the proteasome inhibitor lactacystin at 10 µM as a chemical method to block proteasome activity. This treatment is implied to block differentiation by the morphology of treated cells under microscopy. The authors suggest that proteins protected from degradation act as repressors of differentiation that override any positive signal coming from signalling pathways or newly synthesised proteins. Proteins are identified as differentially stabilised, in both amastigote, promastigote forms and during differentiation. The GO analysis identifies protein kinases as significantly enriched in this context and suggests layering of regulation by the stability of signalling proteins.

      In the context of this differential regulation the authors perform a round of phosphoproteomics, and use the whole-cell proteome to normalise phosphosite abundance to protein abundance in the various stages. While not at the level of site-specific occupancy this allows them to subset the phospho modifications into various categories including those that change independently of protein levels, ie. Increasing in their "stoichiometry". GO term analysis again is used and identifies ubiquitination systems and protein kinases & phosphatases as enriched, and specific examples are described. This demonstrates that there is likely recursive and reciprocal regulatory processes at interplay between these classes of proteins.

      The authors provide a clear and comprehensive description of the data at each stage of the results and this in woven together in the discussion allowing hypotheses to be formed on the potential regulatory and signalling pathways that control the differentiation of amastigotes to promastigotes. Given the amount and breadth of data presented the authors are able to present a high-level assessment of the processes that form feedback loops and/or intersectional signalling, but specific examples are not picked out for deeper validation or exploration.

      Major comments:

      Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      As I have understood it from the description in the text, and in Data Table 4, the RNA-seq element of the work has only been conducted using two replicates. If this is the case, it would substantially undermine the RNA-seq and the inferences drawn from it. Minimum replicates required for inferential analysis is 3 bio-replicates and potentially up to 6 or 12. It may be necessary for the authors to repeat this for the RNA-seq to carry enough weight to support their arguments. (PMID: 27022035 )

      There are several examples that are given as reciprocal or recursive signalling pathways, but these are not followed up with independent, orthogonal techniques. I think the paper currently forms a great resource to pursue these interesting signalling interactions and is certainly more than just a catalogue of modifications, but to take it to the next level ideally a novel signalling interaction would be demonstrated using an orthogonal approach. Perhaps the regulation of the ribosomes could have been explored further (same teams recently published related work on this). Or perhaps more interestingly, a novel target(s) from the ubiquitinated protein kinases could have been explored further; for example making precision mutants that lack the ubiquitination or phosphorylation sites - does this abrogate differentiation?

      I found the use of lactacystin a bit curious as there are more potent and specific inhibitors of Leishmania proteasomes e.g. LXE-408. This could be clarified in the write-up (See below).

      Please request additional experiments only if they are essential for the conclusions. Alternatively, ask the authors to qualify their claims as preliminary or speculative, or to remove them altogether.

      If it is the case that only 2 replicates of the RNA-Seq have been performed it really is not the accepted level of replication for the field. Most studies use a minimum of 3 bioreplicates and even a minimum of 6 is recommended by independent assessment of DESeq2.

      As far as I could see, the cell viability assay does not include a positive control that shows it is capable of detecting cytotoxic effects of inhibitors. Add treatment showing that it can differentiate cytostatic vs cytotoxic compound.

      If you have constructive further reaching suggestions that could significantly improve the study but would open new lines of investigations, please label them as "OPTIONAL". Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated time investment for substantial experiments.

      It is realistic for the authors to validate the cell viability assay. If the RNA-seq needs to be repeated then this would be a substantial involvement.

      Are the data and the methods presented in such a way that they can be reproduced?

      All the methods are written to a good level of detail. The sample prep, acquisition and data analysis of the protein mass spectrometry contained a high level of detail in a supplemental section. The authors should be more explicit about the amount of replication at each stage, as in parts of the manuscript this was quite unclear.

      Are the experiments adequately replicated and statistical analysis adequate?

      Unless I have misunderstood the manuscript, I believe the RNA-seq dataset is underpowered according to the number of replicates the authors report in the text.

      Looking at Figure 1 and S1 and Data Table 4 to show the sample workflow I was surprised to see that the RNA-seq only used 2 replicates. The authors do show concordance between the individual biological replicates, but I would consider that only having 2 is problematic here, especially given the importance placed on the mRNA levels and linkage in this study. This would constitute a major weakness of the study, given that it is the basis for a crucial comparison between the RNA and protein levels.

      It also wasn't clear to me how many replicates were performed at each condition for the lactacystin treatment experiment - can the authors please state this clearly in the text, it looks like 4 replicates from Figure S1 and Data Table 8.

      Four replicates are used for the phosphoproteomics data set, which is probably ok, but other researchers have used a minimum of 5 in phosphoproteomics experiments to deal with the high level of variability that can often be observed with low abundance proteins & modifications. The method for the phosphoproteomics analysis suggests that a detection of a phosphosite in 1 sample (also with a localisation probability of >0.75) was required for then using missing value imputation of other samples. This seems like a low threshold for inclusion of that phosphosite for further relative quantitative analysis. For example, Geoghegan et al (2022) (PMID: 36437406) used a much more stringent threshold of greater than or equal to 2 missing values from 5 replicates as an exclusion criteria for detected phoshopeptides. Please correct me if I misunderstood the data processing, but as it stands the imputation of so many missing values (potentially 3 of 4 per sample category) could be reducing the quality of this analysis.

      For the metabolomics analysis it looks like 2 amastigote samples were compared against 4 promastigote samples. Why not triplicates of each?

      Minor comments:

      Specific experimental issues that are easily addressable.

      Are prior studies referenced appropriately?

      Yes

      Are the text and figures clear and accurate?

      The write up is clear, with the data presented coherently for each method. The analyses that link everything together are well discussed. The figures are mostly clear (see below) and are well described in the legends. There is good use of graphics to explain the experimental designs and sample names - although it is unclear if technical replicates are defined in these figures.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      As I have understood it, the authors have calculated the "phosphostoichiometry" using the ratio of change in the phosphopeptide to the ratio of the change in total protein level changes. This is detailed in the supplemental method (see below). Whilst this has normalised the data, it has not resulted in an occupancy or stoichiometry measurement, which are measured between 0-1 (0% to 100%). The normalisation has probably been sufficient and useful for this analysis, but this section needs to be re-worded to be more precise about what the authors are doing and presenting. These concepts are nicely reviewed by Muneer, Chen & Chen 2025 (PMID: 39696887) who reference seminal papers on determination of phosphopeptide occupancy - and may be a good place to start. An alternative phrase should be used to describe the ratio of ratios calculated here, not phosphostoichiometry.

      From the authors methods describing the ratio comparison approach:

      "Another statistical test was performed in a second step: a contrasted t-test was performed to compare the variation in abundance of each modified peptide to the one of its parent unmodified protein using the limma R package {Ritchie, 2015; Smyth, 2005}. This second test allows determining whether the fold-change of a phosphorylated peptide between two conditions is significantly different from the one of its parent and unmodified protein (paragraph 3.9 in Giai Gianetto et al 2023). An adaptive Benjamini-Hochberg procedure was applied on the resulting p-values thanks to the adjust.p function of R package cp4p {Giai Gianetto, 2016} using the Pounds et al {Pounds, 2006} method to control the False Discovery Rate level."

      Several aspects of the figures that contain STRING networks are quite useful, particularly the way colour around the circle of each node to denote different molecular functions/biological processes. However, some have descended into "hairball" plots that convey little useful information that would be equally conveyed in a table, for example. Added to this, the points on the figure are identified by gene IDs which, while clear and incontrovertible, are lacking human readability. I suggest that protein name could be included here too. It is also not clear what STRING data is being plotted here, what are the edges indicating - physical interactions proven in Leishmania, or inferred interactions mapped on from other organisms? Perhaps as supplemental data provide the Cytoscape network files so readers can explore the networks themselves?

      The title of columns in table S10 panel A are written in French, which will be ok for many people particularly those familiar with proteomics software outputs, but everything else is in English so perhaps those titles could be made consistent.

      I would suggest that the authors provide a table that has all the gene IDs of the Ld1S2D strain and the orthologs for at least one other species that is in TriTrypDB. This would make it easy to interrogate the data and make it a more useful resource for the community who work on different strains and species of Leishmania. Although this data is available it is a supplemental material file in a previous paper (Bussotti et al PNAS 2021) and not easy to find.

      Figure 5b - from the legend it is not clear where the confidence values were derived in this analysis, although this is explained in the supplemental method. Perhaps the legend can be a bit clearer.

      Can the authors discuss why lactacystin was used? While this is a commonly used proteasome inhibitor in mammalian cells there is concern that it can inhibit other proteases. At the concentrations (10 µM) the authors used there are off-target effects in Leishmania, certainly the inhibition of a carboxypeptidase (PMID: 35910377) and potentially cathepsins as is observed in other systems (PMID: 9175783). There is a specific inhibitor of the Leishmania proteasome LXE-408 (PMID: 32667203), which comes closer to fulfilling the SGC criteria (PMID: 26196764) for a chemical probe - why not use this. Does lactacystin inhibit a different aspect of proteasome activity compared to LXE-408? The application of lactacystin is changing the abundance of a multitude of proteins but no precision follow up is done to identify if those proteins are necessary and/or sufficient from driving/blocking differentiation. This could be tested using precision edited lines that are unable to be ubiquitinated? There is a lack of direct evidence that the proteins protected from degradation by lactacystin are ubiquitinated? Perhaps some of these could be tagged and IP'd then probed for ubiquitin signal. Di-Gly proteomics to reveal ubiquitinated proteins? These suggestions should be considered as OPTIONAL experiments in the relevant section above.

      In the data availability RNA-seq section the text for the GEO link is :

      (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc= GSE227637) but the embedded link takes me to (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE165615) which is data for another, different study. Also, the link to the GEO site for the DNA seq isn't working and manual searches with the archive number (BioProject PRJNA1231373 ) does not appear to find anything. The IDs for the mass spec data PRIDE/ProteomeXchange don't seem to bring up available datasets: PXD035697 and PXD035698

      Referee cross-commenting

      It seems that all the reviewers' comments touch on relatively common themes - potential overinterpretation of the RNA-seq data (such as identifying/defining regulons), & statistical analysis of the proteomics/phosphoproteomics. Experimentally adding the positive control for the cell viability assay would increase robustness of that workflow, and follow up of a pathway/interaction would boost the significance of the work. All the other reviews look to be fair and well justified. I don't have anything more to add or change based on reading the other reviews.

      Significance

      Provide contextual information to readers (editors and researchers) about the novelty of the study, its value for the field and the communities that might be interested.

      The following aspects are important:

      General assessment: provide a summary of the strengths and limitations of the study. What are the strongest and most important aspects? What aspects of the study should be improved or could be developed?

      Strengths:

      The molecular pathways that regulate Leishmania life-stage transitions are still poorly understood, with many approaches exploring single proteins/RNAs etc in a reductionist manner. This paper takes a systems-scale approach and does a good job of integrating the disparate -omics datasets to generate hypotheses of the intersections of regulatory proteins that are associated with life-cycle progression.

      The differentiation step studied is from amastigote to promastigote. I am not aware that this has been studied before using phosphoproteomics. The use of the hamster derived amastigotes is a major strength. While a difficult/less common model, the use of hamsters permits the extraction of parasites that are host adapted and represent "normal", host-adapted Leishmania ploidy, the promastigote experiments are performed at a low passage number. This is a strength or the work as it reduces the interference of the biological plasticity of Leishmania when it is cultured outside the host.

      Limitations:

      Potential lack of appropriate replication (see above).

      Lack of follow up/validation of a novel signalling interaction identified from the systems-wide approach. There is a lack of assessment of whether a single signalling cascade is driving the differentiation or these are all parallel, requisite pathways. The authors state the differentiation is not driven by a single master regulator, but I am not sure there is adequate evidence to rule this in or out.

      Advance: compare the study to the closest related results in the literature or highlight results reported for the first time to your knowledge; does the study extend the knowledge in the field and in which way? Describe the nature of the advance and the resulting insights (for example: conceptual, technical, clinical, mechanistic, functional,...).

      The study applies well established techniques without any particular technical step-change. The application of large-scale multi-omics techniques and integrated comparisons of the different experimental workflows allow a synthesis of data that is a step forward from that existing in the previous Leishmania literature. It allows the generation of new hypotheses about specific regulatory pathways and crosstalk that potentially drive, or are at least active, during amastigote>promastigote differentiation.

      Audience: describe the type of audience ("specialized", "broad", "basic research", "translational/clinical", etc...) that will be interested or influenced by this research; how will this research be used by others; will it be of interest beyond the specific field?

      This manuscript will have primary interest to those researchers studying the molecular and cell biology of Leishmania and other kinetoplastid parasites. The approaches used are quite standard (so not so interesting in terms of methods development etc.) and given the specific quirks of Leishmania biology it may not be that relevant to those working more broadly in parasites from different clades/phyla, or those working on opisthokont systems- yeast, humans etc. Other Leishmania focused groups will surely cherry-pick interesting hits from this dataset to advance their studies, so this dataset will form a valuable reference point for hypothesis generation.

      Please define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Relevant expertise: Trypanosoma & Leishmania molecular & cell biology, RNA-seq, proteomics, transcriptional/epigenetic regulation, protein kinases - some experience of UPS system.

      I have not provided comment on the metabolomics as it is outside my core expertise. However, I can see it was performed at one of the leading parasitology metabolomics labs.

    5. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study confirms and extends previous observations of discordance between RNA and protein abundances, characterized by major counter-correlations, during amastigote-to-promastigote differentiation in Leishmania parasites in culture, using a systems-level approach. The authors leverage robust omics techniques and sophisticated computational analyses to explore potential post-transcriptional and -translational control interplay in Leishmania. However, the narrative becomes somewhat diffuse with the shift to putative multilevel regulatory networks, which would benefit from further experimental validation.

      Major issues

      1. Fig.1D suggests a significant portion of the SNPs are exclusive, with a frequency of zero in one of the two stages. Were only the heterozygous and minor alleles plotted in Fig.1D, since frequencies close to 1 are barely observed? Is the same true in Sup Fig. S2B? Why do chrs 4 and 33 show unusual patterns in S2B?
      2. Chr26 revealed a striking contrasting gene coverage between H-1 and the other two samples. While a peak is observed for H-1 in the middle of this chr, the other two show a decrease in coverage. Is there any correlation with the transcriptomic/proteomic findings?
      3. The term "regulon" is used somewhat loosely in many parts of the text. Evidence of co-transcriptomic patterns alone does not necessarily demonstrate control by a common regulator (e.g., RNA-binding protein), and therefore does not fulfill the strict definition of a regulon. It should be clear whether the authors are highlighting potential multiple inferred regulons within a list of genes or not. Maybe functional/ gene module/cluster would be more appropriate terms.
      4. It is unclear whether the findings in Fig.3E are based on previous analysis of stage-specific rRNA modifications or inferred from the pre-snoRNA transcriptomic data in the current work or something else. I struggle to find the significance of presenting this here.
      5. The protein turnover analysis is missing the critical confirmation of the expected lactacystin activity on the proteasome in both ama and pro. A straightforward experiment would be an anti-polyUb western blotting using a low concentration SDS-PAGE or a proteasome activity assay on total extracts.
      6. The viability tests upon lactacystin treatment need a positive control for the PI and the YoPro staining (i.e., permeabilized or heat-killed promastigotes).
      7. I found that the section on regulatory networks was somewhat speculative and less focused. Several of the associated conclusions are, in some parts, overstated, such as in "uncovered a similar recursive feedback loop" (line 566) or "unprecedented insight into the regulatory landscape" (line 643). It would be important to provide some form of direct evidence supporting a functional connection between phosphorylation/ubiquitination, ribosome biogenesis/proteins and gene expression regulation.

      Minor issues

      1. The ordinal transition words "First,"/"Second," are used too frequently in explanatory sections. I noted six instances. I suggest replacing or rephrasing some to improve flow.
      2. Ln 168: Unformatted citations were given for the Python packages used in the study.
      3. Fig.1D: "SNP frequency" is the preferred term in English.
      4. Fig.2A: not sure what "counts}1" mean.
      5. Ln 685: "Transcripts with FC < 2 and adjusted p-value > 0.01 are represented by black dots" -> This sentence is inaccurate. The intended wording might be: "Transcripts with FC < 2 OR adjusted p-value > 0.01 are represented by black dots"
      6. Ln 698: Same as ln 685 mentioned above.
      7. Fig.2B and elsewhere: The legend key for the GO term enrichment is a bit confusing. It seems like the color scales represent the adj. p-values, but the legend keys read "Cluster efficiency" and "Enrichment score", while those values are actually represented by each bar length. Does light blue correspond to a max value of 0.05 in one scale, and dark blue to a max value of 10-7 in the other scale?
      8. Sup Figure S3A and S4A: The hierarchical clustering dendrograms are barely visible in the heatmaps.
      9. S3A Legend: The following sentence sounds a bit awkward: "Rows and columns have been re-ordered thanks to a hierarchical clustering". I suggest switching "thanks to a hierarchical clustering" to "based on hierarchical clustering".
      10. Fig.5D: The font size everywhere except the legend key is too small. In addition, on the left panel, gene product names are given as a column, while on the right, the names are shown below the GeneIDs. Consistency would make it clearer.

      Significance

      At its current stage, this work represents a robust resource for molecular parasitology research programs, paving the way for mechanistic studies on multilayered gene expression control and it would benefit from experimental evidence for some of the claims concerning the in silico regulatory networks. Terms like "regulons", "recursive feedback loop" are employed without solid confirmation or extensive literature support. In my view, the most relevant contribution of this study is centered in the direct association between proteasome-dependent degradation and Leishmania differentiation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Response to Review

      We would like to thank all three reviewers for their encouraging comments on our manuscript. We now submit our revised study after considerable efforts to address each of the reviewer concerns. I will first provide a response related to a major change we have made in the revision that addressed a concern common to all three reviewers, followed by a point-by-point response to individual comments.

      Replacing LRRK2ARM data with a LRRK2 specific type II kinase inhibitor: The most critical issue for all 3 reviewers was the use of our new CRISPR-generated truncation mutant of LRRK2 that we called LRRK2ARM. We had not provided direct evidence of the protein product of this truncation, which was a significant limitation. To address this we performed proteomics analysis of all clones, and to our surprise, we identified 7 peptides that were C-terminal to our "predicted" stop codon we had engineered into the CRISPR design. A repeat of the deep sequencing analysis in both directions then more clearly revealed site specific mutations leading to 4 amino acid changes at the junction of exon 19, without introducing a stop codon. Given that we could not detect the protein by western blot (even though proteomics now indicated the region of LRRK2 recognized by our antibodies was present) we decided to remove this clone from the manuscript. In the meantime we had compared the ineffectiveness of MLi-2 to block Rab8 phosphorylation during iron overload in the LRRK2G2019S cells with a type II kinase inhibitor called rebastinib. The data showed very clearly that treatment with rebastinib reversed the iron-induced phospho-Rab8 at the plasma membrane (and by western blot, in new Fig 3). Since this inhibitor is very broad spectrum inhibiting ~30% of the kinome we reached out to Sam Reck-Peterson and Andres Leschziner, experts in LRRK2 structure/function, who recently developed a much more selective LRRK2-specific type II kinase inhibitor they called RN341 and RN277 (developed with Stefan Knapp PMID: 40465731). These compounds effectively coupled the MLi-2 compound through an indole ring to a rebastinib type II compound to provide LRRK2 binding specificity to the efficient DYG "out" type II inhibitor. As with rebastinib, the new LRRK-specific kinase inhibitors also effectively reversed the cell surface p-Rab8 seen in LRRK2G2019S, iron loaded cells. These new data provide the first biological paradigm where the kinase activity of LRRK2 is resistant to type I MLi-2, yet remains highly sensitive to type II inhibitors. While the loss of our LRRK2ARM clone marks a significant change in the manuscript we believe the main message is stronger with the addition of the new LRRK2 specific type II kinase inhibitor. Our data show that it is indeed the active kinase function of LRRK2G2019S that is impacting the iron phenotypes we observe but highlight the conformational specificity upon iron overload such that MLi-2 is ineffective. The overall phenotypes we observe in LRRK2G2019S macrophages remain unchanged and are now expanded within the manuscript. We hope reviewers will agree that our work provides important new insights into LRRK2 function in iron homeostasis while opening new avenues of research in future studies.

      Given this new information we have changed the title from "LRRK2G2019S acts as a dominant interfering mutant in the context of iron overload" to the more accurate "LRRK2G2019S interferes with NCOA4 trafficking in response to iron overload leading to oxidative stress and ferroptotic cell death."

      Response to Reviewer 1

      Reviewer 1 (R1): There are two major concerns with the data in their present form. In brief, first, the G2019S cells express much less LRRK2 and more Rab8 that the WT cells and this severely affects interpretability.

      Heidi McBride (HM): We agree that the LRRK2G2019S lines express lower levels of LRRK2 than wild type, which is a previously documented phenomenon, presumably as the cell attempts to downregulate the increased kinase activity by reducing protein expression. However, the levels of Rab8 across 10s of experiments do not consistently show any differences between the wild type, G2019S and KO. We have provided more comprehensive quantifications of the blots in the revised version, and the Rab8 levels are consistent across all the blots presented in the manuscript (Figure 1A and 1B).

      R1: Second, the investigators used CRISPR to truncate the endogenous LRRK2 locus to produce a hypothetical truncated LRRK2-ARM polypeptide. This appears to have robust effects on NCOA4, in particular, which drives the overall interpretation of the data. However, the expression of this novel LRRK2 species is not confirmed nor compared to WT or G2019S in these cells (although admittedly the investigators did seek to address this with subsequent KO in the ARM cells). It would be premature to account for the changes reported without evidence of protein expression. This latter issue may be more easily addressed and could provide very strong support for a novel function/finding, see more detailed comments below, most seeking clarifications beyond the above.

      HM: As described in my common response above, we have removed the LRRK2ARM data from the manuscript.

      R1: Need to make clear in the results whether the G2019S CRISPR mutant is heterozygous or homozygous (presumably homozygous, same for ARM)

      HM: The RAW cell line we generated is homozygous for the G2019S and the KO alleles. We added this to the beginning of the results section and methods.

      R1: The text of the results implies that MLi2 was used in both WT and G2019S Raw cells, but it's only shown for G2019S. Given the premise for the use of RAW cells, it's important to show that there is basal LRRK2 kinase activity in WT cells to go along with its high protein expression. This is particularly important as the G2019S blot suggests minor LRRK2-independent phosphorylation of Rab8a (and other detected pRabs). One would imagine that pRab8 levels in both WT and G2019S would reduce to the same base line or ratio of total Rab in the presence of MLi2, but WT untreated is similar to G2019S with MLi2. This suggests no basal LRRK2 activity in the Raw cells, but I don't think that is the case.

      HM: We have included the data from MLi-2 treatment of wild type cells in Fig 3C quantified in D. Again, the baseline levels of Rab8 are unchanged across the genotypes. However, the reviewer is correct that there is some baseline LRRK2 kinase activity that is sensitive to MLi2 in wild type cells. This is seen most clearly on the autophosphorylation of LRRK2 at S1292 in Fig 3C. The pRab8 blots is not as clear in wild type cells. It is likely that LRRK2 must be actively recruited to membranes (as seen by others with LLOME, etc) to easily visualize p-Rabs in wild type cells. Nevertheless, we do clearly see the activity of autophosphorylation in wild type cells. Therefore while we understand the reviewers point that there should be some Rab8 phosphorylation in wild type cells, we don't see a significant, or very convincing, amount of it in our RAW macrophages.

      R1: Also, in terms of these cells, the levels of LRRK2 are surprisingly unmatched (Fig 1A, 1D, 1H, S1D, etc.) as are total levels of Rab8 (but in opposite directions) between the WT and G2019S. This is not mentioned in the Results text and is clearly reproducible and significant. Why do the investigators think this is? If Rab8 plays a role in iron, how do these differences affect the interpretation of the G2019S cells (especially given that MLi2 does not rescue)? Are other LRRK2-related Rabs affected at the protein (not phosphorylation level)? Could reduced levels of LRRK2 or increase Rab 8 alone or together account for some of these differences? Substantial further characterization is required as this seriously affects the interpretability of the data. Since pRab8 is not normalized to total Rab8, this G2019S model may not reflect a total increase in LRRK2 kinase activity, and could in fact have both less LRRK2 protein and less cellular kinase activity than WT (in this case).

      HM: In our hands, the RAW cells with homozygous LRRK2G2019S mutations show clearly that the total protein levels of LRRK2 is reduced compared to wild type, which is likely a compensatory effect to reduce cellular kinase activity overall. We understand that some of our previous blots were not so clear on the total Rab8 levels across the different experiments. We have repeated many of these experiments and hope the reviewer can see in Figs 1A, 3C, 3E, 3J, and Sup3A that the total Rab8 levels are stable across the conditions. We also present quantifications from 3 independent experiments normalizing the pRab8/Rab8 levels in all three genotypes in untreated and iron-loaded conditions (Supp Fig 3A and B), and upon MLi2 treatment (Fig 3C). In 3C and D the data show the effectiveness of MLi-2 to reduce pRab8 in control conditions, but the resistance to MLi-2 in FAS treated cells.

      R1: Presumably, the blots in 1H are whole cell lysates and account for the pooled soluble and insoluble NCOA4 (increased in G2019S), as there is no difference in soluble NCOA4 (Fig 2H). I suspect the prior difference is nicely reflected in the insoluble fraction (Fig 2H). This should be better explained in the Results text. This is a very interesting finding and I wonder what the investigators believe is driving this phenotype? Is the NCOA4 partitioning into a detergent-inaccessible compartment? Does this replicate with other detergents, those perhaps better at solubilizing lipid rafts? Is this a phenotype reversible with MLi2? Very interesting data.

      HM: We apologize for not being clearer in the text describing the behavior of NCOA4. The reviewer is correct that the major change in G2019S is the increased triton-X100 insoluble NCOA4. Previous work has established that NCOA4 segregates into detergent-insoluble foci upon iron overload as a way to release it from ferritin cages, and this fraction is then internalized into lysosomes through a microautophagy pathway (see Mizushima's work PMID: 36066504). In Fig 1I we show that the elevation in NCOA4 and ferritin heavy chain seen in untreated G2019S cells can be cleared upon iron chelation with DFO, indicating that the canonical NCOA4 mediated ferritinophagy (macroautophagy) pathway remains intact to recycle the iron in conditions of iron starvation. However in Figure 2 we show that conditions of iron overload, when NCOA4 segregates from ferritin (to allow cytosolic storage of iron), this form of NCOA4 cannot be degraded within the lysosome through the microautophagy pathway, and begins to accumulate. We see this with our live and fixed imaging compared to wild type cells (Fig 2A,D), and by the lack of clearance seen by western blot (Fig 2E). As for the impact of MLi-2, we observe some reversal of NCOA4 accumulation in untreated cells at 4 and 8 hrs after MLi-2 treatment (Supp Fig 2F). However, in iron loaded conditions the high NCOA4 levels in G2019S cells are MLi2 insensitive, while the elevated NCOA4 in wild type cells is reduced upon MLi2 addition (Fig. 2F, compare lates 3vs4 in wt with lanes 7vs8 in G2019S). This is consistent with a block in the microautophagy pathway of phase-separated NCOA4 degradation in G2019S cells.

      R1: Figure 2 describes the increased NCOA4-positive iron structures after iron load, but does not emphasize that the G2019S cells begin preloaded with more NCOA4. How do the investigators account for differential NCOA4 in this interpretation? Is this simply a reflection of more NCOA4 available in G2019S cells? This seems reasonable.

      HM: The reviewer is correct, we showed that there is some turnover of NCOA4 in untreated conditions through canonical ferritinophagy, but in iron overload this appears to be blocked, the NCOA4 segregates from ferritin and remains within insoluble, phase-separated structures that cannot be degraded through microautophagy. We have written the text to be more clear on these points.

      R1: These are very long exposures to iron, some as high as 48 hr which will then take into account novel transcriptomic and protein changes. Did the investigators evaluate cell death? Iron uptake would be trackable much quicker.

      HM: We agree that many things will change after our FAS treatments and now provide a full proteomics dataset on wild type and G2019S cells with and without iron overload, which is presented in Figure 4A-B. Indeed Figure 4 is entirely new to this revised submission. The proteomics highlighted a series of cellular changes that reflect major cell stress responses including the upregulation of HMOX1 (western blots to validate in Supp Fig 4A), an NRF2 transcriptional target consistent with our observation that NRF2 is stabilized and translocated to the nucleus in G2019S iron loaded cells (Sup Fig 4B,C). There are several interesting changes, and we highlighted the three major nodes, which are changes in iron response proteins, lysosomal proteins - particularly a loss of catalytic enzymes like lysozymes and granzymes consistent with the loss of hydrolytic capacity we show in Fig. 4C,D. We also noted changes in cytoskeletal proteins we suspect is consistent with the "blebbing" of the plasma membrane we see decorated with pRab8 in Fig 3. To test the activation of lipid oxidation likely resulting from the elevation in Fe2+ and oxidation signatures we employed the C11-bodipy probe and observe strong signal specific to the G2019 iron-loaded cells, particularly labelling endocytic compartments and the cell surface (Fig. 4E-G).

      Lastly, an analysis of SYTOX green uptake experiments was done to monitor the uptake of the dye into cells that have died of cell membrane rupture, commonly used to examine ferroptotic cell death. We now show the G2019S cells are very susceptible to this form of death (Fig 4H,I). These data add new functional evidence for the consequence of the G2019S mutation in an increased susceptibility to iron stress.

      R1: The legend for 2F is awkward (BSADQRED)

      HM: We have changed this to BSA-DQRed, which is a widely used probe to monitor the hydrolytic capacity of the lysosome.

      R1: Why are WT cells not included in Fig 2G?

      HM: We have now included new panels in Fig 3C,D showing wild type and G2019S +/- FAS and +/-ML-i2 with quantifications of pRab8/Rab8.

      R1: The biochemical characterization of NCOA4 in the LRRK2-arm cells is a great experiment and strength of the paper. The field would benefit by a bit further interrogation, other detergents, etc.

      HM: We have removed all of the LRRK2ARM data given our confusion over the impact of the 4 amino acid changes in exon 19 and our inability to monitor this protein by western blot. The concept that NCOA4 enters into TX100 insoluble, phase separated compartments has been well established, so we didn't explore other detergents at this point.

      R1: Have the investigators looked for aberrant Rab trafficking to lysosomes in the LRRK2-arm cells? Is pRab8 mislocalized compared to WT? Other pRabs?

      HM: We did initially show that pRab8 was also at the plasma membrane in the LRRK2ARM cells, and we still focus on this finding for the G2019S, seen in Fig 3A,B,F,H. We did try to look at other p-Rabs known to be targets of LRRK2 but none of them worked in immunofluorescence so we couldn't easily monitor specific traffic and/or localization changes for them.

      R1: The expression levels and therefore stability of the ARM fragment is not shown. This is necessary for interpretation. While very intriguing, the data in Aim 3 rely on the assumption that the ARM fragment is expressed, and at comparable levels to G2019S to account for phenotypes. The generation of second clone is admirable, but the expression of the protein must be characterized. This is especially true because of the different LRRK2 levels between WT and G2019S. One could easily conceive of exogenous expression of a tagged-ARM fragment into LRRK2 KO cells, for example, as another proof-of-concept experiment. If it is truly dominant, does this effect require or benefit from some FL LRRK2? It seems easy enough to express the LRRK2-ARM in at least WT and KO RAW cells.

      HM: We agree and our attempts to understand this clone resulted in its removal from the manuscript. We did also express cDNA encoding our ARM domain (up to exon 19), but it didn't phenocopy the CRISPR clone, which of course made sense once we had better proteomics and repeated our deep sequencing.

      In our further efforts to understand why our phenotype was MLi-2 resistant upon iron overload we expanded to examine the impact of pan-specific TypeII kinase inhibitors, and then reached out to the Reck-Peterson and Leschziner labs to obtain a newly developed LRRK2 selective type II kinase inhibitor. These all very efficiently reversed the pRab8 signals seen at the plasma membrane of G2019S cells upon iron overload (Fig 3E-K). Therefore the G2019S is not dominant negative, as we had initially supposed, rather there is a specific conformation of LRRK2 in high iron that potentially opens the ATP binding pocket to bind the type II inhibitors, but not MLi2. We do not understand exactly what this conformation is but likely involves new protein interactions specific to high iron, or perhaps LRRK2 binds iron directly as a sensor somehow that ultimately leads to the differential sensitivity we observe between type I and type II kinase inhibitors. Our data indicate that MLi-2 treatment in clinic will not be protective against iron toxicity phenotypes that may contribute to PD, where these newer selective type II LRRK2 kinase inhibitors would be effective in this conformation-specific context of iron toxicity.

      R1: Does iron overload induce Rab8a phosphorylation in a LRRK2 KO cell? This would be a solid extension on the ARM data and support the important finding that an additional kinase(s) can phosphorylate Rab8a under these conditions, and while not unexpected, this may not have been demonstrated by others as clearly. It also addresses whether the ARM domain is important to this other putative kinase(s), which may add value to the authors' model.

      HM: Iron overload does not induce pRab8 in LRRK2 KO cells, as seen by immunofluorescence in Fig 3A,B, and western blot in Supp Fig 3 A,B. With our new type II kinase inhibitor data we can confirm that the plasma membrane localized Rab8 is indeed phosphorylated by LRRK2.

      R1: Minor concern - the abstract but not the introduction emphasizes a hypothesis that loss of neuromelanin may promote cell loss in PD (through loss of iron chelation), while post mortem studies are by definition only correlative, early works suggested that the higher melanized DA neurons were preferentially lost when compared to poorly melanized neurons in PD. This speculation in the abstract is not necessary to the novel findings of the paper.

      HM: We appreciate that the links to iron in PD are correlative, we have maintained some of our discussion on this point within the manuscript given the lack of attention the field has paid to the cell biology of iron homeostasis in PD models. If there is a cell autonomous nature to the loss of DA neurons in PD, iron is very likely to be a part of this specificity in our opinion. Most of the newer MRI studies looking at iron levels in patient brains are showing higher free iron and working on this as potential biomarkers of disease. The precise timing of this relative to the stability/loss of neuromelanin is, I agree, not really clear.

      R1: (Significance (Required)): This study could shed light on a both novel and unexpected behavior of the LRRK2 protein, and open new insights into how pathogenic mutations may affect the cell. While studied in one cell line known for unusually high LRRK2 expression levels, data in this cell type have been broadly applicable elsewhere. Give the link to Parkinson's disease, Rab-dependent trafficking, and iron homeostasis, the findings could have import and relevance to a rather broad audience.

      HM: We are so very appreciative that reviewer 1 feels our work will be of interest to the PD and cell biology communities.

      Response to Reviewer 2

      Reviewer 2 (R2): Major: Please confirm that the observed phenotype is conserved within bone marrow-derived macrophages of LRRK2 G2019S mice. These mice are widely available within the community and frozen bone marrow could be sent to the labs. The main reason for this experiment is that CRISPR macrophage cell lines do sometimes acquire weird phenotypes (at least in our lab they sometimes do!) and it would strengthen the validity of the observations.

      HM: We did a series of experiments on primary BMDM derived from 3 pairs of wild type, LRRK2G2019S and LRRK2KO mice. We examined levels of ferritin heavy and light chains in steady state and withFAS treatment experiments. Unfortunately the data did not phenocopy the RAW macrophage lines we present here since FTL and FTH were mostly unchanged. We did observe an increase in NCOA4 levels, consistent with potential issues with microautophagy as observed in our RAW system.

      While we understand the danger that our phenotypes are nonspecific and linked to a CRISPR-based anomaly, there are a number of arguments we would make that these data and pathways are potentially very important to our understanding of LRRK2 mutant phenotypes and pathology. The first point is that we now include a LRRK2-specific type II kinase inhibitor that reverses the iron-overload pRab8 accumulation at the plasma membrane in LRRK2G2019S cells, showing that this is at least directly linked to LRRK2 kinase activity, even though it is resistant to MLi2.

      Second, Suzanne Pfeffer recently published their single cell RNAseq datasets from brains of untreated LRRK2G2019S mice (PMID: 39088390). She reported major changes in Ferritin heavy chain (it is lost) in very specific cell types of the brain, astrocytes, microglia and oligodendrocytes, with no changes in other cell types at all (her Fig 6 included left). This is consistent with a very context specific impact of LRRK2 on iron homeostasis that we don't yet understand.

      Third, the labs of both Cookson, Mamais and Lavoie have been working on the impact of LRRK2 mutations on iron handling in a few different model systems, including iPSCs, and see changes in transferrin recycling and iron accumulation. Those studies did not go into much detail on ferritin, NCOA4 and other readouts of iron homeostasis but are roughly in agreement with our work here. In the last biorxiv study submitted after we sent this work for review they concluded their phenotypes were reversed by MLi2 treatment, however they required 7 days of treatment for a ~20% restoration in iron levels. Given our work it would seem the impact of LRRK2G019S in high iron conditions is also very resistant to MLi2 treatment. In all these studies we do not yet know for sure whether iron overload in the brain may be a precursor to DA neuron cell death, which could be exacerbated in G2019S carriers. But we hope the reviewer will agree that our approach and findings will be useful for the field to expand on these concepts within different models of PD.

      R2: Minor comments: Supplementary Fig 1: I don't think one should normalize all controls to 1 and then do a statistical test as obviously the standard deviation of control is 0.

      HM: We agree with the reviewer that statistical testing is not appropriate when the WT control is fixed to a value of 1, as this necessarily eliminates variance in that group; accordingly, we have removed both statistical comparisons and standard deviation from the WT control while retaining variability measures for all experimental conditions. Raw densitometry values could not be pooled across independent experiments due to substantial inter-blot variability, and therefore normalization to the WT control was used solely to allow relative comparison within experiments, acknowledging the inherent quantitative limitations of Western blot densitometry. Ultimately the magnitude of the changes relative to the control lanes in each biological replicate was consistent across experiments, even if the absolute density of the bands between experiments was not always the same.

      R2: The raw data needs to be submitted to PRIDE or similar.

      HM: All of our data is being uploaded to the GEO databases, protocols to protocols.io and raw data deposited on Zenodo site in compliance with our ASAP funding requirements and the journals.

      R2: Some of the western blots could be improved. If these are the best shown, I am a little concerned about the reproducibility. How often has they been done?

      HM: We now ensure there is quantification of all the blots for at least 3 independent experiments and have worked to improve the quality of them throughout the revision period.

      R2: (Significance (Required)): Considering the importance of LRRK2 biology in Parkinson's and the new biology shown, this paper will be of great interest to the community and wider research fields.

      HM: We are so very grateful that the reviewer appreciates that the LRRK2 and PD community will find our work of interest. We hope our revisions will prove satisfactory even in the absence of ferritin changes in primary G2019S BMDM.

      Response to Reviewer 3

      Reviewer 3 (R3): What is missing in the study is the physiological relevance of these findings, mainly whether this effect actually results in higher cell death during iron overload. Since iron overload is known to result in ferroptosis, it is surprising that the authors have not checked whether the LRRK2 G2019S and ARM cells undergo more ferroptosis relative to LRRK2 WT cells.

      HM: We thank the reviewer for pushing us to monitor the functional implications of the iron mishandling upon iron overload in the G2019S RAW cell system. We now add a completely new Figure 4 to get to these functional points. We employed two tools to look at established aspects of ferroptosis, first the C11-bodipy probe that labels oxidized lipids and we see significant signals specific to the G2019S iron loaded cells, where it labels endocytic membranes and the cell surface (Fig 4 E-G). This is consistent with the elevation of free iron 2+. We also used the SYTOX green death assay where the dye is internalized into cells when the cell surface is ruptured and show that G2019S cells die upon iron overload, but not the LRRK2KO or wild type cells (Fig 4 H,I). Lastly, we performed full proteomics analysis of the wt and G2019S RAW cells in iron overload conditions. These data provide a better view of the full stress response initiated in the G2019S cells, including the upregulation of HMOX1 (an NRF2 target gene), changes in lysosomal hydrolytic enzymes consistent with the reduction in BSA-DQRed signals, and in cytoskeleton, which is consistent with the plasma membrane blebbing phenotypes we see in G2019S (Fig. 4A-D and Supp. Fig 4 data). We hope these new data help to position the phenotype into a more physiological output.

      R3: Moreover, their conclusion of the findings as "resistant to LRRK2 kinase inhibitors" is not convincing, since in most of the studies, they have removed the kinase domain, and this description implies the use of pharmacological kinase inhibition which has not been done in this paper.

      HM: We took this comment to heart and, as explained in the general response we removed the LRRK2ARM clones from the study. To understand the kinase function in the iron overload conditions we first explored the pan-specific type II kinase inhibitor rebastinib, shown to inhibit LRRK2. In contrast to MLi2, this drug effectively blocked p-Rab8 in G2019S cells exposed to high iron. However, since it is not specific and likely inhibits about 30-40% of all kinases we reached out to the Reck-Peterson and Leschziner labs who have developed a LRRK2 specific type II kinase inhibitor (published in June 2025 PMID: 40465731). They provided these to us (along with a great deal of discussion) and the two drugs both blocked the effect of LRRK2G2019 on p-Rab8 at the plasma membrane. These data show that the phenotypes we observe are indeed linked to the increased kinase activity of LRRK2, even though they are fully resistant to MLi-2. It suggests that high iron results in some alteration in LRRK2 conformation that alters the ability of MLi2 to block the kinase activity, while still allowing the type II kinase inhibitors that bind deeper in the ATP-binding pocket, to functionally block activity. We believe that these new data remove a great deal of confusion we had in the initial submission to explain the MLi-2 resistance.

      R3: There is lower LRRK2 expression in LRRK2 G2019S cells, have the authors checked Rab phosphorylation to validate the mutation?

      HM: We agree that the G2019S mutation leads a reduction in total LRRK2 levels in the cell, which is likely a compensatory effect to lower kinase activity in the cell. We do show that the G2019S mutation has clear activation of phosphorylation on both Rab8 and at the autophosphorylation site S1292 of LRRK2, as seen in Fig 1A, quantified in Fig 1B. In untreated conditions, these phosphorylation events are reversible upon treatment with MLi-2. We also provide the sequencing data in the supplement to confirm the presence of the G2019S mutation in this clone, shown in Supp Fig. 1A.

      R3: The authors should specify if their cells are heterozygous or homozygous since they are discussing a dominant interfering mutant.

      HM: The G2019S and LRRK2 KO are both homozygous. We state this early in the results section and the methods.

      R3: The transferrin phenotype validated through proteomics and western blot is solid. HM: We agree, thank you very much!

      R3: Quantification in figure 1F-G is problematic, not clear what they mean by "diffuse and lysosomal". Puncta is either colocalising with lysosomes or not colocalising. This needs to be clarified and re-analysed.

      HM: We apologize for the confusion. In control cells the Cherry tagged FTL is efficiently cycling through the lysosomes and we don't see a strong cytosolic (diffuse) pool, which likely reflects the relatively iron-poor culture conditions. However, in G2019S cells, there is a highly elevated amount of FTL, with a strong cytosolic/diffuse stain in steady state, with some flux into lysosomes. In this experiment we chelated iron to test whether this cytosolic pool of FTL was capable of clearing through the lysosomes (ferritinophagy). While there is a cytosolic (diffuse) pool that remains, the pool that fluxes into the lysosome increases in G2019S chelated cells. This is also seen by the reduction in total FTL seen by western blot (endogenous FTL). Our conclusion here is that the general ferritinophagy machinery remains functional in G2019S cells. We have changed the term "diffuse" to "cytosolic" and improved our description of this experiment in the text.

      R3: Text in the first results part called "LRRK2G2019S RAW macrophages have altered iron homeostasis" is very long. It could be divided into more sections to improve readability. HM: We have improved the text to be more descriptive of the conclusions and added new sections

      R3: If the effect is armadillo-dependent, where does LRRK2 G2019S is implicated since there is no kinase domain in these cells?

      HM: Our new data employing the LRRK2-specific type II kinase inhibitors now confirm that the effects of the G2019S on iron overload are indeed kinase dependent, it's just insensitive to MLi2.

      R3: The authors do not show any controls (PCR, sequencing) confirming knockout or truncation. HM: We did higher resolution proteomics and deep sequencing and learned that the "Arm" mutation was not a truncation but a series of 4 point mutations around exon 19. Therefore we removed all data referring to this clone and replaced it with the use of the type II kinase inhibitor experiments. We feel this removed a lot of confusion and provides much clearer conclusions on the role of the kinase activity in iron overload. We may continue to explore what the 4 amino acid mutations created such strong phenotypes, as it could reflect a critical conformational change that impacts the kinase activity. But that is for future work. We now include the sequencing files of the G2019 and KO as Supplementary Data Files 1 and 2.

      R3: The data is interesting and the image quality with the insets is very high. HM: We thank the reviewer for their positive comments!

      R3: Mutant not clearly described in text, did the authors remove just the kinase and ROC-COR domains or all the domains downstream of the Armadillo domain? This is not clear. HM: We have removed the clone from the manuscript.

      R3: The authors cannot conclude that their phenotype is due to the independence of the kinase domain specifically as they are also interfering with the GTPase activity by removing the ROC-COR domains. HM: We agree and our new drugs allow us to confirm that the phenotypes are due to kinase activity, but there is a new conformation of LRRK2 induced in high iron that renders the kinase domain resistant to MLi-2 inhibition. We discuss this in the manuscript now.

      R3: In Figure 3E, is the difference between the "ARM CTRL" and the "ARM FAS" conditions significant? A trend appears to be there, but the p-value is not shown. HM: these data are now removed.

      R3: In figure 4A, it would have been important to check if Rab8 phosphorylation is also observed in LRRK2 KO cells after administration of FAS to further evaluate the mechanism through which this Rab8 phosphorylation is occurring.

      HM: We show that the pRab8 is specific to the G2019S lines and not seen in LRRK2 KO (Fig 3A,B, Supp. Fig. 3A,B).

      R3: The vinculin bands in figure 4A are misaligned with the rest of the bands.

      HM: We now provide new blots for all of these experiments (in Fig 3) as we removed the LRRK2ARM data from the manuscript and the appropriate loading controls are all included.

      R3: The authors do not have any controls to validate the pRab8 staining in IF. This is an important caveat and needs to be addressed. HM: We now include siRNA validation of Rab8 (vs Rab10) to confirm the specificity of the antibody to pRab8 in IF where it labels the plasma membrane in G2019S iron loaded cells.

      R3: The authors should have checked if FAS administration in the LRRK2 G2019S and the ARM cells is leading to ferroptotic cell death (or cell death in general). This is key to validate the link between the altered iron homeostasis in LRRK2 G2019S cells and increased cytotoxicity observed during neurodegeneration.

      HM: As mentioned above, we have added extensively to our new Fig 4 to include full proteomics analysis of the changes in iron loaded G2019S cells, we use C11-Bodipy probes to monitor lipid oxidation, and SYTOX green assays to monitor cell death through cell surface rupture (consistent with ferroptosis). We thank the reviewer for pushing us to do these experiments and provide further relevance to the potential for LRRK2 mutations to promote cell toxicity during neurodegeneration.

      R3: Regarding the literature, the authors are missing some important papers that are preprinted and these studies need to be discussed. This includes a report with opposite findingshttps://www.biorxiv.org/content/10.1101/2025.09.26.678370v1.full and a report showing kinase independent cell death in macrophages https://www.biorxiv.org/content/10.1101/2023.09.27.559807v1.abstract

      HM: We thank the reviewers for alerting us to the biorxiv papers, one of which was submitted after we sent our manuscript to review. We are excited to see the growing interest in the impact of LRRK2 function in iron homeostasis and hope our work will contribute to this. Upon reading the study from the LaVoie lab they do show some sensitivity of the iron loaded phenotype in G2019S cells, however they see a ~20% reduction in lysosomal iron after 7 days of MLi treatment in Astrocytes (their Fig 2L). To us, this is very likely an indication of a relatively high resistance to the drug. I'm sure if they tried these new Type II inhibitors the iron load would be much more rapidly reversed. The specificity of their phenotype to Rab8 is also very interesting considering the cell surface localization we see for pRab8 in our iron loaded system. Similar comments for the Guttierez study in macrophages. We have included the findings of these papers within the manuscript and thank the reviewer for pointing them out.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, the authors report an interesting phenotype of the LRRK2 G2019S mutation on iron homeostasis in RAW264.7 macrophages. The phenotype is well characterised through proteomic and western blot approaches investigating transferrin and ferritin trafficking. The study is well conducted and data of high quality. The authors also appear to have discovered a cellular context where Rab8 is phosphorylated independently of LRRK2. This is a major finding which can potentially have an important impact in the LRRK2 field. What is missing in the study is the physiological relevance of these findings, mainly whether this effect actually results in higher cell death during iron overload. Since iron overload is known to result in ferroptosis, it is surprising that the authors have not checked whether the LRRK2 G2019S and ARM cells undergo more ferroptosis relative to LRRK2 WT cells. Moreover, their conclusion of the findings as "resistant to LRRK2 kinase inhibitors" is not convincing, since in most of the studies, they have removed the kinase domain, and this description implies the use of pharmacological kinase inhibition which has not been done in this paper.

      Significance

      Major comments

      In Figure 1:

      • There is lower LRRK2 expression in LRRK2 G2019S cells, have the authors checked Rab phosphorylation to validate the mutation?
      • The authors should specify if their cells are heterozygous or homozygous since they are discussing a dominant interfering mutant.
      • The transferrin phenotype validated through proteomics and western blot is solid.
      • Quantification in figure 1F-G is problematic, not clear what they mean by "diffuse and lysosomal". Puncta is either colocalising with lysosomes or not colocalising. This needs to be clarified and re-analysed.
      • Text in the first results part called "LRRK2G2019S RAW macrophages have altered iron homeostasis" is very long. It could be divided into more sections to improve readability.

      In Figure 2:

      • If the effect is armadillo-dependent, where does LRRK2 G2019S is implicated since there is no kinase domain in these cells?
      • The authors do not show any controls (PCR, sequencing) confirming knockout or truncation.
      • The data is interesting and the image quality with the insets is very high.

      In Figure 3:

      • Mutant not clearly described in text, did the authors remove just the kinase and ROC-COR domains or all the domains downstream of the Armadillo domain? This is not clear.
      • The authors cannot conclude that their phenotype is due to the independence of the kinase domain specifically as they are also interfering with the GTPase activity by removing the ROC-COR domains.
      • In Figure 3E, is the difference between the "ARM CTRL" and the "ARM FAS" conditions significant? A trend appears to be there, but the p-value is not shown.

      In Figure 4:

      • In figure 4A, it would have been important to check if Rab8 phosphorylation is also observed in LRRK2 KO cells after administration of FAS to further evaluate the mechanism through which this Rab8 phosphorylation is occurring.
      • The vinculin bands in figure 4A are misaligned with the rest of the bands.
      • The authors do not have any controls to validate the pRab8 staining in IF. This is an important caveat and needs to be addressed.
      • The authors should have checked if FAS administration in the LRRK2 G2019S and the ARM cells is leading to ferroptotic cell death (or cell death in general). This is key to validate the link between the altered iron homeostasis in LRRK2 G2019S cells and increased cytotoxicity observed during neurodegeneration. Regarding the literature, the authors are missing some important papers that are preprinted and these studies need to be discussed. This includes a report with opposite findings https://www.biorxiv.org/content/10.1101/2025.09.26.678370v1.full and a report showing kinase independent cell death in macrophages https://www.biorxiv.org/content/10.1101/2023.09.27.559807v1.abstract
    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript the authors describe an interesting connection between the Parkinson's kinase LRRK2 and iron trafficking in RAW macrophages. Expression of the LRRK2 G2029S mutation affects the abundance of ferritin heavy and light chains and therefore the uptake and storage of iron. Interestingly, the loss of the kinase domain still had a strong phenotype, suggesting that this is independent of the kinase function.

      The paper is well written and excellently cited. The data is convincing and of good quality.

      I have only one request and else very minor comments:

      Major: Please confirm that the observed phenotype is conserved within bone marrow-derived macrophages of LRRK2 G2019S mice. These mice are widely available within the community and frozen bone marrow could be sent to the labs.

      The main reason for this experiment is that CRISPR macrophage cell lines do sometimes acquire weird phenotypes (at least in our lab they sometimes do!) and it would strengthen the validity of the observations.

      Minor comments:

      Supplementary Fig 1: I don't think one should normalize all controls to 1 and then do a statistical test as obviously the standard deviation of control is 0. I would normalize to the average of the control, which will provide an error for the control.

      The raw data needs to be submitted to PRIDE or similar. This has not happened yet.

      Some of the western blots could be improved. If these are the best shown, I am a little concerned about the reproducibility. How often has they been done?

      Significance

      Considering the importance of LRRK2 biology in Parkinson's and the new biology shown, this paper will be of great interest to the community and wider research fields.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Goldman et al describe some novel findings with respect to LRRK and iron handling in a series of RAW macrophage cell lines. This cell background is chosen for its recognized high levels of endogenous LRRK2 protein expression, its somewhat broad use in the field, and the investigators add its relevance due to phagocytosis of red blood cells, thus requiring iron robust metabolic processes. Proteomic analyses of WT and G2019S RAW cells revealed multiple iron-related proteins affected by LRRK2 mutation. A deeper candidate-based analysis revealed complex changes in ferritin heavy and light chain and changes in ferric and ferrous iron. Notably, reliable changes in the levels and/or solubility of NCOA4 result from this pathogenic LRRK2 mutation. Unexpectedly, however, these changes were not sensitive to LRRK2 kinase inhibitor treatment. The investigators suggest a dominant effect rather than loss-of-function as subsequent experiments revealed that these effects could be replicated with a LRRK2 variant lacking the kinase domain (LRRK2-ARM) and were not replicated by LRRK2 KO. The data are internally consistent throughout and could certainly shed new important light onto unique and unexpected effects of this LRRK2 mutation.

      There are two major concerns with the data in their present form. In brief, first, the G2019S cells express much less LRRK2 and more Rab8 that the WT cells and this severely affects interpretability. Second, the investigators used CRISPR to truncate the endogenous LRRK2 locus to produce a hypothetical truncated LRRK2-ARM polypeptide. This appears to have robust effects on NCOA4, in particular, which drives the overall interpretation of the data. However, the expression of this novel LRRK2 specie is not confirmed nor compared to WT or G2019S in these cells (although admittedly the investigators did seek to address this with subsequent KO in the ARM cells). It would be premature to account for the changes reported without evidence of protein expression. This latter issue may be more easily addressed and could provide very strong support for a novel function/finding, see more detailed comments below, most seeking clarifications beyond the above.

      • Need to make clear in the results whether the G2019S CRISPR mutant is heterozygous or homozygous (presumably homozygous, same for ARM)
      • The text of the results implies that MLi2 was used in both WT and G2019S Raw cells, but it's only shown for G2019S. Given the premise for the use of RAW cells, it's important to show that there is basal LRRK2 kinase activity in WT cells to go along with its high protein expression. This is particularly important as the G2019S blot suggests minor LRRK2-independent phosphorylation of Rab8a (and other detected pRabs). One would imagine that pRab8 levels in both WT and G2019S would reduce to the same base line or ratio of total Rab in the presence of MLi2, but WT untreated is similar to G2019S with MLi2. This suggests no basal LRRK2 activity in the Raw cells, but I don't think that is the case.
      • Also, in terms of these cells, the levels of LRRK2 are surprisingly unmatched (Fig 1A, 1D, 1H, S1D, etc.) as are total levels of Rab8 (but in opposite directions) between the WT and G2019S. This is not mentioned in the Results text and is clearly reproducible and significant. Why do the investigators think this is? If Rab8 plays a role in iron, how do these differences affect the interpretation of the G2019S cells (especially given that MLi2 does not rescue)? Are other LRRK2-related Rabs affected at the protein (not phosphorylation level)? Could reduced levels of LRRK2 or increase Rab 8 alone or together account for some of these differences? Substantial further characterization is required as this seriously affects the interpretability of the data. Since pRab8 is not normalized to total Rab8, this G2019S model may not reflect a total increase in LRRK2 kinase activity, and could in fact have both less LRRK2 protein and less cellular kinase activity than WT (in this case).
      • Presumably, the blots in 1H are whole cell lysates and account for the pooled soluble and insoluble NCOA4 (increased in G2019S), as there is no difference in soluble NCOA4 (Fig 2H). I suspect the prior difference is nicely reflected in the insoluble fraction (Fig 2H). This should be better explained in the Results text. This is a very interesting finding and I wonder what the investigators believe is driving this phenotype? Is the NCOA4 partitioning into a detergent-inaccessible compartment? Does this replicate with other detergents, those perhaps better at solubilizing lipid rafts? Is this a phenotype reversible with MLi2? Very interesting data.
      • Figure 2 describes the increased NCOA4-positive iron structures after iron load, but does not emphasize that the G2019S cells begin preloaded with more NCOA4. How do the investigators account for differential NCOA4 in this interpretation? Is this simply a reflection of more NCOA4 available in G2019S cells? This seems reasonable.
      • These are very long exposures to iron, some as high as 48 hr which will then take into account novel transcriptomic and protein changes. Did the investigators evaluate cell death? Iron uptake would be trackable much quicker.
      • The legend for 2F is awkward (BSADQRED)
      • Why are WT cells not included in Fig 2G?
      • The biochemical characterization of NCOA4 in the LRRK2-arm cells is a great experiment and strength of the paper. The field would benefit by a bit further interrogation, other detergents, etc.
      • Have the investigators looked for aberrant Rab trafficking to lysosomes in the LRRK2-arm cells? Is pRab8 mislocalized compared to WT? Other pRabs?
      • The expression levels and therefore stability of the ARM fragment is not shown. This is necessary for interpretation. While very intriguing, the data in Aim 3 rely on the assumption that the ARM fragment is expressed, and at comparable levels to G2019S to account for phenotypes. The generation of second clone is admirable, but the expression of the protein must be characterized. This is especially true because of the different LRRK2 levels between WT and G2019S. One could easily conceive of exogenous expression of a tagged-ARM fragment into LRRK2 KO cells, for example, as another proof-of-concept experiment. If it is truly dominant, does this effect require or benefit from some FL LRRK2? It seems easy enough to express the LRRK2-ARM in at least WT and KO RAW cells.
      • Does iron overload induce Rab8a phosphorylation in a LRRK2 KO cell? This would be a solid extension on the ARM data and support the important finding that an additional kinase(s) can phosphorylate Rab8a under these conditions, and while not unexpected, this may not have been demonstrated by others as clearly. It also addresses whether the ARM domain is important to this other putative kinase(s), which may add value to the authors' model.

      Minor concern - the abstract but not the introduction emphasizes a hypothesis that loss of neuromelanin may promote cell loss in PD (through loss of iron chelation), while post mortem studies are by definition only correlative, early works suggested that the higher melanized DA neurons were preferentially lost when compared to poorly melanized neurons in PD. This speculation in the abstract is not necessary to the novel findings of the paper.

      Significance

      This study could shed light on a both novel and unexpected behavior of the LRRK2 protein, and open new insights into how pathogenic mutations may affect the cell. While studied in one cell line known for unusually high LRRK2 expression levels, data in this cell type have been broadly applicable elsewhere. Give the link to Parkinson's disease, Rab-dependent trafficking, and iron homeostasis, the findings could have import and relevance to a rather broad audience.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all the reviewers for their comments and suggestions.

      Please find below our point-by-point response to the Reviewers' comments, which details the corrections already made and outlines the planned revisions, experiments, and analyses.

      Reviewer 1

      Major comments:

      • Reviewer 1 commented that the 'manuscript would greatly benefit from having someone spend time on the figures, and associated text, to ensure they are fully comprehensible'. We agree wholeheartedly with the reviewer and apologise. We have now revisited the text, figures, and associated figure legends to ensure that they are more easily accessible and fully comprehensible to readers from across disciplines. This includes adding labels to point out specific anatomical features on images, and ensuring figures and text align. Further specific examples are included in the points below.
      • In response to concerns raised by Reviewer 1 relating to: Figure 1 and the lack of figure citations; 'the persistence of mCherry in the H2B Fucci'; how mCherry seems to persist longer in H1 (compare Figs 1D and 1G)':
      • We apologise for the lack of figure citations in the text. We have now reworked the figures relating to the constructs (original Figures 1 and S1) and have made these Figures 1, 2 and S1 in our updated version.
      • Figure 1 is now an introductory background figure which illustrates the differences between Fucci(SA) and Fucci(CA) reporters, with additional details provided in the associated legend, and call outs to the figure starting in the introduction.
      • Regarding 'the persistence of mCherry in the H2B Fucci', what we are trying to articulate is that the mCherry degradation that we observed in the Fucci(2A) expressing DF1 cells extended beyond the end of S phase and into G2/M, compared with what would be expected (Revised Figure 2H, arrows).
      • We have now replaced these montages with a more representative example. Additionally, the new images (Figures 2C and 2G) are synchronised (both starting at G2/M), restricted to a single cell cycle, are larger in size, and have the cell cycle stage labelled. We believe these changes will aid interpretation.
      • Specifically relating to the lack of labelling in Figure 3A, we agree that this figure was not labelled sufficiently, and neither was there enough detail included in the text or figure legend for readers to follow easily and make their own conclusions. We have now added additional labels to this figure, broken the figure down into more panels (Figures 4A-4D in revised manuscript), and included more detailed descriptions in the associated figure legend and text.
      • We thank the reviewer for making the important point that it is 'hard to know where the biosensor is reporting patterns that are already well established (eg neural tube), and where the biosensor is reporting patterns that are novel - and if so, what these patterns are' which was made more challenging by insufficient references to previous studies.
      • Firstly, as for the point above, we have now added labels to many of the panels (Figure 4 in revision), including highlighting features such as the non-proliferative dermal condensates and demarcating the proliferative retinal pigmented epithelium (Figures 4F and 4G in revision). Secondly, we have also now included additional references in the text, specifically relating to the neural tube, digits, and forming feathers, where our proliferation profiles are consistent with previous literature.
      • With regards to the Reviewer's comment regarding the difficulty in drawing conclusions 'about cell cycle in different tissue layers without sectioning' in original Figure 3B we will include more sections of FuChi embryos which include structures such as mesenchymal condensates.
      • To make our data on cell cycle stages as 'cells egress from the primitive streak, to form prechordal plate' clearer we have added additional labels to the figures (Figures 4B and 6E in revised manuscript). We will complement this adding sections of gastrulating FuChi embryos to further demonstrate the cell cycle status of cells that form the pre-chordal plates.

      Minor comments

      • We have added additional references relating to the data in original Figure 3 (now Figure 4 see above), and any new descriptions of known proliferation profiles that we include will have appropriate citations.
      • In this current revision we have addressed figure call out issues, and added labels to enhance readability, clarity and data interpretation. Reviewer 2

      Major comments

      • Reviewer 2 rightly pointed out that the 'description of the bicistronic tandem-Fucci(CA) system in paragraph 6 is not consistent with what is described in the original bibliographic reference indicated by the authors'. We have now added additional text to properly explain the CDT1 probe dynamics, as per the cited manuscript, and also referenced the schematics to help readers.
      • To address whether the FuChi model can be accurately 'used to study embryogenesis' and following up on the suggestion to 'indicate if the size of the embryos is comparable to the wildtype' we have now included size comparisons of FuChi and wild-type/non-transgenic embryos at mid (E9) and late (E18) gestational stages demonstrating that there is no significant difference between genotypes during embryogenesis (Figure 3D in revised manuscript). For all earlier stages, we did not see any developmental or size differences. We believe if there were any differences, these would be reflected in size at the mid and late gestational stages we analysed.
      • Reviewer 2 made very valuable observations and suggestions regarding our data and interpretation of somitogenesis, specifically in response to our sentence saying that "the mesenchyme, which is predominantly in G1 as they undergo condensation". Furthermore, they noted that Supplementary Video 4 "shows distinct green fluorescence (S) in the presomitic mesoderm for the first hour or so, only then turning to magenta (G1)". We were asked to review the sentence/video to clarify if this is a significant finding or if this is not representative of their observations.
      • We thank the reviewer for this suggestion. From looking again at our timelapse movies, and also analysing additional static images, we agree that presomitic mesoderm (PSM) does appear to be green (S phase), which then may transition to G1 as the somites form. To address this, we plan to quantify cell cycle status in the PSM on embryos to see if this is a significant finding.
      • We hope this quantification of the PSM may also enable us to include discussion on how our findings relate to the Cell Cycle model for somitogenesis proposed in the Collier et al, 2000 paper suggested by the Reviewer.
      • We agree with the Reviewer that "the fluorescence profiles in original Figure 4C do not seem similar regarding the Myc-tag epitope" and believe this difference is likely just a reflection of the part of the image we used. We will include a more representative image once we have repeated the staining.
      • Reviewer 2 has asked for quantitative support for our fluorescence-based interpretations. We thank the reviewer for this suggestion and are now planning to perform quantitative analyses of different tissues (similar to our quantification in germ cells) and in embryos to support our observations. These will include the PSM (see above), neural tube, intestine, and early embryos (also see Reviewer 3 response for blastoderm quantification).
      • Since our original submission, we have further refined our in situ hybridisation protocol on FuChi embryos (Figures 5A & B in revision), finding that strong reporter expression is maintained for all the fluorescent proteins of the H1-Fucci(CA)2 reporter. Therefore, the "notably fainter" appearance of the hGMNN-mVenus in Figure 4A from the first version of the paper was likely a result of the experimental protocol not being 100% optimal.
      • *

      Minor comments

      • We have reordered the paragraphs relating to the different Fucci versions in the introduction as per the suggestions by the reviewer for better clarity.
      • To address the issues with Fucci system nomenclatures which made reading difficult, we have now added a background figure (new Figure 1 in revised draft) which is cited in the introduction, made sure constructs are introduced appropriately, and ensured we are consistent with our nomenclature.
      • Supplementary Figure lettering corrected.
      • All figure panels are now mentioned in the main text, and the incorrect call outs noted by the Reviewer have been corrected
      • Removed period and included clarifying statement in the figure legend relating to the comment regarding the extraembryonic region in Figure 5 (original) / Figure 6 (revised).
      • Other issues raised relating to reference duplication and missing words have been resolved.
      • We have corrected the legend of Figure 1 of the original paper, see related Reviewer 1 response provided above.

      Reviewer #3

      Minor comments

      • We have corrected all the figure call outs (see responses to similar comments by Reviewers 1 and 2) to ensure that all data presented is accurately reported.
      • We would like to thank the reviewer for suggesting modifications to the cell cycle montages (original figures 1D, 1G and 2F). We agree it would help the reader to enlarge the image, and therefore reduced the montage to include just one cell cycle, and have also included annotations of cell cycle stages in Figures 2C and 2G of the revised manuscript. We have also added some labels to Figure 3E (original figure 2F) and enlarged this.
      • In response to Reviewer 3's comment regarding fluorescent intensity. We quantified fluorescence levels in multiple individual DF1 cells expressing either the H1.0-Fucci(CA)2 or H2B-Fucci(SA)2 reporters, and this is shown as the fluorescent index in Figures 2D, 2E, 2H and 2I of the revised manuscript, where reporter levels were measured across time. In terms of overall mean intensity levels of the reporters, we found the reporters to be comparable in brightness and have similar mean intensity levels across the cell populations in the flow cytometry data (Figures 2F and 2J).
      • To enhance speedy interpretation, we will also process our supplementary videos to include annotations and arrows to highlight key cells and events (e.g. a cell undergoing mitosis).
      • As recommended by Reviewer 3, we have now quantified cell cycle status in blastoderm cells, confirming that a high proportion are in the G2/M phase. We will include these data in the final revision, which will complement our planned quantification of cell cycle status in other tissues (see response to Reviewer 2).
      • For our final revision, we will include higher magnification/zoomed in images of selected regions of the somites, neural tube (lumen) and retina (epithelium). Revisiting our images of the neural tube showed that dividing cells lumen did so in the perpendicular plane and we will include these images in our revision to provide further evidence of the fidelity of the FuChi reporter. We thank the reviewer for this excellent idea to show the efficacy of our system.
      • To address the levels of proliferation in somites, we plan to generate a cropped video with a fixed ROI to enable proliferation in individual cells of the forming somites to be more readily visualised. This will be further complemented by the quantification of cell cycle status in forming somites (see responses to other reviewers).
      • We have added lines to the discussion regarding the use of our reporter in other conventional model systems.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Sudderick and colleagues describes the development and characterisation of a new generation of cell cycle reporter that can distinguish between cells in G1, S, G2 and M phases. Furthermore, the authors have developed a transgenic chicken line incorporating this reporter and demonstrated faithful discrimination of cell cycle stages in the in vivo context of developing transgenic embryos. Of note is the addition of epitope tags, which facilitate discrimination of cell cycle stages in tissue fixed using various techniques. This is a very important paper for the following reasons:

      • The authors have achieved faithful discrimination of all four cell cycle stages, which is a major advance in itself.
      • This generation of the FuChi transgenic chick is of enormous importance. This will facilitate accurate in vivo studies in a broad range of fixed and living tissue types and is a major milestone in the further establishment of the chick as a transgenic model system.

      Th characterisation of the cell cycle reporter as presented is robust and convincing. The authors further demonstrate the potential utility of the FuChi chickens through their observation of partial cell cycle synchrony during onset of development. I therefore only have minor suggestions that may facilitate easier interpretation of their data.

      Results 2

      • I can't see any mention of Figures 1C and D. Presumably the authors have carried out fluorescence intensity measurements using the two cell cycle reporters here, but this is not mentioned in the main text.
      • Figure 1D&G: I find these difficult to follow given the small size of the cells as presented. The authors may consider enlarging these and clearly annotating for cell cycle stage. They may find it helpful to focus on a single cell cycle, although I appreciate that displaying two cell cycles strengthens the claim of efficacy of the newly developed sensor. The supplementary videos associated with these figure panels are excellent as they display several cells with faithful reporter activity, but again, the authors may wish to annotate a few of these cells to enhance speedy interpretation. I have similar comments for Figure 2F and the associated movie.

      Results 4

      • The authors state that a large proportion of blastoderm cells were in G2/M. They may wish to formally quantify this, perhaps by performing simple cell counts in designated regions of interest. A similar quantification for gastrulating embryos would also be helpful.
      • It would be helpful to see zoomed in images of selected regions of the somites, neural tube and retina displayed in Figure 3B. This would be particularly appropriate in the context of the neural tube and retina (which are not discussed in the main text) as the positioning of the nucleus is defined by the stage of the cell cycle and should therefore serve to highlight the efficacy of the reporter.
      • Video 4 beautifully demonstrates the high levels of proliferation in somites, but again, it would be useful to have a zoomed in view. I appreciate the difficulty involved in doing this, given the movement of the embryo, but perhaps the authors could focus on a fixed ROI or present a separate movie of a few cells undergoing a full cell cycle.

      Discussion

      • The authors could perhaps expand on their discussion about potential utility in other conventional model systems (e.g. mouse, fish, etc).

      Significance

      General assessment: A timely piece of work that introduces a faithful cell cycle reporter that will be of broad interest.

      Advance: The ability to discriminate between all four stages of the cell cycle is a clear advance here.

      Audience: Broad interest, including those studying cell cycle and embryonic development in several tissue contexts.

      Expertise: Chick embryology, in vivo live imaging, neurogenesis, cellular developmental biology

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This work presents a novel transgenic chicken model with fluorescent reporters that allow in vivo monitoring of the four phases of the cell cycle. To achieve this, the authors clearly identify the limitations of previous Fucci systems and developed an optimised reporter construct that overcomes the major technical challenges identified. Addition of epitope tags to cell cycle stage-specific markers further enables antibody detection in fixed tissues. Proof of concept is provided by live imaging of chick embryos in early developmental stages, evidencing dynamic cell cycle states in tissues and migrating cells.

      Major comments:

      1. Introduction: Description of the bicistronic tandem-Fucci(CA) system in paragraph 6 is not consistent with what is described in the original bibliographic reference indicated by the authors. Namely: "...accumulation of the CTD1 probe..." should be expected in the G1-S transition (not S-G2) and the yellow reporter should be expected in G2 and M phases (not S and G2, as described). Please review this portion of the text.
      2. The authors state that "Of note, hatched FuChi chicks are initially smaller than wild type counterparts but grow at comparative rates and are fertile". If the model is to be used to study embryogenesis, it would be useful to indicate if the size of the embryos is comparable to the wildtype, at least for the major developmental stages mentioned in the manuscript.
      3. When referring to somitogenesis, the authors state "...the mesenchyme, which is predominantly in G1 as they undergo condensation". Suppl Video 4, however, shows distinct green fluorescence (S) in the presomitic mesoderm for the first hour or so, only then turning to magenta (G1). The authors should review the sentence/video to clarify if this is a significant finding or if this is not representative of their observations.
      4. (Optional) It would be interesting to describe if the authors' observations of cell cycle dynamics in the presomitic mesoderm support the proposed Cell Cycle model for somitogenesis (Collier et al., J.Theor.Biol.2000).
      5. The fluorescence profiles in Figure 4C do not seem similar regarding the Myc-tag epitope (contrarily to what is stated). The authors should rephrase or revisit this image to clarify their findings.
      6. Quantitative support for several fluorescence-based interpretations made throughout the manuscript. In some instances, conclusions are drawn from qualitative differences in signal intensity. For example, the statement in Fig. 4A that hGMNN-mVenus appears "notably fainter" than the other reporters. Incorporating simple quantitative analyses would strengthen these claims and ensure that observed differences reflect biological behaviour rather than technical or optical factors.

      Minor comments:

      1. Organization of the information in the Introduction: Paragraphs 3-5 introduce sequentially improved versions of the Fucci system. Then, paragraph 6 returns to the system described in the 4th paragraph. Authors should consider including paragraph 5 (description of Fucci4 and its limitations) just prior to the description of chickens as valuable developmental models (current paragraph 8) for clarity of the text.
      2. Fucci system nomenclature. Many different Fucci systems are mentioned, but nomenclature consistency throughout the manuscript is lacking, which makes reading difficult. For example, the terms "Fucci(SA)2" and "Fucci(CA)2" should be defined in the introduction, as they are employed to describe the construction of the new biosensor in the following sections.
      3. Some figure panels are not mentioned in the main text (for ex. Figures 1B and C, Figure 2C)
      4. The legend of Figure 1 (D & G) mentions "denoted by *", but the * seems to be missing in the figure.
      5. Supplementary Figure 1 has two D panels (and is missing the E).
      6. In the main text, where it reads "...Flow cytometry analysis of three independent PGC lines... (Figures 2G & S2E)", S2E should be replaced by S1E.
      7. In the Figure 4A legend, hCDT1-mVenus should be corrected to hCDT1-mcherry. Also, it is not clear why the authors state that "hGMNN-mVenus expression is notably fainter compared with hCDT1-mVenus and H1.0-mCerulean expression".
      8. In Figure 5E, the optical sections "i" seem to pertain to the extraembryonic tissue/area opaca and not to anterior mesoderm, as stated in the figure legend. Also, there is a period between "prechordal plate" and "and" in the legend's last sentence.
      9. Discussion: The last sentence of the third paragraph lacks "to" between "used" and "interrogate".
      10. References 10 and 23 are identical.

      Referee cross-commenting

      I agree with all comments from reviewers 1 and 3

      Significance

      This is a beautiful paper, describing a long sought-after model system to study cell cycle dynamics in vivo. The methodological details are thorough, and the results obtained are clearly presented, highlighting the utility of the new model in various embryonic stages and tissues/organs.

      This work is of pivotal importance to the developmental/stem cell biology community, as well as to the wider community that employs the chicken embryo as a preclinical model to assess therapeutic or teratogenic potential of biologically- or chemically-derived products.

      My expertise is in chicken embryo development, namely gastrulation, somitogenesis and limb bud outgrowth.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript reports the development of a novel Fucci (Fluorescent Ubiquitination-based cell cycle indicator) system for analysing cell cycle analysis, including live imaging of cell cycle. The novel biosensor (H1.0-Fucci(CA)2) has been developed for analyses of chick cells and tissues: chick embryos are a valuable developmental model that have (and in the future, will) particularly informed our understanding of early stages of embryogenesis, and of development of numerous tissues, including the neural tube, somites, limb bud. The authors conclude that the novel system has advantages over previous Fucci systems, including faithful labelling of all four cell cycle phases. Importantly, the authors have generated a stable germline of H1.0-Fucci(CA)2 transgenic chicks, enabling, for the first time, the discrimination and tracking of cells in all 4 phases of the cell cycle - i.e. in vivo studies of cell cycle progression in vivo, in intact tissues and organs. Additional epitope tags mean that the biosensor can be detected in fixed tissues, enabling comparison of cell cycle with expression of mRNA and proteins that mediate other aspects of development/label particular cells and tissues. The authors map proliferation dynamics across numerous tissues in the developing chick, at numerous stages of development, and conclude in particular that transition from S phase may be a key morphogenetic event in gastrulation, as mesendoderm cells leave the primitive streak to form embryonic stuctures such as prechordal plate

      Major comments:

      The novel biosensor looks to be an incredibly useful tool, and the manuscript suggests patterns of cell cycle progression in different tissues, and at different points in time, that look intriguing. But it is sometimes difficult to draw the strong conclusions suggested by the authors because the text and figures are sometimes difficult to follow. The manuscript would greatly benefit from having someone spend time on the figures, and associated text, to ensure they are fully comprehensible.

      Specifically:

      Conclusion1: That the new FUCCI biosensor is a superior cell cycle probe, better at discriminating all cell cycle phases than previous versions. I was very convinced by the vidoes (video 1 and 2) but had problems with Figure 1. Potentially, this is because I am not an expert in these types of analyses - but it was not helped by the fact that components of the figure were not cited in the text. I was particularly confused by the statement remarking on 'the persistence of mCherry in the H2B Fucci' as mCherry seems to persist longer in H1 (compare Figs 1D and 1G). Please explain, in the Figure legend, why this appears to be the case.

      Conclusion 2: that the FuChi chicks are the first viable stably expressing avian cell cycle biosensor model. I agree, and the authors should be congratulated on the development of this important tool.

      Conclusion 3: the authors monitor cell cycle progression in chicks, in vivo, looking at stages from blastoderm, through gastrulation, and into organogenesis, and draw various conclusions

      For example: Fig 3A and text: 'as gastrulation progresses, the primitive streak an presomitic mesoderm display...., whereas the .... And neural plate contains...'

      Figure 3A covers an enormous range of stages and tissues. The figure is barely labelled. The text and figure need to better align, and key features in each figure panel need to be labelled so that the reader can better follow, and draw conclusions.

      Fig 3B: Reports expression in numerous tissues. There are some beautiful examples of cells segregating relative to cell cycle - for instance, in the neural tube. But I found it hard to know where the biosensor is reporting patterns that are already well established (eg neural tube), and where the biosensor is reporting patterns that are novel - and if so, what these patterns are. Again, this is not described adequately in the text (for instance, there is no mention of the neural tube). And in some cases, references are provided (allowing comparison with previous studies) - but in other cases, there are no references to previous studies. The reader must be given the opportunity to compare this study with previous studies.

      Overall - I can appreciate that there are some fascinating patterns, but it is very difficult to draw the conclusions suggested by the authors. Primarily this is due to poor labelling of figures, and lack of clarity between figures and text, and poor referencing. Additionally, it is not clear that strong conclusions can be drawn about cell cycle in different tissue layers without sectioning some embryos.

      Fig 3C: The authors remark 'The results confirm that the ... FuChi embryos recapitulate known cell cycle profiles of those tissues'. See my comments in 3B.

      Conclusion 4: Robust stability of biosensor in fixed tissues. I agree, and the authors should be congratulated for having made a construct that can be paired with in situ hybridisation and immunohistochemistry - this is invaluable.

      Conclusion 5: The authors investigate the potential of the new system for live imaging, and focus on a couple of novel dynamic examples.

      The data indicating that PGCs at initial migratory stages are not undergoing frequent cell division is clear.

      However, the data indicating that cell cycle status changes as cells egress form the primitive streak, to form prechordal plate, is not clear. The figures need to be better labelled, and the text needs to be more clear (eg ' and prechordal plate. and anterior mesoderm'..

      Minor comments:

      • Specific experimental issues that are easily addressable.

      I would recommend that the authors section some embryos, to better support key conclusions (eg in figure 3 and 5) - Are prior studies referenced appropriately?

      Not always - see comment above (Fig 3) - Are the text and figures clear and accurate?

      No - this needs work. Not all figures cited in text, or cited in wrong order; Figures are poorly labelled - making it hard to follow - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Label figures more carefully and ensure figures and text align

      Referee cross-commenting

      I agree with all comments from reviewers 2 and 3

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Technically this is a fantastic resource. As detailed above, the novel biosensor (H1.0-Fucci(CA)2) has been developed for analyses of chick cells and tissues: chick embryos are a valuable developmental model that have (and in the future, will) particularly informed our understanding of early stages of embryogenesis, and of development of numerous tissues, including the neural tube, somites, limb bud. Increasingly, studies show the importance of cell cycle for development, differentiation and morphogenesis - it is a huge breakthrough to be able to perform in vivo studies of cell cycle progression in intact tissues and organs.<br /> - State what audience might be interested in and influenced by the reported findings.

      Broad basic research, including developmental biologists, stem cell biologists, modellers. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Developmental biologist, with expertise in chick

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03280

      Corresponding author(s): Stephan Gruber

      1. General Statements [optional]

      First, we would like to thank the editor at Review Commons for the efficient handling of our manuscript. We also apologize for our delayed response.

      We are grateful to all three reviewers for their careful evaluation of our work and for their constructive feedback, which will provide a valuable basis for improving the figures and the text, as described below. We expect to be able to complete the revision following the plan described below quickly.

      We note that the reviewer reports (Rev. #1 and Rev. #3) made us realize that the manuscript text was misleading on the following point. Although we used the purified ATP hydrolysis–deficient Smc protein for sybody isolation, this does not restrict the selection to a specific conformation. As described in detail in Vazquez-Nunez et al. (Figure 5), this mutant displays the ATP-engaged conformation only in a smaller fraction of complexes (~25% in the presence of ATP and DNA), consistent with prior in vivo observations reported by Diebold-Durand et al. (Figure 5). Rather than limiting the selection to a particular configuration, our aim was to reduce the prevalence of the predominant rod state in order to broaden the range of conformations represented during sybody selection. Consistent with this interpretation, only a small number of isolated sybodies show strong conformation-specific binding in the presence or absence of ATP/DNA, as observed by ELISA (now included in the manuscript). We will revise the manuscript text accordingly to clarify this point.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Gosselin et al., develop a method to target protein activity using synthetic single-domain nanobodies (sybodies). They screen a library of sybodies using ribosome/ phage display generated against bacillus Smc-ScpAB complex. Specifically, they use an ATP hydrolysis deficient mutant of SMC so as to identify sybodies that will potentially disrupt Smc-ScpAB activity. They next screen their library in vivo, using growth defects in rich media as a read-out for Smc activity perturbation. They identify 14 sybodies that mirror smc deletion phenotype including defective growth in fast-growth conditions, as well as chromosome segregation defects. The authors use a clever approach by making chimeras between bacillus and S. pnuemoniae Smc to narrow-down to specific regions within the bacillus Smc coiled-coil that are likely targets of the sybodies. Using ATPase assays, they find that the sybodies either impede DNA-stimulated ATP hydrolysis or hyperactivate ATP hydrolysis (even in the absence of DNA). The authors propose that the sybodies may likely be locking Smc-ScpAB in the "closed" or "open" state via interaction with the specific coiled-coil region on Smc. I have a few comments that the authors should consider:

      Major comments: 1. Lack of direct in vitro binding measurements: The authors do not provide measurements of sybody affinities, binding/ unbinding kinetics, stoichiometries with respect to Smc-ScpAB. Additionally, do the sybodies preferentially interact with Smc in ATP/ DNA-bound state? And, do the sybodies affect the interaction of ScpAB with SMC? It is understandable that such measurements for 14 sybodies is challenging, and not essential for this study. Nonetheless, it is informative to have biochemical characterization of sybody interaction with the Smc-ScpAB complex for at least 1-2 candidate sybodies described here.

      We agree with the reviewer that adding such data would be reassuring and that obtaining solid data using purified components is not easy even for a smaller selection of sybodies. We have data that show direct binding of Smc to sybodies by various methods including ELISA, pull-downs and by biophysical methods (GCI). Initially, we omitted these data from the manuscript as we are convinced that the mapping data obtained with chimeric SMC proteins is more definitive and relevant. During the revision we will incorporate the ELISA data showing direct binding and also indicating a lack of preference for a specific state of Smc.

      Many modes of sybody binding to Smc are plausible The authors provide an elaborate discussion of sybodies locking the Smc-ScpAB complex in open/ closed states. However, in the absence of structural support, the mechanistic inferences may need to be tempered. For example, is it also not possible for the sybodies to bind the inner interface of the coiled-coil, resulting in steric hinderance to coiled-coil interactions. It is also possible that sybody interaction disrupts ScpAB interaction (as data ruling this possibility out has not been provided). Thus, other potential mechanisms would be worth considering/ discussing. In this direction, did AlphaFold reveal any potential insights into putative binding locations?

      We have attempted to map the binding by structure prediction, however, so far, even the latest versions of AlphaFold are not able to clearly delineate the binding interface. Indeed, many ways of binding are possible, including disruption of ScpAB interaction. However, since the main binding site is located on the SMC coiled coils, the later scenario would likely be an indirect consequence of altered coiled coil configuration, consistent with our current interpretation.

      1. Sybody expression in vivo Have the authors estimated sybody expression in vivo? Are they all expressed to similar levels?

      We have tagged selected sybodies with gfp and performed live cell imaging. This showed that they are all roughly equally expressed and that they localize as foci in the cell presumably by binding to Smc complexes loaded onto the chromosome at ParB/parS sites. We will include this data in the revised version of the manuscript.

      1. Sybodies should phenocopy ATP hydrolysis mutant of Smc The sybodies were screened against an ATP hydrolysis deficient mutant of Smc, with the rationale that these sybodies would interfere this step of the Smc duty cycle. Does the expression of the sybodies in vivo phenocopy the ATP hydrolysis deficient mutant of Smc? Could the authors consider any phenotypic read-outs that can indicate whether the sybody action results in an smc-null effect or specifically an ATP hydrolysis deficient effect?

      As eluded to above, we think that our selection gave rise to sybodies that bind various, possibly multiple Smc conformations. Consistent with this idea, the phenotypes are similar to null mutant rather than the ATP-hydrolysis defective EQ mutant, which display even more severe growth phenotypes. We will add the following notes to the text:

      “These conditions favour ATP-engaged particles alongside the typically predominant ATP-disengaged rod-shaped state (add Vazquez Nunez et al., 2021).”

      “ELISA data confirm that nearly all clones bind Smc-ScpAB; however, their binding shows little or no dependence on the presence of ATP or DNA.”

      Minor comments: 1. It was surprising that no sybodies were found that could target both bacillus and spneu Smc. For example, sybodies targeting the head regions of Smc that might work in a more universal manner. Could the authors comment on the coverage of the sybodies across the protein structure?

      It is rather common that sybodies (like antibodies and nanobodies) exhibit strong affinity differences between highly conserved proteins (> 90 % identity). The underlying reasons for such strong discrimination are i) location of less conserved residues primarily at the target protein surface and ii) the large interaction interface between sybody and target which offers multiple vulnerabilities for disturbance, in particular through bulky side chains resulting in steric clashes. Another frequently observed phenomenon is sybody binding to a dominant epitope, which also often applies to nanobodies and antibodies. A great example for this are the dominant epitopes on SARS-CoV-2 RBDs.

      Growth curves (Fig. S3) show a large jump in recovery in growth under sybody induction conditions. Could the authors address this observation here and in the text?

      We suppose that this recovery represents suppressor mutants and/or (more likely) improved growth in the absence of functional Smc during nutrient limitation (see Gruber et al., 2013 and Wang et al., 2013). We will add this statement to the text.

      L41- Sentence correction: Loop can be removed. Ah, yes, sorry for this confusing error. Thank you. 4. L525 - bsuSmc 'E' :extra E can be removed. To do. Thank you. 5. References need to be properly formatted. To do. Thank you. 6. The authors should add in figure legend for Fig 1i) details on representation of the purple region, and explain the grey strokes for orientation of the loop. To do. 7. How many cells were analysed in the cell biological assays? Legends should include these information. To Be Included.

      Reviewer #1 (Significance (Required)):

      Overall, this is an impressive study that uses an elegant strategy to find inhibitors of protein activity in vivo. The manuscript is clearly written and the experiments are logical and well-designed. The findings from the study will be significant to the broad field of genome biology, synthetic biology and also SMC biology. Specifically, the coiled coil domain of SMC proteins have been proposed to be of high functional value. The authors have elegantly identified key coiled-coil regions that may be important for function, and parallelly exhibited potential of the use of synthetic sybody/designed binders for inhibition of protein activity.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Review: "Single Domain Antibody Inhibitors Target the Coiled Coil Arms of the Bacillus subtilis SMC complex" by Ophélie Gosselin et al, Review Commons RC-2025-03280 Structural Maintenance of Chromosome proteins (SMCs), a family of proteins found in almost all organisms, are organizers of DNA. They accomplish this by a process known as loop extrusion, wherein double-stranded DNA is actively reeled in and extruded into loops. Although SMCs are known to have several DNA binding regions, the exact mechanism by which they facilitate loop extrusion is not understood but is believed to entail large conformational changes. There are currently several models for loop extrusion, including one wherein the coiled coil (CC) arms open, but there is a lack of insightful experimentation and analysis to confirm any of these models. The work presented aims to provide much-needed new tools to investigate these questions: conformation-selective sybodies (synthetic nanobodies) that are likely to alter the CC opening and closing reactions. The authors produced, isolated, and expressed sybodies that specifically bound to Bacillus subtilis Smc-ScpAB. Using chimeric Smc constructs, where the coiled coils were partly replaced with the corresponding sequences from Streptococcus pneumoniae, the authors revealed that the isolated sybodies all targeted the same 4N CC element of the Smc arms. This region is likely disrupted by the sybodies either by stopping the arms from opening (correctly) or forcing them to stay open (enough). Disrupting these functional elements is suggested to cause the Smc-dependent chromosome organization lethal phenotype, implying that arm opening and closing is a key regulatory feature of bacterial Smc-ScpAB. In summary, the authors present a new method for trapping bacterial Smc's in certain conformations using synthetic antibodies. Using these antibodies, they have pinpointed the (previously suggested) 4N region of the coiled coils as an essential site for the opening and closing of the Smc coiled coil arms and that hindering these reactions blocks Smc-driven chromosomal organization. The work has important implications for how we might elucidate the mechanism of DNA loop extrusion by SMC complexes. Some specific comments: Line 75: "likely stabilizing otherwise rare intermediates of the conformational cycle." - sorry, why is that being concluded? Why not stabilizing longer-lived oncformations? We will clarify this statement!

      Line 89: Sorry, possibly our lack of understanding: why first ribosome and then phage display?

      Ribosome display offers to screen around 10^12 sybodies per selection round (technically unrestricted library size), while for phage display, the library size is restricted to around 10^9 sybodies due to the fact that production of a phage library requires transformation of the phagemid plasmid into E. coli, thereby introducing a diversity bottleneck. This is why the sybody platform starts off with ribosome display. It switches to phage display from round 2 onwards because the output of the initial round of ribosome display is around 10^6 sybodies, which can be easily transferred into the phage display format. Phage display is used to minimize selection biases. For more information, please consult the original sybody paper (PMID: 29792401).

      Line 100: Why was only lethality selected? Less severe phenotypes not clear enough?

      Yes, colony size is more difficult to score robustly, as the sizes of individual transformant colonies can vary quite widely. The number of isolated sybodies was at the limit of further analysis.

      Line 106: Could it be tested somehow if convex and concave library sybodies fold in Bs?

      We did not focus on the non-functional sybody candidates and only sybodies of the loop library turned out to cause functional consequences at the cellular level. Notably, we will include gfp-imaging showing that non-lethal sybodies are expressed to similar levels that toxic sybodies. Given the identical scaffold of concave and loop sybodies (they only differ in their CDR3 length), we expect that the concave sybodies fold in the cytoplasm of B. subtilis. For the convex sybodies exhibiting a different scaffold, this will be tested.

      Line 125: Could Pxyl be repressed by glucose?

      To our knowledge and experience, repression by glucose (catabolite repression) does not work well in this context in B. subtilis.

      Line 131: The SMC replacement strain is a cool experiment and removes a lot of doubts!

      Thank you! (we agree 😊)

      Line 141: The mapping is good and looks reliable, but looks and feels like a tour de force? Of course, some cryo-EM would have been lovely (lines 228-229 understood, it has been tried!).

      Yes, we have made several attempts at structural biology. Unfortunately, Smc-ScpAB is not well suited for cryo-EM in our hands and crystallography with Smc fragments and sybodies did not yield well-diffracting crystals.

      Line 179: Mmmh. Do we not assume DNA binding on top of the dimerised heads to open the CC (clamp)?

      We will clarify the text here.

      Line 187: Having sybodies that presumably keep the CC together (closing) and some that do not allow them to come together correctly (opening) is really cool and probably important going forward.

      Thank you!

      Figure 1 Ai is not very colour-blind friendly.

      We are sorry for this oversight. We will try to make the color scheme more inclusive. Thank you for the notification.

      Optional: did the authors see any spontaneous mutations emerge that bypass the lethal phenotype of sybody expression?

      No, we did not observe spontaneous mutations suppressing the phenotype, possibly due to the limited number of cell generations observed. We tried to avoid suppressors by limiting growth, but this may indeed be a good future approach for further fine map the binding sites and to obtain insights into the mechanism of inhibition.

      Optional: we think it would be nice to try some biochemical experiment with BMOE/cysteine-crosslinked B. subtilis Smc in the mid-region (4N or next to it) of the Smc coiled coils to try to further strengthen the story. Some of the authors are experts in this technique and strains might already exist?

      We have indeed tried to study the impact of sybody binding on Smc conformation by cysteine cross-linking. However, we were not convinced by the results and thus prefer not to draw any conclusions from them. We will add a corresponding note to the text.

      Reviewer #2 (Significance (Required)):

      The authors present a new method for trapping bacterial Smc's in certain conformations using synthetic antibodies. Using these antibodies, they have pinpointed the (previously suggested) 4N region of the coiled coils as an essential site for the opening and closing of the Smc coiled coil arms and that hindering these reactions blocks Smc-driven chromosomal organization. The work has important implications for how we might elucidate the mechanism of DNA loop extrusion by SMC complexes. Thank you!

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Gosselin et al. use the sybody technology to study effects of in vivo inhibition oft he Bacillus subtilis SMC complex. Smc proteins are central DNA binding elements of several complexes that are vital for chromosome dynamics in almost all organisms. Sybodies are selected from three different libraries of the single domain antibodies, using the „transition state" mutant Smc. They identify 14 such mutant sybodies that are lethal when expressed in vivo, because they prevent proper function of Smc. The authors present evidence suggesting that all obtained sybodies bind to a coiled-coil region close to the Smc „neck", and thereby interfere with the Smc activity cycle, as evidenced by defective ATPase activity when Smc is bound to DNA. The study is well done and presented and shows that the strategy is very potent in finding a means to quickly turn off a protein's function in vivo, much quicker than depleting the protein.

      The authors also draw conclusions on the molecular mode of action of the SMC complex. The provide a number of suggestive experiments, but in my view mostly indirect evidence for such mechanism.

      My main criticism ist hat the authors have used a single - and catalytically trapped form of SMC. They speculate why they only obtain sybodies from one library, and then only idenfity sybodies that bind to a rather small part oft he large Smc protein. While the approach is definitely valuable, it is biassed towards sybodies that bind to Smc in a quite special way, it seems. Using wild type Smc would be interesting, to make more robust statements about the action of sybodies potantially binding to different parts of Smc.

      As explained above, we are quite confident the Smc ATPase mutation did not bias the selection in an obvious way. The surprising bias towards coiled coil binding sites has likely other explanations, as they likely form a preferred epitope recognized by sybodies.

      Line 105: Alternatively, the other libraries did not produce good binders or these sybodies were 106 not stably expressed in B. subtilis. This could be tested using Western blotting - I am assuming sybody antibodies are commercially avalable. However, this test is not important for the overall study, it would just clarify a minor point.

      While there are antibody fragments available to augment the size of sybodies (PMID: 40108246), these recognize 3D-epitopes and are thus not suited for Western blotting. We did not follow up on the negative results much, but would like to point out again that there are several biases that likely emerge for the same reason (bias to library, bias to coiled coil binding site). If correct, then likely few other sybodies are effectively lethal in B. subtilis, with the exception of the ones isolated and characterized. We have added this notion to the manuscript. We have also tested the expression of non-lethal sybodies by gfp-tagging and imaging. These results will be included in the revision.

      Fig. 2B: is is odd to count Spo0J foci per cells, as it is clear from the images that several origins must be present within the fluorescent foci. I am fine with the „counting" method, as the images show there is a clear segregation defect when sybodies are expressed, I believe the authors should state, though, that this is not a replication block, but failure to segregate origins.

      We agree that this is an important point and will add a corresponding comment to the text.

      Testing binding sites of sybodies tot he SMC complex is done in an indirect manner, by using chimeric Smc constructs. I am surprised why the authors have not used in vitro crosslinking: the authors can purify Smc, and mass spectrometry analyses would identify sites where sybodies are crosslinked to Smc. Again, I am fine with the indirect method, but the authors make quite concrete statements on binding based on non-inhibition of chimeric Smc; I can see alternative explanations why a chimera may not be targeted.

      We have made several attempts of testing direct binding with mixed outcomes and decided to not include those results in the light of the stronger and more relevant in vivo mapping. However, we will add ELISA results and briefly discuss grating coupled interferometry (GCI) data and pull-downs.

      Smc-disrupting sybodies affect the ATPase activity in one of two ways. Again, rather indirect experiments. This leads to the point Revealing Smc arm dynamics through synthetic binders in the discussion. The authors are quite careful in stating that their experiments are suggestive for a certain mode of action of Smc, which is warranted.

      In line 245, they state More broadly, the study demonstrates how synthetic binders can trap, stabilize, or block transient conformations of active chromatin-associated machines, providing a powerful means to probe their mechanisms in living cells. This is off course a possible scenario for the use of sybodies, but the study does not really trap Smc in a transient conformation, at least this is not clearly shown.

      We agree and will carefully rephrase this statement. Thank you.

      Overall, it is an interesting study, with a well-presented novel technology, and a limited gain of knowledge on SMC proteins. We respectfully disagree with the last point, since our unique results highlight the importance of the Smc coiled coils, which are otherwise largely neglected in the SMC literature, likely (at least in part) due the mild effect of single point mutations on coiled coil dynamics.

      Reviewer #3 (Significance (Required)):

      The work describes the gaining and use of single-binder antibodies (sybodies) to interfere with the function of proteins in bacteria. Using this technology for the SMC complex, the authors demonstrate that they can obtain a significant of binders that target a defined region is SMC and thereby interfere with the ATPase cycle.

      The study does not present a strong gain of knowledge of the mode of action of the SMC complex.

      As pointed out above, we respectfully disagree with this assertion.

      • *

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      As pointed out above, there are a few minor points that we prefer not to experimentally address. In particular, we do not consider it as necessary to determine the expression levels of sybodies which were non-inhibitory. We also wish to note that we attempted to obtain structural additional biochemical data and to that end performed cryo-EM, crystallography and cysteine cross-linking experiments. Unfortunately, we did not obtain sybody complex structures and the cross-linking data were unfortunately not conclusive. We also wish to note that the first author has finished her PhD and left the lab, which limits our capacity to add additional experiments. However, as the reviewers also pointed out, the main conclusions are well supported by the data already.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Gosselin et al. use the sybody technology to study effects of in vivo inhibition oft he Bacillus subtilis SMC complex. Smc proteins are central DNA binding elements of several complexes that are vital for chromosome dynamics in almost all organisms. Sybodies are selected from three different libraries of the single domain antibodies, using the „transition state" mutant Smc. They identify 14 such mutant sybodies that are lethal when expressed in vivo, because they prevent proper function of Smc. The authors present evidence suggesting that all obtained sybodies bind to a coiled-coil region close to the Smc „neck", and thereby interfere with the Smc activity cycle, as evidenced by defective ATPase activity when Smc is bound to DNA. The study is well done and presented and shows that the strategy is very potent in finding a means to quickly turn off a protein's function in vivo, much quicker than depleting the protein.

      The authors also draw conclusions on the molecular mode of action of the SMC complex. The provide a number of suggestive experiments, but in my view mostly indirect evidence for such mechanism.

      My main criticism ist hat the authors have used a single - and catalytically trapped form of SMC. They speculate why they only obtain sybodies from one library, and then only idenfity sybodies that bind to a rather small part oft he large Smc protein. While the approach is definitely valuable, it is biassed towards sybodies that bind to Smc in a quite special way, it seems. Using wild type Smc would be interesting, to make more robust statements about the action of sybodies potantially binding to different parts of Smc.

      Line 105: Alternatively, the other libraries did not produce good binders or these sybodies were 106 not stably expressed in B. subtilis. This could be tested using Western blotting - I am assuming sybody antibodies are commercially avalable. However, this test is not important for the overall study, it would just clarify a minor point.

      Fig. 2B: is is odd to count Spo0J foci per cells, as it is clear from the images that several origins must be present within the fluorescent foci. I am fine with the „counting" method, as the images show there is a clear segregation defect when sybodies are expressed, I believe the authors should state, though, that this is not a replication block, but failure to segregate origins.

      Testing binding sites of sybodies tot he SMC complex is done in an indirect manner, by using chimeric Smc constructs. I am surprised why the authors have not used in vitro crosslinking: the authors can purify Smc, and mass spectrometry analyses would identify sites where sybodies are crosslinked to Smc. Again, I am fine with the indirect method, but the authors make quite concrete statements on binding based on non-inhibition of chimeric Smc; I can see alternative explanations why a chimera may not be targeted.

      Smc-disrupting sybodies affect the ATPase activity in one of two ways. Again, rather indirect experiments. This leads to the point Revealing Smc arm dynamics through synthetic binders in the discussion. The authors are quite careful in stating that their experiments are suggestive for a certain mode of action of Smc, which is warranted.

      In line 245, they state More broadly, the study demonstrates how synthetic binders can trap, stabilize, or block transient conformations of active chromatin-associated machines, providing a powerful means to probe their mechanisms in living cells. This is off course a possible scenario for the use of sybodies, but the study does not really trap Smc in a transient conformation, at least this is not clearly shown.

      Overall, it is an interesting study, with a well-presented novel technology, and a limited gain of knowledge on SMC proteins.

      Significance

      The work describes the gaining and use of single-binder antibodies (sybodies) to interfere with the function of proteins in bacteria. Using this technology for the SMC complex, the authors demonstrate that they can obtain a significant of binders that target a defined region is SMC and thereby interfere with the ATPase cycle.

      The study does not present a strong gain of knowledge of the mode of action of the SMC complex.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Review: "Single Domain Antibody Inhibitors Target the Coiled Coil Arms of the Bacillus subtilis SMC complex" by Ophélie Gosselin et al, Review Commons RC-2025-03280

      Structural Maintenance of Chromosome proteins (SMCs), a family of proteins found in almost all organisms, are organizers of DNA. They accomplish this by a process known as loop extrusion, wherein double-stranded DNA is actively reeled in and extruded into loops. Although SMCs are known to have several DNA binding regions, the exact mechanism by which they facilitate loop extrusion is not understood but is believed to entail large conformational changes. There are currently several models for loop extrusion, including one wherein the coiled coil (CC) arms open, but there is a lack of insightful experimentation and analysis to confirm any of these models. The work presented aims to provide much-needed new tools to investigate these questions: conformation-selective sybodies (synthetic nanobodies) that are likely to alter the CC opening and closing reactions.

      The authors produced, isolated, and expressed sybodies that specifically bound to Bacillus subtilis Smc-ScpAB. Using chimeric Smc constructs, where the coiled coils were partly replaced with the corresponding sequences from Streptococcus pneumoniae, the authors revealed that the isolated sybodies all targeted the same 4N CC element of the Smc arms. This region is likely disrupted by the sybodies either by stopping the arms from opening (correctly) or forcing them to stay open (enough). Disrupting these functional elements is suggested to cause the Smc-dependent chromosome organization lethal phenotype, implying that arm opening and closing is a key regulatory feature of bacterial Smc-ScpAB. In summary, the authors present a new method for trapping bacterial Smc's in certain conformations using synthetic antibodies. Using these antibodies, they have pinpointed the (previously suggested) 4N region of the coiled coils as an essential site for the opening and closing of the Smc coiled coil arms and that hindering these reactions blocks Smc-driven chromosomal organization. The work has important implications for how we might elucidate the mechanism of DNA loop extrusion by SMC complexes.

      Some specific comments:

      Line 75: "likely stabilizing otherwise rare intermediates of the conformational cycle." - sorry, why is that being concluded? Why not stabilizing longer-lived oncformations?

      Line 89: Sorry, possibly our lack of understanding: why first ribosome and then phage display?

      Line 100: Why was only lethality selected? Less severe phenotypes not clear enough?

      Line 106: Could it be tested somehow if convex and concave library sybodies fold in Bs?

      Line 125: Could Pxyl be repressed by glucose?

      Line 131: The SMC replacement strain is a cool experiment and removes a lot of doubts!

      Line 141: The mapping is good and looks reliable, but looks and feels like a tour de force? Of course, some cryo-EM would have been lovely (lines 228-229 understood, it has been tried!).

      Line 179: Mmmh. Do we not assume DNA binding on top of the dimerised heads to open the CC (clamp)?

      Line 187: Having sybodies that presumably keep the CC together (closing) and some that do not allow them to come together correctly (opening) is really cool and probably important going forward.

      Figure 1 Ai is not very colour-blind friendly. Optional: did the authors see any spontaneous mutations emerge that bypass the lethal phenotype of sybody expression?

      Optional: we think it would be nice to try some biochemical experiment with BMOE/cysteine-crosslinked B. subtilis Smc in the mid-region (4N or next to it) of the Smc coiled coils to try to further strengthen the story. Some of the authors are experts in this technique and strains might already exist?

      Significance

      The authors present a new method for trapping bacterial Smc's in certain conformations using synthetic antibodies. Using these antibodies, they have pinpointed the (previously suggested) 4N region of the coiled coils as an essential site for the opening and closing of the Smc coiled coil arms and that hindering these reactions blocks Smc-driven chromosomal organization. The work has important implications for how we might elucidate the mechanism of DNA loop extrusion by SMC complexes.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Gosselin et al., develop a method to target protein activity using synthetic single-domain nanobodies (sybodies). They screen a library of sybodies using ribosome/ phage display generated against bacillus Smc-ScpAB complex. Specifically, they use an ATP hydrolysis deficient mutant of SMC so as to identify sybodies that will potentially disrupt Smc-ScpAB activity. They next screen their library in vivo, using growth defects in rich media as a read-out for Smc activity perturbation. They identify 14 sybodies that mirror smc deletion phenotype including defective growth in fast-growth conditions, as well as chromosome segregation defects. The authors use a clever approach by making chimeras between bacillus and S. pnuemoniae Smc to narrow-down to specific regions within the bacillus Smc coiled-coil that are likely targets of the sybodies. Using ATPase assays, they find that the sybodies either impede DNA-stimulated ATP hydrolysis or hyperactivate ATP hydrolysis (even in the absence of DNA). The authors propose that the sybodies may likely be locking Smc-ScpAB in the "closed" or "open" state via interaction with the specific coiled-coil region on Smc. I have a few comments that the authors should consider:

      Major comments:

      1. Lack of direct in vitro binding measurements: The authors do not provide measurements of sybody affinities, binding/ unbinding kinetics, stoichiometries with respect to Smc-ScpAB. Additionally, do the sybodies preferentially interact with Smc in ATP/ DNA-bound state? And, do the sybodies affect the interaction of ScpAB with SMC? It is understandable that such measurements for 14 sybodies is challenging, and not essential for this study. Nonetheless, it is informative to have biochemical characterization of sybody interaction with the Smc-ScpAB complex for at least 1-2 candidate sybodies described here.
      2. Many modes of sybody binding to Smc are plausible The authors provide an elaborate discussion of sybodies locking the Smc-ScpAB complex in open/ closed states. However, in the absence of structural support, the mechanistic inferences may need to be tempered. For example, is it also not possible for the sybodies to bind the inner interface of the coiled-coil, resulting in steric hinderance to coiled-coil interactions. It is also possible that sybody interaction disrupts ScpAB interaction (as data ruling this possibility out has not been provided). Thus, other potential mechanisms would be worth considering/ discussing. In this direction, did AlphaFold reveal any potential insights into putative binding locations?
      3. Sybody expression in vivo Have the authors estimated sybody expression in vivo? Are they all expressed to similar levels?
      4. Sybodies should phenocopy ATP hydrolysis mutant of Smc The sybodies were screened against an ATP hydrolysis deficient mutant of Smc, with the rationale that these sybodies would interfere this step of the Smc duty cycle. Does the expression of the sybodies in vivo phenocopy the ATP hydrolysis deficient mutant of Smc? Could the authors consider any phenotypic read-outs that can indicate whether the sybody action results in an smc-null effect or specifically an ATP hydrolysis deficient effect?

      Minor comments:

      1. It was surprising that no sybodies were found that could target both bacillus and spneu Smc. For example, sybodies targeting the head regions of Smc that might work in a more universal manner. Could the authors comment on the coverage of the sybodies across the protein structure?
      2. Growth curves (Fig. S3) show a large jump in recovery in growth under sybody induction conditions. Could the authors address this observation here and in the text?
      3. L41- Sentence correction: Loop can be removed.
      4. L525 - bsuSmc 'E' :extra E can be removed.
      5. References need to be properly formatted.
      6. The authors should add in figure legend for Fig 1i) details on representation of the purple region, and explain the grey strokes for orientation of the loop.
      7. How many cells were analysed in the cell biological assays? Legends should include these information.

      Significance

      Overall, this is an impressive study that uses an elegant strategy to find inhibitors of protein activity in vivo. The manuscript is clearly written and the experiments are logical and well-designed. The findings from the study will be significant to the broad field of genome biology, synthetic biology and also SMC biology. Specifically, the coiled coil domain of SMC proteins have been proposed to be of high functional value. The authors have elegantly identified key coiled-coil regions that may be important for function, and parallelly exhibited potential of the use of synthetic sybody/designed binders for inhibition of protein activity.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      We thank the Reviewers for their appreciative comments (Reviewer 1: “first time that a well-established existing mathematical model of signaling response extended and applied to heterogeneous ligand mixtures”)and constructive suggestions for improvement. In this extensive revision, we have not only addressed the suggestions comprehensively but also extended our analysis of signaling antagonism to all doses and at the single-cell level using novel computational workflows. This resulted in the discovery of several mechanismsof antagonism and synergy that are dose-dependent, and dependent on the cell-specific state of the signaling network, thereby manifesting in only a subset of cells.

      We have addressed Reviewer comments: we have made substantial revisions to improve clarity, rigor, and biological interpretation. Below we briefly summarize the main concerns raised by Reviewers 1-3 and how we have addressed them.

      • We have rewritten the Methods section to clarify our approaches. We have also added the explanation of methodology and the rationale in the main text to improve readability and comprehensiveness (Addressing Reviewer #1 comments). This includes explaining and justifying the signaling codon approaches (Reviewer 1), our core-module parameter matching methodology and discussion (Reviewer #1, point 11, Reviewer #2, point 1), and the model schematic (Reviewer #1, point 5).
      • For one of our major conclusions – that macrophages may distinguish stimuli in the context of ligand mixtures – we have validated these results with experiments, which increases confidence in this conclusion (Reviewer #2, point 3, Reviewer #3, point 2).
      • We have updated the model for CpG-pIC competition using Michaelis–Menten kinetics without any additional parameters, rather than introducing new free parameters. This change removes parameter freedom for fitting combinatorial conditions, leading to a more constrained and mechanistically grounded model whose predictions align better with experimental data (Updated Figures 2 and S2; Reviewer #2, point 2).
      • We have addressed all other editorial and clarification-related concerns as well, as detailed in our point-by-point response below. In addition, we have extended the scope of the manuscript. We have extended our analysis of ligand combinations across a broad dose range, from non-responsive to saturated conditions. This led to several additional discoveries. For example, we show that ultrasensitive IKK activation can underlie synergistic combinations of ligands at low doses. In contrast, beyond the CpG-poly(I:C) antagonism, we identify that competition for CD14 uptake by LPS and Pam can generate antagonism between these ligands within specific dose ranges.

      Importantly, such antagonism or synergy is not evident in all cells in the population. It may also not be picked up by studies of the mean behavior. With our new computational workflow that allows for single-cell resolution we identify the conditions that must be met by the signaling network state, for antagonism or synergy to take place.

      Further, we examine the hypothesis that such signaling pathway interactions affect stimulus-response specificity in combinatorial stimulus conditions. By comparing models with and without this antagonism, we demonstrate that antagonistic interactions can improve stimulus-response specificity in complex ligand mixtures.

      These additional analyses provide a new mechanistic understanding of cellular information processing and elucidate how synergy and antagonism can mechanistically shape signaling fidelity in response to complex ligand mixtures.

      Point-by-Point Response

      Reviewer #1

      Evidence, reproducibility and clarity

      The authors extend an existing mathematical model of NFkB signalling under stimulation of various single receptors, to model that describes responses to stimulation of multiple receptors simultaneously. They compare this model to experimental data derived from live-cell imaging of mouse macrophages, and modify the model to account for potential antagonism between TLR3 and TLR9 response due to competition for endosomal transport. Using this framework they show that, despite distinguishability decreasing with increasing numbers of heterogenous stimuli, macrophages are still able in principle to distinguish these to a statistically significant degree. I congratulate the authors on an interesting approach that extends and validates an existing mathematical model, and also provides valuable information regarding macrophage response.

      Response: We thank the reviewer for this appreciative assessment and for the careful reading of our work. The constructive comments helped us substantially improve the rigor and clarity of the manuscript.

      In addition to revising the text for clarity, we have extended our analysis to systematically investigate dose-response behavior for each pair of ligand combination. Using the experimentally validated model, we explored 10 ligand pairs across a range of doses from non-responsive to saturating. This allowed us to identify mechanistic regimes in which synergy and antagonism arise at the single-cell level. In particular, we found that low-dose synergy can be explained by ultrasensitive IKK activation (Figure 4 and corresponding supplementary figures), while antagonism can emerge from competition for shared components such as CD14 (Figure 5 and corresponding supplementary figures). We further show that antagonism can enhance condition distinguishability in ligand mixtures, thereby contributing to stimulus-response specificity (Figure 5 and corresponding supplementary figures).

      There are no major issues affecting the scientific conclusions of the paper, however the lack of detail surrounding the mathematical model and the 'signaling codons' that are used throughout the paper make it difficult to read. This is exacerbated by the fact that I was unable to find Ref 25 which apparently describes the model, however I was able to piece together the essential components from the description in Ref 8 and the supplementary material.

      Response: This comment helped us to improve the writing. We apologize that the key reference 25 was still not publicly available. It is now published in Nature Communications. In addition, we have added more details to clarify the mathematical model as well as the signaling codons, in results and in methods. Please see below for details.

      Lots of the minor comments below stem from this, however there are also a few other places that could benefit from some additional clarification and explanation.

      Significance: 1. '...it remains unclear complex...' -> '...it remains unclear whether complex...' Response: We have rewritten the Significance (now it is Synopsis).

      Introduction: 2. 'temporal dynamics of NFkB' - it would be good to be more concrete regarding the temporal dynamics of what aspect of this (expression, binding, conformation, etc), if possible. Response: It refers to the presence of NFκB into nucleus, which represents active NFκB capable of activating gene expression. We have clarified this (Lines 59-61 in introduction paragraph 2). “Upon stimulation, NFκB translocates into the nucleus, … activating immune gene expression (10, 15–19).

      'signaling codons' - the behaviour of these is key to the entire paper, so even if they are well described in the reference, it would be good to have a short description as early as possible so that the reader can get an idea in their mind what exactly is being discussed here. Later, it would be good to have concrete description of exactly what these capture.

      Response: We thank the reviewer for this comment. We have added one whole paragraph in the early introduction to describe the concept of Signaling Codons which allow quantitative characterization of NFkB stimulus-response-specific dynamics (Lines 60-67). We have also added more concrete description of Signaling Codons in the results as well as adding an illustration for the signaling codons (Lines 169-175, Figure S2B).

      'This challenge...population of macrophages' - this seems a bit out of place, and is a bit of a run on sentence, so I suggest moving this to the next paragraph and working it into the first sentence there '...regulatory mechanisms, and this challenge could be addressed with a model parameterised to account for heterogeneous...Early models ...', or something similar.

      Response: We thank the reviewer for this suggestion, we have revised this as suggested. This improves the logic flow (Lines 87-88).

      Ref 25: I can't find a paper with this title anywhere, so if it's an accepted preprint then it would be good to have this available as well. That said, I still think it would be difficult to grasp the work done in this paper without some description of the mathematical model here, at least schematically, if not the full set of ODEs. For example, there are numerous references to how this incorporates heterogeneous responses, the 'core module', etc, and the reader has no context of these if they aren't familiar with the structure of the model. Response: We apologize that Ref 25 was not on PubMed. Now it’s published, and we have updated the corresponding information. This comment also helped us to improve the writing by adding a description of the mathematical model in the Introduction (Lines 95-105), the results (Lines 129-141), and a detailed description of the model in the Methods (Simulation of heterogenous NFκB dynamical responses.)

      We have also added the schematic of the model topology in Figure S1 (adapted from previous publications Guo et al 2025, Adelaja et al 2021) to make sure the paper is self-contained.

      'A key challenge which is...' -> 'A key challenge is...' Response: We have revised the Introduction and removed this sentence.

      'With model simulation ...' -> a bit of a run on sentence, I suggest breaking after 'conditions'. Response: We have revised the introduction and removed this sentence.

      Results:

      1. This section would benefit from a more in-depth description of the model and experimental setup. In particular for the experiment, the reader never really knows what this workflow for this is, nor what the model ingests as input, and what the predictions are of. Response: This comment helped us to improve clarity by adding an in-depth description of the model and experimental setup. We have revised the Results as suggested (Lines 129-141). We also appended the corresponding revision here for reviewer reference.

      This mechanistic model was trained on single-ligand response experimental datasets, capturing the single-ligand stimulus-response specificity of the population of macrophages while accounting for cellular heterogeneity. Specifically, quantitative NFκB dynamic trajectory data from hundreds of single macrophages responding to five single ligands (TNF, pIC, Pam, CpG, LPS) at 3-5 doses was obtained from live cell imaging experiments. The mathematical model (Figure S1) consists of a 52-dimensional system of ordinary differential equations, including 52 intracellular species, 101 reactions and 133 parameters, and is divided into five receptor modules, which respond to the corresponding ligands respectively, and the IKK-NFκB core module that contains the prominent IκBα negative feedback loop. By fitting the single-cell experimental data set with a non-linear mixed effect statistical model (coupling with 52-dimensional NFκB ODE model), the parameter distributions for the single-cell population were inferred. Analyzing the resulting simulated NFκB trajectories with Information theoretic and machine learning classification analyses confirmed that the virtual cell model simulations reproduced key SRS performance characteristics of live macrophages.”

      '..mechanistic model was trained...' - trained in this study, or in the previous referenced study? Response: The mechanistic model was trained in a previous study (Guo et al 2025 Nature Comm), and we have clarified this in the revision (Lines 127 - 129).

      1. 'determined parameter distributions' - this is where it would be good to have more background on the model. What parameters are these, and what do they correspond to biologically? It would also be nice to see in the methods or supplementary material how this is done (maximum likelihood, etc). Response: This comment helps us to clarify the predetermined parameter distributions. We have revised the methods to include this information (Simulation of heterogenous NFκB dynamical responses, paragraph 3). We have appended the corresponding text here for reviewer’s convenience.

      “The ODE model was then fitted to the population of single-cell trajectories to recapitulate the cell-to-cell heterogeneity in the experimental data (2). This is achieved by solving the non-linear mixed effects model (NLME) through stochastic approximation of expectation maximation algorithm (SAEM) (3–6). Seventeen parameters were estimated. Within the core module, the estimated parameters included the rates governing TAK1 activation (k52, k65), the time delays of IκBα transcription regulated by NFκB (k99, k101), and the total cellular NFκB abundance (tot NFκB). Within the receptor module, receptor synthesis rates (k54 for TNF, k68 for Pam, k85 for CpG, k35 for LPS, k77 for pIC), degradation rates of the receptor–ligand complexes (k56, k61, k64 for TNF; k75 for Pam; k93 for CpG; k44 for LPS; k83 for pIC), and endosomal uptake rates (k87 for CpG; k36 and k40 for LPS; k79 for pIC) were fitted. All remaining parameters were fixed at literature-suggested values (1). The single-cell parameters inferred from experimental individualcell trajectories then served as empirical distributions for generating the new dataset (see SupplementaryDataset2).”

      'matching cells with similar core model...' - it's difficult to follow the logic as to why this is done, so I think this needs to be a little clearer. My guess would be that the assumption is that simulated cells with similar 'core' parameters have a similar downstream signalling response, and therefore the receptors can be 'transplanted'. So it would be nice to see exactly what these distributions are and what the effect of a bad match would be. Response: We thank the reviewer for this comment. In the revision, we have explained the rationale for matching cells with similar core module (Lines 145-152).

      Previous work determined parameter distributions for only the cognate receptor module (and the core module) that provided the best fit for the relevant single ligand experimental data (Figure 1A, Step 1), but other receptor modules’ parameter values were not determined. To simulate stimulus responses to more than two ligands, we imputed the other ligand-receptor module parameters using shared core-module parameters as common variables and employing nearest-neighbor hot-deck imputation (35). In this setup, the core module functions as an “anchor” to harmonize two or more receptor-specific parameter distributions.

      This nearest-neighbor hot-deck imputation approach (the core module matching method) was shown to outperform other approaches, including random matching and rescaled-similarity matching (Guo et al. 2025, Supplementary Figure S11). For the reviewer’s convenience, we have also appended the corresponding figure below.

      Figure S11 from (Guo et al., 2025). Assessment of matching techniques for predicting single-cell responses to various ligand stimuli (a-d). Heatmaps illustrating the Wasserstein distance between the signaling codon distributions predicted by the model and those observed in experiments. The analysis employs four distinct matching methods to align the five ligand-receptor module parameters: (a) “Random Matching”, (b) “Similarity Matching” (the method used in our study), (c) “Rescaled-Similarity Matching”, and (d) “Sampling Approximated Distribution”. In the heatmaps, rows represent signaling codons, columns denote ligands, and the color intensity indicates the Wasserstein distance, providing a visual metric of similarity between model predictions and experimental data. e-f. Histogram of the average Wasserstein distance between the model-predicted and experimentally observed signaling codon distributions, summarized across signaling codons (e) and ligands (f).

      Some explanation of how this relates to the experimental data the parameters are fit on would also be useful. (a) Is there a correspondence between individual simulated cells and the experimental data for the single ligand stimulation, and then the smallest set of these is taken? Is there also a matching from the simulated multi-receptor modules and the multi-receptor data, and if so, is this done in the same way? Response: This comment to help us clarify the correspondence relationship between model simulations and experimental data.

      Yes—there is a correspondence between individual simulated cells and the previously published experimental data (Guo et al., 2025b) for single-ligand stimulation. We have revised the first paragraph of the Results (Lines 136–148) and the Methods (Lines 544-557) to clarify how the model simulations were fit to the previous experimental dataset. See Reviewer 1, Comments 10 for the updates in Methods. We have pasted in the revised Results section below for the reviewer’s reference.

      By fitting the single-cell experimental data set with a non-linear mixed effect statistical model (coupling with 52-dimensional NFκB ODE model), the parameter distributions for the single cell population were inferred.

      'six signaling codons' - here it would be good to recapitulate what these represent, but also what the 'strength' and 'activity' correspond to (total integrated value, maximum value, etc) Response: We thank the reviewer for the suggestion and have clarified this point (Lines 169-175, Figure S2B).

      'pre-defined thresholds' - no need to state these numerically in the text (although giving some sense of how/why these were chosen would give some context), but I couldn't find the values of these, nor values corresponding to the signaling codons. Response: We appreciate the reviewer’s comment. We have added this information in the figure legend (Figure 1B-C) and Method -- “Responder fraction” (Lines 666-672). Specifically, for the model simulation data, the integral thresholds are 0.4 (µM·h), 0.5 (µM·h), and 0.6 (µM·h). The peak thresholds are 0.12 (µM), 0.14 (µM), and 0.16 (µM). For the experimental data, the integral thresholds are 0.2 (A.U.·h), 0.3 (A.U.·h), and 0.4 (A.U.·h). The peak thresholds are 0.14 (A.U.), 0.18 (A.U.), and 0.22 (A.U.). Thresholds were selected so that the medium threshold yields 50% responder cells under single-ligand conditions, while the responder ratio remains unsaturated under three-ligand stimulation.

      'non-responder cells are likely a result of cellular heterogeneity in receptor modules rather than the core module' - is this the 'ill health' referenced earlier? If so make this clear. Response: Yes, this is the ‘ill health’ referenced earlier, and we have clarified this (Lines 198-199).

      It's also very difficult to follow this chain of logic, given that the reader at this point doesn't have any knowledge of what the 'core' module is, nor the significance of the thresholds on the signaling codons. I would suggest making this much clearer, with reference to each of these. Response: We apologize for the poor explanation. We have now explained in the Introduction (Lines 95-106) and the results (Lines 129-141) how the model is structured into receptor-proximal modules that converge on the common core module. We have also added a schematic for clarity (Figure S1). For further clarification of the math models, we have significantly revised the Methods (Simulation of heterogenous NFκB dynamical responses). The defined thresholds are clarified in the Methods -- “Responder fraction”.

      '...but the model represented these as independent mass action reactions' - the significance of this may not be clear to someone not familiar with biophysical models, so probably better to make it explicit. Response: We thank the reviewer for this reminder, and we have added a description of the significance of this point (Lines 225-227).

      '...we trained a random forest classifier...' - is this trained on the 'raw' experimental time series data, or on the signaling codons? Response: It is trained on the signaling codons calculated from model simulations of NFκB trajectories. We have clarified this (Lines 260-261).

      'We also applied a Long Short-Term Memory (LSTM) machine learning model...' - it might be good to reference these three approaches at the beginning of this section, otherwise they seem to come out of the blue a little. Response: We have added the references of these three approaches in the beginning of this section (Lines 242-246).

      'We then used machine learning classifiers...' - random forests, LSTMs, or a different model? Response: We have clarified that this as random forest classifier (Line 276).

      Discussion:

      1. '...over statistical models...' - suggest maybe 'purely statistical models' Response: We thank the reviewer for this suggestion. We have rewritten the whole Discussion to include the new insights of antagonism and synergy and their roles in maintaining unexpectedly high SRS performance. Thus, this sentence was removed.

      'We found that endosomal transport...' - A paper by Huang, et. al. (https://www.jneurosci.org/content/40/33/6428) observed a synergistic phagocytic response between CpC and pIC stimulation in microglia. This is still consistent with a saturation effect dependent on dose, but may be worth a mention. Response: We thank the reviewer for referring this interesting paper to us, and this comment helps us to improve the Discussion of inflammatory signaling pathways besides NFκB. This paper demonstratessynergistic effects between CpG and pIC in inhibiting tumor growth and promoting cytokine production(Huang et al., 2020), such as IFN-β and TNF-α, whose expression is also regulated by the IRF and MAPK signaling pathways (Luecke et al., 2021; Sheu et al., 2023). This finding does not contradict our findings that CpG and pIC act antagonistically in the NFκB signaling pathway because of the combinatorial pathways that act on gene expression: CpG can activate the MAPK signaling pathway (Luecke et al., 2024) but not the IRF signaling pathway, whereas pIC activates the IRF signaling pathway (Akira and Takeda, 2004) but only weakly the MAPK pathway. Therefore, their combination can synergistically regulate inflammatory responses. We have added this to the discussion (Lines 515-522).

      '...features termed...' -> 'features, termed' Response: We thank the reviewer for their carefully reading, and we have rewritten the Discussion.

      '...we applied a Long Short-Term Memory (LSTM) machine learning model..' - maybe make clear that this is on the time-series data (also LSTM has already been defined). Response: We thank the reviewer for their carefully reading, and we have rewritten the Discussion.

      Materials and methods:

      1. The descriptions in this section are quite vague, so I would suggest expanding this with more detail from the supplementary material, where things are quite well explained. Response: We thank the reviewer for this suggestion, and we have rewritten the whole Methods as suggested.

      'sampling distribution' - not clear what this refers to in this context Response: We have clarified this in the revision (Methods -- Simulation of heterogenous NFκB dynamical responses, paragraph 3). The single-cell signaling-pathway parameter values used for bootstrapping sampling to generate model simulations are given in Supplementary dataset 2.

      'RelA-mVenus mouse strain' - it would be good to mention the relevance of the reporter for NFkB signaling Response: We have added the relevance of the reporter for NFkB signaling (Methods, Lines 624-626).

      '...A random forest classifier...' -> a random forest classifier

      Response: We have rewritten the methods.

      Significance

      This study provides mechanistically interpretable insight on the important question of how immune cells perform target recognition in realistic scenarios, and also provides validation of existing mathematical models by extending these beyond their original domain. The paper uses 'signaling codons' as a proxy for information processing, however in this instance it is cross-validated with an LSTM model that is applied directly to the time series data. Nevertheless, the scope of the paper is such that it does not deal with the question of how these signals are transmitted or used in a downstream immune response. To my knowledge, this is the first time that a well established existing mathematical model of signalling response has been extended and applied to heterogeneous ligand mixtures. These results will be of interest to those studying immune cell responses, and to those interested in basic research on mathematical models of signaling and cellular information processing more generally.

      My background is in biophysical models, machine learning, and signaling in cancer. I have a basic understanding of immunology, but no experience in experimental cell biology.

      Response: We thank the reviewer for highlighting the novelty of our study. We appreciate the reviewer’s recognition that our work advances the understanding of cellular information processing in the context of ligand mixtures, particularly as the first to extend computational models to investigate signaling fidelity under mixed-ligand conditions.

      We agree that this work will interest computational biologists focused on signaling network modeling and information processing. In addition, we believe it will also be valuable for all signaling biologists, as we provide fundamental insights. For experimental biologists in particular, our model provides an efficient, quantitative framework for exploring and generating testable hypotheses.

      We would also like to gently emphasize that evaluating specificity within signaling pathways is as essential as studying downstream functional responses. While immune function outcomes are certainly important, they rely on the upstream signaling pathways that first respond to environmental cues. Understanding how these signaling pathways achieve specificity and discriminability is therefore crucial. For example, this is particularly relevant for drug development targeting pathways such as NFκB, where assessing the direct signaling output—NFκB activation dynamics—can provide valuable insight into the effects of pharmacological interventions.

      Reviewer #2

      Evidence, reproducibility and clarity

      Guo et al. developed a heterogeneous, single-cell ODE model of NFκB signaling parameterized on five individual ligands (TNF, Pam, LPS, CpG, pIC) and extended it, via core-module parameter matching, to predict responses to all 31 combinations of up to five ligands. They found that simulated responder fractions and signaling codon features generally agreed with live-cell imaging data. A notable discrepancy emerged for the CpG (TLR9) + pIC (TLR3) pair: experiments exhibited non-integrative antagonism unpredicted by the original model. This issue was resolved by incorporating a Hill-type term for competitive, limited endosomal trafficking of these ligands. Finally, by decomposing NFκB trajectories into six "signaling codons" and applying Wasserstein distances plus random-forest and LSTM classifiers, the authors showed that stimulus-response specificity (SRS) declines with ligand complexity but remains statistically significant even for quintuple mixtures. This is a well written and scientifically sound manuscript about complexities of cellular signaling, especially considering the limitations of in vitro experiments in recapitulating in vivo dynamics.

      Response: We thank the reviewer for carefully reading the manuscript and for this endorsement. We have significantly improved the manuscript thanks to the reviewer’s insightful comments (see below for point-to-point responses).

      Besides addressing the reviewer’s questions, we have further extended our work to investigate how ligand pairs interact across all doses and how those interactions affect stimulus-response specificity. As the reviewer pointed out, experimental studies are limited in recapitulating the multitude of complex physiological contexts. The model is helpful to explore more complex scenarios beyond the feasibility of in-vitro experimental setups. Using computational simulations, we have further explored 360 conditions generated from 10 ligand pairs, each evaluated at 6 doses spanning non-responsive to saturating levels, and with each condition considered 1000 cells to capture the heterogeneity of the population.

      From this extended analysis, we identified the mechanistic bases for observations of both synergy and antagonism. Synergy for certain low-dose ligand combinations can be explained by ultrasensitive IKK activation (Figure 4), while antagonism between LPS and Pam arises from competition for the cofactor CD14 (Figure 5). We show that these phenomena are dependent on the signaling network state and therefore are not observed in all cells of the population. We define the network conditions that must be met for antagonism and synergy to occur. Importantly, we then show that antagonism can contribute to stimulus-response specificity in ligand mixtures (Figure 5).

      Here are a few comments and recommendations:

      1. The modeling approach used in this manuscript, while interesting, might need further validation. Inferring multi-ligand receptor parameters by matching single-ligand cells on core-module similarity may not capture true co-variation in receptor expression or adaptor availability. Single cell measurements of receptor expressions could be done (e.g. via flow cytometry) to ground this assumption in real data. If the authors think this is out of scope for this manuscript, they could fit core-matched single cell models with two receptor modules from scratch to the two-ligand experimental data. Would this fitted model produce similar receptor parameters compared to the presented approach? At least the authors should add a bit more explanation for why their modeling approach is better (or valid) than fitting the models with 2/3/4/5 receptor modules from scratch to the experimental data.

      Response: We thank the reviewer for this comment, this helped us improve the explanation of the methodology, the rationale, and the validation. The methodology is based on the well-established statistical method of nearest-neighbor hot-deck imputation (Andridge and Little, 2010). In this implementation, the core module functions as a stabilizing “anchor” (common variables) to harmonize various receptor-specific parameter distributions. Similar methodologies have been successfully applied to correct batch effects or integrate single-cell RNAseq datasets using anchor cell types (Stuart et al., 2019). Our workflow has been validated on single-ligand stimuli conditions in a previous study (Guo et al., 2025) (See below 3rdparagraph). Here, we used this method to generate predictions for ligand mixtures and have validated them with experimental studies of the dual-ligand stimuli, and we found that our predictions align well with the experimental data. As the reviewer suggested in point 3, in the revision, we also added experimental validation on the binary classifiers of macrophage determines whether specific stimuli are presented in the ligand mixture. The question we are interested in in this work is how macrophage process ligand-specific information in the context of ligand mixtures. For this question, the experimental results align with the model predictions, reaching consistent conclusions.

      In the revision, we have explained the rationale for using the nearest-neighbor hot-deck imputation by matching cells with similar core module (Lines 143-150).

      Previous work determined parameter distributions for only the cognate receptor module (and the core module) that provided the best fit for the single ligand experimental data (Figure 1A, Step 1), and other receptor modules parameter information is missing. To simulate stimulus responses to more than two ligands, we imputed the other ligand–receptor module parameters using shared core-module parameters as common variables and employing nearest-neighbor hot-deck imputation (35). In this setup, the core module functions as an “anchor” to harmonize two or more receptor-specific parameter distributions. This was achieved by by minimizing Euclidean distance between the core module parameters associated with the independently parameterized single-ligand models (Figure 1A, Step 2).

      In Guo et al. (2025) (see Supplementary Figure S11), the nearest-neighbor hot-deck imputation approach (core module similarity matching method) was compared with other approaches, including random matching and rescaled-similarity matching. The results show that, after matching, the core module method best preserves the single-ligand stimulus signaling codon distributions. For the reviewer’s convenience, we have also appended the figure in the response to Reviewer 1, Comment 11.

      The advantage of our workflow is that it does not need to be fit to new experimental data and still gives reliable predictions on signaling dynamics. For the reviewer’s interest, we have tried to fit core-matched single cell models with two receptor modules. As fitting parameters require sufficiently large and high-quality datasets, single-ligand stimulation data with more than 1,000 cells can be adequate to estimate 6~7 parameters (Guo et al., 2025) (approx. 1400 cells to 2000 cells per ligand). However, our current experimental dataset for combinatorial-ligand conditions contains only 500~1,000 cells, and we have tested these datasets but results show a poor fit of heterogeneous signaling dynamics. This is due to an insufficient number of cells for estimating 8~10 parameters. We estimate that at least ~1,500 cells would be needed for reliable parameter estimation under dual-ligand stimulation (and more cells may be needed for combinatorial ligand stimuli involving more ligands). This is currently not feasible to obtain for mixed ligands given the large number of combinatorial conditions.

      Overall, in this paper, the nearest-neighbor hot-deck imputation approach is presented as a feasible and acceptable approach that best reflects our current understanding of the signaling network. Importantly, it helps identify potential gaps by highlighting discrepancies between model predictions and experimental observations.

      (a) The refined model posits competitive, saturable endosomal transport for CpG and pIC, but no direct measurements of endosomal uptake rates or compartmental saturation thresholds are provided, leaving the Hill parameters under-constrained. The authors could produce dose-response curves for CpG and pIC individually and in combination across a range of concentrations to fit the Hill parameters for competitive uptake. (b) If this is out of scope for this paper, the authors should at least comment on why the endosome hypothesis is better than others e.g. crosstalks and other parallel pathway activations. Especially given that even the refined model simulations with Hill equations for CpG and pIC do not quite match with the experimental data (Fig 2 B,E).

      Response: (a) The reviewer’s comments helped us to improve our work by employing the Michaelis-Menten Kinetics for substrate competition reactions, which increases the mathematic rigor of the CpG-pIC competition model. In this updated model, there is no free parameters to tune, as all the Vmax, Kd, should be consistent with the single-ligand scenario. And the Hill is same as single-ligand case, equal to 1.

      The comments on examining dose-response curves for CpG and pIC inspired us to extend the dose-response curves for all ligand pair combination, allowing us to identify the synergy in low-dose ligand pairs and antagonism for high-dose LPS-Pam, besides CpG-pIC (new Figure 4 & 5).

      (b) Regarding alternative hypotheses for antagonism—such as crosstalk or parallel-pathway activation: any antagonistic effect would have to arise from negative regulation acting within the first 30 min. However, IκBα-mediated feedback only becomes appreciable after ~30 min (Hoffmann et al., 2002), and A20-dependent attenuation requires ≥2 h (Werner et al., 2005). Beyond these delayed feedback, NFκB activation depends primarily on phosphorylation and K63-linked ubiquitination, for which no mechanism produces true antagonism; at most, combinatorial inputs saturate the response to the level of the strongest single ligand. We have added this rationale to the Discussion to explain why we favor the endosome saturation hypothesis over other mechanisms (Lines 459-465). While this may not capture every nuance, it represents the simplest model extension capable of reproducing the observed antagonism.

      Authors asses the distinguishability of single-ligand stimuli and combinatorial ligands stimuli using the simulations from the refined model. While this is informative, the simulated data could propagate deviations from the experimental data to the classifiers. How would the classifiers fare when the experimental data is used to assess the single-stimulus distinguishability? The authors could use the experimental data they already have and confirm their main claim of the paper, that cells retain stimulus-response specificity even with multiple ligand exposure. In short, how would Fig 3E look when trained/validated on available experimental data?

      Response: We thank the reviewer’s valuable comments, and they helped us strengthen the rigor of our analysis by incorporating cross-model testing. Specifically, we refined our analysis of ligand presence/absence classification by including ROC AUC and balanced accuracy metrics. This adjustment accounts for the fact that the experimental data did not cover all combinatorial conditions, thereby mitigating potential biases from data imbalance and threshold choice. The experimental results are qualitatively consistent with the simulations, though—as expected—they show somewhat lower ligand distinguishability compared to the noise-free simulated dataset. We have updated Figures 3E–F (previously Figure 3E), added Figure S8, and revised the manuscript accordingly (Lines 292–301). For the reviewer’s convenience, we have also pasted in the revised manuscript text below.

      “Classifiers trained to distinguish TNF-present from TNF-absent conditions achieved a Receiver Operating Characteristic-Area Under the Curve (ROC AUC) of 0.96, significantly above the 0.5 baseline (Figure 3D, Figure S8A). Extending this analysis to other ligands, cells detected LPS (0.85), Pam (0.84), pIC (0.73), and CpG (0.63) in mixtures (Figure 3D, S8A). Using experimental data from double- and triple-ligand stimuli (Figure 1D), ROC AUC values were TNF 0.74, LPS 0.74, Pam 0.66, pIC 0.75, and CpG 0.66 (Figure 3E, S8B). Classifier accuracies yielded consistent results (Figure S8C-D). These results indicated a remarkable capability of preserving ligand-specific dynamic features within complex NFκB signal trajectories that enable nuclear detection of extracellular ligands even in complex stimulus mixtures.”

      While the approach of presented here with multiple simultaneous ligand exposures is a major step towards the in vivo-like conditions, the temporal aspect is still missing. That is, temporal phasing i.e. sequential exposure to multiple ligands as one would expect in vivo rather than all at once. This is probably out of scope for this paper but the authors could comment how how their work could be taken forward in such direction and would the SRS be better or worse in such conditions. Response: We thank the reviewer for this insightful comment. We have added “the temporal aspect of multiple ligand exposures” to the discussion (Lines 503-510), and we pasted the corresponding paragraph here for reviewer’s references (black fonts are previous version, and blue fonts is the revised new texts):

      Cells may be expected to interpret not only the combination of signals but also their timing and duration to mount appropriate transcriptional responses (58, 59). For example, acute inflammation integrates pathogen-derived cues with pro- and anti-inflammatory signals over a timeframe of hours to days (58), to coordinate the pathogen removal and tissue repairing process. Investigating sequential stimulus combinations in our model is therefore crucial for understanding how cells process complex physiological inputs. Simulations that account for longer timescales may require additional feedback mechanisms, as described in some of our previous studies for NFκB (15, 60). **

      There is no caption for Figure 3F in the figure legend nor a reference in the main text.

      Response: In the revised manuscript we actually removed Figure 3F.

      Significance

      General assessment: This is a good manuscript in it's present form which could get better with revision. There needs more supporting data and validation to back the main claim presented in the manuscript.

      Significance/impact/readership: When revised this manuscript could be of interest to a broad community involving single cells biology, cell and immune signaling, and mathematical modeling. Especially the models presented here could be used a starting point to more complex and detailed modeling approaches.

      Response: We thank the reviewer for this endorsement. The reviewer’s constructive suggestion helped us significantly improve the clarity and rigor of our main conclusion.

      In summary, we have strengthened the computational framework in several ways. We improved the model’s fit to experimental single-ligand training data and reformulated the antagonistic CpG-pIC model using Michaelis–Menten kinetics, thereby reducing parameter arbitrariness and increasing mechanistic interpretability. These changes led to better agreement between model predictions and experimental observations for combinatorial ligand responses (Updated Figure 2 and Figure S2), which we hope will further increase experimentalists’ confidence in the modeling results. We have also validated one key conclusion (“cells retain stimulus-response specificity even with multiple ligand exposure”) using the experimental dataset, and it aligns with the model predictions.

      In addition, we have further extended our analysis and the scope. Inspired by the reviewer’s advice (and Reviewer 3’s comment 1b) on dose-combination study for CpG-pIC pair, we expanded our research to dose-response relationships for all dual-ligand combinations (Lines 302-406, Figure 4-5). This additional comprehensive analysis allowed us to identify the mechanism of synergistic and antagonistic effects in single-cell responses and to pinpoint the corresponding dose ranges among different ligand pairs.

      Interestingly, we found that IKK ultrasensitive activation may lead to low-dose ligand combinations synergistic response for single cells. We also found that CD14 uptake competition between LPS and Pam may lead to antagonistic/non-integrative combination. Our simulation-based finding of non-integrative combination of LPS-Pam stimuli aligns with previous independent experimental finding of non-integrative response for LPS and Pam combination (Kellogg et al., 2017), and this independent experimental study validated our model prediction.

      We further analyzed stimulus-response specificity under conditions predicted to exhibit synergy or antagonism. Our results indicate that antagonistic combinations of ligands can increase stimulus-response specificity in the context of ligand mixtures.

      Reviewer #3

      Evidence, reproducibility and clarity

      The authors investigate experimentally single macrophages' NF-kB responses to five ligands, separately and to 3 pairs of ligands. Using the single ligand stimulations, they train an existing mathematical model to replicate single-cell NF-kB nuclear trajectories. From what I understand, for each single cell trajectory in response to a given ligand, the best fit parameters of the core module and the receptor module (specific for the given ligand) are found.

      Then (again, from what I understand), single ligand models are used to generate responses to combinations of ligands. The parametrizations of single ligand models (to be combined) are chosen to have the most similar core modules. It is not described how the responses to more than one ligand are calculated - I expect that respective receptor modules work in parallel, providing signals to the core module. After observing that the response to CpG+pIC is lower (in terms of duration and total) than for CpG alone, the model is modified to account for competition for endosomal transport required by both ligands.

      Having the trained model, simulations of responses to all 31 combinations of ligands are performed, and each NF-κB trajectory is described by six signaling codons-Speed, Peak, Duration, Total, Early vs. Late, and Oscillations. Next, these codons are used to reconstruct (using a random forest model) the stimuli (which may be the combination of ligands). The single and even the two ligand stimuli are relatively well recognized, which is interpreted as the ability of macrophages to distinguish ligands even if present in combination.

      We thank the reviewer for careful reading of the manuscript.

      Major comments

      1) The demonstrated ability to recognize stimuli is based on several key assumptions that can hardly be met in reality.

      Response: We thank the reviewer for this comment, which prompted us to carefully reflect on the rigor of our work, inspired us to extend our analysis to a broad range of ligand-dose combinations, and helped us improve clarifying the limitations of our approach. Please see our detailed responses below.

      a) The cell knows the stimulation time, and then it can use speed as a codon. Look on fig. S4A: The trajectories in response to plC are similar to those in response to TNF, but just delayed. Response: We thank the reviewer for this comment. We updated the model parameterization to better fit to the single-ligand pIC condition (Lines 557-559). In the updated model, the simulated responses to TNF and pIC are quite different (Fig. S2A-B, Fig. S5A-B). Specifically, the Peak, Duration, EarlyVsLate, and Total signaling codons have different values. In addition, the literature suggests that timing difference of NFκB activation are sufficient to elicit differences in downstream gene expression responses, especially for the early response genes (ERG) and intermediate response genes (ING) (Figure 1 in Ando, et al, 2021). For reviewer’s convenience, we have also appended the figures. Specifically, within the first 60 minutes, ctrl exhibit higher Speed of NFκB activation, and the NFκB regulated ERG and ING show differences in the first 60 minutes (Below Fig 1a,b). Ando et al then identified the gene regulatory mechanism that is able to distinguish between differences in the Speed codon. Importantly, this mechanism does not require knowledge of t=0, i.e. when the timer was started.

      The signaling codon Speed, which is based on derivatives, is one way to quantify such timing differences in activation. It was selected from a library of more than 900 different dynamic features using an information maximizing algorithm (Adelaja et al., 2021). It is possible that other ways of measuring time, e.g. time to half-max, might not be distinguished that well by these regulatory mechanisms.

      b) The increase of stimulus concentration typically increases Peak, Duration, and Total, so a similar effect can be achieved by changing the ligand or concentration. Response: This (“the increase of stimulus concentration typically increases Peak, Duration, and Total”) is not an assumption. What the reviewer described (“a similar effect can be achieved by changing the ligand or concentration”) may occur or may not. The six informative signaling codons can vary under different ligands or doses. For example, with increasing doses of Pam, the NFκB response shows a higher peak, potentially making it appear more like LPS stimulation. However, as the Pam dose increases, the response duration decreases, which distinguishes it from LPS stimulation (See experimental data shown in Figure 4A, second row, and Figure 3A, second row in Luecke et al., (2024), we also pasted the corresponding figure below for reviewer’s convenience).

      Figure 4A and Figure 3A from Luecke et al., (2024). Figure 4A: NFκB activity dynamics in the single cells in response to 0, 0.01, 0.1, 1, 10, and 100 ng/ml P3C4 stimulation. Eight hours were measured by fluorescence microscopy of reporter hMPDMs. Each row of the heatmap represents the p38 or NFκB signaling trajectory of one cell. Trajectories are sorted by the maximum amplitude of p38 activity. Data from two pooled biological replicates are depicted. Total # of cells: 898, 834, 827, 787, 778, and 923. Figure 3A: NFκB activity dynamics in the single cells in response to 100 ng/ml LPS stimulation. Eight hours were measured by fluorescence microscopy of reporter hMPDMs. Each row of the heatmap represents the NFκB signaling trajectory of one cell (with p38 measured shown in the original paper). Trajectories are sorted by the maximum amplitude of p38 activity. Data from two pooled biological replicates are depicted.

      Inspired by the reviewer’s comment (and also Reviewer 2’s comments), in the revision, we expanded our research to dose-response relationships for all dual-ligand combinations (Lines 302-406, Figure 4-5). This additional comprehensive analysis allowed us to identify the mechanism of synergistic and antagonistic effects in single-cell responses and to pinpoint the corresponding dose ranges among different ligand pairs.

      Interestingly, we found that IKK ultrasensitive activation may lead to synergistic responses to low-dose ligand combinations but only in a subset of single cells. We also found that CD14 uptake competition between LPS and Pam may lead to antagonistic/non-integrative combination. Our simulation-based finding of non-integrative combination of LPS-Pam stimuli aligns with previous independent experimental findings of non-integrative response for LPS and Pam combination (Kellogg et al., 2017).

      c) Distinguishing a given ligand in the presence of some others, even stronger bases, on the assumption that these ligands were given at the same time, which is hardly justified. Response: We agree with the reviewer that ligands could be given at different times. Considering time delays between ligands (the inset and also removal) dramatically adds to the combinatorial complexity. Some initial studies by the Tay lab are beginning to explore some scenarios of time-shifted ligand pairs (Wang et al 2025). Here we focus on a systematic exploration of all ligand combinations at 6 different doses. The fact that we do not consider time delays is not an assumption but admittedly a limitation that may well be addressed in future studies. We have included a brief discussion of this issue in the discussion (Lines 503-514). We’ve appended here for reviewer’s convenience.

      Cells may be expected to interpret not only the combination of signals but also their timing and duration to mount appropriate transcriptional responses (Kumar et al., 2004; Son et al., 2023). For example, acute inflammation integrates pathogen-derived cues with pro- and anti-inflammatory signals over a timeframe of hours to days (Kumar et al., 2004), to coordinate the pathogen removal and tissue repairing process. Investigating sequential stimulus combinations in our model is therefore crucial for understanding how cells process complex physiological inputs. Simulations that account for longer timescales may require additional feedback mechanisms, as described in some of our previous studies for NFκB (Werner et al., 2008, 2005).

      We would like to suggest that despite (or maybe because) limiting our study to coincident stimuli, we made some noteworthy discoveries.

      2) For single ligands, it would be nice to see how the random forest classifier works on experimental data, not only on in silico data (even if generated by a fitted model).

      Response: This comment and Reviewer 2 comment 3 have helped us strengthen the rigor of our analysis by incorporating cross-model testing. We pasted the response below.

      Specifically, we refined our analysis of ligand presence/absence classification by including ROC AUC and balanced accuracy metrics. This adjustment accounts for the fact that the experimental data did not cover all combinatorial conditions, thereby mitigating potential biases from data imbalance and threshold choice. The experimental results are qualitatively consistent with the simulations, though—as expected—they show somewhat lower ligand distinguishability compared to the noise-free simulated dataset. We have updated Figures 3E–F (previously Figure 3E), added Figure S8, and revised the manuscript accordingly (Lines 292–301). For the reviewer’s convenience, we have also included the revised manuscript text below.

      “Classifiers trained to distinguish TNF-present from TNF-absent conditions achieved a Receiver Operating Characteristic-Area Under the Curve (ROC AUC) of 0.96, significantly above the 0.5 baseline (Figure 3D, Figure S8A). Extending this analysis to other ligands, cells detected LPS (0.85), Pam (0.84), pIC (0.73), and CpG (0.63) in mixtures (Figure 3D, S8A). Using experimental data from double- and triple-ligand stimuli (Figure 1D), ROC AUC values were TNF 0.74, LPS 0.74, Pam 0.66, pIC 0.75, and CpG 0.66 (Figure 3E, S8B). Classifier accuracies yielded consistent results (Figure S8C-D). These results indicated a remarkable capability of preserving ligand-specific dynamic features within complex NFκB signal trajectories that enable nuclear detection of extracelular ligands even in complex stimulus mixtures.”

      3) My understanding of ligand discrimination is such that it is rather based on a combination of pathways triggered than solely on a single transcription factor response trajectory, which varies with ligand concentration and ligand concentration time profile (no reason to assume it is OFF-ON-OFF). For example, some of the considered ligands (plC and CpG) activate IRF3/IRF7 in addition to NF-kB, which leads to IFN production and activation of STATs. This should at least be discussed.

      Response: We thank the reviewer for this comment and fully agree. In the previous version, we discussed different signaling pathways combinatorically distinguishing stimulus. In the revision, we have extended this discussion to include the example of pIC and CpG activation, as suggested (Lines 515-522). We pasted the corresponding text below.

      Furthermore, innate immune responses do not solely rely on NFκB but also involve the critical functions of AP1, p38, and the IRF3-ISGF3 axis. The additional pathways are likely activated in a coordinated manner and provide additional information (Luecke et al., 2021). This is exemplified by the studies demonstrating synergistic effects between CpG and pIC in inhibiting tumor growth and promoting cytokine production (Huang et al., 2020), such as IFNβ and TNFα, whose expression is also regulated by the IRF and MAPK signaling pathways (Luecke et al., 2021; Sheu et al., 2023). Therefore the inclusion of parallel pathways of AP1 and MAPK, as well as the type I interferon network (Cheng et al., 2015; Davies et al., 2020; Hanson and Batchelor, 2022; Luecke et al., 2024; Paek et al., 2016; Peterson et al., 2022) are next steps for expanding the mathematical models presented here.”

      Technical comments

      1) Reference 25: X. Guo, A. Adelaja, A. Singh, W. Roy, A. Hoffmann, Modeling single-cell heterogeneity in signaling dynamics of macrophages reveals principles of information transmission. Nature Communications (2025) does not lead to any paper with the same or a similar title and author list. This Ref is given as a reference to the model. Fortunately, Ref 8 is helpful. Nevertheless, authors should include a schematic of the model.

      Response: We apologize for the paper not being accessible on time. It is now. We have also added a schematic of the model as suggested (Figure S1) and have added detailed description of the model and simulations in introduction (Lines 95-106), results (Lines 129-141), and methods (Simulation of heterogenous NFκB dynamical responses).

      2) Also Mendeley Data DOI:10.17632/bv957x6frk.1 and GitHub https://github.com/Xiaolu-Guo/Combinatorial_ligand_NFkB lead to nowhere.

      Response: We thank the reviewer for this comment, and we have made the GitHub codes public. Mendeley Data DOI:10.17632/bv957x6frk.1 can be accessed via the shared link: https://data.mendeley.com/preview/bv957x6frk?a=6d56e079-d7b0-482e-951f-8a8e06ee8797

      and will be public once the paper accepted.

      3) Dataset 1 is not described. Possibly it contains sets of parameters of receptor modules (different numbers of sets for each module, why?), but the names of parameters never appear in the text, which makes it impossible to reproduce the data.

      Response: We thank the reviewer for this comment, and we have added the description of the dataset (S3 SupplementaryDataset2_NFkB_network_single_cell_parameter_distribution.xlsx) and added the parameter names in the methods (Simulation of heterogenous NFκB dynamical responses).


      4) It is difficult to understand how the simulations in response to more than one ligand are performed.

      Response: We thank the reviewer for this comment, and we have improved the explanation of the methods (Results, Lines 145-152) and included a detailed description of the model and simulations for combinatorial ligands (Methods, Predicting heterogeneous single-cell responses to combinatorial-ligand stimulation).

      Significance

      A lot of work has been done, the methodology is interesting, but the biological conclusions are overstated.

      Response: We thank the reviewer for their interest in the methodology. We have revised the title, the abstract, and added the discussion about our finding to more accurately document what we have found. In the revision, we have increased the clarity and rigor of the work. For the key conclusion that macrophages maintain some level of NFκB signaling fidelity in response to ligand mixtures, we have validated the binary classifier results on experimental data as reviewer suggested.

      In the revision, we have also extended our methodology to explore further, the dose-response curves for different dosage combination for ligand pairs. This further work allowing us identified the synergistic and antagonistic regimes. By comparing the stimulus response specificity for antagonistic model vs the non-antagonistic model, we demonstrated that signaling antagonism may increase the distinguishability of presence or absence of specific ligands within complex ligand mixtures. This provides a mechanism of how signaling fidelity is maintained to the surprising degree we reported.

      REFERENCES

      Adelaja, A., Taylor, B., Sheu, K.M., Liu, Y., Luecke, S., Hoffmann, A., 2021. Six distinct NFκB signaling codons convey discrete information to distinguish stimuli and enable appropriate macrophage responses. Immunity 54, 916-930.e7. https://doi.org/10.1016/j.immuni.2021.04.011

      Akira, S., Takeda, K., 2004. Toll-like receptor signalling. Nat Rev Immunol 4, 499–511. https://doi.org/10.1038/nri1391

      Andridge, R.R., Little, R.J.A., 2010. A Review of Hot Deck Imputation for Survey Non-response. Int Stat Rev 78, 40–64. https://doi.org/10.1111/j.1751-5823.2010.00103.x

      Cheng, Z., Taylor, B., Ourthiague, D.R., Hoffmann, A., 2015. Distinct single-cell signaling characteristics are conferred by the MyD88 and TRIF pathways during TLR4 activation. Sci Signal 8, ra69. https://doi.org/10.1126/scisignal.aaa5208

      Davies, A.E., Pargett, M., Siebert, S., Gillies, T.E., Choi, Y., Tobin, S.J., Ram, A.R., Murthy, V., Juliano, C., Quon, G., Bissell, M.J., Albeck, J.G., 2020. Systems-Level Properties of EGFR-RAS-ERK Signaling Amplify Local Signals to Generate Dynamic Gene Expression Heterogeneity. Cell Systems 11, 161-175.e5. https://doi.org/10.1016/j.cels.2020.07.004

      Guo, X., Adelaja, A., Singh, A., Roy, W., Hoffmann, A., 2025a. Modeling single-cell heterogeneity in signaling dynamics of macrophages reveals principles of information transmission. Nature Communications.

      Guo, X., Adelaja, A., Singh, A., Wollman, R., Hoffmann, A., 2025b. Modeling heterogeneous signaling dynamics of macrophages reveals principles of information transmission in stimulus responses. Nat Commun 16, 5986. https://doi.org/10.1038/s41467-025-60901-3

      Hanson, R.L., Batchelor, E., 2022. Coordination of MAPK and p53 dynamics in the cellular responses to DNA damage and oxidative stress. Molecular Systems Biology 18, e11401. https://doi.org/10.15252/msb.202211401

      Huang, Y., Zhang, Q., Lubas, M., Yuan, Y., Yalcin, F., Efe, I.E., Xia, P., Motta, E., Buonfiglioli, A., Lehnardt, S., Dzaye, O., Flueh, C., Synowitz, M., Hu, F., Kettenmann, H., 2020. Synergistic Toll-like Receptor 3/9 Signaling Affects Properties and Impairs Glioma-Promoting Activity of Microglia. J. Neurosci. 40, 6428–6443. https://doi.org/10.1523/JNEUROSCI.0666-20.2020

      Kellogg, R.A., Tian, C., Etzrodt, M., Tay, S., 2017. Cellular Decision Making by Non-Integrative Processing of TLR Inputs. Cell Rep 19, 125–135. https://doi.org/10.1016/j.celrep.2017.03.027

      Kumar, R., Clermont, G., Vodovotz, Y., Chow, C.C., 2004. The dynamics of acute inflammation. Journal of Theoretical Biology 230, 145–155. https://doi.org/10.1016/j.jtbi.2004.04.044

      Luecke, S., Guo, X., Sheu, K.M., Singh, A., Lowe, S.C., Han, M., Diaz, J., Lopes, F., Wollman, R., Hoffmann, A., 2024. Dynamical and combinatorial coding by MAPK p38 and NFκB in the inflammatory response of macrophages. Molecular Systems Biology 20, 898–932. https://doi.org/10.1038/s44320-024-00047-4

      Luecke, S., Sheu, K.M., Hoffmann, A., 2021. Stimulus-specific responses in innate immunity: Multilayered regulatory circuits. Immunity 54, 1915–1932. https://doi.org/10.1016/j.immuni.2021.08.018

      Paek, A.L., Liu, J.C., Loewer, A., Forrester, W.C., Lahav, G., 2016. Cell-to-Cell Variation in p53 Dynamics Leads to Fractional Killing. Cell 165, 631–642. https://doi.org/10.1016/j.cell.2016.03.025

      Peterson, A.F., Ingram, K., Huang, E.J., Parksong, J., McKenney, C., Bever, G.S., Regot, S., 2022. Systematic analysis of the MAPK signaling network reveals MAP3K-driven control of cell fate. Cell Systems 13, 885-894.e4. https://doi.org/10.1016/j.cels.2022.10.003

      Sheu, K.M., Guru, A.A., Hoffmann, A., 2023. Quantifying stimulus-response specificity to probe the functional state of macrophages. Cell Systems 14, 180-195.e5. https://doi.org/10.1016/j.cels.2022.12.012

      Son, M., Wang, A.G., Keisham, B., Tay, S., 2023. Processing stimulus dynamics by the NF-κB network in single cells. Exp Mol Med 55, 2531–2540. https://doi.org/10.1038/s12276-023-01133-7

      Stuart, T., Butler, A., Hoffman, P., Hafemeister, C., Papalexi, E., Mauck, W.M., Hao, Y., Stoeckius, M., Smibert, P., Satija, R., 2019. Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902.e21. https://doi.org/10.1016/j.cell.2019.05.031

      Werner, S.L., Barken, D., Hoffmann, A., 2005. Stimulus Specificity of Gene Expression Programs Determined by Temporal Control of IKK Activity. Science 309, 1857–1861. https://doi.org/10.1126/science.1113319

      Werner, S.L., Kearns, J.D., Zadorozhnaya, V., Lynch, C., O’Dea, E., Boldin, M.P., Ma, A., Baltimore, D., Hoffmann, A., 2008. Encoding NF-kappaB temporal control in response to TNF: distinct roles for the negative regulators IkappaBalpha and A20. Genes Dev 22, 2093–2101. https://doi.org/10.1101/gad.1680708

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors investigate experimentally single macrophages' NF-kB responses to five ligands, separately and to 3 pairs of ligands. Using the single ligand stimulations, they train an existing mathematical model to replicate single-cell NF-kB nuclear trajectories. From what I understand, for each single cell trajectory in response to a given ligand, the best fit parameters of the core module and the receptor module (specific for the given ligand) are found. Then (again, from what I understand), single ligand models are used to generate responses to combinations of ligands. The parametrizations of single ligand models (to be combined) are chosen to have the most similar core modules. It is not described how the responses to more than one ligand are calculated - I expect that respective receptor modules work in parallel, providing signals to the core module. After observing that the response to CpG+pIC is lower (in terms of duration and total) than for CpG alone, the model is modified to account for competition for endosomal transport required by both ligands.

      Having the trained model, simulations of responses to all 31 combinations of ligands are performed, and each NF-κB trajectory is described by six signaling codons-Speed, Peak, Duration, Total, Early vs. Late, and Oscillations. Next, these codons are used to reconstruct (using a random forest model) the stimuli (which may be the combination of ligands). The single and even the two ligand stimuli are relatively well recognized, which is interpreted as the ability of macrophages to distinguish ligands even if present in combination.

      Major comments

      1. The demonstrated ability to recognize stimuli is based on several key assumptions that can hardly be met in reality.

      a) The cell knows the stimulation time, and then it can use speed as a codon. Look on fig. S4A: The trajectories in response to plC are similar to those in response to TNF, but just delayed.

      b) The increase of stimulus concentration typically increases Peak, Duration, and Total, so a similar effect can be achieved by changing the ligand or concentration.

      c) Distinguishing a given ligand in the presence of some others, even stronger bases, on the assumption that these ligands were given at the same time, which is hardly justified. 2. For single ligands, it would be nice to see how the random forest classifier works on experimental data, not only on in silico data (even if generated by a fitted model). 3. My understanding of ligand discrimination is such that it is rather based on a combination of pathways triggered than solely on a single transcription factor response trajectory, which varies with ligand concentration and ligand concentration time profile (no reason to assume it is OFF-ON-OFF). For example, some of the considered ligands (plC and CpG) activate IRF3/IRF7 in addition to NF-kB, which leads to IFN production and activation of STATs. This should at least be discussed.

      Technical comments

      1. Reference 25: X. Guo, A. Adelaja, A. Singh, W. Roy, A. Hoffmann, Modeling single-cell heterogeneity in signaling dynamics of macrophages reveals principles of information transmission. Nature Communications (2025) does not lead to any paper with the same or a similar title and author list. This Ref is given as a reference to the model. Fortunately, Ref 8 is helpful. Nevertheless, authors should include a schematic of the model.
      2. Also Mendeley Data DOI:10.17632/bv957x6frk.1 and GitHub https://github.com/Xiaolu-Guo/Combinatorial_ligand_NFkB lead to nowhere.
      3. Dataset 1 is not described. Possibly it contains sets of parameters of receptor modules (different numbers of sets for each module, why?), but the names of parameters never appear in the text, which makes it impossible to reproduce the data.
      4. It is difficult to understand how the simulations in response to more than one ligand are performed.

      Significance

      A lot of work has been done, the methodology is interesting, but the biological conclusions are overstated.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Guo et al. developed a heterogeneous, single-cell ODE model of NFκB signaling parameterized on five individual ligands (TNF, Pam, LPS, CpG, pIC) and extended it, via core-module parameter matching, to predict responses to all 31 combinations of up to five ligands. They found that simulated responder fractions and signaling codon features generally agreed with live-cell imaging data . A notable discrepancy emerged for the CpG (TLR9) + pIC (TLR3) pair: experiments exhibited non-integrative antagonism unpredicted by the original model. This issue was resolved by incorporating a Hill-type term for competitive, limited endosomal trafficking of these ligands. Finally, by decomposing NFκB trajectories into six "signaling codons" and applying Wasserstein distances plus random-forest and LSTM classifiers, the authors showed that stimulus-response specificity (SRS) declines with ligand complexity but remains statistically significant even for quintuple mixtures. This is a well written and scientifically sound manuscript about complexities of cellular signaling, especially considering the limitations of in vitro experiments in recapitulating in vivo dynamics. Here are a few comments and recommendations:

      1. The modeling approach used in this manuscript, while interesting, might need further validation. Inferring multi-ligand receptor parameters by matching single-ligand cells on core-module similarity may not capture true co-variation in receptor expression or adaptor availability. Single cell measurements of receptor expressions could be done (e.g. via flow cytometry) to ground this assumption in real data. If the authors think this is out of scope for this manuscript, they could fit core-matched single cell models with two receptor modules from scratch to the two-ligand experimental data. Would this fitted model produce similar receptor parameters compared to the presented approach? At least the authors should add a bit more explanation for why their modeling approach is better (or valid) than fitting the models with 2/3/4/5 receptor modules from scratch to the experimental data.
      2. The refined model posits competitive, saturable endosomal transport for CpG and pIC, but no direct measurements of endosomal uptake rates or compartmental saturation thresholds are provided, leaving the Hill parameters under-constrained. The authors could produce dose-response curves for CpG and pIC individually and in combination across a range of concentrations to fit the Hill parameters for competitive uptake. If this is out of scope for this paper, the authors should at least comment on why the endosome hypothesis is better than others e.g. crosstalks and other parallel pathway activations. Especially given that even the refined model simulations with Hill equations for CpG and pIC do not quite match with the experimental data (Fig 2 B,E).
      3. Authors asses the distinguishability of single-ligand stimuli and combinatorial ligands stimuli using the simulations from the refined model. While this is informative, the simulated data could propagate deviations from the experimental data to the classifiers. How would the classifiers fare when the experimental data is used to assess the single-stimulus distinguishability? The authors could use the experimental data they already have and confirm their main claim of the paper, that cells retain stimulus-response specificity even with multiple ligand exposure. In short, how would Fig 3E look when trained/validated on available experimental data?
      4. While the approach of presented here with multiple simultaneous ligand exposures is a major step towards the in vivo-like conditions, the temporal aspect is still missing. That is, temporal phasing i.e. sequential exposure to multiple ligands as one would expect in vivo rather than all at once. This is probably out of scope for this paper but the authors could comment how how their work could be taken forward in such direction and would the SRS be better or worse in such conditions.
      5. There is no caption for Figure 3F in the figure legend nor a reference in the main text.

      Significance

      General assessment: This is a good manuscript in it's present form which could get better with revision. There needs more supporting data and validation to back the main claim presented in the manuscript.

      Significance/impact/readership: When revised this manuscript could be of interest to a broad community involving single cells biology, cell and immune signaling, and mathematical modeling. Especially the models presented here could be used a starting point to more complex and detailed modeling approaches.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The authors extend an existing mathematical model of NFkB signalling under stimulation of various single receptors, to model that describes responses to stimulation of multiple receptors simultaneously. They compare this model to experimental data derived from live-cell imaging of mouse macrophages, and modify the model to account for potential antagonism between TLR3 and TLR9 response due to competition for endosomal transport. Using this framework they show that, despite distinguishability decreasing with increasing numbers of heterogenous stimuli, macrophages are still able in principle to distinguish these to a statistically significant degree. I congratulate the authors on an interesting approach that extends and validates an existing mathematical model, and also provides valuable information regarding macrophage response.

      There are no major issues affecting the scientific conclusions of the paper, however the lack of detail surrounding the mathematical model and the 'signaling codons' that are used throughout the paper make it difficult to read. This is exacerbated by the fact that I was unable to find Ref 25 which apparently describes the model, however I was able to piece together the essential components from the description in Ref 8 and the supplementary material.

      Lots of the minor comments below stem from this, however there are also a few other places that could benefit from some additional clarification and explanation.

      Significance:

      '...it remains unclear complex...' -> '...it remains unclear whether complex...'

      Introduction: 'temporal dynamics of NFkB' - it would be good to be more concrete regarding the temporal dynamics of what aspect of this (expression, binding, conformation, etc), if possible.

      'signaling codons' - the behaviour of these is key to the entire paper, so even if they are well described in the reference, it would be good to have a short description as early as possible so that the reader can get an idea in their mind what exactly is being discussed here. Later, it would be good to have concrete description of exactly what these capture.

      'This challenge...population of macrophages' - this seems a bit out of place, and is a bit of a run on sentence, so I suggest moving this to the next paragraph and working it into the first sentence there '...regulatory mechanisms, and this challenge could be addressed with a model parameterised to account for heterogeneous...Early models ...', or something similar.

      Ref 25: I can't find a paper with this title anywhere, so if it's an accepted preprint then it would be good to have this available as well. That said, I still think it would be difficult to grasp the work done in this paper without some description of the mathematical model here, at least schematically, if not the full set of ODEs. For example, there are numerous references to how this incorporates heterogeneous responses, the 'core module', etc, and the reader has no context of these if they aren't familiar with the structure of the model.

      'A key challenge which is...' -> 'A key challenge is...'

      'With model simulation ...' -> a bit of a run on sentence, I suggest breaking after 'conditions'.

      Results:

      This section would benefit from a more in-depth description of the model and experimental setup. In particular for the experiment, the reader never really knows what this workflow for this is, nor what the model ingests as input, and what the predictions are of.

      '..mechanistic model was trained...' - trained in this study, or in the previous referenced study?

      'determined parameter distributions' - this is where it would be good to have more background on the model. What parameters are these, and what do they correspond to biologically? It would also be nice to see in the methods or supplementary material how this is done (maximum likelihood, etc).

      'matching cells with similar core model...' - it's difficult to follow the logic as to why this is done, so I think this needs to be a little clearer. My guess would be that the assumption is that simulated cells with similar 'core' parameters have a similar downstream signalling response, and therefore the receptors can be 'transplanted'. So it would be nice to see exactly what these distributions are and what the effect of a bad match would be.

      Some explanation of how this relates to the experimental data the parameters are fit on would also be useful. Is there a correspondence between individual simulated cells and the experimental data for the single ligand stimulation, and then the smallest set of these is taken? Is there also a matching from the simulated multi-receptor modules and the multi-receptor data, and if so, is this done in the same way?

      'six signaling codons' - here it would be good to recapitulate what these represent, but also what the 'strength' and 'activity' correspond to (total integrated value, maximum value, etc)

      'pre-defined thresholds' - no need to state these numerically in the text (although giving some sense of how/why these were chosen would give some context), but I couldn't find the values of these, nor values corresponding to the signaling codons.

      'non-responder cells are likely a result of cellular heterogeneity in receptor modules rather than the core module' - is this the 'ill health' referenced earlier? If so make this clear.

      It's also very difficult to follow this chain of logic, given that the reader at this point doesn't have any knowledge of what the 'core' module is, nor the significance of the thresholds on the signaling codons. I would suggest making this much clearer, with reference to each of these.

      '...but the model represented these as independent mass action reactions' - the significance of this may not be clear to someone not familiar with biophysical models, so probably better to make it explicit.

      '...we trained a random forest classifier...' - is this trained on the 'raw' experimental time series data, or on the signaling codons?

      'We also applied a Long Short-Term Memory (LSTM) machine learning model...' - it might be good to reference these three approaches at the beginning of this section, otherwise they seem to come out of the blue a little.

      'We then used machine learning classifiers...' - random forests, LSTMs, or a different model?

      Discussion:

      '...over statistical models...' - suggest maybe 'purely statistical models'

      'We found that endosomal transport...' - A paper by Huang, et. al. (https://www.jneurosci.org/content/40/33/6428) observed a synergistic phagocytic response between CpC and pIC stimulation in microglia. This is still consistent with a saturation effect dependent on dose, but may be worth a mention.

      '...features termed...' -> 'features, termed'

      '...we applied a Long Short-Term Memory (LSTM) machine learning model..' - maybe make clear that this is on the time-series data (also LSTM has already been defined).

      Materials and methods:

      The descriptions in this section are quite vague, so I would suggest expanding this with more detail from the supplementary material, where things are quite well explained.

      'sampling distribution' - not clear what this refers to in this context

      'RelA-mVenus mouse strain' - it would be good to mention the relevance of the reporter for NFkB signaling

      '...A random forest classifier...' -> a random forest classifier

      Significance

      This study provides mechanistically interpretable insight on the important question of how immune cells perform target recognition in realistic scenarios, and also provides validation of existing mathematical models by extending these beyond their original domain. The paper uses 'signaling codons' as a proxy for information processing, however in this instance it is cross-validated with an LSTM model that is applied directly to the time series data. Nevertheless, the scope of the paper is such that it does not deal with the question of how these signals are transmitted or used in a downstream immune response. To my knowledge, this is the first time that a well established existing mathematical model of signalling response has been extended and applied to heterogeneous ligand mixtures. These results will be of interest to those studying immune cell responses, and to those interested in basic research on mathematical models of signaling and cellular information processing more generally.

      My background is in biophysical models, machine learning, and signaling in cancer. I have a basic understanding of immunology, but no experience in experimental cell biology.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript reports the identification of putative orthologues of mitochondrial contact site and cristae organizing system (MICOS) proteins in Plasmodium falciparum - an organism that unusually shows an acristate mitochondrion during the asexual part of its life cycle and then this develops cristae as it enters the sexual stage of its life cycle and beyond into the mosquito. The authors identify PfMIC60 and PfMIC19 as putative members and study these in detail. The authors at HA tags to both proteins and look for timing of expression during the parasite life cycle and attempt (unsuccessfully) to localise them within the parasite. They also genetically deleted both gene singly and in parallel and phenotyped the effect on parasite development. They show that both proteins are expressed in gametocytes and not asexuals, suggesting they are present at the same time as cristae development. They also show that the proteins are dispensible for the entire parasite life cycle investigated (asexuals through to sporozoites), however there is some reduction in mosquito transmission. Using EM techniques they show that the morphology of gametocyte mitochondria is abnormal in the knock out lines, although there is great variation.

      Major comments: The manuscript is interesting and is an intriguing use of a well studied organism of medical importance to answer fundamental biological questions. My main comments are that there should be greater detail in areas around methodology and statistical tests used. Also, the mosquito transmission assays (which are notoriously difficult to perform) show substantial variation between replicates and the statistical tests and data presentation are not clear enough to conclude the reduction in transmission that is claimed. Perhaps this could be improved with clearer text?

      We would like to thank the reviewer for taking the time to review our manuscript. We are happy to hear the reviewer thinks the manuscript is interesting and thank the reviewer for their constructive feedback.

      To clarify the statistical analyses used, we included a new supplementary dataset with all statistical analyses and p-values indicated per graph. Furthermore, figure legends now include the information on the exact statistical test used in each case.

      Regarding mosquito experiments, while we indeed reported a reduction in transmission and oocysts numbers we are aware that this effect might be due to the high variability in mosquito feeding assays. To highlight this point, we deleted the sentence "with the transmission reduction of [numbers]...." and we included the sentence "The high variability encountered in the standard membrane feeding assays, though, partially obstructs a clear conclusion on the biological relevance of the observed reduction in oocyst numbers"

      More specific comments to address: Line 101/Fig1E (and figure legend) - What is this heatmap showing. It would be helpful to have a sentence or two linking it to a specific methodology. I could not find details in the M+M section and "specialized, high molecular mass gels" does not adequately explain what experiments were performed. The reference to Supplementary Information 1 also did not provide information.

      We added the information "high molecular mass gels with lower acrylamide percentage" to clarify methodology in the text. Furthermore, we extended the figure legend to include all relevant information. Further experimental details can be found in the study cited in this context, where the dataset originates from (Evers et al., 2021).

      Line 115 and Supplementary Figure 2C + D - The main text says that the transgenic parasites contained a mitochondrially localized mScarlet for visualization and localization, but in the supplementary figure 2 it shows mitotracker labelling rather than mScarlet. This is very confusing. The figure legend also mentions both mScarlet and MitoTracker. I assume that mScarlet was used to view in regular IFAs (Fig S2C) and the MitoTracker was used for the expansion microscopy (Fig S2D)? Please clarify.

      We thank the reviewer for pointing this out - this was indeed incorrectly annotated. We used the endogenous mito-mScarlet signal in IFA and mitoTracker in U-ExM. The figure annotation has now been corrected.

      Figure 2C - what is the statistical test being used (the methods say "Mean oocysts per midgut and statistical significance were calculated using a generalized linear mixed effect model with a random experiment effect under a negative binomial distribution." but what test is this?)?

      The statistic test is now included in the material and method section with the sentence "The fitted model was used to obtain estimated means and contrasts and were evaluated using Wald Statistics". The test is now also mentioned in the figure legend.

      Also the choice of a log10 scale for oocyst intensity is an unusual choice - how are the mosquitoes with 0 oocysts being represented on this graph? It looks like they are being plotted at 10^-1 (which would be 0.1 oocysts in a mosquito which would be impossible).

      As the data spans three orders of magnitude with low values being biologically meaningful, we decided that a log scale would best facilitate readability of the graph. As the 0 values are also important to show, we went with a standard approach to handle 0s in log transformed data and substituted the 0s with a small value (0.001). We apologize for not mentioning this transformation in the manuscript. To make this transformation transparent, we added a break at the lower end of the log‑scaled y‑axis and relabelled the lowest tick as '0'. This ensures that mosquitoes with zero oocysts are shown along the x‑axis without being assigned an artificial value on the log scale. We would furthermore like to highlight that for statistics we used the true value 0 and not 0.001.

      Figure 2D - it is great that the data from all feeding replicates has been shared, however it is difficult to conclude any meaningful impact in transmission with the knock-out lines when there is so much variation and so few mosquitoes dissected for some datapoints (10 mosquitoes are very small sample sizes). For example, Exp1 shows a clear decrease in mic19- transmission, but then Exp2 does not really show as great effect. Similarly, why does the double knock out have better transmission than the single knockouts? Sure there would be a greater effect?

      We agree with the reviewer and with the new sentence added, as per major point, we hope we clarified the concept. Note that original Figure 2D has been moved to the supplementary information, as per minor comment of another reviewer.

      Figure 3 legend - Please add which statistical test was used and the number of replicates.

      Done

      Figure 4 legend - Please add which statistical test was used and the number of replicates.

      Done. Regarding replicates, note that while we measured over 100 cristae from over 30 mitochondria, these all stem from the same parasite culture.

      Figure 5C - the 3D reconstructions are very nice, but what does the red and yellow coloring show?

      Indeed, the information was missing. We added it to the figure legend.

      Line 352 - "Still, it is striking that, despite the pronounced morphological phenotype, and the possibly high mitochondrial stress levels, the parasites appeared mostly unaffected in life cycle propagation, raising questions about the functional relevance of mitochondria at these stages." How do the authors reconcile this statement with the proven fact that mitochondria-targeted antimalarials (such as atovaquone) are very potent inhibitors of parasite mosquito transmission?

      Our original sentence was reductive. What we wanted to state was related to the functional relevance of crista architecture and overall mitochondrial morphology rather than the general functional relevance of the mitochondria. We changed the sentence accordingly.

      Furthermore, even though we do not discuss this in the article, we are aware of mitochondria targeting drugs that are known to block mosquito transmission. We want to point out that it is difficult to discern the disruption of ETC and therefore an impact on energy conversion with the impact on the essential pathway of pyrimidine synthesis, highly relevant in microgamete formation. Still, a recent paper from Sparkes et al. 2024 showed the essentiality of mitochondrial ATP synthesis during gametogenesis so it is very likely that the mitochondrial energy conversion is highly relevant for transmission to the mosquito.

      Reviewer #1 (Significance (Required)):

      This manuscript is a novel approach to studying mitochondrial biology and does open a lot of unanswered questions for further research directions. Currently there are limitations in the use of statistical tests and detail of methodology, but these could be easily be addressed with a bit more analysis/better explanation in the text. This manuscript could be of interest to readers with a general interest in mitochondrial cell biology and those within the specific field of Plasmodium research. My expertise is in Plasmodium cell biology.

      We thank the reviewer for the praise.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Major comments: 1) In my opinion, the authors tend to sensationalize or overinterpret their results. The title of the manuscript is very misleading. While MICOS is certainly important for crista formation, it is not the only factor, as ATP synthase dimer rows make a highly significant contribution to crista morphology. Thus, one can argue with equal validity that ATP synthase should be considered the 'architect', as it's the conformation of the dimers and rows modulate positive curvature. Secondly, while cristae are still formed upon mic60/mic19 gene knockout (KO), they are severely deformed, and likely dysfunctional (see below). Thus, I do not agree with the title that MICOS is dispensable for crista formation, because the authors results show that it clearly is essential. So, the title should be changed.

      We thank the reviewer for taking the time to review our manuscript.

      Based on the reviewers' interpretation we conclude the title does not come across as intended. We have changed the title to: "The role of MICOS in organizing mitochondrial cristae in malaria parasites"

      The Discussion section starting from line 373 also suffers from overinterpretation as well as being repetitive and hard to understand. The authors infer that MICOS stability is compromised less in the single KOs (sKO) in compared to the mic60/mic19 double KO (dKO). MICOS stability was never directly addressed here and the composition of the MICOS complex is unaddressed, so it does not make sense to speculate by such tenuous connections. The data suggest to me that mic60 and mic19 are equally important for crista formation and crista junction (CJ) stabilization, and the dKO has a more severe phenotype than either KO, further demonstrating neither is epistatic.

      We do agree with the reviewer's notion that we did not address complex stability, and our wording did not make this sufficiently clear. We shortened and rephrased the paragraph in question.

      The following paragraphs (line 387 to 422) continues with such unnecessary overinterpretation to the point that it is confusing and contradictory. Line 387 mentions an 'almost complete loss of CJs' and then line 411 mentions an increase in CJ diameter, both upon Mic60 ablation. I do not think this discussion brings any added value to the manuscript and should be shortened. Yes, maybe there are other putative MICOS subunits that may linger in the KOS that are further destabilized in the dKO, or maybe Mic60 remains in the mic19 KO (and vice versa) to somehow salvage more CJs, which is not possible in the dKO. It is impossible to say with confidence how ATP synthase behaves in the KOs with the current data.

      We shortened this paragraph.

      2) While the authors went through impressive lengths to detect any effect on lifecycle progression, none was found except for a reduction in oocyte count. However, the authors did not address any direct effect on mitochondria, such as OXPHOS complex assembly, respiration, membrane potential. This seems like a missed opportunity, given the team's previous and very nice work mapping these complexes by complexome profiling. However, I think there are some experiments the authors can still do to address any mitochondrial defects using what they have and not resorting to complexome profiling (although this would be definitive if it is feasible):

      i) Quantification of MitoTracker Red staining in WT and KOs. The authors used this dye to visualize mitochondria to assay their gross morphology, but unfortunately not to assay membrane potential in the mutants. The authors can compare relative intensities of the different mitochondria types they categorized in Fig. 3A in 20-30 cells to determine if membrane potential is affected when the cristae are deformed in the mutants. One would predict they are affected.

      Interesting suggestion. As our staining and imaging conditions are suitable for such analysis (as demonstrated by Sarazin et al., 2025, https://www.biorxiv.org/content/10.1101/2025.11.27.690934v1), we performed the measurements on the same dataset which we collected for Figure 3. We did, however, not detect any difference in mitotracker intensity between the different lines. The result of this analysis is included in the new version of Supplementary figure S6.

      ii) Sporozoites are shown in Fig S5. The authors can use the same set up to track their motion, with the hypothesis that they will be slower in the mutants compared to WT due to less ATP. This assumes that sporozoite mitochondria are active as in gametocytes.

      While theoretically plausible and informative, we currently do not know the relevance of mitochondrial energy conversion for general sporozoite biology or specifically features of sporozoite movement. Given the required resources and time to set this experiment up and the uncertainty whether it is a relevant proxy for mitochondrial functioning, we argue it is out of scope for this manuscript.

      iii) Shotgun proteomics to compare protein levels in mutants compared to WT, with the hypothesis that OXPHOS complex subunits will be destabilized in the mutants with deformed cristae. This could be indirect evidence that OXPHOS assembly is affected, resulting in destabilized subunits that fail to incorporate into their respective complexes.

      While this experiment could potentially further our understanding of the interaction between MICOS and levels of OXPHOS complex subunits we argue that the indirect nature of the evidence does not justify the required investments.

      To expedite resubmission, the authors can restrict the cell lines to WT and the dKO, as the latter has a stronger phenotype that the individual KOs and conclusions from this cell line are valid for overall conclusions about Plasmodium MICOS.

      I will also conclude that complexome/shotgun proteomics may be a useful tool also for identifying other putative MICOS subunits by determining if proteins sharing the same complexome profile as PfMic60 and Mic19 are affected. This would address the overinterpretation problem of point 1.

      3) I am aware of the authors previous work in which they were not able to detect cristae in ABS, and thus have concluded that these are truly acristate. This can very well be true, or there can be immature cristae forms that evaded detection at the resolution they used in their volumetric EM acquisitions. The mitochondria and gametocyte cristae are pretty small anyway, so it not unreasonable to assume that putative rudimentary cristae in ABS may be even smaller still. Minute levels of sampled complex III and IV plus complex V dimers in ABS that were detected previously by the authors by complexome profiling would argue for the presence of miniscule and/or very few cristae.

      I think that authors should hedge their claim that ABS is acrisate by briefly stating that there still is a possibility that miniscule cristae may have been overlooked previously.

      We acknowledge that we cannot demonstrate the absolute absence of any membrane irregularities along the inner mitochondrial membrane. At the same time, if such structures were present, they would be extremely small and unlikely to contain the full set of proteins characteristic of mature cristae. For this reason, we consider it appropriate to classify ABS mitochondria as acristate. To reflect the reviewer's point while maintaining clarity for readers, we have slightly adjusted our wording in the manuscript, changing 'fully acristate' to 'acristate'.

      This brings me to the claim that Mic19 and Mic60 proteins are not expressed in ABS. This is based on the lack of signal from the epitope tag; a weak signal is detected in gametocytes. Thus, one can counter that Mic19 and Mic60 are also expressed, but below the expression limits of the assay, as the protein exhibits low expression levels when mitochondrial activity is upregulated.

      We agree with the reviewer that the absence of a detectable epitope‑tag signal does not definitively exclude low‑level expression, and we have therefore replaced the term 'absent' with 'undetectable' throughout the manuscript. In context with previous findings of low-level transcripts of the proteins in a study by Lopez-Berragan et al. and Otto et al., we also added the sentence "The apparent absence could indicate that transcripts are not translated in ABS or that the proteins' expression was below detection limits of western blot analysis." to the discussion. _At the same time, we would like to clarify that transcript levels for both genes fall within the

      To address this point, the authors should determine of mature mic60 and mic19 mRNAs are detected in ABS in comparison to the dKO, which will lack either transcript. RT-qPCR using polyT primers can be employed to detect these transcripts. If the level of these mRNAs are equivalent to dKO in WT ABS, the authors can make a pretty strong case for the absence of cristae in ABS.

      We appreciate the reviewer's suggestion. As noted in the Discussion, existing transcriptomic datasets already show detectable MIC19 and MIC60 mRNAs in ABS. For this reason, we expect RT-qPCR to reveal low (but not absent) levels of both transcripts, unlike the true loss expected to be observed in the dKO. Because such residual signals have been reported previously and their biological relevance remains uncertain, we do not believe transcript levels alone can serve as a definitive indicator of cristae absence in ABS.

      They should highlight the twin CX9C motifs that are a hallmark of Mic19 and other proteins that undergo oxidative folding via the MIA pathway. Interestingly, the Mia40 oxidoreductase that is central to MIA in yeast and animals, is absent in apicomplexans (DOI: 10.1080/19420889.2015.1094593).

      Searching for the CX9C motifs is a valuable suggestion. In response to the reviewer´s suggestion we analysed the conservation of the motif in PfMIC19 and included this in a new figure panel (Figure 1 F).

      Did the authors try to align Plasmodium Mic19 orthologs with conventional Mic19s? This may reveal some conserved residues within and outside of the CHCH domain.

      In response to this comment we made Figure 1 F, where we show conserved residues within the CHCH domains of a broad range of MIC19 annotated sequences across the opisthokonts, and show that the Cx9C motifs are conserved also in PfMIC19. Outside the CHCH domain, we did not find any meaningful conservation, as PfMIC19 heavily diverges from opisthokont MIC19.

      5) Statistcal significance. Sometimes my eyes see population differences that are considered insignificant by the statistical methods employed by the authors, eg Fig. 4E, mutants compared to WT, especially the dKO. Have the authors considered using other methods such as student t-test for pairwise comparisons?

      The graphs in figures 3, 4 and 5 got a makeover, such that they now are in linear scale and violin plots (also following a suggestion from further down in the reviewer's comments). We believe that this improves interpretability. ANOVA was kept as statistical testing to assure the correction for multiple comparisons that cannot be performed with standard t-test. A full overview of statistics and exact p-values can also be found in the newly added supplementary information 2.

      Minor comments: Line 33. Anaerobes (eg Giardia) have mitochondria that do produce ATP, unlike aerobic mitochondria

      We acknowledge that producing ATP via OXPHOS is not a characteristic of all mitochondria-like organelles (e.g. mitosomes), which is why these are typically classified separately from canonical mitochondria. When not considering mitochondria-like organelles, energy conversion is the function that the mitochondrion is most well-known for and the one associated with cristae.

      Line 56: Unclear what authors mean by "canonical model of mitochondria"

      To clarify we changed this to "yeast or human" model of mitochondria.

      Lines 75-76: This applies to Mic10 only

      We removed the "high degree of conservation in other cristate eukaryotes" statement.

      Line 80: Cite DOI: 10.1016/j.cub.2020.02.053

      Done

      Fig 2D: I find this table difficult to read. If authors keep table format, at least get rid of 'mean' column' as this data is better depicted in 2C. I suggest depicted this data either like in 3B depicting portion of infected vs unaffected flies in all experiments, then move modified Table to supplement. Important to point out experiment 5 appears to be an outlier with reduced infectivity across all cell lines, including WT.

      To clarify: the mean reported in the table indicates the mean per replicate while the mean reported in figure 2C is the overall mean for a given genotype that corrects for variability within experiments. We agree that moving the table to the supplementary data is a good idea. We decided to not include a graph for infected and non-infected mosquitoes as this information would be partially misleading, highlighting a phenotype we argue to be influenced by the strong variability.

      Fig. 3C-G: I feel like these data repeatedly lead to same conclusions. These are all different ways of showing what is depicted in Fig 2B: mitochondria gross morphology is affected upon ablation of MICOS. I suggest that these graphs be moved to supplement and replaced by the beautiful images.

      Thank you for the nice comment on our images. We have now moved part of the graphs to supplementary figure 6 and only kept the Relative Frequency, Sphericity and total mitochondria volume per cell in the main figure.

      Line 180: Be more specific with which tubulin isoform is used as a male marker and state why this marker was used in supplemental Fig S6.

      We have now specified the exact tubulin isoform used as the male gametocyte marker, both in the main text and in Supplementary Fig. S6. This is a commercial antibody previously known to work as an effective male marker, which is why we selected it for this experiment. This is now clearly stated in the manuscript.

      Line 196 and Fig 3C: the word 'intensities' in this context is very ambiguous. Please choose a different term (puncta, elements, parts?). This is related to major point 2i above.

      To clarify the biological effect that we can conclude form the measurement, we added an explanation about it in the respective section of the results, and we decided to replace the raw results of the plug-in readout with the deduced relative dispersion.

      Line 222: Report male/female crista measurements

      We added Supplementary information 2, which contains exact statistical test and outcomes on all presented quantifications as well as a per-sex statistical analysis of the data from figure 4. Correspondingly, we extended supplementary information 2 by a per-sex colour code for the thin section TEM data.

      Fig. 4B-E: depict data as violin plots or scatter plots like Fig. 2C to get a better grasp of how the crista coverage is distributed. It seems like the data spread is wider in the double KO. This would also solve the problem with the standard deviation extending beyond 0%.

      We changed this accordingly.

      Lines 331-333: Please clarify that this applies for some, but not all MICOS subunits. Please also see major point 1 above. Also, the authors should point out that despite their structural divergence, trypanosomal cryptic mitofilins Mic34 and Mic40 are essential for parasite growth, in contrast to their findings with PfMic60 (DOI: https://doi.org/10.1101/2025.01.31.635831).

      This has been changed accordingly.

      Line 320: incorrect citation. Related to point 1above.

      Correct citation is now included in the text.

      Lines 333-335. This is related to the above. Again, some subunits appear to affect cell growth under lab conditions, and some do not. This and the previous sentence should be rewritten to reflect this.

      This has been changed accordingly.

      Line 343-345: The sentence and citation 45 are strange. Regarding the former, it is about CHCHD10, whose status as a bona fide MICOS subunit is very tenuous, so I would omit this. About the phenomenon observed, I think it makes more sense to write that Mic60 ablation results in partially fragmented mitochondria in yeast (Rabl et al., 2009 J Cell Biol. 185: 1047-63). A fragmented mitochondria is often a physiological response to stress. I would just rewrite as not to imply that mitochondrial fission (or fusion) is impaired in these KOs, or at least this could be one of several possibilities.

      The sentence has been substituted following the indication of the reviewer. Though we still include the data of the human cells as this has also been shown in Stephens et al. 2020.

      Line 373: 'This indicates' is too strong. I would say 'may suggest' as you have no proof that any of the KOs disrupts MICOS. This hypothesis can be tested by other means, but not by penetrance of a phenotype.

      Done

      Line 376-377; 'deplete functionality' does not make sense, especially in the context of talking about MICOS subunit stability. In my opinion, this paragraph overinterprets the KO effects on MICOS stability. None of the experiments address this phenomenon, and thus the authors should not try to interpret their results in this context. See major point 1. Other suggestions for added value

      We removed the sentence. Also, the entire paragraph has been shortened, restructured and wording was changed to address major point 1.

      1) Does Plasmodium Sam50 co-fractionate with Mic60 and Mic19 in BN PAGE (Fig. 1E)

      While we did identify SAMM50 in our BN PAGE, the protein does not co-migrate with the MICOS components but instead comigrates with other components of a putative sorting and assembly machinery (SAM) complex. As SAMM50, the SAM complex and the overarching putative mitochondrial membrane space bridging (MIB) complex are not mentioned in the manuscript, we decided to not include the information in the figure.

      Reviewer #2 (Significance (Required)):

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      However, despite the herculean efforts of the authors, the manuscript as it currently stands represents only a minor advance in our understanding of the evolution of MICOS, which from the title and focus of the manuscript, is the main goal of the authors. In its current form, the manuscript reports some potentially important findings:

      1) Mic60 is verified to play a role in crista formation, as is predicted by its orthology to other characterized Mic60 orthologs.

      2) The discovery of a novel Mic19 analog (since the authors maintain there is no significant sequence homology), which exhibits a similar (or the same?) complexome profile with Mic60. This protein was upregulated in gametocytes like Mic60 and phenocopies Mic60 KO.

      3) Both of these MICOS subunits are essential (not dispensable) for proper crista formation

      4) Surprisingly, neither MICOS subunit is essential for in vitro growth or differentiation from ABS to sexual stages, and from the latter to sporozoites. This says more about the biology of plasmodium itself than anything about the essentiality of Mic60, ie plasmodium life cycle progression tolerates defects to mitochondrial morphology. But yes, I agree with the authors that Mic60's apparent insignificance for cell growth in examined conditions does differ with its essentiality in other eukaryotes. But fitness costs were not assayed (eg by competition between mutants and WT in infection of mosquitoes)

      5) Decreased fitness of the mutants is implied by a reduction of oocyte formation.

      While interesting in their own way, collectively they do not represent a major advance in our understanding of MICOS evolution. Furthermore, the findings bifurcate into categories informing MICOS or Plasmodium biology. Both aspects are somewhat underdeveloped in their current form.

      This is unfortunate because there seem to be many missed opportunities in the manuscript that could, with additional experiments, lead to a manuscript with much wider impact. For me, what is remarkable about Plasmodium MICOS that sets it apart from other iterations is the apparent absence of the Mic10 subunit. Purification of plasmodium MICOS via the epitope tagged Mic60 and Mic19 could have verified that MICOS is assembled without this core subunit. Perhaps Mic60 and Mic19 are the vestiges of the complex, and thus operate alone in shaping cristae. Such a reduction may also suggest the declining importance of mitochondria in plasmodium.

      Another missed opportunity was to assay the impact of MICOS-depletion of OXPHOS in plasmodium. This is a salient issue as maybe crista morphology is decoupled from OXPHOS capacity in Plasmodium, which links to the apparent tolerance of mitochondrial morphology in cell growth and differentiation. I suggested in section A experiments to address this deficit.

      Finally, the authors could assay fitness costs of MICOS-ablation and associated phenotypes by assaying whether mosquito infectivity is reduced in the mutants when they are directly competing with WT plasmodium. Like the authors, I am also surprised that MICOS mutants can pass population bottlenecks represented by differentiation events. Perhaps the apparent robustness of differentiation may contribute plasmodium's remarkable ability to adapt.

      I realize that the authors put a lot of efforts into their study and again, I am very impressed by the sophistication of the methods employed. Nevertheless, I think there is still better ways to increase the impact of the study aside from overinterpreting the conclusions from the data. But this would require more experiments along the lines I suggest in Section A and here.

      We thank the reviewer for their extensive analysis of the significance of our findings, including the compliments on our microscopy images and the sophisticated experimental approaches. We hope we have convincingly argued why we could or could not include some of the additional analyses suggested by the reviewer in section 1 above.

      With regard to the significance statement, we want to point out that our finding that PfMICOS is not needed for initial formation of cristae (as opposed to organization thereof), is a confirmation of something that has been assumed by the field, without being the actual focus of studies. We argue that the distinction between formation and organization of cristae is important and deserves some attention within the manuscript. The result of MICOS not being involved in the initial formation of cristae, we argue to be relevant in Plasmodium biology and beyond. As for the insights into how MICOS works in Plasmodium we have confirmed that the previously annotated PfMIC60 is indeed involved in the organization of cristae. Furthermore, we have identified and characterized PfMIC19. These findings, we argue, are indeed meaningful insights into PfMICOS.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      MICOS is a conserved mitochondrial protein complex responsible for organising the mitochondrial inner membrane and the maintenance of cristae junctions. This study sheds first light on the role of two MICOS subunits (Mic60 and the newly annotated Mic19) in the malaria parasite Plasmodium falciparum, which forms cristae de novo during sexual development, as demonstrated by EM of thin section and electron tomography. By generating knockout lines (including a double knockout), the authors demonstrate that knockout of both MICOS subunits leads to defects in cristae morphology and a partial loss of cristae junctions. With a formidable set of parasitological assays, the authors show that despite the metabolically important role of mitochondria for gametocytes, the knockout lines can progress through the life stages and form sporozoites, albeit with diminished infection efficiency.

      We thank the reviewer for their time and compliment.

      Major comments:

      1) The authors should improve to present their findings in the right context, in particular by:

      (i) giving a clearer description in the introduction of what is already known about the role of MICOS. This starts in the introduction, where one main finding is missing: loss of MICOS leads to loss of cristae junctions and the detachment of cristae membranes, which are nevertheless formed, but become membrane vesicles. This needs to be clearly stated in the introduction to allow the reader to understand the consistency of the authors' findings in P. falciparum with previous reports in the literature.

      We extended the introduction to include this information.

      (ii) at the end to the introduction, the motivating hypothesis is formulated ad hoc "conclusive evidence about its involvement in the initial formation of cristae is still lacking" (line 83). If there is evidence in the literature that MICOS is strictly required for cristae formation in any organism, then this should be explained, because the bona fide role of MICOS is maintenance of cristae junctions (the hypothesis is still plausible and its testing important).

      To clarify we rephrased the sentence to: "Although MICOS has been described as an organizer of crista junctions, its role during the initial formation of nascent cristae has not been investigated."

      2) Line 96-97: "Interestingly, PfMIC60 is much larger than the human MICOS counterpart, with a large, poorly predicted N-terminal extension." This statement is lacking a reference and presumably refers to annotated ORFs. The authors should clarify if the true N-terminus is definitely known - a 120kDa size is shown for the P. falciparum but this is not compared to the expected length or the size in S. cerevisiae.

      To solve the reference issue, we added the uniprot IDs we compared to see that the annotated ORF is bigger in Plasmodium. We also changed the comparison to yeast instead of human, because we realized it is confusing to compare to yeast all throughout the figure, but then talk about human in this specific sentence.

      Regarding whether the true N-terminus is known. Short answer: No, not exactly.

      However, we do know that the Pf version is about double the size of the yeast protein.

      As the reviewer correctly states, we show the size of 120kDa for the tagged protein in Figure 1G. Considering that we tagged the protein C-terminally, and observed a 120kDa product on western blot, it is safe to conclude that the true N-terminus does not deviate massively from the annotated ORF, and hence, that there is a considerable extension of the protein beyond a 60kDa protein. We do not directly compare to yeast MIC60 on our western blots, however, that comparison can be drawn from literature: Tarasenko et al., 2017 showed that purified MIC60 running at ~60kDa on SDS-PAGE actively bends membranes, suggesting that in its active form, the monomer of yeast MIC60 is indeed 60kDa in size.

      To clarify, we now emphasize that we ran the Alphafold prediction on the annotated open reading frame (annotated and sequenced by Bohme et al. and Chapell et al. now cited in the manuscript), and revised the wording to make clear what we are comparing in which sentence.

      3) lines 244-245: "Furthermore, our data indicates the effect size increases with simultaneous ablation of both proteins?". The authors should explain which data they are referring to, as some of the data in Fig 3 and 4 look similar and all significance tests relate to the wild type, not between the different mutants, so it is not clear if any overserved differences are significant. The authors repeat this claim in the discussion in lines 368-369 without referring to a specific significance test. This needs to be clarified.

      As a reply to this and other comments from the reviewers we added the multiple testing within all samples. In addition, to clarify statistics used we included a supplementary dataset with all p-values and statistical tests used.

      4) lines 304-306: "Though well established as the cristae organizing system, the role of MICOS in initial formation of cristae remains hidden in model organisms that constitutively display cristae.". This sentence is misleading since even in organisms that display numerous cristae throughout their life cycle, new cristae are being formed as the cells proliferate. Thus, failure to produce cristae in MICOS knockout lines would have been observable but has apparently not been reported in the literature. Thus, the concerted process in P. falciparum makes it a great model organism, but not fundamentally different to what has been studied before in other organisms.

      We deleted this statement.

      5) lines 373-378. "where ablation of just MIC60 is sufficient to deplete functionality of the entire MICOS (11, 15),". The authors' claim appears to be contrary to what is actually stated in ref 15, which they cite:

      "MICOS subunits have non-redundant functions as the absence of both MICOS subcomplexes results in more severe morphological and respiratory growth defects than deletion of single MICOS subunits or subcomplexes."

      This seems in line with what the authors show, rather than "different".

      This sentence has been removed.

      6) lines 380-385: "... thus suggesting that membrane invaginations still arise, but are not properly arranged in these knockout lines. This suggests that MICOS either isn't fully depleted,...". These conclusions are incompatible with findings from ref. 15, which the authors cite. In that study, the authors generated a ∆MICOS line which still forms membrane invaginations, showing that MICOS is not required at all for this process in yeast. Hence the authors' implication that MICOS needs to be fully depleted before membrane invaginations cease to occur is not supported by the literature.

      This sentence has been deleted in the revised version of the manuscript.

      Minor comments:

      7) The authors should consider if the first part of their title could be seen as misleading: It suggests that MICOS is "the architect" in cristae formation, but this is not consistent with the literature nor their own findings.

      Title is changed accordingly

      Minor comments:

      • Line 43, of the three seminal papers describing the discovery of MICOS in 2011, the authors only cite two (refs 6 and 7), but miss the third paper, Hoppins et al, PMID: 21987634, which should probably be corrected.

      Done, the paper is now cited

      • Page 2, line 58: for a more complete picture the authors should also cite the work of others here which shows that although at very low levels, e.g. complex III (a drug target) and ATP synthase do assemble (Nina et al, 2011, JBC).

      Done

      • Page 3, line 80: "Irrespective of the shape of an organism's cristae, the crista junctions have been described as tubular channels that connect the cristae membrane to the inner boundary membrane (22, 24)." This omits the slit-shaped cristae junctions found in yeast (Davies et al, 2011, PNAS), which the authors should include.

      The paper and concept have been added to the manuscript, though the sentence has been moved up in the introduction, when crista junctions are first introduced.

      • Line 97: "poorly predicted N-terminal extension", as there is no experimental structure, we don't know if the prediction is poor. Presumably the authors mean either poorly ordered or the absence of secondary structure elements, or the poor confidence score for that region in the prediction? This should be clarified or corrected.

      We were referring to the poor confidence score. To address this comment as well as major point 2, we rewrote the respective paragraph. It now clearly states that confidence of the prediction is low, and we mention the tool that was used to identify conserved domains (Topology-based Evolutionary Domains).

      • Line 98: "an antiparallel array of ten β-sheets". They are actually two parallel beta-sheets stacked together. The authors could find out the name of this fold, but the confidence of the prediction is marked a low/very low. So, its existence is unknown, not just its "function".

      We adapted the domain description to "a stack of two parallel beta-sheets" and replaced the statement on unknown function by the statement "Because this domain is predicted solely from computational analysis, both its actual existence in the native protein and its biological function remain unknown."

      Fig 1B: The authors show two alphafold predictions of S. cerevisiae and P. falciparum Mic60 structures. There is however an experimental Mic60/19 (fragment) structure from the former organism (PMID: 36044574), which should be included if possible

      We appreciate the reviewer's suggestion and note that the available structural data indeed provides valuable insight into how MIC60 and MIC19 interact. However, these structures represent fusion constructs of limited protein fragments and therefore capture only a small portion of each protein, specifically the interaction interface. Because our aim in Fig. 1B is to compare the overall domain architecture of the full‑length proteins, we believe that including fragment‑based structures would be less informative in this context.

      Line: 318-321: "The same trend was observed for PfMIC19 and PfMIC60. Although transcriptomic data suggested that low-level transcripts of PfMIC19 and PfMIC60 are present in ABS (38), we did not detect either of the proteins in ABS by western blot analysis. While this statement is true, the authors should comment on the sensitivity of the respective methods - how well was the antibody working in their hands and how do they interpret the absence of a WB band compared to transcriptomics data?

      The HA antibody used in our experiments is a standard commercial reagent that performs reliably in both WB and IFA, although it shows a low background signal in gametocytes. We agree that the sensitivity of the method and the interpretation of weak or absent bands should be addressed explicitly. Transcript levels for both PfMIC19 and PfMIC60 in asexual blood stages fall within the

      • Lines 322-323: would the authors not typically have expected an IFA signal given the strength of the band in Western blot? If possible, the authors should comment if the negative fluorescence outcome can indeed be explained with the low abundance or if technical challenges are an equally good explanation.

      Considering the nature of the investigated proteins (embedded in the IMM and spread throughout the mitochondria) difficulties in achieving a clear signal in IFA or U-ExM are not very surprizing. While epitopes may remain buried in IFA, U-ExM usually increases accessibility for the antibodies. However, U-ExM comes at the cost of being prone to dotty background signals, therefore potentially hiding low abundance, naturally dotty signals such as the signal of MICOS proteins that localize to distinct foci (at the CJ) along the mitochondrion. Current literature suggests that, in both human and yeast, STED is the preferred method for accurate spatial resolution of MICOS proteins (https://www.ncbi.nlm.nih.gov/pubmed/32567732,https://www.ncbi.nlm.nih.gov/pubmed/32067344). Unfortunately, we do not have experience with, nor access to, this particular technique/method.

      Lines 357-365: the authors describe limitations of the applied methods adequately. Perhaps it would be helpful to make a similar statement about the analysis of 3D objects like mitochondria and cristae from 2D sections. E.g. the apparent cristae length depends on whether cristae are straight (e.g. coiled structures do not display long cross sections despite their true length in 3D).

      The limitations of other methods are described in the respective results section.

      We added a clarifying sentence in the results section of Figure 4:

      "Note that such measurements do not indicate the true total length or width of cristae, as the data is two-dimensional. The recorded values are to be considered indicative of possible trends, rather than absolute dimensions of cristae."

      This statement refers to the length/width measurements of cristae.

      In the context of Figure 4 D we mention the following (see preprint lines 229 - 230): "We expect this effect to translate into the third dimension and thus conclude that the mean crista volume increases with the loss of either PfMIC19,PfMIC60, or both."

      For Figure 5, we included a clarifying statement in the results section of the preprint (lines 269 - 273): "Note that these mitochondrial volumes are not full mitochondria, but large segments thereof. As a result of the incompleteness of the mitochondria within the section, and the tomography specific artefact of the missing wedge, we were unable to confirm whether cristae were in fact fully detached from the boundary membrane, or just too long to fit within the observable z-range. "

      Line 404: perhaps undetected or similar would be a better description than "hidden"?

      The sentence does not exist in the revised manuscript

      Reviewer #3 (Significance (Required)):

      The main strength of the study is that it provides the first characterisation of the MICOS complex in P. falciparum, a human parasite in which the mitochondrion has been shown to be a drug target. Mic60 and the newly annotated Mic19 are confirmed to be essential for proper cristae formation and morphology, as well as overall mitochondrial morphology. Furthermore, the mutant lines are characterised for their ability to complete the parasite life cycle and defects in infection effectivity are observed. This work is an important first step for deciphering the role of MICOS in the malaria parasite and the composition and function of this complex in this organism. The limitation of the study stems from what is already known about MICOS and its subunits in

      great detail in yeast and humans with similar findings regarding loss of cristae and cristae defects. The findings of this study do not provide dramatic new insight on MICOS function or go substantially beyond the vast existing literature in terms of the extent of the study, which focuses on parasitological assays and morphological analysis. Exploring the role of MICOS in an early-divergent organism and human parasite is however important given the divergence found in mitochondrial biology and P. falciparum is a uniquely suited model system. One aspect that would increase the impact of the paper would be if the authors could mechanistically link the observed morphological defects to the decreased infection efficiency, e.g. by probing effects on mitochondrial function. This will likely be challenging as the morphological defects are diverse and the fitness defects appear moderate/mild.

      As suggested by Reviewer 2, we examined mitochondrial membrane potential in gametocytes using MitoTracker staining and did not observe any obvious differences associated with the morphological defects. At present, additional assays to probe mitochondrial function in P. falciparum gametocytes are not sufficiently established, and developing and validating such methods would require substantial work before they could be applied to our mutant lines. For these reasons, a more detailed mechanistic link between the observed morphological changes and the reduced infection efficiency is currently beyond reach.

      The advance presented in this study is to pioneer the study of MICOS in P. falciparum, thus widening our understanding of the role of this complex to different model organism. This study will likely be mainly of interest for specialised audiences such as basic research parasitologists and mitochondrial biologists. My own field of expertise is mitochondrial biology and structural biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      MICOS is a conserved mitochondrial protein complex responsible for organising the mitochondrial inner membrane and the maintenance of cristae junctions. This study sheds first light on the role of two MICOS subunits (Mic60 and the newly annotated Mic19) in the malaria parasite Plasmodium falciparum, which forms cristae de novo during sexual development, as demonstrated by EM of thin section and electron tomography. By generating knockout lines (including a double knockout), the authors demonstrate that knockout of both MICOS subunits leads to defects in cristae morphology and a partial loss of cristae junctions. With a formidable set of parasitological assays, the authors show that despite the metabolically important role of mitochondria for gametocytes, the knockout lines can progress through the life stages and form sporozoites, albeit with diminished infection efficiency.

      Major comments:

      1) The authors should improve to present their findings in the right context, in particular by:

      (i) giving a clearer description in the introduction of what is already known about the role of MICOS. This starts in the introduction, where one main finding is missing: loss of MICOS leads to loss of cristae junctions and the detachment of cristae membranes, which are nevertheless formed, but become membrane vesicles. This needs to be clearly stated in the introduction to allow the reader to understand the consistency of the authors' findings in P. falciparum with previous reports in the literature.

      (ii) at the end to the introduction, the motivating hypothesis is formulated ad hoc "conclusive evidence about its involvement in the initial formation of cristae is still lacking" (line 83). If there is evidence in the literature that MICOS is strictly required for cristae formation in any organism, then this should be explained, because the bona fide role of MICOS is maintenance of cristae junctions (the hypothesis is still plausible and its testing important).

      2) Line 96-97: "Interestingly, PfMIC60 is much larger than the human MICOS counterpart, with a large, poorly predicted N-terminal extension." This statement is lacking a reference and presumably refers to annotated ORFs. The authors should clarify if the true N-terminus is definitely known - a 120kDa size is shown for the P. falciparum but this is not compared to the expected length or the size in S. cerevisiae.

      3) lines 244-245: "Furthermore, our data indicates the effect size increases with simultaneous ablation of both proteins?". The authors should explain which data they are referring to, as some of the data in Fig 3 and 4 look similar and all significance tests relate to the wild type, not between the different mutants, so it is not clear if any overserved differences are significant. The authors repeat this claim in the discussion in lines 368-369 without referring to a specific significance test. This needs to be clarified.

      4) lines 304-306: "Though well established as the cristae organizing system, the role of MICOS in initial formation of cristae remains hidden in model organisms that constitutively display cristae.". This sentence is misleading since even in organisms that display numerous cristae throughout their life cycle, new cristae are being formed as the cells proliferate. Thus, failure to produce cristae in MICOS knockout lines would have been observable but has apparently not been reported in the literature. Thus, the concerted process in P. falciparum makes it a great model organism, but not fundamentally different to what has been studied before in other organisms.

      5) lines 373-378. "where ablation of just MIC60 is sufficient to deplete functionality of the entire MICOS (11, 15),". The authors' claim appears to be contrary to what is actually stated in ref 15, which they cite:

      "MICOS subunits have non-redundant functions as the absence of both MICOS subcomplexes results in more severe morphological and respiratory growth defects than deletion of single MICOS subunits or subcomplexes."

      This seems in line with what the authors show, rather than "different".

      6) lines 380-385: "... thus suggesting that membrane invaginations still arise, but are not properly arranged in these knockout lines. This suggests that MICOS either isn't fully depleted,...". These conclusions are incompatible with findings from ref. 15, which the authors cite. In that study, the authors generated a ∆MICOS line which still forms membrane invaginations, showing that MICOS is not required at all for this process in yeast. Hence the authors' implication that MICOS needs to be fully depleted before membrane invaginations cease to occur is not supported by the literature.

      7) The authors should consider if the first part of their title could be seen as misleading: It suggests that MICOS is "the architect" in cristae formation, but this is not consistent with the literature nor their own findings.

      Minor comments:

      • Line 43, of the three seminal papers describing the discovery of MICOS in 2011, the authors only cite two (refs 6 and 7), but miss the third paper, Hoppins et al, PMID: 21987634, which should probably be corrected.
      • Page 2, line 58: for a more complete picture the authors should also cite the work of others here which shows that although at very low levels, e.g. complex III (a drug target) and ATP synthase do assemble (Nina et al, 2011, JBC).
      • Page 3, line 80: "Irrespective of the shape of an organism's cristae, the crista junctions have been described as tubular channels that connect the cristae membrane to the inner boundary membrane (22, 24)." This omits the slit-shaped cristae junctions found in yeast (Davies et al, 2011, PNAS), which the authors should include.
      • Line 97: "poorly predicted N-terminal extension", as there is no experimental structure, we don't know if the prediction is poor. Presumably the authors mean either poorly ordered or the absence of secondary structure elements, or the poor confidence score for that region in the prediction? This should be clarified or corrected.
      • Line 98: "an antiparallel array of ten β-sheets". They are actually two parallel beta-sheets stacked together. The authors could find out the name of this fold, but the confidence of the prediction is marked a low/very low. So, its existence is unknown, not just its "function".
      • Fig 1B: The authors show two alphafold predictions of S. cerevisiae and P. falciparum Mic60 structures. There is however an experimental Mic60/19 (fragment) structure from the former organism (PMID: 36044574), which should be included if possible
      • Line: 318-321: "The same trend was observed for PfMIC19 and PfMIC60. Although transcriptomic data suggested that low-level transcripts of PfMIC19 and PfMIC60 are present in ABS (38), we did not detect either of the proteins in ABS by western blot analysis. While this statement is true, the authors should comment on the sensitivity of the respective methods - how well was the antibody working in their hands and how do they interpret the absence of a WB band compared to transcriptomics data?
      • Lines 322-323: would the authors not typically have expected an IFA signal given the strength of the band in Western blot? If possible, the authors should comment if the negative fluorescence outcome can indeed be explained with the low abundance or if technical challenges are an equally good explanation.
      • Lines 357-365: the authors describe limitations of the applied methods adequately. Perhaps it would be helpful to make a similar statement about the analysis of 3D objects like mitochondria and cristae from 2D sections. E.g. the apparent cristae length depends on whether cristae are straight (e.g. coiled structures do not display long cross sections despite their true length in 3D).
      • Line 404: perhaps undetected or similar would be a better description than "hidden"?

      Significance

      The main strength of the study is that it provides the first characterisation of the MICOS complex in P. falciparum, a human parasite in which the mitochondrion has been shown to be a drug target. Mic60 and the newly annotated Mic19 are confirmed to be essential for proper cristae formation and morphology, as well as overall mitochondrial morphology. Furthermore, the mutant lines are characterised for their ability to complete the parasite life cycle and defects in infection effectivity are observed. This work is an important first step for deciphering the role of MICOS in the malaria parasite and the composition and function of this complex in this organism.

      The limitation of the study stems from what is already known about MICOS and its subunits in other organism. MICOS subunit knockouts have been characterised in great detail in yeast and humans with similar findings regarding loss of cristae and cristae defects. The findings of this study do not provide dramatic new insight on MICOS function or go substantially beyond the vast existing literature in terms of the extent of the study, which focuses on parasitological assays and morphological analysis.

      Exploring the role of MICOS in an early-divergent organism and human parasite is however important given the divergence found in mitochondrial biology and P. falciparum is a uniquely suited model system. One aspect that would increase the impact of the paper would be if the authors could mechanistically link the observed morphological defects to the decreased infection efficiency, e.g. by probing effects on mitochondrial function. This will likely be challenging as the morphological defects are diverse and the fitness defects appear moderate/mild.

      The advance presented in this study is to pioneer the study of MICOS in P. falciparum, thus widening our understanding of the role of this complex to different model organism. This study will likely be mainly of interest for specialised audiences such as basic research parasitologists and mitochondrial biologists. My own field of expertise is mitochondrial biology and structural biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Major comments:

      1) In my opinion, the authors tend to sensationalize or overinterpret their results. The title of the manuscript is very misleading. While MICOS is certainly important for crista formation, it is not the only factor, as ATP synthase dimer rows make a highly significant contribution to crista morphology. Thus, one can argue with equal validity that ATP synthase should be considered the 'architect', as it's the conformation of the dimers and rows modulate positive curvature. Secondly, while cristae are still formed upon mic60/mic19 gene knockout (KO), they are severely deformed, and likely dysfunctional (see below). Thus, I do not agree with the title that MICOS is dispensable for crista formation, because the authors results show that it clearly is essential. So, the title should be changed.

      The Discussion section starting from line 373 also suffers from overinterpretation as well as being repetitive and hard to understand. The authors infer that MICOS stability is compromised less in the single KOs (sKO) in compared to the mic60/mic19 double KO (dKO). MICOS stability was never directly addressed here and the composition of the MICOS complex is unaddressed, so it does not make sense to speculate by such tenuous connections. The data suggest to me that mic60 and mic19 are equally important for crista formation and crista junction (CJ) stabilization, and the dKO has a more severe phenotype than either KO, further demonstrating neither is epistatic.

      The following paragraphs (line 387 to 422) continues with such unnecessary overinterpretation to the point that it is confusing and contradictory. Line 387 mentions an 'almost complete loss of CJs' and then line 411 mentions an increase in CJ diameter, both upon Mic60 ablation. I do not think this discussion brings any added value to the manuscript and should be shortened. Yes, maybe there are other putative MICOS subunits that may linger in the KOS that are further destabilized in the dKO, or maybe Mic60 remains in the mic19 KO (and vice versa) to somehow salvage more CJs, which is not possible in the dKO. It is impossible to say with confidence how ATP synthase behaves in the KOs with the current data.

      2) While the authors went through impressive lengths to detect any effect on lifecycle progression, none was found except for a reduction in oocyte count. However, the authors did not address any direct effect on mitochondria, such as OXPHOS complex assembly, respiration, membrane potential. This seems like a missed opportunity, given the team's previous and very nice work mapping these complexes by complexome profiling. However, I think there are some experiments the authors can still do to address any mitochondrial defects using what they have and not resorting to complexome profiling (although this would be definitive if it is feasible):

      i) Quantification of MitoTracker Red staining in WT and KOs. The authors used this dye to visualize mitochondria to assay their gross morphology, but unfortunately not to assay membrane potential in the mutants. The authors can compare relative intensities of the different mitochondria types they categorized in Fig. 3A in 20-30 cells to determine if membrane potential is affected when the cristae are deformed in the mutants. One would predict they are affected.

      ii) Sporozoites are shown in Fig S5. The authors can use the same set up to track their motion, with the hypothesis that they will be slower in the mutants compared to WT due to less ATP. This assumes that sporozoite mitochondria are active as in gametocytes.

      iii) Shotgun proteomics to compare protein levels in mutants compared to WT, with the hypothesis that OXPHOS complex subunits will be destabilized in the mutants with deformed cristae. This could be indirect evidence that OXPHOS assembly is affected, resulting in destabilized subunits that fail to incorporate into their respective complexes.

      To expedite resubmission, the authors can restrict the cell lines to WT and the dKO, as the latter has a stronger phenotype that the individual KOs and conclusions from this cell line are valid for overall conclusions about Plasmodium MICOS.

      I will also conclude that complexome/shotgun proteomics may be a useful tool also for identifying other putative MICOS subunits by determining if proteins sharing the same complexome profile as PfMic60 and Mic19 are affected. This would address the overinterpretation problem of point 1.

      3) I am aware of the authors previous work in which they were not able to detect cristae in ABS, and thus have concluded that these are truly acristate. This can very well be true, or there can be immature cristae forms that evaded detection at the resolution they used in their volumetric EM acquisitions. The mitochondria and gametocyte cristae are pretty small anyway, so it not unreasonable to assume that putative rudimentary cristae in ABS may be even smaller still. Minute levels of sampled complex III and IV plus complex V dimers in ABS that were detected previously by the authors by complexome profiling would argue for the presence of miniscule and/or very few cristae.

      I think that authors should hedge their claim that ABS is acrisate by briefly stating that there still is a possibility that miniscule cristae may have been overlooked previously.

      This brings me to the claim that Mic19 and Mic60 proteins are not expressed in ABS. This is based on the lack of signal from the epitope tag; a weak signal is detected in gametocytes. Thus, one can counter that Mic19 and Mic60 are also expressed, but below the expression limits of the assay, as the protein exhibits low expression levels when mitochondrial activity is upregulated.

      To address this point, the authors should determine of mature mic60 and mic19 mRNAs are detected in ABS in comparison to the dKO, which will lack either transcript. RT-qPCR using polyT primers can be employed to detect these transcripts. If the level of these mRNAs are equivalent to dKO in WT ABS, the authors can make a pretty strong case for the absence of cristae in ABS.

      4) The major finding of the manuscript is of a Mic19 analog in plasmodium should be highlighted. As far as I know, this manuscript could represent the first instance of Mic19 outside of opisthokonts that was not found by sensitive profile HMM searches and certainly the first time such a Mic19 was functionally analyzed.

      They should highlight the twin CX9C motifs that are a hallmark of Mic19 and other proteins that undergo oxidative folding via the MIA pathway. Interestingly, the Mia40 oxidoreductase that is central to MIA in yeast and animals, is absent in apicomplexans (DOI: 10.1080/19420889.2015.1094593).

      Did the authors try to align Plasmodium Mic19 orthologs with conventional Mic19s? This may reveal some conserved residues within and outside of the CHCH domain.

      5) Statistcal significance. Sometimes my eyes see population differences that are considered insignificant by the statistical methods employed by the authors, eg Fig. 4E, mutants compared to WT, especially the dKO. Have the authors considered using other methods such as student t-test for pairwise comparisons?

      Minor comments:

      Line 33. Anaerobes (eg Giardia) have mitochondria that do produce ATP, unlike aerobic mitochondria

      Line 56: Unclear what authors mean by "canonical model of mitochondria"

      Lines 75-76: This applies to Mic10 only

      Line 80: Cite DOI: 10.1016/j.cub.2020.02.053

      Fig 2D: I find this table difficult to read. If authors keep table format, at least get rid of 'mean' column' as this data is better depicted in 2C. I suggest depicted this data either like in 3B depicting portion of infected vs unaffected flies in all experiments, then move modified Table to supplement. Important to point out experiment 5 appears to be an outlier with reduced infectivity across all cell lines, including WT.

      Fig. 3C-G: I feel like these data repeatedly lead to same conclusions. These are all different ways of showing what is depicted in Fig 2B: mitochondria gross morphology is affected upon ablation of MICOS. I suggest that these graphs be moved to supplement and replaced by the beautiful images

      Line 180: Be more specific with which tubulin isoform is used as a male marker and state why this marker was used in supplemental Fig S6.

      Line 196 and Fig 3C: the word 'intensities' in this context is very ambiguous. Please choose a different term (puncta, elements, parts?). This is related to major point 2i above.

      Line 222: Report male/female crista measurements

      Fig. 4B-E: depict data as violin plots or scatter plots like Fig. 2C to get a better grasp of how the crista coverage is distributed. It seems like the data spread is wider in the double KO. This would also solve the problem with the standard deviation extending beyond 0%.

      Lines 331-333: Please clarify that this applies for some, but not all MICOS subunits. Please also see major point 1 above. Also, the authors should point out that despite their structural divergence, trypanosomal cryptic mitofilins Mic34 and Mic40 are essential for parasite growth, in contrast to their findings with PfMic60 (DOI: https://doi.org/10.1101/2025.01.31.635831).

      Line 320: incorrect citation. Related to point 1above.

      Lines 333-335. This is related to the above. Again, some subunits appear to affect cell growth under lab conditions, and some do not. This and the previous sentence should be rewritten to reflect this.

      Line 343-345: The sentence and citation 45 are strange. Regarding the former, it is about CHCHD10, whose status as a bona fide MICOS subunit is very tenuous, so I would omit this. About the phenomenon observed, I think it makes more sense to write that Mic60 ablation results in partially fragmented mitochondria in yeast (Rabl et al., 2009 J Cell Biol. 185: 1047-63). A fragmented mitochondria is often a physiological response to stress. I would just rewrite as not to imply that mitochondrial fission (or fusion) is impaired in these KOs, or at least this could be one of several possibilities.

      Line 373: 'This indicates' is too strong. I would say 'may suggest' as you have no proof that any of the KOs disrupts MICOS. This hypothesis can be tested by other means, but not by penetrance of a phenotype.

      Line 376-377; 'deplete functionality' does not make sense, especially in the context of talking about MICOS subunit stability. In my opinion, this paragraph overinterprets the KO effects on MICOS stability. None of the experiments address this phenomenon, and thus the authors should not try to interpret their results in this context. See major point 1.

      Other suggestions for added value

      1) Does Plasmodium Sam50 co-fractionate with Mic60 and Mic19 in BN PAGE (Fig. 1E)

      2) Can Alphafold3 predict a heterotetramer of PfMic60? What about the four Mic19 and Mic60 subunits together. Is this tetramer consistent with the Bock-Bierbaum model. Is this model consistent with the CJ diameter measured in plasmodium, which is perhaps better evidence than that in lines 419-422.

      Significance

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      However, despite the herculean efforts of the authors, the manuscript as it currently stands represents only a minor advance in our understanding of the evolution of MICOS, which from the title and focus of the manuscript, is the main goal of the authors.

      In its current form, the manuscript reports some potentially important findings:

      1) Mic60 is verified to play a role in crista formation, as is predicted by its orthology to other characterized Mic60 orthologs.

      2) The discovery of a novel Mic19 analog (since the authors maintain there is no significant sequence homology), which exhibits a similar (or the same?) complexome profile with Mic60. This protein was upregulated in gametocytes like Mic60 and phenocopies Mic60 KO.

      3) Both of these MICOS subunits are essential (not dispensable) for proper crista formation

      4) Surprisingly, neither MICOS subunit is essential for in vitro growth or differentiation from ABS to sexual stages, and from the latter to sporozoites. This says more about the biology of plasmodium itself than anything about the essentiality of Mic60, ie plasmodium life cycle progression tolerates defects to mitochondrial morphology. But yes, I agree with the authors that Mic60's apparent insignificance for cell growth in examined conditions does differ with its essentiality in other eukaryotes. But fitness costs were not assayed (eg by competition between mutants and WT in infection of mosquitoes)

      5) Decreased fitness of the mutants is implied by a reduction of oocyte formation.

      While interesting in their own way, collectively they do not represent a major advance in our understanding of MICOS evolution. Furthermore, the findings bifurcate into categories informing MICOS or Plasmodium biology. Both aspects are somewhat underdeveloped in their current form.

      This is unfortunate because there seem to be many missed opportunities in the manuscript that could, with additional experiments, lead to a manuscript with much wider impact.

      For me, what is remarkable about Plasmodium MICOS that sets it apart from other iterations is the apparent absence of the Mic10 subunit. Purification of plasmodium MICOS via the epitope tagged Mic60 and Mic19 could have verified that MICOS is assembled without this core subunit. Perhaps Mic60 and Mic19 are the vestiges of the complex, and thus operate alone in shaping cristae. Such a reduction may also suggest the declining importance of mitochondria in plasmodium.

      Another missed opportunity was to assay the impact of MICOS-depletion of OXPHOS in plasmodium. This is a salient issue as maybe crista morphology is decoupled from OXPHOS capacity in Plasmodium, which links to the apparent tolerance of mitochondrial morphology in cell growth and differentiation. I suggested in section A experiments to address this deficit.

      Finally, the authors could assay fitness costs of MICOS-ablation and associated phenotypes by assaying whether mosquito infectivity is reduced in the mutants when they are directly competing with WT plasmodium. Like the authors, I am also surprised that MICOS mutants can pass population bottlenecks represented by differentiation events. Perhaps the apparent robustness of differentiation may contribute plasmodium's remarkable ability to adapt.

      I realize that the authors put a lot of efforts into their study and again, I am very impressed by the sophistication of the methods employed. Nevertheless, I think there is still better ways to increase the impact of the study aside from overinterpreting the conclusions from the data. But this would require more experiments along the lines I suggest in Section A and here.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript reports the identification of putative orthologues of mitochondrial contact site and cristae organizing system (MICOS) proteins in Plasmodium falciparum - an organism that unusually shows an acristate mitochondrion during the asexual part of its life cycle and then this develops cristae as it enters the sexual stage of its life cycle and beyond into the mosquito. The authors identify PfMIC60 and PfMIC19 as putative members and study these in detail. The authors at HA tags to both proteins and look for timing of expression during the parasite life cycle and attempt (unsuccessfully) to localise them within the parasite. They also genetically deleted both gene singly and in parallel and phenotyped the effect on parasite development. They show that both proteins are expressed in gametocytes and not asexuals, suggesting they are present at the same time as cristae development. They also show that the proteins are dispensible for the entire parasite life cycle investigated (asexuals through to sporozoites), however there is some reduction in mosquito transmission. Using EM techniques they show that the morphology of gametocyte mitochondria is abnormal in the knock out lines, although there is great variation.

      Major comments: The manuscript is interesting and is an intriguing use of a well studied organism of medical importance to answer fundamental biological questions. My main comments are that there should be greater detail in areas around methodology and statistical tests used. Also, the mosquito transmission assays (which are notoriously difficult to perform) show substantial variation between replicates and the statistical tests and data presentation are not clear enough to conclude the reduction in transmission that is claimed. Perhaps this could be improved with clearer text?

      More specific comments to address:

      Line 101/Fig1E (and figure legend) - What is this heatmap showing. It would be helpful to have a sentence or two linking it to a specific methodology. I could not find details in the M+M section and "specialized, high molecular mass gels" does not adequately explain what experiments were performed. The reference to Supplementary Information 1 also did not provide information. Line 115 and Supplementary Figure 2C + D - The main text says that the transgenic parasites contained a mitochondrially localized mScarlet for visualization and localization, but in the supplementary figure 2 it shows mitotracker labelling rather than mScarlet. This is very confusing. The figure legend also mentions both mScarlet and MitoTracker. I assume that mScarlet was used to view in regular IFAs (Fig S2C) and the MitoTracker was used for the expansion microscopy (Fig S2D)? Please clarify. Figure 2C - what is the statistical test being used (the methods say "Mean oocysts per midgut and statistical significance were calculated using a generalized linear mixed effect model with a random experiment effect under a negative binomial distribution." but what test is this?)? Also the choice of a log10 scale for oocyst intensity is an unusual choice - how are the mosquitoes with 0 oocysts being represented on this graph? It looks like they are being plotted at 10^-1 (which would be 0.1 oocysts in a mosquito which would be impossible). Figure 2D - it is great that the data from all feeding replicates has been shared, however it is difficult to conclude any meaningful impact in transmission with the knock-out lines when there is so much variation and so few mosquitoes dissected for some datapoints (10 mosquitoes are very small sample sizes). For example, Exp1 shows a clear decrease in mic19- transmission, but then Exp2 does not really show as great effect. Similarly, why does the double knock out have better transmission than the single knockouts? Sure there would be a greater effect? Figure 3 legend - Please add which statistical test was used and the number of replicates. Figure 4 legend - Please add which statistical test was used and the number of replicates. Figure 5C - the 3D reconstructions are very nice, but what does the red and yellow coloring show? Line 352 - "Still, it is striking that, despite the pronounced morphological phenotype, and the possibly high mitochondrial stress levels, the parasites appeared mostly unaffected in life cycle propagation, raising questions about the functional relevance of mitochondria at these stages." How do the authors reconcile this statement with the proven fact that mitochondria-targeted antimalarials (such as atovaquone) are very potent inhibitors of parasite mosquito transmission?

      Significance

      This manuscript is a novel approach to studying mitochondrial biology and does open a lot of unanswered questions for further research directions. Currently there are limitations in the use of statistical tests and detail of methodology, but these could be easily be addressed with a bit more analysis/better explanation in the text. This manuscript could be of interest to readers with a general interest in mitochondrial cell biology and those within the specific field of Plasmodium research.

      My expertise is in Plasmodium cell biology.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study builds upon previous work in schizophrenia and other disorders using fibroblasts derived from patients, assessing mitochondrial phenotypes and then using these to identify compounds which reverse these phenotypes. The study is one of the largest of its kind performed to date with 168 patients included. The authors undertake mitochondrial phenotyping and machine learning of the outputted images to be segregate the patients based on clinical features and the associated cellular phenotype. The authors then go on to screening virtually publicly available datasets of cancer cells treated with compounds and also genetic modulations. In doing so, they can identify compounds which modulate the phenotypes and therefore might be of value to test in the patient derived lines. The study has strengths in the large number of samples, the advanced machine learning and the virtual screening. Furthermore, the authors highlight and discuss the limitations of the study well. There are some weaknesses which the authors can address. Firstly in the introduction, although it is comprehensive in some areas, in other areas for example outlining the fibroblast mitochondrial phenotype and indeed the use of patient fibroblasts to identify compounds, there is significant literature missing, particularly in Parkinson's Disease where screening in fibroblasts has resulted in compounds entering Phase 3 clinical trials. In addition to the studies using 100 or more PD patient fibroblast lines for phenotyping and patient stratification have not been included. It would be useful if the authors could comment on the robustness of the phenotypes identified in the fibroblasts over multiple passages. This is important when considering the biological and disease relevance of the phenotypes and it is not something the authors show or comment on. In discussing the genetic manipulations it would be useful to comment on the genes identified in more detail particularly those which are not known to be associated with changes in mitochondrial phenotypes.

      Significance

      This study builds on work from multiple labs investigating the utility of fibroblasts to identify phenotypes and find potential novel therapeutics. The size of the cohort and the advanced machine learning methods are a particular strength and this advances the field in this area. The availability of the data and code is a strength to allow others to replicate the findings. The lack of experimental validation of any of the compounds or genes identified by the virtual screening is a weakness which could be addressed.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their study, Haghighi et al. seek to build upon prior literature linking alterations in mitochondrial network distribution with various kinds of psychosis. Correlations between subcellular mitochondrial localization and different psychological states is an interesting and potentially fruitful frontier and should be explored; however, despite their ambitious strategy to screen 168 skin fibroblasts from patients experiencing psychosis, and examine various online image databases, there is a concerning number of issues related to the image-analysis approach. The foremost of these is a lack of direct measures of mitochondrial distribution, which might serve to validate their proposed MITO-SLOPE protocol. There is also a worrisome lack of robust controls, which are critical in light of how admittedly subtle some of the distribution phenotypes may be. Overall, the aim to screen differences in mitochondrial distribution is a laudable goal and, in the context of psychological disorders, could be helpful in identifying new therapeutic targets; but the methodology employed in this study does not seem to be sufficiently rigorous to be able to leverage this approach for screening purposes.

      I have extensive experience investigating mitochondria with advanced imaging technologies, including super-resolution microscopy as well as high-throughput and 4D imaging modalities. I am also familiar with standard as well as machine-learning approaches for quantifying mitochondrial morphology as well as distribution or trafficking. In my opinion, this study requires substantial revision, both in terms of the indirect and often opaque image-analysis pipeline as well as the inclusion of orthogonal experiments, which could serve to lessen concerns regarding purported differences in mitochondrial distribution, which are so difficult to discern as to be imperceptible. It is worth noting, too, that this study appears to be predicated, in many ways, upon a 2010 study (Cataldo et al.) of mitochondria in patients with bipolar disorder, which appears to reflect its own lack of critical controls for cell size.

      Major comments:

      The authors state, in the first paragraph of the results section: "By eye, we observed that samples from patients in the control and MDD categories show a more fine-grained, dispersed mitochondrial network extending to the edges of the cell, whereas patients in the categories experiencing psychosis tend to show an agglomerated, thicker network more concentrated around the nucleus. The pattern is subtle and heterogeneous across a cell population." The pattern is indeed subtle. I am concerned that it is so subtle as to be imperceptible. Firstly, it is important to note that the mitochondrial reticulum in BP, SZ, and SZA is more difficult to differentiate, by eye, because the signal appears to be saturated in places, such that the boundaries of individual mitochondria are indistinguishable due to differences in contrast or possibly from the fluorescence intensity itself. Although the authors indicate in the legend that the intensity of the mitochondrial fluorescence was adjusted "for visual clarity," it appears that the contrast needs to be decreased in the BP, SZ, and SZA conditions. It is also important to note that MitoTrackers load into mitochondria in a membrane-potential-dependent fashion. Did the authors detect differences in membrane potential between these groups? While imaging, was the same laser power and gain utilized from condition to condition? With this being said, it is not clear that mitochondria in control and MDD categories have different morphologies from the other conditions. It is also not clear what "fine-grained" means in this context. Is this a comment on aspect ratio? If so, it would be better to use standard terminology. (Why are there large red circular structures in the nucleus? These are likely not mitochondria, so why are they showing up in the channel with MitoTracker?) It is also not evident that one condition has more dispersed mitochondria than another. Given that the authors appear to be making this a central claim of their manuscript, it would seem appropriate to highlight specifically the regions of the different cells that they believe exhibit meaningful differences. If I attempt to look at the merged image, which is important because it is really the only way that one can gauge the relative distance of the mitochondrial network from the edge of the cell, there would seem to be no obvious differences between the conditions. Another key point that I think important to mention, given that it is frequently referenced in this manuscript, Cataldo et al., 2010 indicate that mitochondria in patient fibroblasts with bipolar disorder (BD) are more perinuclear than those in control. However, a cursory inspection of the images from this study (e.g., Figure 2A-B; Figure 4A-D; and Figure 6A-H) unambiguously demonstrate that the BD cells are smaller than the control cells. Of course, if the cells are smaller, the distance from the nucleus will tend to be shorter. In Cataldo et al., 2010, the authors state, "We also measured cell area, cell length, cell width, and cell perimeter of the fibroblasts used in this analysis to verify that the observed mitochondrial distributional differences were not simply a result of BD cells being smaller, shorter, or fatter. No significant differences in any of these measurements were seen based on diagnosis after two sample t tests." Notably, the data is not shown, so it is difficult to appreciate what the variance of the population of cells from control and BD would look like, but it must be said, nevertheless, that the representative images in this paper all point to the BD cells being smaller. In light of this, it would be helpful if Haghighi et al. could add scale bars to all the images (e.g., in Figure 2), so readers can ascertain whether all the cells are portrayed at the same scale and are of similar areas.

      As the authors indicate, interpretable measures of mitochondrial morphology include values like size and shape. It is concerning, therefore, that Figure 3 purports to identify a number of significantly different mitochondrial "features" in the patient groups experiencing psychosis, but they do not appear to make an effort to clarify how any of these features might reflect ground truths of mitochondrial architecture, which can be understood directly by values such as aspect ratio, circularity, area, number organelles, number of nodes or branching points in a network, etc. Unless the authors can specifically tie their machine-learning classifications to standard mitochondrial shape descriptors, their classifications will remain opaque and therefore of limited credibility or value. One way to improve the validation of their machine-learning classification methods would be to use empirically sound methods for manipulating a mitochondrial morphology and distribution, which could serve as positive or negative controls. For example, treatment of cells with the uncoupler FCCP would induce mitochondrial fragmentation, treatment with cycloheximide results in stress-induced mitochondrial hyperfusion (SIMH), or treatment with Nocodazole would block mitochondrial trafficking. Treating control cells with these chemicals would help to establish baseline measurements for how far the patient cells are deviating from untreated controls, in one direction or another. Such considerations, I think, are especially important when the mitochondrial phenotypes are so subtle. I agree with the authors' argument that, for the purposes of screening, it is best to focus on a single metric. Based on their apparent discernment of the subtle differences in mitochondrial distribution in patients experiencing psychosis, they opted to examine possible differences in network density. To this end, they developed "MITO-SLOPE." Out of multiple categories of features, they highlight the following as the most powerful for establishing differences in mitochondrial network density:

      "(a) A subset of texture measures in the nuclei and cytoplasm area of the mito channel. (b) A subset of features measuring the intensity of the mitochondria area across the cell."

      Within the concentric bins around the cell nuclei, they measure:

      • FracAtD: Fraction of total stain in an object at a given radius.
      • MeanFrac: Mean fractional intensity at a given radius, calculated as the fraction of total intensity normalized by the fraction of pixels at a given radius.
      • RadialCV: Coefficient of variation of intensity within a ring, calculated across 8 slices."

      While the authors have recommended the use of a single metric for purposes of screening, MITO-SLOPE appears to represent a bundle of metrics, which, in the end, do not amount to a clear readout of what is being measured. From my point of view, if one were interested in measuring mitochondrial distribution, then, in an ideal situation, one would measure the average distance of all the mitochondria from the center of the nucleus. And, since the size of the cell is critical for establishing relative distances to the boundaries or periphery of the cell, one would normalize this metric by cellular area. Thus, the readout would be: [average mitochondrial distance from the nuclear center (µm)]/[cellular area (µm2)]. An even simpler metric could be: [average mitochondrial distance from nuclear center (µm)]/[average cytoplasmic radius (µm)]. When talking about mitochondrial distribution, we typically think in terms of where is the mitochondrial network, on average, in relation to the nucleus (perinuclear) or to the edge of the cell (peripheral). By quantifying the actual mean distance of the mitochondrial network in relation to both the nucleus and the bona fide cell extremities, via the metrics I described above, one can obtain direct measurements of the truly meaningful values related to mitochondrial distribution. It seems deviating from these approaches introduces more and more opportunities for confounding variables.

      However, the MITO-SLOPE analysis does not seem to consider this metric. Is this, or a similar variation, not the most direct way to establish differences in the mitochondrial network distribution? I would, of course, at least want to see a discussion of why the authors have not chosen to use the most direct form of quantification for this purely spatial value. Why opt for a multifaceted measurement of a relatively straightforward quantity, when a simpler form of quantification would not only suffice but arguably be more likely to capture the ground truth? With this being said, it is not clear to me why, within MITO-SLOPE there seems to be a reliance on measuring the "intensity" of the mitochondria. (And what intensity is it? Mean intensity per ROI?) Of course, particularly if MitoTrackers were used for staining mitochondria, there will be heterogeneity in fluorescence intensity from organelle to organelle, which introduces potential confounders into the workflow. Furthermore, as indicated above, to know if the subcellular distribution of mitochondria is truly altered, it is essential to know if the cell size has likewise changed. Therefore, any unbiased measure of mitochondrial distribution must take into consideration the size of the cell; however, based on the information provided about MITO-SLOPE, it does not appear that the authors are accounting for possible variations in cell size that might account for alterations in mitochondrial network distribution - i.e., a smaller cell will have a more constrained area in which mitochondria will be able to disperse - thus, not accounting for cell size (area) will yield ambiguous results. For example, how can we know if mitochondrial motility is impaired or if the cell is simply smaller and there is less space in which to move? Another complexity, here, is if the cell boundaries were not accounted for via staining of actin, etc., then establishing a true cell boundary will be very challenging. How many bins are sufficient to capture the whole cell? Just 12? Furthermore, human fibroblasts have a tendency to be quite large (sometimes several hundred microns from end to end); how can the authors account for the whole cell, particularly in cases where part of the cell is beyond the field of view or cells are growing on top of each other, as is often the case?

      In Figure 6, there is no control image that could be used as a frame of reference. I have extensive experience imaging A549 cells. The mitochondria in these images appear to be highly fragmented. The staining patterns, particularly of the cells treated with divalproex-sodium, are quite dim, indicating mitochondrial depolarization. Of course, depolarization affects the fluorescence intensity of mitochondria stained with vital dyes, such as MitoTrackers, which will, in turn, presumably affect the values obtained from MITO-SLOPE, which appear to rely on intensity gradients, rather than more concrete spatial coordinates. Also, as indicated above, it is unclear how the authors are establishing the edges of cells without a marker of the plasma membrane or cytoskeleton.

      The authors note that "Divalproex-sodium is a benzodiazepine receptor agonist and HDAC inhibitor (Rahman et al. 2025) used to manage a variety of seizure disorders (Willmore 2003) and bipolar disorder(Bond et al. 2010; Cipriani et al. 2013); it shows a positive MITO-SLOPE which is the direction expected to normalize the centralized mitochondrial localization associated with psychosis." Insofar as this recommends the drug for use in "normalizing" perinuclear mitochondria within neurons, it would seem only prudent to mention that this drug also appears to induce mitochondrial depolarization and fragmentation, which are both associated with a range of severe human pathologies. I would caution the authors to not highlight one potential benefit while omitting an obvious side effect involving what appears to be significant perturbation of mitochondrial structure and function. What is the point of normalizing mitochondrial distribution if the mitochondria being redistributed are dysfunctional?

      The authors note, in Figure 7, that their MITO-SLOPE analysis was unable to discern a statistically significant difference in cells with specific knockouts of genes associated with mitochondrial trafficking. If the MITO-SLOPE cannot discern a difference in the context of a substantial abrogation of mitochondrial transport capacity, how is it that it could detect meaningful differences where there is only a "subtle" change in distribution? This result would seem to militate strongly against the efficacy of this analysis pipeline and would not recommend its use for unbiased screening and discovery.

      Minor comments:

      For Figure 6 b and c, "µm" should be "µM."

      The introduction and discussion could be more concise.

      Significance

      This study attempts to fill an important gap in knowledge relating to mitochondrial distribution and psychological disorders. It aims to perform an initial screen to try to validate a novel analysis pipeline called MITO-SLOPE, however, the study appears to lack analytical rigor, both in terms of the underlying cell biology together with the approach for quantification, itself. Conceptually, this study has great promise, but the authors will need to improve their pipeline prior to publication, which will likely require fundamental revisions, including an array of orthogonal measures (largely lacking here) as well as detailed demonstrations of how the segmentation actually works and ultimately yields data reflecting demonstrable mitochondrial trafficking/distribution defects.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Haghighi and McPhie et al. builds upon their previous findings by exploring the mitochondrial localization as a disease-associated phenotype in mental disorders, particularly in psychotic disorders. They recruited a cohort of patients diagnosed with schizophrenia, schizoaffective disorder, bipolar disorder and MDD. By taking advantage of skin biopsies, they screened patient-derived fibroblasts for aberrant mitochondrial localization and morphology using common staining techniques. Then, they use a machine learning approach to classify patients into their respective groups, which was effective for BP, SZA and pooled psychotic patients. Authors then develop a single feature for phenotyping, Mito-SLOPE, a metric of mitochondria density distribution across a cell by radial areas. With this metric, psychotic patients tend to have more nuclear-localized than edge-localized mitochondria; whereas MDD patients show a trend for higher edge-to-nucleus distribution. To find candidate drugs, authors screen publicly available datasets of cells treated with small compounds using mito-SLOPE. Furthermore, authors then apply mitoSLOPE on a CRISPR screen dataset, showcasing the role of mitochondrial dynamics genes and three genes of interest because of their association with psychosis. Finally, they identified the top genes whose KO or overexpression may explain (or reverse) the mitoSLOPE phenotype.

      Overall, the manuscript is well-written, the conclusions are supported within their limitations and this work represents an advancement in the field. I recommend it for publication provided these concerns are addressed:

      Major comments:

      1. The mitoSLOPE measure is very interesting and most likely reflects a subtle changes in mitochondrial transport. How does the microtubule network look like in the patient fibroblasts, are there obvious alterations in e.g. their posttranslational modifications? Is there a difference in mito transport speed or pausing frequency?
      2. I concur with the exclusion of compounds that obviously alter cell shape, as the authors mention for the cancer therapeutics. Some cancer therapeutics actually affect microtubule dynamics (see 1st point), which may underlie their effect on both cell shape and mitoSLOPE. To undertand the mechanism of action, the top hits should also be tested for the integrity of the microtubular network and mitochondrial transport parameters.
      3. While I agree with the authors' reasoning that the observed phenotype could be a result of the disease or the result of a compensatory mechanism, their hypothesis could be experimentally tested by addition of any of the top hits in order to reverse mitoSLOPE in their patient cell lines. It may not have worked for Lithium in their last manuscript, but the mechanism of action of the novel compounds could be cell intrinsic.
      4. Does recreation of the CRISPR cell line in their hands produce the same phenotype?
      5. Additionally, the observed phenotypes could also be a product of the medication taken by the patients. Deeper patient data from the cohort may be relevant to put the findings in context. How were patients diagnosed? Which medications were the patients taking? Was substance abuse present? In Mertens et al, Lithium responders and Lithium non-responders showed a differential mitochondrial response, how does this affect their dataset?
      6. While MDD itself is not a psychotic disorder, it can still present with psychotic features. Was this evaluated during the recruitment? Also important, were they on antipsychotic medication in addition to antidepressant therapy?
      7. The fact that CACNA1C is excluded from the "unbiased" hit discovery (Fig 8) undermines the power of the filtering criteria selected by the authors. Authors should include some discussion around this.

      Minor comments:

      1. Colored images should be made colorblind-accessible. This applies to microscopy images and graphs.
      2. Fig 3: Exact p-values should be reported in the graphs
      3. Fig. 5 and Fig 7a-b: It is not immediately clear what the lines in these graphs represent. Is it the individual drug/gene hits in a pre-ranked manner?
      4. Fig 6 b-c: should the "m" be capitalized for Molarity?
      5. The annotation of divalproex/valproic acid as a "benzodiazepine receptor agonist" is incorrect. While it is known to enhance GABAergic neurotransmission, the mechanism is supported to be through GABA synthesis rather than being a GABA-A receptor agonist (see eg. PMID: 23407051).
      6. Supplementary Fig 3 and 4 could be swapped to match the main text order.
      7. One reference was inaccessible: Anon, Phenomics-Enabled Discovery and Optimization of Small Molecule RBM39 Degraders as Alternative to CDK12 Targeting in High-Grade Serious Ovarian Cancer (HGSOC).

      Significance

      Recently, mitochondria have emerged as mediators of anxious behavior and are increasingly studied in the context of neuropsychiatric disorders. However, the molecular mechanisms that connect altered mitochondrial performance to specific neuropathological conditions are unknown. This study extends our knowledge in this realm. While it is in principle an extension of earlier work from the authors (Cataldo, A.M. et al. Am. J. Pathol. 2010), it has added value due to the application of their automated analysis to publicly available datasets, providing a clear technical advance. This identified known as well as novel compounds that could revert the mitochondrial phenotype and makes this study specifically interesting to an audience interested in translational research. The strength of the manuscript certainly lies in the large number of examples studied and their well-rounded discussion of their findings. It is limited by the fact that the phenotype of neuropsychiatric conditions is studied in peripheral cells, and thus may not be a simple cell-autonomous response but a compensatory, systemic response that is not easy to replicate in a fibroblast in isolation. No mechanistic insight is gained on the underlying cell biology in the current format.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General statements.

      We thank the reviewers for their positive response and useful suggestions on our manuscript. They recognize the ‘proof of concept’ nature of the work and the importance of extending the number of human mutation-specific DMD mouse models from one to five for preclinical research. We feel that the quality of the manuscript has been improved upon implementation of the reviewer’s suggestions.

      Reviewer 1.

      OPTIONAL - From the point of view of the reviewer, it seems plausible to use CRISP/Cas9 to "clean up" the original hDMDmdx mouse line by selectively removing one of the YACs forming the tail-to-tail tandem in the mouse genome. Once such single copy mouse line is generated (and proven viable?) any subsequent rearrangement of the hDMD transgene would prove much less challenging. Such mouse line would also better represent human model where only one DMD copy is carried on the X chromosome.

      The reviewer gives the optional suggestion that the generation of these models could have been combined with the removal of one of the copies of the YAC to extend the use of the new models to CRISPR-based therapies. This is correct, but we note that when the data on the removal of a copy of the YAC were published, our new models were already generated and in different stages of QC, colony building and analysis. The procedure described by Chey et al could be used on our new models, but this would require additional time and funding and is therefore outside the scope of this manuscript.

      The labels in figure 2B and 3A would benefit from showing the PCR fragment lengths as well as the sizes of obtained hDMD exon deletions. On could also include an additional figure panel demonstrating the principle of ASO-induced exon skipping

      Reviewer #1 also has a minor comment regarding the exact deletions in figure 2B and 3A. For fig. 2B he/she suggests to include the sizes of the PCR fragments next to the gel. Especially for the gel regarding PCR1, which detects the deleted YAC copy, this will not be very informative as this can be (and is) different for different clones depending on the NHEJ-mediated repair in the specific clone. Adding sizes is only interesting for each specific clone, and adding them all will make a very messy figure. The important message from this gel is the presence of any fragment, as the undeleted copy is not amplified under the conditions used. For the gel of PCR2 the opposite is the case, here the PCR fragment shown is simply the undeleted YAC copy, and here we are only interested in the absence of the PCR fragment.

      We thank the reviewer for the suggestion of adding the deletion sizes to fig 3A. This made us realize that an additional table with the details of the mutant alleles in all models had been omitted, and we apologize for this error. With the revised version we include details on the size of the deletions and their genomic coordinates (in the human genome as it is in the human YAC) of each of the new models (revised Sup. Table 1). We trust that adding these details will clarify this reviewer’s minor comment.

      The reviewer requests to include an additional figure panel demonstrating the principle of ASO-induced exon skipping. We have now added this to the revised version of the manuscript (new fig. 5).

      The study is fairly limited in scope and will be of primary interest to those working in the DMD field.

      We are aware of 9 clinical trials for exon 51 and 53 studies that are ongoing or were recently stopped. For four of these compounds companies have a license to our hDMDdel52/mdx mouse model, and one of these studies has been published. An additional 7 clinical trials are planned or ongoing for exon 44, 45 and 50 skipping for which the newly developed models are being or can be used for preclinical studies.

      Reviewer 2.

      To further strengthen the rigor of the study, it would be valuable to include an analysis of potential off-target effects of CRISPR editing, particularly given that double targeting of two YAC copies was required. This is especially important for germline edits, as off-target mutations could introduce confounding phenotypes in the resulting mice. Demonstrating minimal or absent off-target activity would increase confidence in the specificity and safety of the generated models.

      There has indeed been one major study suggesting a large number of CRISPR-induced off-target mutations in mouse models. However, this publication was rapidly questioned by multiple groups for having used the wrong control animals and the original publication was retracted (https://doi.org/10.1038/nmeth0518-394a). Another study at that time, using the correct controls, did not find mutations that could be attributed to CRISPR-induced off-target mutations. A more recent study analysed founder animals from transgenic projects using 163 different guide RNAs and concluded ‘In total, only 4.9% (8/163) of guides tested have detectable off-target activity, at a rate of 0.2 Cas9 off-target mutations per founder analysed. In comparison, we observe __~1,100 unique variants in each mouse regardless of genome exposure to Cas9 __indicating off-target variants comprise a small fraction of genetic heterogeneity in Cas9-edited mice.’ In short, the background mutation rate in mice is much higher than the Cas9 off-target mutation rate. In addition to this, we only used guide RNAs that did not have any predicted off-target sites (according to the CRISPOR tool; https://crispor.gi.ucsc.edu/crispor.py) on the same chromosome or in protein coding sequences, so that any undetected off-target mutation will rapidly be lost in the subsequent breeding. We also would like to refer the reviewer to the ‘referee cross-commenting remark’ from reviewer #3 on this topic.

      The validation of the dystrophic phenotype is generally convincing. However, the authors should clarify how "human dystrophin" is detected in the deletion models. Since only part of the dystrophin gene in these mice is humanized (the remainder is murine), it is important to specify, also in the results, which antibody was used and which epitope/exon it recognizes. If the antibody targets a deleted exon in a given model, this could lead to misinterpretation of the dystrophin signal. Providing this clarification would ensure the conclusions regarding dystrophin expression are fully supported.

      This question is based on the incorrect assumption that only part of the DMD gene in these models is humanized. As described in the original publication on the YAC transgenics the complete human gene is in the YAC. Here, we deleted a particular exon from this complete human DMD gene. In combination with the mdx allele, these mice lack the full-length mouse and human dystrophin isoforms expressed in muscle. As mentioned in the materials section, the human dystrophin protein was detected with the Mandys 106 antibody (recognizing exon 43; amino acids 2063-2078), which only has reactivity with human dystrophin according to the product specification of Sigma Aldrich. We confirmed this for wild type mouse tissue, showing no dystrophin for this antibody. In fig 4 we confirm lack of human dystrophin in the deletion models using this antibody. The mouse and human dystrophin protein was detected with the AB154168 antibody of Abcam (recognizing the last 100 amino acids of the C-terminal part of the protein), which has reactivity with both mouse and human. So neither antibody did target a deleted exon. For the exon skipping validation, solely the Abcam antibody was used, as none of the deleted or skipped exons was recognized by this antibody. Information regarding the targeted protein region has now been added to the materials section.

      Additionally, to further strengthen the characterization of the muscular dystrophy phenotype, the authors could quantify muscle fibre size and the percentage of centrally nucleated fibres, both of which are widely accepted quantitative markers of ongoing degeneration/regeneration in DMD models.

      and

      The validation of exon skipping in the new hDMD deletion models is convincing at the molecular level. However, since the ASOs were injected into both gastrocnemius and triceps muscles, it would be helpful to include at least a brief characterization of the triceps, even in supplementary data, as different muscles can show slightly different pathology and responses. Additionally, while the molecular readouts (RT-PCR and Western blot) demonstrate restoration of dystrophin expression, including simple histological analysis, such as H&E staining, could further support functional improvement and reinforce the physiological relevance of exon skipping in these models.

      The proof-of-principle nature of the current manuscript is focused on restoration of dystrophin expression shortly after ASO treatment, and the current sample sizes (n=3 mice per strain) are too limited for actual quantification of histopathological improvements. Furthermore, the timespan between the intramuscular injection and tissue collection (2 weeks) does not allow sufficient time for histopathological improvements to develop. Notably, a large natural history analysis of all these new models is currently ongoing, which includes a large variety of in vivo functional outcome measures and provides a full description of the histopathological aspects of these mice. The proposed characterization of the triceps is now included as supplementary data of the manuscript (Sup. Fig 1).

      Reviewer 3.

      This reviewer starts with pointing out some typos, or requested rephrasing to sentences for clarification. We appreciate this and have addressed this in the revised version of the manuscript.

      Generation of the models: it is not clear why the authors generated line 44 in ES cells, then switched to direct gene editing in zygotes. Was this due to advent of electroporation of zygotes at the time? This may need clarification beyond the sentence "Encouraged by the specificity of our new prescreen workflow and the efficiency of correct targeting of human exon 44 in ES cells, we generated additional models ... directly in mouse zygotes".

      The simple answer to this is that we were (pleasantly) surprised ourselves by the efficiency we got in the ES cells (which was based on the previous experience generating the del52 model). For animal welfare reason we prefer to generate models via ES cells if we expect a long and cumbersome quality control process and / or very low efficiency, as ES cells allow us to do this QC before the actual animals are generated, thus reducing the number of animals generated during the model generation phase. Expecting very low efficiency, we originally picked 10 x 96 well plates of clones for this del44 targeting, but after pre-screening the first two plates (192 clones), we realized this was an enormous overkill in clones, and the additional 8 plates were not analysed. With this much higher than expected efficiency, and the power of the two-step pre-screen described in the manuscript, we decided to try the next model (the del45) directly in zygotes. This was found efficient enough to also do the last two models directly in zygotes. We can only speculate on the much higher efficiency than observed for the del52 targeting. Clearly the fact that we knew of the double integration this time allowed us to develop the successful 2-step pre-screen. Another difference is that the del52 model was generated using TALENs as genome editors, whereas now we could use CRISPR/Cas9.

      Antisense oligonucleotide treatment: there is no description of the design of the ASOs beyond their sequence in suppl. Table 4. How were they designed? Moreover, they have been injected at two different doses (i.e., 50ul for Exon 51 & 53; 100ul for Exon 44 & 45). What is the rational for this? There is no justification in the manuscript.

      The requested additional details on ASO design and dosing have been added to the materials section of the revised manuscript. The reviewer also pointed out that fig 4 includes both a protein sample diluted to 10% of protein of both a C57BL/6J and hDMD/mdx control mouse, and requested a justification for this. We included samples of both wildtype strains to confirm species reactivity of the dystrophin antibodies used, with the AB145168 antibody being specific for both mouse and human protein (showing a dystrophin band in both wildtype samples), and the Mandys106 antibody being specific to only human protein (showing a dystrophin band in the hDMD/mdx control only).

      Phenotypic validation of the new models: a description of the mdx line with C57BL/6J mice is mentioned. Is this why Fig.4 includes "10% Bl6" and "10% hDMD/mdx"? If so, this should be clarified in the text (or deleted from the figure). The authors mentioned "As expected, the gastrocnemius of healthy hDMD/mdx mice expressed dystrophin of human origin at wildtype levels". Why would this be expected? If 2 copies of the gene, including the human promoter, are integrated, why would one expect a wildtype level of expression? In fact, in the original paper describing the hDMD/mdx model ('t Hoen et al. 2008), the human transcripts are expressed at 2 to 4-fold higher than their endogenous counterparts (which is in line with the integration of 2 copies).

      It is true, as he/she points out, that qRT-PCR data in the original YAC transgenic publication showed double expression of the human transcript, consistent with the double integration. However, fig. 3b in the same paper shows that at the protein level the expression of human DMD is comparable to the mouse protein. We don’t know the reason for the discrepancy between transcript and protein levels in this model, but in the current manuscript we are referring to this protein expression.

      A quantification of the expression levels on Figure 4 should be done (normalized to actinin) to resolve this. The size of the Marker should also be added on Figure 4.

      We feel that proper quantification can only be done with the utilization of a standard curve. As we expected no, or trace levels of dystrophin in the deletion models, we only included wildtype samples diluted to 10% of wildtype protein. This prevents us from accurate quantification of the trace dystrophin levels observed in the del45 and del51 models. However, as can be appreciated from fig 4, expression is very minimal. We added information on the marker in the materials section, and indicated the size (85 kDa) in the figure legend.

      Finally, the authors observed histological hallmarks of the disease in the new models (i.e., muscle degeneration and fibrosis). Although obvious on the images, it may be useful to add indications (e.g., arrows) on the images for readers non familiar with DMD.

      We added information on the marker in the materials section, and indicated the size (85 kDa) in the figure legend. Lastly, we also added the requested arrows to the pictures of fig. 4B to allow distinction between different histopathological hallmarks, and refer to these in the figure legend.

      Prescreen PCR of hDMD/mdx ES cells (Fig. 2): the authors mentioned that "The PCR conditions were chosen for not being able to amplify the undeleted allele." What does this mean? Was the elongation time reduced? As per the text, the theoretical size of a WT band is around 1.6kb. Yet, on the gel, bands higher than 1kb are visible for some clones.

      This is indeed based on the extension time of the PCR reaction shown in PCR 1 from fig 2B, amplified with primers upstream and downstream of the deleted region (see fig 1 and 2A). However, the approx. 1.6 kb fragment the reviewer refers to is the undeleted-specific amplification shown in Fig 2B PCR 2, which is the result of a primer outside and a primer inside the deleted region (fig 1and 2A). Amplification of the undeleted copy with the primers used in PCR 1 would give a fragment of 3902 nt. The deletion of exon 44 in the final model is 3584 nt, which details will be shown in the excel file that was erroneously omitted (see our response to reviewer #1), with the PCR 1 product of the deleted copy in the clone used for the mouse model being 318 nt. It is straight-forward to select an extension time that would be insufficient for a 3.9 kb fragment, but which can amplify fragments that are shorter due to the deletion. Even in a clone with a single copy of exon 44 deleted, one would not expect to see the 3902 nt fragment due to preferential amplification of the much shorter mutant band. This has now been clarified in the legend of figure 2 of the revised version of the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by van Putten et al. describes the generation and initial characterization of four new mouse models of DMD, based on the previously generated hDMD/mdx murine model, which expressed human dystrophin from a yeast artificial chromosome (YAC) in a DMD null (mdx) background. The four new models are based on the deletion of four Exons (44, 45, 51 & 53), which accounts for most human deletions (hotspot) in DMD.

      The description of the generation of these models using CRISPR/Cas9 gene editing is thorough, and the quality control is adequate. Moreover, preliminary testing of exon skipping therapy using ASO showed it is possible to restore the production of dystrophin protein (albeit truncated) in these models, which increase their translational value. Although the study is valuable and methodologically sound, there are minor points that need to be addressed:

      • Few typos need to be corrected:
        • "Therapeutic approaches aiming to restore dystrophin for DMD are based on the discrepancy between DMD and BMD mutations." This needs to be rephrased to clarify the meaning for readers not familiar with DMD.
        • "Western blot and immune fluorescence analysis on gastrocnemius muscles..." replace" immune fluorescence" with immunofluorescence.
        • "Two weeks after the last injection muscles were isolated, and RNA and protein was isolated from muscle..." protein WERE isolated.
        • "However, gene editing-based therapies could run into the same unpredictable outcome reduced efficiency of a therapy ..." This sentence is confusing, consider rephrasing.
      • Generation of the models: it is not clear why the authors generated line 44 in ES cells, then switched to direct gene editing in zygotes. Was this due to advent of electroporation of zygotes at the time? This may need clarification beyond the sentence "Encouraged by the specificity of our new prescreen workflow and the efficiency of correct targeting of human exon 44 in ES cells, we generated additional models ... directly in mouse zygotes".
      • Antisense oligonucleotide treatment: there is no description of the design of the ASOs beyond their sequence in suppl. Table 4. How were they designed? Moreover, they have been injected at two different doses (i.e., 50ul for Exon 51 & 53; 100ul for Exon 44 & 45). What is the rational for this? There is no justification in the manuscript.
      • Phenotypic validation of the new models: a description of the mdx line with C57BL/6J mice is mentioned. Is this why Fig.4 includes "10% Bl6" and "10% hDMD/mdx"? If so, this should be clarified in the text (or deleted from the figure). The authors mentioned "As expected, the gastrocnemius of healthy hDMD/mdx mice expressed dystrophin of human origin at wildtype levels". Why would this be expected? If 2 copies of the gene, including the human promoter, are integrated, why would one expect a wildtype level of expression? In fact, in the original paper describing the hDMD/mdx model ('t Hoen et al. 2008), the human transcripts are expressed at 2 to 4-fold higher than their endogenous counterparts (which is in line with the integration of 2 copies). A quantification of the expression levels on Figure 4 should be done (normalized to actinin) to resolve this. The size of the Marker should also be added on Figure 4. Finally, the authors observed histological hallmarks of the disease in the new models (i.e., muscle degeneration and fibrosis). Although obvious on the images, it may be useful to add indications (e.g., arrows) on the images for readers non familiar with DMD.
      • Prescreen PCR of hDMD/mdx ES cells (Fig. 2): the authors mentioned that "The PCR conditions were chosen for not being able to amplify the undeleted allele." What does this mean? Was the elongation time reduced? As per the text, the theoretical size of a WT band is around 1.6kb. Yet, on the gel, bands higher than 1kb are visible for some clones.

      Referee cross-commenting

      The comments from the other reviewers seem fair, reasonable, and should be easily addressed by the authors. The off-target analysis might however be a bit of a stretch, given that (as per published data) the off-target rate is low (i.e., no higher than genetic drift) in mouse zygotes when using CRISPR RNPs, and any potential off-target mutation could easily be segregated out by means of backcrossing.

      Significance

      The four new mouse models generated in this study will advance the field both at the preclinical and the clinical levels, because they more closely recapitulate the human mutations linked to DMD than previous models, while presenting with a translational potential (the authors showed a proof of concept of exon-skipping therapy in these mice).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors generated four novel humanized DMD mouse models carrying deletions of exons 44, 45, 51, or 53 in the human DMD gene on an mdx C57BL/6J background. They developed an optimized CRISPR-Cas9 pre-screening workflow for embryonic stem cells and zygotes, allowing efficient and precise targeting of the human DMD YAC, which carries a complex double tail-to-tail integration. The models display absent or trace dystrophin and classical DMD muscle pathology, including fibrosis. ASO-mediated exon skipping of flanking exons successfully restores dystrophin expression, validating their use for preclinical testing of mutation-specific therapies. These models address a key limitation of the standard mdx mouse, which carries a mutation only in exon 23, and provide a more clinically relevant platform for evaluating human sequence-specific therapeutic strategies for the most frequently mutated DMD exons.

      Minor comments:

      1. The pre-screen workflow and model generation are impressive and well-optimized. To further strengthen the rigor of the study, it would be valuable to include an analysis of potential off-target effects of CRISPR editing, particularly given that double targeting of two YAC copies was required. This is especially important for germline edits, as off-target mutations could introduce confounding phenotypes in the resulting mice. Demonstrating minimal or absent off-target activity would increase confidence in the specificity and safety of the generated models.
      2. The validation of the dystrophic phenotype is generally convincing. However, the authors should clarify how "human dystrophin" is detected in the deletion models. Since only part of the dystrophin gene in these mice is humanized (the remainder is murine), it is important to specify, also in the results, which antibody was used and which epitope/exon it recognizes. If the antibody targets a deleted exon in a given model, this could lead to misinterpretation of the dystrophin signal. Providing this clarification would ensure the conclusions regarding dystrophin expression are fully supported. Additionally, to further strengthen the characterization of the muscular dystrophy phenotype, the authors could quantify muscle fibre size and the percentage of centrally nucleated fibres, both of which are widely accepted quantitative markers of ongoing degeneration/regeneration in DMD models.
      3. The validation of exon skipping in the new hDMD deletion models is convincing at the molecular level. However, since the ASOs were injected into both gastrocnemius and triceps muscles, it would be helpful to include at least a brief characterization of the triceps, even in supplementary data, as different muscles can show slightly different pathology and responses. Additionally, while the molecular readouts (RT-PCR and Western blot) demonstrate restoration of dystrophin expression, including simple histological analysis, such as H&E staining, could further support functional improvement and reinforce the physiological relevance of exon skipping in these models.

      Significance

      This study presents a clear and technically robust advance in the field of Duchenne muscular dystrophy (DMD) preclinical research. The strongest aspects are the generation of four novel humanized DMD mouse models carrying clinically relevant exon deletions (44, 45, 51, 53) and the development of an optimized CRISPR-Cas9 pre-screening workflow that efficiently and precisely targets the human DMD YAC, despite its complex double tail-to-tail integration. These models display relevant dystrophic phenotypes and are validated for ASO-mediated exon skipping, demonstrating their applicability for preclinical testing of mutation-specific therapies.

      Compared to existing models, such as the standard mdx mouse or previously generated hDMDdel52/mdx line, these new models address the critical limitation that most human DMD mutations cluster outside exon 23, providing a more clinically relevant system. The study extends knowledge both technically, by demonstrating an efficient pre-screening workflow for complex humanized YAC edits, and functionally, by creating models that allow preclinical evaluation of human sequence-specific therapeutic strategies for the most frequent DMD mutations. The audience for this work includes basic and translational researchers in the muscular dystrophy, gene therapy, and genome editing fields, as well as clinicians interested in the development and preclinical testing of exon skipping and gene-editing therapies. These models will likely be widely used to optimize therapy design, dosage, and delivery, enhancing translatability to clinical applications.

      Field of expertise: Duchenne muscular dystrophy, preclinical models, genome editing, exon skipping therapies, regenerative medicine.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The study describes the creation and preliminary validation of four humanized mouse DMD models. The authors utilized the pre-existing hDMDmdx mouse line as a platform to generate clinically relevant models that carry the deletion of exons 44, 45, 51 and 52 of the hDMD transgene using CRISPR/Cas9 technology. This proved somewhat challenging as the hDMD YAC transgene present in the original hDMDmdx line is inserted as a tail-to-tail tandem in the mouse genome. The initial mouse line carrying deletion of exon 44 was performed using a combination of CRISPR/Cas9 and ES cell technologies whereas the remaining mouse lines were generated by applying CRISPR technology directly of hDMCmdx zygotes. In order to identify and select ES cell lines and animals carrying the desired deletion patterns, the authors devised a two steps PCR-based selection strategy followed by a copy number PCR for individual hDMD exons. Once the desired mouse lines were obtained, the authors performed Western blots and histological staining to prove the loss of the hDMD protein expression and the appearance of associated DMD muscle phenotypes. Finally, in vivo experiment was carried out where intramuscular injection of exon-specific ASOs lead to exon skipping and partial restoration of the expression of the truncated but potentially functional hDMD protein variants. The experiment was carried out solely as a proof-of-concept and was terminated before any therapeutic effect of the ASOs could be potentially observed. Nevertheless, the authors argue (correctly) that such models can prove useful in future development of treatment strategies for DMD.

      Major comments:

      The study has clear aims and is well described. The performed experiments support the final conclusions presented in the paper. OPTIONAL - From the point of view of the reviewer, it seems plausible to use CRISP/Cas9 to "clean up" the original hDMDmdx mouse line by selectively removing one of the YACs forming the tail-to-tail tandem in the mouse genome. Once such single copy mouse line is generated (and proven viable?) any subsequent rearrangement of the hDMD transgene would prove much less challenging. Such mouse line would also better represent human model where only one DMD copy is carried on the X chromosome.

      Minor comments:

      The labels in figure 2B and 3A would benefit from showing the PCR fragment lengths as well as the sizes of obtained hDMD exon deletions. On could also include an additional figure panel demonstrating the principle of ASO-induced exon skipping

      Significance

      General Assessment:

      The study is fairly limited in scope and will be of primary interest to those working in the DMD field. The new patient-derived hDMD exon deletions will allow testing and validation of human therapeutic moieties in mouse models but as such the study does not advance our knowledge about DMD or transgenic mouse model generation.

      Advance:

      Perhaps the only novelty is a very diligent genotyping approach aimed at identifying lines where both exons in the tail-to-tail hDMD tandem have been deleted. Given the extensive work put into this approach, the author may have missed an opportunity to reengineer the original hDMDmdx mouse line (see OPTIONAL) to generate a mouse line where any future modifications of the hDMD allele would be much more accessible to both CRISPR-mediate NHEJ and HDR approaches.

      Audience:

      The study is fairly limited in scope and will be of primary interest to those working in the DMD field.

      Reviewer's expertise focuses on CRISPR/Cas9 technologies and transgenic mouse model generation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      COMBINED REVIEW REPORTS

      __1.1. The biochemical and biophysical experiments performed in this study were well designed, data were clear and the conclusions were well supported by the results. One potential improvement is to check whether NLS could affect the normal activation targets of ΔNp63α, such as KRT14 and other epithelial genes. This could complement the experiments testing the inhibition effect of ΔNp63α on p53-mediated gene activation. This will be interesting, as ΔNp63α is a master regulator in epithelial cells via regulation of diverse epithelial genes. __

      We thank the Review for such useful comment. In order to further investigate the relationship between p63 nuclear import and function, and the importance of the oligomerization driven tolerance to point mutations in the latter, we have now performed a number of novel experiments. First of all, we have included both DNp63a NLSn and NLSc mutants in DNA binding/p53 -inhibition assays shown in original Figure 7. The new data is shown in Figure 4E and Supplementary Figure__ S5__. As expected, such mutants had a much smaller effect on DNA binding/p53-inhibition as compared to the NLSbip mutant, further establishing a functional link between p63 nuclear levels and transcriptional activity, and proving the functional relevance of the compensatory mechanism evolved by p63 to tolerate the effect of mutations inactivating either NLSn or NLSc.

      In addition, and as specifically suggested by the Reviewer, we have measured the effect of NLS impairing mutations on the ability of DNap63 to transactivate the K14 and the Bax promoters, which. Our results, shown in revised Figure 4F and 4G, as well as in Supplementary Figure S6 clearly show that both DNp63a NLSn and NLSc mutants transactivate the promoters at undistinguishable levels compared to the wild-type, consistent with their minimal effect on DNA binding and nuclear transport, while the NLSbip mutation, which prevents nuclear localization and DNA binding, also prevents transcriptional transactivation.

      __1.2. A minor suggestion: authors could consider use p63 rather than ΔNp63α in the manuscript. The heterogenous sequences of NLS regions are relevant for the delta isoform of p63. In addition, all experiments performed in the study are not necessarily specific for the biology of the ΔNp63α isoform, but they are probably informative for all p63 isoforms. __

      We thank the Reviewer for this suggestion. We have modified the text in the discussion to introduce this concept. Indeed, we expect the bipartite NLS to mediate nuclear transport of most p63 isoforms, whereas the p63 delta isoform, which lacks NLSn, would be transported into the nucleus by NLSc. We have modified the text in the Discussion section to make this point clearer and more explicit "the bipartite NLS identified here is responsible for nuclear localization of most p63 isoforms, while p63 delta is transported into the nucleus by NLSc: SIKKRRSPD)." To further corroborate this statement, we have also included new data obtained with the TAp63a and gNp63a isoforms. Our data clearly show that nuclear import of both isoforms depends on the NLSbip identified here and is mediated by the IMPa/b1 heterodimer, so that the findings obtained for the ΔNp63α isoform can be generalized to others. The new data is shown in Figure 3 and in Supplementary Figure S3.

      __1.3. Another minor suggestion: As p63 forms a tetramer when binding to DNA sequence for gene regulation, it would be good for authors to speculate the role of NLS and its variations in tetramerization. __

      We thank the Reviewer for such comment. Since the NLS is located outside of the tetramerization domain, it is not expected to play a direct role in tetramerization. We have addressed this issue by generating computational models of ΔNp63α and DNp63α;mNLS dimers and tetramers to allow a direct comparison. The new data is shown in Figure 5A-D and Supplementary Figure S11A-D. The data suggests that mutation of the NLS residues, which lies outside of the oligomerizaiton domain, does not affect ΔNp63α oligomerization abilities supporting the experimental evidences from Figure 5E (BRET experiments).

      __

      2.1. In immunofluorescence images it is sometime difficult to see nuclear accumulation. Single channels of the GFP signal may help to make the point. __

      We thank the Reviewer for pointing out this issue. We have provided single channels for every microscopic image in Supplemental Figures.

      __ 2.2. The binding assays in Fig. 3 would profit from using the most efficient imp a variant together with imp beta to show potential cooperative binding.__

      We thank the Reviewer for such comment, which helped enhancing the physiological relevance of our binding data. We have now introduced the requested data in Supplementary Figure S2A. In the revised Figure panel, we compared binding of FITC-labelled p63-NLS peptide to either full length IMPa1 alone, IMPa1DIBB and pre-heterodimerized IMPa1/IMPb1 complex. The data are consistent with a classical binding mode whereby interaction with IMPb1 releases full length IMPa1 binding minor and major binding sites by engaging with the autoinhibitory IBB domain. To corroborate our results even further and demonstrate the bipartite nature of p63 NLS identified here, we have also performed FP experiments between p63-NLS and LTA SV40 NLS (a well characterized monopartite NLS) in the presence of either wt IMPa1DIBB or its minor and major site mutants. As expected from a bipartite NLS, either mutation impaired binding significantly, whereas the mutation of the minor site had a much smaller effect on binding of SV40 LTA NLS. The new data, shown in Supplementary Figure S2BC and Supplementary Table S3 confirm our hypothesis by highlighting a very strong binding affinity reduction of p63 NLS peptide for IMPa1 major site mutant (

      __2.3. please mention that NTR can also recognize 3D structures of structural RNAs, e.g. tRNAs or miRNAs __

      We thank the Reviewer for this very useful suggestion. We have now introduced this concept in the Introduction and added two references to support our statement. The paragraph is as follows: "Additionally, Exportin 5 and Exportin-T evolved to recognize specific RNA structures within pre-miRNAs and t-RNAs, respectively (5, 6)."

      2.4. longer TA isoforms

      We have added corrected the typo and we thank the Reviewer for noticing it.

      __ 2.5. homologues or orthologues? __

      We thank the reviewer for pointing out this issue. We have corrected the text, so now IMPas and members of the p53 family are referred to as paralogs and not as orthologs

      __3.1. The major function of DNp63a seems to be that of a bookmarking factor that ensures the establishment of an epithelial transcriptional program. It is found to bind more to enhancer than to promoter regions. While it might also act for a few genes as a classical transcription factor (K14). this bookmarking and interaction with other transcriptional regulators seems to be its major task. This should be included in the introduction. __

      We thank the Reviewer for this suggestion. The Introduction has been modified as requested to incorporate this important concept "Additionally, p63 has been shown to act as a pioneer factor, shaping the chromatin and enhancer landscape, thus regulating accessibility to activating and repressing transcription factors (18-20)."

      __ 3.2. "DNp63a can be imported into the nucleus as a dimer" What is the evidence that DNp63a is imported as a dimer and not as a tetramer? Although functional not really relevant, because all conclusions drawn for a dimer are true for a tetramer (such as the mutation compensation), this statement (and others in the text) should either be substantiated or modified. __

      The Reviewer is correct in pointing out that, while p63 isoforms bind DNA as tetramers (7), the precise oligomeric state at which nuclear import occurs is not firmly established. Indeed, little is known about the regulation of the p63 oligomerization process during nucleocytoplasmic trafficking. While TA isoforms are generally maintained in an inactive, closed, and dimeric conformation-requiring external stimuli such as phosphorylation to undergo activation and tetramerization-ΔNp63α has been reported to form tetramers even in the absence of such stimuli (4, 8). In light of this, we have modified the text to explicitly acknowledge the possibility that ΔNp63α may be transported into the nucleus either as a dimer or as a tetramer, rather than implying a single obligatory oligomeric state.

      Importantly, to directly address the Reviewer's concern, we have broadened the scope of the manuscript to include additional p63 isoforms, particularly TAp63α, which is predominantly present as a dimer under basal conditions. Our new data (Figure 3) demonstrate that TAp63α is efficiently translocated into the nucleus via the IMPα/β1 heterodimer in an NLSbip-dependent manner. Notably, despite its inability to form tetramers, TAp63α displays a similar tolerance to mutations that inactivate individual basic clusters within the bipartite NLS, analogous to what is observed for ΔNp63α (Supplementary Figure S11).

      Together, these results formally demonstrate that dimerization is sufficient to support efficient nuclear import in the presence of NLS-inactivating mutations, and that higher-order oligomerization (i.e., tetramerization) is not required for this property. We have therefore revised the manuscript accordingly to avoid over-interpretation and to more accurately reflect the experimental evidence.

      __ 3.3. The explanation for the difference in the sensitivity of mutations in the bipartite NLS in the isolated peptide experiments and experiments with the full length DNp63a is intriguing. Unfortunately, it is not based on direct experimental evidence. To proof their model (which is the central claim of this manuscript) they should fuse the bipartite NLS to any dimerization module (e.g. a leucine zipper sequence) and show that by dimerization of the bipartite NLS the same results towards mutations are obtained as for full length DNp63a. This would strongly support their model. __

      We agree that the model for nuclear transport is a central claim of our work, and deserves additional experimental validation. In order to support our hypothesis, in the revised manuscript we have generated a number of additional DNp63a mutants uncapable of self-interaction, based on deletion of residues 301-347(p63-DOD).

      We have now:

      (i) Validated the inability of the DOD mutant to self-interact by means of BRET assays in living cells, whereby a strong decrease in BRET ratio is observed compared to wild-type DNp63a (New Figure 6E and New Supplementary Figure S8).

      (ii) Shown that, in such context, substitution of either the N-terminal or C-terminal basic stretch of amino acids in the NLS is sufficient to impact p63 nuclear import, whereas in the context of the full-length protein, they are not (New Figure 6F-H, and New Supplementary Figure S9).

      (iii) Shown that while FLAG-p63 wt could relocalize to the nucleus YFP-p63mNLSbip but not YFP-p63;DOD;mNLSbip (New Supplementary Figure S10).

      We believe that these new data further demonstrate the impact of p63 self-association on subcellular localization and strongly support our hypothesis. We greatly thank the Reviewer for their inspiring comment, which led to a significant improvement of our manuscript.

      References

      Lotz R, Osterburg C, Chaikuad A, Weber S, Akutsu M, Machel AC, et al. Alternative splicing in the DBD linker region of p63 modulates binding to DNA and iASPP in vitro. Cell Death Dis. 2025;16(1):4. Ciribilli Y, Monti P, Bisio A, Nguyen HT, Ethayathulla AS, Ramos A, et al. Transactivation specificity is conserved among p53 family proteins and depends on a response element sequence code. Nucleic Acids Res. 2013;41(18):8637-53. Monti P, Ciribilli Y, Bisio A, Foggetti G, Raimondi I, Campomenosi P, et al. ∆N-P63alpha and TA-P63alpha exhibit intrinsic differences in transactivation specificities that depend on distinct features of DNA target sites. Oncotarget. 2014;5(8):2116-30. Pitzius S, Osterburg C, Gebel J, Tascher G, Schafer B, Zhou H, et al. TA*p63 and GTAp63 achieve tighter transcriptional regulation in quality control by converting an inhibitory element into an additional transactivation domain. Cell Death Dis. 2019;10(10):686. Okada C, Yamashita E, Lee SJ, Shibata S, Katahira J, Nakagawa A, et al. A high-resolution structure of the pre-microRNA nuclear export machinery. Science. 2009;326(5957):1275-9. Kutay U, Lipowsky G, Izaurralde E, Bischoff FR, Schwarzmaier P, Hartmann E, et al. Identification of a tRNA-specific nuclear export receptor. Mol Cell. 1998;1(3):359-69. Enthart A, Klein C, Dehner A, Coles M, Gemmecker G, Kessler H, et al. Solution structure and binding specificity of the p63 DNA binding domain. Scientific reports. 2016;6:26707. Deutsch GB, Zielonka EM, Coutandin D, Weber TA, Schafer B, Hannewald J, et al. DNA damage in oocytes induces a switch of the quality control factor TAp63alpha from dimer to tetramer. Cell. 2011;144(4):566-76.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Demarinis et al describe a detailed analysis of different stretches of basic amino acids located between the DBD and the OD of DNp63a to act as nuclear localization signals. They convincingly show that two stretches exist that form a bipartite NLS. They combine both functional import data with structure determination of the NLS sequence with IMP⍺ showing that both parts interact with the major and the minor site. The data are presented well provide a very good model of how nuclear important is regulated for DNp63a.

      This is a nice study of the bipartite NLS of DNp63a. Most interestingly, the authors show that nuclear import experiments using either the isolated peptide fused to GFP or DNp63a have a different outcome when the individual sequences are mutated. While in the case of the isolated peptide experiments a mutation in either of the two sequences has a measurable effect, this is not the case in the full length DNp63a context. The authors explain this with the oligomeriic state of DNp63a, which provides additional sequences from the other monomers within the tetramer, even when one of the NLS sequences is mutated. They provide alphaFold models to support this explanation. This in trans substitution effect explains why the NLS is not a mutation hotspot for inactivating DNp63a. These results are new and interesting in the context of how DNp63a regulates the development of epithelial tissues.

      Criticism:

      1. The major function of DNp63a seems to be that of a bookmarking factor that ensures the establishment of an epithelial transcriptional program. It is found to bind more to enhancer than to promoter regions. While it might also act for a few genes as a classical transcription factor (K14). this bookmarking and interaction with other transcriptional regulators seems to be its major task. This should be included in the introduction.
      2. "DNp63a can be imported into the nucleus as a dimer" What is the evidence that DNp63a is imported as a dimer and not as a tetramer? Although functional not really relevant, because all conclusions drawn for a dimer are true for a tetramer (such as the mutation compensation), this statement (and others in the text) should either be substantiated or modified.
      3. The explanation for the difference in the sensitivity of mutations in the bipartite NLS in the isolated peptide experiments and experiments with the full length DNp63a is intriguing. Unfortunately it is not based on direct exerimental evidence. To proof their model (which is the central claim of this manuscript) they should fuse the bipartite NLS to any dimerization module (e.g. a leucine zipper sequence) and show that by dimerization of the bipartite NLS the same results towards mutations are obtained as for full length DNp63a. This would strongly support their model.

      Significance

      Demarinis et al describe a detailed analysis of different stretches of basic amino acids located between the DBD and the OD of DNp63a to act as nuclear localization signals. They convincingly show that two stretches exist that form a bipartite NLS. They combine both functional import data with structure determination of the NLS sequence with IMP⍺ showing that both parts interact with the major and the minor site. The data are presented well provide a very good model of how nuclear important is regulated for DNp63a.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      General assessment:

      The authors show a number of generally very solid experiments that consistently support what is stated in the headline and further developed. They use wt and recombinant deltaN63alpha (N63) to sort out a previously published NLS whose inactivation did not lead to preventing nuclear localization of N63. The authors convincingly show that import is governed by a bipartite NLS. The interesting observation is that - when the bipartite stretch is transferred to GFP to drive the import, each motif is required - but the full-length protein tolerates alterations in either motif. The puzzle is solved by further structural analysis of binding of the NLS to importin alpha that shows the bipartite signal to work as expected. However, additional binding studies using BRET demonstrate dimerization that brings two copies of N63 and thus two bipartite signals together that compensate for mutations in one or the other part. Transcriptional activity of p53 can be modulated consistently with nuclear import, i.e. functional NLS motifs.

      The manuscript is overall in a very mature state, and I foresee publication essentially in its present form. A few suggestions may be considered prior to publication:

      1. In immunofluorescence images it is sometime difficult to see nuclear accumulation. Single channels of the GFP signal may help to make the point.
      2. The binding assays in Fig. 3 would profit from using the most efficient imp a variant together with imp beta to show potential cooperative binding.
      3. wording:

      please mention that NTR can also recognize 3D structures of structural RNAs, e.g. tRNAs or miRNAs

      longer TA isoforms

      homologues or orthologues?

      Significance

      General assessment:

      see above: this is a very consistent and mature study that can be pubslihed essentially in its present form.

      Advance:

      Even though the described mechanisms are not novel, they clarify how N63 is imported into human cell nuclei. We understand that in molecular mechanism and can deduce that the amounts of nuclear N63 are directly linked to its transcriptional response on p53.

      Audience:

      I see that this is interesting to experts in the nucleo-cytoplasmic transport field since it adds a novel aspect how robustness of import via dimerization can be reached. Beyond, the work brings news in translational research for physiology and pathology of epithelial tissue differentiation and homeostasis.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript, authors performed a solid biochemical and biophysical study to identify the nuclear localization signal (NLS) of the transcription factor p63 and its regulatory mechanism. By deletion and mutagenesis experiments, the two partially overlapping NLS were identified. They were shown to have relatively minor consequences for nuclear localization when disrupted individually but the nuclear localization was abolished when both were affected. The nuclear localization was important for transactivation activity but not for dimerization. In addition, authors also performed bioinformatics analysis and showed that sequences of part of these NLS were diverse in p63 in different species. This led to the conclusion that NLS of p63 is quite robust for nuclear localization which not easily affected by sequence divergence. This is important information for the p63 field.

      Major comments

      The biochemical and biophysical experiments performed in this study were well designed, data were clear and the conclusions were well supported by the results. One potential improvement is to check whether NLS could affect the normal activation targets of ΔNp63α, such as KRT14 and other epithelial genes. This could complement the experiments testing the inhibition effect of ΔNp63α on p53-mediated gene activation. This will be interesting, as ΔNp63α is a master regulator in epithelial cells via regulation of diverse epithelial genes.

      Major comments

      The biochemical and biophysical experiments performed in this study were well designed, data were clear and the conclusions were well supported by the results. One potential improvement is to check whether NLS could affect the normal activation targets of ΔNp63α, such as KRT14 and other epithelial genes. This could complement the experiments testing the inhibition effect of ΔNp63α on p53-mediated gene activation. This will be interesting, as ΔNp63α is a master regulator in epithelial cells via regulation of diverse epithelial genes.

      Minor comments

      A minor suggestion: authors could consider use p63 rather than ΔNp63α in the manuscript. The heterogenous sequences of NLS regions are relevant for the delta isoform of p63. In addition, all experiments performed in the study are not necessarily specific for the biology of the ΔNp63α isoform, but they are probably informative for all p63 isoforms. Another minor suggestion: As p63 forms a tetramer when binding to DNA sequence for gene regulation, it would be good for authors to speculate the role of NLS and its variations in tetramerization.

      Significance

      In this manuscript, authors performed a biochemical and biophysical study on nuclear localization signal (NLS) of the transcription factor ΔNp63α, a topic that is not yet fully understood. Previous study did not yet provide sufficiently convincing evidence for NLS that is essential for ΔNp63α nuclear localization. Authors also investigated the robustness of the NLS and its function, which provides important information for the field of p63, a key factor in epithelial development and in cancer.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      Authors should be commended for the availability of data/code and detailed methods. Clarity is good. Authors have clearly spent a lot of time thinking about the challenges of metabolomics data analysis.

      Significance

      Schmidt et al. present MetaProViz, a comprehensive and modular platform for metabolomics data analysis. The tool provides a full suite of processing capabilities spanning metabolite annotation, quality control, normalization, differential analysis, integration of prior knowledge, functional enrichment, and visualization. The authors also include example datasets, primarily from renal cancer studies, to demonstrate the functionality of the pipeline. The MetaProViz framework addresses several long-standing challenges in metabolomics data analysis, particularly issues of reproducibility, ambiguous metabolite annotation, and the integration of metabolite features with pathway knowledge. The platform is likely to be a valuable addition for the community, but the reviewer has some comments that need to be addressed prior to publication.

      We thank the reviewer for this positive feedback.

      Comments:

      (1) (Planned)

      The section "Improving the connection between prior knowledge and metabolomics features" could benefit from additional clarification. It is not entirely clear to the reader what specific steps were taken beyond using RaMP-DB to translate metabolite identifiers. For example, how exactly were ambiguous mappings ("different scenarios") handled in practice, and to what extent does this process "fix" or merely flag inconsistencies? A more explicit description or example of how MetaProViz resolves these cases would help readers better understand the improvements claimed.

      We thank the reviewer for pointing this out and we agree that this section requires extension to ensure clarity. Beyond using RaMP-DB, we are characterising the mapping ambiguity (one-to-none, one-to-many, many-to-one, many-to-many) within and across metabolite-sets (i.e. pathways) and return this information to the user together with the translated identifiers. This is important to understand potential inflation/deflation of metabolite-sets that occur due to the translation. Moreover, we also offer the manually curated amino-acid collection to ensure L-, D- and zwitterion without chirality IDs are assigned for aminoacids (Fig. 2b). Ambiguous mappings are handled based on the measured data (Fig. 2e). Indeed, many translation cases that deflate (many-to-one mapping) or inflate (one-to-many mapping) the metabolite-sets are resolved when merging the prior knowledge with actual measured data (i.e. Fig. 2e, one-to-many in scenario 1, which becomes obsolete as only one/none of the many potential metabolite IDs is detected). By sorting each mapping into one of those scenarios, we only flag those cases. The reason for this decision has been that in many cases multiple decisions are valid (i.e. Fig. 2e, Scenario 5: Here the values of the two detected metabolites could be summed or the metabolite value with the larger Log2FC could be kept) and it should really be up to the user to make those dependent on their knowledge of the biological system and the analytical LC-MS method used.

      Since these points have not been clear enough, we will add a more explicit description to the results section by showcasing more details on how we exactly tackled this problem in the ccRCC example data. This has also been suggested by Reviewer 3 (Minor Comment 7 and 8), so feel free to also see the responses below.

      (2) (Planned)

      The introduction of MetSigDB is intriguing, but its construction and added value are not sufficiently described. It would be helpful to clarify what specific advantages MetSigDB provides over directly using existing pathway resources such as KEGG, Reactome, or WikiPathways. For example, how many features, interactions, or metabolite-set relationships are included, and in what way are these pathways improved or extended compared to those already available in public databases?

      We thank the reviewer for this valuable comment and we apologise that this was not described sufficiently. One of the major advantages is that all the resources are available in one place following the same table format without the need to visit the different original resources and perform data wrangling prior to enrichment analysis. In addition, where applicable, we have removed metabolites that are not detectable by LC-MS (i.e. ions, H2O, CO2) to circumvent pathway inflation with features that are never within the data and hence impacting the statistical testing in enrichment analysis workflows.

      During the revision, we will compile an Extended Data Table listing all the resources present in MetSigDB, their number of features and interactions. We will also extend the methods section "Prior Knowledge access" about MetSigDB and how we removed metabolites.

      (3)

      Figure 1D/1E: The reviewer appreciates the inclusion of the visualizations illustrating the different mapping scenarios, as these effectively convey the complexity of metabolite ID translation. However, it took some time to interpret what each scenario represented. It would be helpful to include brief annotations or explanatory text directly on the figures to clarify what each scenario depicts and how it relates to the underlying issue being addressed.

      *We think the reviewer refers to Fig. 2D/E and we acknowledge that this is a complex problem we try to convey. We received a similar comment from Reviewer 2 (Minor Comment 1), who asked to extend the figure legend description of what the different scenarios display. *

      We have extended the figure legend and specifically explained each displayed case and its meaning (Line 222-242):

      "d-e) Schematics of possible mapping cases between metabolite IDs (= each circle corresponds to one ID) of a pathway-metabolite set (e.g. KEGG) to metabolites IDs of a different database (e.g. HMDB) with (d) showing many-to-many mappings that can occur within and across pathway-metabolite sets and (e) additionally showing the mapping to metabolite IDs that were assigned to the detected peaks within and across pathway-metabolite sets. (d) __Translating the metabolite IDs of a pathway-metabolite set can lead to special cases such as many-to-one mappings (Pathway 1), where for example the original resource used the ID for L-Alanine (Pathway 1, green) and D-Alanine (Pathway 1, yellow) in the amino-acid pathway, whilst the translated resources only has an entry for Alanine zwitterion (Pathway 1, blue). Additionally, many-to-one mappings can also occur across pathways (Pathway 2-4), where this mapping is only detected when mappings are analysed taking all pathways into account. Both of these cases deflate the pathways, which can also happen for one-to-none mappings (Pathway 1, white). There are also cases that inflate the pathway such as one-to-many mappings (e.g. Pathway 2-4, orange mapping to pink and violet). (e)__ Showcasing the different scenarios when merging measured data (detected) based on the translated metabolites within pathways (scenario 1-5) and across pathways (scenario 6-8) highlighting problematic scenarios (4-7) that require further actions. Unproblematic scenarios (1-3 and 8) can include special cases between original and translated (i.e. one-to-many in scenario 1), which become obsolete as only one/none of the many potential metabolite IDs is detected. Yet, if multiple metabolites are detected action is required (scenario 5), which can include building the sum of the multiple detected features or only keeping the one with the highest Log2FC between two conditions. Other special cases between original and translated (i.e. many-to-one in scenario 4 and 6) also depend on what has been mapped to the measured features. If features have been measured in those scenarios, pathway deflation (i.e. only one original entry remains) or measured feature duplication (the same measurement is mapped to many features in the prior knowledge) are the possible results within and across pathways. Those scenarios should be addressed on a case-by-case basis as they also require biological information to be taken into account."

      We have also rearranged the Scenarios in Fig. 2e. We hope that together with the extended figure legend this is now clear.

      (4) (Planned)

      "By assigning other potential metabolite IDs and by translating between the present ID types, we not only increase the number of features within all ID types but also increase the feature space with HMDB and KEGG IDs (Fig. 2a, right, SFig. 2 and Supplementary Table 1)". The reviewer would appreciate additional clarification on how this was done. It is not clear what specific steps or criteria were used to assign additional metabolite IDs or to translate between identifier types. The reviewer also appreciates the inclusion of the UpSet plots. However, simply having the plots side-by-side makes it difficult to determine the specific differences. An alternative visualization, such as stacked bar plots, scatter plots summarizing the changes in feature counts, or other representation that more clearly highlights the deltas, might make these results easier to interpret.

      The main Fig. 2a shows the original (left) metabolite ID availability per detected metabolite feature in the ccRCC data and the adapted (right) metabolite IDs. The individual steps taken to extend the metabolite ID coverage of the measured features and obtain Fig 2a (right), are shown in SFig. 2 for HMDB (SFig. 2a) and KEGG (SFig. 2b). We did not include the plots for the pubchem IDs as they follow the same principle. The individual steps we are showcasing with SFig. 2 are (I) How many of the detected features (577) have a HMDB ID (341, red bar + grey bar), (II) How this distribution changed after equivalent amino-acid IDs are added, which does not change the number of features with an HMDB ID, but the number of features with a single HMDB ID, and (III) How this distribution changed after translating from the other available ID types (KEGG and PubChem) to HMDB IDs using RaMP-DBs knowledge, which leads to 430 detected features with one or multiple HMDB IDs. The exact numbers can be extracted from Supplementary Table 1, Sheet "Feature metadata", where for example N-methylglutamate had no HMDB ID assigned in the original publication (see column HMDB_Original), yet by translating HMDB from KEGG (hmdb_from_kegg) and PubChem (see column hmdb_from_pubchem) we obtain in both cases the same HMDB ID "HMDB0062660". In order to clarify this in the manuscript, we have extended the figure legend of SFig. 2: "a-b) Bargraphs showing the frequency at which a certain number of metabolite IDs per integrated peak are available as per ccRCC patients feature metadata provided in the original publication (left), after potential equivalent IDs for amino-acid and amnio-acid-related features were assigned (middle), which increases the number of features with multiple (middle: grey bars) and after IDs were translated from the other available ID types (right). for a) Of 577 detected features, 341 had at least one HMDB IDs assigned (left graph, red + grey bar) according to the original publication (left). Translating from KEGG-to-HMDB and from PubChem-to-HMDB increased the number of features with an HMDB ID from 341 to 430 (left). and __b) __Of 577 detected features, 306 had at least one KEGG IDs assigned (left graph, red + grey bar) according to the original publication (left). Translating from HMDB-to-KEGG and from PubChem-to-KEGG did not increase the total number of features with an KEGG ID (left)."

      We like the suggestion of the reviewer to provide representations of the deltas and will add additional plots to SFig. 2 as part of our planned revision.

      (5) (Planned)

      MetaboAnalyst is mentioned several times in the manuscript. The reviewer is familiar with some of the limitations and practical challenges associated with using MetaboAnalyst and its R package. Given that MetaboAnalyst already offers some overlapping functionality with MetaProViz (and offers it in the form of an interactive website and a sometimes functional R package), a more explicit comparison between the two tools would help readers fully understand the unique advantages and improvements provided by MetaProViz.

      This is a good point the reviewer raises. As part of the revisions, we plan to create a supplementary data table that includes both tools and their respective features. We will refer to this table within the manuscript text.

      (6)

      Page 11: The authors state that they used limma for statistical testing, including for the analysis of exometabolomics data, where the values appear to represent log2-transformed distances or ratios rather than normally distributed intensities. Since limma assumes approximately normal residuals, please provide evidence or justification that this assumption holds for these data types. If the distributions deviate substantially from normality, a non-parametric alternative might be more appropriate.

      For exometabolomics data we use data normalised to media blank and growth factor (formula (1)). Limma is performed on those data, not on the log2-transformed distances. The Log2(Distance) is calculated separately to the statistical results using the normalised exometabolomics data. In addition, we always perform the Shapiro-Wilk test as part of MetaProViz differential analysis function on each metabolite to understand the distribution. In this particular case we have the following distributions:

      Cell line

      Metabolites normal distribution [%]

      Metabolites not-normal distribution [%]

      HK2

      82.35

      17.65

      786-O

      95.71

      4.29

      786-M1A

      97.14

      2.86

      786-M2A

      88.57

      11.43

      OSRC2

      92.86

      7.14

      OSLM1B

      85.71

      14.29

      RFX631

      97.14

      2.86

      If a user would have distributions that deviate substantially from normality, non-parametric alternatives are also available in MetaProViz (see methods section for all options).

      7)

      Page 13: why were young and old defined this way? Authors should provide their reasoning and/or citations for this grouping.

      We thank the reviewer for pointing this out. The explanation of our choices of the age groups is purely based on the literature:

      First, ccRCC can be sporadic (>96%) or familial (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3308682/pdf/nihms362390.pdf). This was also observed in other cohorts, where of 1233 patients only 93 were under 40 years of age (%, whilst 1140 (%) were older than 40 years (https://www.europeanurology.com/article/S0302-2838(06)01316-9/fulltext). Second, given the high frequency of sporadic cases it is unsurprising that ccRCC incidences were found to peak in patients aged 60 to 79 years with more male than female incidences (https://journals.lww.com/md-journal/Fulltext/2019/08020/Frequency,_incidence_and_survival_outcomes_of.49.aspx). Third, it was shown that sex impacts on the renal cancer-specific mortality and is modified by age, which is a proxy for hormonal status with premenopausal period below 42 years and postmenopausal period above 58 years (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4361860/pdf/srep09160.pdf). Putting all of this information together, we decided on our age groups of young (58years) following the hormonal period in order to account for sex impact. Additionally, our young age group is representative of the age of familial ccRCC, whilst our old age group summarises the age group where incidences were found to peak.

      To make this clear in the manuscript we have extended the method section of the manuscript (Line 547-548):

      "For the patient's ccRCC data, we compared tumour versus normal of two patient subset, "young" (58years)."

      (8)

      Figure 4e: It may help with interpretation to have these Sankey-like graph edges be proportional to the number of metabolites.

      We thank the reviewer for this suggestion, which we also pondered. When we tested this visualisation, the plot became convoluted, hard to interpret and not all potential flows exist in the data. This is why we have opted to create an overview graph of each potential flow, with each edge representing a potentially existing flow. The number of times a flow exists is shown in Fig. 4f.

      (9)

      Figure 4h: The values appear to be on an intensity scale (e.g., on the order of 3e10), yet some of them are negative, which would not be expected for raw or log-transformed mass spectrometry intensities. It is unclear whether these represent normalized abundance values, distances, or some other transformation. In addition, for the comparison of tumour versus normal tissue, it is not specified what statistical test was applied. Since mass spectrometry data are typically log2-transformed to approximate a log-normal distribution before performing t-tests or similar parametric methods, clarification is needed on how these data were processed.

      Thanks for pointing this out, it made us realize that we need to extend our figure legend for clarity for Fig. 4h (Line 343-345). In both cases we show normalized intensities following the workflow described in Fig. 3a. In case of the left graph labelled "CoRe", we are plotting an exometabolomics experiment, were additionally normalised using both media blanks (samples where no cells were cultured in) and growth factor (accounts for cell growth during experiment) as growth rate (accounts for variations in cell proliferation) has not been available (see also formula (1) in methods section). A result has a negative value if the metabolite has been consumed from the media, or a positive value if the metabolite has been released from the cell into the culture media.

      In addition, the reviewer refers to the comparison of tumour versus normal (Fig. 4a __and 4d__) and the missing description of the chosen statistical test. We have added the details to the figure legend (Lines 334 and 345).

      Adapted legend Fig. 4: "a) Differential metabolite analysis results for exometabolomics data comparing 786-O versus HK2 cells using Annova and false discovery rate (FDR) for p-value adjustment. b) __Heatmap of mean consumption-release of the measured metabolites across cell lines. c) Heatmap of normalised ccRCC cell line exometabolomics data for the selected metabolites of amino acid metabolism for a sample subset. __d) __Differential metabolite analysis results for intracellular data comparing 786-O versus HK2 cells using Annova and false discovery rate (FDR) for p-value adjustment. __e) __Schematics of bioRCM process to integrate exometabolomics with intracellular metabolomics and __f) __number of metabolites by their combined change patterns in intracellular- and exometabolomics in 786-M1A versus HK2. g)__ Heatmap of the metabolite abundances in the "Both_DOWN (Released/Comsumed)" cluster. __h) __Bar graphs of normalised methionine intensity for exometabolomics (CoRe: negative value, if the metabolite has been consumed from the media, or a positive value, if the metabolite has been released from the cell into the culture media) and intracellular metabolomics (Intra)."


      (10)

      Figure 5: "Tukey's p.adj We thank the reviewer for pointing this out. We have used the TukeyHSD (Tukey's Honestly Significant Difference) test in R on the Anova results. We have added more details into the figure legend (Line 384): "(Tukey's post-doc test after anova p.adj<br /> (11)

      The potential for multi-omics is mentioned. Please clarify how generalizable this framework is. Can it readily accommodate transcriptomics, proteomics, or fluxomics data, or does it require custom logic or formatting for each new data type?

      Thanks for raising this question. MetaProViz can readily accommodate transcriptomics and proteomics data for combined enrichment analysis using for example MetalinksDB metabolite-receptor pairs. Yet, MetaProViz does not support modelling fluxomics data into metabolic networks. We state in the discussion that this could be future development ("Beyond current capabilities, future developments could also incorporate mechanistic modeling to capture metabolic fluxes, subcellular compartmentalization, enzyme kinetics, regulatory feedback loops, and thermodynamic constraints to dissect metabolic response under perturbations."). To clarify on the availability of multi-omics integration for combined enrichment analysis, we have added some more details into the discussion section.

      Line 467-469: "In addition, providing knowledge of receptor-, transporter- and enzyme-metabolite pairs, MetaProViz can readily accommodate transcriptomics and proteomics data for combined enrichment analysis."

      (12)

      Please clarify if/how enrichment analyses account for varying set sizes and redundant metabolite memberships across pathways, which can bias over-representation analysis results.

      This is a very relevant point, which we have already been working on. Indeed, we agree that enrichment results from enrichment analyses can be biased due to varying set sizes and redundant metabolite memberships across pathways. MetaProViz explicitly accounts for varying set sizes when running over representation analysis (functions standard_ora()and cluster_ora()), which uses a model that computes the p-value under a hypergeometric distribution. Thereby, larger pathways are penalized unless the overlap is proportionally large, while smaller pathways can be significant with fewer overlaps. Hence, the test quantifies whether the observed overlap between the query set and a pathway is larger than would be expected under random sampling. In addition, we explicitly filter by gene‑set size using min_gssize/max_gssize, which further controls for extreme small or large sets. So both the statistical test itself and the size filters incorporate gene‑set size variation.

      Regarding the redundant metabolite-set (i.e. pathways) memberships, we have now implemented a new function (cluster_pk()) to cluster metabolite-sets like pathways based on overlapping metabolites. Thereby we allow investigation of enrichment results in regard to redundancy and similarity. For given metabolite-sets, the function calculates pathway similarities via either overlap- or correlation-based metrics. After optional thresholding to remove weak similarities, we implemented three clustering algorithms (connected-components clustering, Louvain community detection and hierarchical clustering) to group similar pathways. We then visualize the clustering results as a network graph using the new function viz_graph based on igraph. We have added all information into our methods section "Metabolite-set clustering" (Lines 656-671). In addition, we have also added the results of the clustering into Fig. 5f.

      New Fig. 5f:"f) *Network graph of top enriched pathways (p.adjusted

      Reviewer #2

      Evidence, reproducibility and clarity

      Schmidt et al report the development of MetaProViz, an integrated R package to process, analyze and visualize metabolomics data, including integration with prior knowledge. The authors then go on to demonstrate utility by analyzing several metabolomes of cell lines, media and patient samples from kidney cancer. The manuscript provides a concise description of key challenges in metabolomics that the authors identify and address in their software. The examples are helpful and illustrative, although I should point out that I lack the expertise to evaluate the R package itself. I only have a few very minor comments.

      Significance

      This is a very significant advance from one of the leading groups in the field that is likely to enhance metabolomics data analysis in the wider community.

      We thank the reviewer for this positive feedback on our package. We appreciate that there are no major comments from the reviewer.

      Minor comments:

      (1)

      Figure 2D, E: While the schematics are fairly intuitive, a brief figure legend description of what the different scenarios etc. represent would make this easier to grasp.

      We thank the reviewer for pointing this out and we acknowledge that this is a complex problem we try to convey. We received a similar comment from Reviewer 1 (Comment 3), so please see the extensive response there. In brief, we have extended the figure legend and specifically explained each displayed case and its meaning (Line 222-242) and extended the Figure itself by adding additional categories to Fig. 2e.

      Extended legend Fig.2 d-e: "d-e) Schematics of possible mapping cases between metabolite IDs (= each circle corresponds to one ID) of a pathway-metabolite set (e.g. KEGG) to metabolites IDs of a different database (e.g. HMDB) with (d) showing many-to-many mappings that can occur within and across pathway-metabolite sets and (e) additionally showing the mapping to metabolite IDs that were assigned to the detected peaks within and across pathway-metabolite sets. (d) __Translating the metabolite IDs of a pathway-metabolite set can lead to special cases such as many-to-one mappings (Pathway 1), where for example the original resource used the ID for L-Alanine (Pathway 1, green) and D-Alanine (Pathway 1, yellow) in the amino-acid pathway, whilst the translated resources only has an entry for Alanine zwitterion (Pathway 1, blue). Additionally, many-to-one mappings can also occur across pathways (Pathway 2-4), where this mapping is only detected when mappings are analysed taking all pathways into account. Both of these cases deflate the pathways, which can also happen for one-to-none mappings (Pathway 1, white). There are also cases that inflate the pathway such as one-to-many mappings (e.g. Pathway 2-4, orange mapping to pink and violet). (e)__ Showcasing the different scenarios when merging measured data (detected) based on the translated metabolites within pathways (scenario 1-5) and across pathways (scenario 6-8) highlighting problematic scenarios (4-7) that require further actions. Unproblematic scenarios (1-3 and 8) can include special cases between original and translated (i.e. one-to-many in scenario 1), which become obsolete as only one/none of the many potential metabolite IDs is detected. Yet, if multiple metabolites are detected action is required (scenario 5), which can include building the sum of the multiple detected features or only keeping the one with the highest Log2FC between two conditions. Other special cases between original and translated (i.e. many-to-one in scenario 4 and 6) also depend on what has been mapped to the measured features. If features have been measured in those scenarios, pathway deflation (i.e. only one original entry remains) or measured feature duplication (the same measurement is mapped to many features in the prior knowledge) are the possible results within and across pathways. Those scenarios should be addressed on a case-by-case basis as they also require biological information to be taken into account."

      (2) Fig. 4: The authors briefly state that they integrate prior knowledge to identify the changes in methionine metabolism in kidney cancer, but it is not clear how exactly they contribute to this conclusion. It could be helpful to expand a bit on this to better illustrate how MetaProViz can be used to integrate prior knowledge into the analysis workflow.

      We think the reviewer refers to this section in the text (Line 363-370):

      "Next, we focused on the cluster "Both_DOWN (Released-Consumed)" and found that several amino acids are consumed by the ccRCC cell line 786-M1A but released by healthy HK2 cells. At the same time, intracellular levels are significantly lower than in HK2 (Log2FC = -0.9, p.adj = 4.4e-5) (Fig. 4g). To explore the role of these metabolites in signaling, we queried the prior knowledge resource MetalinksDB, which includes metabolite-receptor, metabolite-transporter and metabolite-enzyme relationships, for their known upstream and downstream protein interactors for the measured metabolites (Supplementary Table 5). This approach is especially valuable for exometabolomics, as it allows us to generate hypotheses about cell-cell communication. Notably, we identified links involving methionine (Fig. 4h), enzymes such as BHMT, and transporters such as SLC43A2 that were previously shown to be important in ccRCC25,42 (Supplementary Table 5)."

      We have now extended this part to clearly state that here MetalinkDB is the prior knowledge resource we used to identify the links for methionine (Line 363-364). In addition we have extended our summary statement to ensure clarity for the reader that we combine the biological clustering, which revealed the amino acid changes, with prior knowledge for the mechanistic insight (Line 380-381):

      "In summary, calculating consumption-release and combining it with intracellular metabolomics via biological regulated clustering reveals metabolites of interest. Further combining these results with prior knowledge using the MetaproViz toolkit facilitates biological interpretation of the data."

      (3)

      Given the functional diversity among metabolites -central to diverse pathways, are key signaling molecules, restricted functions, co-variation within a pathway - I wonder how informative approaches such as PCA or enrichment analyses are for identifying metabolic drivers of a (patho)physiological state. To some extent, this can be addressed by integrating prior knowledge, and it would be helpful if the authors could comment on (and if applicable explain) whether/how this is integrated into MetaProViz.

      The reviewer is correct in stating the functional diversity of metabolites, which is also why prior knowledge is needed to add mechanistic interpretation to the finding from the metadata analysis (as we showcased by focusing on the separation of age (Fig. 5c-d)). We think that approaches such as PCA or enrichment can be helpful, even if admittedly limited. For example, in the metadata analysis presented in Fig. 5b and the subsequent enrichment analysis presented in Fig. 5, we used PCA to extract the eigenvector and the loading, which act as weights indicating the contribution of each original metabolite to that specific principal components separation. Hence, the eigenvector of PCA shows the metabolite drivers of the separation. This does not necessarily mean that those metabolites are drivers of a (patho)physiological state - the (patho)physiological state can equally be the reason for those metabolites driving the separation on the Eigenvectors. Thus, the metadata analysis presented in Fig. 5b enables us to extract the metadata variables (patho)physiological states separated on a PC with the explained variance. This can also lead to co-variation, when multiple (patho)physiological states are separated on the same PC, as the reviewer correctly points out. Regarding the enrichment analysis, we provide different types of prior knowledge for classical mapping, but also the prior knowledge we used to create the biological regulated clustering, which together help to identify key metabolic groups as we can first cluster the metabolites and afterwards perform functional enrichment. Yet, this does not account for the technical issues of enrichment analysis. In this context multi-omics integration building metabolic-centric networks could further elucidate the diversity of metabolic pathways and connection to signalling and co-variation, yet this is not the scope of MetaProViz. To sum up, we are aware of the limitations of this analysis and the constraints on the downstream interpretation.

      To capture the functional diversity amongst metabolites, which leads to metabolites being present in multiple pathways of metabolite-pathways sets, we have implemented a new function to cluster metabolite-sets like pathways based on overlapping metabolites and visualize redundant metabolite-set (i.e. pathways) memberships (Fig.5f). For more details also see our response to Reviewer 1, Comment 12. We hope this will circumvent miss- and over-interpretation of the enrichment results.

      In addition, we have extended the text to include the analysis pitfalls explicitly (Line 416-419): "Another variable explaining the same amount of variance in PC1 is the tumour stage, which could point to adjacent normal tissue metabolic rewiring that happens in relation to stage and showcases that biological data harbour co-variations, which can not be disentangled by this method."

      Reviewer #3

      Evidence, reproducibility and clarity

      This manuscript introduces an R package MetaProViz for metabolomics data analysis (post anotation), aiming to solve a poor-analysis-choices problem and enable more people to do the analysis. MetaProViz not only guides people to select the best statistical method, but also enables to solve previously unsolved problems: e.g. multiple and variable metabolite names in different databases and their connections to prior knowledge. They also created exometabolomics analysis and the needed steps to visualise intra-cell / media processes. The authors demonstrated their new package via kidney cancer (clear-cell renal cell carcinoma dataset, steping one step closer to improve biological interpretability of omics data analysis.

      Significance

      This is a great tool and I can't wait to use it on many upcoming metabolomics projects! Authors tackle multiple ongoing issues within the field: from poor selection of statistical methods (they provide guidance or have default safer options) to the messiness of data annotation between databases and improving data interpretability. The field is still evolving quickly, and it's impossible to solve all problems with one package; thus some limitations within the package could be seen as a bit rigid. Nonetheless, this fully steps toward filling an existing methodological gap. All bioinformaticians doing metabolomic analysis, or those learning how to do it, will greatly benefit from this knowledge.

      I myself lead a team of 6 bioinformaticians, and we do analysis for researchers, clinicians, drug discovery, and various companies. We run internal metabolomics pipelines every day and fully sympathise with the problems addressed by the authors.

      Major comments affecting conclusions

      none.

      We thank the reviewer for this positive feedback on evidence, reproducibility and clarity as well as significance of our work given the reviewers experience with metabolomics data analysis mentioned. We appreciate that there are no major comments from the reviewer.

      Minor comments

      Minor comments, important issues that could be addressed and possibly improve the clarity or generally presentation of the tool. Please see all below.

      (1)

      1- You start with separating and talking about metabolomics and lipidomics, but lipidomics quickly dissapears (especially beyond abstract/intro) - no real need to discuss lipidomics.

      Thanks, that's a good note and we have removed it from the abstract and introduction.

      (2)

      2- You refer to the MetImp4 imputation web tool, but I cannot find an active website, manuscript, or R package for it, and the cited link does not load. This raises doubts about whether the tool is currently usable. Additionally, imputation choice should be guided by biological context and study design, not just by testing a few methods and selecting the one that performs best.

      We fully agree with the reviewer on imputation handling. The manuscript we cite from Wei et. al. (https://doi.org/10.1038/s41598-017-19120-0) compared a multitude of missing value imputation methods and made this comparison strategy available as a web-based tool not as any code-based package such as an R-package. Yet, the reviewer is right, the web-tool is no longer reachable. Hence, we have adapted the statement in our introduction (Line 61-62): "Moreover, there are tools that focus on specific steps of the pre-processing of feature intensities, which encompasses feature selection, missing value imputation (MVI)9 and data normalisation. For example, MetImp4 is a web-tool that includes and compares multiple MVI methods9. "

      (3)

      3- The authors address key metabolomics issues such as ambiguous metabolite names and isoforms, and their focus on resolving mapping ambiguities and translating between database identifiers is highly valuable. However, the larger challenge of de novo identification and the "dark matter" of unannotated metabolites remains unresolved (initiatives as MassIVE might help in the future https://massive.ucsd.edu/ProteoSAFe/ ), and readers may benefit from clearer acknowledgement that MetaProViz does not operate on raw spectral data. The introduction currently emphasizes annotation, but since MetaProViz requires already annotated metabolite tables (and then deals with all the messiness), this space might be better used to frame the interpretability and pathway-analysis challenges that the tool directly addresses.

      We appreciate the comment and have highlighted this in the abstract and introduction: "MetaProViz operates on annotated intensity values..." (Line 29 and 88).

      Given the newest advancements in metabolite identification using AI-based methods, MetaProViz toolkit with a focus on connecting metabolite IDs to prior knowledge becomes increasingly valuable. We added this to our discussion (Line 484-488): "Given the imminent shift in metabolite identification through AI-based approaches, including language model-guided48 methods and self-supervised learning49, the growing number of identified metabolites will make the MetaProViz toolkit increasingly valuable for the community to gain functional insights."

      In regards to the introduction, where we mention some tools for peak annotation: The reason why we have this paragraph where peak annotation are named is that we wanted to set the basis by (I) listing the different steps of metabolomics data analysis and (II) pointing to well-known tools of those steps. We also have a dedicated paragraph for pathway-analysis challenges.

      (4)

      4- I also really enjoyed you touching on the point of user-friendly but then inflexible and problem of reproducibility. We truly need well working packages for other bioinformaticians, rather than expecting wet-lab scientists to do all the analysis within the user interface.

      We thank the reviewer for this positive feedback.

      (5)

      5- It would be helpful to explain why the authors chose cancer/RCC samples for the demonstration. Was it because the dataset included both media and cell measurements? Does the tool perform best when multiple layers of information are available from the same experiment?

      We specifically chose the ccRCC cell line data as example since, for a multitude of cell lines, both media (exometabolomics) and intracellular metabolomics had been performed. The combination of both data types is only used in the biological regulated clustering (Fig. 5e-g), all other analyses do not require additional data modalities. We have not specifically tested how performance differs for this particular case as it would require multiple paired data (exometabolomics and intracellular metabolomics) taken at the same time and at different times.

      (6)

      6- Figure 2B: The upset plots effectively show increased overlap after adaptation, but it would be easier to compare changes if the order of the intersection bars in the "adapted" plot matched the original. For example, while total intersections increased (251→285), the PubChem+KEGG overlap decreased (24→5), likely due to reallocation to the full intersection.

      Thanks for raising this point. We initially had ordered the bars based on their intersection size, but we agree with the reviewers that for our point it makes sense to fix the order in the adapted plot to match the order of the original plot. We have done this (Fig 2a) and also extended the figure legend text of SFig. 2, which shows the individually performed adaptations summarized in Fig 2a.

      (7) (Planned)

      7- In your example of D-alanine and L-alanine - you mention how chirality is important biological feature, but up to this point it's not clear how do you do translation exactly and in which situations this would be treated just as "alanine" and when the more precise information would be retained? You mention RaMP-DB knowledge and one to X mappings as well as your general guidance in the "methods" part, but it would be useful to describe in this publication how you exactly tackled this problem in the ccRCC case.

      We thank the reviewer for this suggestion. Since this is a complex problem, we will add a more explicit description to the results section by showcasing more details on how we exactly tackled this problem in the ccRCC example data.

      In regards to D- and L-alanine, even though chirality is an important biological feature, in a standard experiment we can not distinguish if we detect the L- or D-aminoacid. This is why we try to assign all possible IDs to increase the overlap with the prior knowledge. In Fig. 2b we showcase that this can potentially lead to multiple mappings of the same measured feature to multiple pathways. For example, if we measure alanine and assign the pubchem ID for L-Alanine, D-Alanine and Alanine and try to map to metabolite-sets that include both L-Alanine and D-Alanine. In turn this could fall into Scenario 6 (Fig. 2e), where across pathways there is a D-Alanine specific one (Pathway 1) and a L-Alanine specific one (Pathway 2). Now we can decide, if we want to allow both mapping (many-to-one) or if we decide to exclude D-Alanine because we know our biological system is human and should primarily have L-Alanine.

      (8) (Planned)

      8- In one to many mappings, it would be interesting to see quantification how frequently it was happening within a pathway or across pathways. I.e. Would going into pathway analysis "solve" the issue of "lost in translation" or not really?

      We have quantified the frequency for the example of translating the KEGG metabolite-set into HMDB IDs (Fig. 2c, left panel). Yet, we are not showcasing the quantification across the KEGG metabolite-sets with this plot. During the revision we will add the full results available to the Extended Data Table 2, which currently only includes the results displayed in Fig.2c.

      (9)

      9- QC: the coefficient of variation (CV) helps identify features with high variability and thus low detection accuracy. Here it's important to acknowledge that if the feature is very variable between groups it can be extremely important, but if the feature is very variable within the group - only then one would have low trust in the accuracy.

      Yes, we totally agree with the reviewer on this. For this reason, we have applied CV only in instances where this is not leading to any condition-driven CV differences, but is truly feature-focused: (1) Function pool_estimation performs CV on the pool samples only, which are a homogeneous mixture of all samples, and hence can be used to assess feature variability. (2) Function processing performs CV on exometabolomics media samples (=blanks), which are also not impacted by different conditions.

      (10)

      10- Missing value imputation - while missing not at random is a great way to deal with missingness, it would be great to have options for others (not just MNAR), as missingness is of a complex nature. If a pretty strong decision has been made, it would be good to support this by some supplementary data (i.e. how results change while applying various combinations of missingness and why choosing MNAR seems to be the most robust).

      We have decided to only offer support for MNAR, since we would recommend MVI only if there is a biological basis for it.

      As mentioned in the response to your minor comment 2, Wei et. al. (https://doi.org/10.1038/s41598-017-19120-0) compared a multitude of missing value imputation methods. They compared six imputation methods (i.e., QRILC, Half-minimum, Zero, RF, kNN, SVD) for MNAR and systematically measured the performance of those imputation methods. They showed that QRILC and Half-Minimum produced much smaller SOR values, showing consistent good performances on data with different numbers of missing variables. This was the reason for us to only provide Half-minimum.

      (11) (Planned)

      11- In the pre-processing and imputation stages - it would be interesting to see a summary table of how many features are left after each stage.

      This is a good suggestion and refers to the steps described in Fig. 3a. We will create an overview table for this, add it into the Extended Data Table and refer to it in the results section.

      (12)

      12- Is there a reason not to do UMAP or PSL-DA graphs for outlier detection? Doing more than PCA would help to have more confidence in removing or retaining outliers in the cases where biological relevance is borderline.

      The reason we decided to use PCA was the standardly used combination with the Hotelling T2 outlier testing. Since PCA is a linear dimensionality reduction technique that preserves the overall variance in the data and has a clear mathematical foundation linked to the covariance structure, it specifically fits the required assumptions of the Hotelling T2 outlier testing. Indeed, Hotelling T2 relies on the properties of the covariance matrix and the assumption of a multivariate Gaussian distribution. UMAP is a non-linear dimensionality reduction technique, which prioritizes preserving local and global structures in a way that often results in good clustering visualization, but it distorts distances between clusters and does not have the same rigorous statistical underpinnings as PCA. In terms of PLS-DA, which focuses on maximizing the covariance between variables and the class labels, even though not commonly done, one could use the optimal latent variables for discrimination and apply Hotelling's T² to those latent variables. Yet, PLS-DA is supervised and actively tries to separate data points in the latent space, which can be misleading for outlier detection where methods like PCA that are unbiased, unsupervised and preserve global variance are advantageous.

      (13)

      13- Metadata vs metabolite features - can this be used beyond metabolomics (i.e. proteomics, transcriptomics, etc)? It can be always very useful when there are many metadata features and it's hard to pre-select beforehand which ones are the most biologically relevant.

      Yes, definitely. In fact, we have used the metadata analysis strategy also with proteomics data and it will work equally with any omics data type.

      (14)

      14- While authors discussed what KEGG pathways were significantly deregulated, it would be interesting to see all the pathways that were affected (e.g. aPEAR "bubble" graphs can show this (https://github.com/kerseviciute/aPEAR) , or something similar to NES scores). I appreciate the trickiness of it, but it would be quite interesting to see how authors e.g. Figure5e narrowed it down to the two pathways and how all the others looked like.

      We thank the reviewer for the suggestion of the aPEAR graphs. Following this suggestion, we have implemented a new function to enable clustering of the pathways based on overlapping metabolites (cluster_pk()). For more details regarding the method see also our response to Reviewer 1 (Comment 12) and our extended method section "Metabolite-set clustering" (Lines 656-671). We visualize the clustering results as a network graph, which we also included into Fig. 5f.

      The complete result of the KEGG enrichment can be found in Extended Data Table 1, Sheet 13 (Pathway enrichment analysis using KEGG on Young patient subset). The pathways are ranked by p.adjusted value and also include a score (FoldEnrichment) from the fishers exact test (similar to NES scores in GSEA). Here one can find a total of seven pathways with a p.adjusted value For Fig. 5e we narrowed down to these two pathways based on the previous findings of dysregulated dipeptides (Fig. 5d), as we searched for a potential explanation of this observation.

      (15)

      15- Could you comment on the runtime of the pipeline? In particular, do the additional translation steps and use of multiple databases substantially affect computational speed?

      Downloading and parsing databases takes significant time, especially large ones like RaMP or HMDB might take minutes on a standard laptop. Our local cache speeds up the process by eliminating the need for repeated downloads. In the future, database access will be even faster: according to our plans, all prior knowledge will be accessible in an already parsed format by our own API (omnipathdb.org). The ambiguity analysis, which is a complex data transformation pipeline, and plotting by ggplot2, another key component of MetaProViz, are the slowest parts, especially when performing analysis for the first time when no cache can be used. This means there are a few slow operations which complete in maximum a few dozens of seconds. However, the implementation and speed of these solutions doesn't fall behind what we commonly find in bioinformatics packages, and most importantly, the speed of MetaProViz doesn't pose an obstacle or difficulty regarding an efficient use of it in analysis pipelines.

      (16)

      16- I clap to the authors for automated checks if selected methods are appropriate!

      Thank you, this is something we think is important to ensure correct analysis and circumvent misinterpretation.

      (17)

      17- My suggestion would be to also look into power calculation or p-value histogram. In your example you saw some clear signal, but very frequently research studies are under-sampled and while effect can be clearly seen, there are just not enough samples to have statistically significant hits.

      We fully agree that power calculations are very important. Yet, this should ideally happen prior to the user's experiment. MetaProViz analysis starts at a later time-point and power calculations should have been done before. In regards to p-value histogram, we have implemented a similar measure, namely a density plot, which is plotted as a quality control measure within MetaProViz differential analysis function. The density plot is a smoothed version of a histogram that represents the distribution as a continuous probability density function and can be used to assess whether the p-values follow a uniform distribution.

      (18)

      18- Overall functional parts are novel and next step in helping with data interpretability, but I still found it hard to read into functionally clear insights (re to pathways / functional groupings of metabolites) - especially as you have e.g. enzyme-metabolite databases etc. I think clarity there could be improved and would help to get your message more widely across.

      Regarding the clarity to the pathway enrichment and their functional insights, we have extended the Figure legends of Fig. 4 and 5, clearly state that for the functional interpretation MetalinkDB is the prior knowledge resource we used to identify the links for methionine (Line 367-368), and we have extended our summary statement to highlight that we combine the biological clustering with prior knowledge for the mechanistic insight (Line 380-381).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript introduces an R package MetaProViz for metabolomics data analysis (post anotation), aiming to solve a poor-analysis-choices problem and enable more people to do the analysis. MetaProViz not only guides people to select the best statistical method, but also enables to solve previously unsolved problems: e.g. multiple and variable metabolite names in different databases and their connections to prior knowledge. They also created exometabolomics analysis and the needed steps to visualise intra-cell / media processes. The authors demonstrated their new package via kidney cancer (clear-cell renal cell carcinoma dataset, steping one step closer to improve biological interpretability of omics data analysis.

      Major comments affecting conclusions: none.

      Minor comments, important issues that could be addressed and possibly improve the clarity or generally presentation of the tool. Please see all below.

      1. You start with separating and talking about metabolomics and lipidomics, but lipidomics quickly dissapears (especially beyond abstract/intro) - no real need to discuss lipidomics.
      2. You refer to the MetImp4 imputation web tool, but I cannot find an active website, manuscript, or R package for it, and the cited link does not load. This raises doubts about whether the tool is currently usable. Additionally, imputation choice should be guided by biological context and study design, not just by testing a few methods and selecting the one that performs best.
      3. The authors address key metabolomics issues such as ambiguous metabolite names and isoforms, and their focus on resolving mapping ambiguities and translating between database identifiers is highly valuable. However, the larger challenge of de novo identification and the "dark matter" of unannotated metabolites remains unresolved (initiatives as MassIVE might help in the future https://massive.ucsd.edu/ProteoSAFe/ ), and readers may benefit from clearer acknowledgement that MetaProViz does not operate on raw spectral data. The introduction currently emphasizes annotation, but since MetaProViz requires already annotated metabolite tables (and then deals with all the messiness), this space might be better used to frame the interpretability and pathway-analysis challenges that the tool directly addresses.
      4. I also really enjoyed you touching on the point of user-friendly but then inflexible and problem of reproducibility. We truly need well working packages for other bioinformaticians, rather than expecting wet-lab scientists to do all the analysis within the user interface.
      5. It would be helpful to explain why the authors chose cancer/RCC samples for the demonstration. Was it because the dataset included both media and cell measurements? Does the tool perform best when multiple layers of information are available from the same experiment?
      6. Figure 2B: The upset plots effectively show increased overlap after adaptation, but it would be easier to compare changes if the order of the intersection bars in the "adapted" plot matched the original. For example, while total intersections increased (251→285), the PubChem+KEGG overlap decreased (24→5), likely due to reallocation to the full intersection.
      7. In your example of D-alanine and L-alanine - you mention how chirality is important biological feature, but up to this point it's not clear how do you do translation exactly and in which situations this would be treated just as "alanine" and when the more precise information would be retained? You mention RaMP-DB knowledge and one to X mappings as well as your general guidance in the "methods" part, but it would be useful to describe in this publication how you exactly tackled this problem in the ccRCC case.
      8. In one to many mappings, it would be interesting to see quantification how frequently it was happening within a pathway or across pathways. I.e. Would going into pathway analysis "solve" the issue of "lost in translation" or not really?
      9. QC: the coefficient of variation (CV) helps identify features with high variability and thus low detection accuracy. Here it's important to acknowledge that if the feature is very variable between groups it can be extremely important, but if the feature is very variable within the group - only then one would have low trust in the accuracy.
      10. Missing value imputation - while missing not at random is a great way to deal with missingness, it would be great to have options for others (not just MNAR), as missingness is of a complex nature. If a pretty strong decision has been made, it would be good to support this by some supplementary data (i.e. how results change while applying various combinations of missingness and why choosing MNAR seems to be the most robust).
      11. In the pre-processing and imputation stages - it would be interesting to see a summary table of how many features are left after each stage.
      12. Is there a reason not to do UMAP or PSL-DA graphs for outlier detection? Doing more than PCA would help to have more confidence in removing or retaining outliers in the cases where biological relevance is borderline.
      13. Metadata vs metabolite features - can this be used beyond metabolomics (i.e. proteomics, transcriptomics, etc)? It can be always very useful when there are many metadata features and it's hard to pre-select beforehand which ones are the most biologically relevant.
      14. While authors discussed what KEGG pathways were significantly deregulated, it would be interesting to see all the pathways that were affected (e.g. aPEAR "bubble" graphs can show this (https://github.com/kerseviciute/aPEAR) , or something similar to NES scores). I appreciate the trickiness of it, but it would be quite interesting to see how authors e.g. Figure5e narrowed it down to the two pathways and how all the others looked like.
      15. Could you comment on the runtime of the pipeline? In particular, do the additional translation steps and use of multiple databases substantially affect computational speed?
      16. I clap to the authors for automated checks if selected methods are appropriate!
      17. My suggestion would be to also look into power calculation or p-value histogram. In your example you saw some clear signal, but very frequently research studies are under-sampled and while effect can be clearly seen, there are just not enough samples to have statistically significant hits.
      18. Overall functional parts are novel and next step in helping with data interpretability, but I still found it hard to read into functionally clear insights (re to pathways / functional groupings of metabolites) - especially as you have e.g. enzyme-metabolite databases etc. I think clarity there could be improved and would help to get your message more widely across.

      Significance

      This is a great tool and I can't wait to use it on many upcoming metabolomics projects! Authors tackle multiple ongoing issues within the field: from poor selection of statistical methods (they provide guidance or have default safer options) to the messiness of data annotation between databases and improving data interpretability. The field is still evolving quickly, and it's impossible to solve all problems with one package; thus some limitations within the package could be seen as a bit rigid. Nonetheless, this fully steps toward filling an existing methodological gap. All bioinformaticians doing metabolomic analysis, or those learning how to do it, will greatly benefit from this knowledge.

      I myself lead a team of 6 bioinformaticians, and we do analysis for researchers, clinicians, drug discovery, and various companies. We run internal metabolomics pipelines every day and fully sympathise with the problems addressed by the authors.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Schmidt et al report the development of MetaProViz, an integrated R package to process, analyze and visualize metabolomics data, including integration with prior knowledge. The authors then go on to demonstrate utility by analyzing several metabolomes of cell lines, media and patient samples from kidney cancer. The manuscript provides a concise description of key challenges in metabolomics that the authors identify and address in their software. The examples are helpful and illustrative, although I should point out that I lack the expertise to evaluate the R package itself. I only have a few very minor comments.

      Minor comments:

      1. Figure 2D, E: While the schematics are fairly intuitive, a brief figure legend description of what the different scenarios etc. represent would make this easier to grasp.
      2. Fig. 4: The authors briefly state that they integrate prior knowledge to identify the changes in methionine metabolism in kidney cancer, but it is not clear how exactly they contribute to this conclusion. It could be helpful to expand a bit on this to better illustrate how MetaProViz can be used to integrate prior knowledge into the analysis workflow.
      3. Given the functional diversity among metabolites -central to diverse pathways, are key signaling molecules, restricted functions, co-variation within a pathway - I wonder how informative approaches such as PCA or enrichment analyses are for identifying metabolic drivers of a (patho)physiological state. To some extent, this can be addressed by integrating prior knowledge, and it would be helpful if the authors could comment on (and if applicable explain) whether/how this is integrated into MetaProViz.

      Significance

      This is a very significant advance from one of the leading groups in the field that is likely to enhance metabolomics data analysis in the wider community.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Schmidt et al. present MetaProViz, a comprehensive and modular platform for metabolomics data analysis. The tool provides a full suite of processing capabilities spanning metabolite annotation, quality control, normalization, differential analysis, integration of prior knowledge, functional enrichment, and visualization. The authors also include example datasets, primarily from renal cancer studies, to demonstrate the functionality of the pipeline. The MetaProViz framework addresses several long-standing challenges in metabolomics data analysis, particularly issues of reproducibility, ambiguous metabolite annotation, and the integration of metabolite features with pathway knowledge. The platform is likely to be a valuable addition for the community, but the reviewer has some comments that need to be addressed prior to publication.

      The section "Improving the connection between prior knowledge and metabolomics features" could benefit from additional clarification. It is not entirely clear to the reader what specific steps were taken beyond using RaMP-DB to translate metabolite identifiers. For example, how exactly were ambiguous mappings ("different scenarios") handled in practice, and to what extent does this process "fix" or merely flag inconsistencies? A more explicit description or example of how MetaProViz resolves these cases would help readers better understand the improvements claimed.

      The introduction of MetSigDB is intriguing, but its construction and added value are not sufficiently described. It would be helpful to clarify what specific advantages MetSigDB provides over directly using existing pathway resources such as KEGG, Reactome, or WikiPathways. For example, how many features, interactions, or metabolite-set relationships are included, and in what way are these pathways improved or extended compared to those already available in public databases?

      Figure 1D/1E: The reviewer appreciates the inclusion of the visualizations illustrating the different mapping scenarios, as these effectively convey the complexity of metabolite ID translation. However, it took some time to interpret what each scenario represented. It would be helpful to include brief annotations or explanatory text directly on the figures to clarify what each scenario depicts and how it relates to the underlying issue being addressed.

      "By assigning other potential metabolite IDs and by translating between the present ID types, we not only increase the number of features within all ID types but also increase the feature space with HMDB and KEGG IDs (Fig. 2a, right, SFig. 2 and Supplementary Table 1)". The reviewer would appreciate additional clarification on how this was done. It is not clear what specific steps or criteria were used to assign additional metabolite IDs or to translate between identifier types. The reviewer also appreciates the inclusion of the UpSet plots. However, simply having the plots side-by-side makes it difficult to determine the specific differences. An alternative visualization, such as stacked bar plots, scatter plots summarizing the changes in feature counts, or other representation that more clearly highlights the deltas, might make these results easier to interpret.

      MetaboAnalyst is mentioned several times in the manuscript. The reviewer is familiar with some of the limitations and practical challenges associated with using MetaboAnalyst and its R package. Given that MetaboAnalyst already offers some overlapping functionality with MetaProViz (and offers it in the form of an interactive website and a sometimes functional R package), a more explicit comparison between the two tools would help readers fully understand the unique advantages and improvements provided by MetaProViz.

      Page 11: The authors state that they used limma for statistical testing, including for the analysis of exometabolomics data, where the values appear to represent log2-transformed distances or ratios rather than normally distributed intensities. Since limma assumes approximately normal residuals, please provide evidence or justification that this assumption holds for these data types. If the distributions deviate substantially from normality, a non-parametric alternative might be more appropriate.

      Page 13: why were young and old defined this way? Authors should provide their reasoning and/or citations for this grouping.

      Figure 4e: It may help with interpretation to have these Sankey-like graph edges be proportional to the number of metabolites.

      Figure 4h: The values appear to be on an intensity scale (e.g., on the order of 3e10), yet some of them are negative, which would not be expected for raw or log-transformed mass spectrometry intensities. It is unclear whether these represent normalized abundance values, distances, or some other transformation. In addition, for the comparison of tumour versus normal tissue, it is not specified what statistical test was applied. Since mass spectrometry data are typically log2-transformed to approximate a log-normal distribution before performing t-tests or similar parametric methods, clarification is needed on how these data were processed.

      Figure 5: "Tukey's p.adj < 0.05" . Was this a Tukey's post-hoc test? This should be explicitly stated.

      The potential for multi-omics is mentioned. Please clarify how generalizable this framework is. Can it readily accommodate transcriptomics, proteomics, or fluxomics data, or does it require custom logic or formatting for each new data type?

      Please clarify if/how enrichment analyses account for varying set sizes and redundant metabolite memberships across pathways, which can bias over-representation analysis results.

      Significance

      The MetaProViz framework addresses several long-standing challenges in metabolomics data analysis, particularly issues of reproducibility, ambiguous metabolite annotation, and the integration of metabolite features with pathway knowledge. The platform is likely to be a valuable addition for the community, but the reviewer has some comments that need to be addressed prior to publication.

      Authors should be commended for the availability of data/code and detailed methods. Clarity is good. Authors have clearly spent a lot of time thinking about the challenges of metabolomics data analysis.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      We are grateful for the reviewers' constructive comments and suggestions, which contributed to improving our manuscript. We are pleased to see that our work was described as an "interesting manuscript in which a lot of work has been undertaken". We are also encouraged by the fact that the experiments were considered "on the whole well done, carefully documented, and support most of the conclusions drawn," and that our findings were viewed as providing "mechanistic insight into how HNRNPK modulates prion propagation" and potentially offering "new mechanical insight of hnRNPK function and its interaction with TFAP2C."

      We conducted several new experiments and revised specific sections of the manuscript, as detailed below in the point-by-point response in this letter.

      Referee #1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The paper by Sellitto describes studies to determine the mechanism by which hnRNPK modulates the propagation of prion. The authors use cell models lacking HNRNPK, which is lethal, in a CRISPR screen to identify genes that suppress lethality. Based on this screen to 2 different cell lines, gene termed Tfap2C emerged as a candidate for interaction with HNRNPK. The show that Tfap2C counteracts the actions of HNRNPK with respect to prion propagation. Cells lacking HNRNPK show increased PrPSc levels. Overexpression of Tfap2C suppesses PrPSc levels. These effects on PrPSc are independent of PrPC levels. By RNAseq analysis, the authors hone in on metabolic pathways regulated by HNRPNK and Tfap2C, then follow the data to autophagy regulation by mTor. Ultimately, the authors show that short-term treatments of these cell models with mTor inhibitors causes increased accumulation of PrPSc. The authors conclude that the loss of HNRNPK leads to a reduced energy metabolism causing mTor inhibition, which is reduces translation by dephosphorylation of S6

      Major comments:

      1) Fig H and I, Fig 3L. The interaction between Tfap2C and HNRNPK is pretty weak. The interaction may not be consequential. The experiment seems to be well controlled, yielding limited interaction. The co-ip was done in PBS with no detergent. The authors indicate that the cells were mechanically disrupted. Since both of these are DNA binding proteins, is it possible that the observed interaction is due to the proximity on DNA that is linking the 2 proteins, including a DNAase treatment would clarify.

      Response: We agree that the observed co-IP between Tfap2c and hnRNP K is weak (previous Fig. 2H-I, Supp. Fig. 3L now shifted in Supp. Fig. 4C-E), and we have now highlighted this in the relevant section of the manuscript to reflect this observation better.

      Importantly, the co-IP was performed using endogenous proteins without overexpression or tagging, which can sometimes artificially enhance protein-protein interactions. However, we acknowledge that the use of a detergent-free lysis buffer and mechanical disruption alone may have limited nuclear protein extraction and solubilization, potentially contributing to the low co-IP signal.

      To address the reviewer's concerns and clarify whether the observed interaction could be DNA-mediated, we repeated the co-IP experiments under low-detergent conditions and included benzonase nuclease treatment to digest nucleic acids (Fig. 2H-I). DNA digestion was confirmed by agarose gel electrophoresis (Supp. Fig. 4F-G). Additionally, we performed the reciprocal IPs using both hnRNP K and Tfap2c antibodies (Fig. 2H-I). Although the level of co-immunoprecipitation remains modest, these updated experiments continue to demonstrate a specific co-immunoprecipitation between Tfap2c and hnRNP K, independent of DNA bridging. These additional controls and experimental refinements strengthen the validity of our findings. These results are also attached here for your convenience.

      2) Supplemental Fig 5B - The western blot images for pAMPK don't really look like a 2 fold increase in phosphorylation in HNRNPK deletion.

      Response: We thank the reviewer for raising this point. We re-examined the original pAMPK western blot (previously Supp. Fig. 5B; now presented as Supp. Fig. 6B) and confirmed the reported results. We note that the overall loading is not perfectly uniform across lanes (as suggested by the actin signal), which may affect the visual impression of band intensity. However, the phosphorylation change reported in the manuscript is based on the pAMPK/total AMPK ratio, which accounts for differences in AMPK expression and accurately reflects relative phosphorylation levels. To further address this concern, we performed three additional independent experiments. These new data reproduce the increase in pAMPK/AMPK upon HNRNPK deletion and are now included in the revised Supplementary Fig. 6B, together with the updated quantification. The new blot and the quantification are also attached here for your convenience.

      3) Fig. 5A - I don't think it is proper to do statistics on an of 2.

      Response: We believe the reviewer's comment refers to Fig. 5B, as Fig. 5A already has sufficient replication. We have now added two additional replicates, bringing the total to four. The updated statistical analysis corroborates our initial results. The new quantification is provided in the revised manuscript (Fig. 5B) along with the new blot (Supp. Fig. 6C). Both data are also attached here for your convenience.

      4) Fig 6D. The data look a bit more complicated than described in the text. At 7 days, compared to 2 days, it looks like there is a decrease in % cells positive for 6D11. Is there clearance of PrPSc or proliferation of un-infected cells?

      Response: We have now reworded our text in the results paragraph as follows:

      "These data show that TFAP2C overexpression and HNRNPK downregulation bidirectionally regulate prion levels in cell culture."

      We have now also included the following comments in the discussion section:

      "However, prion propagation relies on a combination of intracellular PrPSc seeding and amplification, as well as intercellular spread, which together contribute to the maintenance and expansion of infected cells within the cultured population. In this study, we were limited in our ability to dissect which specific steps of the prion life cycle are affected by TFAP2C. We also cannot fully exclude the possibility that TFAP2C overexpression influenced the relative proliferation of prion-infected versus uninfected cells in the PG127-infected HovL culture, thereby contributing to the observed reduction in the percentage of 6D11+ cells and overall 6D11+ fluorescence. However, we did not observe any signs of cell death, growth impairment, or increased proliferation under TFAP2C overexpression in PG127-infected HovL cells compared to NBH controls (data not shown). This suggests that a negative selective pressure on infected cells or a proliferative advantage of uninfected cells is unlikely in this context".

      5) The authors might consider a different order of presenting the data. Fig 6 could follow Fig. 2 before the mechanistic studies in Figs 3-5.

      Response: We believe that the current order of presenting the data is more appropriate. The first part of the manuscript focuses on the genetic and functional interactions between hnRNP K and its partners, particularly TFAP2C, which is a critical point for understanding the broader context before delving into the mechanistic studies involving prion-infected cells.

      6) The authors use SEM throughout the paper and while this is often used, there has been some interest in using StdDev to show the full scope of variability.

      Response: We chose to use SEM as it reflects the precision of the mean, which is central to our statistical comparisons. As the reviewer notes, this is a common and appropriate practice. To address variability, almost all graphs already include individual data points, which provide a direct visual representation of data spread. To further enhance clarity, we have now included StdDev in the Supplementary Source Data table of the revised manuscript.

      Discussion:

      The discrepancy between short-term and long-term treatments with mTor inhibitors is only briefly mentioned with a bit of a hand-waving explanation. The authors may need a better explanation.

      Response: We have now integrated a more detailed explanation in the discussion section of the revised manuscript as follows:

      "Previous studies showed that mTORC1/2 inhibition and autophagy activation generally reduce, rather than increase, PrPSc aggregation (79, 80). The reason for this discrepancy remains unclear and may be multifactorial. First, most prior studies were based on long-term mTOR inhibition, whereas our work examined acute inhibition, mimicking the time frame of HNRNPK and TFAP2C manipulation. Acute inhibition may trigger transient metabolic or signaling shifts that differ from adaptive changes associated with mTOR chronic inhibition, potentially overriding autophagy's effects on prion propagation. Additionally, while previous works were primarily conducted in murine in vivo models, our study focused on a human cell system propagating ovine prions. Differences in species background, model complexity (e.g., interactions between different cell types), and prion strain variability, as certain strains exhibit distinct responses to autophagy and mTOR modulation (https://doi.org/10.1371/journal.pone.0137958), likely contributed to the observed differences".

      Minor comments:

      Page 12 - no mention of chloroquine in the text or related data.

      Page 12 - Supp. Fig. E - should be 5E

      Response: We thank the reviewer for pointing this out. We have now better highlighted the use of chloroquine in Fig. 5B (see reviewer #1 - Point 3 - Major comments) and in the text as follows:

      "Furthermore, in the presence of chloroquine, LC3-II levels rose almost proportionally across all conditions (Fig. 5B), suggesting that the effects of HNRNPK and TFAP2C on autophagy occur at the level of autophagosome formation, rather than autophagosome-lysosome fusion and degradation."

      We have corrected the reference to Supp. Fig. 5E.

      Reviewer #1 (Significance (Required)):

      The study provides mechanistic insight into how HNRNPK modulates prion propagation. The paper is limited to cell models, and the authors note that long term treatment with mTor inhibitors reduced PrPSc levels in an in vivo model.

      The primary audience will be other prion researchers. There may be some broader interest in the mTor pathway and the role of HNRNPK in other neurodegenerative diseases.

      Referee #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2g (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency. They state that HNRNPK and TFAP2C are linked to mTOR signalling and observe that HNRNPK ablation inhibits mTORC1 activity through downregulation of mTOR and Rptor while TFAP2C overexpression enhances mTORC1 downstream functions. In prion infected cells, TFAP2C activation reduced prion levels and countered the increased prion propagation due to HNRNPK suppression. Pharmacological inhibition of mTOR also elevated prion levels and partially mimicked the effects of HNRNPK silencing. They state their study identifies TFAP2C as a genetic interactor of HNRNPK and implicates their roles in mTOR metabolic regulation and establishes a causative link between these activities and prion propagation.

      This is an interesting manuscript in which a lot of work has been undertaken. The experiments are on the whole well done, carefully documented and support most of the conclusions drawn. However, there are places where it was quite difficult to read as some of the important results are in the supplementary Figures and it was necessary to go back and forth between the Figs in the main body of the paper and the supplementary Figs. There are also Figures in the supplementary which should have been presented in the main body of the paper. These are indicated in our comments below.

      We have the following questions /points:

      Major comments:

      1) A plasmid harbouring four guide RNAs driven by four distinct constitutive promoters is used for targetting HNRNPK- is there a reason for using 4 guides- is it simply to obtain maximal editing - in their experience is this required for all genes or specific to HNRNPK?

      Response: The use of four guide RNAs driven by distinct promoters is chosen to maximize editing efficiency for HNRNPK. As previously demonstrated by J. A. Yin et al. (Ref. 32), this system provides better efficiency for gene knockout (or activation). For HNRNPK, achieving full knockout was crucial for observing a complete lethal phenotype, which made the four guide RNAs approach fundamental. However, other knockout systems, while potentially less efficient, have been shown to work well in other circumstances. We have now included this explanation in the revised manuscript as follows:

      "We employed a plasmid harboring quadruple non-overlapping single-guide RNAs (qgRNAs), driven by four distinct constitutive promoters, to target the human HNRNPK gene and maximize editing efficiency in polyclonal LN-229 and U-251 MG cells stably expressing Cas9 (32)."

      2) Is there a minimal amount of Cas9 required for editing?

      Response: We did not observe a correlation between Cas9 levels and activity, yet the C3 clone was the one with higher Cas9 expression and higher activity (Supp. Fig. 1A-B). We agree that comments about the amount of Cas9 expression may be misleading here. Thus, in the first result paragraph of the revised manuscript, we have now modified the text "we isolated by limiting dilutions LN-229 clones expressing high Cas9 levels" to "we isolated by limiting dilutions LN-229 single-cell clones expressing Cas9".

      3) It is stated that cell death is delayed in U251-MG cells compared to LN-229-C3 cells- why? Also, why use glioblastoma cells other than that they have high levels of HNRNPK? Would neuroblastoma cells be more appropriate if they are aiming to test for prion propagation?

      Response: As shown in Fig. 1A, U251-MG cells reached complete cell death at day 13, while LN-229 C3 reached it already at day 10. The percentage of viable U251-MG cells is higher (statistically significant) than LN-229 C3 cells at all time points before day 13, when both lines show complete death. The underlying reasons for this partial and relative resistance are probably multiple, but we clearly showed in Fig. 2 that TFAP2C differential expression is one modulator of cell sensitivity to HNRNPK ablation.

      We selected glioblastoma cells because their high expression of HNRNPK was essential for developing our synthetic lethality screen strategy, and we have now clarified it in the revised manuscript as follows:

      "As model systems, we chose the human glioblastoma-derived LN-229 and U-251 MG cell lines, which express high levels of HNRNPK (2, 3), a key factor for optimizing our synthetic lethality screen."

      While neuroblastoma cells might be more relevant in terms of prion neurotoxicity, glial cells, despite their resistance to prion toxicity, are fully capable of propagating prions. Prion propagation in glial cells has been shown to play crucial roles in mediating prion-dependent neuronal loss in a non-autonomous manner (see 10.1111/bpa.13056). This makes glioblastoma cells a valuable model for studying prion propagation (that is the focus of our study), despite the lack of direct toxicity (which is not the focus of our study). We have now added this explanation to the revised manuscript as follows:

      "Therefore, we continued our experiments using LN-229 cells, which provide a relevant model for studying prions, as glial cells can propagate prions and contribute to prion-induced neuronal loss through non-cell-autonomous mechanisms."

      4) Human CRISPR Brunello pooled library- does the Brunello library use constructs which have four independent guide RNAs as used for the silencing of HNRPNK?

      Response: No, the Human CRISPR Brunello pooled library does not use constructs with four independent guide RNAs (qgRNAs). Instead, each gene is targeted by 4 different single-guide RNAs (sgRNAs), each expressed on a separate plasmid. We have now clarified this in the main text of the revised manuscript as follows:

      "To identify functionally relevant epistatic interactors of HNRNPK, we conducted a whole-genome ablation screen in LN-229 C3 cells using the Human CRISPR Brunello pooled library (33), which targets 19,114 genes with an average of four distinct sgRNAs per gene, each expressed by a separate plasmid (total = 76,441 sgRNA plasmids)."

      5) To rank the 763 enriched genes, they multiply the -log10FDR with their effect size - is this a standard step that is normally undertaken?

      Response: The approach of ranking hits using the product of effect size and statistical significance is a well-established method in CRISPR screening studies. This strategy has been explicitly used in high-impact work by Martin Kampmann and others (see https://doi.org/10.1371/journal.pgen.1009103 and https://doi.org/10.1016/j.neuron.2019.07.014 as references). We have now added both references to the revised manuscript.

      6) The 32 genes selected- they were ablated individually using constructs with one guide RNA or four guide RNAs?

      Response: The 32 genes selected were ablated individually using constructs with quadruple-guide RNAs (qgRNAs), as this approach was intended to maximize editing efficiency for each gene. We have now clarified this in the main text of the revised manuscript as follows:

      "We ablated each gene individually using qgRNAs and then deleted HNRNPK."

      7) The identified targets were also tested in U251-MG cells and nine were confirmed but the percent viability was variable - is the variability simply a reflection of the different cell line?

      Response: The variability in percent viability observed in U251-MG cells likely reflects the inherent differences between cell lines, which can contribute to varying levels of susceptibility to gene ablation, even for the same targets. We have now highlighted these small differences in the main text of the revised manuscript as follows:

      "We confirmed a total of 9 hits (Fig. 1H), including the ELPs gene IKBAKP and the transcription factor TFAP2C, the two strongest hits identified in LN-229 C3 cells. However, in the U251-Cas9 the rescue effect did not always fall within the exact range observed in LN-229 C3 cells, likely due to intrinsic differences between the two cell lines."

      8) The two strongest hits were IKBAKP and TFAP2C. As TFAP2C is a transcription factor - is it known to modulate expression of any of the genes that were identified to be perturbed in the screen? Moreover, it is stated that it regulates expression of several lncRNAs- have the authors looked at expression of these lncRNAs- is the expression affected- can modulation of expression of these lncRNAs modulate the observed phenotypic effects and also some of the targets they have identified in the screen?

      Response: While TFAP2C is a transcription factor known to regulate the expression of several genes and lncRNAs, we did not identify any of its known target genes among the hits of our screen. However, our RNA-seq data and RT-qPCR (data not shown) indicate that the expression of lncRNA MALAT1 and NEAT1 (reported to interact with both HNRNPK and TFAP2C; ref 37, 41, 47) is strongly affected by HNRNPK ablation and to a lesser extent by TFAP2C deletion. However, the double deletion condition does not appear to change these lncRNA levels beyond what is observed with HNRNPK ablation alone. Therefore, we concluded that these changes do not play a primary role in the phenotypic effects observed in our study. Thus, although interesting, we believe that the description of such observations goes beyond the scope of this manuscript and the relevance of this work.

      9) As both HNRNPK and TFAP2C modulate glucose metabolism, the authors have chosen to explore the epistatic interaction. This is most reasonable.

      Response: We do not have further comments on this point.

      10) The orthogonal assay to confirm that deletion of TFAP2C supresses cell death upon removing HNRNPK- was this done using a single guide RNA or multiple guides - is there a level of suppression required to observe rescue? Interestingly ablation of HNRNPK increases TFAP2C expression in LN-229-C3 whereas in U251-Cas9 cells HNRNPK ablation has the opposite effect- both RNA and protein levels of TFAP2C are decreased - is this the cause of the smaller protective effect of TFAP2C deletion in this cell line?

      Response: TFAP2C deletion was performed using quadruple-guide RNAs (gqRNAs). We have clarified this point by addressing the reviewer #2's point 6 in "Major comments".

      We did not directly test the threshold of TFAP2C inhibition required to suppress HNRNPK ablation-induced cell death. We did not exclude that other effectors may take a role in the smaller protective effect of TFAP2C deletion in the U251-Cas9 cells, however, multiple lines of evidence from our study suggest that TFAP2C expression levels influence cellular sensitivity to HNRNPK loss:

      1) Both LN-229 C3 and U251-Cas9 cells are less sensitive to HNRNPK ablation upon TFAP2C deletion (Fig. 1G-H, Fig. 2A-B, Supp. Fig.3A-B).

      2) We observed a correlation between endogenous TFAP2C levels and HNRNPK ablation sensitivity. U251-Cas9 cells, where TFAP2C expression is reduced upon HNRNPK ablation (in contrast to LN-229 C3 cells, where HNRNPK ablation leads to an increase in TFAP2C expression) (Fig. 2C-F), are a) less sensitive to HNRNPK deletion than LN-229 C3 (Fig. 1A, 2A-B) and b) the protective effect of TFAP2C deletion is less pronounced than in LN-229 C3 (Fig. 1G-H, Fig. 2A-B, Supp. Fig.3A-B).

      3) TFAP2C overexpression experiments (Fig. 2G) establish a causal relationship to the former correlation: TFAP2C overexpression increased U251-Cas9 sensitivity to HNRNPK ablation.

      As clearly mentioned in the manuscript, we believe that, taken together, these findings strongly demonstrate a causal role for TFAP2C in modulating sensitivity to HNRNPK loss. Thus, despite the differences in the expression, the proposed viability interaction between TFAP2C and HNRNPK is conserved across cell lines.

      To further strengthen our conclusions, we have now added LN-229 C3 TFAP2C overexpression in Fig. 2G (also attached below for your convenience). As for the U251-Cas9, LN-229 C3 cells show increased sensitivity to HNRNPK ablation upon TFAP2C overexpression.

      11) Nuclear localisation studies indicate that the HNRNPK and TFAP2C proteins colocalise in the nucleus however the co-IP data is not convincing- although appropriate controls are present, the level of interaction is very low - the amount of HNRNPK pulled down by TFAP2C is really very low in the LN-229C3 cells and even lower in the U251-Cas9 cells. Have they undertaken the reciprocal co-IP expt?

      Response: We rephrased our text to better highlight this as also mentioned in our response to reviewer #1 (Point 1 - Major comments). However, as also noted by the reviewer, the experiments included all the relevant controls. Thus, the results are solid and confirm a degree of co-immunoprecipitation (although weak). As detailed in our response to reviewer #1 (Point 1 - Major comments), to strengthen our conclusion, we have now repeated the experiment in low-detergent conditions and used benzonase nuclease for DNA digestion. We also have performed the reciprocal experiment as suggested by the reviewer, confirming the initial results. In our opinion, these additional experiments support the conclusion that Tfap2c and hnRNP K co-immunoprecipitate through a weak, but direct, interaction.

      12) They state that LN-229 C3 ∆TFAP2C and U251-Cas9 ∆TFAP2C were only mildly resistant to the apoptotic action of staurosporin Fig 3E and F - I accept they have undertaken the stats which support their statement that at high concentrations of staurosporin the LN-229 C3 ∆TFAP2C cells are less sensitive but the U251-Cas9 ∆TFAP2C decreased sensitivity is hard to believe. Has this been replicated? I agree that HNRNPK deletion causes apoptosis in both LN-229 C3 and U251-Cas9 cells and this is blocked by Z-VAD-FMK - however the block is not complete- the max viability for HNRNPK deletion in LN-229 C3 cells is about 40% whereas for U251-Cas9 cells it is about 30% - does this suggest that cells are being lost by another pathway. Have they tested concentrations higher than 10nM?

      Response: The experiments in FIG. 3E-F have been replicated four times, as stated in the figure legend. We agree that TFAP2C plays a limited role in response to staurosporine-induced apoptosis, particularly in U251-Cas9 cells. To ensure clarity, we have now modified our previous sentence as follows:

      "LN-229 C3ΔTFAP2C cells were only mildly resistant to the apoptotic action of staurosporine, and U251-Cas9ΔTFAP2C showed even lower and minimal recovery (Fig. 3E-F). These results indicate that TFAP2C plays a limited role in apoptosis regulation and suggest that its suppressive effect on HNRNPK essentiality is not mediated through direct modulation of apoptosis but rather through upstream processes that eventually converge on it."

      The incomplete blockade of apoptosis by Z-VAD-FMK suggests that HNRNPK ablation may activate alternative, non-caspase-mediated cell death pathways. Regarding this point, we decided to not test Z-VAD-FMK above 10 nM as we noted that the rescue effect at the lowest concentration (2nM) was not proportionally increasing at higher concentrations, suggesting we already reached saturation. We have now added and clarified these observations in the revised manuscript as follows:

      "Z-VAD-FMK decreased cell death consistently and significantly in LN-229 C3 and U251-Cas9 cells transduced with HNRNPK ablation qgRNAs (Fig. 3C‑D), confirming that HNRNPK deletion promotes cell apoptosis. However, we observed that viability recovery plateaued already at the lowest concentration (2 nM) without further increase at higher doses, suggesting a saturation effect. This indicates that while caspase inhibition alleviates part of the cell death, HNRNPK loss triggers additional mechanisms beyond apoptosis".

      Following the suggestion of the reviewer, we have now also tested two higher concentrations of Z-VAD (20 and 50nM) in LN-229 cells. At these concentrations, we observed a slight decrease in cell viability in the NT condition, with a rescue effect in the HNRNPK-ablated cells comparable to what was observed at 2-10nM Z-VAD. For this reason, we did not include these data in the revised manuscript, and we attached them here for transparency.

      13) The RNA-seq comparisons- the authors use log2 FC Response: We used a log2 FC threshold of >0.5 and 0.25) is commonly used in RNA-seq studies to capture biologically relevant shifts (e.g.,https://doi.org/10.1371/journal.ppat.1012552; https://doi.org/10.1371/journal.ppat.1008653; https://doi.org/10.1016/j.neuron.2025.03.008; https://doi.org/10.15252/embj.2022112338). We complemented this analysis with Gene Set Enrichment Analysis (GSEA) to assess coordinated changes in biological/genetic pathways, ensuring that our conclusions are not based on isolated, minor expression changes nor on arbitrary thresholds. Finally, to enhance our result robustness, we applied False Discovery Rate (FDR) statistics, which is more stringent than a p-value cutoff. We hope this clarification strengthens the reviewer's confidence in the significance of the observed changes.

      14) It is stated" Accordingly, we observed increased AMPK phosphorylation (pAMPK) upon ablation of HNRNPK, which was consistently reduced in LN-229 C3ΔTFAP2C cells (Supp. Fig. 5B). LN-229 C3ΔTFAP2C; ΔHNRNPK cells also showed a partial reduction of pAMPK relative to LN-229 C3ΔHNRNPK cells (Supp. Fig. 5B). These results suggest that hnRNP K depletion causes an energy shortfall, leading to cell death.

      Response: I am not totally convinced by the data presented in this Fig. The authors have quantified the band intensity and present the ratio of pAMPK to AMPK. Please note that the actin levels are variable across the samples - did they normalise the data using the actin level before undertaking the comparisons? Also, if the authors think this is an important point which supports their conclusion, then it should be in the main body of the paper rather than the supplementary. If AMPK is being phosphorylated, this should lead to activation of the metabolic check point which involves p53 activation by phosphorylation. Activated p53 would turn on p21CIP1 which is a very sensitive indicator of p53 activation.

      We also refer the reviewer to our response to reviewer #1 (Point 2 - Major comments). We understand the point of the reviewer as pAMPK/Actin (absolute AMPK phosphorylation) may provide additional context regarding the downstream effects of AMPK activation, which, however, is not the primary scope of our experiment. We believe that in our specific case, a) the pAMPK/AMPK ratio is the most appropriate metric, as it reflects the energy status of the cell (ATP/AMP levels), which was our main point to assess in this experiment, and b) phospho-protein/total protein is the standard approach for quantifying phosphorylation ratio. For completeness, we have now included pAMPK/Actin quantifications in Supp. Fig. 6B of the revised manuscript (also attached below). pAMPK/Actin levels follow the same trend of pAMPK/AMPK in HNRNPK and TFAP2C single ablations. The pAMPK/AMPK partial rescue in HNRNPK;TFAP2C double ablation relative to HNRNPK single deletion is instead not observed at pAMPK/Actin level. We have now added the pAMPK/Actin quantification and this observation to the revised manuscript as follows:

      "Accordingly, we observed increased AMPK phosphorylation (pAMPK/AMPK ratio and pAMPK/Actin) upon ablation of HNRNPK, with a trend toward reduction in LN-229 C3ΔTFAP2C cells (Supp. Fig. 6B). LN-229 C3ΔTFAP2C;ΔHNRNPK cells also showed a reduction of pAMPK/AMPK ratio relative to LN-229 C3ΔHNRNPK cells, although absolute AMPK phosphorylation (pAMPK/Actin) remained high (Supp. Fig. 6B)."

      We prefer to keep the AMPK blots in Supplementary Fig. 6B, as we believe the main take-home message of the manuscript should remain centered on mTORC1 activity.

      15) We also do not understand why the mTOR Suppl. Fig. 5E is not in the main body of the paper. It's clear that RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C- however the ΔTFAP2C;ΔHNRNPK double deletion levels are only slightly higher than the ΔHNRNPK - they are not at the level NT or even ΔTFAP2C (Fig. 4C, Supp. Fig. 5E).

      Response: We moved the mTOR blot to Fig.5D of the revised manuscript. About the low rescue effect, this is in line with all the other observations where a full rescue of the effects of HNRNPK ablation is never achieved, but is only partial. As suggested by reviewer #3 (Figure 5 - Point 2), we have now added RT-qPCR in Fig.5C, which corroborates these data.

      16) The authors state: "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C). Similarly, the S6 phosphorylation ratio was reduced in LN-229 C3ΔHNRNPK cells and was restored in the ΔTFAP2C;ΔHNRNPK double-ablated cells (Fig. 5C)."

      WE are not convinced that p4EBP1 is preserved in the LN-229 C3ΔTFAP2C cells - there is a very faint band which is at a lower level than the band in the LN-229 C3ΔHNRNPK cells. However, when both HNRNPK and TFAP2C were ablated, the p4EBP1 band is clear cut. I agree with the quantitation that deletion of HNRNPK and TFAP2C both reduce the level of 4EBP1 - the reduction is greater with TFAP2 but when both are deleted together the levels of 4EBP1 are higher and p4EBP1 is clearly present. In quantifying the S6 and pS6 levels, did the authors consider the actin levels- they present a ratio of the pS6 to S6. I may be lacking some understanding but why is the ratio of pS6/S6 being calculated. Is the level of pS6 not what is important - phosphorylation of S6 should lead it to being activated and thus it's the actual level of pS6 that is important, not the ratio to the non-phosphorylated protein.

      Response: In Fig. 5C, the three-band pattern of 4EBP1 is clearly visible in the NT+NT or WT condition, with the top band representing the highest phosphorylation state. Upon HNRNPK deletion, this top band almost completely disappears, mimicking the effect of our starvation control (Starv.). This top band remains clearly visible in both TFAP2C-ablated and double-ablated cells, supporting our conclusion. In our original text, we referred to the "highly phosphorylated forms" of 4EBP1, which might have caused some confusion, suggesting we were evaluating the two top bands. We are specifically referring only to the very top band (high p4EBP1), which represents the most highly phosphorylated form of 4EBP1. This is the relevant phosphorylated form to focus on, as it is the only one that disappears in the starvation control (Starv.) or upon mTORC1/2 inhibition with Torin-1 (Fig. 7B).

      To better clarify these points, we have now more clearly indicated the "high p4EBP1" band with an asterisk in Fig. 5E, added quantification of high p4EBP1/4EBP1, and rephrased the text as follows:

      "Deletion of HNRNPK diminished the highest phosphorylated form of 4EBP1 (high p4EBP1, marked with an asterisk), mimicking the effect observed in starved cells (Starv.). This high p4EBP1 band was preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C).".

      Regarding pS6 quantification, we added pS6/Actin quantification in Supp. Fig. 6E and F of the revised manuscript, also attached here for your convenience.

      17) When determining ATP levels, do they control for cell number? HNRNPK depletion results in lower ATP levels, co-deletion of TFAP2C rescues this. But this could be because there is less cell-death? So, more cells express ATP. Have they controlled for relative numbers of cells.

      Response: As described in the Materials and Methods , we normalized ATP levels to total protein content, which is a standard approach for this type of quantification (see DOI:10.1038/nature19312).

      18) The construction of the HovL cell line that propagate ovine prions - very few details are provided of the susceptibility of the cell line to PG127 prions.

      Response: As with other prion-infected cell lines, HovL cells do not exhibit any specific growth defects, susceptibilities, or phenotypes beyond their ability to propagate prions. This is consistent with established observations in prion research, where immortalized cell lines (and in general in vitro cultures) normally do not show cytotoxicity upon prion infection and, therefore, are used as models for prion propagation rather than for prion toxicity (see https://doi.org/10.1111/jnc.14956 for reference).

      We now expanded the relevant section, including technical and conceptual details in the main text of the revised manuscript as follows:

      "As reported for other ovinized cell models (66), HovL cells were susceptible to infection by the PG127 strain of ovine prions and capable of sustaining chronic prion propagation, as shown by proteinase K (PK)-digested western blot and by detection of PrPSc using the anti-PrP antibody 6D11, which selectively stains prion-infected cells after fixation and guanidinium treatment (67) (Supp. Fig. 7C-E). Consistent with most prion-propagating cell lines (68), HovL cells did not exhibit specific growth defects, susceptibilities, or overt phenotypes beyond their ability to propagate prions."

      19) It is stated that HRNPK depletion from HovL cells increases PrpSC as determined by 6D11 fluorescence, but in the manuscript HRNPK depletion results in cell death. How does this come together?

      Response: As explicitly stated in the main text and shown in Fig.6-7, HNRNPK is downregulated (via siRNAs) in the prion experiments rather than fully deleted (via CRISPR) as in the first part of the manuscript. As shown in Supp. Fig. 8B, this downregulation does not affect cell viability within the experimental time window. Therefore, the observed increase in PrPSc levels upon HNRNPK downregulation, as determined by western blot and 6D11 staining, is independent of any potential cell death effects. Moreover, the same siRNA downregulation approach was used by M. Avar et al. (Ref. 26) in comparable experiments, yielding similar outcomes.

      20) They show that mTOR inhibition mimics the effect of HNRNPK deletion, why didn't they overexpress mTOR and see if that rescues this? This would indicate a causal relationship.

      Response: We appreciate the reviewer's suggestion. We agree that the proposed rescue strategy would be the best approach to indicate a causal relationship. However, we linked the activity of the mTORC1 complex (and not only that of mTOR) to prion propagation. Overexpression of only mTOR would not restore mTORC1 full function, as Rptor would still be downregulated in the context of HNRNPK siRNA silencing (Fig. 7A and Supp. Fig. 8E). Moreover, our RNA-seq data (Supp. Table 5) from HNRNPK ablation indicate the downregulation of other mTORC1 components (namely Pras40 (AKT1S1) and mLST8). Therefore, the rescue of the mTORC1 activity by an overexpression strategy would be a very challenging approach. Given these complexities, to infer causality, we used mTORC1 inhibition (via rapamycin and Torin1) to mimic the effects of HNRNPK downregulation in reducing mTORC1 activity (FIG. 7B).

      For clarification, we have now highlighted in Fig. 4C that HNRNPK ablation downregulates also AKT1S1 and mLST8, other than mTOR and Rptor (also attached below), and we have discussed this in the main text as well. We also have clarified in the revised manuscript (where we sometimes inadvertently referred to it as just mTOR inhibition) that the observed effects are due to mTORC1 inhibition, and not simply mTOR inhibition.

      21) Flow cytometric data: supplementary Fig of Fig6d. - when they are looking at fixed cells the gating strategy for cells results in the inclusion of a lot of debris. The gate needs to be moved and be more specific to ensure results are interpreted properly. Same with the singlet gating. It's not tight enough, they include doublets as well which will skew their data. The gating strategy needs to be regated.

      Response: We have reanalyzed the flow cytometry data in Fig. 6D with a more stringent gating approach to better exclude debris and ensure proper singlet selection. We confirm that there is no change in the final interpretation of the results after applying the updated gating strategy.

      Reviewer #2 (Significance (Required)):

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2g (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency.

      Referee #3

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: Using a CRISPR-based high throughput abrasion assay, Sellitto et al. identified a list of genes that improve cell viability when deleted in hnRNP K knockout cells. Tfap2c, a transcription factor, was identified as a candidate with potential overlap with a hnRNP K function like modulating glucose metabolism. The deletion of Tfap2c in hnRNP K-deletion background prevented caspase-dependent apoptosis observed in hnRNP K single-deletion cells. Further analysis of bulk RNA-seq in hnRNP K/TFAP2C single- and double-deletion cells revealed the impairment in cellular ATP level. Accordingly, activation of AMPK led to perturbed autophagy in hnRNP K deleted cells. Moreover, the reduction and/or inactivation of the downstream mTOR protein resulted in the reduced phosphorylation of S6. Conversely, the phosphorylation of S6 and E4BP1 can be increased by TFAP2C overexpression. Finally, the pharmacological inhibition of the mTOR pathway increased the PrPSC level. This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. Co-IP experiments suggested hnRNP K and Tfap2c may interact, though further validation is needed. Several figures require additional clarification, statistical analysis, or experimental validation to strengthen conclusions.

      Major comments:

      1) Different responses of the TFAP2C expression level to deletion of hnRNPK in the two cell lines (LN-229 C3 and U251-Cas9) should be more adequately addressed. The manuscript focuses on the interaction between hnRNPK and TFAP2C, yet the hnRNPK deletion causes different changes in TFAP2C level in two different lines. Furthermore, in studies where the mechanistic link between hnRNPK and TFAP2C is being investigated, only results from the LN-229 line are presented (Figure 4-7). Thus, it is not clear whether these mechanisms also apply to another line, U251-Cas9, where hnRNPK deletion has the opposite effect on the TFAP1C level. Thus, key experiments should be performed in both lines.

      Response: The opposite effects of hnRNPK ablation on TFAP2C expression between LN-229 C3 and U251-Cas9 cells likely reflect intrinsic differences between the two cell lines. However, the viability interaction between hnRNPK and TFAP2C is conserved in both cell models (Fig. 1G-H, 2A-B, Supp. Fig. 3A-B), suggesting that shared molecular functions at the interface of this interaction exist across the lines. In fact, we believe that the opposite effect of hnRNPK ablation on TFAP2C expression in the two lines strengthens (rather than weakens) our model by highlighting how TFAP2C expression modulates cellular sensitivity to HNRNPK ablation, as detailed in our response to Reviewer #2 (Point 10 - Major comments).

      Regarding the mechanistic studies presented in FIG. 4-7, our initial goal in using two cell lines was to validate the functional viability interaction between HNRNPK and TFAP2C, as identified in our screening (performed in LN-229 C3 cells). After confirming this interaction, we chose to focus only on LN-229 C3 (beginning with RNA-seq analysis, which then led to subsequent mechanistic studies), as this provided the necessary foundation to investigate prion propagation in HovL cells (derived from LN-229). As a U251 model propagating prions does not exist, we are technically limited in performing prion experiments only in HovL and we do not believe that conducting additional experiments in U251 cells would add substantial value to our work or further our investigation.

      We hope this explanation clarifies our rationale and addresses the reviewer's concerns.

      2) Although a lot of data are presented, it is not clear how deletion of the TFAP2C reverses the toxicity caused by deletion of hnRNPK. Specifically, the first half of the paper seems to suggest an opposite mechanism than the second half of the paper. In Figure 2-4, the authors suggest a model that TFAP2C deletion has the opposite effect of hnRNPK deletion, thus rescuing toxicity. However, in Figure 5-6, it is suggested TFAP2C overexpression has the opposite effect of hnRNPK deletion. This two opposite effect of TFAP2C make it difficult to understand the models that the authors are proposing. Please also see below comment 2 for Figure 5.

      Response: We respectfully disagree with the notion that the first and second halves of the manuscript propose contradictory mechanisms.

      In Fig. 2-4, we describe the phenotypic rescue of cell viability upon TFAP2C deletion in hnRNPK-deficient cells. At this stage, we are not proposing a specific molecular mechanism but simply observing a rescue of viability and highlighting underlying transcriptional differences. There is no implication of an opposite molecular mechanism involving the individual activities of hnRNPK and TFAP2C; rather, we focused on the broader effect of TFAP2C deletion on the viability of HNRNPK-lacking cells. In Fig. 5, we isolated a partial mechanism underlying this interaction. We state that: "These data specify a role for TFAP2C in promoting mTORC1-mediated cell anabolism and suggest that its overexpression might hypersensitize cells to HNRNPK ablation by depleting the already limited ATP available, thus making its deletion advantageous". In the discussion, we now further reviewed our explanation: "HNRNPK deletion might cause a metabolic impairment leading to a nutritional crisis and a catabolic shift, whereas TFAP2C activation could promote mTORC1 anabolic functions. Thus, Tfap2c removal may rewire the bioenergetic needs of cells by modulating the mTORC1 signaling and augmenting their resilience to metabolic stress like the one induced by HNRNPK ablation". Therefore, we propose that TFAP2C expression might be particularly detrimental in hnRNPK-deficient cells, as it could push the cell into an anabolic biosynthetic state, further depleting energy stores that the cell is attempting to conserve in response to hnRNPK depletion. Removal of TFAP2C alleviates this metabolic strain. In our view, there is no contradiction between our observations.

      We hope this explanation clarifies our rationale and resolves any perceived inconsistency in our model. To further enhance the understanding of our interpretations, we have now also added (in substitution of Fig. 5E of the original manuscript) a graphical scheme (Fig. 5G of the revised manuscript) to visually explain and illustrate our model (attached below for your convenience).

      3) Similar to the point above, the first half of the paper focuses on hnRNPK deletion-induced toxicity (Fig. 1-5), while the second half of the paper focuses on hnRNPK deletion-induced PrPSC level (Fig. 6-7). The mechanistic link between these two downstream effects of hnRNPK deletion is not clear and thus, it is difficult to understand the reason that hnRNPK deletion-induced toxicity can be rescued by TFAP2C deletion, while hnRNPK deletion-induced PrPSC level increase can be rescued by TFAP2C overexpression.

      Response: Our study is not aimed at comparing viability and prion propagation as interconnected phenotypes but rather at identifying molecular processes regulated by the HNRNPK-TFAP2C interaction. Our study identifies mTORC1 activity as a molecular process at the interface of the HNRNPK-TFAP2C. HNRNPK knockout (or knockdown, which does not affect viability, and therefore is used in the prion section of the manuscript) tones mTORC1 activity down, while TFAP2C overexpression enhances it. This finding suggested an explanation for the viability interaction we observed (see reply to reviewer #3 - Point 2 -Major comments) and it provided a partial mechanism (mTORC1 activity) to explain the effect of HNRNPK knockdown and TFAP2C overexpression on prions.

      We hope this clarification addresses the reviewer's concern.

      Abstract:

      1) Please rephrase and clarify "We linked HNRNPK and TFAP2C interaction to mTOR signaling..." by distinguishing functional, genetic, and direct (molecule-to-molecule) interactions.

      Response: 1) We have now clarified it in the text of the revised manuscript as follows:

      "We linked HNRNPK and TFAP2C functional and genetic interaction to mTOR signaling, observing that HNRNPK ablation inhibited mTORC1 activity through downregulation of mTOR and Rptor, while TFAP2C overexpression enhanced mTORC1 downstream functions."

      2) A sentence reads, "...HNRNPK ablation inhibited mTORC1 activity through downregulation of mTOR and Rptor," although the downregulation of Rptor is observed only at the RNA level. The change in Rptor protein expression level is not reported in the manuscript. Please consider adding an experiment to address this or rephrase the sentence.

      Response: 2) We have now added the experiment in Supp. Fig. 9A of the revised manuscript. The blot shows that hnRNP K depletion reduces both mTOR and Rptor protein levels. "hnRNP K depletion inhibited mTORC1 activity through downregulation of mTOR and Rptor".

      Figure 2:

      1. H and I. Co-IP experiments were done using anti-TFAP2C antibody to the bead. Although the TFAP2C bands show robust signals on the blots, indicating successful enrichment of the protein, hnRNP K bands are very faint. Has the experiment been done by conjugating the hnRNP K antibody to the beads instead? Was the input lysate enriched in the nuclear fraction? Did the lysis buffer include nuclease (if so, please indicate in the figure legend and the methods section)? Addressing these would make the argument, "We also observed specific co-immunoprecipitation of hnRNP K and Tfap2c in LN-229 C3 and U251-Cas9 cells (Fig. 2H-I, Supp. Fig. 3L), suggesting that the two proteins form a complex inside the nucleus" stronger, providing information on potential direct binding.

      Response: 1. We refer the reviewer to our response to reviewers #1 and #2 regarding the weak interaction, the nuclease treatment, and the HNRNPK IP (reviewer #1 Point 1 and reviewer #2 Point 11 - Major comments). As for the co-IP input, it was not enriched in the nuclear fraction, but as shown in Supp. Fig. 4A-B hnRNPK and Tfap2c are exclusively nuclear.

      Figure 3:

      1. C and D. Please add a sentence in the figure legend explaining which means the multiple comparisons were made between (DMSO vs each drug concentration?). Graphing individual data points instead of bars would also be helpful and more informative. Please discuss the lack of dose dependency.

      Response: 1. We have now added information about the comparison in the figure legend ("Multiple comparison was made between Z-VAD-FMK and DMSO treatments in ΔHNRNPK cells."), modified the graph to show the individual data points (attached below for your convenience), and expanded the discussion as detailed for reviewer #2 (Point 14 - Major comments). (For completeness, we have also modified Supp. FIG. 5F to show individual data points, and we have combined the graphs (the DMSO control was shared across treatments)).

      Supplemental Figure 4 (Now shifted in Supplemental Figure 5):

      1. A. Although the trend can be observed, the deletion of hnRNP K does not significantly reduce the GPX4 protein level in LN-229 C3. Therefore, the following statement requires more data points and additional statistical analysis to be accurate: "In LN-229 C3 and U251-Cas9 cells, the deletion of HNRNPK reduced the protein level of GPX4, whereas TFAP2C deletion increased it (Supp. Fig. 4A-B)."

      2. A and B. The results are confusing, considering the previous report cited (ref 49) shows an increase in GPX4 with TFAP2C. It may be possible that the deletion of TFAP2C upregulates the expression of proteins with similar functions (e.g., Sp1). If this is the case, the changes in GPX4 expression observed here are a consequence of TFAP2C deletion and may not "suggest a role for HNRNPK and TFAP2C in balancing the protein levels of GPX4."

      Response: 1. We agree with the reviewer that in LN-229 C3 cells the reduction of GPX4 protein levels upon HNRNPK deletion did not reach statistical significance in our initial Western blot analysis. To address this concern, we performed six additional independent experiments and repeated the statistical analysis. Although the trend toward reduced GPX4 protein levels remained consistent, statistical significance was still not achieved (p > 0.05). Importantly, this trend is supported by our RNA-seq dataset (Supplementary Table 5), which shows decreased GPX4 expression upon HNRNPK deletion. We have now revised the text to more accurately reflect the experimental observations and to avoid overstating the effect in LN-229 C3 cells as follows:

      "In LN-229 C3 and U251-Cas9 cells, deletion of HNRNPK was associated with reduced glutathione peroxidase 4 (GPX4) protein abundance (although not statistically significant in LN-229 C3; p ≈ 0.08), whereas deletion of TFAP2C increased it (Supp. Fig. 5A-B)."

      The six new experimental replicas have been added to the uncropped western blot section.

      __Response: __2. Concerning the potential role of TFAP2C deletion in upregulating proteins with similar functions, we recognize the reviewer's perspective. However, our primary focus is on the observed trends rather than a definitive mechanistic conclusion. We clarified our wording to acknowledge this possibility while maintaining the relevance of our findings within the broader context of hnRNPK and TFAP2C interactions.

      "This last result was interesting as a previous study reported that Tfap2c enhances GPX4 expression (51). Thus, the observed increase upon TFAP2C deletion suggests additional layers of regulation, potentially involving compensatory mechanisms."

      Supplemental Figure 5 (Now shifted in Supplemental Figure 6):

      1. B. To obtain statistical significance and strengthen the conclusion, more repeated Western blot experiments can be done to quantify the pAMPK/AMPK ratio.

      Response: We included three more experiments as detailed in our response to reviewer #1 (Point 2 - Major comments) and reviewer #2 (Point 14 - Major comments).

      Figure 5:

      1. B. I believe statistical analysis with two replicates or less is not recommended. Although the assay is robust, and the blot is convincing, please consider adding more replicates if the blot is to be quantified and statistically analyzed.

      2. "Interestingly, RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C;ΔHNRNPK double deletion (Fig. 4C, Supp. Fig. E)." The statement is based on a slight difference at the protein level between the single deletion and the double deletion, as well as the observation from the bulk RNA-seq data. mTOR (and Rptor) mRNA level can be assessed by RT-qPCR to validate and further support the existing data. It is also curious why deletion of TFAP2C alone, also induced decrease in mTOR, but double deletion rescued mTOR level slightly compared to deletion of HNRNPK alone.

      3. C. The main text refers to the changes in the level of phosphorylated E4BP1, stating, "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C)." However, the quantification was done on the total E4BP1, which may be because separating pE4BP1 and E4BP1 bands on a blot is challenging. Please consider using phospho-E4BP1 specific antibody or rephrase the sentence mentioned above. The current data suggest the single- and double-deletion of hnRNP K/TFAP2C affect the overall stability of E4BP1, which may be a correlation and not due to the mTOR activity as claimed in "We conclude that HNRNPK and TFAP2C play an essential role in co-regulating cell metabolism homeostasis by influencing mTOR and AMPK activity and expression." How does the cap-dependent translation (or total protein level) change in TFAP2C deleted and overexpressing cells?

      Response: 1. We added two additional experiments as detailed in our response to reviewer #1 (Point 3 - Major comment).

      __Response: __2. Deletion of TFAP2C does not decrease mTOR levels as shown from the quantification in Fig. 5D. To further support our results, we have now included RT-qPCR in FIG. 5C as suggested by the reviewer. Data are also attached here for your convenience.

      __Response: __3. Regarding the assessment of phosphorylated 4EBP1, we think we achieved a clear separation of the differently phosphorylated forms of 4EBP1 in our blots, and we have now added the quantification for High p4EBP1/4EBP1 in Fig. 5E (see also our response to reviewer #2 Point 16 - Major comments). The quantification of total 4EBP1 represents an additional dataset, and we do not claim that 4EBP1 stability is affected by HNRNPK and TFAP2C directly through mTOR, which could be, in fact, correlative. We claim that HNRNPK and TFAP2C modulate mTORC1 and AMPK metabolic signaling as shown by the changed phosphorylation of 4EBP1, S6, AMPK, and ULK1 (Fig. 5C-E, Supp. FIG. 6B, D) and by the regulation of autophagy (Fig. 5B, Supp. Fig. 6C); we did not directly check cap-dependent translation.

      We have now rephrased our text to ensure clarity as follows:

      "We conclude that HNRNPK and TFAP2C play a role in co-regulating mTORC1 and AMPK expression, signaling, and activity."

      Figure 6:

      1. A. Did the sihnRNP K increase the TFAP2C level?

      2. A and C. Are the total PrP levels lower in TFAP2C overexpressing cells compared to mCherry cells when they are infected?

      3. D. Do the TFAP2C protein levels differ between 2-day+72-h and 7-day+96-h?

      __Response: __1. Yes, it does. We have now provided the quantification in Fig. 6A, C, and Supp. Fig. 8A (also attached below for your convenience).

      __Response: __2. We have now provided the quantification in Fig. 6A and Supp. Fig. 8A. The total PrP does not change in TFAP2C overexpressing cells. Total PrP consists of both PK-resistant PrP (PrPSc) and PK-sensitive PrP (PrPC plus potential other intermediate species), with PrPSc typically present at much lower levels. In our model, PrPC is exogenously expressed at high levels via a vector and remains constant across conditions (Fig. 6C and Supp. Fig. 8C). As a result, any changes in PrPSc may not necessarily reflect on total PrP levels.

      __Response: __3. No, there is no statistically significant change. We have now added a representative western blot and the quantification of 3 independent replicates in Supp. Fig. 8D. The other two western blots are only shown in the uncropped western blots section. This dataset is also attached here for your convenience.

      Figure 7:

      1. I agree with the latter half of the statement: "These findings suggest that HNRNPK influences prion propagation at least in part through mTORC1 signaling, although additional mechanisms may be involved." The first half requires careful rephrasing since (A) Independent of the background siRNA treatment, TFAP2C overexpression by itself can modulate PrPSC level as seen in Fig 6A and B, (B) Although the increase in TFAP2C level is observed with the hnRNP K deletion (Fig 1; LN-229 C3), sihnRNP K treatment may or may not influence the TFAP2C level (Fig 6; quantified data not provided), and (C) In the sihnRNP K-treated cells, E4BP1 level is increased compared to the siNT-treated cells, which was not observed hnRNP K-deleted cells. Discussions and additional experiments (e.g., mTOR knockdown) addressing these points would be helpful.

      __Response: __A, B) We respectfully disagree with the possibility that HNRNPK downregulation may increase prion propagation via TFAP2C upregulation. As shown in Fig. 6A-B, D and in Supp. Fig. 8A, TFAP2C overexpression reduces, rather than increases, prion levels. Therefore, it would be inconsistent to suggest that HNNRPK siRNA promotes prion propagation through TFAP2C upregulation (quantification is now provided, see reviewer #3 - Figure 6 - Point 1). C) Concerning 4EBP1 levels, we have quantified the total 4EBP1 (also attached below) and expanded the discussion on potential discrepancies between HNRNPK knockout and knockdown, as the former affects cell viability, while the latter does not. However, as explained also in the previous reply to reviewer #3 - Figure 5 - Point 3, our focus is on the highly phosphorylated band of 4EBP1 (High p4EBP1), which is the direct target of mTORC1 activity. In both the hnRNPK knockout LN-229 C3 (Fig. 5E) and knockdown HovL models (Fig. 7B), phosphorylation of 4EBP1, along with phosphorylation of S6, is clearly reduced (we have now included quantification for Fig. 7B), reinforcing our conclusion that mTORC1 activity is affected by hnRNPK depletion. As the reviewer noted, we do not claim that mTORC1 is the sole mediator of hnRNPK's effect on prion regulation. However, we think that our interpretation of a potential and partial role of mTORC1 inhibition in the effect of HNRNPK downregulation on prion propagation is in line with the data presented in Fig. 6-7 and Supp. Fig. 8-9. For further clarification, we expanded the text according to the new experiments and analysis, and we added mTOR and Raptor siRNA knockdown (Supp. Fig.9C) to further support our conclusions (also attached below for your convenience).

      Minor comments:

      1. Please clarify "independent cultures." Does this mean technical replicates on the same cell culture plate but different wells or replicated experiments on different days?

      __Response: __We have now clarified in each figure legend. "Individually treated wells" means different parental cultures grown and treated separately on the same day. n represents independent experiments on different days.

      1. Fig 2G. Please explain how the sigmoidal curves were fitted to the data points under the materials and methods section.

      2. Fig 3E and F. Please refer to the comment on Fig 2G above.

      __Response: __We have now added the explanation in Materials and Methods as follows:

      "Curve Fitting

      For sigmoidal curve fitting, we used GraphPad Prism (version X, GraphPad Software). Data in Figure 2G were fitted using nonlinear regression with a least squares regression model. For Figures 3E and 3F, data fitting was performed using an asymmetric sigmoidal model with five parameters (5PL) and log-transformed X-values (log[concentration])."

      3.Fig S3 F/H. Quantification of gel bands would be helpful when comparing protein expression changes after different treatments, as band intensities look different across.

      __Response: __We have now added the quantifications in Supp. FIG. 3D-H (attached below for your convenience). They confirm that there are no significant differences in the means of the normalized values.

      1. Supp Fig 5C and F. These panels can be combined with the corresponding panels in main Figure 5 if space allows so that the readers do not have to flip pages between the main text and Supplemental material.

      __Response: __We have now combined the panels. Previous Supp. FIG. 5C and F are now shown in FIG. 6C and E, respectively.

      Reviewer #3 (Significance (Required)):

      This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. It is also important to understand how hnRNPK deletion induces prion propagation and develop methods to mitigate its spread. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. I have expertise in RNA-binding protein, cell biology, and prion disease.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Using a CRISPR-based high throughput abrasion assay, Sellitto et al. identified a list of genes that improve cell viability when deleted in hnRNP K knockout cells. Tfap2c, a transcription factor, was identified as a candidate with potential overlap with a hnRNP K function like modulating glucose metabolism. The deletion of Tfap2c in hnRNP K-deletion background prevented caspase-dependent apoptosis observed in hnRNP K single-deletion cells. Further analysis of bulk RNA-seq in hnRNP K/TFAP2C single- and double-deletion cells revealed the impairment in cellular ATP level. Accordingly, activation of AMPK led to perturbed autophagy in hnRNP K deleted cells. Moreover, the reduction and/or inactivation of the downstream mTOR protein resulted in the reduced phosphorylation of S6. Conversely, the phosphorylation of S6 and E4BP1 can be increased by TFAP2C overexpression. Finally, the pharmacological inhibition of the mTOR pathway increased the PrPSC level. This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. Co-IP experiments suggested hnRNP K and Tfap2c may interact, though further validation is needed. Several figures require additional clarification, statistical analysis, or experimental validation to strengthen conclusions.

      Major comments:

      1. Different responses of the TFAP2C expression level to deletion of hnRNPK in the two cell lines (LN-229 C3 and U251-Cas9) should be more adequately addressed. The manuscript focuses on the interaction between hnRNPK and TFAP2C, yet the hnRNPK deletion causes different changes in TFAP2C level in two different lines. Furthermore, in studies where the mechanistic link between hnRNPK and TFAP2C is being investigated, only results from the LN-229 line are presented (Figure 4-7). Thus, it is not clear whether these mechanisms also apply to another line, U251-Cas9, where hnRNPK deletion has the opposite effect on the TFAP1C level. Thus, key experiments should be performed in both lines.
      2. Although a lot of data are presented, it is not clear how deletion of the TFAP2C reverses the toxicity caused by deletion of hnRNPK. Specifically, the first half of the paper seems to suggest an opposite mechanism than the second half of the paper. In Figure 2-4, the authors suggest a model that TFAP2C deletion has the opposite effect of hnRNPK deletion, thus rescuing toxicity. However, in Figure 5-6, it is suggested TFAP2C overexpression has the opposite effect of hnRNPK deletion. This two opposite effect of TFAP2C make it difficult to understand the models that the authors are proposing. Please also see below comment 2 for Figure 5.
      3. Similar to the point above, the first half of the paper focuses on hnRNPK deletion-induced toxicity (Fig. 1-5), while the second half of the paper focuses on hnRNPK deletion-induced PrPSC level (Fig. 6-7). The mechanistic link between these two downstream effects of hnRNPK deletion is not clear and thus, it is difficult to understand the reason that hnRNPK deletion-induced toxicity can be rescued by TFAP2C deletion, while hnRNPK deletion-induced PrPSC level increase can be rescued by TFAP2C overexpression.

      Abstract.

      1. Please rephrase and clarify "We linked HNRNPK and TFAP2C interaction to mTOR signaling..." by distinguishing functional, genetic, and direct (molecule-to-molecule) interactions.
      2. A sentence reads, "...HNRNPK ablation inhibited mTORC1 activity through downregulation of mTOR and Rptor," although the downregulation of Rptor is observed only at the RNA level. The change in Rptor protein expression level is not reported in the manuscript. Please consider adding an experiment to address this or rephrase the sentence.

      Figure 2.

      1. H and I. Co-IP experiments were done using anti-TFAP2C antibody to the bead. Although the TFAP2C bands show robust signals on the blots, indicating successful enrichment of the protein, hnRNP K bands are very faint. Has the experiment been done by conjugating the hnRNP K antibody to the beads instead? Was the input lysate enriched in the nuclear fraction? Did the lysis buffer include nuclease (if so, please indicate in the figure legend and the methods section)? Addressing these would make the argument, "We also observed specific co-immunoprecipitation of hnRNP K and Tfap2c in LN-229 C3 and U251-Cas9 cells (Fig. 2H-I, Supp. Fig. 3L), suggesting that the two proteins form a complex inside the nucleus" stronger, providing information on potential direct binding.

      Figure 3.

      1. C and D. Please add a sentence in the figure legend explaining which means the multiple comparisons were made between (DMSO vs each drug concentration?). Graphing individual data points instead of bars would also be helpful and more informative. Please discuss the lack of dose dependency.

      Supplemental Figure 4.

      1. A. Although the trend can be observed, the deletion of hnRNP K does not significantly reduce the GPX4 protein level in LN-229 C3. Therefore, the following statement requires more data points and additional statistical analysis to be accurate: "In LN-229 C3 and U251-Cas9 cells, the deletion of HNRNPK reduced the protein level of GPX4, whereas TFAP2C deletion increased it (Supp. Fig. 4A-B)."
      2. A and B. The results are confusing, considering the previous report cited (ref 49) shows an increase in GPX4 with TFAP2C. It may be possible that the deletion of TFAP2C upregulates the expression of proteins with similar functions (e.g., Sp1). If this is the case, the changes in GPX4 expression observed here are a consequence of TFAP2C deletion and may not "suggest a role for HNRNPK and TFAP2C in balancing the protein levels of GPX4."

      Supplemental Figure 5.

      1. B. To obtain statistical significance and strengthen the conclusion, more repeated Western blot experiments can be done to quantify the pAMPK/AMPK ratio.

      Figure 5.

      1. B. I believe statistical analysis with two replicates or less is not recommended. Although the assay is robust, and the blot is convincing, please consider adding more replicates if the blot is to be quantified and statistically analyzed.
      2. "Interestingly, RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C;ΔHNRNPK double deletion (Fig. 4C, Supp. Fig. E)." The statement is based on a slight difference at the protein level between the single deletion and the double deletion, as well as the observation from the bulk RNA-seq data. mTOR (and Rptor) mRNA level can be assessed by RT-qPCR to validate and further support the existing data. It is also curious why deletion of TFAP2C alone, also induced decrease in mTOR, but double deletion rescued mTOR level slightly compared to deletion of HNRNPK alone.
      3. C. The main text refers to the changes in the level of phosphorylated E4BP1, stating, "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C)." However, the quantification was done on the total E4BP1, which may be because separating pE4BP1 and E4BP1 bands on a blot is challenging. Please consider using phospho-E4BP1 specific antibody or rephrase the sentence mentioned above. The current data suggest the single- and double-deletion of hnRNP K/TFAP2C affect the overall stability of E4BP1, which may be a correlation and not due to the mTOR activity as claimed in "We conclude that HNRNPK and TFAP2C play an essential role in co-regulating cell metabolism homeostasis by influencing mTOR and AMPK activity and expression." How does the cap-dependent translation (or total protein level) change in TFAP2C deleted and overexpressing cells?

      Figure 6.

      1. A. Did the sihnRNP K increase the TFAP2C level?
      2. A and C. Are the total PrP levels lower in TFAP2C overexpressing cells compared to mCherry cells when they are infected?
      3. D. Do the TFAP2C protein levels differ between 2-day+72-h and 7-day+96-h?

      Figure 7.

      1. I agree with the latter half of the statement: "These findings suggest that HNRNPK influences prion propagation at least in part through mTORC1 signaling, although additional mechanisms may be involved." The first half requires careful rephrasing since (A) Independent of the background siRNA treatment, TFAP2C overexpression by itself can modulate PrPSC level as seen in Fig 6A and B, (B) Although the increase in TFAP2C level is observed with the hnRNP K deletion (Fig 1; LN-229 C3), sihnRNP K treatment may or may not influence the TFAP2C level (Fig 6; quantified data not provided), and (C) In the sihnRNP K-treated cells, E4BP1 level is increased compared to the siNT-treated cells, which was not observed hnRNP K-deleted cells. Discussions and additional experiments (e.g., mTOR knockdown) addressing these points would be helpful.

      Minor comments:

      1. Please clarify "independent cultures." Does this mean technical replicates on the same cell culture plate but different wells or replicated experiments on different days?
      2. Fig 2G. Please explain how the sigmoidal curves were fitted to the data points under the materials and methods section.
      3. Fig 3E and F. Please refer to the comment on Fig 2G above.
      4. Fig S3 F/H. Quantification of gel bands would be helpful when comparing protein expression changes after different treatments, as band intensities look different across.
      5. Supp Fig 5C and F. These panels can be combined with the corresponding panels in main Figure 5 if space allows so that the readers do not have to flip pages between the main text and Supplemental material.

      Significance

      This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. It is also important to understand how hnRNPK deletion induces prion propagation and develop methods to mitigate its spread. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. I have expertise in RNA-binding protein, cell biology, and prion disease.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2 (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency.

      They state that HNRNPK and TFAP2C are linked to mTOR signalling and observe that HNRNPK ablation inhibits mTORC1 activity through downregulation of mTOR and Rptor while TFAP2C overexpression enhances mTORC1 downstream functions. In prion infected cells, TFAP2C activation reduced prion levels and countered the increased prion propagation due to HNRNPK suppression. Pharmacological inhibition of mTOR also elevated prion levels and partially mimicked the effects of HNRNPK silencing. They state their study identifies TFAP2C as a genetic interactor of HNRNPK and implicates their roles in mTOR metabolic regulation and establishes a causative link between these activities and prion propagation.

      This is an interesting manuscript in which a lot of work has been undertaken. The experiments are on the whole well done, carefully documented and support most of the conclusions drawn. However, there are places where it was quite difficult to read as some of the important results are in the supplementary Figures and it was necessary to go back and forth between the Figs in the main body of the paper and the supplementary Figs. There are also Figures in the supplementary which should have been presented in the main body of the paper. These are indicated in our comments below.

      We have the following questions /points:

      1. A plasmid harbouring four guide RNAs driven by four distinct constitutive promoters is used for targetting HNRNPK- is there a reason for using 4 guides- is it simply to obtain maximal editing - in their experience is this required for all genes or specific to HNRNPK?
      2. Is there a minimal amount of Cas9 required for editing?
      3. It is stated that cell death is delayed in U251-MG cells compared to LN-229-C3 cells- why? Also, why use glioblastoma cells other than that they have high levels of HNRNPK? Would neuroblastoma cells be more appropriate if they are aiming to test for prion propagation?
      4. Human CRISPR Brunello pooled library- does the Brunello library use constructs which have four independent guide RNAs as used for the silencing of HNRPNK?
      5. To rank the 763 enriched genes, they multiply the -log10FDR with their effect size - is this a standard step that is normally undertaken?
      6. The 32 genes selected- they were ablated individually using constructs with one guide RNA or four guide RNAs?
      7. The identified targets were also tested in U251-MG cells and nine were confirmed but the percent viability was variable - is the variability simply a reflection of the different cell line?
      8. The two strongest hits were IKBAKP and TFAP2C. As TFAP2C is a transcription factor - is it known to modulate expression of any of the genes that were identified to be perturbed in the screen? Moreover, it is stated that it regulates expression of several lncRNAs- have the authors looked at expression of these lncRNAs- is the expression affected- can modulation of expression of these lncRNAs modulate the observed phenotypic effects and also some of the targets they have identified in the screen?
      9. As both HNRNPK and TFAP2C modulate glucose metabolism, the authors have chosen to explore the epistatic interaction. This is most reasonable.
      10. The orthogonal assay to confirm that deletion of TFAP2C supresses cell death upon removing HNRNPK- was this done using a single guide RNA or multiple guides - is there a level of suppression required to observe rescue? Interestingly ablation of HNRNPK increases TFAP2C expression in LN-229-C3 whereas in U251-Cas9 cells HNRNPK ablation has the opposite effect- both RNA and protein levels of TFAP2C are decreased - is this the cause of the smaller protective effect of TFAP2C deletion in this cell line?
      11. Nuclear localisation studies indicate that the HNRNPK and TFAP2C proteins colocalise in the nucleus however the co-IP data is not convincing- although appropriate controls are present, the level of interaction is very low - the amount of HNRNPK pulled down by TFAP2C is really very low in the LN-229C3 cells and even lower in the U251-Cas9 cells. Have they undertaken the reciprocal co-IP expt?
      12. They state that LN-229 C3 TFAP2C and U251-Cas9TFAP2C were only mildly resistant to the apoptotic action of staurosporin Fig 3E and F - I accept they have undertaken the stats which support their statement that at high concentrations of staurosporin the LN-229 C3 TFAP2C cells are less sensitive but the U251-Cas9TFAP2C decreased sensitivity is hard to believe. Has this been replicated? I agree that HNRNPK deletion causes apoptosis in both LN-229 C3 and U251-Cas9 cells and this is blocked by Z-VAD-FMK - however the block is not complete- the max viability for HNRNPK deletion in LN-229 C3 cells is about 40% whereas for U251-Cas9 cells it is about 30% - does this suggest that cells are being lost by another pathway. Have they tested concentrations higher than 10nM?
      13. The RNA-seq comparisons- the authors use log2 FC <0.5 upregulated or genes downregulated by a similar amount- this is a very low cut off and would include essentially minimal changes in expression - not convinced of the significance of such low-level changes.
      14. It is stated" Accordingly, we observed increased AMPK phosphorylation (pAMPK) upon ablation of HNRNPK, which was consistently reduced in LN-229 C3ΔTFAP2C cells (Supp. Fig. 5B). LN-229 C3ΔTFAP2C; ΔHNRNPK cells also showed a partial reduction of pAMPK relative to LN-229 C3ΔHNRNPK cells (Supp. Fig. 5B). These results suggest that hnRNP K depletion causes an energy shortfall, leading to cell death. I am not totally convinced by the data presented in this Fig. The authors have quantified the band intensity and present the ratio of pAMPK to AMPK. Please note that the actin levels are variable across the samples - did they normalise the data using the actin level before undertaking the comparisons? Also, if the authors think this is an important point which supports their conclusion, then it should be in the main body of the paper rather than the supplementary. If AMPK is being phosphorylated, this should lead to activation of the metabolic check point which involves p53 activation by phosphorylation. Activated p53 would turn on p21CIP1 which is a very sensitive indicator of p53 activation.
      15. We also do not understand why the mTOR Suppl. Fig. 5E is not in the main body of the paper. It's clear that RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C- however the ΔTFAP2C;ΔHNRNPK double deletion levels are only slightly higher than the ΔHNRNPK - they are not at the level NT or even ΔTFAP2C (Fig. 4C, Supp. Fig. 5E).
      16. The authors state: "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C). Similarly, the S6 phosphorylation ratio was reduced in LN-229 C3ΔHNRNPK cells and was restored in the ΔTFAP2C;ΔHNRNPK double-ablated cells (Fig. 5C)."

      WE are not convinced that p4EBP1 is preserved in the LN-229 C3ΔTFAP2C cells - there is a very faint band which is at a lower level than the band in the LN-229 C3ΔHNRNPK cells. However, when both HNRNPK and TFAP2C were ablated, the p4EBP1 band is clear cut. I agree with the quantitation that deletion of HNRNPK and TFAP2C both reduce the level of 4EBP1 - the reduction is greater with TFAP2 but when both are deleted together the levels of 4EBP1 are higher and p4EBP1 is clearly present. In quantifying the S6 and pS6 levels, did the authors consider the actin levels- they present a ratio of the pS6 to S6. I may be lacking some understanding but why is the ratio of pS6/S6 being calculated. Is the level of pS6 not what is important - phosphorylation of S6 should lead it to being activated and thus it's the actual level of pS6 that is important, not the ratio to the non-phosphorylated protein. 17. When determining ATP levels, do they control for cell number? HNRNPK depletion results in lower ATP levels, co-deletion of TFAP2C rescues this. But this could be because there is less cell-death? So, more cells express ATP. Have they controlled for relative numbers of cells. 18. The construction of the HovL cell line that propagate ovine prions - very few details are provided of the susceptibility of the cell line to PG127 prions. 19. It is stated that HRNPK depletion from HovL cells increases PrpSC as determined by 6D11 fluorescence, but in the manuscript HRNPK depletion results in cell death. How does this come together? 20. They show that mTOR inhibition mimics the effect of HNRNPK deletion, why didn't they overexpress mTOR and see if that rescues this? This would indicate a causal relationship. 21. Flow cytometric data: supplementary Fig of Fig6d. - when they are looking at fixed cells the gating strategy for cells results in the inclusion of a lot of debris. The gate needs to be moved and be more specific to ensure results are interpreted properly. Same with the singlet gating. It's not tight enough, they include doublets as well which will skew their data. The gating strategy needs to be regated.

      Significance

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2 (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The paper by Sellitto describes studies to determine the mechanism by which hnRNPK modulates the propagation of prion. The authors use cell models lacking HNRNPK, which is lethal, in a CRISPR screen to identify genes that suppress lethality. Based on this screen to 2 different cell lines, gene termed Tfap2C emerged as a candidate for interaction with HNRNPK. The show that Tfap2C counteracts the actions of HNRNPK with respect to prion propagation. Cells lacking HNRNPK show increased PrPSc levels. Overexpression of Tfap2C suppesses PrPSc levels. These effects on PrPSc are independent of PrPC levels. By RNAseq analysis, the authors hone in on metabolic pathways regulated by HNRPNK and Tfap2C, then follow the data to autophagy regulation by mTor. Ultimately, the authors show that short-term treatments of these cell models with mTor inhibitors causes increased accumulation of PrPSc. The authors conclude that the loss of HNRNPK leads to a reduced energy metabolism causing mTor inhibition, which is reduces translation by dephosphorylation of S6.

      Major Comments

      Fig H and I, Fig 3L. The interaction between Tfap2C and HNRNPK is pretty weak. The interaction may not be consequential. The experiment seems to be well controlled, yielding limited interaction. The co-ip was done in PBS with no detergent. The authors indicate that the cells were mechanically disrupted. Since both of these are DNA binding proteins, is it possible that the observed interaction is due to proximity on DNA that is linking the 2 proteins, including a DNAase treatment would clarify.

      Supplemental Fig 5B - The western blot images for pAMPK don't really look like a 2 fold increase in phosphorylation in HNRNPK deletion.

      Fig. 5A - I don't think it is proper to do statistics on an of 2. Fig 6D. The data look a bit more complicated than described in the text. At 7 days, compared to 2 days, it looks like there is a decrease in % cells positive for 6D11. Is there clearance of PrPSc or proliferation of un-infected cells? The authors might consider a different order of presenting the data. Fig 6 could follow Fig. 2 before the mechanistic studies in Figs 3-5. The authors use SEM throughout the paper and while this is often used there has been some interest in using StdDev to show the full scope of variability.

      Discussion The discrepancy between short-term and long-term treatments with mTor inhibitors is only briefly mentioned with a bit of a hand-waving explanation. The authors may need a better explanation.

      Minor Comments

      Page 12 - no mention of chloroquine in the text or related data.

      Page 12 - Supp. Fig. E - should be 5E

      Significance

      The study provides mechanistic insight into how HNRNPK modulates prion propagation. The paper is limited to cell models, and the authors note that long term treatment with mTor inhibitors reduced PrPSc levels in an in vivo model.

      The primary audience will be other prion researchers. There may be some broader interest in the mTor pathway and the role of HNRNPK in other neurodegenerative diseases.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      The study provides a comprehensive overview of genome size variation in two related species of the genus Epidendrum, which appear to be homoploid, although their DNA content more closely corresponds to that of heteroploid species. While I have a few serious concerns regarding the data analysis, the study itself demonstrates a well-designed approach and offers a valuable comparison of different methods for genome size estimation. In particular, I would highlight the analysis of repetitive elements, which effectively explains the observed differences between the species. However, I encourage the authors to adopt a more critical perspective on the k-mer analysis and the potential pitfalls in data interpretation.

      Major comments:

      R1. p. 9: Genome size estimation via flow cytometry is an incorrect approach. The deviation is approximately 19% for E. anisatum and about 25% for E. marmoratum across three repeated measurements of the same tissue over three days? These values are far beyond the accepted standards of best practice for flow cytometry, which recommend a maximum deviation of 2-5% between repeated measurements of the same individual. Such variability indicates a systemic methodological issue or improper instrument calibration. Results with this level of inconsistency cannot be considered reliable estimates of genome size obtained by flow cytometry. If you provide the raw data, I can help identify the likely source of error, but as it stands, these results are not acceptable.

      __A: __Thanks a lot for pointing out this issue. We have identified the source of the wide interval after consulting with the staff of LabNalCit. We originally used human peripheral blood mononuclear cells (PBMCs) as a reference to estimate the genome size (GS) of P. sativum and used the resulting range to estimate the GS of Epidendrum. We calculated P. sativum's GS using a wide human GS range of 6-7 Gb, which resulted in a wide range of P. sativum GS and, consequently, in a wide range of GS for our samples. Therefore, the wide range reported is not an issue with the instruments, but about the specifics of the analysis.

      __We have done the following changes: __

      1. Reducing the range we calculated of P. sativum's GS using a narrower human genome size range (6.41-6.51; Piovesan et al. 2019; DOI: 10.1186/s13104-019-4137-z), and using these intervals to calculate our sample's GS.
      2. We have explained our procedure in the methods, changed our results as required, and included a supplementary table with cytometry data (Supplementary Data Table 1).
      3. Human peripheral blood mononuclear cells (PBMCs) from healthy individuals were used as a standard laboratory reference to calculate the P. sativum genome size. Pisum sativum and the Epidendrum samples were analyzed in a CytoFLEX S flow cytometer (Beckman-Coulter), individually and in combination with the internal references (PBMCs and P. sativum, respectively). Cytometry data analysis was performed using FlowJo® v. 10 (https://www.flowjo.com/). A genome size value for the Epidendrum samples was calculated as the average of the minimum and maximum 1C/2C values obtained from three replicates of the DNA content histograms of each tissue sample. Minimum and maximum values come from the interval of P. sativum estimations based on the human genome size range (human genome size range: 6.41-6.51; Piovesan et al. 2019).
      4. The 1C value in gigabases (Gb; calculated from mass in pg) of E. anisatum ranged from 2.55 to 2.62 Gb (mean 1C value = 2.59 Gb) and that of E. marmoratum from 1.11 to 1.18 Gb (mean 1C value = 1.13 Gb; Supplementary Data Table S1).
      5. We also eliminated from Figure 3 the range we had estimated previously.
      6. Finally, we changed the focus of the comparison and discussion of the evaluation of the bioinformatic estimations, highlighting this deviation rather than whether the GS bioinformatic estimations fall within the cytometric interval. We calculated the Mean Absolute Deviation (MAD) as the absolute difference between the genome size estimates using k-mers and flow cytometry. This meant changing the results in P. 11 and 12 and adding to Fig. 3 two boxplots depicting the MAD. We have also added Supplementary Data Fig. S3 depicting the absolute deviations for E. anisatum and E. marmoratum per tool using the estimates generated from a k-mer counting with a maximum k-mer coverage value of 10,000 using 16 different values of k; a Supplementary Data Figure S5 depicting the mean absolute deviations resulting from the different subsampled simulated depths of coverage of 5×, 10×, 20×, 30×, and 40×; and finally a Supplementary Data Fig. S6 depicting the MAD changes as a function of depth of coverage for E. anisatum and E. marmoratum.

      R1. p. 14 and some parts of Introduction: It may seem unusual, to say the least, to question genome size estimation in orchids using flow cytometry, given that this group is well known for extensive endoreplication. However, what effect does this phenomenon have on genome size analyses based on k-mers, or on the correct interpretation of peaks in k-mer histograms? How can such analyses be reliably interpreted when most nuclei used for DNA extraction and sequencing likely originate from endoreplicated cells? I would have expected a more detailed discussion of this issue in light of your results, particularly regarding the substantial variation in genome size estimates across different k-mer analysis settings. Could endoreplication be a contributing factor?

      A:

      We reworded the introduction p.3, 2nd paragraph to make our point on the effect of endoreplication on flow cytometry clearer. We eliminated the following sentence from discussion p. 15 : "Difficulties for cytometric estimation of genome size can thus be taxon-specific. Therefore, cross-validating flow cytometry and bioinformatics results can be the most effective method for estimating plant genome size, especially when only tissues suspected to show significant endoreplication, such as leaves, are available" We added the following, p. 18: Genome size estimation for non-model species is considered a highly standardized approach. However, tissue availability and intrinsic genome characteristics (large genomes, polyploidy, endoreplication, and the proportion of repetitive DNA) can still preclude genome size estimation (e.g. Kim et al. 2025) using cytometry and bioinformatic tools. Cross-validating flow cytometry and bioinformatics results might be particularly useful in those cases. For example, when only tissues suspected of showing significant conventional endoreplication, such as leaves, are available, bioinformatic tools can help to confirm that the first peak in cytometry histograms corresponds to 2C. Conversely, bioinformatic methods can be hindered by partial endoreplication, which only flow cytometry can detect.

           4. We included a paragraph discussing the effect of CE and PE on bioinformatic GS estimation P. 17:
      

      Besides ploidy level, heterozygosity, and the proportion of repetitive DNA, k-mer distribution can be modified by endoreplication. Since endoreplication of the whole genome (CE) produces genome copies (as in preparation for cell division, but nuclear and cell division do not occur ), we do not expect an effect on genome size estimates based on k-mer analyses. In contrast, PE alters coverage of a significant proportion of the genome, affecting k-mer distributions and genome size estimates (Piet et al., 2022). Species with PE might be challenging for k-mer-based methods of genome size estimation.

      R1. You repeatedly refer to the experiment on genome size estimation using analyses with maximum k-mer coverage of 10,000 and 2 million, under different k values. However, I would like to see a comparison - such as a correlation analysis - that supports this experiment. The results and discussion sections refer to it extensively, yet no corresponding figure or analysis is presented.

      A:

      We had previously included the results of the analyses using different k-mer coverage in the Supplementary Data Figure S2. We have added, to formally compare the results using analyses with maximum k-mer coverage of 10,000 and 2 million, a Wilcoxon paired signed-rank test, which showed a significant difference, p. 12: The estimated genome sizes using a maximum count value of 10,000 were generally lower for all tools in both species compared to using a maximum count value of 2 million (median of 2M experiment genome size - median of 10K experiment genome size= 0.24 Gb). The estimated genome size of the 2 million experiment also tended to be closer to the flow cytometry genome size estimation with significantly lower MAD than the 10K experiment (Wilcoxon paired signed-rank test p = 0.0009). In the 10K experiment (Supplementary Data Figure S2; S3), the tool with the lowest MAD for E. anisatum was findGSE-het (0.546 Gb) and for E. marmoratum it was findGSE-hom (0.116 Gb).

       2. We have added a boxplot in the Supplementary Data Figure S3 depicting the mean absolute deviations using maximum k-mer coverage of 10,000 and 2 million compared to flow cytometry.
      

      Minor comments:

      R1. p. 3: You stated: "Flow cytometry is the gold standard for genome size estimation, but whole-genome endoreplication (also known as conventional endoreplication; CE) and strict partial endoreplication (SPE) can confound this method." How did you mean this? Endopolyploidy is quite common in plants and flow cytometry is an excellent tool how to detect it and how to select the proper nuclei fraction for genome size estimation (if you are aware of possible misinterpretation caused by using inappropriate tissue for analysis). The same can be applied for partial endoreplication in orchids (see e.g. Travnicek et al 2015). Moreover, the term "strict partial endoreplication" is outdated and is only used by Brown et al. In more recent studies, the term partial endoreplication is used (e.g. Chumova et al. 2021- 10.1111/tpj.15306 or Piet et al. 2022 - 10.1016/j.xplc.2022.100330).

      A:

      We have reworded the paragraph where we stated "Flow cytometry is the gold standard for genome size estimation", as in the answer to Major comment 2. Additionally, we highlighted in the discussion how, while FC is the gold standard for GS estimation, studying multiple alternatives to it may be important for cases in which live tissue is not available or is available only to a limited extent (i.e. only certain tissues), p. 18 We have changed the term "strict partial endoreplication" to partial endoreplication (PE).

      R1. p. 5: "...both because of its outstanding taxic diversity..." There is no such thing as "taxic" diversity - perhaps you mean taxonomic diversity or species richness.

      __A: __We have changed "taxic diversity" to "species diversity".

      R1. p. 6: In description of flow cytometry you stated: "Young leaves of Pisum sativum (4.45

      pg/1C; Doležel et al. 1998) and peripheral blood mononuclear cells (PBMCs) from healthy

      individuals...". What does that mean? Did you really use blood cells? For what purpose?

      A: Please find the explanation and the modifications we've made in the answer to major comment 1.

      R1. p. 7: What do you mean by this statement "...reference of low-copy nuclear genes for each species..."? As far as I know, the Granados-Mendoza study used the Angiosperm v.1 probe set, so did you use that set of probes as reference?

      __A: __We rewrote: "To estimate the allele frequencies, the filtered sequences were mapped to a

      reference of low-copy nuclear genes for each species" to:

      To estimate the allele frequencies, the filtered sequences were mapped to the Angiosperm v.1 low-copy nuclear gene set of each species.

      R1. p. 7: Chromosome counts - there is a paragraph of methodology used for chromosome counting, but no results of this important part of the study.

      A: We are including a supplementary figure (Supplementary Data Figure 7) with micrographs of the chromosomes of E. anisatum and E. marmoratum.

      R1. p. 12: Depth of coverage used in repeatome analysis - why did you use different coverage for both species? Any explanation is needed.

      A: To make explicit the fact that the depth of coverage is determined automatically by the analysis with no consideration for the amount of input reads, but only of the graph density and the amount of RAM available (Box 3 in Novak et al. 2020), we rewrote:

      "To estimate the proportion of repetitive DNA, the individual protocol analyzed reads corresponding to depths of coverage of 0.06× for Epidendrum anisatum and 0.43× for E. marmoratum." to

      To estimate the proportion of repetitive DNA, the RepeatExplorer2 individual protocol determined a max number of analyzed reads (Nmax) corresponding to depths of coverage of 0.06x for Epidendrum anisatum and 0.43x for E. marmoratum.

      R1. p. 16: The variation in genome size of orchids is even higher, as the highest known DNA amount has been estimated in Liparis purpureoviridis - 56.11 pg (Travnicek et al 2019 - doi: 10.1111/nph.15996)

      A: We have updated it.

      R1. Fig. 1 - Where is the standard peak on Fig. 1? You mention it explicitly on page 9 where you are talking about FCM histograms.

      A: We reworded the results, eliminating the references to the standard internal reference.

      Reviewer #1 (Significance (Required)):

      Significance

      This study provides a valuable contribution to understanding genome size variation in two Epidendrum species by combining flow cytometry, k-mer analysis, and repetitive element characterization. Its strength lies in the integrative approach and in demonstrating how repetitive elements can explain interspecific differences in DNA content. The work is among the first to directly compare flow cytometric and k-mer-based genome size estimates in orchids, extending current knowledge of genome evolution in this complex plant group. However, the study would benefit from a more critical discussion of the limitations and interpretative pitfalls of k-mer analysis and from addressing methodological inconsistencies in the cytometric data. The research will interest a specialized audience in plant genomics, cytogenetics, and genome evolution, particularly those studying non-model or highly endoreplicated species.

      Field of expertise: plant cytogenetics, genome size evolution, orchid genomics.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary:

      With this work, the authors provide genome profiling information on the Epidendrum genus. They performed low-coverage short read sequencing and analysis, as well as flow cytometry approaches to estimate genome size, and perform comparative analysis for these methods. They also used the WGS dataset to test different approaches and models for genome profiling, as well as repeat abundance estimation, empathising the importance of genome profiling to provide basic and comparative genomic information in our non-model study species. Results show that the two "closely-related" Epidendrum species analysed (E. marmoratum and E. anisatum) have different genome profiles, exhibiting a 2.3-fold genome size difference, mostly triggered by the expansion of repetitive elements in E. marmoratum, specially of Ty3-Gypsy LTR-retrotransposon and a 172 tandem repeat (satellite DNA).

      Major comments:

      Overall, the manuscript is well-written, the aim, results and methods are explained properly, and although I missed some information in the introduction, the paper structure is overall good, and it doesn't lack any important information. The quality of the analysis is also adequate and no further big experiments or analysis would be needed.

      However, from my point of view, two main issues would need to be addressed:

      __R2. __The methods section is properly detailed and well explained. However, the project data and scripts are not available at the figshare link provided, and the BioProject code provided is not found at SRA. This needs to be solved as soon as possible, as if they're not available for review reproducibility of the manuscript cannot be fully assessed.

      __A: __We have made public the .histo files for all depths of coverage and cluster table files necessary to reproduce the results. We will also make public a fraction of the sequencing sufficient to reproduce our genome size and repetitive DNA results as soon as the manuscript is formally published. Whole dataset availability will be pending on the publication of the whole genome draft.

      R2. The authors specify in the methods that 0.06x and 0.43x sequencing depths were used as inputs for the RE analysis of E. anisatum and E. marmoratum. I understand these are differences based on the data availability and genome size differences. However, they don't correspond to either of the recommendations from Novak et al (2020):

      In the context of individual analysis: "The number of analyzed reads should correspond to 0.1-0.5× genome coverage. In the case of repeat-poor species, coverage can be increased up to 1.0-1.5×." Therefore, using 0.06x for E. anisatum should be justified, or at least addressed in the discussion.

      Moreover, using such difference in coverage might affect any comparisons made using these results. Given that the amount of reads is not limiting in this case, why such specific coverages have been used should be discussed in detail.

      In the context of comparative analysis: "Because different genomes are being analyzed simultaneously, the user must decide how they will be represented in the analyzed reads, choosing one of the following options. First, the number of reads analyzed from each genome will be adjusted to represent the same genome coverage. This option provides the same sensitivity of repeat detection for all analyzed samples and is therefore generally recommended; however, it requires that genome sizes of all analyzed species are known and that they do not substantially differ. In the case of large differences in genome sizes, too few reads may be analyzed from smaller genomes, especially if many species are analyzed simultaneously. A second option is to analyze the same number of reads from all samples, which will provide different depth of analysis in species differing in their genome sizes, and this fact should be considered when interpreting analysis results. Because each of these analysis setups has its advantages and drawbacks, it is a good idea to run both and cross-check their results."

      Therefore, it should be confirmed how much it was used for this approach (as in the methods it is only specified how much it was used for the individual analysis), and why.

      __A: __In Box 3, Novak et al (2020) explain that the number of analyzed reads (Nmax) is determined automatically by RepeatExplorer2, based on the graph density and available RAM. Therefore, the reported depths of coverage are results, not the input of the analysis. We tried different amounts of reads as input and got consistently similar results, so we kept the analysis using the whole dataset.

      For the comparative analysis, we have added the resulting depth of coverage and explained that we used the same number of reads for both species.

      Added to methods:

      "For the comparative protocol, we used the same amount of reads for both species".

      Added to results:

      "To estimate the proportion of repetitive DNA, the RepeatExplorer2 individual protocol determined a maximum number of analyzed reads (Nmax) corresponding to depths of coverage of 0.06x for E. anisatum and 0.43x for E. marmoratum. "

      "The RepeatExplorer2 comparative protocol determined a maximum number of analyzed reads (Nmax) corresponding to depths of coverage of approximately 0.14x for E. marmoratum and 0.06x for E. anisatum"

      This is consistent with other works which utilize RepeatExplorer2, for example, Chumová et al (2021; https://doi.org/10.1111/tpj.15306), who wrote: "The final repeatome analysis for each species was done using a maximum number of reads representing between 0.049x and 1.389x of genome coverage."

      Minor comments:

      General comments:

      • The concept of genome endoreplication and the problem it represents for C-value estimations needs to be better contextualised. It would be nice to have some background information in the introduction on how this is an issue (specially in Orchid species). Results shown are valuable and interesting but require a little more context on how frequent this is in plants, especially in Orchids, and across different tissues.

      __A: __We have included information about the variation of conventional and partial endoreplication in plants.

      Differences in CE may also occur between individuals or even respond to environmental factors (Barow 2006). In contrast, PE results in cells that replicate only a fraction (P) of the genome (Brown et al. 2017) and it has only been reported in Orchidaceae (Brown et al. 2017). CE and PE can occur in one or several endoreplication rounds, and different plant tissues may have different proportions of 2C, 4C, 8C ... nC or 2C, 4E, 8E, ... nE nuclear populations, respectively. The 2C nuclear population sometimes constitutes only a small fraction in differentiated somatic tissues and can be overlooked by cytometry (Trávníček et al. 2015). Using plant tissues with a high proportion of the 2C population (such as orchid ovaries and pollinaria) can help overcome this difficulty (Trávníček et al. 2015; Brown et al. 2017).

      Comments and suggestions on the figures:

      __R2. __In fig 1, the flow cytometry histograms need to be more self-explanatory. What are the Y axis "counts" of? Also, please either place the label for both rows or for each, but don't make it redundant. The axis fonts need to be made a bit larger too. If possible, explain briefly in the figure legend (and not only in the text) what each peak means.

      __A: __We have modified the figure adding legends for Y and X axes, eliminated redundant labels, and changed the font size.

      __R2. __Fig 5. Horizontal axis labels are illegible. Please make these larger (maybe make the plot wider by moving the plot legend to the top/bottom of the figure? - just a suggestion).

      __A: __We consider the horizontal axis label to be superfluous and we removed it.

      Small text editing suggestions:

      R2. Methods, "Ploidy level estimation and chromosome counts" section. It would be easier for the reader if this paragraph were either divided into two methods sections, or into two paragraphs at least, since these are two very different approaches and provide slightly different data or information.

      A: We slightly modified: "Chromosome number was counted from developing root tips" to

      "Additionally, to confirm ploidy level, chromosome number was counted from developing root tips" and changed the subtitle to only "Ploidy level estimation".

      R2. Methods, "Genome size estimation by k-mer analysis" section. Please specify whether the coverage simulations (of 5x to 40x) were made based on 1c or 2c of the genome size? I assumed haploid genome size but best to clarify.

      A: We have added it to P7: "To assess the suitability of the whole dataset and estimate the minimum coverage required for genome size estimation, the depth of coverage of both datasets was calculated based on the flow cytometry 1C genome size values."

      R2. Results, "Genome size estimation by k-mer analysis and ploidy estimation" section. In the first two paragraphs, the results presented appear to conform to anticipated patterns based on known properties of these types of datasets. Although this information confirms expected patterns, it does not provide new or biologically significant insights into the genomes analysed. It may be beneficial to further summarize these paragraphs so that the focus of this section can shift toward the comparison of methods and the biological interpretation of the genome profiles of Epidendrum.

      __A: __We agree that those paragraphs deviate a little from the focus of our results. However, we believe they provide useful information both for pattern confirmation in a relatively understudied field and for readers which may not be very familiar with the methods utilized.

      __R2. __Discussion, "Genome size estimation using flow cytometry" section. In the second paragraph, it is discussed how potential endoduplication events can "trick" the flow cytometry measurements. This has probably previously been discussed on other C-value calculation studies and would benefit from context from literature. How does this endoduplication really affect C-value measurements across plant taxa? I understand it is a well-known issue, so maybe add some references?

      A: We have included in the Introduction information about CE and PE and their associated references. P. 3 and 4.

      __R2. __Discussion, "Repetitive DNA composition in Epidendrum anisatum and E. marmoratum" section. In the second paragraph, when mentioning the relative abundance of Ty3-gypsy and Ty1-copia elements, it is also worth mentioning their differences in genomic distribution and the potential structural role of Ty3-gypsy elements.

      A: We added this paragraph in P.20:

      "Ty3-gypsy elements are frequently found in centromeric and pericentromeric regions, and may have an important structural role in heterochromatin (Jin et al. 2004; Neumann et al. 2011; Ma et al. 2023), particularly those with chromodomains in their structure (chromovirus, i.e. Tekay, CRM transposons; Neumann et al. 2011). Conversely, Ty1-copia elements tend to be more frequent in gene-rich regions (Wang et al. 2025A). However, Ty3-gypsy chromovirus elements can be found outside the heterochromatin regions (Neumann et al. 2011), and in Pennisetum purpureum (Poaceae) Ty1-copia elements are more common in pericentromeric regions (Yu et al. 2022)."

      R2. Discussion, "Repetitive DNA composition in Epidendrum anisatum and E. marmoratum" section. In the third paragraph, it is mentioned that both species have 2n=40. I believe these are results from this work since there is a methods section for chromosome counting. This data should therefore go into results.

      __A: __We have added the chromosome count micrographs as Supplementary Data Fig. S7

      R2. Discussion, "Repetitive DNA composition in Epidendrum anisatum and E. marmoratum" section. I'd recommend expanding a bit more on repetitive DNA differences based on the RepeatExplorer results. Providing references on whether this has been found in other taxa would be helpful too. For example, Ogre bursts have been previously described in other species (e.g. legumes, Wang et al., 2025). Moreover, I consider worth highlighting and discussing other interesting differences found, such as the differences in unknown repeats (could be due to one species having "older" elements- too degraded to give any database hits- compared to the other), or Class II TE differences between species (and how these account less for genome size difference because of their size), etc.

      A: We have rearranged and added discussion expanding on the role of repetitive DNA in E. anisatum and E. marmoratum and how it relates to the repetitive DNA in other species. This includes Ogre transposons, an expanded Ty1-copia vs. Ty3-gypsy discussion, and a section on unclassified repeats and can be found on P.19 to P.21.

      Reviewer #2 (Significance (Required)):

      Overall, this study provides a valuable contribution to our understanding of genome size diversity and repetitive DNA dynamics within Epidendrum, particularly through its combined use of low-coverage sequencing, flow cytometry, and comparative genome profiling. Its strongest aspects lie in the clear methodological framework and the integration of multiple complementary approaches, which together highlight substantial genome size divergence driven by repeat proliferation-an insight of clear relevance for orchid genomics and plant genome evolution more broadly.

      While the work would benefit from improved data availability, additional contextualization of the problem of endoreduplication in flow cytometry, and clarification of some figure elements and methodological details, the study nonetheless advances the field by presenting new comparative genomic information for two understudied species and by evaluating different strategies for genome profiling in non-model taxa.

      The primary audience will include researchers in non-model plant genomics, cytogenetics, and evolutionary biology, although the methodological comparisons may also be useful to a broader community working on genome characterization in diverse lineages. My expertise is in plant genomics, genome size evolution, and repetitive DNA biology; I am not a specialist in flow cytometry instrumentation or cytological methods, so my evaluation of those aspects is based on general familiarity rather than technical depth.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      A review on "Nuclear genome profiling of two Mexican orchids of the genus Epidendrum" by Alcalá-Gaxiola et al. submitted to ReviewCommons

      The present manuscript presented genomic data for two endemic Maxican orchids: Epidendrum anisatum and E. marmoratum. Authors aim to determine the genome size and ploidy using traditional (flow cytometry and chromosome counts) and genomic techniques (k-mer analysis, heterozygosity), along with the repetitive DNA composition characterization.

      Considering the genomic composition, the main difference observed in repeat composition between the two species was attributed to the presence of a 172 bp satDNA (AniS1) in E. anisatum, which represents about 11% of its genome but is virtually absent in E. marmoratum. The differences in the genomic proportion of AniS1 and Ty3-gypsy/Ogre lineage TEs between E. anisatum and E. marmoratum are suggested as potential drivers of the GS difference identified between the two species.

      Our main concern are about the GS estimation and chromosome number determination. Along with many issues related to GS estimations by flow cytometry, results related to chromosome number determination are missing on the manuscript. Improvements in both techiniques and results are crucial since authors aim to compare different methods to GS and ploidy determination.

      __R3. __Genome size: Following the abstract, it is no possible to understand that authors confirm the GS by flow cytometry - as clarified after on the manuscript. Please, since the approach used to obtain the results are crucial on this manuscript, make it clear on the abstract.

      A: We have highlighted the congruence of flow cytometry and bioinformatic approaches in the abstract:

      "Multiple depths of coverage, k values, and k-mer-based tools for genome size estimation were explored and contrasted with cytometry genome size estimations. Cytometry and k-mer analyses yielded a consistently higher genome size for E. anisatum (mean 1C genome size = 2.59 Gb) than * E. marmoratum* (mean 1C genome size = 1.13 Gb), which represents a 2.3-fold genome size difference."

      __R3.__Flow cytometry methodology: For a standard protocol, it is mandatory to use, at least, three individuals, each one analyzed on triplicate. Is is also important to check the variation among measurements obtained from the same individual and the values obtained from different individuals. Such variation should be bellow 3%. The result should be the avarege C-value following the standard deviation, what inform us the variation among individuals and measurements.

      __A: __We have done three technical replicates of each tissue of the individuals of E. anisatum and E. marmoratum. To show the variation from different replicates and tissues, we have included the Supplementary Data Table S1. Intraspecific variation on genome size is beyond the scope of this work.

      __R3. __Checking Fig. 1, we could not see the Pisum peack. If authors performed an analysis with external standart, it should be clarified on Methods. I suggest always use internal standard.

      Besides, comparing Fig. 1 for leave and pollinium, it seems to be necessary to set up the Flow Cytoemtry equipament. Note that the 2C peack change its position when comparing different graphs. The data could be placed more central on x-axis by setting the flow cytometry.

      Action Required: Considering that authors want to compare indirect genomic approaches to determine the GS, I suggest authors improve the GS determination by Flow Cytometry.

      Please, on Methodology section, keep both techniques focused on GS close one another. Follow the same order on Methodology, Results and Discussion sections.

      __A: __We have made several changes on the estimation and reporting of the flow cytometry genome size estimation. Among these:

      We have clarified the use of the P. sativum internal standard and PBMC's in methods (P.6). We have added the associated mean coefficient of variation for both the sample and the internal reference in Supplementary Data Table S1, in order to show that the variation is not the result of an instrument error. We have changed the order of the paragraphs in the methods section to follow the order in other sections.

      __R3. __Chromosome count: In Introduction section (page 5), the authors explicitly aim to provide "bioinformatics ploidy level estimation and chromosome counting." Furthermore, the Methods section (page 7, subsection "Ploidy level estimation and chromosome counts") details a specific protocol for chromosome counting involving root tip pretreatment, fixation, and staining. However, no results regarding chromosome counting are presented in the manuscript. There are no micrographs of metaphase plates, no tables with counts, and no mention of the actual counts in the Results section or Supplementary Material. Despite this absence of evidence, the Discussion (Page 18) states: "ploidy and chromosome counts of both E. anisatum and E. marmoratum are the same (2n=40)." The value of 2n=40 is presented as a finding of this study, however, there is no reference to this results.

      Action Required: The authors must resolve this discrepancy by either providing the missing empirical data (micrographs and counts). This detail needs to be reviewed with greater care and scientific integrity.

      __A: __We have added the chromosome count micrographs as Supplementary Data Fig. S7.

      Minor reviews (Suggestions):

      __R3. __Refining the Title (Optional): Although the current title is descriptive, we believe it undersells the value of the manuscript. Since this study provides the first genome profiling and repeatome characterization for the genus Epidendrum and offers important insights into the calibration of bioinformatics tools and flow cytometry for repetitive genomes, I suggest modifying the title to reflect these aspects. The comparative access of GS is also an importante feature. This would make the article more attractive to a broader audience interested in genomics of non-model organisms.

      __A: __We have changed the title to "Nuclear genome profiling of two species of Epidendrum (Orchidaceae): genome size, repeatome and ploidy"

      __R3. __Botanical Nomenclature (Optional): Although citing taxonomic authorities is not strictly required in all fields of plant sciences, most botanical journals expect the full author citation at the first mention of each species. Including this information would improve the nomenclatural rigor of the manuscript and align it with common practices in botanical publishing.

      A: We have added the citation of the taxonomic authorities:

      "This study aims to use two closely related endemic Mexican species, Epidendrum anisatum Lex and Epidendrum marmoratum A. Rich. & Galeotti, to provide the first genomic profiling for this genus..."

      __R3. __Abbreviation of Genus Names: I noticed inconsistencies in the abbreviation of scientific names throughout the manuscript. Standard scientific style dictates that the full genus name (Epidendrum) should be written out only at its first mention in the Abstract and again at the first mention in the main text. Thereafter, it should be abbreviated (e.g., E. anisatum, E. marmoratum), unless the name appears at the beginning of a sentence or if abbreviation would cause ambiguity with another genus. Please revise the text to apply this abbreviation consistently.

      A: We have made the changes requested as necessary.

      __R3. __Genome Size Notation: In the Abstract and throughout the text, genome size estimates are presented using the statistical symbol for the mean (x). While mathematically accurate, this notation is generic and does not immediately inform the reader about the biological nature of the DNA content (i.e., whether it refers to the gametic 1C or somatic 2C value). In plant cytometry literature, it is standard practice to explicitly label these values using C-value terminology to prevent ambiguity and eliminate the effect of the number of chromosome sets (Bennett & Leitch 2005; Greilhuber et al. 2005; Doležel et al. 2018). I strongly suggest replacing references to "x" with "1C" (e.g., changing "x = 2.58 Gb" to "mean 1C value = 2.58 Gb") to ensure immediate clarity and alignment with established conventions in the field.

      __A: __We have revised the text in every instance, for example, in the results section:

      "The 1C value in gigabases (Gb; calculated from mass in pg) of E. anisatum ranged from 2.55 to 2.62 Gb (mean 1C value = 2.59 Gb) and that of E. marmoratum from 1.11 to 1.18 Gb (mean 1C value = 1.13 Gb; Supplementary Data Table S1)."

      __R3. __Justification of the Sequencing Method: Although the sequencing strategy is clearly described, the manuscript would benefit from a bit more contextualization regarding the choice of low-pass genome skimming. In the Introduction, a short justification of why this approach is suitable for estimating genome size, heterozygosity, and repeat composition, particularly in plants with large, repeat-rich genomes, would help readers better understand the methodological rationale. Likewise, in the Methods section, briefly outlining why the selected sequencing depth is appropriate, and how it aligns with previous studies using similar coverage levels, would strengthen the clarity of the methodological framework. These additions would make the rationale behind the sequencing approach more transparent and accessible to readers who may be less familiar with low-coverage genomic strategies.

      __A: __We have added the following short sentence in P.7:

      "This sequencing method produces suitable data sets without systematic biases, allowing the estimation of genome size and the proportion of repetitive DNA. "

      __R3. __Wording Improvement Regarding RepeatExplorer2 Results: In the Results section, several sentences attribute biological outcomes to the RepeatExplorer2 "protocols" (e.g., "According to this protocol, both species have highly repetitive genomes..."; "The comparative protocol showed a 67% total repeat proportion, which falls between the estimated repeat proportions of the two species according to the results of the individual protocol"). Since the RepeatExplorer2 protocol itself only provides the analytical workflow and not species-specific results, this phrasing may be misleading.

      A: We have rephrased these sections to emphasize that these are "the results of" the protocols and not the protocols themselves.

      Reviewer #3 (Significance (Required)):

      Significance

      General assessment

      Strengths

      1.First Detailed Genomic Profile for the Genus Epidendrum: The study provides the first integrated dataset on genome size, ploidy, heterozygosity, and repeatome for species of the genus Epidendrum, a novel contribution for an extremely diverse and under-explored group in terms of cytogenomics.

      Cross-validation of in vitro and in silico analyses: Flow cytometry is considered the gold standard for genome size (GS) estimation because it physically measures DNA quantity (Doležel et al. 2007; Śliwińska 2018). However, it typically requires fresh tissue, which is not always available. Conversely, k-mer analysis is a rapid bioinformatics technique utilizing sequencing data that does not rely on a reference genome. Nevertheless, it is frequently viewed with skepticism or distrust due to discrepancies with laboratory GS estimates (Pflug et al. 2020; Hesse 2023). In this study, by comparing computational results with flow cytometry data, the authors were able to validate the reliability of computational estimates for the investigated species. Since the 'true' GS was already established via flow cytometry, the authors used this value as a benchmark to test various software tools (GenomeScope, findGSE, CovEst) and parameters. This approach allowed for the identification of which tools perform best for complex genomes. For instance, they found that tools failing to account for heterozygosity (such as findGSE-hom) drastically overestimated the genome size of E. anisatum, whereas GenomeScope and findGSE-het (which account for heterozygosity) yielded results closer to the flow cytometry values. Thus, they demonstrated that this cross-validation is an effective method for estimating plant genome sizes with greater precision. This integrative approach is essential not only for defining GS but also for demonstrating how bioinformatics methods must be calibrated (particularly regarding depth of coverage and maximum k-mer coverage) to provide accurate data for non-model organisms when flow cytometry is not feasible.

      Limitations

      1. Limited Taxonomic Sampling: The study analyzes only two species of Epidendrum, which restricts the ability to make broad inferences regarding genome evolution across the genus. Given the outstanding diversity of Epidendrum (>1,800 species), the current sampling is insufficient to propose generalized evolutionary patterns. As the authors state by the end of the Discussion (page 18) "Future work should investigate to what extent LTR transposons and satellite DNA have been responsible for shaping genome size variation in different lineages of Epidendrum, analyzing a greater portion of its taxic diversity in an evolutionary context.". 2.Lack of Cytogenetic Results and Mapping: One of the major finding of this study is the identification of the AniS1 satellite as a potential key driver of the genome size difference between the species, occupying ~11% of the E. anisatum genome and virtually absent in E. marmoratum. While the authors use bioinformatic metrics (C and P indices) to infer a dispersed organization in the Discussion (Page 18), the study lacks physical validation via Fluorescence in situ Hybridization (FISH) - and a basic validation of the chromosome number. Without cytogenetic mapping, it is impossible to confirm the actual chromosomal distribution of this massive repetitive array, for instance, whether it has accumulated in specific heterochromatic blocks (e.g., centromeric or subtelomeric regions) or if it is genuinely interspersed along the chromosome arms. I suggest acknowledging this as a limitation in the Discussion, as the physical organization of such abundant repeats has significant implications for understanding the structural evolution of the species' chromosomes.

      Advance

      To the best of our knowledge, this study represents the first comprehensive genome profiling and repeatome characterization for any species of the genus Epidendrum. By integrating flow cytometry, k-mer-based approaches, and low-pass sequencing, the authors provide the first insights into the genomic architecture of Epidendrum, including quantitative assessments of transposable elements, lineage-specific satellite DNA, and repeat-driven genome expansion. This constitutes both a technical and a conceptual advance: technically, the study demonstrates the feasibility and limitations of combining in vitro and in silico methods for genome characterization in large, repeat-rich plant genomes; conceptually, it offers new evolutionary perspectives on how repetitive elements shape genome size divergence within a highly diverse orchid lineage. These results broaden the genomic knowledge base for Neotropical orchids and establish a foundational reference for future comparative, cytogenomic, and phylogenomic studies within Epidendrum and related groups.

      Audience

      This study will primarily interest a broad audience, including researchers in plant genomics, evolutionary biology, cytogenomics, and bioinformatics, especially those working with non-model plants or groups with large, repetitive genomes. It also holds relevance for scientists engaged in genome size evolution, repetitive DNA biology, and comparative genomics. Other researchers are likely to use this work as a methodological reference for genome profiling in non-model taxa, especially regarding the integration of flow cytometry and k-mer-based estimations and the challenges posed by highly repetitive genomes. The detailed repeatome characterization, including identification of lineage-specific satellites and retrotransposon dynamics, will support comparative genomic analyses, repeat evolution studies, and future cytogenetic validation (e.g., FISH experiments). Additionally, this dataset establishes a genomic baseline that can inform phylogenomic studies, species delimitation, and evolutionary inference within Epidendrum and related orchid groups.

      Reviewer's Backgrounds

      The review was prepared by two reviewers. Our expertise lies in evolution and biological diversity, with a focus on cytogenomic and genome size evolution. Among the projects in development, the cytogenomics evolution of Neotropical orchids is one of the main studies (also focused on Epidendrum). These areas shape my perspective in evaluating the evolutionary, cytogenomic, and biological implications of the study. However, we have limited expertise in methodologies related to k-mer-based genome profiling and heterozygosity modeling. Therefore, our evaluation does not deeply assess the technical validity of these analytical pipelines.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      A review on "Nuclear genome profiling of two Mexican orchids of the genus Epidendrum" by Alcalá-Gaxiola et al. submitted to ReviewCommons

      The present manuscript presented genomic data for two endemic Maxican orchids: Epidendrum anisatum and E. marmoratum. Authors aim to determine the genome size and ploidy using traditional (flow cytometry and chromosome counts) and genomic techniques (k-mer analysis, heterozygosity), along with the repetitive DNA composition characterization.

      Considering the genomic composition, the main difference observed in repeat composition between the two species was attributed to the presence of a 172 bp satDNA (AniS1) in E. anisatum, which represents about 11% of its genome but is virtually absent in E. marmoratum. The differences in the genomic proportion of AniS1 and Ty3-gypsy/Ogre lineage TEs between E. anisatum and E. marmoratum are suggested as potential drivers of the GS difference identified between the two species.

      Our main concern are about the GS estimation and chromosome number determination. Along with many issues related to GS estimations by flow cytometry, results related to chromosome number determination are missing on the manuscript. Improvements in both techiniques and results are crucial since authors aim to compare different methods to GS and ploidy determination.

      Genome size: Following the abstract, it is no possible to understand that authors confirm the GS by flow cytometry - as clarified after on the manuscript. Please, since the approach used to obtain the results are crucial on this manuscript, make it clear on the abstract. Flow cytometry methodology: For a standart protocol, it is mandatory to use, at least, three individuals, each one analyzed on triplicate. Is is also important to check the variation among measurements obtained from the same individual and the values obtained from different individuals. Such variation should be bellow 3%. The result should be the avarege C-value following the standard deviation, what inform us the variation among individuals and measurements. Checking Fig. 1, we could not see the Pisum peack. If authors performed an analysis with external standart, it should be clarified on Methods. I suggest always use internal standard. Besides, comparing Fig. 1 for leave and pollinium, it seems to be necessary to set up the Flow Cytoemtry equipament. Note that the 2C peack change its position when comparing different graphs. The data could be placed more central on x-axis by setting the flow cytometry. Action Required: Considering that authors want to compare indirect genomic approaches to determine the GS, I suggest authors improve the GS determination by Flow Cytometry. Please, on Methodology section, keep both techniques focused on GS close one another. Follow the same order on Methodology, Results and Discussion sections.

      Chromosome count: In Introduction section (page 5), the authors explicitly aim to provide "bioinformatics ploidy level estimation and chromosome counting." Furthermore, the Methods section (page 7, subsection "Ploidy level estimation and chromosome counts") details a specific protocol for chromosome counting involving root tip pretreatment, fixation, and staining. However, no results regarding chromosome counting are presented in the manuscript. There are no micrographs of metaphase plates, no tables with counts, and no mention of the actual counts in the Results section or Supplementary Material. Despite this absence of evidence, the Discussion (Page 18) states: "ploidy and chromosome counts of both E. anisatum and E. marmoratum are the same (2n=40)." The value of 2n=40 is presented as a finding of this study, however, there is no reference to this results. Action Required: The authors must resolve this discrepancy by either providing the missing empirical data (micrographs and counts). This detail needs to be reviewed with greater care and scientific integrity. Minor reviews (Sugestions): Refining the Title (Optional): Although the current title is descriptive, we believe it undersells the value of the manuscript. Since this study provides the first genome profiling and repeatome characterization for the genus Epidendrum and offers important insights into the calibration of bioinformatics tools and flow cytometry for repetitive genomes, I suggest modifying the title to reflect these aspects. The comparative access of GS is also an importante feature. This would make the article more attractive to a broader audience interested in genomics of non-model organisms. 

      Botanical Nomenclature (Optional): Although citing taxonomic authorities is not strictly required in all fields of plant sciences, most botanical journals expect the full author citation at the first mention of each species. Including this information would improve the nomenclatural rigor of the manuscript and align it with common practices in botanical publishing.

      Abbreviation of Genus Names: I noticed inconsistencies in the abbreviation of scientific names throughout the manuscript. Standard scientific style dictates that the full genus name (Epidendrum) should be written out only at its first mention in the Abstract and again at the first mention in the main text. Thereafter, it should be abbreviated (e.g., E. anisatum, E. marmoratum), unless the name appears at the beginning of a sentence or if abbreviation would cause ambiguity with another genus. Please revise the text to apply this abbreviation consistently.

      Genome Size Notation: In the Abstract and throughout the text, genome size estimates are presented using the statistical symbol for the mean (x). While mathematically accurate, this notation is generic and does not immediately inform the reader about the biological nature of the DNA content (i.e., whether it refers to the gametic 1C or somatic 2C value). In plant cytometry literature, it is standard practice to explicitly label these values using C-value terminology to prevent ambiguity and eliminate the effect of the number of chromosome sets (Bennett & Leitch 2005; Greilhuber et al. 2005; Doležel et al. 2018). I strongly suggest replacing references to "x" with "1C" (e.g., changing "x = 2.58 Gb" to "mean 1C value = 2.58 Gb") to ensure immediate clarity and alignment with established conventions in the field.

      Justification of the Sequencing Method: Although the sequencing strategy is clearly described, the manuscript would benefit from a bit more contextualization regarding the choice of low-pass genome skimming. In the Introduction, a short justification of why this approach is suitable for estimating genome size, heterozygosity, and repeat composition, particularly in plants with large, repeat-rich genomes, would help readers better understand the methodological rationale. Likewise, in the Methods section, briefly outlining why the selected sequencing depth is appropriate, and how it aligns with previous studies using similar coverage levels, would strengthen the clarity of the methodological framework. These additions would make the rationale behind the sequencing approach more transparent and accessible to readers who may be less familiar with low-coverage genomic strategies.

      Wording Improvement Regarding RepeatExplorer2 Results: In the Results section, several sentences attribute biological outcomes to the RepeatExplorer2 "protocols" (e.g., "According to this protocol, both species have highly repetitive genomes..."; "The comparative protocol showed a 67% total repeat proportion, which falls between the estimated repeat proportions of the two species according to the results of the individual protocol"). Since the RepeatExplorer2 protocol itself only provides the analytical workflow and not species-specific results, this phrasing may be misleading.

      Significance

      General assessment

      Strengths

      1. First Detailed Genomic Profile for the Genus Epidendrum: The study provides the first integrated dataset on genome size, ploidy, heterozygosity, and repeatome for species of the genus Epidendrum, a novel contribution for an extremely diverse and under-explored group in terms of cytogenomics.
      2. Cross-validation of in vitro and in silico analyses: Flow cytometry is considered the gold standard for genome size (GS) estimation because it physically measures DNA quantity (Doležel et al. 2007; Śliwińska 2018). However, it typically requires fresh tissue, which is not always available. Conversely, k-mer analysis is a rapid bioinformatics technique utilizing sequencing data that does not rely on a reference genome. Nevertheless, it is frequently viewed with skepticism or distrust due to discrepancies with laboratory GS estimates (Pflug et al. 2020; Hesse 2023). In this study, by comparing computational results with flow cytometry data, the authors were able to validate the reliability of computational estimates for the investigated species. Since the 'true' GS was already established via flow cytometry, the authors used this value as a benchmark to test various software tools (GenomeScope, findGSE, CovEst) and parameters. This approach allowed for the identification of which tools perform best for complex genomes. For instance, they found that tools failing to account for heterozygosity (such as findGSE-hom) drastically overestimated the genome size of E. anisatum, whereas GenomeScope and findGSE-het (which account for heterozygosity) yielded results closer to the flow cytometry values. Thus, they demonstrated that this cross-validation is an effective method for estimating plant genome sizes with greater precision. This integrative approach is essential not only for defining GS but also for demonstrating how bioinformatics methods must be calibrated (particularly regarding depth of coverage and maximum k-mer coverage) to provide accurate data for non-model organisms when flow cytometry is not feasible.

      Limitations

      1. Limited Taxonomic Sampling: The study analyzes only two species of Epidendrum, which restricts the ability to make broad inferences regarding genome evolution across the genus. Given the outstanding diversity of Epidendrum (>1,800 species), the current sampling is insufficient to propose generalized evolutionary patterns. As the authors state by the end of the Discussion (page 18) "Future work should investigate to what extent LTR transposons and satellite DNA have been responsible for shaping genome size variation in different lineages of Epidendrum, analyzing a greater portion of its taxic diversity in an evolutionary context.".
      2. Lack of Cytogenetic Results and Mapping: One of the major finding of this study is the identification of the AniS1 satellite as a potential key driver of the genome size difference between the species, occupying ~11% of the E. anisatum genome and virtually absent in E. marmoratum. While the authors use bioinformatic metrics (C and P indices) to infer a dispersed organization in the Discussion (Page 18), the study lacks physical validation via Fluorescence in situ Hybridization (FISH) - and a basic validation of the chromosome number. Without cytogenetic mapping, it is impossible to confirm the actual chromosomal distribution of this massive repetitive array, for instance, whether it has accumulated in specific heterochromatic blocks (e.g., centromeric or subtelomeric regions) or if it is genuinely interspersed along the chromosome arms. I suggest acknowledging this as a limitation in the Discussion, as the physical organization of such abundant repeats has significant implications for understanding the structural evolution of the species' chromosomes.

      Advance

      To the best of our knowledge, this study represents the first comprehensive genome profiling and repeatome characterization for any species of the genus Epidendrum. By integrating flow cytometry, k-mer-based approaches, and low-pass sequencing, the authors provide the first insights into the genomic architecture of Epidendrum, including quantitative assessments of transposable elements, lineage-specific satellite DNA, and repeat-driven genome expansion. This constitutes both a technical and a conceptual advance: technically, the study demonstrates the feasibility and limitations of combining in vitro and in silico methods for genome characterization in large, repeat-rich plant genomes; conceptually, it offers new evolutionary perspectives on how repetitive elements shape genome size divergence within a highly diverse orchid lineage. These results broaden the genomic knowledge base for Neotropical orchids and establish a foundational reference for future comparative, cytogenomic, and phylogenomic studies within Epidendrum and related groups.

      Audience

      This study will primarily interest a broad audience, including researchers in plant genomics, evolutionary biology, cytogenomics, and bioinformatics, especially those working with non-model plants or groups with large, repetitive genomes. It also holds relevance for scientists engaged in genome size evolution, repetitive DNA biology, and comparative genomics. Other researchers are likely to use this work as a methodological reference for genome profiling in non-model taxa, especially regarding the integration of flow cytometry and k-mer-based estimations and the challenges posed by highly repetitive genomes. The detailed repeatome characterization, including identification of lineage-specific satellites and retrotransposon dynamics, will support comparative genomic analyses, repeat evolution studies, and future cytogenetic validation (e.g., FISH experiments). Additionally, this dataset establishes a genomic baseline that can inform phylogenomic studies, species delimitation, and evolutionary inference within Epidendrum and related orchid groups.

      Reviewer's Backgrounds

      The review was prepared by two reviewers. Our expertise lies in evolution and biological diversity, with a focus on cytogenomic and genome size evolution. Among the projects in development, the cytogenomics evolution of Neotropical orchids is one of the main studies (also focused on Epidendrum). These areas shape my perspective in evaluating the evolutionary, cytogenomic, and biological implications of the study. However, we have limited expertise in methodologies related to k-mer-based genome profiling and heterozygosity modeling. Therefore, our evaluation does not deeply assess the technical validity of these analytical pipelines.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      With this work, the authors provide genome profiling information on the Epidendrum genus. They performed low-coverage short read sequencing and analysis, as well as flow cytometry approaches to estimate genome size, and perform comparative analysis for these methods. They also used the WGS dataset to test different approaches and models for genome profiling, as well as repeat abundance estimation, empathising the importance of genome profiling to provide basic and comparative genomic information in our non-model study species. Results show that the two "closely-related" Epidendrum species analysed (E. marmoratum and E. anisatum) have different genome profiles, exhibiting a 2.3-fold genome size difference, mostly triggered by the expansion of repetitive elements in E. marmoratum, specially of Ty3-Gypsy LTR-retrotransposon and a 172 tandem repeat (satellite DNA).

      Major comments:

      Overall, the manuscript is well-written, the aim, results and methods are explained properly, and although I missed some information in the introduction, the paper structure is overall good, and it doesn't lack any important information. The quality of the analysis is also adequate and no further big experiments or analysis would be needed. However, from my point of view, two main issues would need to be addressed:

      • The methods section is properly detailed and well explained. However, the project data and scripts are not available at the figshare link provided, and the BioProject code provided is not found at SRA. This needs to be solved as soon as possible, as if they're not available for review reproducibility of the manuscript cannot be fully assessed.
      • The authors specify in the methods that 0.06x and 0.43x sequencing depths were used as inputs for the RE analysis of E. anisatum and E. marmoratum. I understand these are differences based on the data availability and genome size differences. However, they don't correspond to either of the recommendations from Novak et al (2020):

      In the context of individual analysis: "The number of analyzed reads should correspond to 0.1-0.5× genome coverage. In the case of repeat-poor species, coverage can be increased up to 1.0-1.5×." Therefore, using 0.06x for E. anisatum should be justified, or at least addressed in the discussion. Moreover, using such difference in coverage might affect any comparisons made using these results. Given that the amount of reads is not limiting in this case, why such specific coverages have been used should be discussed in detail.

      In the context of comparative analysis: "Because different genomes are being analyzed simultaneously, the user must decide how they will be represented in the analyzed reads, choosing one of the following options. First, the number of reads analyzed from each genome will be adjusted to represent the same genome coverage. This option provides the same sensitivity of repeat detection for all analyzed samples and is therefore generally recommended; however, it requires that genome sizes of all analyzed species are known and that they do not substantially differ. In the case of large differences in genome sizes, too few reads may be analyzed from smaller genomes, especially if many species are analyzed simultaneously. A second option is to analyze the same number of reads from all samples, which will provide different depth of analysis in species differing in their genome sizes, and this fact should be considered when interpreting analysis results. Because each of these analysis setups has its advantages and drawbacks, it is a good idea to run both and cross-check their results." Therefore, it should be confirmed how much it was used for this approach (as in the methods it is only specified how much it was used for the individual analysis), and why.

      Minor comments:

      General comments:

      • The concept of genome endoreplication and the problem it represents for C-value estimations needs to be better contextualised. It would be nice to have some background information in the introduction on how this is an issue (specially in Orchid species). Results shown are valuable and interesting but require a little more context on how frequent this is in plants, especially in Orchids, and across different tissues.

      Comments and suggestions on the figures:

      • In fig 1, the flow cytometry histograms need to be more self-explanatory. What are the Y axis "counts" of? Also, please either place the label for both rows or for each, but don't make it redundant. The axis fonts need to be made a bit larger too. If possible, explain briefly in the figure legend (and not only in the text) what each peak means.
      • Fig 5. Horizontal axis labels are illegible. Please make these larger (maybe make the plot wider by moving the plot legend to the top/bottom of the figure? - just a suggestion).

      Small text editing suggestions:

      • Methods, "Ploidy level estimation and chromosome counts" section. It would be easier for the reader if this paragraph was either divided into two methods sections, or into two paragraphs at least, since these are two very different approaches and provide slightly different data or information.
      • Methods, "Genome size estimation by k-mer analysis" section. Please specify whether the coverage simulations (of 5x to 40x) were made based on 1c or 2c of the genome size? I assumed haploid genome size but best to clarify.
      • Results, "Genome size estimation by k-mer analysis and ploidy estimation" section. In the first two paragraphs, the results presented appear to conform to anticipated patterns based on known properties of these types of datasets. Although this information confirms expected patterns, it does not provide new or biologically significant insights into the genomes analysed. It may be beneficial to further summarize these paragraphs so that the focus of this section can shift toward the comparison of methods and the biological interpretation of the genome profiles of Epidendrum.
      • Discussion, "Genome size estimation using flow cytometry" section. In the second paragraph, it is discussed how potential endoduplication events can "trick" the flow cytometry measurements. This has probably previously been discussed on other C-value calculation studies and would benefit from context from literature. How does this endoduplication really affect C-value measurements across plant taxa? I understand it is a well-known issue, so maybe add some references?
      • Discussion, "Repetitive DNA composition in Epidendrum anisatum and E. marmoratum" section. In the second paragraph, when mentioning the relative abundance of Ty3-gypsy and Ty1-copia elements, it is also worth mentioning their differences in genomic distribution and the potential structural role of Ty3-gypsy elements.
      • Discussion, "Repetitive DNA composition in Epidendrum anisatum and E. marmoratum" section. In the third paragraph, it is mentioned that both species have 2n=40. I believe these are results from this work since there is a methods section for chromosome counting. This data should therefore go into results.
      • Discussion, "Repetitive DNA composition in Epidendrum anisatum and E. marmoratum" section. I'd recommend expanding a bit more on repetitive DNA differences based on the RepeatExplorer results. Providing references on whether this has been found in other taxa would be helpful too. For example, Ogre bursts have been previously described in other species (e.g. legumes, Wang et al., 2025). Moreover, I consider worth highlighting and discussing other interesting differences found, such as the differences in unknown repeats (could be due to one species having "older" elements- too degraded to give any database hits- compared to the other), or Class II TE differences between species (and how these account less for genome size difference because of their size), etc.

      Significance

      Overall, this study provides a valuable contribution to our understanding of genome size diversity and repetitive DNA dynamics within Epidendrum, particularly through its combined use of low-coverage sequencing, flow cytometry, and comparative genome profiling. Its strongest aspects lie in the clear methodological framework and the integration of multiple complementary approaches, which together highlight substantial genome size divergence driven by repeat proliferation-an insight of clear relevance for orchid genomics and plant genome evolution more broadly.

      While the work would benefit from improved data availability, additional contextualization of the problem of endoreduplication in flow cytometry, and clarification of some figure elements and methodological details, the study nonetheless advances the field by presenting new comparative genomic information for two understudied species and by evaluating different strategies for genome profiling in non-model taxa.

      The primary audience will include researchers in non-model plant genomics, cytogenetics, and evolutionary biology, although the methodological comparisons may also be useful to a broader community working on genome characterization in diverse lineages. My expertise is in plant genomics, genome size evolution, and repetitive DNA biology; I am not a specialist in flow cytometry instrumentation or cytological methods, so my evaluation of those aspects is based on general familiarity rather than technical depth.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The study provides a comprehensive overview of genome size variation in two related species of the genus Epidendrum, which appear to be homoploid, although their DNA content more closely corresponds to that of heteroploid species. While I have a few serious concerns regarding the data analysis, the study itself demonstrates a well-designed approach and offers a valuable comparison of different methods for genome size estimation. In particular, I would highlight the analysis of repetitive elements, which effectively explains the observed differences between the species. However, I encourage the authors to adopt a more critical perspective on the k-mer analysis and the potential pitfalls in data interpretation.

      Major comments:

      p. 9: Genome size estimation via flow cytometry is an incorrect approach. The deviation is approximately 19% for E. anisatum and about 25% for E. marmoratum across three repeated measurements of the same tissue over three days? These values are far beyond the accepted standards of best practice for flow cytometry, which recommend a maximum deviation of 2-5% between repeated measurements of the same individual. Such variability indicates a systemic methodological issue or improper instrument calibration. Results with this level of inconsistency cannot be considered reliable estimates of genome size obtained by flow cytometry. If you provide the raw data, I can help identify the likely source of error, but as it stands, these results are not acceptable.

      p. 14 and some parts of Introduction: It may seem unusual, to say the least, to question genome size estimation in orchids using flow cytometry, given that this group is well known for extensive endoreplication. However, what effect does this phenomenon have on genome size analyses based on k-mers, or on the correct interpretation of peaks in k-mer histograms? How can such analyses be reliably interpreted when most nuclei used for DNA extraction and sequencing likely originate from endoreplicated cells? I would have expected a more detailed discussion of this issue in light of your results, particularly regarding the substantial variation in genome size estimates across different k-mer analysis settings. Could endoreplication be a contributing factor?

      You repeatedly refer to the experiment on genome size estimation using analyses with maximum k-mer coverage of 10,000 and 2 million, under different k values. However, I would like to see a comparison - such as a correlation analysis - that supports this experiment. The results and discussion sections refer to it extensively, yet no corresponding figure or analysis is presented.

      Minor comments:

      p. 3: You stated: "Flow cytometry is the gold standard for genome size estimation, but whole-genome endoreplication (also known as conventional endoreplication; CE) and strict partial endoreplication (SPE) can confound this method." How did you mean this? Endopolyploidy is quite common in plants and flow cytometry is an excellent tool how to detect it and how to select the proper nuclei fraction for genome size estimation (if you are aware of possible misinterpretation caused by using inappropriate tissue for analysis). The same can be applied for partial endoreplication in orchids (see e.g. Travnicek et al 2015). Moreover, the term "strict partial endoreplication" is outdated and is only used by Brwon et al. In more recent studies, the term partial endoreplication is used (e.g. Chumova et al. 2021- 10.1111/tpj.15306 or Piet et al. 2022 - 10.1016/j.xplc.2022.100330).

      p. 5: "...both because of its outstanding taxic diversity..." There is no such thing as "taxic" diversity - perhaps you mean taxonomic diversity or species richness.

      p. 6: In description of flow cytometry you stated: "Young leaves of Pisum sativum (4.45 pg/1C; Doležel et al. 1998) and peripheral blood mononuclear cells (PBMCs) from healthy individuals...". What does that mean? Did you really use blood cells? For what purpose?

      p. 7: What do you mean by this statement "...reference of low-copy nuclear genes for each species..."? As far as I know, the Granados-Mendoza study used the Angiosperm v.1 probe set, so did you use that set of probes as reference?

      p. 7: Chromosome counts - there is a paragraph of methodology used for chromosome counting, but no results of this important part of the study.

      p. 12: Depth of coverage used in repeatome analysis - why did you use different coverage for both species? Any explanation is needed.

      p. 16: The variation in genome size of orchids is even higher, as the highest known DNA amount has been estimated in Liparis purpureoviridis - 56.11 pg (Travnicek et al 2019 - doi: 10.1111/nph.15996)

      Fig. 1 - Where is the standard peak on Fig. 1? You mention it explicitly on page 9 where you are talking about FCM histograms.

      Significance

      This study provides a valuable contribution to understanding genome size variation in two Epidendrum species by combining flow cytometry, k-mer analysis, and repetitive element characterization. Its strength lies in the integrative approach and in demonstrating how repetitive elements can explain interspecific differences in DNA content. The work is among the first to directly compare flow cytometric and k-mer-based genome size estimates in orchids, extending current knowledge of genome evolution in this complex plant group. However, the study would benefit from a more critical discussion of the limitations and interpretative pitfalls of k-mer analysis and from addressing methodological inconsistencies in the cytometric data. The research will interest a specialized audience in plant genomics, cytogenetics, and genome evolution, particularly those studying non-model or highly endoreplicated species.

      Field of expertise: plant cytogenetics, genome size evolution, orchid genomics.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03131

      Corresponding author(s): Ginto George and Adriana Ordoñez

      1. General Statements

      We thank the reviewers for their careful evaluation of our work and for their constructive and insightful comments. We are pleased that both reviewers found the study to be well executed, clearly presented, and of interest to the ER stress and UPR community. We have carefully considered all comments and revised the manuscript accordingly. We believe these revisions have substantially strengthened the clarity, robustness, and conceptual impact of the study.

      2. Point-by-point description of the revisions

      Below we provide a detailed, point-by-point response to the reviewers' comments and describe the revisions and new data included in the revised manuscript.

      Reviewer 1 & 2 (common points)

      1. __ Description of the BiP::GFP reporter as a readout of ATF6α activity.__
      2. Comment: Both reviewers are concerned about whether BiP::GFP is a reliable and specific reporter for ATF6α
      3. Response: In response, we have clarified in the revised manuscript the details of the BiP promoter fragment used in this reporter, explicitly detailing the presence of an ERSE-I element motif (CCAAT-N9-CCACG), the most specifically and robustly activated by ATF6α (new Supplemental Fig. S1). This reporter was first characterised in our recently published study (Tung et al., 2024 eLife), where we demonstrated that BiP::GFP expression is ATF6α dependent, as CRISPR/Cas9-mediated disruption of endogenous ATF6α resulted in a marked reduction in BiP::GFP fluorescence compared with parental cells. Furthermore, treatment with ER stress in the presence of Ceapin-A7 (a small molecule that blocks ATF6⍺ activation by tethering it to the lysosome) effectively blocked activation of the ATF6⍺ fluorescent reporter, whereas the S1P inhibitor partially attenuated the BiP::sfGFP signal in stressed cells (Tung et al., 2024 eLife; Supplemental S1D). We have now reproduced these findings in the present study, further confirming that the BiP::GFP reporter is highly dependent on ATF6α activation, and we present these data in a new Supplemental Fig. S1B.

      __ Correlation between BiP::GFP reporter activity and BiP expression levels.__

      • Comment: Both reviewers requested correlation of the BiP::GFP reporter activity and endogenous BiP levels.
      • __Response: __To address this point, we have measured BiP mRNA levels in parental and Slc33a1-depleted cells under both basal conditions and ER stress conditions. These measurements correlated well with the BiP::GFP reporter activity assessed by flow cytometry and are shown in Supplemental Fig. S3E.

      __ Does ATF6α respond to other ER stressors in Slc33a1-deleted cells?__

      • Comment: Both reviewers accepted our claim that ATF6α activation is partially attenuated in Slc33a1-deleted cells exposed to ER stressors tunicamycin (Tm) and 2-Deoxy-D-glucose (2DG) but raised the possibility that ATF6α signalling might respond differently to other ER stressors.
      • Response: To address this point, we have performed new experiments assessing ATF6α activation (BiP::GFP activity) in both Slc33a1-deleted and parental cells in response to additional ER stressors, including dithiothreitol (DTT) and thapsigargin (Tg). These new data, presented in a new Supplemental Fig. S3B and S3C, show that Slc33a1-deletion also attenuates ATF6α signalling in cells treated with dithiothreitol (DTT) and thapsigargin (Tg).

      __ Deletion of all NAT8 family members.__

      • Comment: Both reviewers suggested that deletion of all NAT8 family members was required to conclusively distinguish their role from that of SLC33A1.
      • __Response: __We agree with this assessment and have now generated cells in which both Nat8 and Nat8b are simultaneously deleted. These new data, included in a new Supplemental Fig. S9, strengthen the comparison with SLC33A1 deficiency and rule out potential redundancy among NAT8 family members. Notably, simultaneous inactivation of Nat8 and Nat8b resulted in the same phenotype observed upon single Nat8 deletion, namely activation of both the IRE1 and ATF6α branches of the UPR. These findings (discussed in detail) are consistent with previous studies implicating protein acetylation in ER proteostasis but suggest that a defect in protein acetylation is unlikely to contribute to the consequences of SLC33A1 deficiency in terms of ATF6α

      __ Generalisability beyond CHO-K1 cells.__

      • Comment: Reviewer 1 raised concerns regarding validation of our findings beyond CHO-K1 cells.
      • Response: While we acknowledge that validation in additional cell types would further strengthen the study, we now explicitly discuss the technical challenges encountered when attempting to generate clonal Slc33a1 knockouts in aneuploid human cell lines, such as HeLa. This limitation is now clearly acknowledged in the revised version, and our conclusions are framed accordingly.

      __ Relationship between basal ATF6 and IRE1 signalling.__

      • Comment: Both reviewers argued that BiP::GFP does not appear to be active under basal conditions in parental cells, and therefore a failure to activate ATF6 would not be expected to affect the conditions of the cells basally. Thereby questioning how attenuated basal ATF6 activity in the SLC33a1 deleted cells could account for the derepression observed in the IRE1 pathway.
      • Response: The logic of the reviewer's critique is impeccable, and we thank them for the opportunity to clarify this important issue. Whilst the basal fluorescent signal arising from BiP::GFP (the ATF6α reporter) is indeed weak, it is not null. This is evident by comparing the BiP::GFP signal in wildtype and ATF6α -deleted cells (new Supplemental Fig. S1B) These experiments revealed a significant reduction in basal BiP::GFP fluorescence in ATF6αΔ cells compared with parental dual-reporter cells, indicating that the BiP::GFP reporter has basal activity that is dependent on ATF6α. These new data are consistent with previous published observations demonstrated that treatment with Ceapin, an ATF6α-specific inhibitor, lowered BiP::GFP fluorescence in tunicamycin-treated cells to levels below those observed in untreated controls (Tung et al., eLife 2024). Together these observations indicate that ATF6α is active basally in CHO-K1 cells. Given the established cross-pathway repression of IRE1 by ATF6α signalling, it renders plausible our suggestion that the basal activation of the XBP1::mCherry (IRE1-reporter) observed basally in the SLC33a1 deleted cells arises from the partial interruption of ATF6α Reviewer 1 (additional points)

      • __ Effect of deleting sialic acid-modifying acetyltransferases.__

      • Comment: Reviewer 1 suggested that comparing the consequences of deleting SLC33a1 and the sialic acid- modifying acetyltransferases that operate downstream of the putative acetyl-CoA transporter could be informative.
      • Response: In response to this valuable suggestion, we have now examined the impact of deleting Casd1, the gene encoding the Golgi acetyltransferase responsible for modifying sialic acids on ATF6α activity, comparing the consequences to Slc33a1. New Supplemental Fig. S8 reveals partial phenotypic overlap between the two deletions, suggesting that the loss of SLC33A1 exerts some of its effects on CHO cells by compromising sialic acid modification.

      __ Potential effects on ATF6-like proteins (SREBP1/2, CREB3L).__

      • Comment: Reviewer 1 suggested that we evaluate the effect of SLC33A1 loss on other ATF6-like transcription factors.
      • Response: We took this advice to heart, but our attempts to compare SREBP2 processing in wildtype and SLC33A1 knockout cells were frustrated by the low quality of the antibodies available to us. Reviewer 2 (additional points)

      • __ Physiological state and clonal adaptation of Slc33a1-deleted cells.__

      • __Comment: __Reviewer 2 raised concerns regarding the physiological state of the Slc33a1-deleted cells and the potential impact of clonal adaptation or selection pressure on the consequences of genetic manipulation.
      • Response: This is a valid concern. Deconvoluting direct from indirect effects are a challenge in any genetics-based experiment. To try and address this point, we compared the proliferation capacity of three pairs of parental CHO-K1 clones with their derivative Slc33a1-deletion variants using the IncuCyte assay. As shown in new Supplemental Fig. S2D, the Slc33a1 deletion variants had no consistent fitness disadvantage revealed by this assay. Whilst cell mass accretion is only one measure of comparability between cell lines, we deem these observations to indicate that a comparison between SLC33A1 wildtype and mutant CHO-K1 cells is unlikely to be compromised by gross underlying differences in cell fitness.

      __ Responsiveness of PERK signalling to ER stress.__

      • Comment: Reviewer 2 asked whether PERK signalling, which appears basally activated due to higher basal IRE1 signalling in the Slc33a1-deleted cells, remains responsive to ER stress.
      • Response: To address this point, we treated cells with ER stressors and assessed PERK pathway activation. As shown in new Supplemental Fig. S4C, PERK signalling remains functional and responsive to ER stress in Slc33a1-depleted cells.

      In addition to the points above, we have addressed several presentation and clarity issues raised by the reviewers, including figure labelling, image presentation, and schematic models. The Discussion has also been revised to more explicitly acknowledge the current limitations of the study while emphasising its central conceptual advance: namely, that loss of SLC33A1 results in a discordant UPR state in which IRE1 and PERK are activated, whereas ATF6α trafficking and transcriptional output are selectively compromised.

      The following table summarises the major changes made to the figures in the revised manuscript to facilitate tracking the modifications introduced

      Figure

      Figure Panels

      Amendment (if any)

      Fig 4

      4B (modified)

      Scale bar added.

      Fig 5

      5B (modified)

      Labelling correction according to the reviewer.

      Fig S1 (new)

      S1A-S1B

      New data detailing the BiP promoter fragment and the reliability of the BiP::GFP reporter as a readout for ATF6α activity in cells.

      Fig S2 (modified)

      S2D (new)

      New IncuCyte data added.

      Fig S3 (modified)

      S3B, S3C and S3E (new)

      Panels B and C: New data from DTT and thapsigargin treatments, respectively. __Panel E: __New data from BiP mRNA levels under 2DG treatment in parental and Slc33a1-deleted cells.

      Fig S4 (new)

      S4C (new)

      __Panels A and B: __Previously shown as panels in Fig. S2C and S2D.

      __Panel C: __New data on the PERK response to ER stress in Slc33a1-deleted cells.

      Fig S7 (new)

      S7A-S7C (new)

      New sanger sequencing chromatograms displaying the targeted exonic regions of the Casd1, Nat8 and Nat8b. * *

      Fig S8 (new)

      S8A-S8B (new)

      Casd1-deleted data added.

      Fig S9 (new)

      Unique panel

      New data comparing Nat8/Nat8b-deleted cells with single Nat8-deleted cells.

      We thank the reviewers again for their insightful comments, which have significantly strengthened the manuscript. We believe the revised study clarifies key mechanistic points and provides a stronger conceptual advance regarding the role of SLC33A1 in UPR regulation.

      Sincerely,

      Adriana Ordóñez

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The authors employed a genome-wide CRISPR-Cas9 screen to search for the genes selectively involved in the activation of ER stress sensor ATF6. Deletion of Slc33a1, which encodes a transporter of acetyl-CoA into the ER lumen, compromised the ATF6 pathway (as assessed by BiP::GFP reporter), while IRE1 and PERK were activated in basal conditions, in the absence of ER stress (as assessed by XBP1s::mCherry reporter and endogenous XBP1s and CHOP::GFP reporter). Moreover, IRE1, but not ATF6, replied to ER stress. Consistently, in Slc33a1Δ cells upon ER stress the levels of the processed N-ATF6α were significantly lowered compared to the parental cells, and microscopy study showed that in Slc33a1-deficient cells ATF6 is translocated to Golgi even in the absence of ER stress, but fails to reach the nucleus even after ER stress is imposed. Golgi-type sugar modification of ATF6α is decreased in Slc33a1Δ cells. These data show the importance of SLC33A1 for ATF6 processing and functioning through the mechanism which remains to be revealed.

      Major comments.

      Taken together, the reported data do support the conclusion about the role of SLC33A1 functioning in post-ER maturation of ATF6. Data and methods are presented in a reproducible way. Still, there are several issues worth attention.

      1. While BiP::GFP reporter is very useful, it would be more convincing to show the level of BiP in Slc33a1Δ cells by WB.
      2. Another concern is the state of Slc33a1Δ cells. While adaptation is a general problem of clonal cells, the cells used in this study (with XBP1 highly spliced, CHOP upregulated, and ATF6 pro-survival pathway inhibited) are probably very sick, and the selection pressure/adaptation is very strong in this cell line. I would suggest the authors to clarify this issue.
      3. Authors showed that, based on CHOP::GFP reporter data, PERK was activated in the absence of ER stress and the activation was due to IRE1 signalling. But did PERK reply to the ER stress?
      4. An important question is a subcellular location of SLC33A1. Huppke et al. (cited in the manuscript) showed that FLAG- and GFP-tagged SLC33A1 was colocalized with Golgi markers. While that may be due to overexpression of the protein, it deserves consideration, given that ATF6 is stuck in Golgi upon depletion of SLC33A1.
      5. OPTIONAL. Regarding the role of acetylation in compromising ATF6 function: since both SLC33A1 deficiency and depletion of Nat8 have broad effects, glycosylation of ATF6 upon depletion of Nat8 should be assessed (similarly to Fig 5), to demonstrate the difference in glycosylation pattern upon the absence of SLC33A1 and Nat8 and strengthen the conclusions.

      Minor comments.

      1. Apart from the table of the cell lines, it would be useful to add to the supplementary a simple-minded scheme of the reporters used in this study (BiP::GFP, CHOP::GFP, XBP1s::mCherry) specifying the mechanism of the readout and the harbored protein and other important details (e.g., whether mRNA of XBP1s::mCherry reporter could be processed by IRE1).
      2. Fig 2B and Fig 3A - the percentage of spliced XBP1 in parental cells is about 30% according to the graphs, but it looks more like 5%.
      3. Fig 3B - It would probably be better to demonstrate the processing of endogenous ATF6. It could help to avoid the problems with alternative translation (even though anti-ATF6 antibodies are known to be tricky).
      4. In Fig 4B - could be better to show Golgi marker separately. In Fig 4B and E the bars are missing (and cells in Fig 4B look bigger than in Fig 4E). Magnification of the insets should be further increased.
      5. As the authors mention, 2-deoxy-D-glucose (2DG) is known to be the ER stress inducer, acting via prevention of N-glycosylation of proteins. Also, N-glycosylation state of ATF6 has been suggested to influence its trafficking. Thus, even if the control cells were treated in the same way, 2DG may not be the best ER-stress inducer to study ATF6 trafficking. Indeed, altered sugar modification of ATF6α in Slc33a1Δ cells (Fig 5) was tracked using Thapsigargin.
      6. Minor comment on Fig 7 - recent data (Belyy et al., 2022) suggest IRE1 is a dimer even in the absence of ER stress.

      Referee cross-commenting

      I agree with Reviewer 1 that the authors need to clarify that authors need to clarify better how exactly BiP::GFP reporter works and whether it reflects ATF6 activation (rev 1 pointed to unclear responsiveness of the reporter to ATF6 and I asked to show the level of BiP by WB and the scheme of the mechanisms of readouts of the reporters)

      I also agree with the comment on 2-DG which for some experiments may not be the best choice to activate UPR (or as Reviewer 1 pointed out shouldn't be the only one used to induce UPR). I still think that there's no contradiction in partial cleavage of ATF6 and suppression of BiP::GFP in Slc33a1Δ cells if then (as authors show) it doesn't reach nucleus.

      Significance

      General assessment. The article shows the necessity of SLC33A1, a transporter of acetyl-CoA in ER lumen, for ATF6 processing and functioning. It is well-written. However, the molecular mechanism which underlies the link is yet to be discovered (and this is clearly mentioned by the authors).

      The study is of interest for the basic research and of potential interest for clinical research.

      My main field of expertise is UPR. While I have broad knowledge and interest in protein science in general, my experience with protein glycosylation is rather limited.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this manuscript, the authors follow up on the results from a previous CRISPR screen in CHO-K1 cells demonstrating that knockout of the ER acetyl-CoA transporter Slc33a1 suppresses ATF6 activation. The authors show in these cells that, in response to 2-DG, the Slc33a1 deletion results in constitutive activation of the UPR except for the ATF6 pathway, which appears to traffic constitutively to the Golgi but to not be cleaved there. They show using an uncleavable ATF6 that loss of Slc33a1 delays formation of an O-glycosylated form of at least this version of the protein, and they also find that single deletion of the ER acetyltransferases NAT8 and NAT8B also constitutively activates the UPR, but that activation in this case includes activation of ATF6. The mechanism by which Acetyl-CoA might impact ATF6 activation is not elucidated.

      Major Comments:

      The following conclusions are well-supported:

      • That loss of Slc33a1 results in IRE1 and PERK activation but not ATF6 activation
      • That ATF6 traffics at least to some degree constitutively to the Golgi when Slc33a1 is deleted, which is a counterintuitive finding given the apparent lack of ATF6 activation
      • That loss of Slc33a1 can alter the level O-glycosylation and the preponderance of sialylated N-glycans on at least ATF6
      • Generally speaking, I find the wording to be careful and precise

      The following claims are less convincing:

      • That loss of Slc33a1 results in universal suppression of ATF6 activation. The effect in response to 2-DG is unquestionably strong at least at the level of Bip-GFP reporter (although it's not clear from this paper nor the previous one from this group how much of the Bip promoter this reporter encodes-which is important because only a minimal Bip promoter is exclusively responsive to ATF6). However, the impairment of ATF6 activation in response to tunicamycin (Fig. 1C) is very modest, and no other stressors were tested (DTT and TG were used for other purposes, not to test ATF6 activation). One might actually expect this pathway, if it affects glycosylation pathways, to be particularly sensitive to a stressor like 2-DG that would have knock-on effects on glycosylation. Admittedly, it does seem to be true in the basal condition (i.e., absent an exogenous ER stress) that IRE1 and PERK are activated where ATF6 is not. At some level, it's hard to reconcile the almost complete suppression of Bip-GFP induction in Slc33a1 cells in response to 2DG with the fact that in Fig. 3, cleavage clearly seems to be occurring, albeit to a lesser extent
      • That regulation of ATF6 is a broadly applicable consequence of Slc33a1 action. Unless I've missed it, all experiments are performed in CHO-K1 cells, so how broadly applicable this pathway is not clear.
      • That loss of Slc33a1 "deregulated activation of the IRE1 branch of the UPR." It is clear that IRE1 is activated when Slc33a1 is deleted (that the authors show this repeatedly in different parental cell lines provides a high degree of rigor). However, at least through the CHOP-GFP reporter, PERK is activated as well. Although 4u8C suppresses this activation, the suppression is not complete, there are no orthogonal ways of showing this (e.g., loss of KD of IRE1), and the converse experiment (examining IRE1 activation when PERK is lost or inhibited) was not done. Thus, while I agree that the data shown are consistent with PERK activation being downstream of IRE1, they are not definitive enough to, in my opinion, rule out the more parsimonious explanation for their own data and what is already published in the field that loss of Slc33a1 causes ER stress (thus in principle activating all 3 pathways of the UPR-including ATF6 transit to the Golgi) but that it also, separately, inhibits activation of ATF6 (and possibly other things? See below)-a possibility acknowledged towards the end of the Discussion.
      • That "Nat8 and Slc33a1 influence ER homeostasis and ATF6 signaling through distinct mechanisms". This conclusion would require simultaneous deletion of both Nat8 and NAT8B because of possible redundancy/compensatory effects.
      • If I'm understanding the authors' argument correctly, they seem to be invoking that the ATF6 activation defect underlies/is upstream of the activation of IRE1 in Slc33a1 KO cells. But if that understanding is correct, it seems fairly unlikely, as the authors' data show no evidence that ATF6 is activated in parental cells under basal conditions (Fig. 3B) and thus no reason to expect that failure to activate ATF6 by itself would result in appreciable phenotype in cells-an idea also consistent with the general lack of phenotype in ATF6-null MEF and other cells.

      Minor Comments:

      • The alteration in O-glycosylation levels of ATF6 is interesting, but it might or might not be relevant to ATF6 activation, and if it isn't, then the paper provides no mechanism for why loss of Slc33a1 has the effects on ATF6 that it does. What about other similar molecules, like ATF6B (surprising that this was not examined), SREBP1/2, a non-glycoyslatable ATF6, and/or one of the other CREB3L proteins?
      • Does Slc33a1 deletion cause other ER resident proteins to constitutively mislocalize to the Golgi?
      • As mentioned above, does loss/knockdown of Slc33a1 activate IRE1 and PERK but not ATF6 in other cell types?
      • Also as mentioned above, how do the UPR (all 3 branches) in cells lacking Slc33a1 respond to TG or DTT? This and the preceding comments are important toward making the claim that Slc33a1 is actually a regulator of ATF6. The time required to do these experiments will depend on whether creation of more stable lines is required, and whether they are worth doing depends on how broad the authors wish the scope of the paper to be.
      • It's surprising that the authors didn't do comparable experiments to what is shown in Fig. 6 but deleting the acetyltransferases that modify sialic acids, which I believe are known.
      • The authors mis-describe the data from Fig. 5B. EndoH and PNGaseF should collapse ATF6 to a 0N form, not a 1N form (what is labeled as 2N should be 1N, and it looks like the true 2N band is partially obscured by the strong 3N band.

      Referee cross-commenting

      While reviewer #2 and I have somewhat different opinions on the strength of the evidence, we seem fairly well-aligned on the overall significance of the work.

      Significance

      The conceptual advance in this paper is that, while loss of Slc33a1 seems widely disruptive to ER function-an idea that has been advanced in the literature before-it seems to have unique and discordant effects on ATF6 relative to the other UPR pathways. The paper does not offer a conclusive mechanism by which these effects are realized, and the sole focus on ATF6 makes it difficult to fully contextualize the findings, but the data are of high quality and, while the scope is somewhat narrow, the phenotype is likely to be of interest to those concerned with ER stress and UPR signaling, which also describes my own expertise.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Damaris et al. perform what is effectively an eQTL analysis on microbial pangenomes of E. coli and P. aeruginosa. Specifically, they leverage a large dataset of paired DNA/RNA-seq information for hundreds of strains of these microbes to establish correlations between genetic variants and changes in gene expression. Ultimately, their claim is that this approach identifies non-coding variants that affect expression of genes in a predictable manner and explain differences in phenotypes. They attempt to reinforce these claims through use of a widely regarded promoter calculator to quantify promoter effects, as well as some validation studies in living cells. Lastly, they show that these non-coding variations can explain some cases of antibiotic resistance in these microbes.

      Major comments

      Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      The authors convincingly demonstrate that they can identify non-coding variation in pangenomes of bacteria and associate these with phenotypes of interest. What is unclear is the extent by which they account for covariation of genetic variation? Are the SNPs they implicate truly responsible for the changes in expression they observe? Or are they merely genetically linked to the true causal variants. This has been solved by other GWAS studies but isn't discussed as far as I can tell here.

      We thank the reviewer for their effective summary of our study. Regarding our ability to identify variants that are causal for gene expression changes versus those that only “tag” the causal ones, here we have to again offer our apologies for not spelling out the limitation of GWAS approaches, namely the difficulty in separating associated with causal variants. This inherent difficulty is the main reason why we added the in-silico and in-vitro validation experiments; while they each have their own limitations, we argue that they all point towards providing a causal link between some of our associations and measured gene expression changes. We have amended the discussion (e.g. at L548) section to spell our intention out better and provide better context for readers that are not familiar with the pitfalls of (bacterial) GWAS.

      They need to justify why they consider the 30bp downstream of the start codon as non-coding. While this region certainly has regulatory impact, it is also definitely coding. To what extent could this confound results and how many significant associations to expression are in this region vs upstream?

      We agree with the reviewer that defining this region as “non-coding” is formally not correct, as it includes the first 10 codons of the focal gene. We have amended the text to change the definition to “cis regulatory region” and avoided using the term “non-coding” throughout the manuscript. Regarding the relevance of this including the early coding region, we have looked at the distribution of associated hits in the cis regulatory regions we have defined; the results are shown in Supplementary Figure 3.

      We quantified the distribution of cis associated variants and compared them to a 2,000 permutations restricted to the -200bp and +30bp window in both E. coli * (panel A) and P. aeruginosa* (panel B). As it can be seen, the associated variants that we have identified are mostly present in the 200bp region and the +30bp region shows a mild depletion relative to the random expectation, which we derived through a variant position shuffling approach (2,000 replicates). Therefore, we believe that the inclusion of the early coding region results in an appreciable number of associations, and in our opinion justify its inclusion as a putative “cis regulatory region”.

      The claim that promoter variation correlates with changes in measured gene expression is not convincingly demonstrated (although, yes, very intuitive). Figure 3 is a convoluted way of demonstrating that predicted transcription rates correlate with measured gene expression. For each variant, can you do the basic analysis of just comparing differences in promoter calculator predictions and actual gene expression? I.e. correlation between (promoter activity variant X)-(promoter activity variant Y) vs (measured gene expression variant X)-(measured gene expression variant Y). You'll probably have to

      We realize that we may not have failed to properly explain how we carried out this analysis, which we did exactly in the way the reviewer suggests here. We had in fact provided four example scatterplots of the kind the reviewer was requesting as part of Figure 4. We have added a mention of their presence in the caption of Figure 3.

      Figure 7 it is unclear what this experiment was. How were they tested? Did you generate the data themselves? Did you do RNA-seq (which is what is described in the methods) or just test and compare known genomic data?

      We apologize for the lack of clarity here; we have amended the figure’s caption and the corresponding section of the results (i.e. L411 and L418) to better highlight how the underlying drug susceptibility data and genomes came from previously published studies.

      Are the data and the methods presented in such a way that they can be reproduced?

      No, this is the biggest flaw of the work. The RNA-Seq experiment to start this project is not described at all as well as other key experiments. Descriptions of methods in the text are far too vague to understand the approach or rationale at many points in the text. The scripts are available on github but there is no description of what they correspond to outside of the file names and none of the data files are found to replicate the plots.

      We have taken this critique to heart, and have given more details about the experimental setup for the generation of the RNA-seq data in the methods as well as the results sections. We have also thoroughly reviewed any description of the methods we have employed to make sure they are more clearly presented to the readers. We have also updated our code repository in order to provide more information about the meaning of each script provided, although we would like to point out that we have not made the code to be general purpose, but rather as an open documentation on how the data was analyzed.

      Figure 8B is intended to show that the WaaQ operon is connected to known Abx resistance genes but uses the STRING method. This requires a list of genes but how did they build this list? Why look at these known ABx genes in particular? STRING does not really show evidence, these need to be substantiated or at least need to justify why this analysis was performed.

      We have amended the Methods section (“Gene interaction analysis”, L799) to better clarify how the network shown in this panel was obtained. In short, we have filtered the STRING database to identify genes connected to members of the waa operon with an interaction score of at least 0.4 (“moderate confidence”), excluding the “text mining” field. Antimicrobial resistance genes were identified according to the CARD database. We believe these changes will help the readers to better understand how we derived this interaction.

      Are the experiments adequately replicated and statistical analysis adequate?

      An important claim on MIC of variants for supplementary table 8 has no raw data and no clear replicates available. Only figure 6, the in vitro testing of variant expression, mentions any replicates.

      We have expanded the relevant section in the Methods (“Antibiotic exposure and RNA extraction”, L778) to provide more information on the way these assays were carried out. In short, we carried out three biological replicates, the average MIC of two replicates in closest agreement was the representative MIC for the strain. We believe that we have followed standard practice in the field of microbiology, but we agree that more details were needed to be provided in order for readers to appreciate this.

      Minor comments

      Specific experimental issues that are easily addressable..

      Are prior studies referenced appropriately?

      There should be a discussion of eQTLs in this. Although these have mostly been in eukaryotes a. https://doi.org/10.1038/s41588-024-01769-9 ; https://doi.org/10.1038/nrg3891.

      We have added these two references, which provide a broader context to our study and methodology, in the introduction.

      Line 67. Missing important citation for Ireland et al. 2020 https://doi.org/10.7554/eLife.55308

      Line 69. Should mention Johns et al. 2018 (https://doi.org/10.1038/nmeth.4633) where they study promoter sequences outside of E. coli

      Line 90 - replace 'hypothesis-free' with unbiased

      We have implemented these changes.

      Line 102 - state % of DEGs relative to the entire pan-genome

      Given that the study is focused on identifying variants that were associated with changes in expression for reference genes (i.e. those present in the reference genome), we think that providing this percentage would give the false impression that our analysis include accessory genes that are not encoded by the reference isolate, which is not what we have done.

      Figure 1A is not discussed in the text

      We have added an explicit mention of the panels in the relevant section of the results.

      Line 111: it is unclear what enrichment was being compared between, FIgures 1C/D have 'Gene counts' but is of the total DEGs? How is the p-value derived? Comparing and what statistical test was performed? Comparing DEG enrichment vs the pangenome? K12 genome?

      We have amended the results and methods section, as well as Figure 1’s caption to provide more details on how this analysis was carried out.

      Line 122-123: State what letters correspond to these COG categories here

      We have implemented the clarifications and edits suggested above

      Line 155: Need to clarify how you use k-mers in this and how they are different than SNPs. are you looking at k-mer content of these regions? K-mers up to hexamers or what? How are these compared. You can't just say we used k-mers.

      We have amended that line in the results section to more explicitly refer to the actual encoding of the k-mer variants, which were presence/absence patterns for k-mers extracted from each target gene’s promoter region separately, using our own developed method, called panfeed. We note that more details were already given in the methods section, but we do recognize that it’s better to clarify things in the results section, so that more distracted readers get the proper information about this class of genetic variants.

      Line 172: It would be VERY helpful to have a supplementary figure describing these types of variants, perhaps a multiple-sequence alignment containing each example

      We thank the reviewer for this suggestion. We have now added Supplementary Figure 3, which shows the sequence alignments of the cis-regulatory regions underlying each class of the genetic marker for both E. coli and P. aeruginosa.

      Figure 4: THis figure is too small. Why are WaaQ and UlaE being used as examples here when you are supposed to be explicitly showing variants with strong positive correlations?

      We rearranged the figure’s layout to improve its readability. We agree that the correlation for waaQ and ulaE is weaker than for yfgJ and kgtP, but our intention was to not simply cherry-pick strong examples, but also those for which the link between predicted promoter strength and recorded gene expression was less obvious.

      Figure 4: Why is there variation between variants present and variant absent? Is this due to other changes in the variant? Should mention this in the text somewhere

      Variability in the predicted transcription rate for isolates encoding for the same variant is due to the presence of other (different) variants in the region surrounding the target variant. PromoterCalculator uses nucleotide regions of variable length (78 to 83bp) to make its predictions, while the variants we are focusing on are typically shorter (as shown in Figure 4). This results in other variants being included in the calculation and therefore slightly different predicted transcription rates for each strain. We have amended the caption of Figure 4 to provide a succinct explanation of these differences.

      Line 359: Need to talk about each supplementary figure 4 to 9 and how they demonstrate your point.

      We have expanded this section to more explicitly mention the contents of these supplementary figures and why they are relevant for the findings of this section (L425).

      Are the text and figures clear and accurate?

      Figure 4 too small

      We have fixed the figure, as described above

      Acronyms are defined multiple times in the manuscript, sometimes not the first time they are used (e.g. SNP, InDel)

      Figure 8A - Remove red box, increase label size

      Figure 8B - Low resolution, grey text is unreadable and should be darker and higher resolution

      Line 35 - be more specific about types of carbon metabolism and catabolite repression

      Line 67 - include citation for ireland et al. 2020 https://doi.org/10.7554/eLife.55308

      Line 74 - You talk about looking in cis but don't specify how mar away cis is

      Line 75 - we encoded genetic variants..... It is unclear what you mean here

      Line 104 - 'were apart of operons' should clarify you mean polycistronic or multi-gene operons. Single genes may be considered operonic units as well.

      We have addressed all the issues indicated above.

      Figure 2: THere is no axis for the percents and the percents don't make sense relative to the bars they represent??

      We realize that this visualization might not have been the most clear for readers, and have made the following improvement: we have added the number of genes with at least one association before the percentage. We note that the x-axis is in log scale, which may make it seem like the light-colored bars are off. With the addition of the actual number of associated genes we think that this confusion has been removed.

      Figure 2: Figure 2B legend should clarify that these are individual examples of Differential expression between variants

      Line 198-199: This sentence doesn't make sense, 'encoded using kmers' is not descriptive enough

      Line 205: Should be upfront about that you're using the Promoter Calculator that models biophysical properties of promoter sequences to predict activity.

      Line 251: 'Scanned the non-coding sequences of the DEGs'. This is far too vague of a description of an approach. Need to clarify how you did this and I didn't see in the method. Is this an HMM? Perfect sequence match to consensus sequence? Some type of alignment?

      Line 257-259: This sentence lacks clarity

      We have implemented all the suggested changes and clarified the points that the reviewer has highlighted above.

      Line346: How were the E. coli isolates tested? Was this an experiment you did? This is a massive undertaking (1600 isolates * 12 conditions) if so so should be clearly defined

      While we have indicated in the previous paragraph that the genomes and antimicrobial susceptibility data were obtained from previously published studies, we have now modified this paragraph (e.g. L411 and L418) slightly to make this point even clearer.

      Figure 6A: The tile plot on the right side is not clearly labeled and it is unclear what it is showing and how that relates to the bar plots.

      In the revised figure, we have clarified the labeling of the heatmap to now read “Log2(Fold Change) (measured expression)” to indicate that it represents each gene’s fold changes obtained from our initial transcriptomic analysis. We have also included this information in the caption of the figure, making the relationship between the measured gene expression (heatmap) and the reporter assay data (bar plots) clear to the reader.

      FIgure 6B: typo in legend 'Downreglation'

      We thank the review for pointing this out. The typo has been corrected to “Down regulation” in the revised figure.

      Line 398: Need to state rationale for why Waaq operon is being investigated here. WHy did you look into individual example?

      We thank the reviewer for asking for a clarification here. Our decision to investigate the waaQ gene was one of both biological relevance and empirical evidence. In our analysis associating non-coding variants with antimicrobial resistance using the Moradigaravand et al. dataset, we identified a T>C variant at position 3808241 that was associated with resistance to Tobramycin. We also observed this variant in our strain collection, where it was associated with expression changes of the gene, suggesting a possible functional impact. The waa operon is involved in LPS synthesis, a central determinant of the bacteria’s outer membrane integrity and a well established virulence factor. This provided a plausible biological mechanism through which variation could influence antimicrobial susceptibility. As its role in resistance has not been extensively characterized, this represents a good candidate for our experimental validation. We have now included this rationale in our revised manuscript (i.e. L476).

      Figure 8: Can get rid of red box

      We have now removed the red box from Figure 8 in the revised version.

      Line 463 - 'account for all kinds' is too informal

      Mix of font styles throughout document

      We have implemented all the suggestions and formatting changes indicated above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In their manuscript "Cis non-coding genetic variation drives gene expression changes in the E. coli and P. aeruginosa pangenomes", Damaris and co-authors present an extensive meta-analysis, plus some useful follow up experiments, attempting to apply GWAS principles to identify the extent to which differences in gene expression between different strains within a given species can be directly assigned to cis-regulatory mutations. The overall principle, and the question raised by the study, is one of substantial interest, and the manuscript here represents a careful and fascinating effort at unravelling these important questions. I want to preface my review below (which may otherwise sound more harsh than I intend) with the acknowledgment that this is an EXTREMELY difficult and challenging problem that the authors are approaching, and they have clearly put in a substantial amount of high quality work in their efforts to address it. I applaud the work done here, I think it presents some very interesting findings, and I acknowledge fully that there is no one perfect approach to addressing these challenges, and while I will object to some of the decisions made by the authors below, I readily admit that others might challenge my own suggestions and approaches here. With that said, however, there is one fundamental decision that the authors made which I simply cannot agree with, and which in my view undermines much of the analysis and utility of the study: that decision is to treat both gene expression and the identification of cis-regulatory regions at the level of individual genes, rather than transcriptional units. Below I will expand on why I find this problematic, how it might be addressed, and what other areas for improvement I see in the manuscript:

      We thank the reviewer for their praise of our work. A careful set of replies to the major and minor critiques are reported below each point.

      In the entire discussion from lines roughly 100-130, the authors frequently dissect out apparently differentially expressed genes from non differentially expressed genes within the same operons... I honestly wonder whether this is a useful distinction. I understand that by the criteria set forth by the authors it is technically correct, and yet, I wonder if this is more due to thresholding artifacts (i.e., some genes passing the authors' reasonable-yet-arbitrary thresholds whereas others in the same operon do not), and in the process causing a distraction from an operon that is in fact largely moving in the same direction. The authors might wish to either aggregate data in some way across known transcriptional units for the purposes of their analysis, and/or consider a more lenient 'rescue' set of significance thresholds for genes that are in the same operons as differentially expressed genes. I would favor the former approach, performing virtually all of their analysis at the level of transcriptional units rather than individual genes, as much of their analysis in any case relies upon proper assignment of genes to promoters, and this way they could focus on the most important signals rather than get lots sometimes in the weeds of looking at every single gene when really what they seem to be looking at in this paper is a property OF THE PROMOTERS, not the genes. (of course there are phenomena, such as rho dependent termination specifically titrating expression of late genes in operons, but I think on the balance the operon-level analysis might provide more insights and a cleaner analysis and discussion).

      We agree with the reviewer that the peculiar nature of transcription in bacteria has to be taken into account in order to properly quantify the influence of cis variants in gene expression changes. We therefore added the exact analysis the reviewer suggested; that is, we ran associations between the variants in cis to the first gene of each operon and a phenotype that considered the fold-change of all genes in the operon, via a weighted average (see Methods for more details). As reported in the results section (L223), we found a similar trend as with the original analysis: we found the highest proportion of associations when encoding cis variants using k-mers (42% for E. coli and 45% for P. aeruginosa). More importantly, we found a high degree of overlap between this new “operon-level” association analysis and the original one (only including the first gene in each operon). We found a range of 90%-94% of associations overlapping for E. coli and between 75% and 91% for P. aeruginosa, depending on the variant type. We note that operon definitions are less precise for P. aeruginosa, which might explain the higher variability in the level of overlap. We have added the results of this analysis in the results section.

      This also leads to a more general point, however, which I think is potentially more deeply problematic. At the end of the day, all of the analysis being done here centers on the cis regulatory logic upstream of each individual open reading frame, even though in many cases (i.e., genes after the first one in multi-gene operons), this is not where the relevant promoter is. This problem, in turn, raises potentially misattributions of causality running in both directions, where the causal impact on a bona fide promoter mutation on many genes in an operon may only be associated with the first gene, or on the other side, where a mutation that co-occurs with, but is causally independent from, an actual promoter mutation may be flagged as the one driving an expression change. This becomes an especially serious issue in cases like ulaE, for genes that are not the first gene in an operon (at least according to standard annotations, the UlaE transcript should be part of a polycistronic mRNA beginning from the ulaA promoter, and the role played by cis-regulatory logic immediately upstream of ulaE is uncertain and certainly merits deeper consideration. I suspect that many other similar cases likewise lurk in the dataset used here (perhaps even moreso for the Pseudomonas data, where the operon definitions are likely less robust). Of course there are many possible explanations, such as a separate ulaE promoter only in some strains, but this should perhaps be carefully stated and explored, and seems likely to be the exception rather than the rule.

      While we again agree with the reviewer that some of our associations might not result in a direct causal link because the focal variant may not belong to an actual promoter element, we also want to point out how the ability to identify the composition of transcriptional units in bacteria is far from a solved problem (see references at the bottom of this comment, two in general terms, and one characterizing a specific example), even for a well-studied species such as E. coli. Therefore, even if carrying out associations at the operon level (e.g. by focusing exclusively on variants in cis for the first gene in the operon) might be theoretically correct, a number of the associations we find further down the putative operons might be the result of a true biological signal.

      1. Conway, T., Creecy, J. P., Maddox, S. M., Grissom, J. E., Conkle, T. L., Shadid, T. M., Teramoto, J., San Miguel, P., Shimada, T., Ishihama, A., Mori, H., & Wanner, B. L. (2014). Unprecedented High-Resolution View of Bacterial Operon Architecture Revealed by RNA Sequencing. mBio, 5(4), 10.1128/mbio.01442-14. https://doi.org/10.1128/mbio.01442-14

      2. Sáenz-Lahoya, S., Bitarte, N., García, B., Burgui, S., Vergara-Irigaray, M., Valle, J., Solano, C., Toledo-Arana, A., & Lasa, I. (2019). Noncontiguous operon is a genetic organization for coordinating bacterial gene expression. Proceedings of the National Academy of Sciences, 116(5), 1733–1738. https://doi.org/10.1073/pnas.1812746116

      3. Zehentner, B., Scherer, S., & Neuhaus, K. (2023). Non-canonical transcriptional start sites in E. coli O157:H7 EDL933 are regulated and appear in surprisingly high numbers. BMC Microbiology, 23(1), 243. https://doi.org/10.1186/s12866-023-02988-6

      Another issue with the current definition of regulatory regions, which should perhaps also be accounted for, is that it is likely that for many operons, the 'regulatory regions' of one gene might overlap the ORF of the previous gene, and in some cases actual coding mutations in an upstream gene may contaminate the set of potential regulatory mutations identified in this dataset.

      We agree that defining regulatory regions might be challenging, and that those regions might overlap with coding regions, either for the focal gene or the one immediately upstream. For these reasons we have defined a wide region to identify putative regulatory variants (-200 to +30 bp around the start codon of the focal gene). We believe this relatively wide region allows us to capture the most cis genetic variation.

      Taken together, I feel that all of the above concerns need to be addressed in some way. At the absolute barest minimum, the authors need to acknowledge the weaknesses that I have pointed out in the definition of cis-regulatory logic at a gene level. I think it would be far BETTER if they performed a re-analysis at the level of transcriptional units, which I think might substantially strengthen the work as a whole, but I recognize that this would also constitute a substantial amount of additional effort.

      As indicated above, we have added a section in the results section to report on the analysis carried out at the level of operons as individual units, with more details provided in the methods section. We believe these results, which largely overlap with the original analysis, are a good way to recognize the limitation of our approach and to acknowledge the importance of gaining a better knowledge on the number and composition of transcriptional units in bacteria, for which, as the reference above indicates, we still have an incomplete understanding.

      Having reached the end of the paper, and considering the evidence and arguments of the authors in their totality, I find myself wondering how much local x background interactions - that is, the effects of cis regulatory mutations (like those being considered here, with or without the modified definitions that I proposed above) IN THE CONTEXT OF A PARTICULAR STRAIN BACKGROUND, might matter more than the effects of the cis regulatory mutations per se. This is a particularly tricky problem to address because it would require a moderate number of targeted experiments with a moderate number of promoters in a moderate number of strains (which of course makes it maximally annoying since one can't simply scale up hugely on either axis individually and really expect to tease things out). I think that trying to address this question experimentally is FAR beyond the scope of the current paper, but I think perhaps the authors could at least begin to address it by acknowledging it as a challenge in their discussion section, and possibly even identify candidate promoters that might show the largest divergence of activities across strains when there IS no detectable cis regulatory mutation (which might be indicative of local x background interactions), or those with the largest divergences of effect for a given mutation across strains. A differential expression model incorporating shrinkage is essential in such analysis to avoid putting too much weight on low expression genes with a lot of Poisson noise.

      We again thank the reviewer for their thoughtful comments on the limitations of correlative studies in general, and microbial GWAS in particular. In regards to microbial GWAS we feel we may have failed to properly explain how the implementation we have used allows to, at least partially, correct for population structure effects. That is, the linear mixed model we have used relies on population structure to remove the part of the association signal that is due to the genetic background and thus focus the analysis on the specific loci. Obviously examples in which strong epistatic interactions are present would not be accounted for, but those would be extremely challenging to measure or predict at scale, as the reviewer rightfully suggests. We have added a brief recap of the ability of microbial GWAS to account for population structure in the results section (“A large fraction of gene expression changes can be attributed to genetic variations in cis regulatory regions”, e.g. L195).

      I also have some more minor concerns and suggestions, which I outline below:

      It seems that the differential expression analysis treats the lab reference strains as the 'centerpoint' against which everything else is compared, and yet I wonder if this is the best approach... it might be interesting to see how the results differ if the authors instead take a more 'average' strain (either chosen based on genetics or transcriptomics) as a reference and compared everything else to that.

      While we don’t necessarily disagree with the reviewer that a “wild” strain would be better to compare against, we think that our choice to go for the reference isolates is still justified on two grounds. First, while it is true that comparing against a reference introduces biases in the analysis, this concern would not be removed had we chosen another strain as reference; which strain would then be best as a reference to compare against? We think that the second point provides an answer to this question; the “traditional” reference isolates have a rich ecosystem of annotations, experimental data, and computational predictions. These can in turn be used for validation and hypothesis generation, which we have done extensively in the manuscript. Had we chosen a different reference isolate we would have had to still map associations to the traditional reference, resulting in a probable reduction in precision. An example that will likely resonate with this reviewer is that we have used experimentally-validated and high quality computational operon predictions to look into likely associations between cis-variants and “operon DEGs”. This analysis would have likely been of worse quality had we used another strain as reference, for which operon definitions would have had to come from lower-quality predictions or be “lifted” from the traditional reference.

      Line 104 - the statement about the differentially expressed genes being "part of operons with diverse biological functions" seems unclear - it is not apparent whether the authors are referring to diversity of function within each operon, or between the different operons, and in any case one should consider whether the observation reflects any useful information or is just an apparently random collection of operons.

      We agree that this formulation could create confusion and we have elected to remove the expression “with diverse biological functions”, given that we discuss those functions immediately after that sentence.

      Line 292 - I find the argument here somewhat unconvincing, for two reasons. First, the fact that only half of the observed changes went in the same direction as the GWAS results would indicate, which is trivially a result that would be expected by random chance, does not lend much confidence to the overall premise of the study that there are meaningful cis regulatory changes being detected (in fact, it seems to argue that the background in which a variant occurs may matter a great deal, at least as much as the cis regulatory logic itself). Second, in order to even assess whether the GWAS is useful to "find the genetic determinants of gene expression changes" as the authors indicate, it would be necessary to compare to a reasonable, non-straw-man, null approach simply identifying common sequence variants that are predicted to cause major changes in sigma 70 binding at known promoters; such a test would be especially important given the lack of directional accuracy observed here. Along these same lines, it is perhaps worth noting, in the discussion beginning on line 329, that the comparison is perhaps biased in favor of the GWAS study, since the validation targets here were prioritized based on (presumably strong) GWAS data.

      We thank the reviewer for prompting us into reasoning about the results of the in-vitro validation experiments. We agree that the agreement between the measured gene expression changes agree only partly with those measured with the reporter system, and that this discrepancy could likely be attributed to regulatory elements that are not in cis, and thus that were not present in the in-vitro reporter system. We have noted this possibility in the discussion. Additionally, we have amended the results section to note that even though the prediction in the direction of gene expression change was not as accurate as it could be expected, the prediction of whether a change would be present (thus ignoring directionality) was much higher.

      I don't find the Venn diagrams in Fig 7C-D useful or clear given the large number of zero-overlap regions, and would strongly advocate that the authors find another way to show these data.

      While we are aware that alternative ways to show overlap between sets, such as upset plots, we don’t actually find them that much easier to parse. We actually think that the simple and direct Venn diagrams we have drawn convey the clear message that overlaps only exist between certain drug classes in E. coli, and virtually none for P. aeruginosa. We have added a comment on the lack of overlap between all drug classes and the differences between the two species in the results section (i.e. L436 and L465).

      In the analysis of waa operon gene expression beginning on line 400, it is perhaps important to note that most of the waa operon doesn't do anything in laboratory K12 strains due to the lack of complete O-antigen... the same is not true, however, for many wild/clinical isolates. It would be interesting to see how those results compare, and also how the absolute TPMs (rather than just LFCs) of genes in this operon vary across the strains being investigated during TOB treatment.

      We thank the reviewer for this helpful suggestion. We examined the absolute expression (TPMs) of waa operon genes under the baseline (A) and following exposure to Tobramycin (B). The representative TPMs per strain were obtained by averaging across biological replicates. We observed a constitutive expression of the genes in the reference strain (MG1655) and the other isolates containing the variant of interest (MC4100, BW25113). In contrast, strains lacking the variants of interest (IAI76 and IAI78), showed lower expression of these operon genes under both conditions. Strain IAI77, on the other hand, displayed increased expression of a subset of waa genes post Tobramycin exposure, indicating strain-specific variation in transcriptional response. While the reference isolate might not have the O-antigen, it certainly expresses the waa operon, both constitutively and under TOB exposure.

      I don't think that the second conclusion on lines 479-480 is fully justified by the data, given both the disparity in available annotation information between the two species, AND the fact that only two species were considered.

      While we feel that the “Discussion” section of a research paper allows for speculative statements, we have to concede that we have perhaps overreached here. We have amended this sentence to be more cautious and not mislead readers.

      Line 118: "Double of DEGs"

      Line 288 - presumably these are LOG fold changes

      Fig 6b - legend contains typos

      Line 661 - please report the read count (more relevant for RNA-seq analysis) rather than Gb

      We thank the reviewer for pointing out the need to make these edits. We have implemented them all.

      Source code - I appreciate that the authors provide their source code on github, but it is very poorly documented - both a license and some top-level documentation about which code goes with each major operation/conclusion/figure should be provided. Also, ipython notebooks are in general a poor way in my view to distribute code, due to their encouragement of nonlinear development practices; while they are fine for software development, actual complete python programs along with accompanying source data would be preferrable.

      We agree with the reviewer that a software license and some documentation about what each notebook is about is warranted, and we have added them both. While we agree that for “consumer-grade” software jupyter notebooks are not the most ergonomic format, we believe that as a documentation of how one-time analyses were carried out they are actually one of the best formats we could think of. They in fact allow for code and outputs to be presented alongside each other, which greatly helped us to iterate on our research and to ensure that what was presented in the manuscript matched the analyses we reported in the code. This is of course up for debate and ultimately specific to someone’s taste, and so we will keep the reviewer’s critique in mind for our next manuscript. And, if we ever decide to package the analyses presented in the manuscript as a “consumer-grade” application for others to use, we would follow higher standards of documentation and design.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Damaris et al. collected genome sequences and transcriptomes from isolates from two bacterial species. Data for E. coli were produced for this paper, while data for P. aeruginosa had been measured earlier. The authors integrated these data to detect genes with differential expression (DE) among isolates as well as cis-expression quantitative trait loci (cis-eQTLs). The authors used sample sizes that were adequate for an initial exploration of gene regulatory variation (n=117 for E. coli and n=413 for P. aeruginosa) and were able to discover cis eQTLs at about 39% of genes. In a creative addition, the authors compared their results to transcription rates predicted from a biophysical promoter model as well as to annotated transcription factor binding sites. They also attempted to validate some of their associations experimentally using GFP-reporter assays. Finally, the paper presents a mapping of antibiotic resistance traits. Many of the detected associations for this important trait group were in non-coding genome regions, suggesting a role of regulatory variation in antibiotic resistance.

      A major strength of the paper is that it covers an impressive range of distinct analyses, some of which in two different species. Weaknesses include the fact that this breadth comes at the expense of depth and detail. Some sections are underdeveloped, not fully explained and/or thought-through enough. Important methodological details are missing, as detailed below.

      We thank the reviewer for highlighting the strengths of our study. We hope that our replies to their comments and the other two reviewers will address some of the limitations.

      Major comments:

      1. An interesting aspect of the paper is that genetic variation is represented in different ways (SNPs & indels, IRG presence/absence, and k-mers). However, it is not entirely clear how these three different encodings relate to each other. Specifically, more information should be given on these two points:

      2. it is not clear how "presence/absence of intergenic regions" are different from larger indels.

      In order to better guide readers through the different kinds of genetic variants we considered, we have added a brief explanation about what “promoter switches” are in the introduction (“meaning that the entire promoter region may differ between isolates due to recombination events”, L56). We believe this clarifies how they are very different in character from a large deletion. We have kept the reference to the original study (10.1073/pnas.1413272111) describing how widespread these switches are in E. coli as a way for readers to discover more about them.

      • I recommend providing more narration on how the k-mers compare to the more traditional genetic variants (SNPs and indels). It seems like the k-mers include the SNPs and indels somehow? More explanation would be good here, as k-mer based mapping is not usually done in other species and is not standard practice in the field. Likewise, how is multiple testing handled for association mapping with k-mers, since presumably each gene region harbors a large number of k-mers, potentially hugely increasing the multiple testing burden?

      We indeed agree with the reviewer in thinking that representing genetic variants as k-mers would encompass short variants (SNP/InDels) as well as larger variants and promoters presence/absence patterns. We believe that this assumption is validated by the fact that we identify the highest proportion of DEGs with a significant association when using this representation of variants (Figure 2A, 39% for both species). We have added a reference to a recent review on the advantages of k-mer methods for population genetics (10.1093/molbev/msaf047) in the introduction. Regarding the issue of multiple testing correction, we have employed a commonly recognized approach that, unlike a crude Bonferroni correction using the number of tested variants, allows for a realistic correction of association p-values. We used the number of unique presence/absence patterns, which can be shared between multiple genetic variants, and applied a Bonferroni correction using this number rather than the number of variants tested. We have expanded the corresponding section in the methods (e.g. L697) to better explain this point for readers not familiar with this approach.

      1. What was the distribution of association effect sizes for the three types of variants? Did IRGs have larger effects than SNPs as may be expected if they are indeed larger events that involve more DNA differences? What were their relative allele frequencies?

      We appreciate the suggestion made by the reviewer to look into the distribution of effect sizes divided by variant type. We have now evaluated the distribution of the effect sizes and allele frequencies for the genetic markers (SNPs/InDels, IGRs, and k-mers) for both species (Supplementary Figure 2). In E. coli, IGR variants showed somewhat larger median effect sizes (|β| = 4.5) than SNPs (|β| = 3.8), whereas k-mers displayed the widest distribution (median |β| = 5.2). In P. aeruginosa, the trend differed with IGRs exhibiting smaller effects (median |β| = 3.2), compared to SNPs/InDels (median |β| =5.1) and k-mers (median |β| = 6.2). With respect to allele frequencies, SNPs/InDels generally occured at lower frequencies (median AF = 0.34 for E.coli, median AF = 0.33 for P. aeruginosa), whereas IGRs (median AF = 0.65 for E. coli and 0.75 for P. aeruginosa) and k-mers (median AF = 0.71 for E. coli and 0.65 for P. aeruginosa) were more often at the intermediate to higher frequencies respectively. We have added a visualization for the distribution of effect sizes (Supplementary Figure 2).

      1. The GFP-based experiments attempting to validate the promoter effects for 18 genes are laudable, and the fact that 16 of them showed differences is nice. However, the fact that half of the validation attempts yielded effects in the opposite direction of what was expected is quite alarming. I am not sure this really "further validates" the GWAS in the way the authors state in line 292 - in fact, quite the opposite in that the validations appear random with regards to what was predicted from the computational analyses. How do the authors interpret this result? Given the higher concordance between GWAS, promoter prediction, and DE, are the GFP assays just not relevant for what is going on in the genome? If not, what are these assays missing? Overall, more interpretation of this result would be helpful.

      We thanks the reviewer for their comment, which is similar in nature to that raised by reviewer #2 above. As noted in our reply above we have amended the results and discussion to indicate that although the direction of gene expression change was not highly accurate, focusing on the magnitude (or rather whether there would be a change in gene expression, regardless of the direction), resulted in a higher accuracy. We postulate that the cases in which the direction of the change was not correctly identified could be due to the influence of other genetic elements in trans with the gene of interest.

      1. On the same note, it would be really interesting to expand the GFP experiments to promoters that did not show association in the GWAS. Based on Figure 6, effects of promoter differences on GFP reporters seem to be very common (all but three were significant). Is this a higher rate than for the average promoter with sequence variation but without detected association? A handful of extra reporter experiments might address this. My larger question here is: what is the null expectation for how much functional promoter variation there is?

      We thank the reviewer for this comment. We agree that estimating the null expectation for the functional promoter would require testing promoter alleles with sequence variation that are not associated in the GWAS. Such experiments, which would directly address if the observed effects in our study exceeds background, would have required us to prepare multiple constructs, which was unfortunately not possible for us due to staff constraints. We therefore elected to clarify the scope of our GFP reporter assays instead. These experiments were designed as a paired comparison of the wild-type and the GWAS-associated variant alleles of the same promoter in an identical reporter background, with the aim of testing allele-specific functional effects for GWAS hits (Supplementary Figure 6). We also included a comparison in GFP fluorescence between the promoterless vector (pOT2) and promoter-containing constructs; we observed higher GFP signals in all but four (yfgJ, fimI, agaI, and yfdQ) variant-containing promoter constructs, which indicates that for most of the construct we cloned active promoter elements. We have revised the manuscript text accordingly to reflect this clarification and included the control in the supplementary information as Supplementary Figure 6.

      1. Were the fold-changes in the GFP experiments statistically significant? Based on Figure 6 it certainly looks like they are, but this should be spelled out, along with the test used.

      We thank the reviewer for pointing this out. We have reviewed Figure 6 to indicate significant differences between the test and control reporter constructs. We used the paired student’s t-test to match the matched plate/time point measurements. We also corrected for multiple testing using the Benhamini-Hochberg correction. As seen in the updated Figure 6A, 16 out of the 18 reporter constructs displayed significant differences (adjusted p-value

      1. What was the overall correlation between GWAS-based fold changes and those from the GFP-based validation? What does Figure 6A look like as a scatter plot comparing these two sets of values?

      We thank the reviewer for this helpful suggestion, which allows us to more closely look into the results of our in-vitro validation. We performed a direct comparison of RNAseq fold changes from the GWAS (x-axis) with the GFP reporter measurements (y-axis) as depicted in the figure above. The overall correlation between the two was weak (Pearson r = 0.17), reflecting the lack of thorough agreement between the associations and the reporter construct. We however note that the two metrics are not directly comparable in our opinion, since on the x-axis we are measuring changes in gene expression and on the y-axis changes in fluorescence expression, which is downstream from it. As mentioned above and in reply to a comment from reviewer 2, the agreement between measured gene expression and all other in-silico and in-vitro techniques increases when ignoring the direction of the change. Overall, we believe that these results partly validate our associations and predictions, while indicating that other factors in trans with the regulatory region contribute to changes in gene expression, which is to be expected. The scatter plot has been included as a new supplementary figure (Supplementary Figure 7).

      1. Was the SNP analyzed in the last Results section significant in the gene expression GWAS? Did the DE results reported in this final section correspond to that GWAS in some way?

      The T>C SNP upstream of waaQ did not show significant association with gene expression in our cis GWAS analysis. Instead, this variant was associated with resistance to tobramycin when referencing data from Danesh et al, and we observed the variant in our strain collection. We subsequently investigated whether this variant also influenced expression of the waa operon under sub-inhibitory tobramycin exposure. The differential expression results shown in the final section therefore represent a functional follow-up experiment, and not a direct replication of the GWAS presented in the first part of the manuscript.

      1. Line 470: "Consistent with the differences in the genetic structure of the two species" It is not clear what differences in genetic structure this refers to. Population structure? Genome architecture? Differences in the biology of regulatory regions?

      The awkwardness of that sentence is perhaps the consequence of our assumption that readers would be aware of the differences in population genetics differences between the two species. We however have realized that not much literature is available (if at all!) about these differences, which we have observed during the course of this and other studies we have carried out. As a result, we agree that we cannot assume that the reader is similarly familiar with these differences, and have changed that sentence (i.e. L548) to more directly address the differences between the two species, which will presumably result in a diverse population structure. We thank the reviewer for letting us be aware of a gap in the literature concerning the comparison of pangenome structures across relevant species.

      1. Line 480: the reference to "adaption" is not warranted, as the paper contains no analyses of evolutionary patterns or processes. Genetic variation is not the same as adaptation.

      We have amended this sentence to be more adherent to what we can conclude from our analyses.

      1. There is insufficient information on how the E. coli RNA-seq data was generated. How was RNA extracted? Which QC was done on the RNA; what was its quality? Which library kits were used? Which sequencing technology? How many reads? What QC was done on the RNA-seq data? For this section, the Methods are seriously deficient in their current form and need to be greatly expanded.

      We thank the reviewer for highlighting the need for clearer methodological detail. We have expanded this section (i.e. L608) to fully describe the generation and quality control of the E. coli RNA-seq data including RNA extraction and sequencing platform.

      1. How were the DEG p-values adjusted for multiple testing?

      As indicated in the methods section (“Differential gene expression and functional enrichment analysis”), we have used DEseq2 for E. coli, and LPEseq for P. aeruginosa. Both methods use the statistical framework of the False Discovery Rate (FDR) to compute an adjusted p-value for each gene. We have added a brief mention of us following the standard practice indicated by both software packages in the methods.

      1. Were there replicates for the E. coli strains? The methods do not say, but there is a hint there might have been replicates given their absence was noted for the other species.

      In the context of providing more information about the transcriptomics experiments for E. coli, we have also more clearly indicated that we have two biological replicates for the E. coli dataset.

      1. There needs to be more information on the "pattern-based method" that was used to correct the GWAS for multiple tests. How does this method work? What genome-wide threshold did it end up producing? Was there adjustment for the number of genes tested in addition to the number of variants? Was the correction done per variant class or across all variant classes?

      In line with an earlier comment from this reviewer, we have expanded the section in the Methods (e.g. L689) that explains how this correction worked to include as many details as possible, in order to provide the readers with the full context under which our analyses were carried out.

      1. For a paper that, at its core, performs a cis-eQTL mapping, it is an oversight that there seems not to be a single reference to the rich literature in this space, comprising hundreds of papers, in other species ranging from humans, many other animals, to yeast and plants.

      We thank both reviewer #1 and #3 for pointing out this lack of references to the extensive literature on the subject. We have added a number of references about the applications of eQTL studies, and specifically its application in microbial pangenomes, which we believe is more relevant to our study, in the introduction.

      Minor comments:

      1. I wasn't able to understand the top panels in Figure 4. For ulaE, most strains have the solid colors, and the corresponding bottom panel shows mostly red points. But for waaQ, most strains have solid color in the top panel, but only a few strains in the bottom panel are red. So solid color in the top does not indicate a variant allele? And why are there so many solid alleles; are these all indels? Even if so, for kgtP, the same colors (i.e., nucleotides) seem to seamlessly continue into the bottom, pale part of the top panel. How are these strains different genotypically? Are these blocks of solid color counted as one indel or several SNPs, or somehow as k-mer differences? As the authors can see, these figures are really hard to understand and should be reworked. The same comment applies to Figure 5, where it seems that all (!) strains have the "variant"?

      We thank the reviewer for pointing out some limitations with our visualizations, most importantly with the way we explained how to read those two figures. We have amended the captions to more explicitly explain what is shown. The solid colors in the “sequence pseudo-alignment” panels indicate the focal cis variant, which is indicated in red in the corresponding “predicted transcription rate” panels below. In the case of Figure 5, the solid color indicates instead the position of the TFBS in the reference.

      1. Figure 1A & B: It would be helpful to add the total number of analyzed genes somewhere so that the numbers denoted in the colored outer rings can be interpreted in comparison to the total.

      We have added the total number of genes being considered for either species in the legend.

      1. Figure 1C & D: It would be better to spell out the COG names in the figure, as it is cumbersome for the reader to have to look up what the letters stand for in a supplementary table in a separate file.

      While we do not disagree with the awkwardness of having to move to a supplementary table to identify the full name of a COG category, we also would like to point out that the very long names of each category would clutter the figure to a degree that would make it difficult to read. We had indeed attempted something similar to what the reviewer suggests in early drafts of this manuscript, leading to small and hard to read labels. We have therefore left the full names of each COG category in Supplementary Table 3.

      1. Line 107: "Similarly," does not fit here as the following example (with one differentially expressed gene in an operon) is conceptually different from the one before, where all genes in the operon were differentially expressed.

      We agree and have amended the sentence accordingly.

      1. Figure 5 bottom panel: it is odd that on the left the swarm plots (i.e., the dots) are on the inside of the boxplots while on the right they are on the outside.

      We have fixed the position of the dots so that they are centered with respect to the underlying boxplots.

      1. It is not clear to me how only one or a few genes in an operon can show differential mRNA abundance. Aren't all genes in an operon encoded by the same mRNA? If so, shouldn't this mRNA be up- or downregulated in the same manner for all genes it encodes? As I am not closely familiar with bacterial systems, it is well possible that I am missing some critical fact about bacterial gene expression here. If this is not an analysis artifact, the authors could briefly explain how this observation is possible.

      We thanks the reviewer for their comment, which again echoes one of the main concerns from reviewer #2. As noted in our reply above, it has been established in multiple studies (see the three we have indicated above in our reply to reviewer #2) how bacteria encode for multiple “non-canonical” transcriptional units (i.e. operons), due to the presence of accessory terminators and promoters. This, together with other biological effects such as the presence of mRNA molecules of different lengths due to active transcription and degradation and technical noise induced by RNA isolation and sequencing can result in variability in the estimation of abundance for each gene.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this manuscript, Damaris et al. collected genome sequences and transcriptomes from isolates from two bacterial species. Data for E. coli were produced for this paper, while data for P. aeruginosa had been measured earlier. The authors integrated these data to detect genes with differential expression (DE) among isolates as well as cis-expression quantitative trait loci (cis-eQTLs). The authors used sample sizes that were adequate for an initial exploration of gene regulatory variation (n=117 for E. coli and n=413 for P. aeruginosa) and were able to discover cis eQTLs at about 39% of genes. In a creative addition, the authors compared their results to transcription rates predicted from a biophysical promoter model as well as to annotated transcription factor binding sites. They also attempted to validate some of their associations experimentally using GFP-reporter assays. Finally, the paper presents a mapping of antibiotic resistance traits. Many of the detected associations for this important trait group were in non-coding genome regions, suggesting a role of regulatory variation in antibiotic resistance. A major strength of the paper is that it covers an impressive range of distinct analyses, some of which in two different species. Weaknesses include the fact that this breadth comes at the expense of depth and detail. Some sections are underdeveloped, not fully explained and/or thought-through enough. Important methodological details are missing, as detailed below.

      Major comments:

      1. An interesting aspect of the paper is that genetic variation is represented in different ways (SNPs & indels, IRG presence/absence, and k-mers). However, it is not entirely clear how these three different encodings relate to each other. Specifically, more information should be given on these two points:

      2. it is not clear how "presence/absence of intergenic regions" are different from larger indels.

      3. I recommend providing more narration on how the k-mers compare to the more traditional genetic variants (SNPs and indels). It seems like the k-mers include the SNPs and indels somehow? More explanation would be good here, as k-mer based mapping is not usually done in other species and is not standard practice in the field. Likewise, how is multiple testing handled for association mapping with k-mers, since presumably each gene region harbors a large number of k-mers, potentially hugely increasing the multiple testing burden?

      4. What was the distribution of association effect sizes for the three types of variants? Did IRGs have larger effects than SNPs as may be expected if they are indeed larger events that involve more DNA differences? What were their relative allele frequencies?
      5. The GFP-based experiments attempting to validate the promoter effects for 18 genes are laudable, and the fact that 16 of them showed differences is nice. However, the fact that half of the validation attempts yielded effects in the opposite direction of what was expected is quite alarming. I am not sure this really "further validates" the GWAS in the way the authors state in line 292 - in fact, quite the opposite in that the validations appear random with regards to what was predicted from the computational analyses. How do the authors interpret this result? Given the higher concordance between GWAS, promoter prediction, and DE, are the GFP assays just not relevant for what is going on in the genome? If not, what are these assays missing? Overall, more interpretation of this result would be helpful.
      6. On the same note, it would be really interesting to expand the GFP experiments to promoters that did not show association in the GWAS. Based on Figure 6, effects of promoter differences on GFP reporters seem to be very common (all but three were significant). Is this a higher rate than for the average promoter with sequence variation but without detected association? A handful of extra reporter experiments might address this. My larger question here is: what is the null expectation for how much functional promoter variation there is?
      7. Were the fold-changes in the GFP experiments statistically significant? Based on Figure 6 it certainly looks like they are, but this should be spelled out, along with the test used.
      8. What was the overall correlation between GWAS-based fold changes and those from the GFP-based validation? What does Figure 6A look like as a scatter plot comparing these two sets of values?
      9. Was the SNP analyzed in the last Results section significant in the gene expression GWAS? Did the DE results reported in this final section correspond to that GWAS in some way?
      10. Line 470: "Consistent with the differences in the genetic structure of the two species" It is not clear what differences in genetic structure this refers to. Population structure? Genome architecture? Differences in the biology of regulatory regions?
      11. Line 480: the reference to "adaption" is not warranted, as the paper contains no analyses of evolutionary patterns or processes. Genetic variation is not the same as adaptation.
      12. There is insufficient information on how the E. coli RNA-seq data was generated. How was RNA extracted? Which QC was done on the RNA; what was its quality? Which library kits were used? Which sequencing technology? How many reads? What QC was done on the RNA-seq data? For this section, the Methods are seriously deficient in their current form and need to be greatly expanded.
      13. How were the DEG p-values adjusted for multiple testing?
      14. Were there replicates for the E. coli strains? The methods do not say, but there is a hint there might have been replicates given their absence was noted for the other species.
      15. There needs to be more information on the "pattern-based method" that was used to correct the GWAS for multiple tests. How does this method work? What genome-wide threshold did it end up producing? Was there adjustment for the number of genes tested in addition to the number of variants? Was the correction done per variant class or across all variant classes?
      16. For a paper that, at its core, performs a cis-eQTL mapping, it is an oversight that there seems not to be a single reference to the rich literature in this space, comprising hundreds of papers, in other species ranging from humans, many other animals, to yeast and plants.

      Minor comments:

      1. I wasn't able to understand the top panels in Figure 4. For ulaE, most strains have the solid colors, and the corresponding bottom panel shows mostly red points. But for waaQ, most strains have solid color in the top panel, but only a few strains in the bottom panel are red. So solid color in the top does not indicate a variant allele? And why are there so many solid alleles; are these all indels? Even if so, for kgtP, the same colors (i.e., nucleotides) seem to seamlessly continue into the bottom, pale part of the top panel. How are these strains different genotypically? Are these blocks of solid color counted as one indel or several SNPs, or somehow as k-mer differences? As the authors can see, these figures are really hard to understand and should be reworked. The same comment applies to Figure 5, where it seems that all (!) strains have the "variant"?
      2. Figure 1A & B: It would be helpful to add the total number of analyzed genes somewhere so that the numbers denoted in the colored outer rings can be interpreted in comparison to the total.
      3. Figure 1C & D: It would be better to spell out the COG names in the figure, as it is cumbersome for the reader to have to look up what the letters stand for in a supplementary table in a separate file.
      4. Line 107: "Similarly," does not fit here as the following example (with one differentially expressed gene in an operon) is conceptually different from the one before, where all genes in the operon were differentially expressed.
      5. Figure 5 bottom panel: it is odd that on the left the swarm plots (i.e., the dots) are on the inside of the boxplots while on the right they are on the outside.
      6. It is not clear to me how only one or a few genes in an operon can show differential mRNA abundance. Aren't all genes in an operon encoded by the same mRNA? If so, shouldn't this mRNA be up- or downregulated in the same manner for all genes it encodes? As I am not closely familiar with bacterial systems, it is well possible that I am missing some critical fact about bacterial gene expression here. If this is not an analysis artifact, the authors could briefly explain how this observation is possible.

      Significance

      To my knowledge, this work represents the first cis-eQTL mapping in bacteria. As such, it is a useful and interesting exploration of this space that complements the large body of literature on this question in eukaryotic systems. This expansion to bacterial systems is especially interesting given the unique features of bacterial compared to eukaryotic genomes, including a small (10-15%) noncoding fraction of the genome and gene organization in operons. The work will be of interest to readers in the fields of complex trait genetics, gene expression, and regulatory variation. For context of this assessment, I am an expert in genomics and the study of genetic variation in gene expression in eukaryotic systems. I have limited knowledge about bacterial genetics and biology, as well as of antibiotic resistance.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their manuscript "Cis non-coding genetic variation drives gene expression changes in the E. coli and P. aeruginosa pangenomes", Damaris and co-authors present an extensive meta-analysis, plus some useful follow up experiments, attempting to apply GWAS principles to identify the extent to which differences in gene expression between different strains within a given species can be directly assigned to cis-regulatory mutations. The overall principle, and the question raised by the study, is one of substantial interest, and the manuscript here represents a careful and fascinating effort at unravelling these important questions. I want to preface my review below (which may otherwise sound more harsh than I intend) with the acknowledgment that this is an EXTREMELY difficult and challenging problem that the authors are approaching, and they have clearly put in a substantial amount of high quality work in their efforts to address it. I applaud the work done here, I think it presents some very interesting findings, and I acknowledge fully that there is no one perfect approach to addressing these challenges, and while I will object to some of the decisions made by the authors below, I readily admit that others might challenge my own suggestions and approaches here. With that said, however, there is one fundamental decision that the authors made which I simply cannot agree with, and which in my view undermines much of the analysis and utility of the study: that decision is to treat both gene expression and the identification of cis-regulatory regions at the level of individual genes, rather than transcriptional units. Below I will expand on why I find this problematic, how it might be addressed, and what other areas for improvement I see in the manuscript:

      In the entire discussion from lines roughly 100-130, the authors frequently dissect out apparently differentially expressed genes from non differentially expressed genes within the same operons... I honestly wonder whether this is a useful distinction. I understand that by the criteria set forth by the authors it is technically correct, and yet, I wonder if this is more due to thresholding artifacts (i.e., some genes passing the authors' reasonable-yet-arbitrary thresholds whereas others in the same operon do not), and in the process causing a distraction from an operon that is in fact largely moving in the same direction. The authors might wish to either aggregate data in some way across known transcriptional units for the purposes of their analysis, and/or consider a more lenient 'rescue' set of significance thresholds for genes that are in the same operons as differentially expressed genes. I would favor the former approach, performing virtually all of their analysis at the level of transcriptional units rather than individual genes, as much of their analysis in any case relies upon proper assignment of genes to promoters, and this way they could focus on the most important signals rather than get lots sometimes in the weeds of looking at every single gene when really what they seem to be looking at in this paper is a property OF THE PROMOTERS, not the genes. (of course there are phenomena, such as rho dependent termination specifically titrating expression of late genes in operons, but I think on the balance the operon-level analysis might provide more insights and a cleaner analysis and discussion).

      This also leads to a more general point, however, which I think is potentially more deeply problematic. At the end of the day, all of the analysis being done here centers on the cis regulatory logic upstream of each individual open reading frame, even though in many cases (i.e., genes after the first one in multi-gene operons), this is not where the relevant promoter is. This problem, in turn, raises potentially misattributions of causality running in both directions, where the causal impact on a bona fide promoter mutation on many genes in an operon may only be associated with the first gene, or on the other side, where a mutation that co-occurs with, but is causally independent from, an actual promoter mutation may be flagged as the one driving an expression change. This becomes an especially serious issue in cases like ulaE, for genes that are not the first gene in an operon (at least according to standard annotations, the UlaE transcript should be part of a polycistronic mRNA beginning from the ulaA promoter, and the role played by cis-regulatory logic immediately upstream of ulaE is uncertain and certainly merits deeper consideration. I suspect that many other similar cases likewise lurk in the dataset used here (perhaps even moreso for the Pseudomonas data, where the operon definitions are likely less robust). Of course there are many possible explanations, such as a separate ulaE promoter only in some strains, but this should perhaps be carefully stated and explored, and seems likely to be the exception rather than the rule. Another issue with the current definition of regulatory regions, which should perhaps also be accounted for, is that it is likely that for many operons, the 'regulatory regions' of one gene might overlap the ORF of the previous gene, and in some cases actual coding mutations in an upstream gene may contaminate the set of potential regulatory mutations identified in this dataset. Taken together, I feel that all of the above concerns need to be addressed in some way. At the absolute barest minimum, the authors need to acknowledge the weaknesses that I have pointed out in the definition of cis-regulatory logic at a gene level. I think it would be far BETTER if they performed a re-analysis at the level of transcriptional units, which I think might substantially strengthen the work as a whole, but I recognize that this would also constitute a substantial amount of additional effort. Having reached the end of the paper, and considering the evidence and arguments of the authors in their totality, I find myself wondering how much local x background interactions - that is, the effects of cis regulatory mutations (like those being considered here, with or without the modified definitions that I proposed above) IN THE CONTEXT OF A PARTICULAR STRAIN BACKGROUND, might matter more than the effects of the cis regulatory mutations per se. This is a particularly tricky problem to address because it would require a moderate number of targeted experiments with a moderate number of promoters in a moderate number of strains (which of course makes it maximally annoying since one can't simply scale up hugely on either axis individually and really expect to tease things out). I think that trying to address this question experimentally is FAR beyond the scope of the current paper, but I think perhaps the authors could at least begin to address it by acknowledging it as a challenge in their discussion section, and possibly even identify candidate promoters that might show the largest divergence of activities across strains when there IS no detectable cis regulatory mutation (which might be indicative of local x background interactions), or those with the largest divergences of effect for a given mutation across strains. A differential expression model incorporating shrinkage is essential in such analysis to avoid putting too much weight on low expression genes with a lot of Poisson noise.

      I also have some more minor concerns and suggestions, which I outline below: It seems that the differential expression analysis treats the lab reference strains as the 'centerpoint' against which everything else is compared, and yet I wonder if this is the best approach... it might be interesting to see how the results differ if the authors instead take a more 'average' strain (either chosen based on genetics or transcriptomics) as a reference and compared everything else to that.

      Line 104 - the statement about the differentially expressed genes being "part of operons with diverse biological functions" seems unclear - it is not apparent whether the authors are referring to diversity of function within each operon, or between the different operons, and in any case one should consider whether the observation reflects any useful information or is just an apparently random collection of operons. Line 292 - I find the argument here somewhat unconvincing, for two reasons. First, the fact that only half of the observed changes went in the same direction as the GWAS results would indicate, which is trivially a result that would be expected by random chance, does not lend much confidence to the overall premise of the study that there are meaningful cis regulatory changes being detected (in fact, it seems to argue that the background in which a variant occurs may matter a great deal, at least as much as the cis regulatory logic itself). Second, in order to even assess whether the GWAS is useful to "find the genetic determinants of gene expression changes" as the authors indicate, it would be necessary to compare to a reasonable, non-straw-man, null approach simply identifying common sequence variants that are predicted to cause major changes in sigma 70 binding at known promoters; such a test would be especially important given the lack of directional accuracy observed here. Along these same lines, it is perhaps worth noting, in the discussion beginning on line 329, that the comparison is perhaps biased in favor of the GWAS study, since the validation targets here were prioritized based on (presumably strong) GWAS data.

      I don't find the Venn diagrams in Fig 7C-D useful or clear given the large number of zero-overlap regions, and would strongly advocate that the authors find another way to show these data.

      In the analysis of waa operon gene expression beginning on line 400, it is perhaps important to note that most of the waa operon doesn't do anything in laboratory K12 strains due to the lack of complete O-antigen... the same is not true, however, for many wild/clinical isolates. It would be interesting to see how those results compare, and also how the absolute TPMs (rather than just LFCs) of genes in this operon vary across the strains being investigated during TOB treatment.

      I don't think that the second conclusion on lines 479-480 is fully justified by the data, given both the disparity in available annotation information between the two species, AND the fact that only two species were considered.

      Line 118: "Double of DEGs"

      Line 288 - presumably these are LOG fold changes

      Fig 6b - legend contains typos

      Line 661 - please report the read count (more relevant for RNA-seq analysis) rather than Gb

      Source code - I appreciate that the authors provide their source code on github, but it is very poorly documented - both a license and some top-level documentation about which code goes with each major operation/conclusion/figure should be provided. Also, ipython notebooks are in general a poor way in my view to distribute code, due to their encouragement of nonlinear development practices; while they are fine for software development, actual complete python programs along with accompanying source data would be preferrable.

      Significance

      Overall the key strength of the study is the heroic merging of large genetic and transcriptomic datasets to address the question of how much variation in gene expression can be assigned to cis regulatory mutations in E. coli and in P. aeruginosa. The authors find that only a minority of genes can have such an assignment explaining expression variation, which highlights both the many factors (local and global) impacting gene expression, and the difficulty in trying to predict and understand expression patterns in different strains. I believe that with suitable modification, the manuscript will be of great interest to a broad audience interested in bacterial genomics, gene regulation, and systems/synthetic biology.

      Reviewer Expertise: I consider myself a bacterial systems biologist and routinely use high throughput experiments to understand bacterial gene regulation.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Damaris et al. perform what is effectively an eQTL analysis on microbial pangenomes of E. coli and P. aeruginosa. Specifically, they leverage a large dataset of paired DNA/RNA-seq information for hundreds of strains of these microbes to establish correlations between genetic variants and changes in gene expression. Ultimately, their claim is that this approach identifies non-coding variants that affect expression of genes in a predictable manner and explain differences in phenotypes. They attempt to reinforce these claims through use of a widely regarded promoter calculator to quantify promoter effects, as well as some validation studies in living cells. Lastly, they show that these non-coding variations can explain some cases of antibiotic resistance in these microbes.

      Major comments

      Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      The authors convincingly demonstrate that they can identify non-coding variation in pangenomes of bacteria and associate these with phenotypes of interest. What is unclear is the extent by which they account for covariation of genetic variation? Are the SNPs they implicate truly responsible for the changes in expression they observe? Or are they merely genetically linked to the true causal variants. This has been solved by other GWAS studies but isn't discussed as far as I can tell here.

      They need to justify why they consider the 30bp downstream of the start codon as non-coding. While this region certainly has regulatory impact, it is also definitely coding. To what extent could this confound results and how many significant associations to expression are in this region vs upstream?

      The claim that promoter variation correlates with changes in measured gene expression is not convincingly demonstrated (although, yes, very intuitive). Figure 3 is a convoluted way of demonstrating that predicted transcription rates correlate with measured gene expression. For each variant, can you do the basic analysis of just comparing differences in promoter calculator predictions and actual gene expression? I.e. correlation between (promoter activity variant X)-(promoter activity variant Y) vs (measured gene expression variant X)-(measured gene expression variant Y). You'll probably have to

      Figure 7 it is unclear what this experiment was. How were they tested? Did you generate the data themselves? Did you do RNA-seq (which is what is described in the methods) or just test and compare known genomic data?

      Are the data and the methods presented in such a way that they can be reproduced?

      No, this is the biggest flaw of the work. The RNA-Seq experiment to start this project is not described at all as well as other key experiments. Descriptions of methods in the text are far too vague to understand the approach or rationale at many points in the text. The scripts are available on github but there is no description of what they correspond to outside of the file names and none of the data files are found to replicate the plots.

      Figure 8B is intended to show that the WaaQ operon is connected to known Abx resistance genes but uses the STRING method. This requires a list of genes but how did they build this list? Why look at these known ABx genes in particular? STRING does not really show evidence, these need to be substantiated or at least need to justify why this analysis was performed.

      Are the experiments adequately replicated and statistical analysis adequate?

      An important claim on MIC of variants for supplementary table 8 has no raw data and no clear replicates available. Only figure 6, the in vitro testing of variant expression, mentions any replicates.

      Minor comments

      Specific experimental issues that are easily addressable.. Are prior studies referenced appropriately?

      There should be a discussion of eQTLs in this. Although these have mostly been in eukaryotes a. https://doi.org/10.1038/s41588-024-01769-9 ; https://doi.org/10.1038/nrg3891

      Line 67. Missing important citation for Ireland et al. 2020 https://doi.org/10.7554/eLife.55308 Line 69. Should mention Johns et al. 2018 (https://doi.org/10.1038/nmeth.4633) where they study promoter sequences outside of E. coli Line 90 - replace 'hypothesis-free' with unbiased Line 102 - state % of DEGs relative to the entire pan-genome Figure 1A is not discussed in the text Line 111: it is unclear what enrichment was being compared between, FIgures 1C/D have 'Gene counts' but is of the total DEGs? How is the p-value derived? Comparing and what statistical test was performed? Comparing DEG enrichment vs the pangenome? K12 genome? Line 122-123: State what letters correspond to these COG categories here Line 155: Need to clarify how you use k-mers in this and how they are different than SNPs. are you looking at k-mer content of these regions? K-mers up to hexamers or what? How are these compared. You can't just say we used k-mers. Line 172: It would be VERY helpful to have a supplementary figure describing these types of variants, perhaps a multiple-sequence alignment containing each example Figure 4: THis figure is too small. Why are WaaQ and UlaE being used as examples here when you are supposed to be explicitly showing variants with strong positive correlations? Figure 4: Why is there variation between variants present and variant absent? Is this due to other changes in the variant? Should mention this in the text somewhere Line 359: Need to talk about each supplementary figure 4 to 9 and how they demonstrate your point.

      Are the text and figures clear and accurate? Figure 4 too small Acronyms are defined multiple times in the manuscript, sometimes not the first time they are used (e.g. SNP, InDel) Figure 8A - Remove red box, increase label size Figure 8B - Low resolution, grey text is unreadable and should be darker and higher resolution Line 35 - be more specific about types of carbon metabolism and catabolite repression Line 67 - include citation for ireland et al. 2020 https://doi.org/10.7554/eLife.55308 Line 74 - You talk about looking in cis but don't specify how mar away cis is Line 75 - we encoded genetic variants..... It is unclear what you mean here Line 104 - 'were apart of operons' should clarify you mean polycistronic or multi-gene operons. Single genes may be considered operonic units as well. Figure 2: THere is no axis for the percents and the percents don't make sense relative to the bars they represent?? Figure 2: Figure 2B legend should clarify that these are individual examples of Differential expression between variants Line 198-199: This sentence doesn't make sense, 'encoded using kmers' is not descriptive enough Line 205: Should be upfront about that you're using the Promoter Calculator that models biophysical properties of promoter sequences to predict activity. Line 251: 'Scanned the non-coding sequences of the DEGs'. This is far too vague of a description of an approach. Need to clarify how you did this and I didn't see in the method. Is this an HMM? Perfect sequence match to consensus sequence? Some type of alignment? Line 257-259: This sentence lacks clarity Line346: How were the E. coli isolates tested? Was this an experiment you did? This is a massive undertaking (1600 isolates * 12 conditions) if so so should be clearly defined Figure 6A: The tile plot on the right side is not clearly labeled and it is unclear what it is showing and how that relates to the bar plots. FIgure 6B: typo in legend 'Downreglation' Line 398: Need to state rationale for why Waaq operon is being investigated here. WHy did you look into individual example? Figure 8: Can get rid of red box Line 463 - 'account for all kinds' is too informal Mix of font styles throughout document

      Significance

      Provide contextual information to readers (editors and researchers) about the novelty of the study, its value for the field and the communities that might be interested. The following aspects are important:General assessment: provide a summary of the strengths and limitations of the study. What are the strongest and most important aspects? What aspects of the study should be improved or could be developed?

      This study applies eQTL concepts to bacterial pangenomes to understand how genetic variation in non-coding regions contributes to microbial phenotypes, which is clever and has not been done in bacterial communities (although has been done in yeast isolates, see citation above). They characterize these same variants using in silico promoter predictions, in vitro experiments, layer biological mechanism via transcription factor binding site mapping, and study associated antibiotic resistance phenotypes. These are all good ideas, but none of these points are very developed. The antibiotic work in particular was a missed opportunity as this is the most impactful demonstration of their approach. For instance, to what extent do these eQTLs explain resistance across isolates vs coding changes? Are non-coding variants more responsible for antibiotic resistance than coding variants? Given how easy it is to adapt gene expression vs establishing other mechanisms, this is plausible. How could knowing this change how we treat infections? While a general overview of their strategy is fine, the approaches are under-described and unclear so difficult to truly assess.

      Advance: compare the study to the closest related results in the literature or highlight results reported for the first time to your knowledge; does the study extend the knowledge in the field and in which way? Describe the nature of the advance and the resulting insights (for example: conceptual, technical, clinical, mechanistic, functional,...).

      To my knowledge and from a cursory search, this is the first pan-genome mapping of non-coding variants to transcriptional changes in bacteria. This is a good idea that could be applied to any microbe for which large transcriptomic datasets of strains are available or could be generated and is helpful for understanding genetic variation and the architecture of bacterial regulatory systems.

      Audience: describe the type of audience ("specialized", "broad", "basic research", "translational/clinical", etc...) that will be interested or influenced by this research; how will this research be used by others; will it be of interest beyond the specific field?

      This would be of interest to individuals interested in population genetics, gene regulation, and microbial evolution. It could inspire similar studies of other microbes to understand the contribution of non-coding changes to phenotypes across whole genomes.

      Please define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I am an expert on bacterial gene regulation, especially concerning how promoter activity is encoded within sequences. I have less experience on using GWAS.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      In our manuscript, we describe a role for the nuclear mRNA export factor UAP56 (a helicase) during metamorphic dendrite and presynapse pruning in flies. We characterize a UAP56 ATPase mutant and find that it rescues the pruning defects of a uap56 mutant. We identify the actin severing enzyme Mical as a potentially crucial UAP56 mRNA target during dendrite pruning and show alterations at both the mRNA and protein level. Finally, loss of UAP56 also causes presynapse pruning defects with actin abnormalities. Indeed, the actin disassembly factor cofilin is required for pruning specifically at the presynapse.

      We thank the reviewers for their constructive comments, which we tried to address experimentally as much as possible. To summarize briefly, while all reviewers saw the results as interesting (e. g., Reviewer 3's significance assessment: "Understanding how post-transcriptional events are linked to key functions in neurons is important and would be of interest to a broad audience") and generally methodologically strong, they thought that our conclusions regarding the potential specificity of UAP56 for Mical mRNA was not fully covered by the data. To address this criticism, we added more RNAi analyses of other mRNA export factors and rephrased our conclusions towards a more careful interpretation, i. e., we now state that the pruning process is particularly sensitive to loss of UAP56. In addition, reviewer 1 had technical comments regarding some of our protein and mRNA analyses. We added more explanations and an additional control for the MS2/MCP system. Reviewers 2 and 3 wanted to see a deeper characterization of the ATPase mutant provided. We generated an additional UAP56 mutant transgene, improved our analyses of UAP56 localization, and added a biochemical control experiment. We hope that our revisions make our manuscript suitable for publication.

      1. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      • *

      Comments by reviewer 1.

      Major comments

      1.

      For Figure 4, the MS2/MCP system is not quantitative. Using this technique, it is impossible to determine how many RNAs are located in each "dot". Each of these dots looks quite large and likely corresponds to some phase-separated RNP complex where multiple RNAs are stored and/or transported. Thus, these data do not support the conclusion that Mical mRNA levels are reduced upon UAP56 knockdown. A good quantitative microscopic assay would be something like smFISH. Additinally, the localization of Mical mRNA dots to dendrites is not convincing as it looks like regions where there are dendritic swellings, the background is generally brighter.

      Our response

      We indeed found evidence in the literature that mRNPs labeled with the MS2/MCP or similar systems form condensates (Smith et al., JCB 2015). Unfortunately, smFISH is not established for this developmental stage and would likely be difficult due to the presence of the pupal case. To address whether the Mical mRNPs in control and UAP56 KD neurons are comparable, we characterized the MCP dots in the respective neurons in more detail and found that their sizes did not differ significantly between control and UAP56 KD neurons. To facilitate interpretability, we also increased the individual panel sizes and include larger panels that only show the red (MCP::RFP) channel. We think these changes improved the figure. Thanks for the insight.

      Changes introduced: Figure 5 (former Fig. 4): Increased panel size for MCP::RFP images, left out GFP marker for better visibility. Added new analysis of MCP::RFP dot size (new Fig. 5 I).

      1.

      Alternatively, levels of Mical mRNA could be verified by qPCR in the laval brain following pan-neuronal UAP56 knockdown or in FACS-sorted fluorescently labeled da sensory neurons. Protein levels could be analyzed using a similar approach.

      Our response

      We thank the reviewer for this comment. Unfortunately, these experiments are not doable as neuron-wide UAP56 KD is lethal (see Flybase entry for UAP56). From our own experience, FACS-sorting of c4da neurons would be extremely difficult as the GFP marker fluorescence intensity of UAP56 KD neurons is weak - this would likely result in preferential sorting of subsets of neurons with weaker RNAi effects. In addition, FACS-sorting whole neurons would not discriminate between nuclear and cytoplasmic mRNA.

      The established way of measuring protein content in the Drosophila PNS system is immunofluorescence with strong internal controls. In our case, we also measured Mical fluorescence intensity of neighboring c1da neurons that do not express the RNAi and show expression levels as relative intensities compared to these internal controls. This procedure rules out the influence of staining variation between samples and is used by other labs as well.

      1.

      In Figure 5, the authors state that Mical expression could not be detected at 0 h APF. The data presented in Fig. 5C, D suggest the opposite as there clearly is some expression. Moreover, the data shown in Fig. 5D looks significantly brighter than the Orco dsRNA control and appears to localize to some type of cytoplasmic granule. So the expression of Mical does not look normal.

      Our response

      We thank the reviewer for this comment. In the original image in Fig. 5 C, the c4da neuron overlaps with the dendrite from a neighboring PNS neuron (likely c2da or c3da). The latter neuron shows strong Mical staining. We agree that this image is confusing and exchanged this image for another one from the same genotype.

      Changes introduced: Figure 5 L (former Fig. 5 C): Exchanged panel for image without overlap from other neuron.

      1.

      Sufficient data are not presented to conclude any specificity in mRNA export pathways. Data is presented for one export protein (UAP56) and one putative target (Mical). To adequately assess this, the authors would need to do RNA-seq in UAP56 mutants.

      Our response

      We thank the reviewer for this comment. To address this, we tested whether knockdown of three other mRNA export factors (NXF1, THO2, THOC5) causes dendrite pruning defects, which was not the case (new Fig. S1). While these data are consistent with specific mRNA export pathways, we agree that they are not proof. We therefore toned down our interpretation and removed the conclusion about specificity. Instead, we now use the more neutral term "increased sensibility (to loss of UAP56)".

      Changes introduced: Added new Figure S1: RNAi analyses of NXF1, THO2 and THOC5 in dendrite pruning. Introduced concluding sentence at the end of first Results paragraph: We conclude that c4da neuron dendrite pruning is particularly sensitive to loss of UAP56. (p. 6)

      1.

      In summary, better quantitative assays should be used in Figures 4 and 5 in order to conclude the expression levels of either mRNA or protein. In its current form, this study demonstrates the novel finding that UAP56 regulates dendrite and presynaptic pruning, potentially via regulation of the actin cytoskeleton. However, these data do not convincingly demonstrate that UAP56 controls these processes by regulating of Mical expression and defintately not by controlling export from the nucleus.

      Our response

      We hope that the changes we introduced above help clarify this.

      1.

      While there are clearly dendrites shown in Fig. 1C', the cell body is not readily identifiable. This makes it difficult to assess attachment and suggests that the neuron may be dying. This should be replaced with an image that shows the soma.

      Our response

      We thank the reviewer for this comment. Changes introduced: we replaced the picture in the panel with one where the cell body is more clearly visible.

      1.

      The level of knockdown in the UAS56 RNAi and P element insertion lines should be determined. It would be useful to mention the nature of the RNAi lines (long/short hairpin). Some must be long since Dcr has been co-expressed. Another issue raised by this is the potential for off-target effects. shRNAi lines would be preferable because these effects are minimized.

      Our response

      We thank the reviewer for this comment. Assessment of knockdown efficiency is a control to make sure the manipulations work the way they are intended to. As mRNA isolation from Drosophila PNS neurons is extremely difficult, RNAi or mutant phenotypes in this system are controlled by performing several independent manipulations of the same gene. In our case, we used two independent RNAi lines (both long hairpins from VDRC/Bloomington and an additional insertion of the VDRC line, see Table S1) as well as a mutant P element in a MARCM experiment, i. e., a total of three independent manipulations that all cause pruning defects, and the VDRC RNAi lines do not have any predicted OFF targets (not known for the Bloomington line). If any of these manipulations would not have matched, we would have generated sgRNA lines for CRISPR to confirm.

      Minor comments:

      1.

      The authors should explain what EB1:GFP is marking when introduced in the text.


      Our response

      We thank the reviewer for this comment. Changes introduced: we explain the EB1::GFP assay in the panel with one where the cell body is more clearly visible.

      1.

      The da neuron images throughout the figures could be a bit larger.

      Our response

      We thank the reviewer for this comment. Changes introduced: we changed the figure organization to be able to use larger panels:

      • the pruning analysis of the ATPase mutations (formerly Fig. 2) is now its own figure (Figure 3).

      • we increased the panel sizes of the MCP::RFP images (Figure 5 A - I, formerly Fig. 4).

      Reviewer #1 (Significance (Required)):

      Strengths:

      The methodology used to assess dendrite and presynaptic prunings are strong and the phenotypic analysis is conclusive.

      Our response

      We thank the reviewer for this comment.

      Weakness:

      The evidence demonstrating that UAP56 regulates the expression of Mical is unconvincing. Similarly, no data is presented to show that there is any specificity in mRNA export pathways. Thus, these major conclusions are not adequately supported by the data.

      Our response

      We hope the introduced changes address this comment.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      In this paper, the authors describe dendrite pruning defects in c4da neurons in the DEXD box ATPase UAP56 mutant or in neuronal RNAi knockdown. Overexpression UAP56::GFP or UAP56::GFPE194Q without ATPase activity can rescue dendrite pruning defects in UAP56 mutant. They further characterized the mis-localization of UAP56::GFPE194Q and its binding to nuclear export complexes. Both microtubules and the Ubiquitin-proteasome system are intact in UAP56RNAi neurons. However, they suggest a specific effect on MICAL mRNA nuclear export shown by using the MS2-MCP system., resulting in delay of MICAL protein expression in pruned neurons. Furthermore, the authors show that UAP56 is also involved in presynaptic pruning of c4da neuros in VNC and Mica and actin are also required for actin disassembly in presynapses. They propose that UAP56 is required for dendrite and synapse pruning through actin regulation in Drosophila. Following are my comments.

      Major comments

      1.

      The result that UAP56::GFPE194Q rescues the mutant phenotype while the protein is largely mis-localized suggests a novel mechanism or as the authors suggested rescue from combination of residual activities. The latter possibility requires further support, which is important to support the role mRNA export in dendrite and pre-synapse pruning. One approach would be to examine whether other export components like REF1, and NXF1 show similar mutant phenotypes. Alternatively, depleting residual activity like using null mutant alleles or combining more copies of RNAi transgenes could help.

      Our response

      We thank the reviewer for this comment. We agree that the mislocalization phenotype is interesting and could inform further studies on the mechanism of UAP56. To further investigate this and to exclude that this could represent a gain-of-function due to the introduced mutation, we made and characterized a new additional transgene, UAP56::GFP E194A. This mutant shows largely the same phenotypes as E194Q, with enhanced interactions with Ref1 and partial mislocalization to the cytoplasm. In addition, we tested whether knockdown of THO2, THOC5 or NXF1 causes pruning defects (no).

      Changes introduced:

      • added new Figure S1: RNAi analyses of NXF1, THO2 and THOC5 in dendrite pruning.

      • made and characterized a new transgene UAP56 E194A (new Fig. 2 B, E, E', 3 C, C', E, F).

      1.

      The localization of UAP56::GFP (and E194Q) should be analyzed in more details. It is not clear whether the images in Fig. 2A and 2B are from confocal single sections or merged multiple sections. The localization to the nuclear periphery of UAP56::GFP is not clear, and the existence of the E194Q derivative in both nucleus and cytosol (or whether there is still some peripheral enrichment) is not clear if the images are stacked.

      Our response

      We thank the reviewer for this comment. It is correct that the profiles in the old Figure 2 were from single confocal sections from the displayed images. As it was difficult to create good average profiles with data from multiple neurons, we now introduce an alternative quantification based on categories (nuclear versus dispersed) which includes data from several neurons for each genotype, including the new E194A transgene (new Fig 3 G). Upon further inspection, the increase at the nuclear periphery was not always visible and may have been a misinterpretation. We therefore removed this statement.

      Changes introduced:

      • added new quantitative analysis of UAP56 wt and E/A, E/Q mutant localization (new Fig 3 G).

      1.

      The Ub-VV-GFP is a new reagent, and its use to detect active proteasomal degradation is by the lack of GFP signals, which could be also due to the lack of expression. The use of Ub-QQ-GFP cannot confirm the expression of Ub-VV-GFP. The proteasomal subunit RPN7 has been shown to be a prominent component in the dendrite pruning pathway (Development 149, dev200536). Immunostaining using RPN7 antibodies to measure the RPN expression level could be a direct way to address the issue whether the proteasomal pathway is affected or not.

      Our response

      We thank the reviewer for this comment. We agree that it is wise to not only introduce a positive control for the Ub-VV-GFP sensor (the VCP dominant-negative VCP QQ), but also an independent control. As mutants with defects in proteasomal degradation accumulate ubiquitinated proteins (see, e. g., Rumpf et al., Development 2011), we stained controls and UAP56 KD neurons with antibodies against ubiquitin and found that they had similar levels (new Fig. S3).

      Changes introduced:

      • added new ubiquitin immunofluorescence analysis (new Fig. S3).

      1.

      Using the MS2/MCP system to detect the export of MICAL mRNA is a nice approach to confirm the UAP56 activity; lack of UAP56 by RNAi knockdown delays the nuclear export of MS2-MICAL mRNA. The rescue experiment by UAS transgenes could not be performed due to the UAS gene dosage, as suggested by the authors. However, this MS2-MICAL system is also a good assay for the requirement of UAP56 ATPase activity (absence in the E194Q mutant) in this process. Could authors use the MARCM (thus reduce the use of UAS-RNAi transgene) for the rescue experiment? Also, the c4da neuronal marker UAS-CD8-GFP used in Fig4 could be replaced by marker gene directly fused to ppk promoter, which can save a copy of UAS transgene. The results from the rescue experiment would test the dependence of ATPase activity in nuclear export of MICAL mRNA.

      Our response

      We thank the reviewer for this comment. This is a great idea but unfortunately, this experiment was not feasible due to the (rare) constraints of Drosophila genetics. The MARCM system with rescue already occupies all available chromosomes (X: FLPase, 2nd: FRT, GAL80 + mutant, 3rd: GAL4 + rescue construct), and we would have needed to introduce three additional ones (MCP::RFP and two copies of unmarked genomic MICAL-MS2, all on the third chromosome) that would have needed to be introduced by recombination. Any Drosophilist will see that this is an extreme, likely undoable project :-(

      1.

      The UAP56 is also involved in presynaptic pruning through regulating actin assembly, and the authors suggest that Mical and cofilin are involved in the process. However, direct observation of lifeact::GFP in Mical or cofilin RNAi knockdown is important to support this conclusion.

      Our response

      We thank the reviewer for this comment. In response, we analyzed the lifeact::GFP patterns of control and cofilin knockdown neurons and found that loss of cofilin also leads to actin accumulation (new Fig. 7 I, J).

      Changes introduced:

      • new lifeact analysis (new Fig. 7 I, J).

      Minor comments:

      1.

      RNA localization is important for dendrite development in larval stages (Brechbiel JL, Gavis ER. Curr Biol. 20;18(10):745-750). Yet, the role of UAP56 is relatively specific and shown only in later-stage pruning. It would need thorough discussion.


      Our response

      We thank reviewer 2 for this comment. We added the following paragraph to the discussion: "UAP56 has also been shown to affect cytoplasmic mRNA localization in Drosophila oocytes (Meignin and Davis, 2008), opening up the possibility that nuclear mRNA export and cytoplasmic transport are linked. It remains to be seen whether this also applies to dendritic mRNA transport (Brechbiel and Gavis, 2008)." (p.13)

      1.

      Could authors elaborate on the possible upstream regulators that might be involved, as described in "alternatively, several cofilin upstream regulators have been described (Rust, 2015) which might also be involved in presynapse pruning and subject to UAP56 regulation" in Discussion?

      Our response

      We thank reviewer 2 for this comment. In the corresponding paragraph, we cite as example now that cofilin is regulated by Slingshot phosphatases and LIM kinase (p.14).

      1.

      In Discussion, the role of cofilin in pre- and post-synaptic processes was described. The role of Tsr/Cofilin regulating actin behaviors in dendrite branching has been described in c3da and c4da neurons (Nithianandam and Chien, 2018 and other references) should be included in Discussion.

      Our response

      We thank reviewer 2 for this comment. In response we tested whether cofilin is required for dendrite pruning and found that this, in contrast to Mical, is not the case (new Fig. S6). We cite the above paper in the corresponding results section (p.12).

      Changes introduced:

      • new cofilin dendrite pruning analysis (new Fig. S6).

      • added cofilin reference in Results.

      1.

      The authors speculate distinct actin structures have to be disassembled in dendrite and presynapse pruning in Discussion. What are the possible actin structures in both sites could be elaborated.

      Our response

      We thank reviewer 2 for this comment. In response, we specify in the Discussion: "As Mical is more effective in disassembling bundled F-actin than cofilin (Rajan et al., 2023), it is interesting to speculate that such bundles are more prevalent in dendrites than at presynapses." (p14)

      Reviewer #2 (Significance (Required)):

      The study initiated a genetic screen for factors involved in a dendrite pruning system and reveals the involvement of nuclear mRNA export is an important event in this process. They further identified the mRNA of the actin disassembly factor MICAL is a candidate substrate in the exporting process. This is consistent with previous finding that MICAL has to be transcribed and translated when pruning is initiated. As the presynapses of the model c4da neuron in this study is also pruned, the dependence on nuclear export and local actin remodeling were also shown. Thus, this study has added another layer of regulation (the nuclear mRNA export) in c4da neuronal pruning, which would be important for the audience interested in neuronal pruning. The study is limited for the confusing result whether ATPase activity of the exporting factor is required.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: In the manuscript by Frommeyer, Gigengack et al. entitled "The UAP56 mRNA Export Factor is Required for Dendrite and Synapse Pruning via Actin Regulation in Drosophila" the authors surveyed a number of RNA export/processing factors to identify any required for efficient dendrite and/or synapse pruning. They describe a requirement for a general poly(A) RNA export factor, UAP56, which functions as an RNA helicase. They also study links to aspects of actin regulation.

      Overall, while the results are interesting and the impact of loss of UAP56 on the pruning is intriguing, some of the data are overinterpreted as presented. The argument that UAP56 may be specific for the MICAL RNA is not sufficiently supported by the data presented. The two stories about poly(A) RNA export/processing and the actin regulation seem to not quite be connected by the data presented. The events are rather distal within the cell, making connecting the nuclear events with RNA to events at the dendrites/synapse challenging.

      Our response

      We thank reviewer 3 for this comment. To address this, we tested whether knockdown of three other mRNA export factors (NXF1, THO2, THOC5) causes dendrite pruning defects, which was not the case (new Fig. S1). While these data are consistent with specific mRNA export pathways, we agree that they are not proof. We therefore toned down our interpretation and removed the conclusion about specificity. Instead, we now use the more neutral term "increased sensibility (to loss of UAP56)".

      We agree that it is a little hard to tie cofilin to UAP56, as we currently have no evidence that cofilin levels are affected by loss of UAP56, even though both seem to affect lifeact::GFP in a similar way (new Fig. 7 I, J). However, a dysregulation of cofilin can also occur through dysregulation of upstream cofilin regulators such as Slingshot and LIM kinase, making such a relationship possible.

      Changes introduced:

      • added new Figure S1: RNAi analyses of NXF1, THO2 and THOC5 in dendrite pruning.

      • introduced concluding sentence at the end of first Results paragraph: "We conclude that c4da neuron dendrite pruning is particularly sensitive to loss of UAP56." (p. 6)

      • add new lifeact::GFP analysis of cofilin KD (new Fig. I, J).

      • identify potential other targets from the literature in the Discussion (Slingshot phosphatases and LIM kinase, p.14).

      There are a number of specific statements that are not supported by references. See, for example, these sentences within the Introduction- "Dysregulation of pruning pathways has been linked to various neurological disorders such as autism spectrum disorders and schizophrenia. The cell biological mechanisms underlying pruning can be studied in Drosophila." The Drosophila sentence is followed by some specific examples that do include references. The authors also provide no reference to support the variant that they create in UAP56 (E194Q) and whether this is a previously characterized fly variant or based on an orthologous protein in a different system. If so, has the surprising mis-localization been reported in another system?

      Our response

      We thank reviewer 3 for this comment. We added the following references on pruning and disease:

      1) Howes, O.D., Onwordi, E.C., 2023. The synaptic hypothesis of schizophrenia version III: a master mechanism. Mol. Psychiatry 28, 1843-1856.

      2) Tang, G., et al., 2014. Loss of mTOR-dependent macroautophagy causes autistic-like synaptic pruning deficits. Neuron 83, 1131-43.

      To better introduce the E194 mutations, we explain the position of the DECD motif in the Walker B domain, give the corresponding residues in the human and yeast homologues and cite papers demonstrating the importance of this residue for ATPase activity:

      3) Saguez, C., et al., 2013. Mutational analysis of the yeast RNA helicase Sub2p reveals conserved domains required for growth, mRNA export, and genomic stability. RNA 19:1363-71.

      4) Shen, J., et al., 2007. Biochemical Characterization of the ATPase and Helicase Activity of UAP56, an Essential Pre-mRNA Splicing and mRNA Export Factor. J. Biol. Chem. 282, P22544-22550.

      We are not aware of other studies looking at the relationship between the UAP56 ATPase and its localization. Thank you for pointing this out!

      Specific Comments:

      Specific Comment 1: Figure 1 shows the impact of loss of UAP56 on neuron dendrite pruning. The experiment employs both two distinct dsRNAs and a MARCM clone, providing confidence that there is a defect in pruning upon loss of UAP56. As the authors mention screening against 92 genes that caused splicing defects in S2 cells, inclusion of some examples of these genes that do not show such a defect would enhance the argument for specificity with regard to the role of UAP56. This control would be in addition to the more technical control that is shown, the mCherry dsRNA.

      Our response

      We thank reviewer 3 for this comment. To address this, we included the full list of screened genes with their phenotypic categorization regarding pruning (103 RNAi lines targeting 64 genes) as Table S1. In addition, we also tested four RNAi lines targeting the nuclear mRNA export factors Nxf1, THO2 and THOC5 which do not cause dendrite pruning defects (Fig. S1).

      Changes introduced:

      • added RNAi screen results as a list in Table S1.

      • added new Figure S1: RNAi analyses of NXF1, THO2 and THOC5 in dendrite pruning.

      Specific Comment 2: Later the authors demonstrate a delay in the accumulation of the Mical protein, so if they assayed these pruning events at later times, would the loss of UAP56 cause a delay in these events as well? Such a correlation would enhance the causality argument the authors make for Mical levels and these pruning events.

      Our response

      We thank reviewer 3 for this comment. Unfortunately, this is somewhat difficult to assess, as shortly after the 18 h APF timepoint, the epidermal cells that form the attachment substrate for c4da neuron dendrites undergo apoptosis. Where assessed (e. g., Wang et al., 2017, Development) 144: 1851–1862), this process, together with the reduced GAL4 activity of our ppk-GAL4 during the pupal stage (our own observations), eventually leads to pruning, but the causality cannot be easily attributed anymore. We therefore use the 18 h APF timepoint essentially as an endpoint assay.

      Specific Comment 3: Figure 2 provides data designed to test the requirement for the ATPase/helicase activity of UAP56 for these trimming events. The first observation, which is surprising, is the mislocalization of the variant (E194Q) that the authors generate. The data shown does not seem to indicate how many cells the results shown represent as a single image and trace is shown the UAP56::GFP wildtype control and the E194Q variant.

      Our response

      We thank reviewer 3 for this comment. It is correct that the traces shown are from single confocal sections. To better display the phenotypic penetrance, we now added a categorical analysis that shows that the UAP56 E194Q mutant is completely mislocalized in the majority of cells assessed (and the newly added E194A mutant in a subset of cells).

      Changes introduced:

      • added categorical quantification of UAP56 variant localization (new Fig. 2 G).

      __Specific Comment 4: __Given the rather surprising finding that the ATPase activity is not required for the function of UAP56 characterized here, the authors do not provide sufficient references or rationale to support the ATPase mutant that they generate. The E194Q likely lies in the Walker B motif and is equivalent to human E218Q, which can prevent proper ATP hydrolysis in the yeast Sub2 protein. There is no reference to support the nature of the variant created here.

      Our response

      We thank reviewer 3 for this comment. To better introduce the E194 mutations, we explain the position of the DECD motif in the Walker B domain, give the corresponding residues in the human and yeast homologues (Sub2) and cite papers demonstrating the importance of this residue for ATPase activity:

      1) Saguez, C., et al., 2013. Mutational analysis of the yeast RNA helicase Sub2p reveals conserved domains required for growth, mRNA export, and genomic stability. RNA 19:1363-71.

      2) Shen, J., et al., 2007. Biochemical Characterization of the ATPase and Helicase Activity of UAP56, an Essential Pre-mRNA Splicing and mRNA Export Factor. J. Biol. Chem. 282, P22544-22550.

      __Specific Comment 5: __Given the surprising results, the authors could have included additional variants to ensure the change has the biochemical effect that the authors claim. Previous studies have defined missense mutations in the ATP-binding site- K129A (Lysine to Alanine): This mutation, in both yeast Sub2 and human UAP56, targets a conserved lysine residue that is critical for ATP binding. This prevents proper ATP binding and consequently impairs helicase function. There are also missense mutations in the DEAD-box motif, (Asp-Glu-Ala-Asp) involved in ATP binding and hydrolysis. Mutations in this motif, such as D287A in yeast Sub2 (corresponding to D290A in human UAP56), can severely disrupt ATP hydrolysis, impairing helicase activity. In addition, mutations in the Walker A (GXXXXGKT) and Walker B motifs are can impair ATP binding and hydrolysis in DEAD-box helicases. Missense mutations in these motifs, like G137A (in the Walker A motif), can block ATP binding, while E218Q (in the Walker B motif)- which seems to be the basis for the variant employed here- can prevent proper ATP hydrolysis.

      Our response

      We thank reviewer 3 for this comment. Our cursory survey of the literature suggested that mutations in the Walker B motif are the most specific as they still preserve ATP binding and their effects have not well been characterized overall. In addition, these mutations can create strong dominant-negatives in related helicases (e. g., Rode et al., 2018 Cell Reports, our lab). To better characterize the role of the Walker B motif in UAP56, we generated and characterized an alternative mutant, UAP56 E194A. While the E194A variant does not show the same penetrance of localization phenotypes as E194Q, it also is partially mislocalized, shows stronger binding to Ref1 and also rescues the uap56 mutant phenotypes without an obvious dominant-negative effect, thus confirming our conclusions regarding E194Q.

      Changes introduced:

      • added biochemical, localization and phenotypic analysis of newly generated UAP56 E194A variant (new Figs. 2 B, 2 E, E', 3 C, C'). categorical quantification of UAP56 variant localization (new Fig. 2 G).

      __Specific Comment 6: __The co-IP results shown in Figure 2C would also seem to have multiple potential interpretations beyond what the authors suggest, an inability to disassemble a complex. The change in protein localization with the E194Q variant could impact the interacting proteins. There is no negative control to show that the UAP56-E194Q variant is not just associated with many, many proteins. Another myc-tagged protein that does not interact would be an ideal control.

      Our response

      We thank reviewer 3 for this comment. To address this comment, we tried to co-IP UAP56 wt or UAP56 E194Q with a THO complex subunit THOC7 (new Fig. S2). The results show that neither UAP56 variant can co-IP THOC7 under our conditions (likely because the UAP56/THO complex intermediate during mRNA export is disassembled in an ATPase-independent manner (Hohmann et al., Nature 2025)).

      Changes introduced:

      • added co-IP experiment between UAP56 variants and THOC7 (new Fig. S2).

      __Specific Comment 7: __With regard to Figure 3, the authors never define EB1::GFP in the text of the Results, so a reader unfamiliar with this system has no idea what they are seeing. Reading the Materials and Methods does not mitigate this concern as there is only a brief reference to a fly line and how the EB1::GFP is visualized by microscopy. This makes interpretation of the data presented in Figure 3A-C very challenging.

      Our response

      We thank reviewer 3 for pointing this out. We added a description of the EB1::GFP analysis in the corresponding Results section (p.8).

      __Specific Comment 8: __The data shown for MICAL MS2 reporter localization in Figure 4 is nice, but is also fully expected on many former studies analyzing loss of UAP56 or UAP56 hypomorphs in different systems. While creating the reporter is admirable, to make the argument that MICAL localization is in some way preferentially impacted by loss of UAP56, the authors would need to examine several other transcripts. As presented, the authors can merely state that UAP56 seems to be required for the efficient export of an mRNA transcript, which is predicted based on dozens of previous studies dating back to the early 2000s.

      Our response

      Firstly, thank you for commenting on the validity of the experimental approach! The primary purpose of this experiment was to test whether the mechanism of UAP56 during dendrite pruning conforms with what is known about UAP56's cellular role - which it apparently does. We also noted that our statements regarding the specificity of UAP56 for Mical over other transcripts are difficult. While our experiments would be consistent with such a model, they do not prove it. We therefore toned down the corresponding statements (e. g., the concluding sentence at the end of first Results paragraphis now: "We conclude that c4da neuron dendrite pruning is particularly sensitive to loss of UAP56." (p. 6)).

      Minor (and really minor) points:

      In the second sentence of the Discussion, the word 'developing' seems to be mis-typed "While a general inhibition of mRNA export might be expected to cause broad defects in cellular processes, our data in develoing c4da neurons indicate that loss of UAP56 mainly affects pruning mechanisms related to actin remodeling."

      Sentence in the Results (lack of page numbers makes indicating where exactly a bit tricky)- "We therefore reasoned that Mical expression could be more challenging to c4da neurons." This is a complete sentence as presented, yet, if something is 'more something'- the thing must be 'more than' something else. Presumably, the authors mean that the length of the MICAL transcript could make the processing and export of this transcript more challenging than typical fly transcripts (raising the question of the average length of a mature transcript in flies?).

      Our response

      Thanks for pointing these out. The typo is fixed, page numbers are added. We changed the sentence to: "Because of the large size of its mRNA, we reasoned that MICAL gene expression could be particularly sensitive to loss of export factors such as UAP56." (p.9) We hope this is more precise language-wise.

      Reviewer #3 (Significance (Required)):

      Understanding how post-transcriptional events are linked to key functions in neurons is important and would be of interest to a broad audience.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In the manuscript by Frommeyer, Gigengack et al. entitled "The UAP56 mRNA Export Factor is Required for Dendrite and Synapse Pruning via Actin Regulation in Drosophila" the authors surveyed a number of RNA export/processing factors to identify any required for efficient dendrite and/or synapse pruning. They describe a requirement for a general poly(A) RNA export factor, UAP56, which functions as an RNA helicase. They also study links to aspects of actin regulation.

      Overall, while the results are interesting and the impact of loss of UAP56 on the pruning is intriguing, some of the data are overinterpreted as presented. The argument that UAP56 may be specific for the MICAL RNA is not sufficiently supported by the data presented. The two stories about poly(A) RNA export/processing and the actin regulation seem to not quite be connected by the data presented. The events are rather distal within the cell, making connecting the nuclear events with RNA to events at the dendrites/synapse challenging.

      There are a number of specific statements that are not supported by references. See, for example, these sentences within the Introduction- "Dysregulation of pruning pathways has been linked to various neurological disorders such as autism spectrum disorders and schizophrenia. The cell biological mechanisms underlying pruning can be studied in Drosophila." The Drosophila sentence is followed by some specific examples that do include references. The authors also provide no reference to support the variant that they create in UAP56 (E194Q) and whether this is a previously characterized fly variant or based on an orthologous protein in a different system. If so, has the surprising mis-localization been reported in another system?

      Specific Comments:

      Figure 1 shows the impact of loss of UAP56 on neuron dendrite pruning. The experiment employs both two distinct dsRNAs and a MARCM clone, providing confidence that there is a defect in pruning upon loss of UAP56. As the authors mention screening against 92 genes that caused splicing defects in S2 cells, inclusion of some examples of these genes that do not show such a defect would enhance the argument for specificity with regard to the role of UAP56. This control would be in addition to the more technical control that is shown, the mCherry dsRNA. Later the authors demonstrate a delay in the accumulation of the Mical protein, so if they assayed these pruning events at later times, would the loss of UAP56 cause a delay in these events as well? Such a correlation would enhance the causality argument the authors make for Mical levels and these pruning events.

      Figure 2 provides data designed to test the requirement for the ATPase/helicase activity of UAP56 for these trimming events. The first observation, which is surprising, is the mislocalization of the variant (E194Q) that the authors generate. The data shown does not seem to indicate how many cells the results shown represent as a single image and trace is shown the UAP56::GFP wildtype control and the E194Q variant.

      Given the rather surprising finding that the ATPase activity is not required for the function of UAP56 characterized here, the authors do not provide sufficient references or rationale to support the ATPase mutant that they generate. The E194Q likely lies in the Walker B motif and is equivalent to human E218Q, which can prevent proper ATP hydrolysis in the yeast Sub2 protein. There is no reference to support the nature of the variant created here.

      Given the surprising results, the authors could have included additional variants to ensure the change has the biochemical effect that the authors claim. Previous studies have defined missense mutations in the ATP-binding site- K129A (Lysine to Alanine): This mutation, in both yeast Sub2 and human UAP56, targets a conserved lysine residue that is critical for ATP binding. This prevents proper ATP binding and consequently impairs helicase function. There are also missense mutations in the DEAD-box motif, (Asp-Glu-Ala-Asp) involved in ATP binding and hydrolysis. Mutations in this motif, such as D287A in yeast Sub2 (corresponding to D290A in human UAP56), can severely disrupt ATP hydrolysis, impairing helicase activity. In addition, mutations in the Walker A (GXXXXGKT) and Walker B motifs are can impair ATP binding and hydrolysis in DEAD-box helicases. Missense mutations in these motifs, like G137A (in the Walker A motif), can block ATP binding, while E218Q (in the Walker B motif)- which seems to be the basis for the variant employed here- can prevent proper ATP hydrolysis.

      The co-IP results shown in Figure 2C would also seem to have multiple potential interpretations beyond what the authors suggest, an inability to disassemble a complex. The change in protein localization with the E194Q variant could impact the interacting proteins. There is no negative control to show that the UAP56-E194Q variant is not just associated with many, many proteins. Another myc-tagged protein that does not interact would be an ideal control.

      With regard to Figure 3, the authors never define EB1::GFP in the text of the Results, so a reader unfamiliar with this system has no idea what they are seeing. Reading the Materials and Methods does not mitigate this concern as there is only a brief reference to a fly line and how the EB1::GFP is visualized by microscopy. This makes interpretation of the data presented in Figure 3A-C very challenging. The data shown for MICAL MS2 reporter localization in Figure 4 is nice, but is also fully expected on many former studies analyzing loss of UAP56 or UAP56 hypomorphs in different systems. While creating the reporter is admirable, to make the argument that MICAL localization is in some way preferentially impacted by loss of UAP56, the authors would need to examine several other transcripts. As presented, the authors can merely state that UAP56 seems to be required for the efficient export of an mRNA transcript, which is predicted based on dozens of previous studies dating back to the early 2000s.

      Minor (and really minor) points:

      In the second sentence of the Discussion, the word 'developing' seems to be mis-typed "While a general inhibition of mRNA export might be expected to cause broad defects in cellular processes, our data in develoing c4da neurons indicate that loss of UAP56 mainly affects pruning mechanisms related to actin remodeling."

      Sentence in the Results (lack of page numbers makes indicating where exactly a bit tricky)- "We therefore reasoned that Mical expression could be more challenging to c4da neurons." This is a complete sentence as presented, yet, if something is 'more something'- the thing must be 'more than' something else. Presumably, the authors mean that the length of the MICAL transcript could make the processing and export of this transcript more challenging than typical fly transcripts (raising the question of the average length of a mature transcript in flies?).

      Significance

      Understanding how post-transcriptional events are linked to key functions in neurons is important and would be of interest to a broad audience.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this paper, the authors describe dendrite pruning defects in c4da neurons in the DEXD box ATPase UAP56 mutant or in neuronal RNAi knockdown. Overexpression UAP56::GFP or UAP56::GFPE194Q without ATPase activity can rescue dendrite pruning defects in UAP56 mutant. They further characterized the mis-localization of UAP56::GFPE194Q and its binding to nuclear export complexes. Both microtubules and the Ubiquitin-proteasome system are intact in UAP56RNAi neurons. However, they suggest a specific effect on MICAL mRNA nuclear export shown by using the MS2-MCP system., resulting in delay of MICAL protein expression in pruned neurons. Furthermore, the authors show that UAP56 is also involved in presynaptic pruning of c4da neuros in VNC and Mica and actin are also required for actin disassembly in presynapses. They propose that UAP56 is required for dendrite and synapse pruning through actin regulation in Drosophila. Following are my comments.

      Major comments

      1. The result that UAP56::GFPE194Q rescues the mutant phenotype while the protein is largely mis-localized suggests a novel mechanism or as the authors suggested rescue from combination of residual activities. The latter possibility requires further support, which is important to support the role mRNA export in dendrite and pre-synapse pruning. One approach would be to examine whether other export components like REF1, and NXF1 show similar mutant phenotypes. Alternatively, depleting residual activity like using null mutant alleles or combining more copies of RNAi transgenes could help.

      2. The localization of UAP56::GFP (and E194Q) should be analyzed in more details. It is not clear whether the images in Fig. 2A and 2B are from confocal single sections or merged multiple sections. The localization to the nuclear periphery of UAP56::GFP is not clear, and the existence of the E194Q derivative in both nucleus and cytosol (or whether there is still some peripheral enrichment) is not clear if the images are stacked.

      3. The Ub-VV-GFP is a new reagent, and its use to detect active proteasomal degradation is by the lack of GFP signals, which could be also due to the lack of expression. The use of Ub-QQ-GFP cannot confirm the expression of Ub-VV-GFP. The proteasomal subunit RPN7 has been shown to be a prominent component in the dendrite pruning pathway (Development 149, dev200536). Immunostaining using RPN7 antibodies to measure the RPN expression level could be a direct way to address the issue whether the proteasomal pathway is affected or not.

      4. Using the MS2/MCP system to detect the export of MICAL mRNA is a nice approach to confirm the UAP56 activity; lack of UAP56 by RNAi knockdown delays the nuclear export of MS2-MICAL mRNA. The rescue experiment by UAS transgenes could not be performed due to the UAS gene dosage, as suggested by the authors. However, this MS2-MICAL system is also a good assay for the requirement of UAP56 ATPase activity (absence in the E194Q mutant) in this process. Could authors use the MARCM (thus reduce the use of UAS-RNAi transgene) for the rescue experiment? Also, the c4da neuronal marker UAS-CD8-GFP used in Fig4 could be replaced by marker gene directly fused to ppk promoter, which can save a copy of UAS transgene. The results from the rescue experiment would test the dependence of ATPase activity in nuclear export of MICAL mRNA.

      5. The UAP56 is also involved in presynaptic pruning through regulating actin assembly, and the authors suggest that Mical and cofilin are involved in the process. However, direct observation of lifeact::GFP in Mical or cofilin RNAi knockdown is important to support this conclusion.

      Minor comments

      1. RNA localization is important for dendrite development in larval stages (Brechbiel JL, Gavis ER. Curr Biol. 20;18(10):745-750). Yet, the role of UAP56 is relatively specific and shown only in later-stage pruning. It would need thorough discussion.

      2. Could authors elaborate on the possible upstream regulators that might be involved, as described in "alternatively, several cofilin upstream regulators have been described (Rust, 2015) which might also be involved in presynapse pruning and subject to UAP56 regulation" in Discussion?

      3. In Discussion, the role of cofilin in pre- and post-synaptic processes was described. The role of Tsr/Cofilin regulating actin behaviors in dendrite branching has been described in c3da and c4da neurons (Nithianandam and Chien, 2018 and other references) should be included in Discussion.

      4. The authors speculate distinct actin structures have to be disassembled in dendrite and presynapse pruning in Discussion. What are the possible actin structures in both sites could be elaborated.

      Significance

      The study initiated a genetic screen for factors involved in a dendrite pruning system and reveals the involvement of nuclear mRNA export is an important event in this process. They further identified the mRNA of the actin disassembly factor MICAL is a candidate substrate in the exporting process. This is consistent with previous finding that MICAL has to be transcribed and translated when pruning is initiated. As the presynapses of the model c4da neuron in this study is also pruned, the dependence on nuclear export and local actin remodeling were also shown. Thus, this study has added another layer of regulation (the nuclear mRNA export) in c4da neuronal pruning, which would be important for the audience interested in neuronal pruning. The study is limited for the confusing result whether ATPase activity of the exporting factor is required.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      This manuscript by Frommeyer et al. explores the role of the helicase and regulator of nuclear export, UAP56, in the control of dendrite and presynaptic pruning in Drosophila larval da sensory neurons. The authors present evidence showing that UAP56 regulates these processes via the actin cytoskeleton and suggest that this is occurs by controlling the expression of the actin severing enzyme, Mical.

      Major comments:

      The most signficant issue with the manuscript is that some of the major conclusions are not supported by the data. Additional experiment would need to be completed in order support these claims. These (and other) major comments are as follows:

      1. For Figure 4, the ms2/MCP system is not quantitative. Using this technique, it is impossible to determine how many RNAs are located in each "dot". Each of these dots looks quite large and likely corresponds to some phase-separated RNP complex where multiple RNAs are stored and/or transported. Thus, these data do not support the conclusion that Mical mRNA levels are reduced upon UAP56 knockdown. A good quantitative microscopic assay would be something like smFISH. Additinally, the localization of Mical mRNA dots to dendrites is not convincing as it looks like regions where there are dendritic swellings, the background is generally brighter.

      2. Alternatively, levels of Mical mRNA could be verified by qPCR in the laval brain following pan-neuronal UAP56 knockdown or in FACS-sorted fluorescently labeled da sensory neurons. Protein levels could be analyzed using a similar approach.

      3. In Figure 5, the authors state that Mical expression could not be detected at 0 h APF. The data presented in Fig. 5C, D suggest the opposite as there clearly is some expression. Moreover, the data shown in Fig. 5D looks significantly brighter than the Orco dsRNA control and appears to localize to some type of cytoplasmic granule. So the expression of Mical does not look normal.

      4. Sufficient data are not presented to conclude any specificity in mRNA export pathways. Data is presented for one export protein (UAP56) and one putative target (Mical). To adequately assess this, the authors would need to do RNA-seq in UAP56 mutants.

      5. In summary, better quantitative assays should be used in Figures 4 and 5 in order to conclude the expression levels of either mRNA or protein. In its current form, this study demonstrates the novel finding that UAP56 regulates dendrite and presynaptic pruning, potentially via regulation of the actin cytoskeleton. However, these data do not convincingly demonstrate that UAP56 controls these processes by regulating of Mical expression and defintately not by controlling export from the nucleus.

      6. While there are clearly dendrites shown in Fig. 1C', the cell body is not readily identifiable. This makes it difficult to assess attachment and suggests that the neuron may be dying. This should be replaced with an image that shows the soma.

      7. The level of knockdown in the UAS56 RNAi and P element insertion lines should be determined. It would be useful to mention the nature of the RNAi lines (long/short hairpin). Some must be long since Dcr has been co-expressed. Another issue raised by this is the potential for off-target effects. shRNAi lines would be preferable because these effects are minimized.

      Minor comments:

      1. The authors should explain what EB1:GFP is marking when introduced in the text.

      2. The da neuron images througout the figures could be a bit larger.

      Significance

      Strengths:

      The methodology used to assess dendrite and presynaptic prunings are strong and the phenotypic analysis is conclusive.

      Weakness:

      The evidence demonstrating that UAP56 regulates the expression of Mical is unconvincing. Similarly, no data is presented to show that there is any specificity in mRNA export pathways. Thus, these major conclusions are not adequately supported by the data.

      Advance:

      The findings that UAP56 regulate dendrite and synaptic pruning are novel. As is its specific regulation of the actin cytoskeleton. These findings are restricted to a phenotypic analysis and do not show that it is not simply due to the disruption of general mRNA export.

      Audience:

      In its current form the manuscript whould be of interest to an audience who specializes in the study of RNA binding proteins in the control of neurodevelopment. This would include scientists who work in invertebrate and vertebrate model systems.

      My expertise:

      My lab uses Drosophila to study the role of RNA binding proteins in neurodevelopment and neurodegeneration. Currently, we use flies as a model to better understand the molecular pathogenesis of neurodevelopmenal disorders such as FXS and ASD.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      *Reviewer #1 (Evidence, reproducibility and clarity (Required): *

      *Using genetics and microscopy approaches, Cabral et al. investigate how fission yeast regulates its length and width in response to osmotic, oxidative, or low glucose stress. Miller et al. have recently found that the cell cycle regulators Cdc25, Cdc13 and Cdr2 integrate information about cell volume, time and cell surface area into the cellular decision when to divide. Cabral now build on this work and test how disruption of these regulators affects cell size adaptation. They find that each stress condition shows a distinct dependence on the individual regulators, suggesting that the complex size control network enables optimized size adaptation for each condition. Overall, the manuscript is clear and the detailed methods ensure that the experiments can be replicated.

      Major comments:

      1.) It would be much easier to follow the authors' conclusions, if in addition to surface area to volume ratio, length and width, they would also plot cell volume at division in Figs. 1-4.*

      AUTHOR RESPONSE: Due to space constraints in the main (and supplemental) figures, we focused on SA:Vol ratio together with cell length and width, which directly define cell geometry in rod-shaped fission yeast. Surface area and volume are derived from these measurements and can be misleading when considered alone, as similar surface area or volume values can arise from distinct combinations of length and width. The SA:Vol ratio therefore serves as a robust integrative metric for capturing coordinated changes in length and width that reshape cell geometry. We would be happy to include individual surface area and volume plots if requested.

      2.) To me, it seems that maybe even more than upon osmotic stress, the cdc13-2x strain differs qualitatively from WT in low glucose conditions, where the increased SA-V ratio is almost completely abolished.

      AUTHOR RESPONSE: We agree with the reviewer and have revised the manuscript text to point out this difference. The newly added text states: “Under low glucose, cdc13-2x cells also showed a WT-like response, decreasing length and increasing in SA:Vol ratio (Figures 3B-D). However, this SA:Vol increase was reduced compared to WT (1% vs 8.5%; Figures 1D and 3B), suggesting impaired geometric remodeling under glucose limitation.”

      3.) It is not entirely clear to me why two copies of Cdc13 would qualitatively affect the responses. Shouldn't the extra copy behave similarly to the endogenous one and therefore only lead to quantitative changes? Maybe the authors can discuss this more clearly or even test a strain in which Cdc13 function is qualitatively disrupted.

      AUTHOR RESPONSE: Increased Cdc13 protein concentration in cdc13-2x cells disrupts the typical time-scaling of Cdc13 protein. Consistent with this, cdc13-2x cells enter mitosis at a smaller cell size. We have modified the text to clarify this point. The new text states: “To access the role of the Cdc13 time-sensing pathway, we disrupted Cdc13 protein abundance by creating a cdc13-2x strain carrying an additional copy of cdc13 integrated at an exogenous locus. cdc13-2x cells divided at a smaller size than WT, reflecting accelerated mitotic entry upon disruption of typical time-scaling of Cdc13 protein (Figure S1A).”

      4.) I don't see why the authors come to the conclusion that under osmotic stress cells would maximize cell volume. It leads to a decreased cell length, doesn't it?

      AUTHOR RESPONSE: WT cells under osmotic stress do decrease in length, but this is accompanied by an increase in cell width. Because width contributes disproportionately to cell volume in rod-shaped cells, this change results in a modest but reproducible reduction in the SA:Vol ratio relative to WT cells in control medium (Figure 1D). We note that the degree of this change under osmotic stress is small (-0.4%), although statistically significant (p * Likewise, in Figure 2B, they interpret tiny changes in the SA/V. By my estimation, the difference between control and osmotic stress is only 2% (1.195/1.17), less that the wild-type case, which appears to be twice that (which is still pretty modest). The small amplitude of these changes is obscured by the fact that the graphs do not have a baseline at zero, which, as a matter of good data-presentation practice, they should.

      *

      AUTHOR RESPONSE: We appreciate the reviewer’s distinction between statistical and biological significance and agree that this is an important point to clarify. We now note in the revised text that changes in SA:Vol ratio under osmotic stress are numerically small and should not be overinterpreted. Our revised text now states: “Under oxidative and osmotic stress, the SA:Vol ratio decreased, indicating greater cell volume expansion relative to surface area (Figure 1D). However, we note that the reduction in SA:Vol under osmotic stress, while statistically significant, was modest in magnitude (−0.4%).”

      Although small in absolute terms, even subtle geometric changes can be biologically meaningful in fission yeast due to the small size of these cells, where minor shifts in length or width translate into measurable differences in membrane area relative to cytoplasmic volume. Importantly, in Figure 2B, the key observation is not the magnitude of the change but its direction: cdc25-degron-DaMP cells exhibit a ~2% increase in SA:Vol ratio under osmotic stress, in contrast to the decrease observed in WT cells under the same condition. This opposite response reflects altered cell geometry and is supported by corresponding changes in cell length and width. We have revised the Results text to emphasize both the modest magnitude and the directional nature of these effects: “Under osmotic stress, cdc25-degron-DaMP cells exhibited a ~2% increase in SA:Vol ratio, opposite to the modest decrease observed in WT cells. This increase arose from increased cell length and reduced width (Figures 2B-D).”

      Regarding data presentation, because SA:Vol ratios vary over a narrow numerical range, setting the y-axis minimum to zero would compress the data and obscure all detectable differences. Instead, we have modifed our SA:Vol ratio graphs in Fig. 1-4 to have consistent axis scaling across panels to accurately convey relative changes while maintaining visual clarity. We are happy to provide full data tables and statistical outputs upon request.

      * I am also concerned about the use of manual measurement of width at a single point along the cell. This approach is very sensitive to the choice of width point and to non-cylindrical geometries, several of which are evident in the images presented. MATLAB will return the ??? as well as the length from a mask, but even better, one can more accurately calculate the surface area and volume by assuming rotational symmetry of the mask. Given that surface area and volume calculation need to be redone anyway, as discussed below, I encourage the authors to calculate them directly from the mask, instead of using the cylindrical assumption.*

      AUTHOR RESPONSE: In initial experiments to calculate surface area and volume of fission yeast cells for prior work (Miller et al., 2023, Current Biology) we found that automated width measurements by MATLAB or ImageJ were inaccurate for a subset of cells leading to noisy cell surface area and volume values. Measuring cell width by hand and assuming that each cell in a given strain had the same cell radius (average of population) for calculation of cell surface area and volume gave more consistent results and recapitulated established conclusions regarding size control mechanisms.

      In this previous work and the current study, abnormally skinny or wide regions of a cell were avoided when drawing a line to measure the cell width by hand. For each strain and condition, an average cell width was determined per independent experiment and used for surface area and volume calculations. Additionally, previous analysis demonstrated that this approach yields results consistent with a rotation method derived directly from cell masks, which does not assume a cylindrical cell shape (Facchetti et al., 2019, Current Biology; Miller et al., 2023, Current Biology).

      To test the validity of our size measurements and confirm the robustness of our results in this study we compared the surface area and volume of cells by this rotation method. We have added this additional information to our revised methods section and also added SA:Vol ratio graphs generated from the rotation size measurement to our revised Figure S1 E-J. Importantly, both approaches used to measure cell size gave consistent results and supported the same conclusions.*

      The authors also need to be more careful about their claims about size-dependent scaling. The concentration of both Cdc13 and Cdc25 scale with size (perhaps indirectly, in the case of Cdc13), but Cdr2 does not. Cdr2 activity has been proposed to scale with size, and its density at cortical nodes has been reported to scale with size, although that claim has been challenged .*

      AUTHOR RESPONSE: We have modified text in the Introduction and Results to address this point. Our revised text in the introduction states: “Recent work has shown that Cdk1 activation integrates size- and time-dependent inputs: the Wee1-inhibitory kinase Cdr2 cortical node density scales with cell surface area (Pan et al., 2014; Facchetti et al., 2019); Cdc25 nuclear accumulation scales with cell volume; and cyclin Cdc13 accumulates over time in the nucleus (Miller et al., 2023) (Figure 1B).” Our revised text in the results section states: “Cdr2 functions as a cortical scaffold that regulates Wee1 activity in relation to cell size, with Cdr2 nodal density reported to scale with cell surface area, enforcing a surface area threshold for mitotic entry (Pan et al., 2014; Allard et al., 2018; Facchetti et al., 2019; Sayyad and Pollard, 2022).”*

      Even taking the authors approach at face value, there are observations that do not seem to make sense, which led me to realize that the wrong formulae were used to calculate surface area and volume.

      In Figure 1E,F, the KCl-treated cells get shorter and wider; surely, that should result in a lower SA/V ratio. However, as noted above, in Figure 1D, they are shown to have a similar ratio. As a sanity check, I eye-balled the numbers off of the figure (control: 14 µm x 3.6 µm and KCl: 11 µm x 3.8 µm) and calculated their surface area and volume using the formula for a capsule (i.e., a cylinder with hemispheric ends).

      SA = the surface area of the two hemispheres + the surface are of the cylinder in between = 4*pi*(width/2)^2 + pi*width*(length-width), the length-width term calculates the side length of the capsule (length without the hemispheres) from the full length of the capsule (length including the hemispheres)

      V = the volume of the two hemispheres + the volume of the cylinder in between = 4/3*pi*(width/2)^3 + pi*(width/2)^2*(length-width).

      I got SA/V ratios of around 2, which are way off from what is presented in Figure 1D, but my calculated ratio goes down in KCl, as expected, but not as reported.

      To make sure I was not doing something wrong, I was going to repeat my calculations with the formulae in Table 1, which made me realize both are incorrect. The stated formula for the cell surface area-2*pi*RL-only represents to surface area of the cylindrical side of the cells, not its hemispherical ends. And it is not even the correct formula for the surface area of the side, because that calls for L to be the length of the side (without the hemispherical ends) not the length of the cell (which includes the hemispherical ends). L here is stated to be cell length (which is what is normally measured in the field, and which is consistent with the reported length of control cells in Figure 1E being 14 µm). The formula for the volume of a capsule in the form use in Table 1 (volume of a cylinder of length L - the volume excluded from the hemispherical ends) is pi*R^2*L - (8-(4/3*pi))*R^3.

      Given these problems, I think I spent too much time thinking about the rest of the paper, because all of the calculations, and perhaps their interpretations, need to be redone.*

      AUTHOR RESPONSE: The surface area and volume equations for a cylinder with hemispherical ends used in our study and listed in our table are correct and widely used in other work with fission yeast cells (Navarro and Nurse, 2012; Pan et al., 2014; Facchetti et al., 2019; BayBay et al., 2020; and Miller et al., 2023). We write our equations with variables for cell length and radius because these are biologically relevant and measured parameters for fission yeast cells. Cell length (L) refers to the total tip-to-tip length of the cell, including the hemispherical ends, and radius (R) refers to half the measured cell width. We have revised the Methods section to clarify this definition and avoid ambiguity (Please see methods section “Cell geometry measurements”)

      Additionally, SA or Vol calculations were performed using the length of each individual cell and the average cell radius of the population. We did not use mean cell length of the population for our calculations like the reviewer assumed in their “sanity check” above. Please see methods section “Cell geometry measurements”. We hope that these clarifications and text revisions improve transparency and reproducibility.

      * Minor Points:

      Strains should be identified by strain number is the text and figure legends.*

      AUTHOR RESPONSE: For clarity and readability, we refer to strains by genotype in the main text and figure legends, which we believe is more informative for readers than strain numbers. All strain numbers corresponding to each genotype are provided in Table S1, ensuring traceability and reproducibility without compromising clarity in data presentation.*

      In the Introduction, "Most cell control their size" should be "Most eukaryotic cell control their size".*

      • *

      AUTHOR RESPONSE: The text has been corrected as suggested.*

      Reviewer #2 (Significance (Required)):

      Nothing to add.*

      *Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary This manuscript reports that fission yeast cells exhibit distinct cell size and geometry when exposed to osmotic, oxidative, or low-glucose stress. Based on quantitative measurements of cell length and width, the authors propose that different stress conditions trigger specific 'geometric adaptation' patterns, suggesting that cell size homeostasis is flexibly modulated depending on environmental cues. The study provides phenotypic evidence that multiple environmental stresses lead to distinct outcomes in the balance between cell surface area and volume, which the authors interpret as stress-specific modes of size control.

      Major comments 1) The authors define the 48-hour time point as the 'long-term response', but no justification is provided for why 48 hours represents a physiologically relevant adaptation phase. It is unclear whether the size-control mode has stabilized by that time, or whether it may continue to change afterward. At minimum, the authors should provide a rationale (e.g., growth recovery dynamics, transcriptional adaptation plateau, or pilot time-course observations) to demonstrate that 48 hours corresponds to the steady-state adaptive phase rather than an arbitrarily selected time point.*

      AUTHOR RESPONSE: We thank the reviewer for this important point and agree that the definition of the long-term response should be clarified. We have addressed this with new experiments and revised text. We now incorporate growth curve data and doubling time analyses for all yeast strains grown under control and stress conditions (See new Figure S3). These analyses show that following an initial transient stress-induced cell cycle delay, growth rates stabilize well before 48 hours. Notably, the slowest growth rate observed was in 1M KCl, with a doubling time of ~4 hours across all yeast strains tested. Thus, by 48 hours, cells in this condition have undergone more than 12 generations of growth, while cells in all other conditions with shorter doubling times have undergone even more divisions. So by allowing cells to grow for 48 hours prior to imaging, we are capturing cells that have resumed sustained cell cycle progression following transient stress-induced cell cycle delays. Because cell size control is tightly linked to the cell cycle, we define 48 hours as a physiologically relevant time point where cells have adapted to stress conditions.

      Our revised methods now states: “Cultures were incubated at 25°C while shaking at 180 rpm for 48 h prior to imaging. This time point was chosen to ensure that cells had progressed beyond the initial transient stress response and reached a stable, condition-specific growth state, as confirmed by growth curve and doubling time analyses showing stabilization well before 48 h (Figure S3), including in the slowest growing condition (1 M KCl; doubling time ~4 h).”

      * 2*)Related to the above comment, the authors propose that different stresses lead to distinct cell size adaptations, yet the rationale for the chosen stress intensities and exposure times is insufficiently described. It remains unclear whether the osmotic, oxidative, and low-glucose conditions used here induce comparable levels of cellular stress. Dose-response and time-course analyses would greatly strengthen the conclusions. Without such analyses, it is difficult to support the interpretation that geometry modulation represents a direct adaptive response.

      AUTHOR RESPONSE: * *We selected the specific stress conditions based on previously published work showing that these doses elicit robust responses while preserving overall cell viability and the capacity for recovery. We note that osmotic, oxidative, and low glucose conditions perturb fundamentally different cellular systems (turgor pressure and cell wall mechanics, redox balance, and metabolism etc.) and therefore do not generate directly comparable levels of cellular stress in a quantitative sense. Our goal was not to equalize stress intensity across conditions, but to examine how cells change their geometry in response to distinct classes of stressors.

      We have clarified the rationale for specific stress conditions in the revised methods: “These stress intensities were selected based on prior studies demonstrating robust cellular responses while preserving cell viability and the capacity for recovery (Fantes and Nurse, 1977, Shiozaki and Russell, 1995, Degols, et al., 1996; López-Avilés et al., 2008; Sansó et al., 2008; Satioh et al., 2015, Salat-Canela et al., 2021, Bertaux et al., 2023).”

      * 3) The authors describe stress-induced size changes as an 'adaptive' response. While this is an appealing hypothesis, the presented data do not demonstrate that the change in cell size itself confers a fitness advantage. Evidence showing that blocking the size change reduces stress survival-or that the altered size improves growth recovery- would be required to support this claim. Without such data, the use of the term 'geometric adaptation' seems overstated.*

      AUTHOR RESPONSE: We have revised the text to remove the term “adaptive” and now describe stress-induced size changes in descriptive terms. As discussed further in response to Comment 4, new growth curve and doubling time analyses show that defects in surface area or volume expansion do not uniformly impair growth or survival over the stress exposure examined here, reinforcing the decision to avoid fitness-based language.*

      4) The authors conclude that mutants exhibit no major defects in growth or viability during 48-hour stress exposure based on comparable septation index values (Fig. S2). However, septation index alone does not fully capture growth performance or cell-cycle progression and is not sufficient to support claims regarding fitness or robustness of proliferation. If the authors intend to make statements about 'growth', 'viability', or 'cell-cycle progression', additional quantitative measures (e.g., growth curves, doubling time, colony-forming units, or microcolony growth measurements) would be necessary. Alternatively, the claims should be toned down to align with the measurements currently provided.*

      AUTHOR RESPONSE: We have addressed this concern with new experiments and revised text. In addition to septation index measurements (now analyzed using chi-square tests of proportions; Figure S2), we performed growth curve experiments and doubling time analyses for all genotypes under control and stress conditions (new Figure S3). These additional data show that growth rates are largely comparable across genotypes in control, oxidative, and low-glucose conditions, with more pronounced genotype-dependent differences emerging under osmotic stress. Defects in surface area or volume expansion did not uniformly correspond to impaired population growth, indicating that geometric remodeling is not strictly required for proliferation over the 48-hour stress exposure examined here. We have refined our conclusion to emphasize that defects in surface area or volume expansion do not uniformly impair growth or survival. See revised Results text under the heading “Defects in surface area or volume expansion do not uniformly compromise growth or survival”.*

      5) Related to the above comment, the manuscript does not adequately rule out the possibility that the decreased division size simply results from slower growth or delayed cell-cycle progression rather than a shift in the size-control mechanism. Measurements and normalizations of growth rate are required; without them, the interpretation remains speculative.*

      AUTHOR RESPONSE: We agree that changes in growth rate or altered cell cycle timing are important to consider. We have revised our text: “Changes in growth rate or cell cycle progression under stress may influence division size by altering mitotic regulator accumulation. Future studies measuring mitotic regulator dynamics alongside growth rates will be needed to distinguish direct changes in size control mechanisms from growth- or timing-dependent effects.”

      * 6) Regarding the phenotypes of wee1-2x cells, it is interesting that they increase the SA:Vol ratio under all stress conditions and show phenotypes distinct from cdr2Δ cells. From these observations, the authors claims that Cdr2 and Wee1 function as a surface-area-sensing module that complements the volume-sensing and time-sensing pathways to maintain geometric homeostasis. To support this interpretation, the authors could consider additional experiments, such as analyzing cdr2Δ + wee1-2x cells under the same stress conditions. Such data would test whether increased Wee1 can rescue or modify the cdr2Δ phenotype, providing functional evidence for the proposed Cdr2-Wee1-Cdk1 regulatory relationship. Measurements of cell length, width, SA:Vol ratio, and, if feasible, Cdk1 activity markers in the strain would greatly strengthen the mechanistic claims.*

      AUTHOR RESPONSE: We thank the reviewer for this insightful suggestion. While analysis of a cdr2Δ wee1-2x strain could provide additional mechanistic detail, such experiments address a distinct question beyond the scope of our current study, which focuses on how cell geometry changes under different stress conditions in cells with perturbed surface area-, volume-, or time-sensing pathways. Our conclusions regarding a surface area-sensing role for Cdr2-Wee1 signaling are based on previous studies (Pan et al., 2014; Facchetti et al., 2019; Miller et al., 2023) and the cell geometry phenotypes we observe of cdr2Δ and wee1-2x cells under stress conditions. *

      Minor comments 1) The manuscript focuses on adaptation through changes in the surface-to-volume ratio; however, only the ratio is shown. Presenting the underlying values of surface area and volume would clarify which geometric parameter primary contributes to the observed changes.*

      AUTHOR RESPONSE: Please see our response to Reviewer 1 major comment 1.*

      *2) Statistical analysis for Fig.S2 should be provided.

      AUTHOR RESPONSE: We have completed this. See revised Figure S2 and methods.*

      3) The paper by Kellog and Levin 2022 is missing from the reference list.*

      AUTHOR RESPONSE: Thank you for catching this. This reference has now been added. *

      **Referees cross-commenting**

      After reading the other reviewer's reports, I recognize that focal points differ, but they appear sequential rather than contradictory.

      Reviewer 2 raises concerns regarding the surface area/volume calculations, which-if incorrect-would influence many of the quantitative conclusions. I agree that confirming the validity of these calculations (and recalculating if necessary) should be the top priority before evaluating the biological interpretations.

      Reviewer 1 raises more mechanistic biological questions. These are certainly important, but in my view they depend on the robustness of the quantitative analysis highlighted by Reviewer 2.

      Therefore, I regard the reports as complementary rather than conflicting. Once the analytical issue pointed out by Reviewer 2 is resolved, the field will be in a better position to assess the significance of the mechanistic points raised by Reviewer 1 (as well as those in my own report).

      Reviewer #3 (Significance (Required)):

      General assessment One of the major strengths of this manuscript is its quantitative, side-by-side comparison of multiple environmental stresses under a unified experimental and analytical framework. The authors provide well-controlled morphometric measurements, allowing direct comparison of geometry changes that would otherwise be difficult to evaluate across studies. The observation that different stress types generate distinct geometric outcomes is particularly intriguing and has the potential to stimulate new conceptual thinking in the field of size control. However, the strength of the conceptual conclusion is currently limited by several aspects of the experimental design and interpretation. In particular, it remains unclear whether the observed geometry changes represent active adaptive responses rather than non-specific consequences of prolonged or string stress exposure. Demonstrating whether geometry remodeling provides a fitness advantage, clarifying whether the changes reach a steady-state rather than reflecting slow drift over time, or identifying upstream stress pathways that govern the response would substantially strengthen the conceptual advance. Even if additional mechanistic or fitness-related data cannot be added, refining the interpretation so that it remains aligned with the present evidence will enhance the clarity, and impact of the study.

      Advance Previous study - including the 2023 publication by the James B. Moseley group - established that fission yeast integrates distinct size-control pathways related to surface area, volume, and time under normal growth conditions. The present manuscript extends this line of work to stressed environments and argues that each stress condition elicits a distinct size-control pattern. To our knowledge, a systematic comparison of cell geometry across multiple stress types in the context of size-control pathways has not been reported, and this represents a potentially valuable conceptual advance. The advance is primarily phenomenological and conceptual rather than mechanistic: the work presents new correlation between stress types and geometry but does not yet elucidate the pathways governing these responses or demonstrate a functional advantage. With additional evidence - or with qualifiers ensuring that claims match the current data - the study could make an important contribution to understanding how cells integrate environmental cues into size-control strategies.

      Audience Although the primary audience consists of researchers in the fields of cell growth, cell-cycle control, and stress responses in yeast, the conceptual contribution may interest broader fields such as growth homeostasis, metabolic adaptation, and pathological cell size changes in higher eukaryotes. Beyond yeast biology, the modular view of size regulation proposed here may inspire new investigations in stem cell biology, cancer research, and biotechnology where environmental adaptation and cell size are closely linked.

      Expertise: nuclear morphology; cell morphology; cell growth; cell cycle; cytoskeleton*

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This manuscript reports that fission yeast cells exhibit distinct cell size and geometry when exposed to osmotic, oxidative, or low-glucose stress. Based on quantitative measurements of cell length and width, the authors propose that different stress conditions trigger specific 'geometric adaptation' patterns, suggesting that cell size homeostasis is flexibly modulated depending on environmental cues. The study provides phenotypic evidence that multiple environmental stresses lead to distinct outcomes in the balance between cell surface area and volume, which the authors interpret as stress-specific modes of size control.

      Major comments

      1) The authors define the 48-hour time point as the 'long-term response', but no justification is provided for why 48 hours represents a physiologically relevant adaptation phase. It is unclear whether the size-control mode has stabilized by that time, or whether it may continue to change afterward. At minimum, the authors should provide a rationale (e.g., growth recovery dynamics, transcriptional adaptation plateau, or pilot time-course observations) to demonstrate that 48 hours corresponds to the steady-state adaptive phase rather than an arbitrarily selected time point.

      2)Related to the above comment, the authors propose that different stresses lead to distinct cell size adaptations, yet the rationale for the chosen stress intensities and exposure times is insufficiently described. It remains unclear whether the osmotic, oxidative, and low-glucose conditions used here induce comparable levels of cellular stress. Dose-response and time-course analyses would greatly strengthen the conclusions. Without such analyses, it is difficult to support the interpretation that geometry modulation represents a direct adaptive response.

      3) The authors describe stress-induced size changes as an 'adaptive' response. While this is an appealing hypothesis, the presented data do not demonstrate that the change in cell size itself confers a fitness advantage. Evidence showing that blocking the size change reduces stress survival-or that the altered size improves growth recovery- would be required to support this claim. Without such data, the use of the term 'geometric adaptation' seems overstated.

      4) The authors conclude that mutants exhibit no major defects in growth or viability during 48-hour stress exposure based on comparable septation index values (Fig. S2). However, septation index alone does not fully capture growth performance or cell-cycle progression and is not sufficient to support claims regarding fitness or robustness of proliferation. If the authors intend to make statements about 'growth', 'viability', or 'cell-cycle progression', additional quantitative measures (e.g., growth curves, doubling time, colony-forming units, or microcolony growth measurements) would be necessary. Alternatively, the claims should be toned down to align with the measurements currently provided.

      5) Related to the above comment, the manuscript does not adequately rule out the possibility that the decreased division size simply results from slower growth or delayed cell-cycle progression rather than a shift in the size-control mechanism. Measurements and normalizations of growth rate are required; without them, the interpretation remains speculative.

      6) Regarding the phenotypes of wee1-2x cells, it is interesting that they increase the SA:Vol ratio under all stress conditions and show phenotypes distinct from cdr2Δ cells. From these observations, the authors claims that Cdr2 and Wee1 function as a surface-area-sensing module that complements the volume-sensing and time-sensing pathways to maintain geometric homeostasis. To support this interpretation, the authors could consider additional experiments, such as analyzing cdr2Δ + wee1-2x cells under the same stress conditions. Such data would test whether increased Wee1 can rescue or modify the cdr2Δ phenotype, providing functional evidence for the proposed Cdr2-Wee1-Cdk1 regulatory relationship. Measurements of cell length, width, SA:Vol ratio, and, if feasible, Cdk1 activity markers in the strain would greatly strengthen the mechanistic claims.

      Minor comments

      1) The manuscript focuses on adaptation through changes in the surface-to-volume ratio; however, only the ratio is shown. Presenting the underlying values of surface area and volume would clarify which geometric parameter primary contributes to the observed changes.

      2) Statistical analysis for Fig.S2 should be provided.

      3) The paper by Kellog and Levin 2022 is missing from the reference list.

      Referees cross-commenting

      After reading the other reviewer's reports, I recognize that focal points differ, but they appear sequential rather than contradictory.

      Reviewer 2 raises concerns regarding the surface area/volume calculations, which-if incorrect-would influence many of the quantitative conclusions. I agree that confirming the validity of these calculations (and recalculating if necessary) should be the top priority before evaluating the biological interpretations.

      Reviewer 1 raises more mechanistic biological questions. These are certainly important, but in my view they depend on the robustness of the quantitative analysis highlighted by Reviewer 2.

      Therefore, I regard the reports as complementary rather than conflicting. Once the analytical issue pointed out by Reviewer 2 is resolved, the field will be in a better position to assess the significance of the mechanistic points raised by Reviewer 1 (as well as those in my own report).

      Significance

      General assessment

      One of the major strengths of this manuscript is its quantitative, side-by-side comparison of multiple environmental stresses under a unified experimental and analytical framework. The authors provide well-controlled morphometric measurements, allowing direct comparison of geometry changes that would otherwise be difficult to evaluate across studies. The observation that different stress types generate distinct geometric outcomes is particularly intriguing and has the potential to stimulate new conceptual thinking in the field of size control. However, the strength of the conceptual conclusion is currently limited by several aspects of the experimental design and interpretation. In particular, it remains unclear whether the observed geometry changes represent active adaptive responses rather than non-specific consequences of prolonged or string stress exposure. Demonstrating whether geometry remodeling provides a fitness advantage, clarifying whether the changes reach a steady-state rather than reflecting slow drift over time, or identifying upstream stress pathways that govern the response would substantially strengthen the conceptual advance. Even if additional mechanistic or fitness-related data cannot be added, refining the interpretation so that it remains aligned with the present evidence will enhance the clarity, and impact of the study.

      Advance

      Previous study - including the 2023 publication by the James B. Moseley group - established that fission yeast integrates distinct size-control pathways related to surface area, volume, and time under normal growth conditions. The present manuscript extends this line of work to stressed environments and argues that each stress condition elicits a distinct size-control pattern. To our knowledge, a systematic comparison of cell geometry across multiple stress types in the context of size-control pathways has not been reported, and this represents a potentially valuable conceptual advance. The advance is primarily phenomenological and conceptual rather than mechanistic: the work presents new correlation between stress types and geometry but does not yet elucidate the pathways governing these responses or demonstrate a functional advantage. With additional evidence - or with qualifiers ensuring that claims match the current data - the study could make an important contribution to understanding how cells integrate environmental cues into size-control strategies.

      Audience

      Although the primary audience consists of researchers in the fields of cell growth, cell-cycle control, and stress responses in yeast, the conceptual contribution may interest broader fields such as growth homeostasis, metabolic adaptation, and pathological cell size changes in higher eukaryotes. Beyond yeast biology, the modular view of size regulation proposed here may inspire new investigations in stem cell biology, cancer research, and biotechnology where environmental adaptation and cell size are closely linked.

      Expertise: nuclear morphology; cell morphology; cell growth; cell cycle; cytoskeleton.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Cabral et al. present a analysis of the effects of environmental stress of cellular geometry in the fission yeast S. pombe. The stresses they study-oxidative, osmotic and nutritional-have previously been shown to affect cell size in fission yeast. Here, the authors do a more sophisticated analysis, measuring surface area as well as volume (for which length had previously been used as a proxy, assuming fission yeast cells are cylinders of constant width). In addition, they investigate the effect of mutations in three cell-cycle control proteins that have been proposed to regulate cell geometry: Cdc13, Cdc25 and Cdr2. It is an interesting study that could provide insight into cell-size control and environmental-stress response in fission yeast. However, I have serious concerns about the analysis of the data. In fact, as I was writing up my concerns, I noticed that the formulae in Table 1 for surface area and volume are incorrect, so the whole paper appears to require reanalysis.

      One general problem is that the authors seem to confuse statistical significance with biological significance. They claim that both oxidative and osmotic stress cause a reduction in SA/V ratio. For oxidative stress, the difference is evident, but the control and KCl-treated cells look to have indistinguishable distributions. Perhaps there is a significant statistical difference between the, but I am skeptical. (I would ask for the data table to try out the stats myself, but given the revelation below that the number will all need to be recalculated, that point is moot). In any case, the difference is certainly not biologically significant.

      Likewise, in Figure 2B, they interpret tiny changes in the SA/V. By my estimation, the difference between control and osmotic stress is only 2% (1.195/1.17), less that the wild-type case, which appears to be twice that (which is still pretty modest). The small amplitude of these changes is obscured by the fact that the graphs do not have a baseline at zero, which, as a matter of good data-presentation practice, they should.

      I am also concerned about the use of manual measurement of width at a single point along the cell. This approach is very sensitive to the choice of width point and to non-cylindrical geometries, several of which are evident in the images presented. MATLAB will return the ??? as well as the length from a mask, but even better, one can more accurately calculate the surface area and volume by assuming rotational symmetry of the mask. Given that surface area and volume calculation need to be redone anyway, as discussed below, I encourage the authors to calculate them directly from the mask, instead of using the cylindrical assumption.

      The authors also need to be more careful about their claims about size-dependent scaling. The concentration of both Cdc13 and Cdc25 scale with size (perhaps indirectly, in the case of Cdc13), but Cdr2 does not. Cdr2 activity has been proposed to scale with size, and its density at cortical nodes has been reported to scale with size, although that claim has been challenged <https://pubmed.ncbi.nlm.nih.gov/36093997>.

      Even taking the authors approach at face value, there are observations that do not seem to make sense, which led me to realize that the wrong formulae were used to calculate surface area and volume.

      In Figure 1E,F, the KCl-treated cells get shorter and wider; surely, that should result in a lower SA/V ratio. However, as noted above, in Figure 1D, they are shown to have a similar ratio. As a sanity check, I eye-balled the numbers off of the figure (control: 14 µm x 3.6 µm and KCl: 11 µm x 3.8 µm) and calculated their surface area and volume using the formula for a capsule (i.e., a cylinder with hemispheric ends).

      SA = the surface area of the two hemispheres + the surface are of the cylinder in between = 4pi(width/2)^2 + piwidth(length-width), the length-width term calculates the side length of the capsule (length without the hemispheres) from the full length of the capsule (length including the hemispheres)

      V = the volume of the two hemispheres + the volume of the cylinder in between = 4/3pi(width/2)^3 + pi(width/2)^2(length-width).

      I got SA/V ratios of around 2, which are way off from what is presented in Figure 1D, but my calculated ratio goes down in KCl, as expected, but not as reported.

      To make sure I was not doing something wrong, I was going to repeat my calculations with the formulae in Table 1, which made me realize both are incorrect. The stated formula for the cell surface area-2piRL-only represents to surface area of the cylindrical side of the cells, not its hemispherical ends. And it is not even the correct formula for the surface area of the side, because that calls for L to be the length of the side (without the hemispherical ends) not the length of the cell (which includes the hemispherical ends). L here is stated to be cell length (which is what is normally measured in the field, and which is consistent with the reported length of control cells in Figure 1E being 14 µm). The formula for the volume of a capsule in the form use in Table 1 (volume of a cylinder of length L - the volume excluded from the hemispherical ends) is piR^2L - (8-(4/3pi))R^3.

      Given these problems, I think I spent too much time thinking about the rest of the paper, because all of the calculations, and perhaps their interpretations, need to be redone.

      Minor Points:

      Strains should be identified by strain number is the text and figure legends.

      In the Introduction, "Most cell control their size" should be "Most eukaryotic cell control their size".

      Significance

      Nothing to add.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Using genetics and microscopy approaches, Cabral et al. investigate how fission yeast regulates its length and width in response to osmotic, oxidative, or low glucose stress. Miller et al. have recently found that the cell cycle regulators Cdc25, Cdc13 and Cdr2 integrate information about cell volume, time and cell surface area into the cellular decision when to divide. Cabral now build on this work and test how disruption of these regulators affects cell size adaptation. They find that each stress condition shows a distinct dependence on the individual regulators, suggesting that the complex size control network enables optimized size adaptation for each condition. Overall, the manuscript is clear and the detailed methods ensure that the experiments can be replicated.

      Major comments:

      1. It would be much easier to follow the authors' conclusions, if in addition to surface area to volume ratio, length and width, they would also plot cell volume at division in Figs. 1-4.
      2. To me, it seems that maybe even more than upon osmotic stress, the cdc13-2x strain differs qualitatively from WT in low glucose conditions, where the increased SA-V ratio is almost completely abolished.
      3. It is not entirely clear to me why two copies of Cdc13 would qualitatively affect the responses. Shouldn't the extra copy behave similarly to the endogenous one and therefore only lead to quantitative changes? Maybe the authors can discuss this more clearly or even test a strain in which Cdc13 function is qualitatively disrupted.
      4. I don't see why the authors come to the conclusion that under osmotic stress cells would maximize cell volume. It leads to a decreased cell length, doesn't it?

      Significance

      Fission yeast has long been used as a model for eukaryotic cell size regulation. So far, this research has been mostly focused on steady state size regulation. While it has long been clear that cells across organisms adapt their size in response to environmental changes, little is known about how these external inputs are processed through the size control network. Dissecting how disruption of the various branches of the size control network affects size adaptation is an important step towards a mechanistic understanding of this process. Future studies will have to build on these observations and investigate how each stress mechanistically affects the respective regulator(s). While the details of the molecular players and their contribution to size adaptation are likely specific to fission yeast, the concept of stress type-specific size adaptation that is mediated through different regulators is likely conserved and thus of broader relevance.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We sincerely appreciate the feedback, attention to detail and timeliness of the referees for our manuscript. Below, we provide a point-by-point response to all comments from the referees, detailing the changes we have already made, and those that are in progress. Referee's comments will appear in bolded text, while our responses will be unbolded. Any text quoted directly from the manuscript will be italicised and contained within "quotation marks". Additionally, we have grouped all comments into four categories (structural changes, minor text changes, experimental changes, figure changes), comments are numbered 1-n in each of these categories. Please note: this response to reviewer's comments included some images that cannot be embedded in this text-only section.

      1. General Statements

      We appreciate the overall highly positive and enthusiastic comments from all reviewers, who clearly appreciated the technical difficulty of this study, and noted amongst other things that this study represents" a major contribution to the future advancement of oocyst-sporozoite biology" and the development of the segmentation score for oocysts as a "major advance[ment]". We apologise for the omission of line numbers on the document sent to reviewers, we removed these for the bioRxiv submission without considering that this PDF would be transferred across to Review Commons.

      We have responded to all reviewers comments through a variety of text changes, experimental inclusions, or direct query response. Significant changes to the manuscript since initial submission are as follows:

      1. Refinement of rhoptry biogenesis model: Reviewers requested more detail around the content of the AORs, which we had previously suggested were a vehicle for rhoptry biogenesis as we saw they carried the rhoptry neck protein RON4. To address this, we first attempted to address this using antibodies against rhoptry bulb proteins but were unsuccessful. We then developed a * berghei* line where there rhoptry bulb protein RhopH3 was GFP-tagged. Using this parasite line, we observed that the earliest rhoptry-like structure, which we had previously interpreted as an AOR contained RhopH3. By contrast, RhopH3 was absent from AORs. Reflecting these observations we have renamed this initial structure the 'pre-rhoptry' and suggested a model for rhoptry biogenesis where rhoptry neck cargo are trafficked via the AOR but rhoptry bulb cargo are trafficked by small vesicles that move along the rootlet fibre (previously observed by EM).
      2. Measurement of rhoptry neck vs bulb: While not directly suggested by the reviewers, we have also included an analysis that estimates the proportion of the sporozoite rhoptry that represents the rhoptry neck. By contrast to merozoites, which we show are overwhelmingly represented by the rhoptry bulb, the vast majority of the sporozoite rhoptry represents the rhoptry neck.
      3. Measurement of subpellicular microtubules: One reviewer asked if we could measure the length of subpellicular microtubules where we had previously observed that they were longer on one side of the sporozoite than the other. We have now provided absolute and relative (% sporozoite length) length measurements for these subpellicular microtubules and also calculated the proportion of the microtubule that is polyglutamylated.
      4. More detailed analysis of RON11cKD rhoptries: Multiple comments suggested a more detailed analysis of the rhoptries that were formed/not formed in RON11cKD We have included an updated analysis that shows the relative position of these rhoptries in sporozoites.

      2. Point-by-point description of the revisions

      Reviewer #1

      Minor text changes (Reviewer #1)

      1. __Text on page 12 could be condensed to highlight the new data of ron4 staining of the AOR. __

      We agree with the reviewer that it is a reasonable suggestion. After obtaining additional data on the contents of the AOR (as described in General Statements #1), this section has been significantly rewritten to highlight these findings. 2.

      __Add reference on page 3 after 'disrupted parasites' __

      This sentence has been rewritten slightly with some references included and now reads: "Most data on these processes comes from electron microscopy studies 6-8, with relatively few functional reports on gene deleted or disrupted parasites9-11. 3.

      __Change 'the basal complex at the leading edge' - this seems counterintuitive __

      This change has been made. 4.

      __Change 'mechanisms underlying SG are poorly' - what mechanisms? of invasion or infection? __

      This was supposed to read "SG invasion" and has now been fixed. 5.

      __On page 4: 'handful of proteins' __

      This error has been corrected. 6.

      __What are the 'three microtubule spindle structures'? __

      The three microtubule spindle structures: hemispindle, mitotic spindle, and interpolar spindle are now listed explicitly in the text. 7.

      __On page 5: 'little is known' - please describe what is known, also in other stages. At the end of the paper I would like to know what is the key difference to rhoptry function in other stages? __

      The following sentence already detailed that we had recently used U-ExM to visualise rhoptry biogenesis in blood-stage parasites, but the following two sentences have been added to provide extra detail on these findings: "In that study, we defined the timing of rhoptry biogenesis showing that it begun prior to cytokinesis and completed approximate coincident with the final round of mitosis. Additionally, we observed that rhoptry duplication and inheritance was coupled with centriolar plaque duplication and nuclear fission." 8.

      __change 'rhoptries golgi-derived, made de novo' __

      This has been fixed. 9.

      __change 'new understand to' __

      This change has been made 10.

      __'rhoptry malformations' seem to be similar in sporozoites and merozoites. Is that surprising/new? __

      We assume this is in reference to mention of "rhoptry malformations" in the abstract. In the RON11 merozoite study (PMID:39292724) the authors noted no gross rhoptry malformations, only that one was not formed/missing. The abstract sentence has been changed to the following to better reflect this nuance: "*We show that stage-specific disruption of RON11 leads to a formation of sporozoites that only contain half the number of rhoptries of controls like in merozoites, however unlike in merozoites the majority of rhoptries appear grossly malformed."

      * 11.

      __What is known about crossing the basal lamina. Where rhoptries thought to be involved in this process? Or is it proteins on the surface or in other secretory organelles? __

      We are unaware of any studies that specifically look at sporozoites crossing the SG basal lamina. A review, although now ~15 years old stated that "No information is available as to how the sporozoites traverse the basal lamina" (PMID:19608457) and we don't know any more information since then. To try and better define our understanding of rhoptry secretion during SG invasion, we have added the following sentence:

      "It is currently unclear precisely when during these steps of SG invasion rhoptry proteins are required, but rhoptry secretion is thought to begin before in the haemolymph before SG invasion16." 12.

      __On page change/specify: 'wide range of parasite structures' __

      The structures observed have been listed: centriolar plaque, rhoptry, apical polar rings, rootlet fibre, basal complex, apicoplast. 13.

      __On page 7: is Airyscan2 a particular method or a specific microscope? __

      Airyscan2 is a detector setup on Zeiss LSM microscopes, this was already detailed in the materials and methods sections, but figure legends have been clarified to read: "...imaged by an LSM900 microscopy with an Airyscan2 detector". 14.

      __how large is RON11? __

      RON11 is 112 kDa in * berghei*, as noted in the text. 15.

      __There is no causal link between ookinete invasion and oocyst developmental asynchrony __

      We have deleted the sentence that implied that ookinete invasion was responsible for oocyst asynchrony. This section now simply states that "Development of each oocyst within a midgut is asynchronous..." 16.

      __First sentence of page 24 appears to contradict what is written in results____ I don't understand the first two sentences in the paragraph titled Comparison between Plasmodium spp __

      This sentence was worded confusingly, making it appear contradictory when that was not the intention. The sentence has been changed to more clearly support what is written in the discussion and now reads: "Our extensive analysis only found one additional ultrastructural difference between Plasmodium spp."

      __On page 25 or before the vast number of electron microscopy studies should be discussed and compared with the authors new data. __

      It is not entirely clear which new data should be specifically discussed based on this comment. However, we have added a new paragraph that broadly compares MoTissU-ExM and our findings with other imaging methods previously used on mosquito-stage malaria parasites:

      "*Comparison of MoTissU-ExM and other imaging modalities

      Prior to the development of MoTissU-ExM, imaging of mosquito-stage malaria parasites in situ had been performed using electron microscopy7,8,11,28, conventional immunofluorescence assays (IFA)10, and live-cell microscopy25. MoTissU-ExM offers significant advantages over electron microscopy techniques, especially volume electron microscopy, in terms of accessibility, throughput, and detection of multiple targets. While we have benchmarked many of our observations against previous electron microscopy studies, the intracellular detail that can be observed by MoTissU-ExM is not as clear as electron microscopy. For example, previous electron microscopy studies have observed Golgi-derived vesicles trafficking along the rootlet fibre8 and distinguished the apical polar rings44; both of which we could not observe using MoTissU-ExM. Compared to conventional IFA, MoTissU-ExM dramatically improves the number and detail of parasite structures/organelles that can be visualised while maintaining the flexibility of target detection. By contrast, it can be difficult or impossible to reliably quantify fluorescence intensity in samples prepared by expansion microscopy, something that is routine for conventional IFA. For studying temporally complex processes, live-cell microscopy is the 'gold-standard' and there are some processes that fundamentally cannot be studied or observed in fixed cells. We attempt to increase the utility of MoTissU-ExM in discerning temporal relationships through the development of the segmentation score but note that this cannot be applied to the majority of oocyst development. Collectively, MoTissU-ExM offers some benefits over these previously applied techniques but does not replace them and instead serves as a novel and complementary tool in studying the cell biology of mosquito-stage malaria parasites.**"

      *

      __First sentence on page 27: there are many studies on parasite proteins involved in salivary gland invasion that could be mentioned/discussed. __

      The sentence in question is "To the best of our knowledge, the ability of sporozoites to cross the basal lamina and accumulate in the SG intercellular space has never previously been reported."

      This sentence has now been changed to read as follows: "While numerous studies have characterized proteins whose disruption inhibited SG invasion9,10,15,59-63, to the best of our knowledge the ability of sporozoites to cross the basal lamina and accumulate in the SG intercellular space has never previously been reported ."

      __On page 10 I suggest to qualify the statement 'oocyst development has typcially been inferred by'. There seem a few studies that show that size doesn't reflect maturation. __

      In our opinion, this statement is already qualified in the following sentence which reads: "Recent studies have shown that while oocysts increase in size initially, their size eventually plateaus (11 days pot infection (dpi) in P. falciparum4)."

      __On page 16 the authors state that different rhoptries might have different function. This is an interesting hypothesis/result that could be mentioned in the abstract. __

      The abstract already contains the following statement: "...and provide the first evidence that rhoptry pairs are specialised for different invasion events." We see this as an equivalent statement.


      Experimental changes (Reviewer #1)

      1. On page 19: do the parasites with the RON11 knockout only have the cytoplasmic or only the apical rhoptries?

      The answer to this is not completely clear. We have added the following data to Figures 6 and 8 where we quantify the proportion of rhoptries that are either apical or cytoplasmic: In both wildtype parasites and RON11ctrl parasites, oocyst spz rhoptries are roughly 50:50 apical:cytoplasmic (with a small but consistent majority apical), while almost all rhoptries are found at the apical end (>90%) in SG spz. Presumably, after the initial apical rhoptries are 'used up' during SG invasion, the rhoptries that were previously cytoplasmic take their place. In RON11cKD the ratio of apical:cytoplasmic rhoptries is fairly similar to control oocyst spz. In RON11cKD SG spz, the proportion of cytoplasmic rhoptries decreases but not to the same extent as in wildtype or RON11Ctrl. From this, we infer that the two rhoptries that are lost/not made in RON11cKD sporozoites are likely a combination of both the apical and cytoplasmic rhoptries we find in control sporozoites.

      __in panel G: Are the dense granules not micronemes? What are the dark lines? Rhoptries?? __

      We have labelled all of Figure 1 more clearly to point out that the 'dark lines' are indeed rhoptries. Additionally, we have renamed the 'protein-dense granules' to 'protein-rich granules', as it seems we are suggesting that these structures are dense granules the secretory organelle. At this stage we simply do not know what all of these granules are. The observation that some but not all of these granules contain CSP (Supplementary Figure 2) suggests that they may represent heterogenous structures. It is indeed possible that some are micronemes, however, we think it is unlikely that they are all micronemes for a number of reasons: (1) micronemes are not nearly this protein dense in other Plasmodium lifecycle stages, (2) some of them carry CSP which has not been demonstrated to be micronemal, (3) very few of these granules are present in SG sporozoites, which would be unexpected because microneme secretion is required for hepatocyte invasion.

      __Figure 2 seems to add little extra compared to the following figures and could in my view go to the supplement. __

      We agree that Figure 2b adds little and so have moved that to Supplementary Figure 2, but think that the relative ease at which it can be distinguished if sporozoites are in the secretory cavity or SG epithelial cell is a key observation because of the difficulty in doing this by conventional IFA.

      __On page 8 the authors mention a second layer of CSP but do not further investigate it. It is likely hard to investigate this further but to just let it stand as it is seems unsatisfactory, considering that CSP is the malaria vaccine. What happens if you add anti-CSP antibodies? I would suggest to shorten the opening paragraphs of this paper and to focus on the rhoptries. This could be done be toning down the text on all aspects that are not rhoptries and point to the open question some of the observations such as the CSP layers raise for future studies. __

      When writing the manuscript, we were unsure whether to include this data at all as it is a purely incidental finding. We had no intention of investigating CSP specifically, but anti-CSP antibodies were included in most of the salivary gland imaging experiments so we could more easily find sporozoites. Given the tremendous importance of CSP to the field, we figured that these observations were potentially important enough that they should be reported in the literature even though they are not something we have the intention or resources to investigate subsequently. Additionally, after consultation with other microscopists we think there is a reasonable chance that this double-layer effect could be a product of chemical fixation. To account for this, we have qualified the paragraph on CSP with this sentence:

      "We cannot determine if there is any functional significance of this second CSP layer and considering that it has not been observed previously it may well represent an artefact of chemical (paraformaldehyde) fixation."

      __Maybe include more detail of the differences between species on rhoptry structure into Figure 4. I would encourage to move the Data on rhoptries in Figure S6 to the main text ie to Figure 4. __

      We have moved the images of developing rhoptries in * falciparum *(previously Figure S6a and b) into figure 4, which now looks as follows:

      Figure S8 (previously S6c) now consists only of the MG spz rhoptry quantification

      Manuscript structural changes (Reviewer #1)

      1. Abstract: don't focus on technique but on the questions you tried to answer (ie rewrite or delete the 3rd and 4th sentence)

      2. 'range of cell biology processes' - I understand the paper that the key discovery concerns rhoptry biogenesis and function, so focus on that, all other aspects appear rather peripheral.

      3. 'Much of this study focuses on the secretory organelles': I would suggest to rewrite the intro to focus solely on those, which yield interesting findings.

      4. Page 11: I am tempted to suggest the authors start their study with Figure 3 and add panel A from Figure 2 to it. This leads directly to their nice work on rhoptries. Other features reported in Figures 1 and 2 are comparatively less exciting and could be moved to the supplement or reported in a separate study.____ Page 23: I suggest to delete the first sentence and focus on the functional aspects and the discoveries.

      5. __Maybe add a conclusion section rather than a future application section, which reads as if you want to promoted the use of ultrastructure expansion microscopy. To my taste the technological advance is a bit overplayed considering the many applications of this techniques over the last years, especially in parasitology, where it seems widely used. In any case, please delete 'extraordinarily' __

      Response to Reviewer#1 manuscript structural changes 1-5: This reviewer considers the findings related to rhoptry biology as the most significant aspect of the study and suggests rewriting the manuscript to emphasize these findings specifically. Doing so might make the key findings easier to interpret. However, in our view, this approach could misrepresent how the study originated and what we see as the most important outcomes. We did not develop MoTissU-ExM specifically to investigate rhoptry biology. Instead, this technique was created independently of any particular biological question, and once established, we asked what questions it could answer, using rhoptry biology as a proof of concept. Given the authors' previous work and available resources, we chose to focus on rhoptry biology. Since this was driven by basic research rather than a specific hypothesis, it's important to acknowledge this in the manuscript. While we agree that the findings related to rhoptry biology are valuable, we believe that highlighting the technique's ability to observe organelles, structures, and phenotypes with unprecedented ease and detail is more important than emphasizing the rhoptry findings alone. For these reasons, we have decided not to restructure the manuscript as suggested.


      Reviewer #2

      Minor text changes (Reviewer #2)

      1. __The 'image Z-depth' value indicated in the figures is ambiguous. It is not clear whether this refers to the distance from the coverslip surface or the starting point of the z-stack image acquisition. A precise definition of this parameter would be beneficial. __

      In the legend of Figure 1, the image Z-depth has been clarified as "sum distance of Z-slices in max intensity projection". 2.

      __Paragraph 3 of the introduction - line 7, "handful or proteins" should be handful of proteins __

      This has been corrected. 3.

      __Paragraph 5 of the introduction - line 7, "also able to observed" should be observe __

      This has been changed. 4.

      __In the final paragraph of the introduction - line 1, "leverage this new understand" should be understanding __

      This has been fixed. 5.

      __The first paragraph of the discussion summary contains an incomplete sentence on line 7, "PbRON11ctrl-infected SGs." __

      This has been removed. 6.

      __The second paragraph of the discussion - line 10, "until cytokinesis beings" should be begins __

      This mistake has been corrected. 7.

      __One minor point that author suggest that oocyst diameter is not appropriate for the development of sporozoite develop. This is not so true as oocyst diameter tells between cell division and cell growth so it is important parameter especially where the proliferation with oocyst does not take place but the growth of oocyst takes place. __

      We agree that this was not highlighted enough in the text. The final sentence of the results section about this now reads:

      "While diameter is a useful readout for oocyst development in the early stages of its growth, this suggests that diameter is a poor readout for oocyst development once sporozoite formation has begun and highlights the usefulness of the segmentation score as an alternative.", and the final sentence of the discussion section about this now reads "Considering that oocyst size does not plateau until cytokinesis begins4, measuring oocyst diameter may represent a useful biological clock specifically when investigating the early stages of oocyst development." 8.

      __How is the apical polarity different to merozoite as some conoid genes are present in ookinete and sporozoite but not in merozoite. __

      Our hypothesis is that apical polarity is established by the positioning and attachment of the centriolar plaque to the parasite plasma membrane in both forming merozoites and sporozoites. While the apical polar ring proteins are obviously present at the apical end, and have important functions, we think that they themselves are unlikely to regulate polarity establishment directly. Additionally, it seems that the apical polar rings are visible in forming sporozoites far before the comparable stages of merozoite formation. An important note here is that at this point, this is largely inferences based on observational differences and there is relatively little functional data on proteins that regulate polarity establishment at any stage of the Plasmodium 9.

      __Therefore, I think that electron microscopy remains essential for the observation of such ultra-fine structures __

      We have added a paragraph in the discussion that provides a more clear comparison between MoTissU-ExM and other imaging modalities previously applied on mosquito-stage parasites (see response to Reviewer#1 (Minor text changes) comment #17). 10.

      __The author have not mentioned that sometimes the stage oocyst development is also dependent on the age of mosquito and it vary between different mosquito gut even if the blood feed is done on same day. __

      In our opinion this can be inferred through the more general statement that "development of each oocyst within a midgut is asynchronous..."


      Figure changes (Reviewer #2)

      1. __Fig 3B: stage 2 and 6 does not show the DNA cyan, it would-be good show the sate of DNA at that particular stage, especially at stage 2 when APR is visible. And box the segment in the parent picture whose subset is enlarged below it. __

      We completely agree with the reviewer that the stage 2 image would benefit from the addition of a DNA stain. Many of the images in Figure 3b were done on samples that did not have a DNA stain and so in these * yoelii samples we did not find examples of all segmentation scores with the DNA stain. Examples of segmentation score 2 and 6 for P. berghei, and 6 for P. falciparum* can be found with DNA stains in Figure S8. 2.

      __For clarity, it would be helpful to add indicators for the centriolar plaques in Figure 1b, as their locations are not immediately obvious. __

      The CPs in Figure 1a and 1b have been circled on the NHS ester only panel for clarity. +

      __Regarding Figure 1c, the authors state that 'the rootlet fiber is visible'. However, such a structure cannot be confirmed from the provided NHS ester image. Can the authors present a clearer image where the rootlet fibre is more distinct? Furthermore, please provide the basis for identifying this structure as a rootlet fiber based on the NHS ester observation alone. __

      The image in Figure 1c has been replaced with one that more clearly shows the rootlet fibre.

      Based on electron microscopy studies, the rootlet fibre has been defined as a protein dense structure that connects the centriolar plaque to the apical polar rings (PMID: 17908361). Through NHS ester and tubulin staining, we could identify the apical polar rings and centriolar plaque as sites on the apical end of the parasite and nucleus that microtubules are nucleated from. There is a protein dense fibre that connects these two structures. Based on the fact that the protein density of this structure was previously considered sufficient for its identification by electron microscopy, we consider its visualisation by NHS ester staining sufficient for its identification by U-ExM.

      __Fig 1B - could the tubulin image in the hemispindle panel be made brighter? __

      The tubulin staining in this panel was not saturated, and so this change has been made.

      __Fig 4A - the green text in the first image panel is not visible. Also, the cyan text in the 3rd image in Fig 1A is also difficult to see. There's a few places where this is the case __

      We have made all microscopy labels legible at least when printed in A4/Letter size.

      __Fig 6A - how do the authors know ron11 expression is reduced by 99%? Did they test this themselves or rely on data from the lab that gifted them the construct? Also please provide mention the number of oocyst and sporozoites were observed. __

      The way Figure 6a was previously designed and described was an oversight, that wrongly suggested we had quantified a >99% reduction in *ron11 * The 99% reduction has been removed from Figure 6a and the corresponding part of the figure legend has been rewritten to emphasise that this was previously established:

      "(a) Schematic showing previously established Ron11Ctrl and Ron11cKD parasite lines where ron11 expression was reduced by >99%9."

      As to the second part of the question, we did not independently test either protein or RNA level expression of RON11, but we were gifted the clonal parasite lines established by Prof. Ishino's lab in PMID: 31247198 not just the genetic constructs.

      __Fig 6E - are the data point colours the wrong way round on this graph? Just looking at the graph it looks as though the RON11cKD has more rhoptries than the control which does not match what is said in the text. __

      Thank you for pointing out this mistake, the colours have now been corrected.

      __Fig S8C, PbRON11 ctrl, pie chart shows 89.7 % spz are present in the secretory cavity while the text shows 100 %, 35/35 __

      The text saying 100% (35/35) only considered salivary glands that were infected (ie. Uninfected SGs were removed from the count. The two sentences that report this data have been clarified to reflect this better:

      "Of *PbRON11ctrl SGs that were infected (35/39), 100% (35/35) contained sporozoites in the secretory cavity (Figure S8c). Conversely of infected PbRON11cKD SGs (59/82), only 24% (14/59) contained sporozoites within the secretory cavity (Figure S9d)."

      *

      __Fig S9D shows that RON11 ckd contains 17.1% sporozoites in secretory cavity while the text says 24%. __

      Please see the response to Reviewer#2 Figure Changes Comment #8 where this was addressed.


      Experimental changes (Reviewer #2)

      1. __Why do the congruent rhoptries have similar lengths to each other, while the dimorphic rhoptries have different lengths? Is this morphological difference related to the function of these rhoptries? __

      We hypothesise that this morphological difference arises because the congruent rhoptries are 'used' during SG invasion, while the dimorphic rhoptries are utilized during hepatocyte invasion. It is not straightforward to test this functionally at this point, as no protein is known to have differential localization between the two. Additionally, RON11 is likely directly involved in both SG and hepatocyte invasion through a secreted portion of the protein (as seen in RBC invasion). Therefore, RON11cKD sporozoites may have combined defects, meaning we cannot assume any defect is solely due to the absence of two rhoptries. Determining this functionally is of high interest to our research groups and remains an area of ongoing study, but it is beyond the scope of this study. 2.

      Would it be possible to show whether RON11 localises to the dimorphic rhoptries, the congruent rhoptries, or both, by using expansion microscopy and a parasite line that expresses RON11 tagged with GFP or a peptide tag?

      __ __We do not have access to a parasite line that expresses a tagged copy of RON11, or anti-PbRON11 antibodies. Based on previously published localisation data, however, it seems likely that RON11 localises to both sets of rhoptries. Below are excerpts from Figure 1c of PMID: 31247198, where RON11 (in green) seems to have a more basally-extended localisation in midgut (MG) sporozoites than in salivary gland (SG) sporozoites. From this we infer that in the MG sporozoite you're seeing RON11 in both pairs of rhoptries, but only the one remaining pair in the SG sporozoite.


      __The knockdown of RON11 disrupts the rhoptry structure, making the dimorphic and congruent rhoptries indistinguishable. Does this suggest that RON11 is important for the formation of both types of rhoptries? I believe that it would be crucial to confirm whether RON11 localises to all rhoptries or is restricted to specific rhoptries for a more precise discussion of RON11's function. __

      Based on our analysis, it does indeed seem that RON11 is important for both types of rhoptries as when RON11 isn't expressed sporozoites still have both apical and cytoplasmic rhoptries (ie. Not just one pair is lost; see Reviewer #1 Experimental changes comment #1).

      __The authors state that 64% of RON11cKD SG sporozoites contained no rhoptries at all. Does this mean RON11cKD SG sporozoites used up all rhoptries corresponding to the dimorphic and congruent pairs during SG invasion? If so, this contradicts your claims that sporozoites are 'leaving the dimorphic rhoptries for hepatocyte invasion' and that 'rhoptry pairs are specialized for different invasion events'. If that is not the case, does it mean that RON11cKD sporozoites failed to form the rhoptries corresponding to the dimorphic pair? A more detailed discussion would be needed on this point and, as I mentioned above, on the specific role of RON11 in the formation of each rhoptry pair. __

      We do not agree that this constitutes a contradiction; instead, more nuance is needed to fully explain the phenotype. As shown in the new graph added in response to Reviewer#1 Figure changes comment #1 in RON11cKD oocyst sporozoites, 64% of all rhoptries are located at the apical end. Our hypothesis is that these rhoptries are used for SG invasion and, therefore, would not be present in RON11cKD SG sporozoites. Consequently, the fact that 64% of RON11cKD sporozoites lack rhoptries is exactly what we would expect. Essentially, we predict three slightly different 'pathways' for RON11cKD sporozoites: If they had 2 apical rhoptries in the oocyst, we predict they would have zero rhoptries in the SG. If they had 2 cytoplasmic rhoptries in the oocyst, we predict they would have two rhoptries in the SG. If they had one apical and one cytoplasmic rhoptry in the oocyst, we predict they would have one rhoptry in the SG. In any case, we expect the apical rhoptries to be 'used up,' which appears to be supported by the data.

      __Out of pure curiosity, is it possible to measure the length and number of subpellicular microtubules in the sporozoites observed in this study using expansion microscopy? __

      We have performed an analysis of subpellicular microtubules which is now included as Supplementary Figure 2. We could not always distinguish every SPMT from each other and so have not quantified SPMT number. We have, however, quantified their absolute length on both the 'long side' and 'short side', their relative length (as % sporozoite length) and the degree to which they are polyglutamylated.

      A description of this analysis is now found in the results section as follows: "*We quantified the length and degree of polyglutamylation of SPMTs on the 'long side' and 'short side' of the sporozoite (Figure S2). 'Short side' SPMTs were on average 33% shorter (mean = 3.6 µm {plus minus}SD 1.0 µm) than 'long side' SPMTs (mean = 5.3 µm {plus minus}SD 1.5 µm) and extended 17.4% less of the total sporozoite length. While 'short side' SPMTs were significantly shorter, a greater proportion of their length (87.9% {plus minus}SD 11.2%) was polyglutamylated compared to 'long side' SPMTs (69.4% {plus minus}SD 13.8%)." *

      Supplementary Figure 2: Analysis of sporozoite subpellicular microtubules. Isolated P. yoelii salivary gland sporozoites were prepared by U-ExM and stained with anti-tubulin (microtubules) and anti-PolyE (polyglutamylated SPMTs) antibodies. SPMTs were defined as being on either the 'long side' (nucleus distant from plasma membrane) or 'short side' (nucleus close to plasma membrane) of the sporozoite as depicted in Figure 1f. (a) SPMT length along with (b) SPMT length as a proportion of sporozoite length were both measured. (c) Additionally, the proportion of the SPMT that was polyglutamylated was measured. Analysis comprises 25 SPMTs (11 long side, 14 short side) from 6 SG sporozoites. ** = p The following section has also been added to the methods to describe this analysis: * "Subpellicular microtubule measurement

      • To measure subpellicular microtubule length and polyglutamylation maximum intensity projections were made of sporozoites stained with NHS Ester, anti-tubulin and anti-PolyE antibodies, and SYTOX Deep Red. The side where the nucleus was closest to the parasite plasma membrane was defined as the 'short side', while the side where the nucleus was furthest from the parasite plasma membrane was defined as the 'long side'. Subpellicular microtubules were then measured using a spline contour from the apical end of the sporozoite to the basal-most end of the microtubule with fluorescence intensity across the contour plotted (Zeiss ZEN 3.8). Sporozoite length was defined as the distance from the sporozoite apical polar rings to the basal complex, measuring through the centre of the cytoplasm. The percentage of the subpellicular microtubule that was polyglutamylated was determined by assessing when along the subpellicular microtubule contour the anti-PolyE fluorescence intensity last dropped below a pre-defined threshold."

      *

      __In addition to the previous point, in the text accompanying Figure 7a, the authors claim that "64% of PbRON11cKD SG sporozoites contained no rhoptries at all, while 9% contained 1 rhoptry and 27% contained 2 rhoptries". Could this data be used to infer which rhoptry pair are missing from the RON11cKD oocyst sporozoites? Can it be inferred that the 64% of salivary gland sporozoites that had no rhoptries in fact had 2 congruent rhoptries in the oocyst sporozoite stage and that these have been discharged already? __

      Please see the response to Reviewer #2 Experimental Changes Comment #4.

      __Is it possible that the dimorphic rhoptries are simply precursors to the congruent rhoptries? Could it be that after the congruent rhoptries are used for SG invasion, new congruent rhoptries are formed from the dimorphic ones and are then used for the next invasion?____ Would it be possible to investigate this by isolating sporozoites some time after they have invaded the SG and performing expansion microscopy? This would allow you to confirm whether the dimorphic rhoptries truly remain in the same form, or if new congruent rhoptries have been formed, or if there have been any other changes to the morphology of the dimorphic rhoptries. __

      In theory, it is possible that the dimorphic rhoptries are precursors to the uniform rhoptries, specifically how the larger one of the two in the dimorphic pair might be a precursor. Maybe the smaller one is, but we have no evidence to suggest that this rhoptry lengthens after SG invasion. We are interested in isolating sporozoites from SGs to add a temporal perspective, but currently, this isn't feasible. When sporozoites are isolated from SGs, they are collected at all stages of invasion. Additionally, we don't know how long each step of SG invasion takes, so a time-based method might not be effective either. We are developing an assay to better determine the timing of events during SG invasion with MoTissU-ExM, but this is beyond the scope of this study.

      __In the section titled "Presence of PbRON11cKD sporozoites in the SG intercellular space", the authors state that "the majority of PbRON11cKD-infected mosquitoes contained some sporozoites in their SGs, but these sporozoites were rarely inside either the SG epithelial cell or secretory cavity". - this is suggestive of an invasion defect as the authors suggest. Could the authors collect these sporozoites and see if liver hepatocyte infection can be established by the mutant sporozoites? They previously speculate that the two different types of rhoptries (congruent and dimorphic) may be specific to the two invasion events (salivary gland epithelial cell and liver cell infection). __

      It has already been shown that RON11cKD sporozoites fail hepatocyte invasion (PMID: 31247198), even when isolated from the haemolymph and so it seems very unlikely that they would be invasive following SG isolation. As mentioned in the discussion, RON11 in merozoites has a 'dual-function' where it is partially secreted during merozoite invasion in addition to its rhoptry biogenesis functions. Assuming this is also the case in sporozoites, using the RON11cKD parasite line we cannot differentiate these two functions and therefore cannot ascribe invasion defects purely to issues with rhoptry biogenesis. In order to answer this question functionally, we would need to identify a protein that only has roles in rhoptry biogenesis and not invasion directly.

      Reviewer #3

      Minor text changes (Reviewer #3)

      1. __Page 3 last paragraph: ...the molecular mechanisms underlying SG (invasion?) are poorly understood. __

      This has been corrected 2.

      __The term "APR" does not refer to a tubulin structure per se, but rather to the proteinaceous structure to which tubulin anchors. Are there any specific APR markers that can be used in Figure 1C? If not, I recommend avoiding the use of "APR" in this context. __

      The text does not state that the APR is a tubulin structure. Given that it is a proteinaceous structure, we visualise the APRs through protein density (NHS Ester). It has been standard for decades to define APRs by protein density using electron microscopy, and it has previously been sufficient in Plasmodium using expansion microscopy (PMIDs: 41542479, 33705377) so it is unclear why it should not be done so in this study. 3.

      __I politely disagree with the bold statements ‚ Little is known about cell biology of sporozoite formation.....from electron microscopy studies now decades old' (p.3, 2nd paragraph); ‚To date, only a handful of (instead of ‚or') proteins have been implicated in SG invasion' (p. 4, 1st paragraph). These claims may overlook existing studies; a more thorough review of the literature is recommended. __

      This study includes at least 50 references from papers broadly related to sporozoite biology, covering publications from every decade since the 1970s. The most recent review that discusses salivary gland invasion cites 11 proteins involved in SG invasion. We have replaced "handful" with a more precise term, as it is not the best adjective, but it is hardly an exaggeration.


      Figure changes (Reviewer #3)

      1. __The hypothesis that Plasmodium utilizes two distinct rhoptry pairs for invading the salivary gland and liver cells is intriguing but remains clearly speculative. Are the "cytoplasmic pair" and "docked pair" composed of the same secretory proteins? Are the paired rhoptries identical? How does the parasite determine which pair to use for salivary gland versus liver cell invasion? Is there any experimental evidence showing that the second pair is activated upon successful liver cell invasion? Without such data this hypothesis seems rather premature. __

      We are unaware of any direct protein localisation evidence suggesting that the rhoptry pairs may carry different cargo. However, only a few proteins have been localised in a way that would allow us to determine if they are associated with distinct rhoptry pairs, so this possibility cannot be ruled out either. It seems unlikely that the parasite 'selects' a specific pair, as rhoptries are typically always found at the apical end. What appears more plausible is that the "docked pair" forms first and immediately occupies the apical docking site, preventing the cytoplasmic pair from docking there. Regarding any evidence that the second pair is activated during liver cell invasion, it has been well documented over decades that rhoptries are involved in hepatocyte invasion. If the dimorphic rhoptries are the only ones present in the parasite during hepatocyte invasion, then they must be used for this process. 2.

      __The quality of the "Roolet fibre" image is not good and resembles background noise from PolyE staining. Additional or alternative images should be provided to convincingly demonstrate that PolyE staining indeed visualizes the Roolet fibre. It is puzzling that the structure is visible with PolyE staining but not with tubulin staining. __

      This is a logical misinterpretation based on the image provided in Figure 1c. Our intention was not to imply that PolyE staining enables us to see the rootlet fibre but that PolyE and tubulin allow us to see the APR to which the rootlet fibre is connected. There is some PolyE staining that likely corresponds to the early SPMTs that in 1c appears to run along the rootlet fibre but this is a product of the max-intensity projection. Please see Reviwer#2 Figure Changes Comment #3 for the updated Figure 1c. 3.

      __More arrows should be added to Figures 6b and 6c to guide readers and improve clarity. __

      We have added arrows to Figure 6b and 6c which point out what we have defined as normal and aberrant rhoptries more clearly. These panels now look like this: 4.

      __Figure 2a zoomed image of P. yoelii infected SG is different than the highligted square. __

      We agree that the highlighted square and the zoomed area appear different, but this is due to the differing amounts of light captured by the objectives used in these two panels. The entire SG panel was captured with a 5x objective, while the zoomed panel was captured with a 63x objective. Because of this difference, the plane of focus of the zoomed area is hard to distinguish in the whole SG image. The zoomed image is on the 'top' of the SG (closest to the coverslip), while most of the signal you see in the whole SG image comes from the 'middle' of the SG. To demonstrate this more clearly, we have provided the exact region of interest shown in the 63x image alongside a 5x image and an additional 20x image, all of which are clearly superimposable.__

      __ 5.

      __Figure 3 legend: "P. yoelii infected midguts harvested on day 15" should be corrected. More general, yes, "...development of each oocyst within a single midgut is asynchronous." but it is still required to provide the dissection days. __

      We are unsure what the suggested change here is. We do not know what is wrong with the statement about day 15 post infection, that is when these midguts were dissected. __ Experimental Changes (Reviewer #3)__

      1. __The proposed role of AOR in rhoptry biogenesis appears highly speculative. It is unclear how the authors conclude that "AORs carry rhoptry cargo" solely based on the presence of RON4 within the structure. Inclusion of additional markers to characterize the content of AOR and rhoptries will be essential to substantiate the hypothesis that this enigmatic structure supports rhoptry biogenesis. __

      It is important to note that the hypothesis that AORs, or rhoptry anlagen, carry rhoptry cargo and serve as vehicles of rhoptry biogenesis was proposed long before this study (PMID: 17908361). In that study, it was assumed that structures now called AORs or rhoptry anlagen were developing rhoptries. Although often visualised by EM and presumed to carry rhoptry cargo (PMID: 33600048, 26565797, 25438048), it was only more recently that AORs became the subject of dedicated investigation (PMID: 31805442), where the authors stated that "...AORs could be immature rhoptr[ies]...". Our observation that AORs contain the rhoptry protein RON4, which is not known to localize to any other organelle, we therefore consider sufficient to conclude that AORs carry rhoptry cargo and are thus vehicles for rhoptry biogenesis. 2.

      __The study of RON11 appears to be a continuation of previous work by a collaborator in the same group. However, neither this study nor the previous one adequately addresses the evolutionary context or structural characteristics of RON11. Notably, the presence of an EF-hand motif is an important feature, especially considering the critical role of calcium signaling in parasite stage conversion. Given the absence of a clear ortholog, it would be interesting to know whether other Apicomplexan parasites harbor rhoptry proteins with transmembrane domains and EF-hand motifs, and if these proteins might respond similarly to calcium stimulation. Investigating mutations within the EF-hand domain could provide valuable functional insights into RON11. __

      We are unsure what suggests that RON11 lacks a clear orthologue. RON11 is conserved across all apicomplexans and is also present in Vitrella brassicaformis (OrthoMCL orthogroup: OG7_0028843). A phylogenetic comparison of RON11 across apicomplexans has previously been performed (PMID: 31247198), and this study provides a structural prediction of PbRON11 with the dual EF-hand domains annotated (Supplementary Figure 9). 3.

      __The study cannot directly confirm that membrane fusion occurs between rhoptries and AORs. __

      This is already stated verbatim in the results "Our data cannot directly confirm that membrane fusion occurs between rhoptries and AORs..." 4.

      __It is unclear what leads to the formation of the aberrant rhoptries observed in RON11cKD sporozoites. Since mosquitoes were not screened for infection prior to salivary gland dissection, The defect reports and revisited of RON11 knockdown does not aid in interpreting rhoptry pair specialization, as there was no consistent trend as to which rhoptry pair was missing in RON11cKD oocyst sporozoites. The notion that RON11cKD parasites likely have ‚combinatorial defects that effect both rhoptry biogenesis and invasion' poses challenges to understand the molecular role(s) of RON11 on biogenesis versus invasion. Of note, RON11 also plays a role in merozoite invasion. __

      We are unclear about the comment or suggestion here, as the claims that RON11cKD does not help interpret rhoptry pair specialization, and that these parasites have combined defects, are both directly stated in the manuscript. 5.

      __Do all SG PbRON11cKD sporozoites lose their reduced number of rhoptries during SG invasion as in Figure 7a (no rhoptries)? __

      Not all RON11cKD SG sporozoites 'use up' their rhoptries during SG invasion. This is quantified in both Figure 7a and the text, which states: "64% of *PbRON11cKD SG sporozoites contained no rhoptries at all, while 9% contained 1 rhoptry and 27% contained 2 rhoptries."

      * 6.

      Different mosquito species/strains are used for P. yoelii, P. berghei, and P. falciparum. Does it effect oocyst sizes/stages? Is it ok to compare?

      __ __We agree that a direct comparison between for example * yoelii and P. berghei *oocyst size would be inappropriate, however Figure 3c and Supplementary Figure 4 are not direct comparisons between two species, but a summation of all oocysts measured in this study to indicate that the trends we observe transcend parasite/mosquito species differences. Our study was not set up with the experimental power to determine if mosquito host species alter oocyst size. 7.

      __While I acknowledge that UExM has significantly advanced resolution capabilities in parasite studies, the value of standard microscopy technique should not be overlooked. Particularly, when discussing the function of RON11, relevant IFA and electron microscopy (EM) images should be included to support claims about RON11's role in rhoptry biogenesis. This would complement the UExM data and substantially strengthen the conclusions. Importantly, UExM can sometimes produce unexpected localization patterns due to the denaturation process, which warrants caution. __

      The purpose of this study is not to discredit, undermine, or supersede other imaging techniques. It is simply to use U-ExM to answer biological questions that cannot or have not been answered using other techniques. Please refer to Reviewer # 1 Minor text changes comment#17 to see the new paragraph "Comparison of MoTissU-ExM and other imaging modalities" that addresses this

      Both conventional IFA and immunoEM have already been performed on RON11 in sporozoites before (PMID: 31247198). When assessing defects caused by RON11 knockdown, conventional IFA isn't especially helpful because it doesn't allow visualization of individual rhoptries. Thin-section TEM also doesn't provide the whole-cell view needed to draw these kinds of conclusions. Volume EM could likely support these observations, but we don't have access to or expertise in this technique, and we believe it is beyond the scope of this study. It's also important to note that for the defect we observe-missing or abnormal rhoptries-the visualization with NHS ester isn't significantly different from what would be seen with EM-based techniques, where rhoptries are easily identified based on their protein density.

      The statement that "UExM can sometimes produce unexpected localisation patterns due to the denaturation process..." is partially correct but lacks important nuance in this context. Based on our extensive experience with U-ExM, there are two main reasons why the localisation of a single protein may look different when comparing U-ExM and traditional IFA images. First, denaturation: in conventional IFAs, antibodies need to recognize conformational epitopes to bind to their target, whereas in U-ExM, antibodies must recognize linear epitopes. This doesn't mean the target protein's localisation changes, only that the antibody's ability to recognize it does. Second, antibody complexes seem unable to freely diffuse out of the gel, which can result in highly fluorescent signals not related to the target protein appearing in the image, as we have previously reported (PMID: 36993603). Importantly, neither of these factors applies to our phenotypic analysis of RON11 knockdown. All phenotypes described are based solely on NHS Ester (total protein) staining, so the considerations about changes in the localisation of individual proteins are not relevant.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Overall, the manuscript is well-written and -structured. However, I would like to raise several major points for consideration:

      1. While I acknowledge that UExM has significantly advanced resolution capabilities in parasite studies, the value of standard microscopy technique should not be overlooked. Particularly, when discussing the function of RON11, relevant IFA and electron microscopy (EM) images should be included to support claims about RON11's role in rhoptry biogenesis. This would complement the UExM data and substantially strengthen the conclusions. Importantly, UExM can sometimes produce unexpected localization patterns due to the denaturation process, which warrants caution.
      2. The proposed role of AOR in rhoptry biogenesis appears highly speculative. It is unclear how the authors conclude that "AORs carry rhoptry cargo" solely based on the presence of RON4 within the structure. Inclusion of additional markers to characterize the content of AOR and rhoptries will be essential to substantiate the hypothesis that this enigmatic structure supports rhoptry biogenesis.
      3. The hypothesis that Plasmodium utilizes two distinct rhoptry pairs for invading the salivary gland and liver cells is intriguing but remains clearly speculative. Are the "cytoplasmic pair" and "docked pair" composed of the same secretory proteins? Are the paired rhoptries identical? How does the parasite determine which pair to use for salivary gland versus liver cell invasion? Is there any experimental evidence showing that the second pair is activated upon successful liver cell invasion? Without such data this hypothesis seems rather premature.
      4. The study of RON11 appears to be a continuation of previous work by a collaborator in the same group. However, neither this study nor the previous one adequately addresses the evolutionary context or structural characteristics of RON11. Notably, the presence of an EF-hand motif is an important feature, especially considering the critical role of calcium signaling in parasite stage conversion. Given the absence of a clear ortholog, it would be interesting to know whether other Apicomplexan parasites harbor rhoptry proteins with transmembrane domains and EF-hand motifs, and if these proteins might respond similarly to calcium stimulation. Investigating mutations within the EF-hand domain could provide valuable functional insights into RON11.
      5. The study cannot directly confirm that membrane fusion occurs between rhoptries and AORs.
      6. It is unclear what leads to the formation of the aberrant rhoptries observed in RON11cKD sporozoites. Since mosquitoes were not screened for infection prior to salivary gland dissection, The defect reports and revisited of RON11 knockdown does not aid in interpreting rhoptry pair specialization, as there was no consistent trend as to which rhoptry pair was missing in RON11cKD oocyst sporozoites. The notion that RON11cKD parasites likely have ‚combinatorial defects that effect both rhoptry biogenesis and invasion' poses challenges to understand the molecular role(s) of RON11 on biogenesis versus invasion. Of note, RON11 also plays a role in merozoite invasion. I like the introduction of a segmentation score to Plasmodium oocyst maturation.

      Minor comments:

      1. The term "APR" does not refer to a tubulin structure per se, but rather to the proteinaceous structure to which tubulin anchors. Are there any specific APR markers that can be used in Figure 1C? If not, I recommend avoiding the use of "APR" in this context.
      2. The quality of the "Roolet fibre" image is not good and resembles background noise from PolyE staining. Additional or alternative images should be provided to convincingly demonstrate that PolyE staining indeed visualizes the Roolet fibre. It is puzzling that the structure is visible with PolyE staining but not with tubulin staining.
      3. Figure 2a zoomed image of P. yoelii infected SG is different than the highligted square.
      4. Figure 3 legend: "P. yoelii infected midguts harvested on day 15" should be corrected. More general, yes, "...development of each oocyst within a single midgut is asynchronous." but it is still required to provide the dissection days.
      5. More arrows should be added to Figures 6b and 6c to guide readers and improve clarity.
      6. Do all SG PbRON11cKD sporozoites lose their reduced number of rhoptries during SG invasion as in Figure 7a (no rhoptries)?
      7. Different mosquito species/strains are used for P. yoelii, P. berghei, and P. falciparum. Does it effect oocyst sizes/stages? Is it ok to compare?
      8. I politely disagree with the bold statements ‚ Little is known about cell biology of sporozoite formation.....from electron microscopy studies now decades old' (p.3, 2nd paragraph); ‚To date, only a handful of (instead of ‚or') proteins have been implicated in SG invasion' (p. 4, 1st paragraph). These claims may overlook existing studies; a more thorough review of the literature is recommended.
      9. Page 3 last paragraph: ...the molecular mechanisms underlying SG (invasion?) are poorly understood.

      Significance

      In this study, the authors explore Ultrastructure Expansion Microscopy (U-ExM) in Plasmodium-infected mosquito tissue with the aim to enhance the visualization of parasite ultrastructure. For this purpose, they revisit sporogony, the maturation of sporozoites inside oocysts, and sporozoite invasion of salivary glands, which has been studied both by cell biological methods and experimental genetics over four decades. They focus their analysis on the biogenesis and function of key secretory organelles, termed rhoptries, which are central to parasite invasion and, again, have been studied extensively.

      This study is a follow-up of a previous study by the same authors (Ref. 19). In the former study the authors showed that U-ExM allows to visualize subcellular structures in sporozoites, including the nucleus, rhoptries, Golgi, apical polar rings (APR), and basal complex, as well as midgut-associated oocysts with developing sporozoites. Here, the authors claim a new finding by stating that sporozoites possess two distinct rhoptry pairs. Supposedly, only one pair is utilized during salivary gland invasion. The authors suggest specialization of rhoptries for different cell invasion events. The authors also revisit a RON11 knock-down parasite line, which was previously shown to be deficient in salivary gland invasion, host cell attachment, gliding locomotion, and liver invasion (Ref. 14).

      I find it difficult to estimate the significance. Obviously, attention will be limited to Plasmodium researchers only, as this study is descriptive and revisits a well-studied aspect of the Plasmodium life cycle in the Anopheles vector.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript by Liffner et al have used the modified expansion microscopy as they term Mosquito Tissue Ultrastructure Expansion Microscopy (MoTissU-ExM) to study a cell biology of temporal development of malaria parasite sporozoite biogenesis within mosquito host. They employed three different malaria parasite models Plasmodium yoelii, P.beghei and P falciparum and infected them in mosquito host.

      The application of MoTissU-ExM to infected mosquito tissues is a significant technical advance, enabling visualizations previously only achievable with electron microscopy.

      The major conclusion and advances are as following

      • The establishment of a "segmentation score" as a great tool for staging asynchronous oocyst development.
      • The location of Centriolar plaques, rootlet and other structures which are difficult to analyse
      • The first detailed timeline for sporozoite rhoptry biogenesis.
      • Clear quantification showing that sporozoites possess four rhoptries and utilise two during salivary gland (SG) invasion.
      • A characterization of the RON11 knockout phenotype, linking it to defects in rhoptry biogenesis and a specific block in SG epithelial cell invasion. The following points are intended to further strengthen the paper for publication.

      Points for Revision

      1. For clarity, it would be helpful to add indicators for the centriolar plaques in Figure 1b, as their locations are not immediately obvious.
      2. The 'image Z-depth' value indicated in the figures is ambiguous. It is not clear whether this refers to the distance from the coverslip surface or the starting point of the z-stack image acquisition. A precise definition of this parameter would be beneficial.
      3. Regarding Figure 1c, the authors state that 'the rootlet fiber is visible'. However, such a structure cannot be confirmed from the provided NHS ester image. Can the authors present a clearer image where the rootlet fibre is more distinct? Furthermore, please provide the basis for identifying this structure as a rootlet fiber based on the NHS ester observation alone.
      4. Why do the congruent rhoptries have similar lengths to each other, while the dimorphic rhoptries have different lengths? Is this morphological difference related to the function of these rhoptries?
      5. Would it be possible to show whether RON11 localises to the dimorphic rhoptries, the congruent rhoptries, or both, by using expansion microscopy and a parasite line that expresses RON11 tagged with GFP or a peptide tag?
      6. The knockdown of RON11 disrupts the rhoptry structure, making the dimorphic and congruent rhoptries indistinguishable. Does this suggest that RON11 is important for the formation of both types of rhoptries? I believe that it would be crucial to confirm whether RON11 localises to all rhoptries or is restricted to specific rhoptries for a more precise discussion of RON11's function.
      7. The authors state that 64% of RON11cKD SG sporozoites contained no rhoptries at all. Does this mean RON11cKD SG sporozoites used up all rhoptries corresponding to the dimorphic and congruent pairs during SG invasion? If so, this contradicts your claims that sporozoites are 'leaving the dimorphic rhoptries for hepatocyte invasion' and that 'rhoptry pairs are specialized for different invasion events'. If that is not the case, does it mean that RON11cKD sporozoites failed to form the rhoptries corresponding to the dimorphic pair? A more detailed discussion would be needed on this point and, as I mentioned above, on the specific role of RON11 in the formation of each rhoptry pair.
      8. Out of pure curiosity, is it possible to measure the length and number of subpellicular microtubules in the sporozoites observed in this study using expansion microscopy?
      9. Is it possible that the dimorphic rhoptries are simply precursors to the congruent rhoptries? Could it be that after the congruent rhoptries are used for SG invasion, new congruent rhoptries are formed from the dimorphic ones and are then used for the next invasion? Would it be possible to investigate this by isolating sporozoites some time after they have invaded the SG and performing expansion microscopy? This would allow you to confirm whether the dimorphic rhoptries truly remain in the same form, or if new congruent rhoptries have been formed, or if there have been any other changes to the morphology of the dimorphic rhoptries.
      10. In addition to the previous point, in the text accompanying Figure 7a, the authors claim that "64% of PbRON11cKD SG sporozoites contained no rhoptries at all, while 9% contained 1 rhoptry and 27% contained 2 rhoptries". Could this data be used to infer which rhoptry pair are missing from the RON11cKD oocyst sporozoites? Can it be inferred that the 64% of salivary gland sporozoites that had no rhoptries in fact had 2 congruent rhoptries in the oocyst sporozoite stage and that these have been discharged already?
      11. In the section titled "Presence of PbRON11cKD sporozoites in the SG intercellular space", the authors state that "the majority of PbRON11cKD-infected mosquitoes contained some sporozoites in their SGs, but these sporozoites were rarely inside either the SG epithelial cell or secretory cavity". - this is suggestive of an invasion defect as the authors suggest. Could the authors collect these sporozoites and see if liver hepatocyte infection can be established by the mutant sporozoites? They previously speculate that the two different types of rhoptries (congruent and dimorphic) may be specific to the two invasion events (salivary gland epithelial cell and liver cell infection).

      There are a few typing errors in the document:

      1. Paragraph 3 of the introduction - line 7, "handful or proteins" should be handful of proteins
      2. Paragraph 5 of the introduction - line 7, "also able to observed" should be observe
      3. In the final paragraph of the introduction - line 1, "leverage this new understand" should be understanding
      4. The first paragraph of the discussion summary contains an incomplete sentence on line 7, "PbRON11ctrl-infected SGs."
      5. The second paragraph of the discussion - line 10, "until cytokinesis beings" should be begins

      Some suggestions for figures

      Fig 1B - could the tubulin image in the hemispindle panel be made brighter?

      Fig 3B: stage 2 and 6 does not show the DNA cyan, it would-be good show the sate of DNA at that particular stage, especially at stage 2 when APR is visible. And box the segment in the parent picture whose subset is enlarged below it.

      Fig 4A - the green text in the first image panel is not visible. Also, the cyan text in the 3rd image in Fig 1A is also difficult to see. There's a few places where this is the case

      Fig 6A - how do the authors know ron11 expression is reduced by 99%? Did they test this themselves or rely on data from the lab that gifted them the construct? Also please provide mention the number of oocyst and sporozoites were observed.

      Fig 6E - are the data point colours the wrong way round on this graph? Just looking at the graph it looks as though the RON11cKD has more rhoptries than the control which does not match what is said in the text.

      Fig S8C, PbRON11 ctrl, pie chart shows 89.7 % spz are present in the secretory cavity while the text shows 100 %, 35/35

      Fig S9D shows that RON11 ckd contains 17.1% sporozoites in secretory cavity while the text says 24%.

      Some point to discuss

      1.One minor point that author suggest that oocyst diameter is not appropriate for the development of sporozoite develop. This is not so true as oocyst diameter tells between cell division and cell growth so it is important parameter especially where the proliferation with oocyst does not take place but the growth of oocyst takes place.<br /> 2. The author have not mentioned that sometimes the stage oocyst development is also dependent on the age of mosquito and it vary between different mosquito gut even if the blood feed is done on same day. 3. How is the apical polarity different to merozoite as some conoid genes are present in ookinete and sporozoite but not in merozoite.

      Significance

      The following aspects are important:

      This is novel and more cell biology approach to study the challenging stage of malaria parasite within mosquito. By using MoTissU-ExM, the authors have enabled the three-dimensional observation of ultrastructures of oocyst-sporozoite development that were previously difficult to observe with conventional electron microscopy alone. This includes the developmental process and entire ultrastructure of oocysts and sporozoites, and even the tissue architecture of the mosquito salivary gland and its epithelia cells.

      Advances:

      By observing sporozoites formation within the oocyst and the overall ultrastructure of the sporozoite with MoTissU-ExM, the authors have provided detailed descriptions of the complete structure and three-dimensional spatial relationships of the rhoptries, rootlet fibre, nucleus, and other organelles. Furthermore, their detailed localisation analysis of sporozoites within the salivary gland is also a great achievement. Considering that such observations were technically and laboriously very difficult with conventional electron microscopy, enabling these analyses with higher efficiency and relatively lower difficulty represents a major contribution to the future advancement of oocyst-sporozoite biology. The development of the 'segmentation score' for sporozoite formation within the oocyst is another major advance. I think this will enable detailed descriptions of structural changes at each developmental stage and of the molecular mechanisms involved in the development of oocysts-sporozoites This has its advantages if antibodies can be used and somewhat reduces the need for immuno-EM. Secondly, in terms of sporozoite rhoptry biology, the Schrevel et al Parasitology 2007 seems to only focus on oocyst sporozoite rhoptries as they say that the sporozoites have 4 rhoptries. This study on the other hand also looks at salivary gland sporozoites and shows that there are potentially important differences between the two - namely the reduction from 4 rhoptries to two. This also leads to further questions about the different types of rhoptries in oocyst sporozoites and whether they're adapted to invasion of different cell types (sal gland epithelial cells or liver hepatocytes)

      Limitation

      It would be that expansion microscopy alone still has its limits when it comes to observing ultra-fine structures. For example, visualising the small vesicular structures that Schrevel et al. observed in detail with electron microscopy, or seeing ultra-high resolution details such as the fusion of membrane structures and their interactions with structures like the rootlet fibre and microtubules. Therefore, I think that electron microscopy remains essential for the observation of such ultra-fine structures The real impact of this work is mostly cell biologist working with malaria parasite and more in mosquito stages. But the approaches can be applied to any material from any species where temporal dynamics need to be studied with tissue related structures and where UExM can be applied. I am parasite cell biologist working with parasites stages within mosquito vector host.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper the authors use ultrastructure expansion microscopy to investigate the mosquito stages of the malaria parasite, specifically the stage called oocyst and the process of sporozoite development. They report a number of observations of which the ones concerning rhoptries are the most interesting. There are four of these organelles in the first form of sporozoites in the oocyst and only two in the mature form in the salivary gland. Using a gene knockout of a protein that was reported to be important for rhoprty formation in merozoites, the parasites invading into human red blood cells, they found that fewer rhoptries are formed also in sporozoites and that these cannot enter into the salivary gland cells any more. The presented data are in my view conclusive and no additional experiments are needed for this work to be published. The described experiments should be readily reproducible and have a high statistical power. The text is mostly clearly written but could be improved to make it more concise and more precise and to avoid overstatements. Some references could be added. It would have helped to have line numbers in the manuscript. My suggestions are as following:

      Abstract: don't focus on technique but on the questions you tried to answer (ie rewrite or delete the 3rd and 4th sentence)

      Add reference on page 3 after 'disrupted parasites' Change 'the basal compelx at the leading edge' - this seems counterintuitive Change 'mechanisms underlying SG are poorly' - what mechanisms? of invasion or infection? On page 4: 'handful of proteins' 'range of cell biology processes' - I understand the paper that the key discovery concerns rhoptry biogenesis and function, so focus on that, all other aspects appear rather peripheral. what are the 'three microtubule spindle structures'? 'Much of this study focuses on the secretory organelles': I would suggest to rewrite the intro to focus solely on those, which yield interesting findings. On page 5: 'little is known' - please describe what is known, also in other stages. At the end of the paper I would like to know what is the key difference to rhoptry function in other stages? change 'rhoptries golgi-derived, made de novo' change 'new understand to' 'rhoptry malformations' seem to be similar in sporozoites and merozoites. Is that surprising/new? What is known about crossing the basal lamina. Where rhoptries thought to be involved in this process? Or is it proteins on the surface or in other secretory organelles? On page change/specify: 'wide range of parasite structures' On page 7: is Airyscan2 a particular method or a specific microscope? what are the dark lines in panel E? in panel G: Are the dense granules not micronemes? What are the dark lines? Rhoptries?? On page 8 the authors mention a second layer of CSP but do not further investigate it. It is likely hard to investigate this further but to just let it stand as it is seems unsatisfactory, considering that CSP is the malaria vaccine. What happens if you add anti-CSP antibodies? I would suggest to shorten the opening paragraphs of this paper and to focus on the rhoptries. This could be done be toning down the text on all aspects that are not rhoptries and point to the open question some of the observations such as the CSP layers raise for future studies. Figure 2 seems to add little extra compared to the following figures and could in my view go to the supplement. On page 10 I suggest to qualify the statement 'oocyst development has typcially been inferred by'. There seem a few studies that show that size doesn't reflect maturation. Page 11: I am tempted to suggest the authors start their study with Figure 3 and add panel A from Figure 2 to it. This leads directly to their nice work on rhoptries. Other features reported in Figures 1 and 2 are comparatively less exciting and could be moved to the supplement or reported in a separate study. Text on page 12 could be condensed to highlight the new data of ron4 staining of the AOR. Maybe include more detail of the differences between species on rhoptry structure into Figure 4. I would encourage to move the Data on rhoptries in Figure S6 to the main text ie to Figure 4. On page 16 the authors state that different rhoptries might have different function. This is an interesting hyopthesis/result that could be mentioned in the abstract. how large is RON11? On page 19: do the parasites with the RON11 knockout only have the cytoplasmic or only the apical rhoptries? Page 23: I suggest to delete the first sentence and focus on the functional aspects and the discoveries. There is no causal link between ookinete invasion and oocyst developmental asynchrony First sentence of page 24 appears to contradict what is written in results I don't understand the first two sentences in the paragraph titled Comparison between Plasmodium spp On page 25 or before the vast number of electron microscopy studies should be discussed and compared with the authors new data. First sentence on page 27: there are many studies on parasite proteins involved in salivary gland invasion that could be mentioned/discussed. Maybe add a conclusion section rather than a future application section, which reads as if you want to promoted the use of ultrastructure expansion microscopy. To my taste the technological advance is a bit overplayed considering the many applications of this techniques over the last years, especially in parasitology, where it seems widely used. In any case, please delete 'extraordinarily'

      Significance

      This interesting study investigates the development of malaria parasites in the mosquito using ultrastructure expansion microscopy adapted to mosquito tissue. It provides new and beautiful views of the process of sporozoite formation. The authors discovered that four secretory vesicles called rhoptires are formed in the sporozoites with two pairs being important for distinct functions, one pair functions during invasion of the salivary glands of the mosquito and the other in liver infection, although the latter is not shown but inferred from prior data.

      This study will thus be of interest to scientists investigating malaria parasites in the mosquito as well as to scientist working on vesicle secretion and invasion in these parasites.

      The authors use a previously generated parasite line that lack a protein to investigate its function in rhoptry biogenesis and find that its absence leads to fewer rhoptries which impacts the capacity of the parasite to enter into salivary gland cells. This is a nice functional addition to an otherwise largely descriptive study, but mimics largely the previously reported results from the blood stages. It is not clear to this reviewer how much the study advances the field over the many previous electron microscopy studies. This could be better elaborated in the text.

      Strength of the study: beautiful microscopy, new insights into rhoptry formation and function, new technique to study malaria parasites in the mosquito

      Weakness of the study: Some loose ends in the description of spindles and CSP layers, text could be more focussed on the key advancements reported

  4. Jan 2026
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 Reviewer 1 Point 1- The authors describe cortical neuronal counts across several mammalian species, which is quite impressive, but the information on the methods of counting is lacking: how representative are the data used / shown; how many individuals / brains / sections were used for each species considered? Much more detailed description of the quantifications should be provided to judge the validity of this first conclusion.

      Response: We sincerely thank the reviewer for this insightful and constructive suggestion. We agree that the methodological description of our comparative histological analysis, which is the fundamental basis of this study, was insufficient in the original manuscript. Following the reviewer’s advice, we have extensively revised the Materials and Methods section entitled “Nissl staining and neuronal cell number count” (Page 32, Line 15).

      Reviewer 1 Point 2- The authors use several markers of cortical neuron identity to confirm their neuron number measurements, but from the data shown in Figure 1D,E it seems that only some markers (Satb2) show species-differences while others do not (CTIP2 / Tbr1). How do the authors explain this discrepancy - does this mean that it is mainly Satb2 neurons that are increased in number? But if so how to explain the relative increase in subcortical projections shown in Figure S7?

      Response: We appreciate the reviewer’s insightful comments regarding the marker expression patterns. Upon re-evaluating our data in light of your feedback, we agree that the species differences in deep-layer (DL) markers such as Ctip2 and Tbr1 in the adult stage appear relatively modest compared to the robust differences observed in Satb2 and the projection data shown in Figure S8.

      To address this point, we have incorporated a comparison between the adult data (Figure 1) and our findings from P7 (Figure S2). As shown in the revised manuscript, the species differences for all markers are significantly more pronounced at P7 than in the adult. Notably, in the lower layers, rats exhibit a significantly higher number of marker-positive cells across all markers, including those newly added in this revision, compared to mice.

      We offer the following interpretation regarding these temporal differences:

      1. Developmental Relevance: The marker molecules analyzed are well-established regulators of neuronal subtype fate and projection identity during development. Their critical fate-determining functions are primarily exercised during the migration and maturation phases of nascent neurons.
      2. Postnatal Expression Shifts: Whether these molecules maintain functional roles in the fully matured adult brain remains less certain. It is plausible that marker expression may diminish in certain neuronal populations during late postnatal development, leading to the attenuated species differences observed in adults. Consequently, we believe the strong correlation between P7 quantitative data and projection fate provides a biologically sound validation of our hypothesis.

      While we have kept the discussion in the main text concise to maintain focus for the general reader, we have provided comprehensive data in Figure 1 and Figure S2. This ensures that the necessary evidence is readily available for specialists interested in these developmental dynamics.

      Reviewer 1 Point 3- The authors focus their study almost exclusively on somatosensory cortex, but can they comment on other areas (motor, visual for instance)? It would be nice to provide additional comparative data on other areas, at least for some of the parameters examined across mouse and rat. Alternatively the authors should be more explicit in the abstract and description of the study that it is limited to a single area.

      Response: We sincerely appreciate the reviewer’s insightful comment. As suggested, we have revised the Abstract to explicitly state that our current analysis is focused on the somatosensory cortex. Furthermore, as demonstrated in Figure 1B, we have added a discussion regarding the possibility that the species differences observed in the primary somatosensory cortex may be a general feature shared across the entire cerebral cortex, as follows: “This DL-biased thickening in rats was evident in the primary somatosensory area, but is consistently observed throughout the rostral-caudal cortical regions. (Page 19, Lines 29-31)“

      Reviewer 1 Point 4- The authors provide convincing evidence of increased Wnt signaling pathway in the rat. They should show more explicitly how other classical pathways of neurogenic balance / temporal patterning are expressed in their mouse and rat transcriptome data sets. These would include Notch, FGF, BMP, for which all the data should be available to provide meaningful species comparison.

      Response: We sincerely thank the reviewer for this insightful suggestion. Following your advice, we have newly included comparative data on key signaling pathways essential for cortical development—namely Wnt, FGF, NOTCH, mTOR, SHH, and BMP—across different species. These results are now presented in Figure S17. Rat progenitors show comparable patterns to other species for FGF, mTOR, and Notch signaling, but elevated Wnt and BMP expression, especially at early stages. A detailed heatmap of raw Wnt pathway gene expression across species is also included in the same supplementary figure. We believe these additions provide a more comprehensive evolutionary perspective and significantly strengthen our findings.

      Reviewer 1 Point 5- The alignment of mouse and rat trajectories is very nicely showing a delay at early-mid-corticogenesis. But there is also heterochronic transcriptome at latest stages (end of 5). How can this be interpreted? Does this mean potentially prolonged astrogliogenesis in the rat cortex?

      Response: We sincerely appreciate the reviewer’s insightful comment and the meticulous attention given to our data. Regarding the heterochronic shift observed at Day 5, we agree that this point was not sufficiently addressed in the original manuscript.

      We would like to clarify the two primary reasons for this omission, which are inherent to the current study’s design:

      1. Resolution of Stage Alignment at Temporal Extremes: In our developmental stage alignment analysis, corresponding stages are defined by pairs showing the highest transcriptomic similarity within the sampled range. By definition, the precision of this alignment tends to decrease at the earliest and latest time points of a dataset. Since the "true" biological equivalent might lie outside our sampling window, we must be cautious in interpreting shifts at these temporal boundaries.
      2. Difference in Validation Rigor: Our study prioritized the early stages of deep-layer (DL) neuron production. Consequently, we rigorously defined the onset of neurogenesis in rats (Day 1) using multiple independent methods, including clonal analysis, immunohistochemistry, and gene expression. In contrast, Day 5 was defined simply as five days post-initiation of neurogenesis, without equivalent multi-modal validation. Given that our primary focus is the early phase of neurogenesis, the precision of the transition from late neurogenesis to gliogenesis is relatively lower. For these reasons, we believe that an in-depth discussion of the heterochronic shift at Day 5 might lead to over-interpretation. To reflect this more accurately and avoid misleading the reader, we have revised Figure 6F to de-emphasize the Day 5 shift. In addition, we revised the manuscript as “Importantly, while this analysis identified stage pairs with the highest similarity, the correspondence at the edges of the temporal sampling window is inherently less certain than at the center. Consequently, we focus on the notable reflection point at the center of our dataset. (Page 13, Lines 37-39)”.

      We believe these changes more faithfully represent the biological scope of our data while maintaining the scientific integrity of our primary conclusions.

      Reviewer 1 Point 6- Figure 7: description implies that module 3 is a subset of module 4, but this is not obvious at all from the panels shown. Please clarify.

      Response: We sincerely appreciate the reviewer’s careful reading of our manuscript. As suggested, we have revised Figure 7 to clarify the hierarchical relationship between Module 3 and Module 4, ensuring that their inclusion is now explicitly presented.


      Reviewer #2 Reviewer 2 Point 1. The introduction lacks sufficient background and fails to convey the significance of the study. Specifically, why the research was undertaken, what knowledge gap it addresses, and how the findings could be applied. Addressing these questions already in the introduction would enhance the impact of the work and broaden its readership.

      Response: We sincerely appreciate the reviewer’s insightful comment on this point. Our study reports evolutionary insights gained through an unconventional approach: a single-cell level comparison between mice and rats. We agree that clarifying the necessity of this specific approach is crucial for the manuscript. Accordingly, we have added the following two points to the Introduction:

      1. At the end of the first paragraph, we emphasized the current lack of research on the evolutionary adaptation of cortical circuits, despite the established functional importance of evolutionarily conserved circuits. (Page 3, Lines 7-10); “Paradoxically, despite the importance of these variations, research has predominantly focused on the conserved aspects of cortical architecture. Consequently, the degree of evolutionary plasticity inherent in these circuits and the cell-intrinsic mechanisms driving their modification remain profoundly enigmatic.”)
      2. At the end of the third paragraph, we revised and added text (Page3, Lines 26-27; “This lack of comparative insight represents a significant gap in our understanding of how conserved developmental programs give rise to species-specific brain architectures.”).

      Reviewer 2 Point 2. In figure 5 the authors conclude that "differences in cell cycle kinetics and indirect neurogenesis are unlikely to be the primary factors driving the species-specific variation in DL neuron production. Instead, the temporal regulation of progenitor neurogenic competence, which determines the duration of the DL production phase, provides a more plausible explanation for the greater number of DL subtypes observed in rats". It is not clear to this reviewer how the authors come to this conclusion. Authors observe a significant proportion of mitotic cells in rat VZ from day 1, and a higher constant proportion of mitotic progenitors in SVZ rats compared to mouse (Figure 5C). This points to an early difference in mitotic progenitors that may also lead to increased IP numbers, and potentially an increased number in DL cells, even before day 1. In addition, the higher abundance of IPs in the G2/S phase (statistically significant in 4 of the 7 time points) (Figure 5F), would suggest that this difference might play a role in the species-specific variation of DL neuron production. The authors should estimate cell cycle length instead of just measuring proportions to conclude something about cell cycle kinetics. They can then model growth curves to predict the effect caused if there were differences in cell cycle length between equivalent cell types across species.

      Response: We sincerely thank the reviewer for their careful reading of our manuscript and for pointing out the overstatements in our original descriptions. We agree that a more nuanced interpretation of the data was necessary. In response to these constructive suggestions, we have made the following revisions:

      1. Refinement of Descriptions: We have revised the text to more accurately reflect our findings, specifically noting that the increase in RG division on Day 1 and IP proliferation throughout the neurogenic period showed a significant trend. These features are now described more fairly and cautiously in the revised manuscript. (Page 11, Lines 42-46; “Remarkably, while the temporal dynamics of mitotic density were strikingly conserved between the two species, subtle yet discernible species-specific signatures emerged. Specifically, rats exhibited a higher ratio of mitotic cells in the VZ at the onset of neurogenesis, the precise period when DL subtypes are generated in both species. Further assessment of G2/S-phase cells via pulse-EdU labeling (Figure 5D, E) “)
      2. Inclusion of Time-lapse Imaging Data: The reviewer is correct that measuring the proportions of M and G2/S phases provides only a limited snapshot of cell cycle dynamics. To gain a more precise insight, we performed primary cultures of neural progenitor cells (NPCs) from Day 1 and conducted live-cell time-lapse imaging. This allowed us to directly quantify the cell cycle duration of mouse and rat NPCs (Figure S9A-C).
      3. Comparative Analysis and Mathematical Modeling: Our new data revealed that the cell cycle lengths of the two species are remarkably similar, with no significant differences observed under these culture conditions. Furthermore, to validate the impact of these findings on overall brain development, we developed a mathematical model based on our experimental data. This model predicts the total number of cells produced over the five-day neurogenic period, providing a more robust theoretical framework for our conclusions (Figure S9D). We believe these additions significantly strengthen the manuscript and address the reviewer's concerns regarding the physiological relevance of our observations.

      Reviewer 2 Point 3. In Figure 6 the authors focus only on the mouse and rat datasets. Given the availability of datasets from primates that the author used already for Figure 7, it would give the reader a broader prospective if also these datasets would be integrated in the analysis done for Figure 6, particularly it would be interesting to integrate them in the pseudotime alignment of cortical progenitor. How do human and/or macaque early and late neurogenic phase would compare to mouse and rat in this model?

      Response: We sincerely appreciate the reviewer’s insightful suggestion. In accordance with this comment, we have now incorporated pseudotime alignments of cortical progenitors between primates (human, macaque) and rodents (mouse, rat), presented as pairwise gene expression distance matrices with dynamic time warping in Figure S13. These heatmaps illustrate temporal compression or stretching in progenitor gene expression progression across species. Notably, macaque progenitors show no definitive deviations from rodents, whereas human progenitors exhibit distinct protraction relative to rats and even more so to mice. These additions provide a more comprehensive cross-species perspective without altering the study's core conclusions.

      Reviewer 2 Point 4. In Figures 6C and 6D, the authors distinguish between cycling and non-cycling NECs and RGCs. Could the authors clarify the rationale behind making this distinction? Could the authors comment on how they interpret the impact of cycling versus non-cycling states on species-specific non-uniform scaling? Do they consider the observed non-linear correspondences to be driven by differences in cell cycle activity?

      Response: We are grateful to the reviewer for their insightful observation. We agree that our initial classification of neural progenitor cell (NPC) populations based on proliferation marker expression levels followed a convention used in other studies but was, in the context of this work, unnecessary and potentially misleading. To avoid further confusion and focus on the core biological question, we have re-organized the data by pooling these populations into a single group. Regarding the concern about species differences in cell cycle kinetics, we believe there is no significant divergence between mice and rats that could explain the observed developmental patterns in temporal progression of neurogenesis. This is supported by two lines of evidence:

      1. Quantitative analysis of pH3-positive cells (Figure 5).
      2. New time-lapse imaging data of primary cultured NPCs, which shows no substantial difference in cell cycle length between the two species (Figure S9). These results indicate that the species-specific differences in deep-layer (DL) neuron production are not driven by cell division kinetics. Consequently, we conclude that the non-linear developmental progression of NPCs occurs independently of cell cycle regulation.

      Reviewer 2 Point 5. For the non-uniform scaling in Figure 6F, the authors identify critical inflection points and mention that "the largest delay in rat progenitors occurring where Day 1 and Day 3 progenitors overlapped". It would be good if the authors could discuss what they think all the inflection points represents. How much can it be explained by the heterogeneity within progenitors per time point? There is a clear higher spread of histograms at days 3 and 5, and the histogram at day 5 almost overlaps with day 1. I wonder if the same conclusion about non-uniform scaling would be detected if the distance matrix was built separately for specific cell types, for example only looking at NECs or RGCs.

      Response: We sincerely appreciate the reviewer’s insightful perspective on this point. In alignment with the suggestions from both this reviewer and Reviewer 1 (Point 5), we have updated the manuscript to discuss all identified inflection points. Specifically, we have clarified why our discussion focuses on the correspondence between Mouse D1 and Rat Day 3.

      A recognized limitation of our current analytical approach is that it identifies the closest matching expression profiles within the specific timeframes sampled for each species. For stages at the beginning or end of our sampling window, the "true" corresponding stage in the other species may lie outside our sampled range, which naturally limits the strength of any conclusions regarding those boundary points. Consequently, while we can confidently confirm the correspondence between Mouse Day 1 and Rat Day 3—both of which sit centrally within our sampled window—we have intentionally avoided over-interpreting data near the temporal boundaries.

      Regarding the cell types analyzed, this specific analysis was conducted exclusively on NECs and RGs (now shown in Figure 6F). Extensive prior research (Susan McConnell lab, Sally Temple lab, Fumio Matsuzaki lab, Dennis Jabaudon lab, and more) has established that the time-dependent mechanisms governing the fate determination of cortical excitatory neuron subtypes are encoded within RGs. Therefore, we focused our investigation on these lineages and did not include other cell types in this study. We believe this focused approach maintains the highest degree of biological relevance for our conclusions.

      Reviewer 2 Point 6. The authors conclude that the elevated and prolonged expression of Wnt-ligand genes in rat RGs extend the DL neurogenic window and contribute to rat-specific expansion of deep cortical layer. In order to validate this finding it would be good for the authors to perform a perturbation experiment and reduce Wnt signalling/ Axin 2 levels in rats or depleted the Lmx1a and Lhx2 double-positive population. Response: __We thank the reviewer for this insightful suggestion. We agree that providing direct experimental evidence is crucial to demonstrating that elevated Wnt signaling in RG progenitors drives the production of DL subtype neurons in rats. To address this, we performed a functional intervention on Day 3, a stage when Wnt signaling (indicated by Axin2 expression) is significantly higher in rats than in mice (__Figure 7C, D). By introducing a dominant-negative form of TCF7L2 (dnTCF7L2) to inhibit Wnt signaling specifically in RG progenitors, we tracked the fate of the resulting neurons (Figure 7I, J). Our results showed a clear reduction in the proportion of DL neurons, accompanied by a reciprocal increase in upper-layer (UL) neurons. These findings demonstrate that maintained high levels of Wnt signaling are essential for the prolonged neurogenic capacity for DL neurons in rats. This new data has been incorporated into Figure 7.

      Reviewer 2 Point 7. The authors conclude that Wnt signaling is a rat specific effect since they did not observe any clear temporal change in wnt receptors in gyrencephalic species, and only a subset of RG in rats co-express Lmx1a and Lhx2. However, specific Wntligands and receptors (Wnt5a, Fzd and Lrp6) seem to be upregulated in human as well (Fig 7G), non RG cells could act as wnt ligand inducers in other species, and it has not been demonstrated that Lmx1a and Lhx2 are the source for Wntligand production. I wonder if the authors can completely rule out a role for Wnt in the protracted neurogenesis of other species.

      Response: We sincerely appreciate the reviewer’s insightful and broad perspective regarding Wnt signaling dynamics across diverse species. In this study, our primary focus was to elucidate the specific mechanisms underlying the differences between mice and rats. Consequently, we did not initially explore Wnt dynamics in other species or their roles in developmental timing in great depth in the original manuscript. We fully acknowledge that lineage-specific adaptations occur at the individual gene level; for instance, Silver and colleagues have reported that human-specific upregulation of Wnt receptor gene FZD8 modulates neural progenitor behavior (Boyd et al., Current Biology 2008, Liu et al., Nature 2025). However, our comparative analysis of five mammalian species—carefully aligned by developmental stage—reveals a distinct global trend. While individual gene variations exist like human FZD8, the expression levels of multiple Wnt-related genes, particularly ligands, are markedly higher in rats than in the other four species.

      Following the reviewer’s insightful suggestion, we examined the potential role of Lmx1a in activating Wnt ligand transcription in rat cortical progenitors by analyzing their expression correlation at the single-cell level. Our analysis revealed that several Wnt ligand genes are co-expressed with Lmx1a with a remarkably strong positive correlation. While we have not yet experimentally demonstrated the direct transcriptional activation of Wnt ligands by Lmx1a in these cells, this robust correlation at single-cell resolution strongly suggests that Lmx1a regulates Wnt ligand expression. These new findings are now included in Figure 7 and Figure S16, and the corresponding results section (Page 15, Lines 42-44) has been revised accordingly.

      __Reviewer 2 Point 8 __Minor comments: The RNAscope experiment is currently qualitative. Is it the mRNA copy number per cell equal in both species but more cells are positive in rat, or are there differences in number of mRNA molecules as well? It is not indicated if the RNAscopeprobes are the same for mouse and rat.

      Response: We sincerely thank the reviewer for this insightful suggestion. Following the comment, we performed RNAscope analysis for Axin2 in both mice and rats and quantified the results (now included in Figure 7D). The new data successfully validate the species differences initially observed in our scRNAseq analysis: specifically, the period of high-level Axin2 expression is significantly extended in rats compared to mice. These findings provide histological evidence that reinforces our conclusions regarding the distinct temporal dynamics between the two species.

      Regarding probe design, the Axin2 RNAscope probes target conserved and corresponding sequences between mouse and rat, with species-specific probes optimized for each organism to ensure maximal specificity and sensitivity. We have updated the Methods section ("Fluorescent in situ hybridization with RNAscope") to include these details.

      Reviewer #3

      Reviewer 3 Point 1. Satb2 is also widely recognized as a deep layer marker. The authors need to perform analysis and quantification in Figs 1 and 4 with other II/III and IV markers such as Cux1 and Rorb.

      Response: We thank the reviewer for their insightful comments regarding the marker specificity. We fully agree that while Satb2 is a robust marker for callosal projection identity, its broad distribution across both deep and upper layers limits its utility as a layer-specific marker. As the reviewer suggested, Cux1 (Layers 2/3) and Rorb (Layer 4) are indeed superior markers for defining laminar identity.

      To address this, we have incorporated new immunohistochemical data for these markers in both the quantification of somatosensory cortical neurons (Figure S2) and the birth-dating analysis (Figure 4).

      Our new findings are as follows:

      1. Layer Quantification (Figure S2): By utilizing Cux1 and Rorb as more specific upper-layer (UL) markers, we confirmed that there are no significant differences in the number of these neurons between mice and rats.
      2. Birth-dating Analysis (Figure 4): These markers allowed us to more precisely define the timing of Cux1/Rorb-positive cell generation, revealing subtle but important differences between the two species. While these additions do not alter the fundamental narrative of the original manuscript, they have significantly enhanced the precision and rigor of our analysis. We are grateful to the reviewer for guiding us toward this more robust validation.

      Reviewer 3 Point 2. Rats have larger cortices. Therefore, quantification of neurons should also be normalized to cortical thickness in Fig 1E and also represented with individual data points.

      Response: We sincerely appreciate the reviewer’s constructive suggestion. We agree that normalizing the number of cortical neurons by thickness provides a more rigorous comparison. Accordingly, we have calculated the neuronal density (cell count per unit thickness) for Tbr1- and Ctip2-positive cells and included these data in Figure S2C. Our analysis confirms that these populations are distributed at a significantly higher density in mice compared to rats.

      Furthermore, we have updated the visualization in Figure 1E to display individual data points, ensuring full transparency of the underlying distribution. We believe these revisions, prompted by the reviewer’s insight, have substantially strengthened the clarity and persuasiveness of our manuscript.

      Reviewer 3 Point 3. The clonal analysis in Figs 2 and 3 quantifies GFP and RFP and reports these as neurons. However, without using cell-specific markers, it seems the authors cannot exclude that some progeny are also glia derived from a radial glial progeny. I don't expect all experiments to have this but they must have some measures of both populations to address this possibility. This needs to be addressed to build confidence in the conclusion that there is clonal production of neurons.

      Related to this, the relationship between position and fate is not always 1 to 1. The data summarized in Fig 2G are based on position and not using subtype markers. They should include assessment of markers as they do in Fig 4.

      Response: We sincerely thank the reviewer for this insightful comment. We agree that a clear definition of cell types is essential for the accuracy of clonal analysis.

      In this study, we primarily identified neurons based on their distinct morphological characteristics and performed measurements specifically on these cells. To validate this approach, we confirmed that the vast majority of cells identified as neurons were positive for NeuN and cortical excitatory neuron markers, while remaining negative for glial markers such as Olig2 and SOX9. (Notably, at postnatal day 7, most cells in the glial lineage exist as undifferentiated Olig2-positive progenitors). These observations support our conclusion that the cells analyzed based on morphology are indeed cortical excitatory neurons.

      As the reviewer rightly pointed out, evaluating cell composition using fate-specific marker expression is the ideal approach. However, our current experimental setup required multiple fluorescence channels for DAPI staining (to assess tissue architecture) and immunostaining for GFP and RFP (to identify labeled clones). Due to these technical constraints regarding available detection channels and host species compatibility, we relied on morphological criteria for the primary analysis.

      To address this concern and ensure the reliability of our findings, we performed additional analyses using a subset of samples. By co-staining retrovirally labeled neurons with cell-fate markers, we obtained results consistent with our other data (Figures 1 and 4) regarding laminar position and marker expression. Based on this consistency, we are confident that our classification based on morphology and laminar position does not alter the fundamental conclusions of this study.

      Reviewer 3 Point 4. In Fig 5, the authors use PH3 as well as EdU to measure differences in indirect neurogenesis. Using EdU and Tbr2 they report more dividing IPs. However they need to measure this over the total number of Tbr2 cells as it is not normalized to differences in Tbr2 cells between species. Are there total differences in Tbr2+ cells when normalized to DAPI as well? Moreover, little analyses is performed to measure any impact on radial glia. As no striking differences were observed in IPs this leaves the cellular mechanism a bit unclear and begs the impact on radial glia. Measuring PH3+ cells in VZ and SVZ is not cell specific nor does it yield information to support the prolonged neurogenesis.

      Response: We sincerely thank the reviewer for this insightful suggestion. We agree that quantifying Tbr2+/EdU+ double-positive cells alone was insufficient to fully capture the IP dynamics. Following the reviewer’s advice, we have now quantified the total population of Tbr2+ cells, normalized to the number of DAPI-stained nuclei. This new analysis reveals that mice and rats exhibit nearly indistinguishable temporal dynamics (Figure S10). When integrated with the original Tbr2+/EdU+ data in Figure 5, these findings suggest that rats maintain a slightly higher IP pool throughout the neurogenic period. This implies that the increased neuronal production in rats is not restricted to a specific phase, but rather occurs consistently across all developmental stages. We believe these additional data significantly strengthen our conclusions.


      Reviewer 3 Point 5. The sc-seq is done in rat and compared to published mouse data from corresponding stages. They conclude species specific differences in progenitor gene expression. I am unsure how appropriate this is. Are similar sequencing platforms used? Can they find similar results if using multiple dataset? There are other datasets that may be used to validate these findings beyond DiBella et al.

      Response: We sincerely thank the reviewer for this insightful comment. We agree that establishing the validity of our analytical approach is crucial for the reader’s confidence in our findings. To address this, we have explicitly stated in the revised manuscript that both our rat scRNAseq data and the publicly available datasets were generated using consistent experimental platforms. This ensures that the integration process is technically sound.

      Revised text (Page 13, Lines 16-18): “After quality control, we integrated these profiles with previously published mouse cortical cell data from corresponding neurogenic stages, which is prepared using the consistent platform with ours (35) (Figure S11).”

      Furthermore, to ensure the robustness of our comparative analysis, we have incorporated an additional independent dataset (Ruan et al., PNAS 2021) in addition to the Di Bella et al. Nature 2021 data used in the original manuscript. We confirmed that the results obtained using this second dataset are highly consistent with our initial findings, further validating our conclusions across different studies (Figure S13A).

      Reviewer 3 Point 6. Wnt ligand analysis requires validation in situ across developmental stages, to support their conclusions. Ideally they might consider doing some manipulations to provide context to this observation.

      Response: We sincerely thank the reviewer for these insightful suggestions. We agree that validating the spatial expression patterns of Wnt ligands and confirming their expression in rat-specific RG, as suggested by our scRNAseq data, is crucial for strengthening our conclusions.

      Regarding the expression of Wnt3a, a key ligand in cortical development: although immunohistochemical analysis clearly identified Wnt3a expression in the cortical hem, the expression levels in RG within the cortical area were substantially lower than those in the hem, making definitive visualization challenging. To complement these findings and provide more robust evidence, we performed the following additional experiments:

      1. Validation of Wnt signaling levels: Using RNAscope-based in situ hybridization for Axin2, we successfully confirmed the elevated Wnt signaling levels in rat-specific RG (Figure 7C, D), consistent with our scRNAseq findings.
      2. Elucidating strikingly high correlated expressions of Lmx1a and Wnt ligand genes in the rat cortical progenitors in our scRNAseq dataset (Figure S16B).
      3. Functional analysis: To test the functional significance of this signaling, we inhibited Wnt signaling by electroporating dominant-negative TCF7L2 into rat RG at E15.5. This manipulation resulted in a subtype shift of the generated neurons toward an upper-layer identity (Figure 7I, J). These new results demonstrate that the rat-specific extension of high Wnt signaling levels serves as a fundamental mechanism for the prolonged production of deep-layer (DL) neurons. We are grateful to the reviewer for these suggestions; these additional data have significantly strengthened our core argument that the heterochronic regulation of Wnt signaling states drives the evolution of cortical neuronal composition.

      __Reviewer 3 Point 7 __Minor concerns-1

      Please separate images in Fig 1D it is very strange to have them all on top of each other.

      Response: We sincerely thank the reviewer for this suggestion. As requested, we have provided individual channel images alongside the merged multicolor panels. We agree that this modification significantly enhances the clarity of our data and makes the results much easier to interpret.

      __Reviewer 3 Point 8 __Minor concerns-2

      Are data in Fig 4E Edu+Tbr1+EdU+? This should be clarified and would be most accurate.

      Response: We appreciate the reviewer’s suggestion. We added the label of Y axes of the plots in Figure 4E-K. The procedure of cell count in these analyses are documented in the caption of Figure 4E-K, “Normalized counts of neurons colabeled for EdU and projection-specific markers, relative to the peak of EdU+ and marker+ cells.”.

      __Reviewer 3 Point 9 __Minor concerns-3

      Fig 4 graphs only have titles without Y axis. Please adjust location of title or repeat for clarity.

      Response: We thank the reviewer for this helpful suggestion. To clarify the definition of the Y-axis, we have now added a descriptive label to the axis in the revised figure.

      __Reviewer 3 Point 10 __Minor concerns-4

      Fig 4A implies cumulative incorporation which I don't think is being performed here. They should clarify this in the figure.

      Response: We appreciate the reviewer’s insightful comment. To avoid any potential misunderstanding regarding the additivity of the effect, we have revised the illustration in Figure 4A for greater clarity.

      __Reviewer 3 Point 11 __Minor concerns-5

      Fig 5 needs labels for the actual stages assayed, as illustrated in Fig 4A.

      Response: We thank the reviewer for this helpful suggestion. Following your comment, we have added the developmental stage information (expressed as embryonic days) for both mice and rats in the revised manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this study the authors investigate differences between two closely related species, rats and mice, in terms of cortical development and neuronal composition. They first perform comparative analysis of cortical layers which revealed the density and markers of deep layer neurons of rats is disproportionately larger compared to adult mice. They then use retroviruses for lineage analysis from embryonic stages to P7. They find in general that there are temporal differences in when mice and rats produce upper versus deep layer neurons, with the process being protracted in rats. EdU injections were used to report differences in the timing of cortical neuron generation between species and they note no striking differences in IPs. Sc-sequencing of rat cortices at different stages was then used to measure temporal changes in gene expression and compared to published mouse data. They note that rats have sustained Wnt ligand expression in radial glia highlighting that as a potential mechanism of action.

      Major concerns 1. Satb2 is also widely recognized as a deep layer marker. The authors need to perform analysis and quantification in Figs 1 and 4 with other II/III and IV markers such as Cux1 and Rorb. 2. Rats have larger cortices. Therefore, quantification of neurons should also be normalized to cortical thickness in Fig 1E and also represented with individual data points. 3. The clonal analysis in Figs 2 and 3 quantifies GFP and RFP and reports these as neurons. However, without using cell-specific markers, it seems the authors cannot exclude that some progeny are also glia derived from a radial glial progney. I don't expect all experiments to have this but they must have some measures of both populations to address this possibility. This needs to be addressed to build confidence in the conclusion that there is clonal production of neurons. Related to this, the relationship between position and fate is not always 1 to 1. The data summarized in Fig 2G are based on position and not using subtype markers. They should include assessment of markers as they do in Fig 4. 4. In Fig 5, the authors use PH3 as well as EdU to measure differences in indirect neurogenesis. Using EdU and Tbr2 they report more dividing IPs. However they need to measure this over the total number of Tbr2 cells as it is not normalized to differences in Tbr2 cells between species. Are there total differences in Tbr2+ cells when normalized to DAPI as well? Moreover, little analyses is performed to measure any impact on radial glia. As no striking differences were observed in IPs this leaves the cellular mechanism a bit unclear and begs the impact on radial glia. Measuring PH3+ cells in VZ and SVZ is not cell specific nor does it yield information to support the prolonged neurogenesis. 5. The sc-seq is done in rat and compared to published mouse data from corresponding stages. They conclude species specific differences in progenitor gene expression. I am unsure how appropriate this is. Are similar sequencing platforms used? Can they find similar results if using multiple dataset? There are other datasets that may be used to validate these findings beyond DiBella et al. 6. Wnt ligand analysis requires validation in situ across developmental stages, to support their conclusions. Ideally they might consider doing some manipulations to provide context to this observation.

      Minor concerns 1. Please separate images in Fig 1D it is very strange to have them all on top of each other. 2. Are data in Fig 4E Edu+Tbr1+EdU+? This should be clarified and would be most accurate. 3. Fig 4 graphs only have titles without Y axis. Please adjust location of title or repeat for clarity. 4. Fig 4A implies cumulative incorporation which I don't think is being performed here. They should clarify this in the figure. 5. Fig 5 needs labels for the actual stages assayed, as illustrated in Fig 4A.

      Significance

      Strengths:

      The finding that there are differences in cortical composition between rats and mice and that this is linked to prolonged neurogenesis in rats Use of careful and detailed lineage analysis to define differences in temporal production of neurons Inclusion of single cell sequencing

      Limitations:

      Largely descriptive Requires additional investigation to support some conclusions about neurons Concerns about inferring too much from single cell sequencing done by the authors but compared to publication

      Advance: Finding that there are differences in neurogenesis between closely related species is interesting and provides insight into mechanisms of cortical evolution.

      Audience: Evolution, cortical development

      Expertise: Cortical development, evolution

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      Yamauchi et al. performed a comparative anatomical analysis of the layer architecture in the primary somatosensory cortex across 8 mammalian species. Unlike primates, which show an expansion of upper layers (UL), rodents, especially rats, display a pronounced thickening of deep layers (DL). In this study they focus on comparing rats and mice, given the higher abundance of DL neuron subtypes in rats. Using histological analysis, they showed that rats possess significantly more DL neurons per cortical column than mice, while UL neuron counts remain similar. Clonal lineage tracing showed that rat radial glial (RG) progenitors generate more DL neurons, indicating species-specific differences in progenitor neurogenic activity. Birth dating assays confirmed an extended DL neurogenesis phase in rats, followed by a conserved UL generation phase. Single-cell RNA sequencing further revealed that rats maintain an early progenitor state longer than mice, marked by sustained expression of DL-associated genes. Specifically, rat RG progenitors exhibit prolonged and elevated expression of Wnt signaling genes, particularly Wnt ligands. Comparative analysis of published single-cell RNA-Seq across species highlighted that this extended Wnt-high period in rats is exceptional, suggesting a species-specific extension of a conserved neurogenic program.

      Major comments:

      This reviewer thinks the topic is exciting, and the experiments elegant, insightful and well described. The paper is well written and follows a very logical flow, the conclusion for each experiment is supported by the data and they are carefully stated. This reviewer really appreciated the summary illustration included as a panel in each figure, they think that this greatly enhanced the clarity and accessibility of the data presented, especially because species comparison can be difficult to follow.

      In this reviewer's opinion, there are some aspects of the findings that the authors would need to clarify/address to explain in clarify the phenotype observed and to enhance the overall significance of this very well-made paper: 1. The introduction lacks sufficient background and fails to convey the significance of the study. Specifically, why the research was undertaken, what knowledge gap it addresses, and how the findings could be applied. Addressing these questions already in the introduction would enhance the impact of the work and broaden its readership. 2. In figure 5 the authors conclude that "differences in cell cycle kinetics and indirect neurogenesis are unlikely to be the primary factors driving the species-specific variation in DL neuron production. Instead, the temporal regulation of progenitor neurogenic competence, which determines the duration of the DL production phase, provides a more plausible explanation for the greater number of DL subtypes observed in rats". It is not clear to this reviewer how the authors come to this conclusion. Authors observe a significant proportion of mitotic cells in rat VZ from day 1, and a higher constant proportion of mitotic progenitors in SVZ rats compared to mouse (Figure 5C). This points to an early difference in mitotic progenitors that may also lead to increased IP numbers, and potentially an increased number in DL cells, even before day 1. In addition, the higher abundance of IPs in the G2/S phase (statistically significant in 4 of the 7 time points) (Figure 5F), would suggest that this difference might play a role in the species-specific variation of DL neuron production. The authors should estimate cell cycle length instead of just measuring proportions to conclude something about cell cycle kinetics. They can then model growth curves to predict the effect caused if there were differences in cell cycle length between equivalent cell types across species. 3. In Figure 6 the authors focus only on the mouse and rat datasets. Given the availability of datasets from primates that the author used already for Figure 7, it would give the reader a broader prospective if also these datasets would be integrated in the analysis done for Figure 6, particularly it would be interesting to integrate them in the pseudotime alignment of cortical progenitor. How do human and/or macaque early and late neurogenic phase would compare to mouse and rat in this model? 4. In Figures 6C and 6D, the authors distinguish between cycling and non-cycling NECs and RGCs. Could the authors clarify the rationale behind making this distinction? Could the authors comment on how they interpret the impact of cycling versus non-cycling states on species-specific non-uniform scaling? Do they consider the observed non-linear correspondences to be driven by differences in cell cycle activity? 5. For the non-uniform scaling in Figure 6F, the authors identify critical inflection points and mention that "the largest delay in rat progenitors occurring where Day 1 and Day 3 progenitors overlapped". It would be good if the authors could discuss what they think all the inflection points represents. How much can it be explained by the heterogeneity within progenitors per time point? There is a clear higher spread of histograms at days 3 and 5, and the histogram at day 5 almost overlaps with day 1. I wonder if the same conclusion about non-uniform scaling would be detected if the distance matrix was built separately for specific cell types, for example only looking at NECs or RGCs. 6. The authors conclude that the elevated and prolonged expression of Wnt-ligand genes in rat RGs extend the DL neurogenic window and contribute to rat-specific expansion of deep cortical layer. In order to validate this finding it would be good for the authors to perform a perturbation experiment and reduce Wnt signalling/ Axin 2 levels in rats or depleted the Lmx1a and Lhx2 double-positive population. 7. The authors conclude that Wnt signaling is a rat specific effect since they did not observe any clear temporal change in wnt receptors in gyrencephalic species, and only a subset of RG in rats co-express Lmx1a and Lhx2. However, specific Wnt ligands and receptors (Wnt5a, Fzd and Lrp6) seem to be upregulated in human as well (Fig 7G), non RG cells could act as wnt ligand inducers in other species, and it has not been demonstrated that Lmx1a and Lhx2 are the source for Wnt ligand production. I wonder if the authors can completely rule out a role for Wnt in the protracted neurogenesis of other species.

      Minor comments:

      The RNAscope experiment is currently qualitative. Is it the mRNA copy number per cell equal in both species but more cells are positive in rat, or are there differences in number of mRNA molecules as well? It is not indicated if the RNAscope probes are the same for mouse and rat.

      Significance

      How different species achieve such remarkable differences in brain shape and size remains poorly understood. A critical aspect of this process is the duration of the neurogenic phase: the period during which neural progenitors generate neurons. This phase tends to be extended in species with larger brains and contains multiple neuronal stem cell types in varying proportions. It is thought that this accounts for their increased neuronal numbers. In their search for mechanisms that prolong neurogenesis across species, the authors propose a rat-specific role for Wnt ligands in expanding the neurogenic period in the rat brain. Importantly, they rule out that this mechanism operates in other species, such as primates or ferrets, to achieve similar extensions.

      The study is of high quality, incorporating rigorous lineage-tracing experiments in two species and single-cell RNA sequencing. Previous work established a role for Wnt signaling in regulating early neurogenesis in mice. Here, the authors characterize a novel population of radial glial cells (Lmx1a and Lhx2 double-positive) that may explain increased Wnt ligand secretion in rats. However, functional validation of this mechanism is still lacking. To strengthen its evolutionary relevance, it would be important to determine whether similar effects occur during earlier neural stages in other species (such as neuroepithelium thickening), or whether other cell types have co-opted the proposed Lmx1a-Lhx2 regulatory module in other species.

      From the perspective of a researcher with a stem cell and developmental background focused on neural evo-devo, this manuscript represents a solid and novel contribution. The proposed model of a rat-specific mechanism for extending the neurogenic phase contrasts with the prevailing concept of convergence in mechanisms underlying species-specific cortical development. This raises intriguing questions about how multiple molecular pathways have been co-opted to achieve similar developmental outcomes. Furthermore, we know very little about what determines the duration of specific developmental processes. This work suggests that extended Wnt signaling may account for prolonged neurogenesis in rats compared to mice. Future studies should aim to validate the proposed rat-specific co-option of an Lmx1a-Wnt ligand cascade in cortical radial glia, potentially through relief of Lhx2-mediated repression of Lmx1a.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Yamauchi and colleagues explore how species-specific differences in timing of neurogenesis may contribute to cell composition in the mature brain, using rat and mouse cortex as a main model of study. They first estimate and compare among 8 mammalian species the number of cortical neurons corresponding to deep layer (DL) and upper layer (UL) neurons. They find a species-specific relative increase of DL/UL neurons in rats, compared with all other species tested. They then explore the cellular mechanisms underlying these differences in mouse and rat, using retrovirus-based clonal analyses and EdU nuclear labeling, as well as axonal projection retrograde tracing. They conclude that the increased number of DL neurons in the rat is correlated with an increase in the period of DL neuron generation at early stages of corticogenesis. They also report a lack of obvious difference in cell cycle kinetics and indirect vs direct neurogenesis that could explain the DL/UL differences. Finally, they perform comparative scRNAseq analysis in mouse vs rat embryonic cortical cells. This first confirms at the transcriptomic level an apparent prolonged period of early neurogenesis in the rat cortex. Moreover they find among modules of co-expression detectable at these stages an increase in genes corresponding to Wnt signalling, a pathway previously linked to increased self-renewal and delayed differentiation of radial glial progenitors. They thus conclude that the species-differences in neuronal number in the rat is linked to increased Wnt signaling at a critical time of corticogenesis.

      Overall this is a thorough and elegant study focused on a timely and interesting topic. The data shown are convincing and carefully interpreted. I have however a couple of comments and questions to make the study fully clear and convincing.

      • The authors describe cortical neuronal counts across several mammalian species, which is quite impressive, but the information on the methods of counting is lacking: how representative are the data used / shown; how many individuals / brains / sections were used for each species considered? Much more detailed description of the quantifications should be provicded to judge the validity of this first conclusion.
      • The authors use several markers of cortical neuron identity to confirm their neuron number measurements, but from the data shown in Figure 1D,E it seems that only some markers (Satb2) show species-differences while others do not (CTIP2 / Tbr1). How do the authors explain this discrepancy - does this mean that it is mainly Satb2 neurons that are increased in number? But if so how to explain the relative increase in subcortical projections shown in Figure S7?
      • The authors focus their study almost exclusively on somatosensory cortex, but can they comment on other areas (motor, visual for instance)? It would be nice to provide additional comparative data on other areas, at least for some of the parameters examined acros mouse and rat. Alternatively the authors should be more explicit in the abstract and description of the study that it is limited to a single area.
      • The authors provide convincing evidence of increased Wnt signaling pathway in the rat. They should show more explicitely how other classical pathways of neurogenic balance / temporal patterning are expressed in their mouse and rat transcriptome data sets. These would include Notch, FGF, BMP, for which all the data should be available to provide meaningful species comparison.
      • The alignment of mouse and rat trajectories is very nicely showing a delay at early-mid-corticogenesis. But there is also heterochronic transcriptome at latest stages (end of 5). How can this be interpreted? Does this mean potentially prolonged astrogliogenesis in the rat cortex?
      • Figure 7: description implies that module 3 is a subset of module 4, but this is not obvious at all from the panels shown. Please clarify.

      Significance

      The topic of the study if of general interest and original, and the conclusions original and important. The approaches used are state of the art and applied in an elegant fashion to the topic. This study should be of broad interest to developmental neurobiologists, but also developmental biologists interesting in temporal patterning and developmental timing across species.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers and editors for their careful evaluation of our manuscript and their positive comments on the importance and rigor of the work. Below you will find our point-by-point response to each reviewer's suggestions. We believe that we have addressed (in the response and the revised manuscript) all of the concerns. Please note that in some cases, we have numbered a reviewer's comments for clarity, however beyond this, we have not altered any of the reviewers' text.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Lo et al., report a high-throughput functional profiling study on the gene encoding for argininosuccinate synthase (ASS1), done in a yeast experimental system. The study design is robust (see lines 141-143, main text, Methods), whereby "approximately three to four independent transformants of each variant would be isolated and assayed." (lines 140 - 141, main text, Methods). Such a manner of analysis will allow for uncertainty of the functional readout for the tested variants to be accounted for.

      This is an outstanding study providing insights on the functional landscape of ASS1. Functionally impaired ASS1 may cause citrullinemia type I, and disease severity varies according to the degree of enzyme impairment (line 30, main text; Abstract). Data from this study forms a valuable resource in allowing for functional interpretation of protein-altering ASS1 variants that could be newly identified from large-scale whole-genome sequencing efforts done in biobanks or national precision medicine programs. I have some suggestions for the Authors to consider:

      1. The specific function of ASS1 is to condense L-citrulline and L-aspartate to form argininosuccinate. Instead of measuring either depletion of substrate or formation of product, the Authors elected to study 'growth' of the yeast cells. This is a broader phenotype which could be determined by other factors outside of ASS1. Whereas i agree that the experiments were beautifully done, the selection of an indirect phenotype such as ability of the yeast cells to grow could be more vigorously discussed.

      We appreciate the reviewer's point regarding the indirect nature of growth as a functional readout. In our system, yeast growth is tightly and specifically coupled to ASS enzymatic activity. The strains used are isogenic and lack the native yeast argininosuccinate synthetase, such that arginine biosynthesis, and therefore yeast replication on minimal medium lacking arginine, depends exclusively on the activity of human ASS1. Under these defined and limiting conditions, growth provides a quantitative proxy for ASS1 function. However, we acknowledge that this assay does not resolve specific molecular mechanisms underlying reduced function, such as altered catalytic activity versus effects on protein stability. We have updated the text to clarify these points.

      "While growth is an indirect phenotype relative to direct measurement of substrate turnover or product formation, it is tightly coupled to ASS enzymatic activity in this system and is expected to be impaired by amino acid substitutions that reduce catalytic activity or protein stability. Therefore, growth on minimal medium lacking arginine is a quantitative measure of ASS enzyme function, allowing the impact of ASS1 missense variants to be assessed at scale through a high-throughput growth assay, in a single isogenic strain background, under controlled, defined conditions that limit confounding factors unrelated to ASS1 activity. We expect that the assay will detect reductions in both catalytic activity and protein stability but will not distinguish between these mechanisms."

      1. One of the key reasons why studies such as this one are valuable is due to the limitations of current variant classification methods that rely on 'conservation' status of amino acid residues to predict which variants might be 'pathogenic' and which variants might be 'likely benign'. However, there are serious limitations, and Figures 2 and 6 in the main text shows this clearly. Specifically, there is an appreciable number of variants that, despite being classified as "ClinVar Pathogenic", were shown by the assay to unlikely be functionally impaired. This should be discussed vigorously. Could these inconsistencies be potentially due to the read out (growth instead of a more direct evaluation of ASS1 function)?

      We interpret this discrepancy as reflecting a sensitivity limitation of the growth-based readout rather than a fundamental disagreement between functional effect and clinical annotation. Specifically, we believe that our assay is unable to resolve the very mildest hypomorphic variants from true wild type, i.e., the residual activity of these variants is sufficient to fully support yeast growth under the conditions used. On this basis, we have chosen not to treat wild-type-like growth in our assay as informative for benignity; conversely, reduced growth provides evidence supporting pathogenicity (all clinically validated variants examined in this range are pathogenic).

      We have revised the manuscript to clarify this point explicitly and to frame these variants as lying outside the effective resolution limit of the assay rather than representing true false positives. Additional discussion of this limitation and its implications is provided in our responses to Reviewer 2 (points 1 and 4) along with specific changes made to the text.

      1. Figure 3 is very interesting, showing a continuum of functional readout ranging from 'wild-type' to 'null'. It is very interesting that the Authors used a threshold of less than 0.85 as functionally hypomorphic. What does this mean? It would be very nice if they have data from patients carrying two hypomorphic ASS1 alleles, and correlate their functional readout with severity of clinical presentation. The reader might be curious as to the clinical presentation of individuals carrying, for example, two ASS1 alleles with normalized growth of 0.7 to 0.8.

      I hope you will find these suggestions helpful.

      We thank the reviewer for this thoughtful comment. Figure 3 indeed illustrates a continuum of functional effects, and we agree that careful interpretation of the thresholds used is important. To clarify the rationale for the hypomorphic threshold, the interpretation of intermediate growth values, and to emphasize that these labels reflect only behavior in the functional assay, we have rewritten the relevant section of the Results:

      "The normalized growth scores of the 2,193 variants tested in our functional assay form a clear bimodal distribution (Figure 3), with two distinct peaks corresponding to functional extremes, as is commonly reported in large-scale functional assays of protein function [9, 10]. The smaller peak, centered around the null control (normalized growth = 0), represents variants that fail to support growth in the assay (growth 0.85). Variants with growth values falling between these two peak-based thresholds display partial functional impairment and are classified as functionally hypomorphic (n = 323). Crucially, these classifications are entirely derived from the observed peaks in the distribution of growth values and reflect differences in functional activity under the assay conditions. They do not provide direct evidence for clinical pathogenicity or benignity and should not be used for clinical variant interpretation without proper benchmarking against clinical reference datasets, as implemented below within an OddsPath framework."

      We agree with the reviewer that correlating functional measurements with clinical severity in individuals carrying two hypomorphic ASS1 alleles would be highly informative, particularly given that ASS1 deficiency is an autosomal recessive disorder. While mild hypomorphic variants (for example, variants with normalized growth values of 0.7-0.8 in our assay) could plausibly contribute to disease when paired with a complete loss-of-function allele, systematic analysis of combinatorial genotype effects and genotype-phenotype correlations is beyond the scope of the present study, which focuses on the functional effects of individual variants. We view this as an important direction for future work.

      Reviewer #1 (Significance (Required)):

      This is an outstanding study providing insights on the functional landscape of ASS1. Functionally impaired ASS1 may cause citrullinemia type I, and disease severity varies according to the degree of enzyme impairment (line 30, main text; Abstract). Data from this study forms a valuable resource in allowing for functional interpretation of protein-altering ASS1 variants that could be newly identified from large-scale whole-genome sequencing efforts done in biobanks or national precision medicine programs.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Lo et al characterize the phenotypic effect of ~90% of all possible ASS1 missense mutations using an elegant yeast-based system, and use this dataset to aid the interpretation of clinical ASS1 variants. Overall, the manuscript is well-written and the experimental data are interpretated rigorously. Of particular interest is the identification of pairs of deleterious alleles that rescue ASS1 activity in trans. My comments mainly pertain to the relevance of using a yeast screening methodology to infer functional effects of human ASS1 mutations.

      1. Since human ASS1 is heterologously expressed in yeast for this mutational screen, direct comparison of native expression levels between human cells and yeast is not possible. Could the expression level of human ASS1 (driven by the pARG1 promoter) in yeast alter the measured fitness defect of each variant? For instance, if ASS1 expression in yeast is sufficiently high to mask modest reductions in catalytic activity, such variants may be misclassified as hypomorphic rather than amorphic. Conversely, if expression is intrinsically low, even mild catalytic impairments could appear deleterious. While it is helpful that the authors used non-human primate SNV data to calibrate their assay, experiments could be performed to directly address this possibility.

      The nature of the relationship between yeast growth and availability of functional ASS1 could also influence the interpretation of results from the yeast-based screen. Does yeast growth scale proportionately with ASS1 enzymatic activity?

      We completely agree that the expression level of human ASS1 in yeast could influence the measured fitness effects of individual variants. We expect the rank ordering of variants in our growth assay to reflect their relative enzymatic activity (i.e. a monotonic relationship) but acknowledge that the precise mapping between activity and growth is unknown and may include ceiling and floor effects that limit the assay's dynamic range. As the reviewer notes, under high expression conditions moderate loss-of-function variants could appear indistinguishable from wild type (ceiling effect), whereas under lower expression the same variants could behave closer to the null control (floor effect).

      In our system, ASS1 is expressed from the pARG1 promoter, chosen under the assumption that the native expression level of ARG1 (the yeast ASS1 ortholog) is appropriately tuned for yeast growth. Crucially, rather than assuming a fixed mapping from assay growth to clinical pathogenicity (given potential nonlinearities in the relationship between ASS function and growth) we benchmark the assay against external data, including known pathogenic and benign variants and non-human primate SNVs, to calibrate thresholds and guide interpretation within an OddsPath framework. This benchmarking indicates that ceiling effects are likely present, with some mild loss-of-function pathogenic variants appearing indistinguishable from wild type in the growth assay. We explicitly account for this by not using high-growth scores as evidence toward benignity. We have made the following changes the manuscript:

      "A subset of clinically pathogenic ASS1 variants exhibit near-wild-type growth in our yeast assay. In general, we expect a monotonic relationship between ASS function and yeast growth, but with the potential for floor and ceiling effects that constrain the assay's dynamic range. In this context, we interpret high-growth pathogenic variants as likely causing mild loss of function that cannot be distinguished from wild type in our assay"

      "Based on these findings and given that 22/56 pathogenic variants show >85% growth, we conclude that growth above this threshold should not be used as evidence toward benignity."

      1. It would be helpful to add an additional diagram to Figure 1A explaining how the screen was performed, in particular: when genotype and phenotype were measured, relative to plating on selective vs non-selective media? This is described in "Variant library sequence confirmation" and "Measuring the growth of individual isolates" of the Methods section but could also be distilled into a diagram.

      We thank the reviewer for this helpful suggestion. We have updated Figure 1 by adding a new schematic panel (Figure 1C) that distills the experimental workflow into a visual overview. This diagram is intended to complement the detailed descriptions in the Methods and improve clarity for the reader.

      1. The authors rationalize the biochemical consequences of ASS1 mutations in the context of ASS1 per se - for example, mutations in the active site pocket impair substrate binding and therefore catalytic activity, which is expected. Does ASS1 physically interact with other proteins in human cells, and could these interactions be altered in the presence of specific ASS1 mutations? Such effects may not be captured by performing mutational scanning in yeast.

      We are not aware of any specific protein-protein interactions involving ASS that are required for its enzymatic function. However, we agree that ASS could engage in non-essential interactions with other human proteins that might be altered by specific missense variants and that such interactions would not necessarily be captured in a yeast-based assay.

      Importantly, our complementation system depends on human ASS providing the essential enzymatic activity required for arginine biosynthesis in yeast. If ASS1 required obligate human-specific protein interactions to function, even the wild-type enzyme would fail to support yeast growth, which is clearly not the case. We therefore conclude that the assay robustly reports on the intrinsic enzymatic activity of ASS, while acknowledging that non-essential human-specific interactions may not be assessed. We have updated the manuscript to reflect this point.

      "Importantly, successful functional complementation indicates that ASS enzymatic activity does not depend on any obligate human-specific protein interactions."

      1. The authors note that only a small number (2/11) of mutations at the ASS1 monomer-monomer interface lead to growth defects in yeast. It would be helpful for the authors to discuss this further.

      As discussed in response to the reviewer's comments on the relationship between ASS activity and yeast growth (point 1 above), we expect growth to be a monotonic but nonlinear function of enzymatic activity, with potential ceiling effects at high activity. Under this model, variants causing weak or moderate loss of function may remain indistinguishable from wild type when residual activity is sufficient to support normal growth. We favor this explanation for the observation that only 2/11 interface variants show reduced growth, as many pathogenic interface substitutions are associated with milder disease presentations, consistent with higher residual enzyme function. Consistent with this interpretation, variants affecting the active site, where substitutions are expected to cause large reductions in catalytic activity, are readily detected by the assay.

      Although we cannot exclude partial buffering of dimerization defects in yeast, we interpret the reduced sensitivity to interface variants primarily as a general limitation of growth-based assays. Accordingly, our decision not to use growth >85% as evidence toward benignity is conservative relative to approaches that would classify high-growth variants as benign except at the monomer-monomer interface, avoiding reliance on structural subclassification and minimizing the risk of false benign interpretation. Reduced growth, by contrast, provides strong evidence of loss of ASS1 function and pathogenicity, validated under the OddsPath framework.

      We have updated the Results and Discussion sections to clarify these points (also see response to the reviewer's point 1).

      "A subset of clinically pathogenic ASS1 variants exhibit near-wild-type growth in our yeast assay. In general, we expect a monotonic relationship between ASS function and yeast growth, but with the potential for floor and ceiling effects that constrain the assay's dynamic range. In this context, we interpret high-growth pathogenic variants as likely causing mild loss of function that cannot be distinguished from wild type in our assay. Consistent with this view, many pathogenic variants with high assay growth are located at the monomer-monomer interface rather than the active site, and are associated with milder or later-onset clinical presentations, suggesting partial enzymatic impairment that is clinically relevant in humans but not resolved by the yeast assay."

      "Based on these findings and given that 22/56 pathogenic variants show >85% growth, we conclude that growth above this threshold should not be used as evidence toward benignity. Notably, this approach is conservative relative to treating high-growth variants as benign except at the monomer-monomer interface, avoiding reliance on structural subclassification and minimizing the risk of false benign interpretation arising from assay ceiling effects. Conversely, the variants with

      Reviewer #2 (Significance (Required)):

      This study presents the first comprehensive mutational profiling of human ASS1 and would be of broad interest to clinical geneticists as well as those seeking biochemical insights into the enzymology of ASS1. The authors' use of a yeast system to profile human mutations would be particularly useful for researchers performing deep mutational scans, given that it provides functional insights in a rapid and inexpensive manner.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Section 1 - Evidence, reproducibility, and clarity Summary This manuscript presents a comprehensive functional profiling of 2,193 ASS1 missense variants using a yeast complementation assay, providing valuable data for variant interpretation in the rare disease citrullinemia type I. The dataset is extensive, technically sound, and clinically relevant. The demonstration of intragenic complementation in ASS1 is novel and conceptually important. Overall, the study represents a substantial contribution to functional genomics and rare disease variant interpretation.

      Major comments 1. This is an exciting paper as it can provide support to clinicians to make actionable decisions when diagnosing infants. I have a few major comments, but I want to emphasize the label of "functionally unimpaired" variants to be misleading. The authors explain that there are several pathogenic ClinVar variants that fall into this category (above the >.85 growth threshold) but I think this category needs a more specific name and I would ask the authors to reiterate the shortcomings of the assay again in the Discussion section.

      We thank the reviewer for raising this important point. We agree that the label "functionally unimpaired" could be misleading if interpreted as implying clinical benignity rather than assay behavior. We have therefore clarified that this designation refers strictly to variant behavior in the yeast growth assay and does not imply absence of pathogenicity.

      In addition, we have expanded the Discussion to explicitly address the existence of clinically pathogenic variants with high growth scores (>0.85), emphasizing that these likely reflect a ceiling effect of the assay and represent a key limitation for interpretation. This clarification reiterates that high-growth scores should not be used as evidence toward benignity, while reduced growth provides strong functional evidence of pathogenicity. Relevant revisions are described in our responses to Reviewers 1 and 2.

      1. I think there's an important discussion to be had here, is the assay detecting variants that alter the function of ASS or is it detecting a complete ablation of enzymatic activity? The results might be strengthened with a follow-up experiment that identifies stably expressed ASS1 variants.

      We agree with the review that distinguishing between stability and enzyme activity would be valuable information. Unfortunately, we do not currently have the resources to perform this type of large-scale study. We have acknowledged in the text that our assay does not distinguish between enzyme activity and protein stability:

      "We expect that the assay will detect reductions in both catalytic activity and protein stability, but will not distinguish between these mechanisms."

      At the very least, it would be great to see the authors replicate some of their interesting results from the high-throughput screen by down-selecting to ~12 variants of uncertain significance that could be newly considered pathogenic.

      We have included new analysis of all 25 VUS variants falling in the pathogenic range of our assay (Supplemental Table S7). Reclassification under current guidelines (in the absence of our data) shifts six variants to Pathogenic/Likely Pathogenic and 11 more are reclassified to Likely Pathogenic with the application of our functional data as PS3_Supporting. The remaining eight VUS are all reclassified to Likely Pathogenic when inclusion of homozygous PrimateAI-benign variants allows the assay to satisfy full PS3 criteria.

      1. I would ask the authors to provide more citations of the literature in the introduction of the manuscript. I would be especially interested in knowing more about human ASS being identified as a homolog of yeast ARG1, as they share little sequence similarity (27.5%) at the protein level. That said, I find the yeast complementation assay exciting.

      We thank the reviewer for this suggestion. Human ASS and yeast Arg1 catalyze the same biochemical reaction and share approximately 49% amino acid sequence identity. We have revised the Introduction to clarify this relationship and to note explicitly that the Saccharomyces Genome Database (SGD) identifies the human gene encoding argininosuccinate synthase (ASS1) as the ortholog of yeast ARG1. An appropriate citation has been added to support this statement. The protein alignments have been provided as File S2.

      "This assay is based on the ability of human ASS to functionally replace (complement) its yeast ortholog (Arg1) in S. cerevisiae (Saccharomyces Genome Database, 2026). Importantly, successful functional complementation indicates that ASS enzymatic activity does not depend on any obligate human-specific protein interactions. At the protein level, human ASS and yeast Arg1 display 49% sequence identity (File S2) and share identical enzymatic roles in converting citrulline and aspartate into argininisuccinate."

      1. I appreciate the efforts made by the authors to share their work and make this study more reproducible, such as sharing the hASS1 and yASS1 plasmids being shared on NCBI Genbank (Line 121) and publishing the ONT reads on SRA (Line 154). I made a requests for additional data to be shared, such as the custom method/code for codon optimization and a table of Twist variant cassettes that were ordered. I would also love to see these results shared on MaveDB.org.

      We thank the reviewer for these suggestions regarding data sharing and reproducibility. As requested, we have provided the custom codon optimization script as File S1 and the amino acid alignment used to perform codon harmonization as File S2. The sequence of the underlying variant cassette is included in the corresponding GenBank entry, and we have clarified this point in the legend of Figure 1. For each amino acid substitution, Twist Bioscience used a yeast-specific codon scheme with a single consistent codon per amino acid; accordingly, the sequence of each variant cassette can be inferred from the base construct and the specified amino acid change. A complete list of variant amino acid substitutions used in this study is provided in Table S3.

      1. I find this manuscript very exciting as the authors have a compelling assay that identifies pathogenic variants, but I was generally disappointed by the quality and organization of the figures. For example, Figure 4 provides very little insight, but could be dramatically improved with an overlay of the normalized growth score data or highlighting variants surrounding the substrate or ATP interfaces. There are some very interesting aspects of this manuscript that could be shine through with some polished figures.

      We thank the reviewer for this feedback and agree that clear and well-organized figures are essential for conveying the key results of the study. In response, we have substantially revised Figure 4 by adding colored overlays showing residue conservation and median normalized growth scores (new panels Figure 4C and 4D), which more directly link structural context to functional outcomes and highlight patterns surrounding the active site and substrate interfaces.

      I would also encourage the authors to generate a heatmap of the data represented in Figure 2 (see Fowler and Fields 2014 PMID 25075907, Figure 2), this would be more helpful reference to the readers.

      The reviewer also suggested that a heatmap representation, similar to that used in Fowler and Fields (2014), might aid interpretation of the data shown in Figure 2. Because our dataset consists of sparse single-amino acid substitutions rather than a complete mutational scan, such heatmaps are inherently less dense and less effective at conveying patterns than in saturation mutagenesis studies. Nevertheless, to aid readers who may find this visualization useful, we have generated and included a single-nucleotide variant heatmap as Supplemental Figure S1.

      My major comments are as follows: 6. Citations needed - especially in the introduction and for establishing that hASS is a homolog of yARG1

      We have added the requested citations and clarified the ASS1-ARG1 orthology in the Introduction, as described in our response to point 3 above.

      1. Generally, the authors do a nice job distinguishing the ASS1 gene from the ASS enzyme, though I found some ambiguities (Line 685). Please double-check the use of each throughout the manuscript.

      We have edited the manuscript to ensure consistent and unambiguous use of gene and enzyme nomenclature throughout.

      1. Generally, I'm confused about what strain was used for integrating all these variants, was is the arg1 knock-out strain from the yeast knockout collection or was it FY4? I think FY4 was used for the preliminary experiments, then the KO collection strain was used for making the variant library but I think this could be made more clear in the text and figures. Lines 226-229 describes introducing the hASS1 and yASS1 sequences into the native ARG1 locus in strain FY4, but the Fig1A image depicts the ASS1 variants going into arg1 KO locus. Fig1A should be moved to Fig2.

      We agree that the strain construction steps were not described as clearly as they could have been. We have therefore clarified the strain construction workflow in the Materials & Methods and Results sections, as well as in the Figure 1 legend, to explicitly distinguish preliminary experiments performed in strain FY4 from construction of the variant library in the arg1 knockout background.

      As we have also added an additional panel to Figure 1 that schematically explains how the screen was performed (per Reviewer #2's request), we believe that Figure 1A is appropriately placed and should remain in Figure 1.

      1. Line 303 - "We classify these variants as 'functionally unimpaired'", this is not an accurate description of these variants as Figure 2 highlights 24 pathogenic ClinVar variants that would fall into this category of "functionally unimpaired". The yeast growth assay appears to capture pathogenic variants, but there is likely some nuance of human ASS functionality that is not being assessed here. I would make the language more specific, e.g. "complementary to Arg1" or "growth-compatible".

      We agree that the label "functionally unimpaired" could be misinterpreted if read as implying clinical benignity. We have therefore clarified within the manuscript that this designation refers strictly to variant behavior in the yeast growth assay (i.e., wild-type-like growth under assay conditions) and does not imply absence of pathogenicity. We also expanded the Discussion to explicitly address the subset of clinically pathogenic variants with high growth scores (>0.85), consistent with a ceiling effect of the assay and a key limitation for interpretation. See response to reviewer #3 point 1. Relevant revisions are also discussed in our responses to Reviewers #1 and #2.

      1. Lines 345-355 - It is interesting that there are variants that appear functional at the substrate interfacing sites. Is there anything common across these variants? Are they maintaining the polarity or hydrophobicity of the WT residue? Are any of these variants included in ClinVar or gnomAD? Are pathogenic variants found at any of these sites

      Yes. For highly sensitive active-site residues that have few permissible variants, the vast majority of amino acid substitutions that do retain activity preserve key physicochemical properties of the wild-type residue, such as hydrophobicity or charge. We have added this important observation to the manuscript:

      "Any variants at these sensitive residues that are permissive for activity in our assay retain hydrophobicity or charged states relative to the original amino acid side chain (Figure 5A & Table S5)."

      None of these variants are present in ClinVar. Only L15V and E191D are present in gnomAD (Table S4).

      1. Lines 423-430 - The OddsPath calculation would seem to rely heavily on the thresholds of .85 for normalized growth. The OddsPath calculation could be bolstered with some additional analysis that emphasizes the robustness to alternative thresholds.

      We agree that the sensitivity of the OddsPath calculation to the choice of growth thresholds is an important consideration. In our assay, benign ClinVar variants and non-human primate variants are observed exclusively within the peak centered on wild-type growth, whereas clinically annotated variants falling below this peak are exclusively pathogenic. On this basis, we defined the upper boundary of the assay range interpreted as supporting pathogenicity as the lower boundary of the wild-type-centered peak in the growth distribution (as defined in Figure 3), rather than selecting a cutoff by direct optimization of the OddsPath. This choice reflects the observed concordance, in our dataset, between the onset of measurable functional impairment in the assay and clinical pathogenic annotation. Importantly, in practice the OddsPath value is locally robust to the precise placement of this boundary, remaining invariant across the range 0.82-0.88. Supporting our chosen threshold of 0.85, the lowest-growth benign or primate variant observed has a normalized growth value of 0.88, while the lowest growth observed among variants present as homozygotes in gnomAD was 0.86. We have clarified this rationale and analysis in the revised manuscript.

      "Notably, the "Among all nine of the human ASS1 missense variants observed as homozygotes in gnomAD which were tested as amino acid substitutions in our assay, the lowest observed growth value was 0.86 (Ala258Val) consistent with the lower boundary of the PrimateAI variants which was a growth value of 0.87 (Ala81Thr) (Figure 6) and with our use of a 0.85 classification threshold."

      "If we treat PrimateAI variants as benign (solely for OddsPath calculation purposes), the OddsPath for growth

      1. Lines 432-441 - This is an interesting idea to use variants observed in primates, has ACMG weighed in on this? I understand that CTLN1 is an autosomal recessive disorder but I'd still be interested in seeing how the observed ASS1 missense variants in gnomAD perform in your growth assay, possibly a supplemental figure?

      To our knowledge, the ACMG/AMP guidelines do not currently address the use of homozygous missense variants observed in non-human primates. We are currently in discussion with two ClinGen working groups to discuss the possibility of formalizing the use of this data source.

      We agree that comparison with human population data is also important. Accordingly, total gnomAD allele counts and homozygous counts for all applicable ASS1 missense variants are provided in Table S4, and the growth behavior of ASS1 missense variants observed in the homozygous state in gnomAD is shown in Figure 6. These homozygous variants uniformly exhibit high growth in our assay, consistent with the absence of strong loss-of-function effects. We have updated the manuscript text to clarify these points.

      Minor comments 1. Lines 53-59 - This paragraph needs to cite the literature, especially lines 56, 57, and 59 2. Line 61 - no need to repeat "citrullinemia type I", just use the abbreviation as it was introduced in the paragraph above 3. Lines 61-71 - again, this paragraph needs more literature citations 4. Line 62 - change to "results"

      The changes suggested in points 1-4 have all been implemented in the revised manuscript.

      1. Line 74-75 - "RUSP" acronym not needed as it's never used in the manuscript, the same goes for "HHS"

      We agree that the acronyms "RUSP" and "HHS" are not reused elsewhere in the manuscript. We have nevertheless retained them at first mention, alongside the expanded names, because these acronyms are commonly used in newborn screening and public health policy contexts and may be more familiar to some readers than the expanded terms. We would be happy to remove the acronyms if preferred.

      1. Line 86 - "ASS1" I think is referring to the enzyme and should just be "ASS"? If referring to the gene then italicize to "ASS1"
      2. Lines 91-93 - It would be helpful to mention this is a functional screen in yeast
      3. Line 101 - It would be helpful to the readers to define SD before using the acronym, consider changing to "minimal synthetic defined (SD) medium" and afterwards can refer to as "SD medium"
      4. 109-114 - It would be great if you could share your method for designing the codon-harmonized yASS1 gene, consider sharing as a supplemental script or creating a GitHub repository linked to a Zenodo DOI for publication.

      The changes suggested in points 6-9 have all been implemented in the revised manuscript. The codon harmonization script has been provided as File S1.

      1. Lines 135-137 - I think it's helpful to provide a full table of the cassettes ordered from Twist as well as the primers used to amplify them, consider a supplemental table.

      Details of Twist cassette and the primer sequences used for amplification have been added to the Materials & Methods.

      1. Line 138 - "standard methods" is a bit vague, I'm guessing this is a Geitz and Schiestl 2007 LiAc/ssDNA protocol (PMID 17401334)? Also, was ClonNAT used to select for natMX colonies?

      The reviewer is correct about which protocol was used, and we have added the citation. We have also clarified that selection was carried out based on resistance to nourseothricin.

      1. Line 150 - change to "sequence the entire open reading frame, as previously described [4]."
      2. Line 222-223 - remove "replace" and just use "complement" (and remove the parenthesis)
      3. Line 249 - It would be great to see a supplemental alignment of the hASS1 and yASS1 sequences.
      4. Line 261 - spelling "citrullemia" should be corrected to "citrullinemia"
      5. Line 280 - "using Oxford Nanopore sequencing" is a bit vague, I suggest specifying the equipment used (e.g. Oxford Nanopore Technologies MinION platform) or simplify to "via long-read sequencing (see Materials & Methods)"

      The changes suggested in points 12-16 have all been implemented in the revised manuscript. An alignment of the ASS and Arg1 protein sequences has been provided as File S2.

      1. Line 287-289 - It would be great to see the average number of isolates per variant, as well as a plot of the variant growth estimate vs individual isolate growth.

      We agree with the reviewer that conveying measurement precision is important. The number of isolates assayed per variant is provided in Table S4, and we have added explicit mention of this in the text. Because variants were assayed with a mixture of 1, 2, or {greater than or equal to}3 independent isolates, a scatterplot of variant-level growth estimates versus individual isolate measurements would be difficult to interpret and potentially misleading. Instead, we report standard error estimates for each variant in Table S4, derived from the linear model used to estimate growth effects, which more appropriately summarizes measurement uncertainty given the experimental design.

      1. Lines 324-25 - consider removing the last sentence of this paragraph, it is redundant as the following paragraph starts with the same statement.

      We have removed this sentence.

      1. Lines 327-335 - This is interesting and would benefit from its own subpanel or plot in which the normalized growth score is plotted against variants that are at conserved or diverse residues in human ASS, and see if there's a statistical difference in score between the two groupings.

      As suggested by the reviewer, we have added Supplemental Figure 2 (Figure S2) in which the normalized growth score of each variant is plotted against the conservation of the corresponding residue, as measured by ConSurf. The manuscript already includes a statistical analysis of the relationship between residue conservation and functional impact, showing that amorphic variants occur significantly more frequently at highly conserved residues than unimpaired variants do (one-sided Fisher's exact test). We now refer to this new supplemental figure in the relevant Results section.

      1. Lines 339-341 - As written, it is unclear if aspartate interacts with all of the same residues as citrulline or just Asn123 and Thr119.
      2. Lines 345-355 - As with my above comment, I find this interesting and would
      3. Line 353 - add a period to "al" in "Diez-Fernandex et al."

      The issues raised in points 20 and 22 have all addressed. Point 21 appears to be truncated.

      1. Figure 1 a. Remove "Figure" from the subpanels and show just "A" and "B" (as you do for Figure 4) and combine the two images into a single image. Also make this correction to Figure 5 and Figure 8. b. Panel A - I thought the hASS1 and yASS1 were dropped into FY4, not the arg1 KO strain. This needs clarification. c. Panel A - I'm assuming the natMX cassette contains its own promoter, you could use a right-angled arrow to indicate where the promotors are in your construct. d. Panel B - I'm not sure the bar graph is necessary, it would be more helpful to see calculations of the colony size (or growth curves for each strain) and plot the raw values (maybe pixel counts?) for each replicate rather than normalizing to yeast ARG1. I would be great to have a supplemental figure showing all the replicates side-by-side. e. Panel B - Would be helpful to denote the pathogenic and benign ClinVar variants with an icon or colored text.

      f. Figure 1 Caption - make "A)" and "B)" bold.

      We have implemented the requested changes in Figure 1 with the following exceptions. We have retained panels A and B as separate subfigures because they illustrate distinct experimental concepts. In addition, we respectfully disagree with point (d). The bar graph is intended to provide a clear, high-level comparison of functional complementation by hASS1 versus yASS1 and to illustrate the gross differences in growth between benign and pathogenic proof-of-principle variants. As the bar graph includes error bars for standard deviations, presenting raw colony size measurements or growth curves for individual replicates would substantially complicate the figure without materially improving interpretability for this purpose.

      1. Figure 2 a. "Shown in magenta are amino acid substitutions corresponding to ClinVar pathogenic, pathogenic/likely pathogenic, and likely pathogenic variants" is repeated in the figure caption. b. "Shown in green are amino acid substitutions corresponding to ClinVar benign and likely benign variants." I don't see any green points. c. Identify the colors used for ASS1 substrate binding residues. d. This plot would benefit from a depiction of the human ASS secondary structure and any protein domains (nucleotide-binding domain, synthase domain, and C-terminal helix from Fig4B)

      e. Line 685 675 - "ASS1" is being used in reference to the enzyme, is this correct or should it be "ASS"?

      We have made the requested changes to Figure 2. The repeated caption text has been removed, and references to green points have been corrected to orange points to match the figure. The colors used to indicate ASS substrate-binding residues are explicitly described in the figure key. Secondary structure annotations have been added. References to the enzyme have been corrected to "ASS" rather than "ASS1" where appropriate.

      1. Figure 3 a. Rename the "unimpaired" category as there are several pathogenic ClinVar variants that fall into this category.

      To address this point, we have clarified the labeling by adding "in our yeast assay" to the figure legend, making explicit that the "unimpaired" category refers only to wild-type-like behavior under assay conditions and does not imply clinical benignity. See also response to Reviewer #3, Major Comment 1.

      1. Figure 4 a. List the PDB or AlphaFold accession used for this structure b. Panel A - state which colors are used for to depict each monomer. It is confusing to see several shades of pink/purple used to depict a single monomer in Panel A. c. It is very difficult to make out the aspartate and citrulline substrates in the catalytic binding activity, consider making an inset zooming-in on this domain and displaying a ribbon diagram of the structure rather than the surface. d. Generally, it would be more helpful here to label any particular residues that were identified as pathogenic from your screen, or to overlay average grow scores per residue data onto the structure

      We have implemented the requested changes to Figure 4. The relevant PDB/AlphaFold accession is now listed, and the colors used to depict each monomer in Panel A are clarified in the figure legend. An inset focusing on the active site has been added to improve visualization of the citrulline and aspartate substrates. In addition, we have added new panels (Figure 4C and 4D) overlaying pathogenic residues and average growth scores onto the structure to more directly link structural context with functional data.

      1. Figure 5 a. Line 716 - Insert a page break to place Figure 5 on its own page b. I suggest using a heatmap for this type of plot, as it is very difficult to track which color corresponds to which residue.

      c. Fig5A - This plot could be improved by identifying which residue positions interface with which substrate.

      We have placed Figure 5 on its own page and added information to the legend identifying which residue positions interface with each substrate. We have retained the active-site variant strip charts raised in point (b), as we believe they effectively illustrate how the distribution of variant effects differs between residues. In addition, we have provided a supplemental heatmap showing variant growth across the entire protein (Figure S1), and individual variant scores for all residues are provided in Table S4.

      1. Figure 7 a. Line 735 - Insert page break to place figure on a new page

      List the PDB accession used for these images. c. For clarity I would mention "human ASS" in the figure title d. State the colors of the substrates e. Panels A and B could be combined into a single panel, making it easier to distinguish the active site and dimerization variants.

      f. Could be interesting to get SASA scores for the ClinVar structural variants to determine if they are surface-accessible

      We have implemented the requested changes in Figure 7 with the following exceptions. For point (e), there is no single orientation of the structure that allows a clear simultaneous view of both active-site and dimerization variants; accordingly, we have retained panels A and B as separate subfigures to preserve clarity. With respect to point (f), we agree that solvent accessibility analysis could be informative in other contexts. However, such an analysis does not integrate naturally with the functional and assay-based framework of the present study and was therefore not included.

      1. Figure 8 a. Panel B - overlay a square frame in the larger protein structure that depicts where the below inset is focused, and frame inset image as well.

      We have framed the inset image as requested. We did not add a corresponding frame to the full protein structure, as doing so obscured structural details in the region of interest.

      Reviewer #3 (Significance (Required)):

      Section 2 - Significance This study represents a substantial technical, functional, and translational advance in the interpretation of missense variation in ASS1, a gene of high clinical relevance for the rare disease citrullinemia type I. Its principal strength lies in the generation of an experimentally validated functional atlas of ASS1 missense variants that covers ~90% of all SNV-accessible substitutions. The scale, internal reproducibility, and careful benchmarking of the yeast complementation assay against known pathogenic and benign variants provide a robust foundation for identifying pathogenic ASS1 variants. Particularly strong aspects include the rigorous quality control of variant identities, the quantitative nature of the functional readout, and the thoughtful integration of results into the ACMG/AMP OddsPath framework. The discovery of intragenic complementation between variants affecting distinct structural regions of the enzyme is a notable conceptual and mechanistic contribution. Limitations include the assay's reduced sensitivity to variants impacting oligomerization or subtle folding defects, and the use of yeast as a heterologous system, which may mask disease-relevant mechanisms as several pathogenic ClinVar variants were found to be "functionally unimpaired". Future work extending functional testing to additional cellular contexts or expanding genotype-level combinatorial analyses would further enhance clinical applicability. Relative to prior studies, which have relied on small numbers of patient-derived variants or low-throughput biochemical assays, this work extends the field decisively by delivering a comprehensive, variant-resolved functional map for ASS1. To the best of my current knowledge, this is the first systematic functional screen of ASS1 at this scale and the first direct experimental demonstration that ASS active sites span multiple subunits, enabling intragenic complementation consistent with Crick and Orgel's classic variant sequestration model. As such, the advance is simultaneously technical (high-throughput functional genomics), mechanistic (defining structural contributors to catalysis and epistasis), and clinical (enabling evidence-based reclassification of VUS). I find the use of homozygous non-human primate variants as an orthogonal benign calibration set both creative and controversial, my hope would be that this manuscript will prompt a productive discussion.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This manuscript presents a comprehensive functional profiling of 2,193 ASS1 missense variants using a yeast complementation assay, providing valuable data for variant interpretation in the rare disease citrullinemia type I. The dataset is extensive, technically sound, and clinically relevant. The demonstration of intragenic complementation in ASS1 is novel and conceptually important. Overall, the study represents a substantial contribution to functional genomics and rare disease variant interpretation.

      Major comments

      This is an exciting paper as it can provide support to clinicians to make actionable decisions when diagnosing infants. I have a few major comments, but I want to emphasize the label of "functionally unimpaired" variants to be misleading. The authors explain that there are several pathogenic ClinVar variants that fall into this category (above the >.85 growth threshold) but I think this category needs a more specific name and I would ask the authors to reiterate the shortcomings of the assay again in the Discussion section. I think there's an important discussion to be had here, is the assay detecting variants that alter the function of ASS or is it detecting a complete ablation of enzymatic activity? The results might be strengthened with a follow-up experiment that identifies stably expressed ASS1 variants. At the very least, it would be great to see the authors replicate some of their interesting results from the high-throughput screen by down-selecting to ~12 variants of uncertain significance that could be newly considered pathogenic. I would ask the authors to provide more citations of the literature in the introduction of the manuscript. I would be especially interested in knowing more about human ASS being identified as a homolog of yeast ARG1, as they share little sequence similarity (27.5%) at the protein level. That said, I find the yeast complementation assay exciting. I appreciate the efforts made by the authors to share their work and make this study more reproducible, such as sharing the hASS1 and yASS1 plasmids being shared on NCBI Genbank (Line 121) and publishing the ONT reads on SRA (Line 154). I made a requests for additional data to be shared, such as the custom method/code for codon optimization and a table of Twist variant cassettes that were ordered. I would also love to see these results shared on MaveDB.org. I find this manuscript very exciting as the authors have a compelling assay that identifies pathogenic variants, but I was generally disappointed by the quality and organization of the figures. For example, Figure 4 provides very little insight, but could be dramatically improved with an overlay of the normalized growth score data or highlighting variants surrounding the substrate or ATP interfaces. There are some very interesting aspects of this manuscript that could be shine through with some polished figures. I would also encourage the authors to generate a heatmap of the data represented in Figure 2 (see Fowler and Fields 2014 PMID 25075907, Figure 2), this would be more helpful reference to the readers.

      My major comments are as follows:

      1. Citations needed - especially in the introduction and for establishing that hASS is a homolog of yARG1
      2. Generally, the authors do a nice job distinguishing the ASS1 gene from the ASS enzyme, though I found some ambiguities (Line 685). Please double-check the use of each throughout the manuscript
      3. Generally, I'm confused about what strain was used for integrating all these variants, was is the arg1 knock-out strain from the yeast knockout collection or was it FY4? I think FY4 was used for the preliminary experiments, then the KO collection strain was used for making the variant library but I think this could be made more clear in the text and figures. Lines 226-229 describes introducing the hASS1 and yASS1 sequences into the native ARG1 locus in strain FY4, but the Fig1A image depicts the ASS1 variants going into arg1 KO locus. Fig1A should be moved to Fig2.
      4. Line 303 - "We classify these variants as 'functionally unimpaired'", this is not an accurate description of these variants as Figure 2 highlights 24 pathogenic ClinVar variants that would fall into this category of "functionally unimpaired". The yeast growth assay appears to capture pathogenic variants, but there is likely some nuance of human ASS functionality that is not being assessed here. I would make the language more specific, e.g. "complementary to Arg1" or "growth-compatible".
      5. Lines 345-355 - It is interesting that there are variants that appear functional at the substrate interfacing sites. Is there anything common across these variants? Are they maintaining the polarity or hydrophobicity of the WT residue? Are any of these variants included in ClinVar or gnomAD? Are pathogenic variants found at any of these sites
      6. Lines 423-430 - The OddsPath calculation would seem to rely heavily on the thresholds of <.05 and >.85 for normalized growth. The OddsPath calculation could be bolstered with some additional analysis that emphasizes the robustness to alternative thresholds.
      7. Lines 432-441 - This is an interesting idea to use variants observed in primates, has ACMG weighed in on this? I understand that CTLN1 is an autosomal recessive disorder but I'd still be interested in seeing how the observed ASS1 missense variants in gnomAD perform in your growth assay, possibly a supplemental figure?

      Minor comments

      1. Lines 53-59 - This paragraph needs to cite the literature, especially lines 56, 57, and 59
      2. Line 61 - no need to repeat "citrullinemia type I", just use the abbreviation as it was introduced in the paragraph above
      3. Lines 61-71 - again, this paragraph needs more literature citations
      4. Line 62 - change to "results"
      5. Line 74-75 - "RUSP" acronym not needed as it's never used in the manuscript, the same goes for "HHS"
      6. Line 86 - "ASS1" I think is referring to the enzyme and should just be "ASS"? If referring to the gene then italicize to "ASS1"
      7. Lines 91-93 - It would be helpful to mention this is a functional screen in yeast
      8. Line 101 - It would be helpful to the readers to define SD before using the acronym, consider changing to "minimal synthetic defined (SD) medium" and afterwards can refer to as "SD medium"
      9. 109-114 - It would be great if you could share your method for designing the codon-harmonized yASS1 gene, consider sharing as a supplemental script or creating a GitHub repository linked to a Zenodo DOI for publication.
      10. Lines 135-137 - I think it's helpful to provide a full table of the cassettes ordered from Twist as well as the primers used to amplify them, consider a supplemental table
      11. Line 138 - "standard methods" is a bit vague, I'm guessing this is a Geitz and Schiestl 2007 LiAc/ssDNA protocol (PMID 17401334)? Also, was ClonNAT used to select for natMX colonies?
      12. Line 150 - change to "sequence the entire open reading frame, as previously described [4]."
      13. Line 222-223 - remove "replace" and just use "complement" (and remove the parenthesis)
      14. Line 249 - It would be great to see a supplemental alignment of the hASS1 and yASS1 sequences
      15. Line 261 - spelling "citrullemia" should be corrected to "citrullinemia"
      16. Line 280 - "using Oxford Nanopore sequencing" is a bit vague, I suggest specifying the equipment used (e.g. Oxford Nanopore Technologies MinION platform) or simplify to "via long-read sequencing (see Materials & Methods)"
      17. Line 287-289 - It would be great to see the average number of isolates per variant, as well as a plot of the variant growth estimate vs individual isolate growth
      18. Lines 324-25 - consider removing the last sentence of this paragraph, it is redundant as the following paragraph starts with the same statement
      19. Lines 327-335 - This is interesting and would benefit from its own subpanel or plot in which the normalized growth score is plotted against variants that are at conserved or diverse residues in human ASS, and see if there's a statistical difference in score between the two groupings
      20. Lines 339-341 - As written, it is unclear if aspartate interacts with all of the same residues as citrulline or just Asn123 and Thr119.
      21. Lines 345-355 - As with my above comment, I find this interesting and would
      22. Line 353 - add a period to "al" in "Diez-Fernandex et al."
      23. Figure 1

      a. Remove "Figure" from the subpanels and show just "A" and "B" (as you do for Figure 4) and combine the two images into a single image. Also make this correction to Figure 5 and Figure 8

      b. Panel A - I thought the hASS1 and yASS1 were dropped into FY4, not the arg1 KO strain. This needs clarification

      c. Panel A - I'm assuming the natMX cassette contains its own promoter, you could use a right-angled arrow to indicate where the promotors are in your construct

      d. Panel B - I'm not sure the bar graph is necessary, it would be more helpful to see calculations of the colony size (or growth curves for each strain) and plot the raw values (maybe pixel counts?) for each replicate rather than normalizing to yeast ARG1. I would be great to have a supplemental figure showing all the replicates side-by-side

      e. Panel B - Would be helpful to denote the pathogenic and benign ClinVar variants with an icon or colored text

      f. Figure 1 Caption - make "A)" and "B)" bold 24. Figure 2

      a. "Shown in magenta are amino acid substitutions corresponding to ClinVar pathogenic, pathogenic/likely pathogenic, and likely pathogenic variants" is repeated in the figure caption

      b. "Shown in green are amino acid substitutions corresponding to ClinVar benign and likely benign variants." I don't see any green points

      c. Identify the colors used for ASS1 substrate binding residues

      d. This plot would benefit from a depiction of the human ASS secondary structure and any protein domains (nucleotide-binding domain, synthase domain, and C-terminal helix from Fig4B)

      e. Line 685 - "ASS1" is being used in reference to the enzyme, is this correct or should it be "ASS"? 25. Figure 3

      a. Rename the "unimpaired" category as there are several pathogenic ClinVar variants that fall into this category 26. Figure 4

      a. List the PDB or AlphaFold accession used for this structure

      b. Panel A - state which colors are used for to depict each monomer. It is confusing to see several shades of pink/purple used to depict a single monomer in Panel A

      c. It is very difficult to make out the aspartate and citrulline substrates in the catalytic binding activity, consider making an inset zooming-in on this domain and displaying a ribbon diagram of the structure rather than the surface.

      d. Generally, it would be more helpful here to label any particular residues that were identified as pathogenic from your screen, or to overlay average grow scores per residue data onto the structure 27. Figure 5

      a. Line 716 - Insert a page break to place Figure 5 on its own page

      b. I suggest using a heatmap for this type of plot, as it is very difficult to track which color corresponds to which residue

      c. Fig5A - This plot could be improved by identifying which residue positions interface with which substrate 28. Figure 7

      a. Line 735 - Insert page break to place figure on a new page

      b. List the PDB accession used for these images

      c. For clarity I would mention "human ASS" in the figure title

      d. State the colors of the substrates

      e. Panels A and B could be combined into a single panel, making it easier to distinguish the active site and dimerization variants

      f. Could be interesting to get SASA scores for the ClinVar structural variants to determine if they are surface-accessible 29. Figure 8

      a. Panel B - overlay a square frame in the larger protein structure that depicts where the below inset is focused, and frame inset image as well.

      Significance

      This study represents a substantial technical, functional, and translational advance in the interpretation of missense variation in ASS1, a gene of high clinical relevance for the rare disease citrullinemia type I. Its principal strength lies in the generation of an experimentally validated functional atlas of ASS1 missense variants that covers ~90% of all SNV-accessible substitutions. The scale, internal reproducibility, and careful benchmarking of the yeast complementation assay against known pathogenic and benign variants provide a robust foundation for identifying pathogenic ASS1 variants. Particularly strong aspects include the rigorous quality control of variant identities, the quantitative nature of the functional readout, and the thoughtful integration of results into the ACMG/AMP OddsPath framework. The discovery of intragenic complementation between variants affecting distinct structural regions of the enzyme is a notable conceptual and mechanistic contribution. Limitations include the assay's reduced sensitivity to variants impacting oligomerization or subtle folding defects, and the use of yeast as a heterologous system, which may mask disease-relevant mechanisms as several pathogenic ClinVar variants were found to be "functionally unimpaired". Future work extending functional testing to additional cellular contexts or expanding genotype-level combinatorial analyses would further enhance clinical applicability.

      Relative to prior studies, which have relied on small numbers of patient-derived variants or low-throughput biochemical assays, this work extends the field decisively by delivering a comprehensive, variant-resolved functional map for ASS1. To the best of my current knowledge, this is the first systematic functional screen of ASS1 at this scale and the first direct experimental demonstration that ASS active sites span multiple subunits, enabling intragenic complementation consistent with Crick and Orgel's classic variant sequestration model. As such, the advance is simultaneously technical (high-throughput functional genomics), mechanistic (defining structural contributors to catalysis and epistasis), and clinical (enabling evidence-based reclassification of VUS). I find the use of homozygous non-human primate variants as an orthogonal benign calibration set both creative and controversial, my hope would be that this manuscript will prompt a productive discussion.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Lo et al characterize the phenotypic effect of ~90% of all possible ASS1 missense mutations using an elegant yeast-based system, and use this dataset to aid the interpretation of clinical ASS1 variants. Overall, the manuscript is well-written and the experimental data are interpretated rigorously. Of particular interest is the identification of pairs of deleterious alleles that rescue ASS1 activity in trans. My comments mainly pertain to the relevance of using a yeast screening methodology to infer functional effects of human ASS1 mutations.

      • Since human ASS1 is heterologously expressed in yeast for this mutational screen, direct comparison of native expression levels between human cells and yeast is not possible. Could the expression level of human ASS1 (driven by the pARG1 promoter) in yeast alter the measured fitness defect of each variant? For instance, if ASS1 expression in yeast is sufficiently high to mask modest reductions in catalytic activity, such variants may be misclassified as hypomorphic rather than amorphic. Conversely, if expression is intrinsically low, even mild catalytic impairments could appear deleterious. While it is helpful that the authors used non-human primate SNV data to calibrate their assay, experiments could be performed to directly address this possibility.
      • The nature of the relationship between yeast growth and availability of functional ASS1 could also influence the interpretation of results from the yeast-based screen. Does yeast growth scale proportionately with ASS1 enzymatic activity?
      • It would be helpful to add an additional diagram to Figure 1A explaining how the screen was performed, in particular: when genotype and phenotype were measured, relative to plating on selective vs non-selective media? This is described in "Variant library sequence confirmation" and "Measuring the growth of individual isolates" of the Methods section but could also be distilled into a diagram.
      • The authors rationalize the biochemical consequences of ASS1 mutations in the context of ASS1 per se - for example, mutations in the active site pocket impair substrate binding and therefore catalytic activity, which is expected. Does ASS1 physically interact with other proteins in human cells, and could these interactions be altered in the presence of specific ASS1 mutations? Such effects may not be captured by performing mutational scanning in yeast.
      • The authors note that only a small number (2/11) of mutations at the ASS1 monomer-monomer interface lead to growth defects in yeast. It would be helpful for the authors to discuss this further.

      Significance

      This study presents the first comprehensive mutational profiling of human ASS1 and would be of broad interest to clinical geneticists as well as those seeking biochemical insights into the enzymology of ASS1. The authors' use of a yeast system to profile human mutations would be particularly useful for researchers performing deep mutational scans, given that it provides functional insights in a rapid and inexpensive manner.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Lo et al., report a high-throughput functional profiling study on the gene encoding for argininosuccinate synthase (ASS1), done in a yeast experimental system. The study design is robust (see lines 141-143, main text, Methods), whereby "approximately three to four independent transformants of each variant would be isolated and assayed." (lines 140 - 141, main text, Methods). Such a manner of analysis will allow for uncertainty of the functional readout for the tested variants to be accounted for.

      This is an outstanding study providing insights on the functional landscape of ASS1. Functionally impaired ASS1 may cause citrullinemia type I, and disease severity varies according to the degree of enzyme impairment (line 30, main text; Abstract). Data from this study forms a valuable resource in allowing for functional interpretation of protein-altering ASS1 variants that could be newly identified from large-scale whole-genome sequencing efforts done in biobanks or national precision medicine programs. I have some suggestions for the Authors to consider:

      1. The specific function of ASS1 is to condense L-citrulline and L-aspartate to form argininosuccinate. Instead of measuring either depletion of substrate or formation of product, the Authors elected to study 'growth' of the yeast cells. This is a broader phenotype which could be determined by other factors outside of ASS1. Whereas i agree that the experiments were beautifully done, the selection of an indirect phenotype such as ability of the yeast cells to grow could be more vigorously discussed.
      2. One of the key reasons why studies such as this one are valuable is due to the limitations of current variant classification methods that rely on 'conservation' status of amino acid residues to predict which variants might be 'pathogenic' and which variants might be 'likely benign'. However, there are serious limitations, and Figures 2 and 6 in the main text shows this clearly. Specifically, there is an appreciable number of variants that, despite being classified as "ClinVar Pathogenic", were shown by the assay to unlikely be functionally impaired. This should be discussed vigorously. Could these inconsistencies be potentially due to the read out (growth instead of a more direct evaluation of ASS1 function)?
      3. Figure 3 is very interesting, showing a continuum of functional readout ranging from 'wild-type' to 'null'. It is very interesting that the Authors used a threshold of less than 0.85 as functionally hypomorphic. What does this mean? It would be very nice if they have data from patients carrying two hypomorphic ASS1 alleles, and correlate their functional readout with severity of clinical presentation. The reader might be curious as to the clinical presentation of individuals carrying, for example, two ASS1 alleles with normalized growth of 0.7 to 0.8.

      I hope you will find these suggestions helpful.

      Significance

      This is an outstanding study providing insights on the functional landscape of ASS1. Functionally impaired ASS1 may cause citrullinemia type I, and disease severity varies according to the degree of enzyme impairment (line 30, main text; Abstract). Data from this study forms a valuable resource in allowing for functional interpretation of protein-altering ASS1 variants that could be newly identified from large-scale whole-genome sequencing efforts done in biobanks or national precision medicine programs.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements [optional]

      We thank all three Reviewers for appreciating our work and for sharing constructive feedback to further enhance the quality of our study. It is really gratifying to read that the Reviewers believe that this work is interesting, novel and of interest to broad audience. Therefore, we believe that it will be suitable for a high profile journal. Further, the experiments suggested by the reviewers have added value to the work and have substantiated our findings. It is important to highlight that we have performed all the suggested experiments. Please find below the detailed point by point response to Reviewer’s Comments.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required):

      • The manuscript entitled, "IP3R2 mediated inter-organelle Ca2+ signaling orchestrates melanophagy" is a rather diffuse study of the relationship between IP3R2 and melanin production. While this is an interesting and understudied area, the study lacks a clear focus. The model seems to be that IP3R2 is essential for mitochondrial calcium loading. And that its absence increases lysosomal calcium loading. There are also a number of incomplete and/or unconvincing links to autophagy/melanophagy, TMEM165, TRPML1 and even gene transcription. In this kind of diffuse study, each step needs to be convincing to get to the next one, which is not the case here. There are also references to altered proteasome function, despite the total absence of any direct data on the proteasome. Finally, I felt it was sometimes unclear whether the authors were referring to melanosomes or lysosomes at various points throughout the study.*

      While I suspect that, somewhere in here, there are some novel relationships worthy of further investigation, this is a case where the many parts make the overall product less convincing. What effects here are directly relevant to IP3R2? This study should stop there, leaving investigations of peripheral factors for future investigations, as the further you get from where you start, the less clear what you are studying becomes. And the less direct.

      Response: We thank the Reviewer for finding our study interesting and recognizing that this is an understudied area. Further, we appreciate the constructive feedback given by the Reviewer. We have addressed all the Reviewer’s comments. Please find below point-wise responses to the comments.

      Specific Comments:

      __ Comment 1.__ The separation of Figures 1F and 1J makes it impossible to assess the effect of αMSH on IP3R2 expression. This presentation makes interpretation difficult; a simple 4 lane Western would be more informative.

      Response: We apologize to the Reviewer for not being very clear. Actually, we have separated these data sets because these are two independent experimental conditions. The Figure 1F illustrates data from the LD-based pigmentation model, whereas Supplementary Figure 1K (Previously Fig 1J) depicts data from α-MSH–induced pigmentation model.

      Comment 2. One of the most attractive points made by this study is that there is a specific link between IP3R2 and melanin production. In my opinion, the null hypothesis is that this is just about the amount of IP3Rs expressed per cell. To reject this concept, the authors should show data demonstrating the relative expression of all 3 IP3Rs. Without this information, the null hypothesis that IP3R2 is the most expressed IP3R isoform and that's why its knockdown has the most dramatic effect cannot be rejected It would also be helpful to show where the different IP3Rs are expressed within the cell.

      Response: We thank the Reviewer for raising this interesting point and for the constructive comment. As suggested, we would like to clarify that the relative expression of all three IP₃R isoforms has already been analyzed in our study. Specifically, in Figure 1B, we demonstrate the expression pattern of IP₃R isoforms in our experimental system, where IP₃R2 shows the highest expression level, followed by IP₃R3 and IP₃R1 (IP₃R2 > IP₃R3 > IP₃R1). Further, in the revised manuscript, we additionally analyzed publicly available datasets for IP₃Rs expression. “The Human Protein Atlas” reports a higher expression of IP₃R2 in melanocytes compared to the other IP₃R isoforms (Supplementary Fig 1A). Therefore, we agree with the Reviewer’s proposed concept that the relatively higher expression of IP₃R2 can be one of the important factors that regulate pigmentation levels. Indeed, our analysis of microarray dataset from African vs Caucasian skin revealed a greater IP₃R2 expression in African skin compared to Caucasian skin (__Figure 1L). __

      With respect to subcellular localization, all three IP₃R isoforms are predominantly localized to the endoplasmic reticulum, consistent with their established role as ER-resident Ca²⁺ release channels. However, their expression levels are known to be highly cell and tissue specific (Bartok et al., Nature Communications 2019), supporting the idea that higher IP₃R2 levels play a functionally specialized role in melanogenesis.

      Comment 3. It would be helpful to label Figs 3F-I with the conditions used. The description in the text is of increased LC3II levels, however, the ratio of LC3I to LC3II might be more meaningful. Irrespective, although the graph shows an increase in LC3II, the Western really doesn't show much. As a standalone finding, I don't find this figure to be very convincing; there are better options to demonstrate this proposed relationship between IP3R2 and autophagy than what is shown.

      Response: We sincerely thank the Reviewer for this thoughtful and critical evaluation, which has helped us improve the clarity and precision of this analysis. To address this concern, in the revised manuscript, we have now labeled ‘LD’ in the Supplementary Fig 2A-B (Previously, Fig 4F-I) with the corresponding experimental conditions for clarity. In addition, we reanalyzed the data by calculating the LC3II/LC3I ratio in all the figures of the revised manuscript that include LC3II expression, which provides a more meaningful and robust assessment of autophagic flux. This revised analysis yields a clearer representation of LC3 dynamics and strengthens the interpretation of the western blotting data in support of the relationship between IP₃R2 and autophagy. Further, we have shown by confocal imaging that IP3R2 silencing significantly reduced GFP/RFP ratio of the pMRX-IP-GFP-LC3-RFP reporter system in comparison to control condition in Fig 4M-N to demonstrate the relationship between IP3R2 and autophagy. Collectively, these autophagy flux assays and biochemical experiments clearly demonstrate a direct relationship between IP3R2 and autophagy.

      Comment 4. The following statement at the beginning of page 22 "We observed an impaired proteasomal degradation of critical melanogenic proteins localized on melanosomes in the IP3R2 knockdown condition" is insufficiently supported by data to be made. Even if I was convinced that autophagy was enhanced, there is no data of any kind about the proteasome in this manuscript.

      Response: We appreciate the Reviewer’s careful scrutiny of this statement and the opportunity to clarify and strengthen our interpretation. To directly address the concern regarding proteasomal involvement, in the revised manuscript, we performed additional experiments using MG132, a well-established inhibitor of proteasomal degradation. These experiments were designed to assess whether the altered stability of melanogenic proteins observed upon IP₃R2 knockdown could be attributed to changes in proteasome-mediated turnover.

      In the revised manuscript, our new data show that treatment with MG132 leads to a marked reduction in the levels of melanosome-associated melanogenic proteins, including GP100 and DCT, compared to the DMSO control (Fig. 4A–D). This response contrasts with that of non-melanosomal proteins, such as IP₃R2 and Calnexin, which are localized to the endoplasmic reticulum and exhibits increased accumulation upon MG132 treatment (Fig. 4E–H), consistent with canonical proteasomal inhibition. These differential outcomes suggest that melanosome-resident proteins respond distinctly to proteasomal blockade, likely due to their compartmentalized localization on melanosomes.

      Previous studies have shown that impairment of proteasomal function can activate autophagy as a compensatory, cytoprotective mechanism (Williams et al, 2013; Li et al, 2019; Su & Wang, 2020; Pan et al, 2020). Indeed, we observed a significant increase in LC3II/LC3I levels in IP3R2 knockdown plus MG132 treatment condition in comparison to IP3R2 knockdown plus the DMSO control (Fig. 4I–J).

      To investigate whether impairment of proteasomal degradation upon IP3R2 silencing alone or together with MG132 selectively triggers melanophagy, we assessed melanophagy using melanophagy reporter, mCherry-Tyrosinase-eGFP following IP3R2 silencing along with MG132 treatment. Our observations revealed an increase in melanophagy flux with IP3R2 silencing and MG132 treatment compared to siNT with DMSO control (Fig 5K-L). This suggests that IP3R2 silencing induced inhibition of proteasomal degradation activates melanophagy. Taken together, these findings indicate that compromised proteasomal degradation engages the autophagy machinery, providing a mechanistic link between proteasome dysfunction, enhanced autophagy, and altered melanogenic protein turnover.

      Comment 5. In figure 5, the authors create a new ratiometric dye to detect melanosome stability based on the principle that tyrosinase is exclusively found in melanosomes. Unfortunately, there is no validation that this new construct is found exclusively in melanosomes upon expression. In addition, there is discussion about the pH of lysosomes, but not of melanosomes. Ultimately, this data cannot be considered at face value without any type of validation; I also note that the pictures lack sufficient detail to support identification of these structures as melanosomes. * While I maintain the above concerns, I note that, the data in supplemental figure 3 is MUCH more convincing than what is in the figure. Both the writing and the figure design should be rethought.*

      Response: We appreciate the Reviewer’s thorough evaluation and constructive critique of Figure 5, which has helped us to better clarify and validate this aspect of the study. In the revised manuscript, we directly address the concern regarding the subcellular specificity of the ratiometric probes, we performed detailed colocalization analysis using established melanosome markers. Specifically, we assessed the localization of the melanophagy detection probes mCherry–Tyr–eGFP and tyrosinase–mKeimaN1 with the melanosome-resident protein GP100 detected by anti-HMB45 (Supplementary Fig 2E-F and 2K-L). These analyses revealed a very high degree of colocalization, reflected by strong Pearson’s correlation and overlap coefficients, thereby validating that the expressed probes are predominantly localized to melanosomes.

      Regarding Lysosome/Melanosomal pH considerations, our melanophagy detection ratiometric probes: mCherry–Tyrosinase–eGFP (sensitive to acidic pH via eGFP) and tyrosinase mKeimaN1 (sensitive to acidic pH via Keima) are specifically designed to identify melanosome degradation, which happens upon melanosome fusion with lysosome. Consequently, the observed signal shifts indicate melanosome turnover rather than merely reflecting the lysosomal pH.

      To further corroborate the microscopic observations, we performed biochemical assays to study melanophagy flux upon IP3R2 silencing. We employed Bafilomycin A1, an inhibitor of autophagosome-lysosome fusion, to examine melanosomal protein accumulation. Upon Bafilomycin A1 treatment, IP3R2 silenced cells showed enhanced accumulation of melanosomes, as indicated by elevated tyrosinase levels compared with siNT controls (Supplementary Fig 3C-D), indicating elevated melanophagy flux upon IP3R2 knockdown. In the revised manuscript, we employed additional melanophagy detection strategies to further strengthen our findings. Specifically, we used Retagliptin phosphate (RTG), a well-established selective inducer of melanophagy, and observed a marked increase in melanophagy using the mCherry–Tyrosinase–eGFP melanophagy probe (Supplementary Fig 2G-H). Additionally, we performed independent validation by assessing colocalization of the melanosome (recognized by anti-HMB45 ab that identifies melanosomal structural protein GP100) with LC3 (Supplementary Fig 3A-B). This analysis revealed a significant increase in melanosomes colocalization with LC3 upon IP₃R2 silencing compared to control conditions.

      Collectively, these independent approaches clearly demonstrate that the melanophagy probes localize to melanosomes and detect melanophagy (by responding to melanosome fusion to lysosomes).

      Comment 6. Given the increase in ER Ca2+ content after IP3R2 knockdown, ER calcium content should be emptied before attempting to estimate lysosomal Ca2+ content with GPN or Bafilomycin. Otherwise, the source of calcium is less than clear.

      Response____: We appreciate the Reviewer’s careful consideration of Ca²⁺ source, which is critical for accurate interpretation of these experiments. Therefore, as suggested, in the revised manuscript, we conducted experiments involving Thapsigargin (Tg) pre-treatment to deplete ER Ca²⁺ reserves before examining lysosomal Ca²⁺ release using GPN or Bafilomycin (Supplementary Fig 6I-N). Even under these conditions, we noted increased lysosomal Ca²⁺ release in IP₃R2 knockdown cells, thus confirming that the observed Ca²⁺ signals originate from lysosomes rather than any remaining ER Ca²⁺. Importantly, this approach allowed us to minimize ER-derived Ca²⁺ contributions to changes in the lysosomal Ca²⁺ release.


      Reviewer #1 (Significance (Required)):

      The manuscript entitled, "IP3R2 mediated inter-organelle Ca2+ signaling orchestrates melanophagy" is a rather diffuse study of the relationship between IP3R2 and melanin production. While this is an interesting and understudied area, the study lacks a clear focus. The model seems to be that IP3R2 is essential for mitochondrial calcium loading. And that its absence increases lysosomal calcium loading. There are also a number of incomplete and/or unconvincing links to autophagy/melanophagy, TMEM165, TRPML1 and even gene transcription. In this kind of diffuse study, each step needs to be convincing to get to the next one, which is not the case here. There are also references to altered proteasome function, despite the total absence of any direct data on the proteasome. Finally, I felt it was sometimes unclear whether the authors were referring to melanosomes or lysosomes at various points throughout the study.

      Response____: We thank the Reviewer for finding our work interesting and appreciating that this is an understudied field. Further, we thank him/her for the constructive feedback on our study. We have performed several additional experiments and significantly revised the manuscript to address all the comments of the Reviewer.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In the present manuscript, Saurav et al. identify IP3R2-mediated ER calcium release as a key suppressor of melanophagy, thereby sustaining pigmentation in melanocytes. Using in vitro (B16 murine melanoma cells, primary human melanocytes) and in vivo (zebrafish) models, the authors report that IP3R2 expression is positively correlated with pigmentation. They then investigate the impact of IP3R2 knockdown and find that IP3R2 silencing enhances the stability of melanogenic proteins, while also inducing autophagic degradation of melanosomes (i.e., melanophagy). Concomitantly, they find that IP3R2 silencing decreases mitochondrial calcium uptake, increases lysosomal calcium loading, and lowers lysosomal pH. They propose a pathway wherein in IP3R2 knockdown cells impaired mitochondrial calcium uptake induces the activation of AMPK-ULK1, and increased lysosomal calcium activates TRPML1 via TMEM165 and closer proximity interactions between ER and lysosomes, TFEB nuclear translocation, and upregulation of melanophagy-related genes, namely OPTN and RCHY1. The work is placed within the context of emerging roles of organelle calcium signaling in pigmentation biology, where extracellular calcium influx pathways are known regulators, but the contribution of ER-mitochondria-lysosome crosstalk to melanosome turnover remains largely unknown.

      Response____: We thank the Reviewer for appreciating our work and highlighting that the contribution of ER-mitochondria-lysosome crosstalk to melanosome turnover remains largely unappreciated.

      Major comments:

      Comment 1- The central finding is that IP3R2 knockdown induces melanophagy and reduces pigmentation. However, the manuscript does not identify any physiological or pathological context in which IP3R2 expression or activity is naturally downregulated in melanocytes. Without such context, the knockdown may represent an artificial perturbation that broadly alters ER calcium handling and triggers melanophagy as part of a general stress-induced autophagy response. This raises uncertainty about whether the pathway operates in vivo under normal or disease conditions. It would strengthen the study to identify upstream cues that reduce IP3R2 function and to test whether these also trigger melanophagy through the proposed mechanism.


      Response____: We thank the Reviewer for asking such an important question. The Reviewer asked to identify any physiological or pathological context in which IP3R2 expression is naturally downregulated in melanocytes. To address this question, in the revised manuscript, we analyzed publicly available microarray datasets comparing skin samples from Caucasian and African populations (Yin et al., Experimental Dermatology 2014). This unbiased analysis revealed considerably lower IP₃R2 expression in the Caucasian skin as compared to African skin (Fig. 1L). This data support a physiological correlation between IP₃R2 expression and pigmentation level, reinforcing the physiological relevance of the proposed pathway.


      Comment 2- While the data link IP3R2 knockdown to decreased pigmentation and increased melanophagy, the causality between altered organelle calcium dynamics and the melanophagy induction is inferred from correlation and partial rescue experiments. More direct interventions in the proposed downstream pathways (e.g., acute mitochondrial calcium uptake restoration, lysosomal calcium buffering) would strengthen mechanistic claims.

      Response____: We appreciate the Reviewer’s recommendation on strengthening the mechanistic causality between organelle Ca²⁺ dynamics and melanophagy. As suggested, in the revised manuscript, we restored acute mitochondrial Ca²⁺ uptake by MCU over-expression in the IP₃R2 knockdown background, which resulted in a marked reduction in melanophagy along with increased mitochondrial Ca²⁺ uptake in comparison to control (Fig 6I-L). This data clearly demonstrates that downstream of IP₃R2 silencing mitochondrial Ca²⁺ restoration rescues the melanophagy phenotype thereby revealing a mechanistic causality between mitochondrial Ca²⁺ dynamics and melanophagy.

      Similarly, to assess the causality between lysosomal Ca²⁺ dynamics and melanophagy, we silenced TMEM165 in the IP₃R2 knockdown background. Excitingly, upon TMEM165 knockdown we observed reduction in melanophagy, concomitant with decrease in lysosomal Ca²⁺ levels under IP₃R2 silencing conditions (Supplementary Fig 7I-L). Together, these direct manipulations support a causal role for altered organelle Ca²⁺ dynamics in driving melanophagy.


      We believe that these experiments would have addressed the concern of the Reviewer. However, if there are any other specific experiments that the Reviewer would like us to perform, we would be happy to carry out them as well.

      __Comment 3____- __Zebrafish assays convincingly show altered pigmentation with altered IP3R2 levels, but do not connect this to in vivo melanophagy measurements or TRPML1/TFEB activity, which would link the cell biology to organismal phenotype more directly.

      Response____: We thank the Reviewer for appreciating our in vivo zenrafish experiments. Futher, we acknowledge the Reviewer’s point of linking the cellular mechanisms to organismal phenotypes in vivo. Therefore, as suggested, we activated TRPML1 in the zebrafish model system. In the revised manuscript, we investigated role of the TRPML1–TFEB axis in pigmentation in vivo by pharmacological activation of TRPML channels with MLSA1. The MLSA1 treatment resulted in a marked reduction in zebrafish pigmentation compared to vehicle-treated controls (Fig. 8M). This phenotypic change was further substantiated by quantitative melanin content assays, which confirmed a significant decrease in melanin levels following MLSA1 treatment (Fig. 8M–N). These in vivo findings support the involvement of TRPML1-mediated lysosomal signaling in pigmentation regulation.

      Comment 4- The work suggests therapeutic potential for pigmentary disorders, but no disease models are tested. It is unclear whether the observed mechanisms operate under physiological stressors.

      Response____: We appreciate the Reviewer’s comment regarding physiological relevance and disease context. As addressed in Comment 1, we examined publicly available human skin microarray datasets for IP₃R2 expression in Caucasian and African population. This analysis revealed a positive correlation between IP₃R2 expression and human skin pigmentation, supporting that modulation of IP₃R2 occurs under physiological conditions rather than representing an artificial perturbation.

      While formal pigmentary disease models were not examined in this study, the observed correlation between IP₃R2 expression and physiological pigmentation differences along with our robust in vivo zebrafish data suggests that IP₃R2 plays an important role in physiological pigmentation. As highlighted by Reviewer 1 and Reviewer 3, the manuscript is already too long. Therefore, we plan to delineate the precise role of IP₃R2 in pigmentary disorders as an independent study.

      Comment 5- The paradox between the observed enhanced stability of melanogenic proteins and increased melanophagy is insufficiently addressed. DCT, Tyrosinase and GP100 are all melanosome-associated and their stability or degradation is in prior literature often interpreted as reflecting melanosome biogenesis and turnover. This discrepancy needs to be resolved, as it complicates interpretation of melanophagy assays.

      Response____: We appreciate the Reviewer’s careful consideration of this apparent paradox. This point was also raised by Reviewer 1. We have addressed the query in detail in response to Comment 4 of Reviewer 1. Briefly, the enhanced stability of melanosome-associated proteins reflects impaired proteasomal degradation and prolonged protein half-life, while the concurrent increase in melanophagy represents a compensatory turnover mechanism for degrading such dysfunctional melanosomes.

      Thus, increased melanophagy and apparent stabilization of melanogenic proteins are not contradictory but instead represent parallel outcomes of disrupted proteostasis. This interpretation is supported by our proteasomal inhibition experiments (Fig 4A-H) and autophagy analyses (Fig 4I-P), which collectively reconcile the observed protein stability with enhanced melanosome turnover.


      Comment 6- The authors propose that mitophagy and ER-phagy are reduced in IP3R2 knockdown cells, suggesting specific induction of melanophagy, but the rationale for why increased autophagic flux only targets melanosomes is insufficiently addressed. Also, these conclusions are solely based on Keima assays, and positive controls for mitophagy and ER-phagy are lacking.

      Response: We appreciate the Reviewer’s critical assessment of the specificity of autophagic targeting in the IP₃R2 knockdown condition and the need for appropriate validation controls. In the revised manuscript, we have repeated both the mitophagy and ER-phagy assays with well-established positive controls. Carbonyl cyanide-p-trifluoromethoxyphenylhydrazone (FCCP) was employed as a positive control to robustly induce mitophagy (Supplementary Fig 4E-F), while 4-phenylbutyric acid (4PBA) was used as a positive control for ER-phagy/reticulophagy (Supplementary Fig 4G-H). Secondly, we have validated the microscopy data with biochemical assays by examining levels of ER (Fig 4E-H) and mitochondria resident protein MCU.

      To provide a mechanistic rationale for the specific induction of melanophagy, we examined recently identified regulators of melanophagy, RCHY1 and OPTN (Lee et al., PNAS 2024). Bioinformatic analysis identified multiple TFEB binding sites on the promoters of both genes, which was supported by increased RCHY1 and OPTN expression following IP₃R2 knockdown. Further, in the revised manuscript, we performed additional loss-of-function experiments to demonstrate that co-silencing IP3R2 along with RCHY1 or OPTN significantly reduced melanophagy flux compared to IP₃R2 knockdown alone (Fig. 9H–K). Taken together, these data explain why enhanced autophagic flux downstream of IP₃R2 silencing is preferentially directed toward melanosomes.

      Comment 7- The melanophagy probes are novel and validated with rapamycin/bafilomycin, but quantitative calibration of GFP/mCherry or Keima signal to actual lysosomal delivery rates is missing; photobleaching, pH heterogeneity (incl., observed decrease in lysosomal pH), and melanin autofluorescence (see below) could confound ratios. Also, side-by-side comparison with other melanophagy detection approaches (e.g., colocalization of melanosomes with LC3) is lacking.

      __Response____: __We appreciate the Reviewer’s careful evaluation of the melanophagy probes and the potential technical confounders. In the revised manuscript, we have performed a variety of experiments to further characterize and validate the probes. First of all, the melanophagy detection ratiometric probes (mCherry–Tyrosinase–eGFP and tyrosinase mKeimaN1) are built on well-established and extensively validated backbones. Further, we used appropriate controls (empty vectors/non-targeting siRNAs/vehicle controls) in all experiments to analyze the relative fluorescence changes in the test condition v/s control. The confounding factors, if any, should be present for both test and control. Therefore, we initially did not perform side-by-side comparison with other melanophagy detection approaches.

      In the revised manuscript, as suggested by the reviewer, we employed additional melanophagy detection strategies to further strengthen our findings. Specifically, we used Retagliptin phosphate (RTG), a well-established selective inducer of melanophagy, and observed a marked increase in melanophagy using the mCherry–Tyrosinase–eGFP melanophagy probe (Supplementary Fig 2G-H). Additionally, we performed independent validation by assessing colocalization of the melanosome (recognized by anti-HMB45 ab that identifies melanosomal structural protein GP100) with LC3 (Supplementary Fig 3A-B). This analysis revealed a significant increase in melanosomes colocalization with LC3 upon IP₃R2 silencing compared to control conditions. Further, to minimize the contribution of melanin autofluorescence, non-transfected cells were imaged under identical settings, and background signals obtained from these cells were subtracted during fluorescence quantitation from all acquired images. Potential effects of photobleaching and pH heterogeneity were minimized by uniform acquisition parameters and ratiometric analysis. Taken together, we believe these complementary approaches address the Reviewer’s concerns and reinforce the robustness of our melanophagy measurements.

      Comment 8- Melanosomes exhibit broad autofluorescence, particularly upon excitation at 405-488 nm and extending into the red channel. This signal can overlap with the detection ranges for GFP, mCherry, and mKeima reporters, potentially confounding quantitative readouts unless appropriate controls (e.g., untransfected cells, spectral unmixing) are used. Throughout this manuscript, it is not addressed how melanosome autofluorescence was controlled for or excluded in the reported fluorescence measurements.

      __Response____: __We apologize to the Reviewer for not clearly stating that melanosome autofluorescence was controlled by imaging non-transfected cells under identical settings, and these background signals were subtracted during quantitation from the acquired images. Specifically, to rigorously control this issue, autofluorescence was systematically evaluated using non-transfected control cells imaged under identical excitation and emission settings used for GFP, mCherry, and mKeima reporters. These controls allowed us to define the baseline autofluorescence profile arising from melanosomes across the relevant spectral ranges. These details are included in the methods section.

      Comment 9- While OPTN and RCHY1 expression is elevated upon IP3R2 knockdown, functional engagement (e.g., OPTN localization to melanosomes, melanosome ubiquitination by RCHY1), or necessity (e.g., siRNA knockdown of these in the IP3R2-deficient background), are not tested.

      Response: We appreciate the Reviewer’s point on establishing necessity of OPTN and RCHY1 in IP₃R2 knockdown–induced melanophagy. In the revised manuscript, we performed targeted loss of function analyses for both OPTN and RCHY1 in the IP₃R2-deficient background. We assessed melanophagy using the mCherry–Tyrosinase–eGFP melanophagy probe following co-silencing of IP₃R2 with either OPTN or RCHY1. Quantitative analysis revealed a significant reduction in melanophagy flux upon co-silencing of either gene compared to IP₃R2 silencing alone (Fig. 9H–K). These findings establish the functional requirement of OPTN and RCHY1 downstream of IP₃R2 loss to drive melanophagy. Since functional engagement of OPTN and RCHY1 on melanosomes is already well-established (Lee et al. PNAS 2024 and Park et al. Autophagy 2024), we have not repeated these experiments. Taken together, our data demonstrates that OPTN and RCHY1 are not only overexpressed but also act as critical mediators of melanophagy downstream of IP₃R2 silencing.

      __Comment 10- __While siRNA/shRNA efficacy is shown, functional rescue with pore-dead mutants sometimes fails to return to control values. The possibility of partial off-target or compensatory effects is not fully excluded.

      Response: We thank the Reviewer for raising for this point. In this study, we employed pore-dead mutants of IP₃R2 (IP₃R2-M) and TRPML1 (TRPML1-M), both of them are well characterized, widely validated and extensively used by a number of leading groups in the field. Upon meticulous literature analysis, we came across multiple studies wherein partial rescue effect was reported with these pore-dead mutants. Therefore, we believe it is not surprising that we are also observing partial rescue in some of our assays.

      Actually, it is important to note that we observe rescue of the function and phenotype in every single experiment carried out with the mutants. We agree with the Reviewer that the extent of rescue is not up to control levels in few experiments. This can be attributed to the differences in the extend of expression of mutants across different experiments. However, we have validated the results with multiple independent approaches. Collectively, the use of multiple independent approaches along with genetic silencing, pharmacological inhibition/activation supports the specificity of the observed phenotypes.

      Comment 11- The mitochondrial and lysosomal calcium measurements are largely endpoint peak quantifications; kinetic analyses and buffering capacity measurements would provide more mechanistic depth, especially for the TMEM165 contribution. Also, TMEM165 necessity for melanophagy induction upon IP3R2 knockdown has not been directly addressed.

      Response: We appreciate the Reviewer’s request for greater mechanistic depth regarding organelle Ca²⁺ dynamics and the specific contribution of TMEM165. Consistent with this, we had previously demonstrated that TMEM165 silencing decreases lysosomal Ca²⁺ levels using Oregon BAPTA–dextran–based measurements (Supplementary Fig 7C-D), establishing its role in regulating lysosomal Ca²⁺ buffering. Building on this, in the revised manuscript, we performed kinetic analyses of lysosomal Ca²⁺ levels following IP₃R2 and TMEM165 silencing. These kinetic analyses validated our end point measurements that IP₃R2 knockdown leads to increase in lysosomal Ca²⁺ levels, whereas TMEM165 silencing results in decrease in lysosomal Ca²⁺ content in comparison to control. Therefore, highlighting distinct and opposing effects of IP₃R2 and TMEM165 on lysosomal Ca²⁺ kinetics.

      Further, we directly evaluated the necessity of TMEM165 for melanophagy induction in the IP₃R2-deficient background. TMEM165 knockdown alone resulted in a significant reduction in melanophagy (Supplementary Fig 7G-H). Further, co-silencing of TMEM165 with IP₃R2 also attenuated melanophagy compared to IP₃R2 knockdown alone (Supplementary Fig 7K-L). Collectively, these kinetic Ca²⁺ assays and genetic loss-of-function analyses provide mechanistic depth to the organelle Ca²⁺ measurements and establish TMEM165 as a critical regulator of melanophagy downstream of IP₃R2 silencing.

      Comment 12- The proximity ligation assay between VAP-A and LAMP1 is interpreted as showing increased ER-lysosome contacts in IP3R2 knockdown cells. However, additional controls are needed and quantitative TEM should be included to substantiate changes in organelle contact frequency and distance.

      Response: We thank the Reviewer’s for his/her emphasis on strengthening the validation of the proximity ligation assay (PLA) findings and on providing ultrastructural evidence to support altered organelle interactions. The PLA data revealed a significant increase in VAP-A–LAMP1 interaction signals in IP₃R2-silenced cells compared to control conditions (Fig. 7L–M). In the revised manuscript, this increase was not observed upon treatment with bafilomycin A1, a specific inhibitor of lysosomal acidification, or when one of the primary antibodies was omitted, confirming the specificity of the PLA signal (Fig. 7L–M). These controls support the interpretation that IP₃R2 downregulation enhances ER–lysosome interactions.

      To further substantiate the changes in organelle contact frequency and distance, we performed ultrastructural analyses using transmission electron microscopy (TEM). The quantitative TEM measurements revealed no significant change in the frequency of ER–mitochondria or ER–lysosome contacts upon IP₃R2 silencing (Fig. 7N–P). Similarly, ER–mitochondria distances remained unchanged. However, we observed a significant reduction in the distance between the ER and lysosomes in IP₃R2 knockdown cells compared to control (Fig. 7N, 7Q–R). Together, these complementary approaches demonstrate that IP₃R2 silencing specifically increases ER–lysosome proximity without altering overall contact frequency, thereby strengthening the conclusion that IP₃R2 regulates ER–lysosome coupling.

      Comment 13- Some assays report small biological n (e.g., three independent experiments with relatively small per-condition cell counts).

      __Response:____ __We appreciate the Reviewer’s comment regarding sample size. All experiments were performed with a minimum of three independent biological replicates, which is consistent with standard practice in the field. For imaging-based assays, multiple fields of view and cells were analyzed per condition in each independent experiment, and quantitative analyses were performed on pooled data across replicates. As suggested by the Reviewer, we have increased the cell numbers in some experiments. The detailed information on biological replicates and cell numbers analyzed is provided in the respective figure legends.

      Minor comments:

      • Comment 1- The title "IP3R2-mediated inter-organelle Ca2+ signaling orchestrates melanophagy" could be misread as indicating IP3R2 'promotes' melanophagy; consider rewording to make clear that IP3R2 suppresses melanophagy to maintain pigmentation. Similarly, the running title "IP3R2 negatively regulates melanophagy" would be clearer as "IP3R2 suppresses melanophagy".*

      __Response____: __As suggested by the Reviewer, we have modified the title and running title in the revised manuscript.

      Comment 2- Unify the framing of "positively regulates pigmentation" vs. "negatively regulates melanophagy" in the Introduction/Discussion.

      Response: As recommended, we have unified the framing in the suggested sections.

      Comment 3- Adding schematic flow diagrams summarizing each pathway at the end of relevant results (figure) sections could help accessibility.

      Response____: __We appreciate the Reviewer’s suggestion to improve accessibility of the presented pathways. Accordingly, we have included schematic diagrams at the end of the relevant figures. These schematics summarize: (i) ER–mitochondria interactions in the context of melanophagy (__Fig. 6P); (ii) differences in Ca²⁺ and pH regulation between wild-type and IP₃R2-silenced cells (Fig. 7S); and (iii) TRPML1-mediated Ca²⁺ release driving melanophagy via TFEB translocation (Fig. 9L). Together, these diagrams provide a concise visual overview of the key mechanistic pathways described in the study.

      Comment 4- While the introduction summarizes extracellular calcium signaling in pigmentation, there is less coverage of recent work on selective autophagy of other lysosome-related organelles (e.g., platelet dense granules, lytic granules), which could provide broader mechanistic context.

      __Response____: __As suggested by the Reviewer, we have discussed selective autophagy of other lysosome-related organelles in the introduction.

      Reviewer #2 (Significance (Required)):

      This study addresses an important gap in pigmentation biology by identifying IP3R2-mediated ER calcium release as a suppressor of melanophagy and a positive regulator of pigmentation. The strongest aspects are the integration of in vitro and in vivo models, the multi-faceted mechanistic exploration linking altered organelle calcium dynamics to selective melanosome turnover, and the development of novel ratiometric fluorescent probes for live-cell melanophagy measurement. Conceptually, the work extends prior literature that has focused on extracellular calcium influx and melanosome biogenesis, revealing a new inter-organelle calcium signaling module that controls melanosome degradation via AMPK-ULK1 and TMEM165-TRPML1-TFEB pathways.

      • However, several limitations reduce the strength of the mechanistic claims. Some key pathway steps are inferred from correlation and partial rescue rather than direct necessity/sufficiency tests (e.g., mitochondrial calcium uptake restoration, lysosomal calcium buffering). The paradoxical observation that IP3R2 knockdown both increases melanophagy and stabilizes melanosome-resident protein (DCT, Tyrosinase, GP100) is not resolved, complicating interpretation of the melanophagy assays. The specificity for melanophagy over other selective autophagy pathways is asserted but not fully explained mechanistically, and positive controls for mitophagy/ER-phagy are missing. Potential technical confounds, such as melanin autofluorescence in the detection ranges of GFP, mCherry, and mKeima, are not explicitly addressed and alternative assays for these key data were insufficiently employed. In vivo results do not yet connect altered pigmentation to melanophagy readouts or downstream TRPML1/TFEB activation. Importantly, the study does not identify any physiological or pathological scenario in which IP3R2 expression or activity is naturally reduced in melanocytes. In the absence of such upstream cues, IP3R2 knockdown may represent an artificial perturbation that triggers melanophagy as part of a broader stress-induced autophagy response, raising questions about the in vivo relevance of the proposed pathway.*

      • The work's primary audience is specialized, cell biologists, autophagy researchers, and pigmentation/skin biology specialists, but the mechanistic framework on organelle crosstalk and selective autophagy will interest a broader basic research readership, including those studying lysosome-related organelles in other systems. The ratiometric probes could be adapted for future melanophagy research, and the pathway insights may guide translational studies in pigmentary disorders or melanoma. My expertise is in mitochondrial and lysosomal calcium signaling, autophagy, and microscopy-based functional assays; I do not have detailed expertise in zebrafish developmental genetics, though the phenotypic analysis appears sound.*

      Response____: We thank the Reviewer for appreciating our work and stating that our study “addresses an important gap in pigmentation biology”. Further, we thank him/her for believing that this work will be of interest to a broad basic research readership. Moreover, we thank him/her for valuing the importance and potential significance of the ratio-metric melanophagy probes generated in this study. Finally, we acknowledge the Reviewer’s constructive feedback on our study, which has helped us in enhancing the quality of our manuscript. We have performed variety of additional in vitro experiments, in vivo zebrafish studies and have significantly revised the manuscript to address all the comments of the Reviewer.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This is a robust and extensive study showing that IP3R2 selectively initiates a calcium signalling pathway leading to melanophagy, that is the degradation of melanosomes. This reduces pigmentation and UV light protection. A strength of the paper is that it combines detailed cellular studies with in viva studies in the zebrafish model. They show that knockdown of IP3R2 reverses this process perhaps leading to a strategy to enhance melanosome number and hence to afford protection from UV irradiation. The authors use a battery of fluorescent probes (mainly genetically encoded reporters) in investigate the signalling cascade leading to melanophagy or its reduction. This involves reports for a number of different organelles involved in this process. The experiments are generally well performed with clear controls for the probes in many cases. My main issue is the panels contain too much data which may obscure the message, and a good deal could be moved to supplementary data. The manuscript investigates many mechanisms in distinct organelles which is remarkable for a two author paper. Particularly interesting was the design of novel fluorescent protein reporters for melanophagy itself. One area not explored is ion fluxes across melanosomes themselves which are lysosome-related organelles and may exhibit similar properties and signalsomes of lysosomes.

      Specifically, the authors show that a REDUCTION of IP3R2-mediated calcium release leads to a calcium flux from the ER by a different mechanism (possibly via TMBIM6). This increases calcium loading of the lysosome via TMEM165, at the expense of calcium transfer to mitochondria, and an acidification.

      • This leads to TRPML1 activation and the lysosomal calcium release activates TFEB translocation to the nucleus increases the transcription of autophagy/melanophagy genes and activation of the AMPK-ULK1 pathway (rather than mTOR). This is a complex pathway and evidence is presented for many of the steps involved.*

      • This is a tour de force investigating organelle communication during the process of melanophagy, that is little understood. It highlights many important organelle ion transport events that are important findings in their own right. For example, the importance of TMEM165 in calcium filling of lysosomes.*

      Response____: We thank the Reviewer for appreciating our study and thinking that it is a robust and extensive study in a highly understudied area. We appreciate the Reviewer’s acknowledgement that our manuscript combines detailed cellular studies with in vivo studies in the zebrafish model. Further, we thank the Reviewer for his/her constructive feedback on our work.

      __ Major points:__

      Comment 1- The authors state that TPC activation does not activate TFEB translocation the nucleus. This is now not the case and should be at least looked at. What is the role of endolysosomal channels on the melanosomes themselves in melanophagy.

      Response____: We appreciate the Reviewer’s comment regarding the potential contribution of TPC channels to TFEB activation and melanophagy. In the revised manuscript, we assessed Ca²⁺ release from TPC2 under IP₃R2 knockdown conditions using the selective TPC2 agonist TPC2-A1-N (Supplementary Fig 9G-H). Additionally, we evaluated TFEB nuclear translocation following TPC2-mediated Ca²⁺ release using TPC2-A1-N (Supplementary Fig 9I-J). Our analyses revealed no significant differences in TPC2 activity or TFEB nuclear translocation upon IP₃R2 silencing compared to control conditions. These findings suggest that, in our system, TPC2-mediated Ca²⁺ signaling does not contribute significantly to TFEB activation or melanophagy downstream of IP₃R2 silencing, indicating a more prominent role for TRPML1-dependent Ca²⁺ signaling in this context.

      Comment 2- How does reduction in IP3R2 mediated calcium fluxes enhance lysosomal acidity?

      Response____: We thank the Reviewer’s question regarding the mechanistic link between reduced IP₃R2-mediated Ca²⁺ flux and enhanced lysosomal acidity. In the revised manuscript, we show that IP₃R2 silencing results in a significant upregulation of the lysosomal proton pump H⁺-ATPase subunits: ATPV0D1 and ATP6V1H (Supplementary Fig 6E-F). Increased H⁺-ATPase expression is expected to promote proton influx into the lysosomal lumen, thereby enhancing lysosomal acidification. These findings provide a mechanistic basis for how IP₃R2 silencing can drive increased lysosomal acidity.

      Comment 3- What mediates the ER source for calcium filling of lysosomes?

      Response____: We appreciate the Reviewer’s interest in the mechanism underlying ER to lysosome Ca²⁺ transfer. Recently, an independent study also reported that IP₃R2 silencing enhances lysosomal Ca²⁺ levels and lysosomal Ca²⁺ release (Zheng et al. Cell 2022). Literature suggests that lysosomal Ca²⁺ refilling is depend on Ca²⁺ fluxes originating from the endoplasmic reticulum, particularly through ER Ca²⁺ leak pathways at ER–lysosome contact sites. In this context, ER-resident Ca²⁺ leak channels such as TMBIM6 (also known as Bax inhibitor-1) play an important role in maintaining basal cytosolic Ca²⁺ levels that can be subsequently taken up by lysosomes (Kim et al. Autophagy 2020). TMBIM6-mediated Ca²⁺ leak from the ER provides a continuous, low-level Ca²⁺ source that supports lysosomal Ca²⁺ loading, (Kim et al. Autophagy 2020). This mechanism allows lysosomes to replenish their Ca²⁺ stores via Ca²⁺ uptake systems operating at ER–lysosome contact sites. Thus, ER Ca²⁺ leak channels represent a key conduit linking ER Ca²⁺ homeostasis to lysosomal Ca²⁺ filling and function.

      Recently, lysosome localized TMEM165 was identified to play an important role in Ca²⁺ filling of lysosomes (Zajac et al. Science Advances 2024). Here, in our study, we observe that TMEM165 drives lysosomal Ca²⁺ influx in melanocytes.

      Comment 4- Oregon-green-dextran is not a great probe for lysosomal calcium. Its Kd is 170nM and even in the acidic environment this may be lowered to low micromolar which may not be great for measuring changes around luminal concentrations of around 500uM. Additionally, it is usual to correct for pH effects simultaneously since the dye is also a pH reporter and has been used as such. However, I take the point that they still see an increase in fluorescence whilst pH falls probably indicating an increase in luminal lysosomal calcium confirmed by increased perilysosomal calcium.

      Response____: We thank the Reviewer for the careful and balanced assessment of the Oregon Green–dextran measurements. We appreciate the acknowledgment that, despite the known limitations of this probe and its pH sensitivity, the observed increase in fluorescence concurrent with reduced lysosomal pH is consistent with elevated luminal lysosomal Ca²⁺ levels. We are grateful for this positive interpretation, which strengthens our conclusions when considered alongside the large amount of supporting data.

      Comment 5- The major point is to reduce the number of main data panels with consigment of some controls perhaps to supplementary. This would increase the comprehensibility of the paper.

      Response____: We thank the Reviewer for this constructive and positive suggestion. We appreciate the emphasis on reducing the data in the main figures. Therefore, as suggested, we have moved considerable data to the supplementary figures. However, due to the additional experiments performed to address the concerns of other Reviewers, the main data panels may still look little busy. We sincerely think that the Reviewer would understand our situation.

      Minor points

      Comment 1- Fig 10 needs a clear legend with symbols in the diagram explained. eg ER calcium release proteins.

      Response____: We thank the Reviewer for this helpful and constructive comment. Therefore, we have revised the Figure 10 legend to clearly explain all symbols used in the schematic illustration.

      Reviewer #3 (Significance (Required)):

      This is a tour de force investigating organelle communication during the process of melanophagy, that is little understood. It highlights many important organelle ion transport events that are important findings in their own right. For example, the importance of TMEM165 in calcium filling of lysosomes.

      Response____: We sincerely thank the Reviewer for considering our work as “a tour de force investigation” and appreciating that our study presents several important organelle ion transport events.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This is a robust and extensive study showing that IP3R2 selectively initiates a calcium signalling pathway leading to melanophagy, that is the degradation of melanosomes. This reduces pigmentation and UV light protection. A strength of the paper is that it combines detailed cellular studies with in viva studies in the zebrafish model. They show that knockdown of IP3R2 reverses this process perhaps leading to a strategy to enhance melanosome number and hence to afford protection from UV irradiation. The authors use a battery of fluorescent probes (mainly genetically encoded reporters) in investigate the signalling cascade leading to melanophagy or its reduction. This involves reportes for a number of different organelles involved in this process. The experiments are generally well performed with clear controls for the probes in many cases. My main issue is the panels conatin to much data which may obscure the message, and a good deal could be moved to supplementary data. The manuscript investigates many mechanisms in distinct organelles which is remarkable for a two author paper. Particularly interesting was the design of novel fluorescent protein reporters for melanophagy itself. One area not explored is ion fluxes across melanosomes themselves which are lysosome-related organelles and may exhibit similar properties and signalsomes of lysosomes. Specifically the authors show that a REDUCTION of IP3R2-mediated calcium release leads to a calcium flux from the ER by a different mechanism (possibly via TMBIM6). This increases calcium loading of the lysosome via TMEM165, at the expense of calcium transfer to mitochondria, and an acidification. This leads to TRPML1 activation and the lysosomal calcium release activates TFEB translocation to the nucleus increases the transcription of autophagy/melanophagy genes and activation of the AMPK-ULK1 pathway (rather than mTOR). This is a complex pathway and evidence is presented for many of the steps involved.

      This is a tour de force investigating organelle communication during the process of melanophagy, that is little understood. It highlights many important organelle ion transport events that are important finmdings in their own right. For example, the importance of TMEM165 in calcium filling of lysosomes.

      Major points:

      1. The authors state that TPC activation does not activate TFEB translocation the the nucleus. This is now not the case and should be at least looked at. What is the role of endolysosomal channels on the melanosomes themselves in melanophagy.
      2. How does reduction in IP3R2 mediated calcium fluxes enhance lysosomal acidity?
      3. What mediates the ER source for calcium filling of lysosomes?
      4. Oregon-green-dextran is not a great probe for lysosomal calcium. Its Kd is 170nM and even in the acidic environment this may be lowered to low micromolar which may not be great for meaduring changes around luminal concentrations of around 500uM. Additionally, it is usual to correct for pH effects simulataneously since the dye is also a pH reporter and has been used as such. However, I take the point that they still see an increase in fluorescence whilst pH falls probably indicating an increase in luminal lysosomal calcium confirmed by increased perilysosomal calcium.

      The major point is to reduce the number of main data panels with consigment of some controls perhaps to supplementary. This would increase the comprehensibility of the paper.

      Minor points

      1. Fig 10 needs a clear legend with symbols in the diagram explained. eg ER calcium release proteins

      Significance

      This is a tour de force investigating organelle communication during the process of melanophagy, that is little understood. It highlights many important organelle ion transport events that are important findings in their own right. For example, the importance of TMEM165 in calcium filling of lysosomes.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In the present manuscript, Saurav et al. identify IP3R2-mediated ER calcium release as a key suppressor of melanophagy, thereby sustaining pigmentation in melanocytes. Using in vitro (B16 murine melanoma cells, primary human melanocytes) and in vivo (zebrafish) models, the authors report that IP3R2 expression is positively correlated with pigmentation. They then investigate the impact of IP3R2 knockdown and find that IP3R2 silencing enhances the stability of melanogenic proteins, while also inducing autophagic degradation of melanosomes (i.e., melanophagy). Concomitantly, they find that IP3R2 silencing decreases mitochondrial calcium uptake, increases lysosomal calcium loading, and lowers lysosomal pH. They propose a pathway wherein in IP3R2 knockdown cells impaired mitochondrial calcium uptake induces the activation of AMPK-ULK1, and increased lysosomal calcium activates TRPML1 via TMEM165 and closer proximity interactions between ER and lysosomes, TFEB nuclear translocation, and upregulation of melanophagy-related genes, namely OPTN and RCHY1. The work is placed within the context of emerging roles of organelle calcium signaling in pigmentation biology, where extracellular calcium influx pathways are known regulators, but the contribution of ER-mitochondria-lysosome crosstalk to melanosome turnover remains largely unknown.

      Major comments:

      • The central finding is that IP3R2 knockdown induces melanophagy and reduces pigmentation. However, the manuscript does not identify any physiological or pathological context in which IP3R2 expression or activity is naturally downregulated in melanocytes. Without such context, the knockdown may represent an artificial perturbation that broadly alters ER calcium handling and triggers melanophagy as part of a general stress-induced autophagy response. This raises uncertainty about whether the pathway operates in vivo under normal or disease conditions. It would strengthen the study to identify upstream cues that reduce IP3R2 function and to test whether these also trigger melanophagy through the proposed mechanism.
      • While the data link IP3R2 knockdown to decreased pigmentation and increased melanophagy, the causality between altered organelle calcium dynamics and the melanophagy induction is inferred from correlation and partial rescue experiments. More direct interventions in the proposed downstream pathways (e.g., acute mitochondrial calcium uptake restoration, lysosomal calcium buffering) would strengthen mechanistic claims.
      • Zebrafish assays convincingly show altered pigmentation with altered IP3R2 levels, but do not connect this to in vivo melanophagy measurements or TRPML1/TFEB activity, which would link the cell biology to organismal phenotype more directly.
      • The work suggests therapeutic potential for pigmentary disorders, but no disease models are tested. It is unclear whether the observed mechanisms operate under physiological stressors.
      • The paradox between the observed enhanced stability of melanogenic proteins and increased melanophagy is insufficiently addressed. DCT, Tyrosinase and GP100 are all melanosome-associated and their stability or degradation is in prior literature often interpreted as reflecting melanosome biogenesis and turnover. This discrepancy needs to be resolved, as it complicates interpretation of melanophagy assays.
      • The authors propose that mitophagy and ER-phagy are reduced in IP3R2 knockdown cells, suggesting specific induction of melanophagy, but the rationale for why increased autophagic flux only targets melanosomes is insufficiently addressed. Also, these conclusions are solely based on Keima assays, and positive controls for mitophagy and ER-phagy are lacking.
      • The melanophagy probes are novel and validated with rapamycin/bafilomycin, but quantitative calibration of GFP/mCherry or Keima signal to actual lysosomal delivery rates is missing; photobleaching, pH heterogeneity (incl., observed decrease in lysosomal pH), and melanin autofluorescence (see below) could confound ratios. Also, side-by-side comparison with other melanophagy detection approaches (e.g., colocalization of melanosomes with LC3) is lacking.
      • Melanosomes exhibit broad autofluorescence, particularly upon excitation at 405-488 nm and extending into the red channel. This signal can overlap with the detection ranges for GFP, mCherry, and mKeima reporters, potentially confounding quantitative readouts unless appropriate controls (e.g., untransfected cells, spectral unmixing) are used. Throughout this manuscript, it is not addressed how melanosome autofluorescence was controlled for or excluded in the reported fluorescence measurements.
      • While OPTN and RCHY1 expression is elevated upon IP3R2 knockdown, functional engagement (e.g., OPTN localization to melanosomes, melanosome ubiquitination by RCHY1), or necessity (e.g., siRNA knockdown of these in the IP3R2-deficient background), are not tested.
      • While siRNA/shRNA efficacy is shown, functional rescue with pore-dead mutants sometimes fails to return to control values. The possibility of partial off-target or compensatory effects is not fully excluded.
      • The mitochondrial and lysosomal calcium measurements are largely endpoint peak quantifications; kinetic analyses and buffering capacity measurements would provide more mechanistic depth, especially for the TMEM165 contribution. Also, TMEM165 necessity for melanophagy induction upon IP3R2 knockdown has not been directly addressed.
      • The proximity ligation assay between VAP-A and LAMP1 is interpreted as showing increased ER-lysosome contacts in IP3R2 knockdown cells. However, additional controls are needed and quantitative TEM should be included to substantiate changes in organelle contact frequency and distance.
      • Some assays report small biological n (e.g., three independent experiments with relatively small per-condition cell counts).

      Minor comments:

      • The title "IP3R2-mediated inter-organelle Ca2+ signaling orchestrates melanophagy" could be misread as indicating IP3R2 'promotes' melanophagy; consider rewording to make clear that IP3R2 suppresses melanophagy to maintain pigmentation. Similarly, the running title "IP3R2 negatively regulates melanophagy" would be clearer as "IP3R2 suppresses melanophagy".
      • Unify the framing of "positively regulates pigmentation" vs. "negatively regulates melanophagy" in the Introduction/Discussion.
      • Adding schematic flow diagrams summarizing each pathway at the end of relevant results (figure) sections could help accessibility.
      • While the introduction summarizes extracellular calcium signaling in pigmentation, there is less coverage of recent work on selective autophagy of other lysosome-related organelles (e.g., platelet dense granules, lytic granules), which could provide broader mechanistic context.

      Significance

      This study addresses an important gap in pigmentation biology by identifying IP3R2-mediated ER calcium release as a suppressor of melanophagy and a positive regulator of pigmentation. The strongest aspects are the integration of in vitro and in vivo models, the multi-faceted mechanistic exploration linking altered organelle calcium dynamics to selective melanosome turnover, and the development of novel ratiometric fluorescent probes for live-cell melanophagy measurement. Conceptually, the work extends prior literature that has focused on extracellular calcium influx and melanosome biogenesis, revealing a new inter-organelle calcium signaling module that controls melanosome degradation via AMPK-ULK1 and TMEM165-TRPML1-TFEB pathways.

      However, several limitations reduce the strength of the mechanistic claims. Some key pathway steps are inferred from correlation and partial rescue rather than direct necessity/sufficiency tests (e.g., mitochondrial calcium uptake restoration, lysosomal calcium buffering). The paradoxical observation that IP3R2 knockdown both increases melanophagy and stabilizes melanosome-resident proteisn (DCT, Tyrosinase, GP100) is not resolved, complicating interpretation of the melanophagy assays. The specificity for melanophagy over other selective autophagy pathways is asserted but not fully explained mechanistically, and positive controls for mitophagy/ER-phagy are missing. Potential technical confounds, such as melanin autofluorescence in the detection ranges of GFP, mCherry, and mKeima, are not explicitly addressed and alternative assays for these key data were insufficiently employed. In vivo results do not yet connect altered pigmentation to melanophagy readouts or downstream TRPML1/TFEB activation. Importantly, the study does not identify any physiological or pathological scenario in which IP3R2 expression or activity is naturally reduced in melanocytes. In the absence of such upstream cues, IP3R2 knockdown may represent an artificial perturbation that triggers melanophagy as part of a broader stress-induced autophagy response, raising questions about the in vivo relevance of the proposed pathway.

      The work's primary audience is specialized, cell biologists, autophagy researchers, and pigmentation/skin biology specialists, but the mechanistic framework on organelle crosstalk and selective autophagy will interest a broader basic research readership, including those studying lysosome-related organelles in other systems. The ratiometric probes could be adapted for future melanophagy research, and the pathway insights may guide translational studies in pigmentary disorders or melanoma. My expertise is in mitochondrial and lysosomal calcium signaling, autophagy, and microscopy-based functional assays; I do not have detailed expertise in zebrafish developmental genetics, though the phenotypic analysis appears sound.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The manuscript entitled, "IP3R2 mediated inter-organelle Ca2+ signaling orchestrates melanophagy" is a rather diffuse study of the relationship between IP3R2 and melanin production. While this is an interesting and understudied area, the study lacks a clear focus. The model seems to be that IP3R2 is essential for mitochondrial calcium loading. And that its absence increases lysosomal calcium loading. There are also a number of incomplete and/or unconvincing links to autophagy/melanophagy, TMEM165, TRPML1 and even gene transcription. In this kind of diffuse study, each step needs to be convincing to get to the next one, which is not the case here. There are also references to altered proteasome function, despite the total absence of any direct data on the proteasome. Finally, I felt it was sometimes unclear whether the authors were referring to melanosomes or lysosomes at various points throughout the study. While I suspect that, somewhere in here, there are some novel relationships worthy of further investigation, this is a case where the many parts make the overall product less convincing. What effects here are directly relevant to IP3R2? This study should stop there, leaving investigations of peripheral factors for future investigations, as the further you get from where you start, the less clear what you are studying becomes. And the less direct.

      Specific Comments:

      1. The separation of Figures 1F and 1J makes it impossible to assess the effect of αMSH on IP3R2 expression. This presentation makes interpretation difficult; a simple 4 lane Western would be more informative
      2. One of the most attractive points made by this study is that there is a specific link between IP3R2 and melanin production. In my opinion, the null hypothesis is that this is just about the amount of IP3Rs expressed per cell. To reject this concept, the authors should show data demonstrating the relative expression of all 3 IP3Rs. Without this information, the null hypothesis that IP3R2 is the most expressed IP3R isoform and that's why its knockdown has the most dramatic effect cannot be rejected It would also be helpful to show where the different IP3Rs are expressed within the cell.
      3. It would be helpful to label Figs 3F-I with the conditions used. The description in the text is of increased LC3II levels, however, the ratio of LC3I to LC3II might be more meaningful. Irrespective, although the graph shows an increase in LC3II, the Western really doesn't show much. As a standalone finding, I don't find this figure to be very convincing; there are better options to demonstrate this proposed relationship between IP3R2 and autophagy than what is shown.
      4. The following statement at the beginning of page 22 "We observed an impaired proteasomal degradation of critical melanogenic proteins localized on melanosomes in the IP3R2 knockdown condition" is insufficiently supported by data to be made. Even if I was convinced that autophagy was enhanced, there is no data of any kind about the proteasome in this manuscript.
      5. In figure 5, the authors create a new ratiometric dye to detect melanosome stability based on the principle that tyrosinase is exclusively found in melanosomes. Unfortunately, there is no validation that this new construct is found exclusively in melanosomes upon expression. In addition, there is discussion about the pH of lysosomes, but not of melanosomes. Ultimately, this data cannot be considered at face value without any type of validation; I also note that the pictures lack sufficient detail to support identification of these stuctures asmelanosomes.

      While I maintain the above concerns, I note that, the data in supplemental figure 3 is MUCH more convincing than what is in the figure. Both the writing and the figure design should be rethought. 6. Given the increase in ER Ca2+ content after IP3R2 knockdown, ER calcium content should be emptied before attempting to estimate lysosomal Ca2+ content with GPN or Bafilomycin. Otherwise, the source of calcium is less than clear.

      Significance

      The manuscript entitled, "IP3R2 mediated inter-organelle Ca2+ signaling orchestrates melanophagy" is a rather diffuse study of the relationship between IP3R2 and melanin production. While this is an interesting and understudied area, the study lacks a clear focus. The model seems to be that IP3R2 is essential for mitochondrial calcium loading. And that its absence increases lysosomal calcium loading. There are also a number of incomplete and/or unconvincing links to autophagy/melanophagy, TMEM165, TRPML1 and even gene transcription. In this kind of diffuse study, each step needs to be convincing to get to the next one, which is not the case here. There are also references to altered proteasome function, despite the total absence of any direct data on the proteasome. Finally, I felt it was sometimes unclear whether the authors were referring to melanosomes or lysosomes at various points throughout the study.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We are grateful to the Review Commons reviewers for their constructive feedback, which has significantly strengthened the manuscript. In response, we have performed additional experiments, revised and expanded multiple figures, incorporated new statistical and functional analyses, and carefully edited the text to improve clarity and precision. A detailed point-by-point response to all reviewer comments, together with a summary of revised figures, is provided.

      To address the reviewers' suggestions, we have conducted additional experiments that are now incorporated into new figures, or we have added new images to several existing figures where appropriate.

      For this reason, please note that all figures have been renumbered to improve clarity and facilitate cross-referencing throughout the text. As recommended by Referee #3, all figure legends have been thoroughly revised to reflect these updates and are now labeled following the standard A-Z panel format, enhancing readability and ensuring easier identification. In addition, all figure legends now include the sample size for each statistical analysis.

      For clarity and ease of reference, we provide below a comprehensive list of all figures included in the revised version. Figures that have undergone modifications are underlined.

      Figure 1____. The first spermatogenesis wave in prepuberal mice.

      This figure now includes amplified images of representative spermatocytes and a summary schematic illustrating the timeline of spermatogenesis. In addition, it now presents the statistical analysis of spermatocyte quantification to support the visual data.

      __Figure 2.____ Cilia emerge across all stages of prophase I in spermatocytes during the first spermatogenesis wave. __

      The images of this figure remain unchanged from the original submission, but all the graphs present now the statistical analysis of spermatocyte quantification.

      Figure 3. Ultrastructure and markers of prepuberal meiotic cilia.

      This figure remains unchanged from the original submission; however, we have replaced the ARL3-labelled spermatocyte image (A) with one displaying a clearer and more representative signal.

      __Figure 4. Testicular tissue presents spermatocyte cysts in prepuberal mice and adult humans. __

      This figure remains unchanged from the original submission.

      __Figure 5. Cilia and flagella dynamics are correlated during prepuberal meiosis. __

      This figure remains unchanged from the original submission.

      __Figure 6. Comparative proteomics identifies potential regulators of ciliogenesis and flagellogenesis. __

      This figure remains unchanged from the original submission.

      Figure 7.____ Deciliation induces persistence of DNA damage in meiosis.

      This figure has been substantially revised and now includes additional experiments analyzing chloral hydrate treatment, aimed at more accurately assessing DNA damage under both control and treated conditions. Images F-I and graph J are new.

      Figure 8____. Aurora kinase A is a regulator of cilia disassembly in meiosis.

      This figure is remodelled as the original version contained a mistake in previous panel II, for this, graph in new Fig.8 I has been corrected. In addition, it now contains additional data of αTubulin staining in arrested ciliated metaphases I after AURKA inhibition (new panel L1´).

      __Figure 9. Schematic representation of the prepuberal versus adult seminiferous epithelium. __

      This figure remains unchanged from the original submission.

      __Supplementary Figure 1. Meiotic stages during the first meiotic wave. __

      This figure remains unchanged from the original submission.

      __Supplementary Figure 2 (new)____. __

      This is a new figure that includes additional data requested by the reviewers. It includes additional markers of cilia in spermatocytes (glutamylated Tubulin/GT335), and the control data of cilia markers in non-ciliated spermatocytes. It also includes now the separated quantification of ciliated spermatocytes for each stage, as requested by reviewers, complementing graphs included in Figure 2.

      Please note that with the inclusion of this new Supplementary Figure 2, the numbering of subsequent supplementary figures has been updated accordingly.

      Supplementary Figure 3 (previously Suppl. Fig. 2)__. Ultrastructure of prophase I spermatocytes. __

      This figure is equal in content to the original submission, but some annotations have been included.

      Supplementary Figure 4 (previously Suppl. Fig. 3).__ Meiotic centrosome under the electron microscope. __

      This figure remains unchanged from the original submission, but additional annotations have been included.

      Supplementary Figure 5 (previously Suppl. Fig. 4)__. Human testis contains ciliated spermatocytes. __

      This figure has been revised and now includes additional H2AX staining to better determine the stage of ciliated spermatocytes and improve their identification.

      Supplementary Figure 6 (previously Suppl. Fig. 5). GLI1 and GLI3 readouts of Hedgehog signalling are not visibly affected in prepuberal mouse testes.

      This figure has been remodeled and now includes the quantification of GLI1 and GLI3 and its corresponding statistical analysis. It also includes the control data for Tubulin, instead of GADPH.

      Supplementary Figure 7 (previously Suppl. Fig. 6)__. CH and MLN8237 optimization protocol. __

      This figure has been remodeled to incorporate control experiments using 1-hour organotypic culture treatment.

      Supplementary Figure 8 (previously Suppl. Fig. 7)__. Tracking first meiosis wave with EdU pulse injection during prepubertal meiosis. __This figure remains unchanged from the original submission.

      Supplementary Figure 9 (previously Suppl. Fig. 8)__. PLK1 and AURKA inhibition in cultured spermatocytes. __

      This figure has been remodeled and now includes additional data on spindle detection in control and AURKA-inhibited spermatocytes (both ciliated and non ciliated).

      DETAILED POINT-BY-POINT RESPONSE TO THE REVIEWERS

      We will submit both the PDF version of the revised manuscript and the Word file with tracked changes relative to the original submission. Each modification made in response to reviewers' suggestions is annotated in the Word document within the corresponding section of the text. all new figures have also been uploaded to the system.

      Response to the Referee #1

      In this manuscript by Perez-Moreno et al., titled "The dynamics of ciliogenesis in prepubertal mouse meiosis reveal new clues about testicular maturation during puberty", the authors characterize the development of primary cilia during meiosis in juvenile male mice. The authors catalog a variety of testicular changes that occur as juvenile mice age, such as changes in testis weight and germ cell-type composition. They next show that meiotic prophase cells initially lack cilia, and ciliated meiotic prophase cells are detected after 20 days postpartum, coinciding with the time when post-meiotic spermatids within the developing testes acquire flagella. They describe that germ cells in juvenile mice harbor cilia at all substages of meiotic prophase, in contrast to adults where only zygotene stage meiotic cells harbor cilia. The authors also document that cilia in juvenile mice are longer than those in adults. They characterize cilia composition and structure by immunofluorescence and EM, highlighting that cilia polymerization may initially begin inside the cell, followed by extension beyond the cell membrane. Additionally, they demonstrate ciliated cells can be detected in adult human testes. The authors next perform proteomic analyses of whole testes from juvenile mice at multiple ages, which may not provide direct information about the extremely small numbers of ciliated meiotic cells in the testis, and is lacking follow up experiments, but does serve as a valuable resource for the community. Finally, the authors use a seminiferous tubule culturing system to show that chemical inhibition of Aurora kinase A likely inhibits cilia depolymerization upon meiotic prophase I exit and leads to an accumulation of metaphase-like cells harboring cilia. They also assess meiotic recombination progression using their culturing system, but this is less convincing.

      Author response: We sincerely thank Ref #1 for the thorough and thoughtful evaluation of our manuscript. We are particularly grateful for the reviewer's careful reading and constructive feedback, which have helped us refine several sections of the text and strengthen our discussion. All comments and suggestions have been carefully considered and addressed, as detailed below.

      __Major comments: __

      1. There are a few issues with the experimental set up for assessing the effects of cilia depolymerization on DNA repair (Figure 7-II). First, how were mid pachytene cells identified and differentiated from early pachytene cells (which would have higher levels of gH2AX) in this experiment? I suggest either using H1t staining (to differentiate early/mid vs late pachytene) or the extent of sex chromosome synapsis. This would ensure that the authors are comparing similarly staged cells in control and treated samples. Second, what were the gH2AX levels at the starting point of this experiment? A more convincing set up would be if the authors measure gH2AX immediately after culturing in early and late cells (early would have higher gH2AX, late would have lower gH2AX), and then again after 24hrs in late cells (upon repair disruption the sampled late cells would have high gH2AX). This would allow them to compare the decline in gH2AX (i.e., repair progression) in control vs treated samples. Also, it would be informative to know the starting gH2AX levels in ciliated vs non-ciliated cells as they may vary.

      Response:

      We thank Ref #1 for this valuable comment, which significantly contributed to improving both the design and interpretation of the cilia depolymerization assay.

      Following this suggestion, we repeated the experiment including 1-hour (immediately after culturing), and 24-hour cultures for both control and chloral hydrate (CH)-treated samples (n = 3 biological replicates). To ensure accurate staging, we now employ triple immunolabelling for γH2AX, SYCP3, and H1T, allowing clear distinction of zygotene (H1T−), early pachytene (H1T−), and late pachytene (H1T+) cells. The revised data (Figure 7) now provide a more complete and statistically robust analysis of DNA damage dynamics. These results confirm that CH-induced deciliation leads to persistence of the γH2AX signal at 24 hours, indicating impaired DNA repair progression in pachytene spermatocytes. The new images and graphs are included in the revised Figure 7.

      Regarding the reviewer's final point about the comparison of γH2AX levels between ciliated and non-ciliated cells, we regret that direct comparison of γH2AX levels between ciliated and non-ciliated cells is not technically feasible. To preserve cilia integrity, all cilia-related imaging is performed using the squash technique, which maintains the three-dimensional structure of the cilia but does not allow reliable quantification of DNA damage markers due to nuclear distortion. Conversely, the nuclear spreading technique, used for DNA damage assessment, provides optimal visualization of repair foci but results in the loss of cilia due to cytoplasmic disruption during the hypotonic step. Given that spermatocytes in juvenile testes form developmentally synchronized cytoplasmic cysts, we consider that analyzing a statistically representative number of spermatocytes offers a valid and biologically meaningful measure of tissue-level effects.

      In conclusion, we believe that the additional experiments and clarifications included in revised Figure 7 strengthen our conclusion that cilia depolymerization compromises DNA repair during meiosis. Further functional confirmation will be pursued in future works, since we are currently generating a conditional genetic model for a ciliopathy in our laboratory.

      The authors analyze meiotic progression in cells cultured with/without AURKA inhibition in Figure 8-III and conclude that the distribution of prophase I cells does not change upon treatment. Is Figure 8-III A and B the same data? The legend text is incorrect, so it's hard to follow. Figure 8-III A shows a depletion of EdU-labelled pachytene cells upon treatment. Moreover, the conclusion that a higher proportion of ciliated zygotene cells upon treatment (Figure 8-II C) suggests that AURKA inhibition delays cilia depolymerization (page 13 line 444) does not make sense to me.

      Response:

      We thank Ref#1 for identifying this issue and for the careful examination of Figure 8. We discovered that the submitted version of Figure 8 contained a mismatch between the figure legend and the figure panels. The legend text was correct; however, the figure inadvertently included a non-corresponding graph (previously panel II-A), which actually belonged to Supplementary Figure 7 in the original submission. We apologize for this mistake.

      This error has been corrected in the revised version. The updated Figure 8 now accurately presents the distribution of EdU-labelled spermatocytes across prophase I substages in control and AURKA-inhibited cultures (previously Figure 8-II B, now Figure 8-A). The corrected data show no significant differences in the proportions of EdU-labelled spermatocytes among prophase I substages after 24 hours of AURKA inhibition, confirming that meiotic progression is not delayed and that no accumulation of zygotene cells occurs under this treatment. Therefore, the observed increase in ciliated zygotene spermatocytes upon AURKA inhibition (new Figure 8 H-I) is best explained by a delay in cilia disassembly, rather than by an arrest or slowdown in meiotic progression. The figure legend and main text have been revised accordingly.

      How do the authors know that there is a monopolar spindle in Figure 8-IV treated samples? Perhaps the authors can use a different Tubulin antibody (that does not detect only acetylated Tubulin) to show that there is a monopolar spindle.

      Response:

      We appreciate Ref#1 for this excellent suggestion. In the original submission (lines 446-447), we described that ciliated metaphase I spermatocytes in AURKA-inhibited samples exhibited monopolar spindle phenotypes. This description was based on previous reports showing that AURKA or PLK1 inhibition produces metaphases with monopolar spindles characterized by aberrant yet characteristic SYCP3 patterns, abnormal chromatin compaction, and circular bivalent alignment around non-migrated centrosomes (1). In our study, we observed SYCP3 staining consistent with these characteristic features of monopolar metaphases I.

      However, we agree with Ref #1 that this could be better sustained with data. Following the reviewer's suggestion, we performed additional immunostaining using α-Tubulin, which labels total microtubules rather than only the acetylated fraction. For clarity purposes, the revised Figure 8 now includes α-Tubulin staining in the same ciliated metaphase I cells shown in the original submission, confirming the presence of defective microtubule polymerization and defective spindle organization. For clarity, we now refer to these ciliated metaphases I as "arrested MI". This new data further support our conclusion that AURKA inhibition disrupts spindle bipolarization and prevents cilia depolymerization, indicating that cilia maintenance and bipolar spindle organization are mechanistically incompatible events during male meiosis. The abstract, results, and discussion section has been expanded accordingly, emphasizing that the persistence of cilia may interfere with microtubule polymerization and centrosome separation under AURKA inhibition. The Discussion has been expanded to emphasize that persistence of cilia may interfere with centrosome separation and microtubule polymerization, contrasting with invertebrate systems -e.g. Drosophila (2) and P. brassicae (3)- in which meiotic cilia persist through metaphase I without impairing bipolar spindle assembly.

      1. Alfaro, et al. EMBO Rep 22, (2021). DOI: 15252/embr.202051030 (PMID: 33615693)
      2. Riparbelli et al . Dev Cell (2012) DOI: 1016/j.devcel.2012.05.024 (PMID: 22898783)
      3. Gottardo et al, Cytoskeleton (Hoboken) (2023) DOI: 1002/cm.21755 (PMID: 37036073)

      The authors state in the abstract that they provide evidence suggesting that centrosome migration and cilia depolymerization are mutually exclusive events during meiosis. This is not convincing with the data present in the current manuscript. I suggest amending this statement in the abstract.

      Response:

      We thank Ref#1 for this valuable observation, with which we fully agree. To avoid overstatement, the original statement has been removed from the Abstract, Results, and Discussion, and replaced with a more accurate formulation indicating that cilia maintenance and bipolar spindle formation are mutually exclusive events during mouse meiosis.

      This revised statement is now directly supported by the new data presented in Figure 8, which demonstrate that AURKA inhibition prevents both spindle bipolarization and cilia depolymerization. We are grateful to the reviewer for highlighting this important clarification.

      Minor comments:

      The presence of cilia in all stages of meiotic prophase I in juvenile mice is intriguing. Why is the cellular distribution and length of cilia different in prepubertal mice compared to adults (where shorter cilia are present only in zygotene cells)? What is the relevance of these developmental differences? Do cilia serve prophase I functions in juvenile mice (in leptotene, pachytene etc.) that are perhaps absent in adults?

      Related to the above point, what is the relevance of the absence of cilia during the first meiotic wave? If cilia serve a critical function during prophase I (for instance, facilitating DSB repair), does the lack of cilia during the first wave imply differing cilia (and repair) requirements during the first vs latter spermatogenesis waves?

      In my opinion, these would be interesting points to discuss in the discussion section.

      Response:

      We thank the reviewer for these thoughtful observations, which we agree are indeed intriguing.

      We believe that our findings likely reflect a developmental role for primary cilia during testicular maturation. We hypothesize that primary cilia at this stage might act as signaling organelles, receiving cues from Sertoli cells or neighboring spermatocytes and transmitting them through the cytoplasmic cysts shared by spermatocytes. Such intercellular communication could be essential for coordinating tissue maturation and meiotic entry during puberty. Although speculative, this hypothesis aligns with the established role of primary cilia as sensory and signaling hubs for GPCR and RTK pathways regulating cell differentiation and developmental patterning in multiple tissues (e.g., 1, 2). The Discussion section has been expanded to include these considerations.

      1. Goetz et al, Nat Rev Genet (2010)- DOI: 1038/nrg2774 (PMID: 20395968)
      2. Naturky et al , Cell (2019) DOI: 1038/s41580-019-0116-4 (PMID: 30948801) Our study focuses on the first spermatogenic wave, which represents the transition from the juvenile to the reproductive phase. It is therefore plausible that the transient presence of longer cilia during this period reflects a developmental requirement for external signaling that becomes dispensable in the mature testis. Given that this is only the second study to date examining mammalian meiotic cilia, there remains a vast area of research to explore. We plan to address potential signaling cascades involved in these processes in future studies.

      On the other hand, while we cannot confirm that the cilia observed in zygotene spermatocytes persist until pachytene within the same cell, it is reasonable to speculate that they do, serving as longer-lasting signaling structures that facilitate testicular development during the critical pubertal window. In addition, the observation of ciliated spermatocytes at all prophase I substages at 20 dpp, together with our proteomic data, supports the idea that the emergence of meiotic cilia exerts a significant developmental impact on testicular maturation.

      In summary, although we cannot yet define specific prophase I functions for meiotic cilia in juvenile spermatocytes, our data demonstrate that the first meiotic wave differs from later waves in cilia dynamics, suggesting distinct regulatory requirements between puberty and adulthood. These findings underscore the importance of considering developmental context when using the first meiotic wave as a model for studying spermatogenesis.

      The authors state on page 9 lines 286-288 that the presence of cytoplasmic continuity via intercellular bridges (between developmentally synchronous spermatocytes) hints towards a mechanism that links cilia and flagella formation. Please clarify this statement. While the correlation between the timing of appearance of cilia and flagella in cells that are located within the same segment of the seminiferous tubule may be hinting towards some shared regulation, how would cytoplasmic continuity participate in this regulation? Especially since the cytoplasmic continuity is not between the developmentally distinct cells acquiring the cilia and flagella?

      Response:

      We thank Ref#1 for this excellent question and for the opportunity to clarify our statement.

      The presence of intercellular bridges between spermatocytes is well known and has long been proposed to support germ cell communication and synchronization (1,2) as well as sharing mRNA (3) and organelles (4). A classic example is the Akap gene, located on the X chromosome and essential for the formation of the sperm fibrous sheath; cytoplasmic continuity through intercellular bridges allows Akap-derived products to be shared between X- and Y-bearing spermatids, thereby maintaining phenotypic balance despite transcriptional asymmetry (5). In addition, more recent work has further demonstrated that these bridges are critical for synchronizing meiotic progression and for processes such as synapsis, double-strand break repair, and transposon repression (6).

      In this context, and considering our proteomic data (Figure 6), our statement did not intend to imply direct cytoplasmic exchange between ciliated and flagellated cells. Although our current methods do not allow comprehensive tracing of cytoplasmic continuity from the basal to the luminal compartment of the seminiferous epithelium, we plan to address this limitation using high-resolution 3D and ultrastructural imaging approaches in future studies.

      Based on our current data, we propose that cytoplasmic continuity within developmentally synchronized spermatocyte cysts could facilitate the coordinated regulation of ciliogenesis, and similarly enable the sharing of regulatory factors controlling flagellogenesis within spermatid cysts. This coordination may occur through the diffusion of centrosomal or ciliary proteins, mRNAs, or signaling intermediates involved in the regulation of microtubule dynamics. However, we cannot exclude the possibility that such cytoplasmic continuity extends across all spermatocytes derived from the same spermatogonial clone, potentially providing a larger regulatory network.]] This mechanism could help explain the temporal correlation we observe between the appearance of meiotic cilia and the onset of flagella formation in adjacent spermatids within the same seminiferous segment.

      We have revised the Discussion to explicitly clarify this interpretation and to note that, although hypothetical, it is consistent with established literature on cytoplasmic continuity and germ cell coordination.

      1. Dym, et al. * Reprod.*(1971) DOI: 10.1093/biolreprod/4.2.195 (PMID: 4107186)
      2. Braun et al. Nature. (1989) DOI: 1038/337373a0 (PMID: 2911388)
      3. Greenbaum et al. * Natl. Acad. Sci. USA*(2006). DOI: 10.1073/pnas.0505123103 (PMID: 16549803)
      4. Ventelä et al. Mol Biol Cell. (2003) DOI: 1091/mbc.e02-10-0647 (PMID: 12857863)
      5. Turner et al. Journal of Biological Chemistry (1998). DOI: 1074/jbc.273.48.32135 (PMID: 9822690)
      6. Sorkin, et al. Nat Commun (2025). DOI: 1038/s41467-025-56742-9 (PMID: 39929837) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      Individual germ cells in H&E-stained testis sections in Figure 1-II are difficult to see. I suggest adding zoomed-in images where spermatocytes/round spermatids/elongated spermatids are clearly distinguishable.

      Response:

      Ref#1 is very right in this suggestion. We have revised Figure 1 to improve the quality of the H&E-stained testis sections and have added zoomed-in panels where spermatocytes, round spermatids, and elongated spermatids are clearly distinguishable. These additions significantly enhance the clarity and interpretability of the figure.

      In Figure 2-II B, the authors document that most ciliated spermatocytes in juvenile mice are pachytene. Is this because most meiotic cells are pachytene? Please clarify. If the data are available (perhaps could be adapted from Figure 1-III), it would be informative to see a graph representing what proportions of each meiotic prophase substages have cilia.

      Response:

      We thank the reviewer for this valuable observation. Indeed, the predominance of ciliated pachytene spermatocytes reflects the fact that most meiotic cells in juvenile testes are at the pachytene stage (Figure 1). We have clarified this point in the text and have added a new supplementary figure (Supplementary Figure 2, new figure) presenting a graph showing the proportion of spermatocytes at each prophase I substage that possess primary cilia. This visualization provides a clearer quantitative overview of ciliation dynamics across meiotic substages.

      I suggest annotating the EM images in Sup Figure 2 and 3 to make it easier to interpret.

      Response:

      We thank the reviewer for this helpful suggestion. We have now added annotations to the EM images in Supplementary Figures 3 and 4 to facilitate their interpretation. These visual guides help readers more easily identify the relevant ultrastructural features described in the text.

      The authors claim that the ratio between GLI3-FL and GLI3-R is stable across their analyzed developmental window in whole testis immunoblots shown in Sup Figure 5. Quantifying the bands and normalizing to the loading control would help strengthen this claim as it hard to interpret the immunoblot in its current form.

      Response:

      We thank the reviewer for this valuable suggestion. Following this recommendation, Supplementary Figure 5 has been revised to include quantification of GLI1 and GLI3 protein levels, normalized to the loading control.

      After quantification, we observed statistically significant differences across developmental stages. Specifically, GLI1 expression is slightly higher at 21 dpp compared to 8 dpp. For GLI3, we performed two complementary analyses:

      • Total GLI3 protein (sum of full-length and repressor forms normalized to loading control) shows a progressive decrease during development, with the lowest levels at 60 dpp (Supplementary Figure 5D).
      • GLI3 activation status, assessed as the GLI3-FL/GLI3-R ratio, is highest during the 19-21 dpp window, compared to 8 dpp and 60 dpp. Although these results suggest a possible transient activation of GLI3 during testicular maturation, we caution that this cannot automatically be attributed to increased Hedgehog signaling, as GLI3 processing can also be affected by other processes, such as changes in ciliogenesis. Furthermore, because the analysis was performed on whole-testis protein extracts, these changes cannot be specifically assigned to ciliated spermatocytes.

      We have expanded the Discussion to address these findings and to highlight the potential involvement of the Desert Hedgehog (DHH) pathway, which plays key roles in testicular development, Sertoli-germ cell communication, and spermatogenesis (1, 2, 3). We plan to investigate these pathways further in future studies.

      1. Bitgood et al. Curr Biol. (1996). DOI: 1016/s0960-9822(02)00480-3 (PMID: 8805249)
      2. Clark et al. Biol Reprod. (2000) DOI: 1095/biolreprod63.6.1825 (PMID: 11090455)
      3. O'Hara et al. BMC Dev Biol. (2011) DOI: 1186/1471-213X-11-72 (PMID: 22132805) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      There are a few typos throughout the manuscript. Some examples: page 5 line 172, Figure 3-I legend text, Sup Figure 5-II callouts, Figure 8-III legend, page 15 line 508, page 17 line 580, page 18 line 611.

      Response:

      We thank the reviewer for detecting this. All typographical errors have been corrected, and figure callouts have been reviewed for consistency.

      Response to the Referee #2

      This study focuses on the dynamic changes of ciliogenesis during meiosis in prepubertal mice. It was found that primary cilia are not an intrinsic feature of the first wave of meiosis (initiating at 8 dpp); instead, they begin to polymerize at 20 dpp (after the completion of the first wave of meiosis) and are present in all stages of prophase I. Moreover, prepubertal cilia (with an average length of 21.96 μm) are significantly longer than adult cilia (10 μm). The emergence of cilia coincides temporally with flagellogenesis, suggesting a regulatory association in the formation of axonemes between the two. Functional experiments showed that disruption of cilia by chloral hydrate (CH) delays DNA repair, while the AURKA inhibitor (MLN8237) delays cilia disassembly, and centrosome migration and cilia depolymerization are mutually exclusive events. These findings represent the first detailed description of the spatiotemporal regulation and potential roles of cilia during early testicular maturation in mice. The discovery of this phenomenon is interesting; however, there are certain limitations in functional research.

      We thank Referee #2 for their careful reading of the manuscript and for highlighting important limitations regarding functional interpretation.

      Our primary objective in this study was to provide a rigorous structural, temporal, and developmental characterization of meiotic ciliogenesis in the mammalian testis, a process for which almost no prior data exist. Given this lack of foundational information, we focused on establishing when, where, and in which meiotic stages primary cilia form during prepubertal development, and on identifying candidate regulatory pathways using complementary imaging, proteomic, and pharmacological approaches.

      We agree that genetic ablation models would provide the most direct means to test ciliary function during spermatogenesis. However, we believe that such functional analyses must be preceded by a detailed developmental and phenotypic framework, which was previously unavailable. The present study therefore represents a necessary first step, defining the dynamics, ultrastructure, and molecular context of meiotic cilia during the transition from juvenile to adult spermatogenesis. We are currently generating conditional genetic models to directly address functional mechanisms in future work.

      Regarding the temporal coincidence between the emergence of meiotic cilia and the onset of flagellogenesis, we do not interpret this observation as evidence of stochastic or non-functional protein expression. Rather, we present it as a developmental correlation that may reflect shared regulatory constraints on axonemal assembly during testicular maturation. We have clarified in the revised manuscript that this relationship is descriptive and hypothesis-generating, and we avoid assigning direct causal roles.

      With respect to the proteomic analysis, we agree that proteomics alone cannot establish function. Our intent was not to assign causality, but to provide a developmental, hypothesis-generating dataset identifying candidate regulators that are enriched at the precise developmental window when both meiotic cilia and spermatid flagella first emerge. We have revised the text to explicitly frame these data as a resource for future mechanistic studies, rather than as direct functional evidence.

      Taken together, we believe that the revised manuscript now more accurately reflects the scope and limitations of the study, while providing a robust and much-needed developmental framework for future genetic and functional analyses of meiotic ciliogenesis in mammals. We would be happy to further clarify any aspect of these interpretations if the reviewer or editor considers it helpful.

      Major points:

      1. The prepubertal cilia in spermatocytes discovered by the authors lack specific genetic ablation to block their formation, making it impossible to evaluate whether such cilia truly have functions. Because neither in the first wave of spermatogenesis nor in adult spermatogenesis does this type of cilium seem to be essential. In addition, the authors also imply that the formation of such cilia appears to be synchronized with the formation of sperm flagella. This suggests that the production of such cilia may merely be transient protein expression noise rather than a functionally meaningful cellular structure.

      Response:

      We agree that a genetic ablation model would represent the ideal approach to directly test cilia function in spermatogenesis. However, given the complete absence of prior data describing the dynamics of ciliogenesis during testis development, our priority in this study was to establish a rigorous structural and temporal characterization of this process in the main mammalian model organism, the mouse. This systematic and rigorous phenotypic characterization is a necessary first step before any functional genetics could be meaningfully interpreted.

      To our knowledge, this study represents the first comprehensive analysis of ciliogenesis during prepubertal mouse meiosis, extending our previous work on adult spermatogenesis (1). Beyond these two contributions, only four additional studies have addressed meiotic cilia-two in zebrafish (2, 3), with Mytlys et al. also providing preliminary observations relevant to prepubertal male meiosis that we discuss in the present work, one in Drosophila (4) and a recent one in butterfly (5). No additional information exists for mammalian gametogenesis to date.

      1. López-Jiménez et al. Cells (2022) DOI: 10.3390/cells12010142 (PMID: 36611937)
      2. Mytlis et al. Science (2022) DOI: 10.1126/science.abh3104 (PMID: 35549308)
      3. Xie et al. J Mol Cell Biol (2022) DOI: 10.1093/jmcb/mjac049 (PMID: 35981808)
      4. Riparbelli et al . Dev Cell (2012) DOI: 10.1016/j.devcel.2012.05.024 (PMID: 22898783)
      5. Gottardo et al, Cytoskeleton (Hoboken) (2023) DOI: 10.1002/cm.21755 (PMID: 37036073) We therefore consider this descriptive and analytical foundation to be essential before the development of functional genetic models. Indeed, we are currently generating a conditional genetic model for a ciliopathy in our laboratory. These studies are ongoing and will directly address the type of mechanistic questions raised here, but they extend well beyond the scope and feasible timeframe of the present manuscript.

      We thus maintain that the present work constitutes a necessary and timely contribution, providing a robust reference dataset that will facilitate and guide future functional studies in the field of cilia and meiosis.

      Taking this into account, we would be very pleased to address any additional, concrete suggestions from Ref#2 that could further strengthen the current version of the manuscript

      The high expression of axoneme assembly regulators such as TRiC complex and IFT proteins identified by proteomic analysis is not particularly significant. This time point is precisely the critical period for spermatids to assemble flagella, and TRiC, as a newly discovered component of flagellar axonemes, is reasonably highly expressed at this time. No intrinsic connection with the argument of this paper is observed. In fact, this testicular proteomics has little significance.

      Response:

      We appreciate this comment but respectfully disagree with the reviewer's interpretation of our proteomic data. To our knowledge, this is the first proteomic study explicitly focused on identifying ciliary regulators during testicular development at the precise window (19-21 dpp) when both meiotic cilia and spermatid flagella first emerge.

      While Piprek et al (1) analyzed the expression of primary cilia in developing gonads, proteomic data specifically covering the developmental transition at 19-21 dpp were not previously available. Furthermore, a recent cell-sorting study (2), detected expression of cilia proteins in pachytene spermatocytes compared to round spermatids, but did not explore their functional relevance or integrate these data with developmental timing or histological context.

      In contrast, our dataset integrates histological staging, high-resolution microscopy, and quantitative proteomics, revealing a set of candidate regulators (including DCAF7, DYRK1A, TUBB3, TUBB4B, and TRiC) potentially involved in cilia-flagella coordination. We view this as a hypothesis-generating resource that outlines specific proteins and pathways for future mechanistic studies on both ciliogenesis and flagellogenesis in the testis.

      Although we fully agree that proteomics alone cannot establish causal function, we believe that dismissing these data as having little significance overlooks their value as the first molecular map of the testis at the developmental window when axonemal structures arise. Our dataset provides, for the first time, an integrated view of proteins associated with ciliary and flagellar structures at the developmental stage when both axonemal organelles first appear. We thus believe that our proteomic dataset represents an important and novel contribution to the understanding of testicular development and ciliary biology.

      Considering this, we would again welcome any specific suggestions from Ref#2 on additional analyses or clarifications that could make the relevance of this dataset even clearer to readers.

      1. Piprek et al. Int J Dev Biol. (2019) doi: 10.1387/ijdb.190049rp (PMID: 32149371).
      2. Fang et al. Chromosoma. (1981) doi: 10.1007/BF00285768 (PMID: 7227045). Response to the Referee #3

      In "The dynamics of ciliogenesis in prepubertal mouse meiosis reveals new clues about testicular development" Pérez-Moreno, et al. explore primary cilia in prepubertal mouse spermatocytes. Using a combination of microscopy, proteomics, and pharmacological perturbations, the authors carefully characterize prepubertal spermatocyte cilia, providing foundational work regarding meiotic cilia in the developing mammalian testis.

      Response: We sincerely thank Ref#3 for their positive assessment of our work and for the thoughtful suggestions that have helped us strengthen the manuscript. We are pleased that the reviewer recognizes both the novelty and the relevance of our study in providing foundational insights into meiotic ciliogenesis during prepubertal testicular development. All specific comments have been carefully considered and addressed as detailed below.

      Major concerns:

      1. The authors provide evidence consistent with cilia not being present in a larger percentage of spermatocytes or in other cells in the testis. The combination of electron microscopy and acetylated tubulin antibody staining establishes the presence of cilia; however, proving a negative is challenging. While acetylated tubulin is certainly a common marker of cilia, it is not in some cilia such as those in neurons. The authors should use at least one additional cilia marker to better support their claim of cilia being absent.

      Response:

      We thank the reviewer for this helpful suggestion. In the revised version, we have strengthened the evidence for cilia identification by including an additional ciliary marker, glutamylated tubulin (GT335), in combination with acetylated tubulin and ARL13B (which were included in the original submission). These data are now presented in the new Supplementary Figure 2, which also includes an example of a non-ciliated spermatocyte showing absence of both ARL13B and AcTub signals.

      Taken together, these markers provide a more comprehensive validation of cilia detection and confirm the absence of ciliary labelling in non-ciliated spermatocytes.

      The conclusion that IFT88 localizes to centrosomes is premature as key controls for the IFT88 antibody staining are lacking. Centrosomes are notoriously "sticky", often sowing non-specific antibody staining. The authors must include controls to demonstrate the specificity of the staining they observe such as staining in a genetic mutant or an antigen competition assay.

      Response:

      We appreciate the reviewer's concern and fully agree that antibody specificity is critical when interpreting centrosomal localization. The IFT88 antibody used in our study is commercially available and has been extensively validated in the literature as both a cilia marker (1, 2), and a centrosome marker in somatic cells (3). Labelling of IFT88 in centrosomes has also been previously described using other antibodies (4, 5). In our material, the IFT88 signal consistently appears at one of the duplicated centrosomes and at both spindle poles-patterns identical to those reported in somatic cells. We therefore consider the reported meiotic IFT88 staining as specific and biologically reliable.

      That said, we agree that genetic validation would provide the most definitive confirmation. We would like to inform that we are currently since we are currently generating a conditional genetic model for a ciliopathy in our laboratory that will directly assess both antibody specificity and functional consequences of cilia loss during meiosis. These experiments are in progress and will be reported in a follow-up study.

      1. Wong et al. Science (2015). DOI: 1126/science.aaa5111 (PMID: 25931445)
      2. Ocbina et al. Nat Genet (2011). DOI: 1038/ng.832 (PMID: 21552265)
      3. Vitre et al. EMBO Rep (2020). DOI: 15252/embr.201949234 (PMID: 32270908)
      4. Robert A. et al. J Cell Sci (2007). DOI: 1242/jcs.03366 (PMID: 17264151)
      5. Singla et al, Developmental Cell (2010). DOI: 10.1016/j.devcel.2009.12.022 (PMID: 20230748) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      There are many inconsistent statements throughout the paper regarding the timing of the first wave of spermatogenesis. For example, the authors state that round spermatids can be detected at 21dpp on line 161, but on line 180, say round spermatids can be detected a 19dpp. Not only does this lead to confusion, but such discrepancies undermine the validity of the rest of the paper. A summary graphic displaying key events and their timing in the first wave of spermatogenesis would be instrumental for reader comprehension and could be used by the authors to ensure consistent claims throughout the paper.

      Response:

      We thank the reviewer for identifying this inconsistency and apologize for the confusion. We confirm that early round spermatids first appear at 19 dpp, as shown in the quantitative data (Figure 1J). This can be detected in squashed spermatocyte preparations, where individual spermatocytes and spermatids can be accurately quantified. The original text contained an imprecise reference to the histological image of 21 dpp (previous line 161), since certain H&E sections did not clearly show all cell types simultaneously. However, we have now revised Figure 1, improving the image quality and adding a zoomed-in panel highlighting early round spermatids. Image for 19 dpp mice in Fig 1D shows early, yet still aflagellated spermatids. The first ciliated spermatocytes and the earliest flagellated spermatids are observed at 20 dpp. This has been clarified in the text.

      In addition, we also thank the reviewer for the suggestion of adding a summary graphic, which we agree greatly facilitates reader comprehension. We have added a new schematic summary (Figure 1K) illustrating the key stages and timing of the first spermatogenic wave.

      In the proteomics experiments, it is unclear why the authors assume that changes in protein expression are predominantly due to changes within the germ cells in the developing testis. The analysis is on whole testes including both the somatic and germ cells, which makes it possible that protein expression changes in somatic cells drive the results. The authors need to justify why and how the conclusions drawn from this analysis warrant such an assumption.

      Response:

      We agree with the reviewer that our proteomic analysis was performed on whole testis samples, which contain both germ and somatic cells. Although isolation of pure spermatocyte populations by FACS would provide higher resolution, obtaining sufficient prepubertal material for such analysis would require an extremely large number of animals. To remain compliant with the 3Rs principle for animal experimentation, we therefore used whole-testis samples from three biological replicates per age.

      We acknowledge that our assumption-that the main differences arise from germ cells-is a simplification. However, germ cells constitute the vast majority of testicular cells during this developmental window and are the population undergoing major compositional changes between 15 dpp and adulthood. It is therefore reasonable to expect that a substantial fraction of the observed proteomic changes reflects alterations in germ cells. We have clarified this point in the revised text and have added a statement noting that changes in somatic cells could also contribute to the proteomic profiles.

      The authors should provide details on how proteins were categorized as being involved in ciliogenesis or flagellogenesis, specifically in the distinction criteria. It is not clear how the categorizations were determined or whether they are valid. Thus, no one can repeat this analysis or perform this analysis on other datasets they might want to compare.

      Response:

      We thank the reviewer for this opportunity to clarify our approach. The categorization of protein as being involved in ciliogenesis or flagellogenesis was based on their Gene Ontology (GO) cellular component annotations obtained from the PANTHER database (Version 19.0), using the gene IDs of the Differentially Expressed Proteins (DEPs). Specifically, we used the GO terms cilium (GO:0005929) and motile cilium (GO:0031514). Since motile cilium is a subcategory of cilium, proteins annotated only with the general cilium term, but not included under motile cilium, were considered to be associated with primary cilia or with shared structural components common to different types of cilia. These GO terms are represented in the bottom panel of the Figure 6.

      This information has been added to the Methods section and referenced in the Results for transparency and reproducibility.

      In the pharmacological studies, the authors conclude that the phenotypes they observe (DNA damage and reduced pachytene spermatocytes) are due to loss of or persistence of cilia. This overinterprets the experiment. Chloral hydrate and MLN8237 certainly impact ciliation as claimed, but have additional cellular effects. Thus, it is possible that the observed phenotypes were not a direct result of cilia manipulation. Either additional controls must address this or the conclusions need to be more specific and toned down.

      Response:

      We thank the reviewer for this fair observation and have taken steps to strengthen and refine our interpretation. In the revised version, we now include data from 1-hour and 24-hour cultures for both control and chloral hydrate (CH)-treated samples (n = 3 biological replicates). The triple immunolabelling with γH2AX, SYCP3, and H1T allows accurate staging of zygotene (H1T⁻), early pachytene (H1T⁻), and late pachytene (H1T⁺) spermatocytes.

      The revised Figure 7 now provides a more complete and statistically supported analysis of DNA damage dynamics, confirming that CH-induced deciliation leads to persistent γH2AX signal at 24 hours, indicative of delayed or defective DNA repair progression. We have also toned down our interpretation in the Discussion, acknowledging that CH could affect other cellular pathways.

      As mentioned before, the conditional genetic model that we are currently generating will allow us to evaluate the role of cilia in meiotic DNA repair in a more direct and specific way.

      Assuming the conclusions of the pharmacological studies hold true with the proper controls, the authors still conflate their findings with meiotic defects. Meiosis is not directly assayed, which makes this conclusion an overstatement of the data. The conclusions need to be rephrased to accurately reflect the data.

      Response:

      We agree that this aspect required clarification. As noted above, we have refined both the Results and Discussion sections to make clear that our assays specifically targeted meiotic spermatocytes.

      We now present data for meiotic stages at zygotene, early pachytene and late pachytene. This is demonstrated with the labelling for SYCP3 and H1T, both specific marker for meiosis that are not detectable in non meiotic cells. We believe that this is indeed a way to assay the meiotic cells, however, we have specified now in the text that we are analysing potential defects in meiosis progression. We are sorry if this was not properly explained in the original manuscript: it is now rephrased in the new version both in the results and discussion section.

      It is not clear why the authors chose not to use widely accepted assays of Hedgehog signaling. Traditionally, pathway activation is measured by transcriptional output, not GLI protein expression because transcription factor expression does not necessarily reflect transcription levels of target genes.

      Response:

      We agree with the reviewer that measuring mRNA levels of Hedgehog pathway target genes, typically GLI1 and PTCH1, is the most common method for measuring pathway activation, and is widely accepted by researchers in the field. However, the methods we use in this manuscript (GLI1 and GLI3 immunoblots) are also quite common and widely accepted:

      Regarding GLI1 immunoblot, many articles have used this method to monitor Hedgehog signaling, since GLI1 protein levels have repeatedly been shown to also go up upon pathway activation, and down upon pathway inhibition, mirroring the behavior of GLI1 mRNA. Here are a few publications that exemplify this point:

      • Banday et al. 2025 Nat Commun. DOI: 10.1038/s41467-025-56632-0 (PMID: 39894896)
      • Shi et al 2022 JCI Insight DOI: 10.1172/jci.insight.149626 (PMID: 35041619)
      • Deng et al. 2019 eLife, DOI: 10.7554/eLife.50208 (PMID: 31482846)
      • Zhu et al. 2019 Nat Commun, DOI: 10.1038/s41467-019-10739-3 (PMID: 31253779)
      • Caparros-Martin et al 2013 Hum Mol Genet, DOI: 10.1093/hmg/dds409 (PMID: 23026747) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      As for GLI3 immunoblot, Hedgehog pathway activation is well known to inhibit GLI3 proteolytic processing from its full length form (GLI3-FL) to its transcriptional repressor (GLI3-R), and such processing is also commonly used to monitor Hedgehog signal transduction, of which the following are but a few examples:

      • Pedraza et al 2025 eLife, DOI: 10.7554/eLife.100328 (PMID: 40956303)
      • Somatilaka et al 2020 Dev Cell, DOI: 10.1016/j.devcel.2020.06.034 (PMID: 32702291)
      • Infante et al 2018, Nat Commun, DOI: 10.1038/s41467-018-03339-0 (PMID: 29515120)
      • Wang et al 2017 Dev Biol DOI: 10.1016/j.ydbio.2017.08.003 (PMID: 28800946)
      • Singh et al 2015 J Biol Chem DOI: 10.1074/jbc.M115.665810 (PMID: 26451044) *note: due to manuscript-length limitations, not all cited references can be included in the text; they are listed here to substantiate our response.

      In summary, we think that we have used two well established markers to look at Hedgehog signaling (three, if we include the immunofluorescence analysis of SMO, which we could not detect in meiotic cilia).

      These Hh pathway analyses did not provide any convincing evidence that the prepubertal cilia we describe here are actively involved in this pathway, even though Hh signaling is cilia-dependent and is known to be active in the male germline (Sahin et al 2014 Andrology PMID: 24574096; Mäkelä et al 2011 Reproduction PMID: 21893610; Bitgood et al 1996 Curr Biol. PMID: 8805249).

      That said, we fully agree that our current analyses do not allow us to draw definitive conclusions regarding Hedgehog pathway activity in meiotic cilia, and we now state this explicitly in the revised Discussion.

      Also in the Hedgehog pathway experiment, it is confusing that the authors report no detection of SMO yet detect little to no expression of GLIR in their western blot. Undetectable SMO indicates Hedgehog signaling is inactive, which results in high levels of GLIR. The impact of this is that it is not clear what is going on with Hh signaling in this system.

      Response:

      It is true that, when Hh signaling is inactive (and hence SMO not ciliary), the GLI3FL/GLI3R ratio tends to be low.

      Although our data in prepuberal mouse testes show a strong reduction in total GLI3 protein levels (GLI3FL+GLI3R) as these mice grow older, this downregulation of total GLI3 occurs without any major changes in the GLI3FL/GLI3R ratio, which is only modestly affected (suppl. Figure 6).

      Hence, since it is the ratio that correlates with Hh signaling rather than total levels, we do not think that the GLI3R reduction we see is incompatible with our non-detection of SMO in cilia: it seems more likely that overall GLI3 expression is being downregulated in developing testes via a Hh-independent mechanism.

      Also potentially relevant here is the fact that some cell types depend more on GLI2 than on GLI3 for Hh signaling. For instance, in mouse embryos, Hh-mediated neural tube patterning relies more heavily on GLI2 processing into a transcriptional activator than on the inhibition of GLI3 processing into a repressor. In contrast, the opposite is true during Hh-mediated limb bud patterning (Nieuwenhuis and Hui 2005 Clin Genet. PMID: 15691355). We have not looked at GLI2, but it is conceivable that it could play a bigger role than GLI3 in our model.

      Moreover, several forms of GLI-independent non-canonical Hh signaling have been described, and they could potentially play a role in our model, too (Robbins et al 2012 Sci Signal. PMID: 23074268).

      We have revised the discussion to clarify some of these points.

      All in all, we agree that our findings regarding Hh signaling are not conclusive, but we still think they add important pieces to the puzzle that will help guide future studies.

      There are multiple instances where it is not clear whether the authors performed statistical analysis on their data, specifically when comparing the percent composition of a population. The authors need to include appropriate statistical tests to make claims regarding this data. While the authors state some impressive sample sizes, once evaluated in individual categories (eg specific cell type and age) the sample sizes of evaluated cilia are as low as 15, which is likely underpowered. The authors need to state the n for each analysis in the figures or legends.

      We thank the reviewer for highlighting this important issue. We have now included the sample size (n) for every analysis directly in the figure legends. Although this adds length, it improves transparency and reproducibility.

      Regarding the doubts of Ref#3 about the different sample sizes, the number of spermatocytes quantified in each stage is in agreement with their distribution in meiosis (example, pachytene lasts for 10 days this stage is widely represented in the preparations, while its is much difficult to quantify metaphases I that are less present because the stage itself lasts for less than 24hours). Taking this into account, we ensured that all analyses remain statistically valid and representative, applying the appropriate statistical tests for each dataset. These details are now clearly indicated in the revised figures and legends.

      Minor concerns:

      1. The phrase "lactating male" is used throughout the paper and is not correct. We assume this term to mean male pups that have yet to be weaned from their lactating mother, but "lactating male" suggests a rare disorder requiring medical intervention. Perhaps "pre-weaning males" is what the authors meant.

      Response:

      We thank the reviewer for noticing this terminology error. The expression has been corrected to "pre-weaning males" throughout the manuscript.

      The convention used to label the figures in this paper is confusing and difficult to read as there are multiple panels with the same letter in the same figure (albeit distinct sections). Labeling panels in the standard A-Z format is preferred. "Panel Z" is easier to identify than "panel III-E".

      Response:

      We thank the reviewer for this suggestion. All figures have been relabelled using the standard A-Z panel format, ensuring consistency and easier readability across the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      In "The dynamics of ciliogenesis in prepubertal mouse meiosis reveals new clues about testicular development" Pérez-Moreno, et al. explore primary cilia in prepubertal mouse spermatocytes. Using a combination of microscopy, proteomics, and pharmacological perturbations, the authors carefully characterize prepubertal spermatocyte cilia, providing foundational work regarding meiotic cilia in the developing mammalian testis.

      Major concerns:

      1. The authors provide evidence consistent with cilia not being present in a larger percentage of spermatocytes or in other cells in the testis. The combination of electron microscopy and acetylated tubulin antibody staining establishes the presence of cilia; however, proving a negative is challenging. While acetylated tubulin is certainly a common marker of cilia, it is not in some cilia such as those in neurons. The authors should use at least one additional cilia marker to better support their claim of cilia being absent.

      2. The conclusion that IFT88 localizes to centrosomes is premature as key controls for the IFT88 antibody staining are lacking. Centrosomes are notoriously "sticky", often sowing non-specific antibody staining. The authors must include controls to demonstrate the specificity of the staining they observe such as staining in a genetic mutant or an antigen competition assay.

      3. There are many inconsistent statements throughout the paper regarding the timing of the first wave of spermatogenesis. For example, the authors state that round spermatids can be detected at 21dpp on line 161, but on line 180, say round spermatids can be detected a 19dpp. Not only does this lead to confusion, but such discrepancies undermine the validity of the rest of the paper. A summary graphic displaying key events and their timing in the first wave of spermatogenesis would be instrumental for reader comprehension and could be used by the authors to ensure consistent claims throughout the paper.

      4. In the proteomics experiments, it is unclear why the authors assume that changes in protein expression are predominantly due to changes within the germ cells in the developing testis. The analysis is on whole testes including both the somatic and germ cells, which makes it possible that protein expression changes in somatic cells drive the results. The authors need to justify why and how the conclusions drawn from this analysis warrant such an assumption.

      5. The authors should provide details on how proteins were categorized as being involved in ciliogenesis or flagellogenesis, specifically in the distinction criteria. It is not clear how the categorizations were determined or whether they are valid. Thus, no one can repeat this analysis or perform this analysis on other datasets they might want to compare.

      6. In the pharmacological studies, the authors conclude that the phenotypes they observe (DNA damage and reduced pachytene spermatocytes) are due to loss of or persistence of cilia. This overinterprets the experiment. Chloral hydrate and MLN8237 certainly impact ciliation as claimed, but have additional cellular effects. Thus, it is possible that the observed phenotypes were not a direct result of cilia manipulation. Either additional controls must address this or the conclusions need to be more specific and toned down.

      7. Assuming the conclusions of the pharmacological studies hold true with the proper controls, the authors still conflate their findings with meiotic defects. Meiosis is not directly assayed, which makes this conclusion an overstatement of the data. The conclusions need to be rephrased to accurately reflect the data.

      8. It is not clear why the authors chose not to use widely accepted assays of Hedgehog signaling. Traditionally, pathway activation is measured by transcriptional output, not GLI protein expression because transcription factor expression does not necessarily reflect transcription levels of target genes.

      9. Also in the Hedgehog pathway experiment, it is confusing that the authors report no detection of SMO yet detect little to no expression of GLIR in their western blot. Undetectable SMO indicates Hedgehog signaling is inactive, which results in high levels of GLIR. The impact of this is that it is not clear what is going on with Hh signaling in this system.

      10. There are multiple instances where it is not clear whether the authors performed statistical analysis on their data, specifically when comparing the percent composition of a population. The authors need to include appropriate statistical tests to make claims regarding this data. While the authors state some impressive sample sizes, once evaluated in individual categories (eg specific cell type and age) the sample sizes of evaluated cilia are as low as 15, which is likely underpowered. The authors need to state the n for each analysis in the figures or legends.

      Minor concerns:

      1. The phrase "lactating male" is used throughout the paper and is not correct. We assume this term to mean male pups that have yet to be weaned from their lactating mother, but "lactating male" suggests a rare disorder requiring medical intervention. Perhaps "pre-weaning males" is what the authors meant.

      2. The convention used to label the figures in this paper is confusing and difficult to read as there are multiple panels with the same letter in the same figure (albeit distinct sections). Labeling panels in the standard A-Z format is preferred. "Panel Z" is easier to identify than "panel III-E".

      Significance

      Overall, this is a well-done body of work that deserves recognition for the novel and implicative discoveries it presents. Assuming the conclusions hold true following appropriate statistical analysis and rephrasing, this paper would report the first documented evidence of meiotic cilia in the developing mammalian testis with sufficient rigor to become the foundational work on this topic.

      This paper will be of interest to communities focused on germ cell development, cilia, and Hedgehog signaling. It may prompt a new perspective on Desert Hedgehog signaling as it pertains to spermatogenesis. Further, this work will be of interest to those studying male fertility, as it highlights the potential role of cilia in spermatogenesis.

      Further, the proteomic analysis presented has the potential to invoke hypotheses and experimentation investigating the role of several proteins with previously uncharacterized roles in ciliogenesis, flagellogenesis, and/or spermatogenesis. The finding that the onset of ciliogenesis and flagellogenesis appear to be temporally linked has the potential to prompt research regarding shared molecular mechanisms dictating axonemal formation. We believe this paper has the potential to have an impact in its respective field, underscored by the exquisite microscopy and detailed characterization of meiotic cilia.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This study focuses on the dynamic changes of ciliogenesis during meiosis in prepubertal mice. It was found that primary cilia are not an intrinsic feature of the first wave of meiosis (initiating at 8 dpp); instead, they begin to polymerize at 20 dpp (after the completion of the first wave of meiosis) and are present in all stages of prophase I. Moreover, prepubertal cilia (with an average length of 21.96 μm) are significantly longer than adult cilia (10 μm). The emergence of cilia coincides temporally with flagellogenesis, suggesting a regulatory association in the formation of axonemes between the two. Functional experiments showed that disruption of cilia by chloral hydrate (CH) delays DNA repair, while the AURKA inhibitor (MLN8237) delays cilia disassembly, and centrosome migration and cilia depolymerization are mutually exclusive events. These findings represent the first detailed description of the spatiotemporal regulation and potential roles of cilia during early testicular maturation in mice. The discovery of this phenomenon is interesting; however, there are certain limitations in functional research.

      Major points:

      1. The prepubertal cilia in spermatocytes discovered by the authors lack specific genetic ablation to block their formation, making it impossible to evaluate whether such cilia truly have functions. Because neither in the first wave of spermatogenesis nor in adult spermatogenesis does this type of cilium seem to be essential. In addition, the authors also imply that the formation of such cilia appears to be synchronized with the formation of sperm flagella. This suggests that the production of such cilia may merely be transient protein expression noise rather than a functionally meaningful cellular structure.

      2. The high expression of axoneme assembly regulators such as TRiC complex and IFT proteins identified by proteomic analysis is not particularly significant. This time point is precisely the critical period for spermatids to assemble flagella, and TRiC, as a newly discovered component of flagellar axonemes, is reasonably highly expressed at this time. No intrinsic connection with the argument of this paper is observed. In fact, this testicular proteomics has little significance.

      Significance

      Strengths: The discovery of a very interesting time window for ciliary growth in spermatocytes.

      Weaknesses: Insufficient analysis of the function of such cilia.

      Readers: Developmental biologists, reproductive biologists

      My expertise: Spermatogenesis, genetics

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this manuscript by Perez-Moreno et al., titled "The dynamics of ciliogenesis in prepubertal mouse meiosis reveal new clues about testicular maturation during puberty", the authors characterize the development of primary cilia during meiosis in juvenile male mice. The authors catalog a variety of testicular changes that occur as juvenile mice age, such as changes in testis weight and germ cell-type composition. They next show that meiotic prophase cells initially lack cilia, and ciliated meiotic prophase cells are detected after 20 days postpartum, coinciding with the time when post-meiotic spermatids within the developing testes acquire flagella. They describe that germ cells in juvenile mice harbor cilia at all substages of meiotic prophase, in contrast to adults where only zygotene stage meiotic cells harbor cilia. The authors also document that cilia in juvenile mice are longer than those in adults. They characterize cilia composition and structure by immunofluorescence and EM, highlighting that cilia polymerization may initially begin inside the cell, followed by extension beyond the cell membrane. Additionally, they demonstrate ciliated cells can be detected in adult human testes. The authors next perform proteomic analyses of whole testes from juvenile mice at multiple ages, which may not provide direct information about the extremely small numbers of ciliated meiotic cells in the testis, and is lacking follow up experiments, but does serve as a valuable resource for the community. Finally, the authors use a seminiferous tubule culturing system to show that chemical inhibition of Aurora kinase A likely inhibits cilia depolymerization upon meiotic prophase I exit and leads to an accumulation of metaphase-like cells harboring cilia. They also assess meiotic recombination progression using their culturing system, but this is less convincing.

      Few suggestions/comments are listed below:

      Major comments

      1. There are a few issues with the experimental set up for assessing the effects of cilia depolymerization on DNA repair (Figure 7-II). First, how were mid pachytene cells identified and differentiated from early pachytene cells (which would have higher levels of gH2AX) in this experiment? I suggest either using H1t staining (to differentiate early/mid vs late pachytene) or the extent of sex chromosome synapsis. This would ensure that the authors are comparing similarly staged cells in control and treated samples. Second, what were the gH2AX levels at the starting point of this experiment? A more convincing set up would be if the authors measure gH2AX immediately after culturing in early and late cells (early would have higher gH2AX, late would have lower gH2AX), and then again after 24hrs in late cells (upon repair disruption the sampled late cells would have high gH2AX). This would allow them to compare the decline in gH2AX (i.e., repair progression) in control vs treated samples. Also, it would be informative to know the starting gH2AX levels in ciliated vs non-ciliated cells as they may vary.

      2. The authors analyze meiotic progression in cells cultured with/without AURKA inhibition in Figure 8-III and conclude that the distribution of prophase I cells does not change upon treatment. Is Figure 8-III A and B the same data? The legend text is incorrect, so it's hard to follow. Figure 8-III A shows a depletion of EdU-labelled pachytene cells upon treatment. Moreover, the conclusion that a higher proportion of ciliated zygotene cells upon treatment (Figure 8-II C) suggests that AURKA inhibition delays cilia depolymerization (page 13 line 444) does not make sense to me.

      3. How do the authors know that there is a monopolar spindle in Figure 8-IV treated samples? Perhaps the authors can use a different Tubulin antibody (that does not detect only acetylated Tubulin) to show that there is a monopolar spindle.

      4. The authors state in the abstract that they provide evidence suggesting that centrosome migration and cilia depolymerization are mutually exclusive events during meiosis. This is not convincing with the data present in the current manuscript. I suggest amending this statement in the abstract.

      Minor comments

      1. The presence of cilia in all stages of meiotic prophase I in juvenile mice is intriguing. Why is the cellular distribution and length of cilia different in prepubertal mice compared to adults (where shorter cilia are present only in zygotene cells)? What is the relevance of these developmental differences? Do cilia serve prophase I functions in juvenile mice (in leptotene, pachytene etc.) that are perhaps absent in adults?

      Related to the above point, what is the relevance of the absence of cilia during the first meiotic wave? If cilia serve a critical function during prophase I (for instance, facilitating DSB repair), does the lack of cilia during the first wave imply differing cilia (and repair) requirements during the first vs latter spermatogenesis waves?

      In my opinion, these would be interesting points to discuss in the discussion section.

      1. The authors state on page 9 lines 286-288 that the presence of cytoplasmic continuity via intercellular bridges (between developmentally synchronous spermatocytes) hints towards a mechanism that links cilia and flagella formation. Please clarify this statement. While the correlation between the timing of appearance of cilia and flagella in cells that are located within the same segment of the seminiferous tubule may be hinting towards some shared regulation, how would cytoplasmic continuity participate in this regulation? Especially since the cytoplasmic continuity is not between the developmentally distinct cells acquiring the cilia and flagella?

      2. Individual germ cells in H&E-stained testis sections in Figure 1-II are difficult to see. I suggest adding zoomed-in images where spermatocytes/round spermatids/elongated spermatids are clearly distinguishable.

      3. In Figure 2-II B, the authors document that most ciliated spermatocytes in juvenile mice are pachytene. Is this because most meiotic cells are pachytene? Please clarify. If the data are available (perhaps could be adapted from Figure 1-III), it would be informative to see a graph representing what proportions of each meiotic prophase substages have cilia.

      4. I suggest annotating the EM images in Sup Figure 2 and 3 to make it easier to interpret.

      5. The authors claim that the ratio between GLI3-FL and GLI3-R is stable across their analyzed developmental window in whole testis immunoblots shown in Sup Figure 5. Quantifying the bands and normalizing to the loading control would help strengthen this claim as it hard to interpret the immunoblot in its current form.

      6. There are a few typos throughout the manuscript. Some examples: page 5 line 172, Figure 3-I legend text, Sup Figure 5-II callouts, Figure 8-III legend, page 15 line 508, page 17 line 580, page 18 line 611.

      Significance

      This work provides new information about an important but poorly understood cellular structure present in meiotic cells, the primary cilium. More generally, this work expands on our understanding of testis development in juvenile mice. The microscopy images presented here are beautiful. The work is mostly descriptive but lays the groundwork for future investigations. I believe that this study would of interest to the germ cell, meiosis, and spermatogenesis communities, and with a few modifications, is suitable for publication.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (R1)

      R1 General statement: Here, Escalera-Maurer and colleagues, present an up-to-date distribution of homologues of Hok toxic proteins belonging to the well-annotated, but otherwise functionally obscure, hok/Sok type I toxin-antitoxin system, across the RefSeq database. Although such computational analyses have been done in the past, the authors here find many more hok homologs than described before, and they categorise their distribution based on whether they are encoded on chromosomes, plasmids, or (pro)phages. These computational analyses are in general tricky with T1TAs, as their toxins are quite short (~50 amino acids, as is the case for Hok), which is why the authors here used three separate approaches to expand their search (nucleotide-level BLAST, protein-homology, or both combined with Infernal). The authors cluster the Hok homologues they find based on a 60% sequence identity cut-off (expanding the known clusters in the process), and proceeded to test 31 candidates belonging to 15 sequence-clusters for their toxicity in Salmonella Typhimurium LT2, showing that 30/31 were toxic upon induction. An interesting finding from their endeavours is that hok/Sok homologues are enriched within prophages and large plasmids, but are not enriched near bacterial anti-phage defense systems (in contrast to the SymE/SymR T1TA). The findings suggest that hok/Sok are indeed sometimes linked to phage and plasmid biology, although they might not be antiphage defenses per se (they have been clearly shown in the past to be addiction modules, and this is still clearly true).

      Authors' answer to R1 General statement: __We do not state here that hok/Sok are not anti-phage defense systems, but we simply observe that they do not cluster with anti-phage defense systems. We have also observed (unpublished data) that known defense systems do not systematically cluster together with other defense systems. Therefore, strong association with other defense systems would have been a strong indication of their function in phage defense but the fact that we did not observe any association with defense systems does not exclude they are involved in phage defense. __

      R1_C1: My expertise lies towards the experimental side of the authors' work, I thus cannot comment on the accuracy/robustness of the computational analyses performed here. The authors do a fine job in clearly stating their findings overall; I could follow most of the conclusions, and I deemed that most of them were supported by their work. Additionally, I find that this paper is a missed opportunity to uncover even more novel biology connected to the interesting hok/Sok T1TAs. The paper does not provide a new framework to think about what is the function of the chromosomal/prophage hok/Sok T1TA systems, although I realize that this is very difficult to accomplish, especially when considering that hok/Sok systems have been around in the literature for almost 40 years.

      Authors' answer to R1_C1: We agree with the reviewer, as we indeed performed this analysis having in mind to clarify the role of hok/Sok systems. However, we still believe that our strong survey of Hok loci put in light their enrichment in various mobile genetic elements, such as prophage and large conjugative plasmids, which is indubitably linked to their function. In addition, our study will guide future experimental efforts in uncovering the function of these systems, for example by helping researchers to select relevant homologs to test for a specific function.__ __

      R1_C2: My major comment is in regard to the Hok toxicity assays (Fig. 2). The authors state in the discussion that "Hok peptides originating from chromosomes are as toxic as those from plasmids", but I believe that the way that they tested their constructs might not have allowed them to see toxicity differences between the two groups. Specifically, using the multi-copy plasmid pAZ3 (pBR322 origin of replication; ~15-20 plasmid copies per chromosome) to induce the different Hok toxin homologues in Salmonella Typhimurium LT2 with arabinose might have masked toxicity differences that would otherwise be apparent on the chromosomal expression-level.

      Some of the authors themselves have previously used the FASTBAC-Seq method to study the Hok homologue from plasmid R1, a useful technique during which a toxin is integrated in the chromosome, in order to study their toxicity under natural levels of expression. I believe that an ideal scenario would be to apply FASTBAC-seq to some of the 31 Hok homologues described here (e.g., a subset of plasmidic vs chromosomal Hok homologues) to shed light on potential toxicity differences between the Hok clusters. This would increase the value of the presented study.

      Alternatively, the authors could employ an L-arabinose concentration gradient to titrate the expression levels of the Hok toxins in order to potentially see different toxicity levels from the different homologues. However, this is not going to work in the system as they are using it now for two reasons:

      1. a) the S. Typhimurium LT2 (STm) used here has its arabinose utilization operon intact (araBAD), which means that Salmonella can catabolize arabinose to use it as a carbon source. This catabolization process interferes with the arabinose induction (i.e., Salmonella eats arabinose instead of using it as the Hok inducer). To ameliorate this, the authors could delete the araBAD operon in STm, rendering STm incapable of catabolizing arabinose, and repeat the experiments in that strain. Or use E. coli BW25113 as the expression host, which already has the araBAD operon deleted (it is not clear to me why the different Hok homologues would not be toxic in E. coli, as the different Hok homologues are widely diverse in sequence, as the authors found here).
      2. b) Even with the araBAD operon deleted, the arabinose induction would be bimodally on or off in the population, due to the bimodal expression of the arabinose transporter (AraE; see Khlebnikov et al., 2002). This would again not allow for titratable arabinose-inducible expression from different concentrations of arabinose. The solution for this would be to co-express a separate plasmid with araE, which would render every cell the same in regards to arabinose permeability, and thus the system would be titratable (as explained in Khlebnikov et al., 2002). Therefore, if the authors would be interested to go towards this route, they would have to first delete the araBAD from STm, then transform STm with an araE plasmid, and redo the experiments. In addition, I would propose to the authors to use the drop plate method (agar plate-based), which is more sensitive compared to the liquid assays employed here.

      Having said all that, I understand that all this experimental work would be strenuous and time-consuming, and although I would like to see it happen, this is not my paper. I would be content therefore if the authors toned down the claim that plasmidic vs chromosomal Hok homologues have the same toxicity, and discuss that chromosomal levels of toxicity are an important caveat that has not been explored here.

      __Authors' answer to R1_C2: __ We thank the reviewer for the detailed suggestion on how to better assess toxicity differences by using an araBAD deletion mutant overexpressing araE. We repeated the arabinose induction assays using drop assays and strain BW25223 with plasmid pJAT13araE and our pAZ3 based plasmid carrying Hok CDS homologs. However, we obtained similar data, not being able to distinguish between the toxicity of chromosomal versus plasmidic CDS, even using different concentration of Arabinose. This is probably because low concentration of the Hok protein are sufficient for activity, but here we are bypassing all post-transcriptional silencing by the native Hok mRNAs by expressing directly the protein, and we are using a multicopy plasmid. We now included 0.01% arabinose induction drop assays in the manuscript as the data obtained with other arabinose concentration did not provide new information. In any case, we are still not accessing the native expression levels for the following reasons 1/ chromosomal level of toxicity were not explored here and 2/ only the toxicity of the coding sequence but not the full mRNA was tested. Indeed, we do not know the exact sequence of the hok homolog mRNAs and this is beyond the scope of the study. These remarks were clearly added in the discussion.

      We agree that the sentence "Hok peptides originating from chromosomes are as toxic as those from plasmids" was too strong and we have added the caveats of our experimental design in the discussion. While we indeed did not compare the toxicity of the peptides, we still showed that chromosomal Hok can be toxic upon overexpression, which would not be the case if the sequences were degenerated.

      The reviewer also suggests the use of the FASTBAC-Seq method, that we previously used to study Hok from the R1 plasmid, which is a method to study toxic type I toxins at the native expression level. While FASTBAC-Seq identifies loss-of-function mutants of the systems, it does not allow to determine a difference of toxicity between systems per se. In addition, FASTBAC-Seq was always done in the context of the full mRNA, not only the coding sequence, and these sequences are presently unknown for most homologs.

      Other comments:

      __R1_C3: __a) There is barely any discussion of the Sok component (RNA antitoxin) of the homologues; why is that? Could you please discuss Sok differences across the homologues, or at least explain why this is not discussed at all in the paper (e.g., in the discussion)?

      Authors' answer to R1_C3: __It is not trivial to identify the Sok RNA sequence, this is why it was not done in this study, a paragraph was added in the discussion explaining this. __

      __R1_C4: __b) In the results section, the Hok clusters are referred to as 62 in number ("Because Hok sequences were too short and variable to construct a meaningful phylogenetic tree, we clustered the Hok sequences with a 60% identity threshold and obtained 62 clusters"), but then in the discussion section, the cluster number becomes 74 ("We highlighted the high sequence variability within Hok peptides by obtaining a total of 74 clusters with 60% identity (Fig. S7)."). Which one is the right number, and why is there a discrepancy?

      Authors' answer to R1_C4: We apologize for the discrepancy between the number. The first number corresponded to the Hok hits from the refSeq and we then added the Hok hits from the plasmid and virus databases (performed later in the manuscript). We clarified this information both in the result and discussion texts (61 clusters from RefSeq and 79 in total, 74 was a typo).__ __

      __R1 Significance: __The most well-clarified aspect of the paper presented here is the distribution of Hok homologues, with the novel aspect of the location in which the hok/Sok T1TAs reside (i.e., chromosome, plasmid, or phage). There is room for the molecular genetics part to be developed further, as I discussed earlier, however this study is the most up-to-date characterization of the diversity of Hok homologues, and will be of interest to the T1TA and the general toxin-antitoxin field.

      __Reviewer #2 (R2) __

      R2 General statement: The authors examined how the Hok toxins are spread across bacterial genomes. The manuscript including its figures is hard to read and understand. I commented figure 1 in details, but similar comments apply to the other figures. Overall, the data lack clarity and precision. Finding information about sequences, clusters in the supplementary materials was not easy. The manuscript should be thoroughly revised. In addition, I believe that other aspects should be developed to expand the interest of the study, such as the co-occurrence of multiple systems in chromosomes, on plasmids and whether they are able to crosstalk. This might provide some evolutionary insights into the biology of these toxins.

      __Authors' answer to R2 General statement: __We designed all figures according to established standards for scientific data visualization, although we recognize that different presentations may work better for different audiences. In our detailed response to Figure 1A, we explain how UpSet plots are constructed and interpreted, which we hope clarifies the visualization approach for the full dataset. We are open to discussing specific improvements if the reviewer has suggestions for enhanced clarity. To address concerns about accessibility, we want to clarify that all sequences are compiled in Table S1 with their clus100 identifiers, making them easy to locate. We are open to reorganizing supplementary materials if a different structure would be more user-friendly. Finally, we agree that an extensive analysis of co-occurrences and crosstalks would be valuable. However, predicting crosstalk bioinformatically for all genomes presents challenges, as it would require predicting RNA:RNA interactions between hok mRNA and Sok sequences, which are currently unknown. Given these limitations, this analysis was beyond the scope of the current study.

      R2_C1: The introduction lacks information regarding the Hok protein (size, structure prediction, localization) as well as a bit of explanation about the reason of looking at these toxins. The description of the potential roles should be a bit expanded.

      Authors' answer to R2_C1: Following the comment from the reviewer, we have provided additional information about Hok in the introduction.

      __R2_C2: __When the authors talk about 'loci', they mean genes encoding Hok homologs if I understand correctly. They did not look for the Sok sequences (hok-sok loci).

      __Author's answer to R2_C2: __Indeed, we did not look for the Sok sequences and we are only describing Hok homologs loci, that could either encode or lack a Sok homolog.

      __R2_C3: __It is not clear what the authors did with the sequences for which they could not detect a start codon and a SD (although it is unusual to refer to SD in the context of protein sequence)

      Authors' answer to R2_C3: The peptides were annotated by extending the initial hit until the first start codon. Therefore, all annotated peptides have a start codon. Shine-Dalgarno sequences were annotated when confidently predicted, to provide additional information. Sequences were not excluded based on the presence or absence of the SD.

      __R2_C4: __Figure 1A is not clear. The total of the bars equal 32,532 which is the number of 'loci' detected by the combination of the different methods. However, it is not clear to me how many are redundant. For instance, I suppose that all the 8483 sequences that were retrieved using blastn and Infernal were retrieved using MMseqs2, blastn and Infernal. So, what is the actual number of sequences that were found? When the authors talk about 1264 distinct peptides, what do they mean? What are the numbers on the X axis (18209, 2260, 27728)?

      Author's answer to R2_C4: Figure A1 is a very typical "UpSet" plot, as indicated in the legend (A. Lex, N. Gehlenborg, H. Strobelt, R. Vuillemot and H. Pfister, "UpSet: Visualization of Intersecting Sets," in IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 1983-1992, 31 Dec. 2014, doi: 10.1109/TVCG.2014.2346248). Those plots are a data visualization method for showing data with more than two intersecting sets. The Hok sequence hits were obtained by 3 different methods stated on the rows (MMseqs2, blastn and Infernal, therefore the number 18209 is the number of hits by the MMseqs2, 22680 the number of hits by blastn and 27728 the number of hits by Infernal). The columns show the intersections between these three sets. For example, the mentioned 8483 sequences (second column) were only found by blastn and Infernal but not by MMseqs2. The actual total number of sequences found is indeed 32 532. The 1264 distinct peptides are peptides with different sequences. After removing false positives, degenerated sequences and small peptides, we obtained 1264 unique Hok sequences that are found in the 32532 bacterial loci.

      __R2_C5: __About Infernal: first the authors are stating that only 8% of the sequences are lost when not considering the mRNA structure - which they seem to consider as negligeable. Then in the next section, they state that Infernal is the best tool at identifying clusters that are not detected otherwise. Seems a bit contradictory.

      __Authors' answer to R2_C5: __We appreciate the reviewer pointing out this apparent contradiction, we have clarified this part in the revised manuscript. Infernal uses both sequence and structure information simultaneously for homology detection. While only 8% of Infernal's hits are detected uniquely when structural information was considered, these sequences account for 9 additional clusters with notably high sequence diversity, which would otherwise have been undetected. Therefore, we believe that Infernal is the best tool to capture novel cluster diversity.

      __R2_C6: __Cluster determination. The threshold was put at 60% identity. What is the rationale for the 60% identity? Given that the Hok sequences (like toxins and antitoxins from TA systems in general) are highly variable, this leads to a high number of clusters. I'm not sure of the relevance of these clusters. Are there any other criteria to define clusters?

      Authors' answer to R2_C6: We selected 60% identity as a balance between capturing sequence diversity and generating interpretable results. We also tested 70, 80 and 90% and obtained 128, 221, 377 clusters, respectively, which would be too many for a meaningful visualization and interpretation. The best clustering method would be constructing a phylogenetic tree. However, as explained in the discussion, because the high sequence diversity prevented the construction of a reliable phylogenetic tree, clustering was used as an alternative strategy to identify and interpret patterns of sequence variability.

      __R2_C7: __The authors claim that most of the Hok diversity is found on chromosomes. However, the number of chromosomal Hok is higher than that located on plasmids, which might be related to the different sizes of the different replicons ie, chromosomes being larger than plasmids. Is there a way to normalize by determining the density per size?

      Authors' answer to R2_C7: We do not claim that chromosomes contain most of Hok diversity, as this would be indeed influenced by biases in the databases. We are just describing that we found most of the diversity in chromosomes, but we cannot conclude whether this is a true representation of the frequencies in nature.__ __

      R2_C8: '46 of the 62 clusters contained 10 or less distinct sequences and might be in the process of degenerating'. The authors also linked this with SD detection. Please explain. From what was indicated earlier, I understand that sequences with premature stop codons or short sequences (Authors' answer to R2_C8: We did not remove sequences for which we could not predict the SD. Indeed, lacking SD is a sign that the hok mRNA might not be able to play its biological role and would be indicative that the sequences have degenerated. To evaluate this hypothesis, we experimentally tested 5 sequences without a predicted SD and two of those were not toxic (see Table S2). In order to assess if the low abundant clusters contained degenerated sequences we experimentally tested representatives from some of the clusters with only one Hok CDS and found most of them to be toxic.

      R2_C9: 'Only 7.3% of the unique sequences were found on both plasmids and chromosomes'. From this observation, the authors conclude that 'there is little stable transfer from chromosomes to plasmids or vice-versa'. I don't understand what this means. Do they mean identical sequences? The fact that sequences differ from chromosomes to plasmids does not rule out 'stable transfer'. What do they actually mean by stable transfer? Once the gene is horizontally transferred, it is fixed and vertically transmitted? Same comments apply to the inter-genera horizontal transfer by plasmids.

      __Authors' answer to R2_C9: __Due to the impossibility of constructing a reliable phylogenetic tree, we used identity of sequences across different localizations or genera as our marker for recent, stable transfer events. We define stable transfer as the persistence of sequences in an unchanged form following horizontal transfer; long enough to be detected in current databases. Our approach likely underestimates total transfer events, as sequences accumulating mutations after transfer would not be captured. We would expect to observe numerous identical sequences across plasmids and chromosomes if frequent exchange were occurring, unless rapid mutation after the transfer prevented their detection as identical sequences. We have added a sentence to clarify this in the manuscript and removed the term stable transfer.

      __R2_C10: __I don't understand the next section about 'family'. What do the authors mean about 'family'? Genera? The same apply to the next section about the Y to C recoding. Did the authors do point mutations in the conserved amino acids/codons to test whether they are important for toxicity? Some Hok variants lacks some of the conserved amino acids and are toxic (under overexpression conditions in Salmonella). What about T18, C31 and E42?

      Authors' answer to R2_C10: Families (Enterobacteriaceae, Vibrionaceae etc... ) and genera (Escherichia, Salmonella etc...) refer to the taxonomic categories. Following the reviewer comment, we experimentally assessed the toxicity of Hok from R1 plasmid after mutating the conserved amino acids to alanine residues. All the mutants were found to be toxic under our expression conditions.

      __R2_C11: __The prevalence of Hok in chromosomes or on plasmids might depend on various confounding parameters, such as the size, number of sequences available among others. The authors should find methods to correct for all that.

      Authors' answer to R2_C11: Normalization would indeed be needed if we were comparing the prevalence on chromosomes vs the prevalence on plasmids. Here, we do not claim that Hok homologs are more prevalent in plasmid or chromosomes and only describe where we found them.

      __R2_C12: __Link with defense systems. The threshold was set at 20 kb. Why this threshold?

      Authors' answer to R2_C12: The size of defense islands in a previous report was approximately 40 kb, by setting up a 20 kb threshold we searched for defense systems in a region of 40 kb adjacent to each of the homologs (https://doi.org/10.1126/science.aar4120). If the specific homolog was part of a defense island we would expect that it is less than 20 kb apart from any defense system.

      __R2 Significance: __The paper in its current state appears to serve the role of a data repository rather than a thorough and original analysis. It requires extensive revisions before it can be of interest to experts in the toxin-antitoxin field.

      __ ____Reviewer #3 (R3): __

      R3 General statement: In the manuscript, "The Hok bacterial toxin: diversity, toxicity, distribution and genomic localization," by Escalera-Maurer et al., investigate the distribution of Hok type I toxin proteins across bacterial species. The Hok-Sok type I toxin-antitoxin system was first described on plasmids where it serves to maintain the plasmid in a population of bacterial cells: translation of the hok mRNA is prevented via the small antitoxin RNA Sok. Upon plasmid loss, with no new transcription of sok, the highly stable hok mRNA is translated into a small protein, killing the plasmid-less cell. Homologues to the system were identified in the chromosome of E. coli in the 1990s, and subsequent analyses have identified identical systems in other bacterial chromosomes, though they are close relatives to E. coli. Given the increased number of bacterial genomes sequenced, the group examined how widespread Hok may be across bacteria. They used a combination of BLASTn, MMseqs2 (protein) and Infernal (RNA) to identify, as best possible, all possible homologs. They then used sequence identity cut-offs to form Hok "clusters," and identified key features of the cluster as well as tested toxicity of overproduction of 31 homologs in a strain of Salmonella. Overall, though a variety of bioinformatic predictions and analyses, the manuscript identifies an expanded number of Hok members not previously identified and broaden the species it is found in, supported that Hok is not associate with defense systems, and provides additional support that horizontal transfer of hok genes is likely via plasmids (where hok is presumed to have originated).

      Major comments: There are some areas of the text that are a bit too definitive (these can be fixed or better explained in the text) and a few questions raised about the analyses and interpretations.

      Authors' answer to R3 Major Comment: As suggested by the reviewer, we rephrased parts of the manuscript.

      __These are the specific comments: __

      Introduction R3_C1: First paragraph: "Toxin production leads to the death of the cell encoding it" For many chromosomally encoded systems, toxicity has only been observed via artificial overexpression. This is an important point, as for many systems, a true biological function remains unknown. Further, add caveats regarding toxin function (for systems with validated function, they are involved in...). Again, there are still many questions for many t-at systems, in particular the Type I systems.

      __Authors' answer to R3_C1: __Indeed, the function of type 1 TA, in particular chromosomal ones, is still a matter of debate. While for hok/Sok R1, we previously showed death by expression at the chromosomal level, this was not shown for all TA (Le Rhun et al., NAR, 2023). We added that it could lead to the death or growth arrest of the cell instead and added the reviewer changes to for the function part.

      __R3_C2: __Introduction: type I's are more narrow in distribution, but much of this is due to their size and lack of biochemical domains. Again, please clarify more here.

      __Authors' answer to R3_C2: __We added the reviewer suggestion to the text.

      __R3_C3: __Introduction: while Hok's have been found on chromosomes, in E. coli strains, there is clear evidence that many are inactive. This comes up in the discussion, but it is worth including briefly in the introduction.

      Authors' answer to R3_C3: We have now added in the introduction that in the K12 laboratory strain, most chromosomal hok/Sok were found to be inactive.

      __R3_C4: __For the predicted transmembrane domain: it would be worth to include a box/indication as to where that is within the peptide (with the understanding it may not be exact). Is there more/less variation here? I'm assuming all clusters/family have a predicted TM domain?

      __Authors' answer to R3_C4: __When predicting the TM domain using DeepTMHMM - 1.0 prediction (https://services.healthtech.dtu.dk/services/DeepTMHMM-1.0/), 227 out of the 1264 unique Hok sequence are predicted to have a TM (transmembrane), 7 a SP (signal peptide) and a TM and 1025 have a SP. When predicting the TM of the consensus sequence (most abundant amino-acid) shown in Fig. 1D, region A8 to L25 is predicted to be inserted in the membrane, with the Nterm inside and Cterm outside.

      __R3_C5: __What is the cutoff for being a Hok? Did they take the "last hit" and use that in additional searches to see if more appeared? If that was done, and the search was exhaustive, this really important to add for the reader.

      Authors' answer to R3_C5: The MMseqs2 search was performed using 5 iterations as indicated in the M&M, meaning that the hits of the one search were used to search the database again five time in a raw. Importantly, an attempt to increase the number of iterations to 10 did not significantly increase the number of hits. Therefore, at least for the MMseqs2 search in the RefSeq database, we are close to being exhaustive.

      __R3_C6: __Figure S4: the authors state that there was no difference in the degree of toxicity between the clusters. There do appear to be some peptides tested that at the arabinose concentration used did not repress growth as immediately as others. If higher arabinose concentration is used, does that eliminate these differences? OR are many of these suppressors-if diluted back again, do they grow as if they are non-toxic in arabinose?

      Authors' answer to R3_C6: As suggested by Reviewer 1 (R1_C2), we performed titration of arabinose in a system overexpressing araE in a ΔaraBAD but were not able to find difference of toxicity in our conditions, see also our answer to R1_C2.

      __R3_C7: __Discussion: "because non-functional homologs are expected to quickly accumulate mutations..." is a bit problematic. Hok is highly regulated-as are some of the other well-described type I toxins. In MG1655, while the coding sequence may be intact, there are other mutations and/or insertion elements that prevent expression (and be extension, function. Given the lack of consensus data for type Is, it is best to provide more context for this. If the authors wish to argue that they should quickly accumulate mutations, it would be good to provide additional rates/evidence (even for other loci) from the Enterobacteriaceae.

      __Authors' answer to R3_C7: __We agree this statement might need to be supported further. We have removed this sentence to address this concern.

      __Minor comments: __

      __R3_C8: __For the sequences used in the search: please provide the sequence used in addition to the reference to the T1TAdb. Was the full-length hok mRNA, including mok, used? Please provide the nucleic acid sequence (and include description of whether full-length, etc.) in Materials and Methods or in Supplemental.

      __Authors' answer to R3_C8: __Sequences and code were deposited on https://gitub.u-bordeaux.fr/alerhun/Escalera-Maurer_2025. This files named curated_Hok.fasta and hok.fa, corresponding to Hok protein and mRNA sequences respectively are available in the file "T1TAdb input".

      __R3_C9: __60% identity was used for clustering. Did this become a problem-meaning separation of same property amino acid?

      __Authors' answer to R3_C9: __We checked amino acid signatures for each cluster (Fig S2), but could not find anything relevant.

      __R3_C10: __Fig. S2: for the clusters shown, please add in HokB, HokE, etc., to better correspond to Figure 1 in the main text.

      __Authors' answer to R3_C10: __The clusters were annotated according to the suggestion.

      __R3_C11: __Fig S1: this figure is challenging to orient-what are the numbers (8_10_85)?

      Authors' answer to R3_C11: The figure was generated using the CLANS tool, with each unique sequence retrieved by our analysis shown as a dot. Hok homologous sequences are in red and cluster together, the outlier clusters are annotated with the numbers corresponding to their 60% identity cluster. We understand that separating the number using an underscore could lead to confusion, therefore we have now separated the numbers using a coma.

      __R3_C12: __Please make a separate table or sheet for the experimentally tested peptides. Table S1 is quite large and a separate table/sheet would make this easier to find. If possible, please give the files names a more descriptive title (Table S1 in the name for example). This may be an issue with Review Commons but the individual file names were non-descript and the descriptions on the webpage did not indicate what the file contained.

      __Authors' answer to R3_C12: __We named the files Table S1 and File_S1 to S7. We added a table S2 with the experimentally tested peptides. Note that identical peptides can be sometime found in several bacterial loci.

      __R3_C13: __Figure S9: the black arrow for Hok is hard to see-it appears that the long grey bar going through multiple loci is indicative of Hok. Perhaps label this differently to make it easier on the reader (the line initially seemed to be a formatting issue and not indicative of the position of Hok.

      __Authors' answer to R3_C13: __We have now added a new label to indicate where is Hok, and clarified it in the figure legend.

      __R3_C14: __While the authors focused on Hok for this approach, which is fine and appropriate, can they comment at all about where mok is there in these new clusters/sub-families? Sok potential?

      __Authors' answer to R3_C14: __We added a paragraph about Mok in the discussion.

      __R3 Significance: __Overall the paper is a sound bioinformatic exercise and is improved with the testing of numerous "new" Hok proteins. Most of the comments can be done with some clarifications and maybe some additional analyses and/or verification which should take minimal time. The authors are over-emphatic at points as indicated and need to be more careful and precise with their language.

      In terms of advancement, it advances the distribution of these systems and adds to the depth of sub-classes. The audience will be more specialized to those who study these systems.

      Expertise: I have been studying type I toxin-antitoxin systems since the mid-2000s. We published a study examining (and mentioned well by this article!) the distribution in chromosomes of type I toxin-antitoxin systems, identified brand-new systems (that were chromosomally-limited at the time). My lab has continued to study regulation of type I toxins and distribution of chromosomally-only-encoded systems (so not Hok).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In the manuscript, "The Hok bacterial toxin: diversity, toxicity, distribution and genomic localization," by Escalera-Maurer et al., investigate the distribution of Hok type I toxin proteins across bacterial species. The Hok-Sok type I toxin-antitoxin system was first described on plasmids where it serves to maintain the plasmid in a population of bacterial cells: translation of the hok mRNA is prevented via the small antitoxin RNA Sok. Upon plasmid loss, with no new transcription of sok, the highly stable hok mRNA is translated into a small protein, killing the plasmid-less cell. Homologues to the system were identified in the chromosome of E. coli in the 1990s, and subsequent analyses have identified identical systems in other bacterial chromosomes, though they are close relatives to E. coli. Given the increased number of bacterial genomes sequenced, the group examined how widespread Hok may be across bacteria. They used a combination of BLASTn, MMseqs2 (protein) and Infernal (RNA) to identify, as best possible, all possible homologs. They then used sequence identity cut-offs to form Hok "clusters," and identified key features of the cluster as well as tested toxicity of overproduction of 31 homologs in a strain of Salmonella. Overall, though a variety of bioinformatic predictions and analyses, the manuscript identifies an expanded number of Hok members not previously identified and broaden the species it is found in, supported that Hok is not associate with defense systems, and provides additional support that horizontal transfer of hok genes is likely via plasmids (where hok is presumed to have originated).

      Major comments: There are some areas of the text that are a bit too definitive (these can be fixed or better explained in the text) and a few questions raised about the analyses and interpretations. These are the specific comments:

      Introduction

      First paragraph: "Toxin production leads to the death of the cell encoding it" For many chromosomally encoded systems, toxicity has only been observed via artificial overexpression. This is an important point, as for many systems, a true biological function remains unknown. Further, add caveats regarding toxin function (for systems with validated function, they are involved in...). Again, there are still many questions for many t-at systems, in particular the Type I systems. Introduction: type I's are more narrow in distribution, but much of this is due to their size and lack of biochemical domains. Again, please clarify more here.

      Introduction: while Hok's have been found on chromosomes, in E. coli strains, there is clear evidence that many are inactive. This comes up in the discussion, but it is worth including briefly in the introduction.

      For the predicted transmembrane domain: it would be worth to include a box/indication as to where that is within the peptide (with the understanding it may not be exact). Is there more/less variation here? I'm assuming all clusters/family have a predicted TM domain?

      What is the cutoff for being a Hok? Did they take the "last hit" and use that in additional searches to see if more appeared? If that was done, and the search was exhaustive, this really important to add for the reader.

      Figure S4: the authors state that there was no difference in the degree of toxicity between the clusters. There do appear to be some peptides tested that at the arabinose concentration used did not repress growth as immediately as others. If higher arabinose concentration is used, does that eliminate these differences? OR are many of these suppressors-if diluted back again, do they grow as if they are non-toxic in arabinose?

      Discussion: "because non-functional homologs are expected to quickly accumulate mutations..." is a bit problematic. Hok is highly regulated-as are some of the other well-described type I toxins. In MG1655, while the coding sequence may be intact, there are other mutations and/or insertion elements that prevent expression (and be extension, function. Given the lack of consensus data for type Is, it is best to provide more context for this. If the authors wish to argue that they should quickly accumulate mutations, it would be good to provide additional rates/evidence (even for other loci) from the Enterobacteriaceae.

      Minor comments:

      For the sequences used in the search: please provide the sequence used in addition to the reference to the T1TAdb. Was the full-length hok mRNA, including mok, used? Please provide the nucleic acid sequence (and include description of whether full-length, etc.) in Materials and Methods or in Supplemental.

      60% identity was used for clustering. Did this become a problem-meaning separation of same property amino acid? Fig. S2: for the clusters shown, please add in HokB, HokE, etc., to better correspond to Figure 1 in the main text.

      Fig S1: this figure is challenging to orient-what are the numbers (8_10_85)?

      Please make a separate table or sheet for the experimentally tested peptides. Table S1 is quite large and a separate table/sheet would make this easier to find. If possible, please give the files names a more descriptive title (Table S1 in the name for example). This may be an issue with Review Commons but the individual file names were non-descript and the descriptions on the webpage did not indicate what the file contained.

      Figure S9: the black arrow for Hok is hard to see-it appears that the long grey bar going through multiple loci is indicative of Hok. Perhaps label this differently to make it easier on the reader (the line initially seemed to be a formatting issue and not indicative of the position of Hok.

      While the authors focused on Hok for this approach, which is fine and appropriate, can they comment at all about where mok is there in these new clusters/sub-families? Sok potential?

      Significance

      Overall the paper is a sound bioinformatic exercise and is improved with the testing of numerous "new" Hok proteins. Most of the comments can be done with some clarifications and maybe some additional analyses and/or verification which should take minimal time. The authors are over-emphatic at points as indicated and need to be more careful and precise with their language.

      In terms of advancement, it advances the distribution of these systems and adds to the depth of sub-classes.

      The audience will be more specialized to those who study these systems.

      Expertise: I have been studying type I toxin-antitoxin systems since the mid-2000s. We published a study examining (and mentioned well by this article!) the distribution in chromosomes of type I toxin-antitoxin systems, identified brand-new systems (that were chromosomally-limited at the time). My lab has continued to study regulation of type I toxins and distribution of chromosomally-only-encoded systems (so not Hok).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors examined how the Hok toxins are spread across bacterial genomes. The manuscript including its figures is hard to read and understand. I commented figure 1 in details, but similar comments apply to the other figures. Overall, the data lack clarity and precision. Finding information about sequences, clusters in the supplementary materials was not easy. The manuscript should be thoroughly revised. In addition, I believe that other aspects should be developed to expand the interest of the study, such as the co-occurrence of multiple systems in chromosomes, on plasmids and whether they are able to crosstalk. This might provide some evolutionary insights into the biology of these toxins.

      Introduction:

      The introduction lacks information regarding the Hok protein (size, structure prediction, localization) as well as a bit of explanation about the reason of looking at these toxins. The description of the potential roles should be a bit expanded.

      Results:

      When the authors talk about 'loci', they mean genes encoding Hok homologs if I understand correctly. They did not look for the Sok sequences (hok-sok loci).

      It is not clear what the authors did with the sequences for which they could not detect a start codon and a SD (although it is unusual to refer to SD in the context of protein sequence)

      Figure 1A is not clear. The total of the bars equal 32,532 which is the number of 'loci' detected by the combination of the different methods. However, it is not clear to me how many are redundant. For instance, I suppose that all the 8483 sequences that were retrieved using blastn and Infernal were retrieved using MMseqs2, blastn and Infernal. So, what is the actual number of sequences that were found? When the authors talk about 1264 distinct peptides, what do they mean? What are the numbers on the X axis (18209, 2260, 27728)?

      About Infernal: first the authors are stating that only 8% of the sequences are lost when not considering the mRNA structure - which they seem to consider as negligeable. Then in the next section, they state that Infernal is the best tool at identifying clusters that are not detected otherwise. Seems a bit contradictory.

      Cluster determination. The threshold was put at 60% identity. What is the rationale for the 60% identity? Given that the Hok sequences (like toxins and antitoxins from TA systems in general) are highly variable, this leads to a high number of clusters. I'm not sure of the relevance of these clusters. Are there any other criteria to define clusters?

      The authors claim that most of the Hok diversity is found on chromosomes. However, the number of chromosomal Hok is higher than that located on plasmids, which might be related to the different sizes of the different replicons ie, chromosomes being larger than plasmids. Is there a way to normalize by determining the density per size?

      '46 of the 62 clusters contained 10 or less distinct sequences and might be in the process of degenerating'. The authors also linked this with SD detection. Please explain. From what was indicated earlier, I understand that sequences with premature stop codons or short sequences (<40aa) were removed from the analysis earlier. Lacking an SD is a sign of decay? Were these sequences lacking SD not discarded before starting the analysis? Did the authors experimentally validate some of these sequences?

      'Only 7.3% of the unique sequences were found on both plasmids and chromosomes'. From this observation, the authors conclude that 'there is little stable transfer from chromosomes to plasmids or vice-versa'. I don't understand what this means. Do they mean identical sequences? The fact that sequences differ from chromosomes to plasmids does not rule out 'stable transfer'. What do they actually mean by stable transfer? Once the gene is horizontally transferred, it is fixed and vertically transmitted? Same comments apply to the inter-genera horizontal transfer by plasmids.

      I don't understand the next section about 'family'. What do the authors mean about 'family'? Genera? The same apply to the next section about the Y to C recoding. Did the authors do point mutations in the conserved amino acids/codons to test whether they are important for toxicity? Some Hok variants lacks some of the conserved amino acids and are toxic (under overexpression conditions in Salmonella). What about T18, C31 and E42?

      The prevalence of Hok in chromosomes or on plasmids might depend on various confounding parameters, such as the size, number of sequences available among others. The authors should find methods to correct for all that.

      Link with defense systems. The threshold was set at 20 kb. Why this threshold?

      Significance

      The paper in its current state appears to serve the role of a data repository rather than a thorough and original analysis. It requires extensive revisions before it can be of interest to experts in the toxin-antitoxin field.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Here, Escalera-Maurer and colleagues, present an up-to-date distribution of homologues of Hok toxic proteins belonging to the well-annotated, but otherwise functionally obscure, hok/Sok type I toxin-antitoxin system, across the RefSeq database. Although such computational analyses have been done in the past, the authors here find many more hok homologs than described before, and they categorise their distribution based on whether they are encoded on chromosomes, plasmids, or (pro)phages. These computational analyses are in general tricky with T1TAs, as their toxins are quite short (~50 amino acids, as is the case for Hok), which is why the authors here used three separate approaches to expand their search (nucleotide-level BLAST, protein-homology, or both combined with Infernal). The authors cluster the Hok homologues they find based on a 60% sequence identity cut-off (expanding the known clusters in the process), and proceeded to test 31 candidates belonging to 15 sequence-clusters for their toxicity in Salmonella Typhimurium LT2, showing that 30/31 were toxic upon induction. An interesting finding from their endeavours is that hok/Sok homologues are enriched within prophages and large plasmids, but are not enriched near bacterial anti-phage defense systems (in contrast to the SymE/SymR T1TA). The findings suggest that hok/Sok are indeed sometimes linked to phage and plasmid biology, although they might not be antiphage defenses per se (they have been clearly shown in the past to be addiction modules, and this is still clearly true).

      My expertise lies towards the experimental side of the authors' work, I thus cannot comment on the accuracy/robustness of the computational analyses performed here. The authors do a fine job in clearly stating their findings overall; I could follow most of the conclusions, and I deemed that most of them were supported by their work. Additionally, I find that this paper is a missed opportunity to uncover even more novel biology connected to the interesting hok/Sok T1TAs. The paper does not provide a new framework to think about what is the function of the chromosomal/prophage hok/Sok T1TA systems, although I realize that this is very difficult to accomplish, especially when considering that hok/Sok systems have been around in the literature for almost 40 years.

      My major comment is in regard to the Hok toxicity assays (Fig. 2). The authors state in the discussion that "Hok peptides originating from chromosomes are as toxic as those from plasmids", but I believe that the way that they tested their constructs might not have allowed them to see toxicity differences between the two groups. Specifically, using the multi-copy plasmid pAZ3 (pBR322 origin of replication; ~15-20 plasmid copies per chromosome) to induce the different Hok toxin homologues in Salmonella Typhimurium LT2 with arabinose might have masked toxicity differences that would otherwise be apparent on the chromosomal expression-level.

      Some of the authors themselves have previously used the FASTBAC-Seq method to study the Hok homologue from plasmid R1, a useful technique during which a toxin is integrated in the chromosome, in order to study their toxicity under natural levels of expression. I believe that an ideal scenario would be to apply FASTBAC-seq to some of the 31 Hok homologues described here (e.g., a subset of plasmidic vs chromosomal Hok homologues) to shed light on potential toxicity differences between the Hok clusters. This would increase the value of the presented study.

      Alternatively, the authors could employ an L-arabinose concentration gradient to titrate the expression levels of the Hok toxins in order to potentially see different toxicity levels from the different homologues. However, this is not going to work in the system as they are using it now for two reasons:

      a) the S. Typhimurium LT2 (STm) used here has its arabinose utilization operon intact (araBAD), which means that Salmonella can catabolize arabinose to use it as a carbon source. This catabolization process interferes with the arabinose induction (i.e., Salmonella eats arabinose instead of using it as the Hok inducer). To ameliorate this, the authors could delete the araBAD operon in STm, rendering STm incapable of catabolizing arabinose, and repeat the experiments in that strain. Or use E. coli BW25113 as the expression host, which already has the araBAD operon deleted (it is not clear to me why the different Hok homologues would not be toxic in E. coli, as the different Hok homologues are widely diverse in sequence, as the authors found here).

      b) Even with the araBAD operon deleted, the arabinose induction would be bimodally on or off in the population, due to the bimodal expression of the arabinose transporter (AraE; see Khlebnikov et al., 2002). This would again not allow for titratable arabinose-inducible expression from different concentrations of arabinose. The solution for this would be to co-express a separate plasmid with araE, which would render every cell the same in regards to arabinose permeability, and thus the system would be titratable (as explained in Khlebnikov et al., 2002).

      Therefore, if the authors would be interested to go towards this route, they would have to first delete the araBAD from STm, then transform STm with an araE plasmid, and redo the experiments. In addition, I would propose to the authors to use the drop plate method (agar plate-based), which is more sensitive compared to the liquid assays employed here.

      Having said all that, I understand that all this experimental work would be strenuous and time-consuming, and although I would like to see it happen, this is not my paper. I would be content therefore if the authors toned down the claim that plasmidic vs chromosomal Hok homologues have the same toxicity, and discuss that chromosomal levels of toxicity are an important caveat that has not been explored here.

      Other comments:

      a) There is barely any discussion of the Sok component (RNA antitoxin) of the homologues; why is that? Could you please discuss Sok differences across the homologues, or at least explain why this is not discussed at all in the paper (e.g., in the discussion)?

      b) In the results section, the Hok clusters are referred to as 62 in number ("Because Hok sequences were too short and variable to construct a meaningful phylogenetic tree, we clustered the Hok sequences with a 60% identity threshold and obtained 62 clusters"), but then in the discussion section, the cluster number becomes 74 ("We highlighted the high sequence variability within Hok peptides by obtaining a total of 74 clusters with 60% identity (Fig. S7)."). Which one is the right number, and why is there a discrepancy?

      Significance

      The most well-clarified aspect of the paper presented here is the distribution of Hok homologues, with the novel aspect of the location in which the hok/Sok T1TAs reside (i.e., chromosome, plasmid, or phage). There is room for the molecular genetics part to be developed further, as I discussed earlier, however this study is the most up-to-date characterization of the diversity of Hok homologues, and will be of interest to the T1TA and the general toxin-antitoxin field.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reply to the Reviewers __

      We thank the Reviewers for their positive assessment and recognition of the paper achievements. The insightful comments will strengthen the data and manuscript.

      Referee #1* *

      Minor comments

      1. Fig 1B - add arrows showing mRNAs being translated or not (the latter mentioned in line 113 is not so easy to see). We have magnified the inset of the colocalisation in the right column; we added arrows and arrowheads to differentiate colocalised and non-colocalised bcd with translating SunTag.

      2. Fig 2A - add a sentence explaining why 1,6HD, 2,5HD and NaCl disrupt P bodies. *

      We have added the information on the use of 1,6HD, 2,5HD, and NaCl to disrupt P-bodies as below. Revised line 158: “To further show that bcd storage in P bodies is required for translational repression, we treated mature eggs with chemicals known to disrupt RNP granule integrity (31, 37, 69-72). Previous work has shown that the physical properties of P bodies in mature Drosophila oocytes can be shifted from an arrested to a more liquid-like state by addition of the aliphatic alcohol hexanediol (HD) (Sankaranarayanan et al., 2021, Ribbeck and Görlich, 2002; Kroschwald et al., 2017). While 1,6 HD has been widely used to probe the physical state of phase-separated condensates both in vivo and in vitro (Alberti et al., 2019; McSwiggen et al., 2019; Gao et al., 2022), in some cells it appears to have unwanted cellular consequences (Ulianov et al., 2021). These include a potentially lethal cellular consequences that may indirectly affect the ability of condensates to form (Kroschwald et al., 2017) and wider cellular implications thought to alter the activity of kinases (Düster et al., 2021). While we did not observe any noticeable cellular issues in mature Drosophila oocytes with 1,6 HD, we also used 2,5 HD, known to be less problematic in most tissues (Ulianov et al., 2021) and the monovalent salt sodium chloride (NaCl), which changes electrostatic interactions (Sankaranarayanan et al., 2021).”

      *Fig 4C - explain in the legend what the white lines drawn over the image represent. And why is there such an obvious distinction in the staining where suddenly the DAPI is much more evident (is the image from tile scans)? *

      Figure 4C is the tile scan image of a n.c.10 embryo and the white line classified the image into four quadrants. We used this image to quantify the extent of bcd (magenta) colocalisation to SunTag (green) in the anterior and posterior domains of the embryo in the bar graph shown in panel C’. There is a formatting error in the image. We will correct this in the revised version. We will also include the details of white lines in the legends. Finally, based on further reviewer comments, in the revised version this data is shifted to the supplementary information.

      • Line 215 - 'We did not see any significant differences in the translation of bcd based on their position, however, there appears an enhanced translation of bcd localised basally to the nuclei (Figure S5).' Since the difference is not significant, I do not think the authors should conclude that translation is enhanced basally. *

      We agree with the reviewer. In this preliminary revision we have changed this statement to: “We did not see any differences in the translation of bcd based on their position with respect to the nuclei position (Figure S5)” (revised line 238-239).

      *Line 218: 'The interphase nuclei and their subsequent mitotic divisions appeared to displace bcd towards the apical surface (Figure S6B).' Greater explanation is needed in the legend to Fig S6B to support this statement as the data just seem to show a nuclear division - I would have thought an apical-basal view is needed to conclude this. *

      We have rearranged this figure and shown in clarity the apical-basal view of the blastoderm nuclei and the displacement of bcd from the surface of the blastoderm in Figure S8.

      New Figure S8: n.c.8 - pre-cortical migration; n.c.12,14- post cortical migration; Mitosis stages of n.c.9-10. The cortical interphase nuclei at n.c. 12,14 displaces bcd. The nuclear area (DAPI, cyan) does not show any bcd particles (magenta) indicated by blue stars. The mitotic nuclei (yellow arrowheads, yellow stars) displace bcd along the plane of nuclear division (doubled headed yellow arrows).

      Fig 5B - the authors compare Bcd protein distribution across developmental time. However, in the early time points cytoplasmic Bcd is measured (presumably as it does not appear nuclear until nc8 onwards) and compare the distribution to nuclear Bcd intensities from nc9 onwards. Is most/all of the Bcd protein nuclear localised form nc9 to validate the nuclear quantitation? Does the distribution look the same if total Bcd protein is measured per volume rather than just the nuclear signal? Are the authors assuming a constant fast rate of nuclear import?

      From n.c.8 onwards, the Bcd signal in interphase nuclei builds up, with the nuclear intensity becoming very high compared to cytoplasmic Bcd. However, we do see significant Bcd signal in the cytoplasm (i.e., above background). In earlier work, gradients of the nuclear Bcd and nuclear-import mutant Bcd overlapped closely (Figure 1B, Grimm et al., 2010). This essentially suggests the nuclear Bcd gradient reflects the corresponding gradient of cytoplasmic Bcd. Further, the nuclear import of Bcd occurs rapidly after photobleaching (Gregor et al., 2007). Based on these observations, and our own measurements, prior to n.c. 9, the cytoplasmic gradient is likely a good approximation of the overall shape, whereas post n.c. 9 the Bcd signal is largely nuclear localised. Further, the overall profile is not dependent on the nuclear volume.

      • Line 259 - 'We then asked if considering the spatiotemporal pattern of bcd translation' - the authors should clarify what new information was included in the model. Similarly in line 286, 'By including more realistic bcd mRNA translation' - what does this actually mean? In line 346, 'We see that the original SDD model .... was too simple.' It would be nice to compare the outputs from the original vs modified SDD models to support the statement that the original model was too simple. *

      We will improve the linking of the results to the model. The important point is that when and where Bcd production occurs is more faithfully used, compared with previous approximations. By including more realistic production domains, we can replicate the observed Bcd gradient within the SDD paradigm without resorting to more complex models.

      Fig S1A - clarify what the difference is between the 2 +HD panels shown.__ __

      The two +HD panels at stage 14 indicate that upon the addition of HD, there are no particles in 70% of the embryos, and 30% show reduced particles. We will add this information to the figure legend.

      • Fig S2E - the graph axis label/legend says it is intensity/molecule. Since intensity/molecule is higher in the anterior for bcd RNAs, is this because there are clumps of mRNAs (in which case it's actually intensity/puncta)? *

      The density of mRNA is very high in the anterior pole; there is a chance that more than one bcd particle is within the imaged puncta (due to optical resolution limitations). We will change the y-axis to average intensity per molecule to average intensity per puncta.


      • Fig S4 - I think this line is included in error: '(B) The line plots of bcd spread on the Dorsal vs. Ventral surfaces.'*

      Yes, we will correct this in the revision.

      • In B, D, E - is the plot depth from the dorsal surface? I would have preferred to see actual mRNA numbers rather than normalised mRNAs. In Fig S4D moderate, from 10um onwards there are virtually no mRNA counts based on the normalised value, but what is the actual number? The equivalent % translated data in Fig S4E look noisy so I wonder if this is due to there being a tiny mRNA number. The same is true for Figs S4D, E 10um+ in the low region.*

      Beyond 10um from the dorsal surface, the number of bcdsun10 counts is very low. It becomes negligible at the moderate and low domains. We will attach the actual counts of mRNA in all these domains as a supplementary table in the revised version.

      General assessment Strengths are: 1) the data are of high quality; 2) the study advances the field by directly visualising Bcd mRNA translation during early Drosophila development; 3) the data showing re-localisation of bcd mRNAs to P bodies nc14 provides new mechanistic insight into its degradation; 4) a new SDD model for Bcd gradient formation is presented. Limitations of the study are: 1) there was already strong evidence (but no direct demonstration) that bcd mRNA translation was associated with release from P bodies at egg activation; 2) it is not totally clear to me how exactly the modified SDD model varies from the original one both in terms of parameters included and model output.

      This is the first direct demonstration of the translation of bcd mRNA released as a single mRNA from P bodies. Previously, we have shown that P bodies disruption releases single bcd from the condensates (31). We have captured a comprehensive understanding of the status of individual bcd translation events, from their release from P bodies at the end of oocyte maturation until the end of blastoderm formation.

      The underlying SDD model – that of localised production, diffusion, and degradation – is still the same (up to spatially varying diffusion). Yet the model as originally formulated did not fit all aspects of the data, especially with regards to the system dynamics. Here, we demonstrate that by including more accurate approximations of when and where Bcd is produced, we can explain the formation of the Bcd morphogen gradient without recourse to any further mechanism.


      Referee #2

      1. Line 114: The authors claim to have validated the SunTag using a fluorescent reporter, but do not show any data. Ref 60 is a general reference to the SunTag, and not the Bcd results in this paper. Perhaps place their data into a supplemental figure or movie? To show the validation of our bcdSun32 line, we have composed a new Figure S1 that shows the translating bcdSun32 (magenta) colocalising to the ScFV-mSGFP2 (green). Yellow arrowheads in the zoom (right panel) points to the translating bcdSun32 (magenta) and red arrowheads points to the untranslated bcdSun32. In addition, we have also shown the validation of bcdSun32 with the anti-GCN4 staining in the main Figure 1B.

      Further, we have dedicated supplementary Figure S3 (previously Figure S2) for the validation of our bcdSun10 construct. Briefly, bcdSun10 is inserted into att40 site of chr.2. We did a rescue experiment, where bcdSun10 rescued the lethality of homozygous bcdE1 null mutant. We then performed a colocalisation experiment using smFISH, where we demonstrated that almost all bcd in the anterior pole are of type bcdSun10. We targeted specific fluorescent FISH probes against 10xSunTag sequence (magenta, Figure S2A) and bcd coding sequence (magenta, Figure S2A). Upon colocalisation, we found ~90% of the mRNA are of bcdSun10 type. The remaining 10% could likely be contributed by the noise level (Figure S2B). We will make sure these points are clear in the revised manuscript.

      Line 128 and Fig. 1E: The claim that bcd becomes dispersed is difficult to verify by looking at the image. The language could also be more precise. What does it mean to lose tight association? Perhaps the authors could quantify the distribution, and summarize it by a length scale parameter? This is one of the main claims of the paper (cf. Line 23 of the abstract) but it is described vaguely and tersely here.

      We have changed the text from, “We also confirmed that bcd becomes dispersed, losing its tight association with the anterior cortex (Figure 1E) (31)” to, “We also confirmed that bcd is released from the anterior cortex at egg activation (Figure 1E) (31, 21).” (Revised line 131).

      The release of bcd mRNA at egg activation was first shown in 2008 (Ref 21, Figure 4, D-E) and again in 2021 (Ref 31, Figure 7 B and E). The main point in line 127-128, “P bodies disassembled and bcd was no longer colocalised with P bodies” and the novel aspect of line 23 is “translation observed”. The distribution of bcd mRNA after egg activation was not the point of this section. We have improved the writing in the revision to make this clearer.

      Line 146, Fig. 1G: This is a really important figure in the paper, but it is confusing because it seems the authors use the word "translation," when they mean "presence of Bcd protein." In other places in the paper, the authors give the impression that "bcd translation" means translation in progress (assayed by the colocalization of GCN4 and bcd mRNA). However, in Fig. 1G, the focus is only on GCN4. Detecting Bcd protein only at the anterior does not mean that translation happens only at the anterior (e.g., diffusion or spatially-restricted degradation could be in play).

      In Figure 1G, we have shown only the “translated” Bcd by staining with a-GCN4. We have changed line 146 from, “Consistent with previous findings, we only observed bcd translation at the anterior of the activated egg and early embryo (Figure 1G-H) (3, 68)” to, “Consistent with previous findings, we only observed the presence of Bcd protein at the anterior of the activated egg and early embryo (Figure 1G-H) (3, 68). (Revised line 151-153). We will use “translating bcd” or “bcd in translation” where we show colocalisation of bcd with BcdSun10 or BcdSun32 elsewhere in the manuscript.

      We did not mean to claim that translation occurred only in the anterior pole. We show that the abundance of bcd is very high in the anterior pole (in agreement with previous work) and that this is where the majority of observed translation events took place. Indeed, we have also shown that posteriorly localised mRNAs have the same BcdSun10 intensity per bcd puncta from the posterior pole (Figure 3B & 4C’ and Figure S2 E), but these are much fewer in number.

      *It would also be helpful to show a plot with quantification of Bcd detection (or translation) on the y-axis and a continuous AP coordinate on the x-axis, instead of just two points (anterior and posterior poles, the latter of which is uninteresting because observing no Bcd at the posterior pole is expected). *

      In Figure 1G,H, our aim was to test whether release from P bodies allowed for bcd mRNA to be translated. We used the presence of Bcd protein at the anterior domain of the oocytes to show this. The posterior pole was included as an internal control. To show the spatial distribution of bcd mRNA and its translation, we used early blastoderm (Figure 3, Figure S4).

      • *

      Another issue with Fig. 1G is that the A and P panels presumably have different brightness and contrast. If not, just from looking at the A and P panels, the conclusion would be that Bcd protein is diffuse (and abundant) in the posterior and concentrated into puncta in the anterior. The authors should either make the brightness and contrast consistent or state that the P panel had a much higher brightness than the A panel.

      We agree with this shortcoming. We have now added the following to Figure 1 legend to clarify this observation. “G: Representative fixed 10 µm Z-stack images (from 10 samples) showing BcdSun32 protein (anti-GCN4) is only present at the anterior of an in vitro activated egg or early embryo 30-minute post fertilization. BcdSun32 protein is not detected in these samples at the posterior pole (image contrast increased to highlight the lack of distinct particles at the posterior). BcdSun32 protein is also not detected at the anterior or posterior of a mature oocyte or an in vitro activated egg incubated with NS8953 (images have the contrast increased to highlight the lack of distinct particles). Scale bar: 20 mm; zoom 2 mm.” (Revised line 623).

      • Line 176: This section is very confusing, because at this point the authors already addressed the spatial localization of translation in Fig. 1G,H (see my above comment). However, here it seems the authors have switched the definition of translation back to "translation in progress." Therefore, the confusion here could be eliminated by addressing the above point.*

      In the revised version, we will use Bcd protein when shown with anti-GCN4 staining. We will use “translating bcd” or “bcd in translation” where we show colocalisation of bcd with a-GCN4 (BcdSun10 or BcdSun32). We will change this in the corresponding text.

      Line 185: The sentence here is seemingly contradictory: "most...within 100 microns" implies that at least some are beyond 100 microns, while the sentence ends with "[none]...more than 100 microns." The language could perhaps be altered to be less vague/contradictory.

      We will clarify this in the revised version. There are few particles visible beyond 100 um. In the lower panel of Figure 3B, the posterior domain shows few particles. However, their actual number compared to bcd counts within the 100 um is negligible (Figure3C). Nonetheless, the few bcd particles we observe do seem to be under translation (quantified in Figure 4C’ and Figure S2E).

      • Line 204: It would be really nice to have quantification of the translation events, such as curves of rate of translation as a function of a continuous AP coordinate, and a curve for each nc.*__ __

      In the revised version we will provide the results quantifying the translation events across the anterior- posterior axis. This will provide a clarity to the presence of bcd and their translation in the posterior domain with time.

      Our colocalisation analysis is semi-automated. It includes an automated counting of the individual bcd particle counts and a manual judgement of the colocalised BcdSun10 protein (distinct spots, above noise) to bcd particles (Figure S3D). The bcd particle counts ran into thousands in each cyan square box (measuring 50um radius and ~ 20um deep from the dorsal surface). We selected three such boxes covering 150um (continuously) from the anterior pole across A-P axis and 20um deep of the flattened embryo mounts across D-V axis (Figure 3A-C, Figure S4). We have also scanned scarce particles in the posterior; however, bcd counts are very low compared to the anterior. Further, in Figure 4 we have repeated the same technique to measure translation of bcd particles in embryos at different nuclear cycles.

      We have also shown continuous intensity measurements of bcd particles with their respective BcdSun10 gradient in Figure 5 across the A-P axis at different nuclear cycles. Here, we know BcdSun10 intensity is not only from the “translating” bcd (colocalised BcdSun10 to bcd particles) but also from the translated BcdSun10 freely diffusing (non-colocalised BcdSun10 to bcd particles). As asked by the reviewer, in the revised version we will add bcd counts and their translation status from anterior to posterior axis for each of the nuclear cycles.

      In our future work, we planned to generate MS2 tagged bcdSun10 to measure the rates of translation in live across all nuclear cycles.

      • *

      *Line 209 and Fig 4C: The authors use the terms "intensity of translation events" or "translation intensity" without clearly defining them. From the figure (specifically from the y-axis label), it looks like the authors are quantifying the intensity per molecule (which is not clearly the same thing as "translation intensity"), but it would be nice if that were stated explicitly. *

      In the relevant result section, we have changed the results text to “the intensity of translation events” for explaining the results of Figure 4C’.

      • In addition, the authors again quantify only two points. This is a continuously frustrating part of the manuscript, which applies to nearly all figures where the authors looked only at two points in space. At a typical sample size of N = 3, it seems well within time constraints to image at multiple points along the AP axis.*__ __

      In addition to the quantification shown at the anterior and posterior locations of the embryo in the Figure 3 and 4, we will show in the revised version, the quantification of translation events across all locations from the anterior to the posterior. We will use three embryos for each nuclear cycle from n.c.1 to 14.

      • Furthermore, it sounds like the authors are saying the "translation intensity" is the same in anterior and the posterior, which is counterintuitive. The expectation is that translation would be undetectable at the posterior end, in part because bcd mRNA would not be present. (Note that this expectation is even acknowledged by the authors on Line 185, which I comment on above, and also on Line 197). There should also be very low levels of Bcd protein (possibly undetectable) at the posterior pole. As such, the authors should explain how they think their claim of the same "translation intensities" in the anterior vs posterior fits into the bigger picture of what we know about Bcd and what they have already stated in the manuscript. They should also explain how they observed enough molecules to quantify at the posterior end. The authors should also disclose how many points are in each box in the boxplot. For example, the sample size is N = 3 embryos. In just three embryos, how many bcd/GCN4 colocalizations did the authors observe at the posterior end of the embryo?*

      In n.c.4 in Figure3, we saw few bcd particles in the posterior. However, at n.c.10 in Figure 4C’ the number of posterior bcd particles are higher than at the early stages. We have quantified them in Figure 4C’. We will clarify this from the new set of quantification we are undertaking now to quantify translation across the A-P axis in the revision.

      Finally, we will also provide the number of bcd particle counts and their colocalisation with a-GCN4 as a supplementary table.

      • Line 215: The sentence that starts on this line seems self-contradictory: I cannot tell whether or not there is a difference in translation based on position. *

      We have not observed any difference in the translation of bcd particles depending on the position along the Z-axis. We will edit this in our revised version.

      • Line 229: Long-ranged is a relative term. From the graph, one could state there is some spatial extent to the mRNA gradient, so it is unclear what the authors mean when they say it is not "long-ranged." Could the mRNA gradient be quantified, such as with a spatial length scale? This would provide more information for readers to make their own conclusions about whether it is long-ranged.*

      We have quantified the bcd mRNA gradient for each n.c. (Figure 5B-C); absolute bcd intensities in Figure 5B, left panel and the normalised intensities in Figure 5C. The length of the mRNA spread appears constant with the half-length maximum of ~75um across all nuclear cycles. Our conclusion of a long ranged Bcd gradient is based on the comparisons of the half-length maximum measurements of bcd particles and BcdSun10 (Figure 5D).

      *Line 230: When the authors claim the Bcd gradient is steeper earlier, a quantification of the spatial extent (exponential decay length scale) would be appropriate. Indeed, lambda as a function of time would be beneficial. It should also be placed in context of earlier papers that claim the spatial length scale is constant. *

      We will show this effectively from the live movies of bcdSun10/nanos-scFv-sGFP2 in the revised version.

      • Lines 235-236: The two sentences that start on these two lines are vague and seemingly contradictory. The first sentence says there is a spatial shift, but the second sentence sounds like it is saying there is no spatial change. The language could be more precise to explain the conclusions. *

      We agree with the reviewer. We will edit this in revision.

      Minor comments

        • Line 81: Probably meant "evolutionarily conserved" * Yes, we have changed, “P bodies are an evolutionarily cytoplasmic RNP granule” to, “P bodies are an evolutionarily conserved cytoplasmic RNP granule.”(Revised line 84-85).

      *Figure 1 legend: part B says "from 15 samples" but also says N = 20. Which is it, or do these numbers refer to different things? *

      We have edited this from, “early embryo (from 15 samples)” to, “early embryo (from 20 samples)”. (Revised line 602).

      • Line 217: migration of what? *

      Edited to “cortical nuclear migration”.

      • Line 228: "early embryo" is vague. The authors should give specific time points or nuclear cycle numbers.*

      Edited to “nuclear cycles 1-8”.

      • Line 301: Other locations in the paper say 75 microns or 100 microns. *

      We will make the changes. It is 100 um.

      • Fig. 5: all images should be oriented such that the dorsal midline is on the upper half of the embryo/image. *

      We will flip the image to match.

      • Fig. 5B: There are light tan and/or light orange curves (behind the bold curves) that are not explained. *

      It is the standard deviation. This will be explained.

      • Fig. 5C: the plot says "normalized" but nowhere do the authors describe what the curves are normalized to. There is also no explanation for what the broad areas of light color correspond to.*__ __

      Normalised to the bcd intensity maxima. This will be explained.

      Significance

      The results, if upheld, are highly significant, as they are foundational measurements addressing a longstanding question of how morphogen gradients are formed, using Bcd (the foundational morphogen gradient) as a model. They also address fundamental questions in genetics and molecular biology: namely, control of mRNA distribution and translation.__ __

      We thank Reviewer 2 for highlighting the importance of our work in the field. We are confident that we address the issues raised by Reviewer 2 with the new set of quantifications we are currently working on.

      Referee #3

        • It is not evident from the main results and methods text that the new SDD model incorporates the phenomenon reported in figure 4B. From my reading, the parameter beta accounts for the Bcd translation rate, which according to figure 7B(ii) effectively switches from off to on around fertilization and thereafter remains constant. Figure 4B shows that the fraction of bcd mRNA engaged in translation decreases beginning around NC12/13, and this is one of the more powerful results that comes from monitoring translation in addition to RNA localization/abundance/stability. My expectation based on figure 4B would be that parameter beta should decrease over time beginning around 90-100 minutes and approach zero by ~150 minutes. This rate could be fit to the experimental data that yields figure 4B. The modeling should be repeated while including this information. This is a good observation. Currently, the reduced rate of bcd translation is modelled by incorporating an increased rate of bcd *mRNA degradation. Of course, this could also be reduced by a change in the rate of translation directly. As stated already, the beta parameter is the least well characterised. In the revision, we will include a model where beta changes but not the mRNA degradation rate. We will improve the discussion to make this point clearer.
      1. The presentation of the SDD model should be expanded to address how well the characteristic decay length fits A) measured Bcd protein distributions, B) measured at different nuclear cycles. This would strengthen the claim that the new SDD model better captures gradient dynamics given the addition of translation and RNA distribution. These experimental data already exist as reported in Figure 5. In the current Figure 7, panels D and D' add little to the story and could be moved to a supplement if the authors want to include it (in any case, please fix the typo on the time axis of fig 7D' to read "hours"). The model per cell cycle and the comparison of experimental and modeled decay lengths could replace current D and D'.*

      Originally, we kept discussion of the SDD model only to core points. It is clear from all Reviewers that expanding this discussion is important. In the revision, we will refocus Figure 7 on describing new results that we can learn. As outlined in the responses above, this paper reveals an important insight: the SDD model – with suitable modifications such as temporally restricted Bcd production – can explain all observed properties of Bcd gradient formation. Other mechanisms – such as bcd mRNA gradients – are not required.

      • The exposition of the manuscript would benefit significantly by including a section either in the introduction or the appropriate section of the results that defines the competing models for gradient formation. In the current version, these models are only cited, and the key details only come out late (e.g., lines 302 onward, in the Discussion). Nevertheless, some of the results are presented as if in dialog with these models, but it reads as a one-sided conversation. For instance: Figure 3. The undercurrent in this figure is the RNA-gradient model. In the context of this model, the results clearly show that translation of bcd is restricted to the anterior. Without this context, Figure 3 could read as a fairly unremarkable observation that translation occurs wherever there is mRNA. Restructuring the manuscript to explicitly name competing models and to address how experimental results support or detract from each competing model would greatly enhance the impact of the exposition.*

      We thank the reviewer for this suggestion. We will add the current models of Bcd gradient formation in the introduction section and will change the narrative of results in the section explaining the models.

      (4A) Related to point 3: The entire results text surrounding Figure 2 should be revised to include more detail about A) what specific hypotheses are being tested; and B) to critically evaluate the limitations of the experimental approaches used to evaluate these hypotheses. Hexanediol and high salt conditions are not named explicitly in the text, but the text touts these as "chemicals" that "disrupt P-body integrity." This implies that the treatments are specific to P-bodies. Neither of these approaches are only disrupting P Body integrity. This does not invalidate this approach, but the manuscript needs to state what hypothesis HD and NaCl treatment addresses, and acknowledge the caveats of the approach (such as the non-specificity and the assumptions about the mechanism of action for HD).

      We have made the following edits to resolve this point. Revised line 158: “To further show that bcd storage in P bodies is required for translational repression, we treated mature eggs with chemicals known to disrupt RNP granule integrity (31, 37, 69-72). Previous work has shown that the physical properties of P bodies in mature Drosophila oocytes can be shifted from an arrested to a more liquid-like state by addition of the aliphatic alcohol hexanediol (HD) (Sankaranarayanan et al., 2021, Ribbeck and Görlich, 2002; Kroschwald et al., 2017). While 1,6 HD has been widely used to probe the physical state of phase-separated condensates both in vivo and in vitro (Alberti et al., 2019; McSwiggen et al., 2019; Gao et al., 2022), in some cells it appears to have unwanted cellular consequences (Ulianov et al., 2021). These include a potentially lethal cellular consequences that may indirectly affect the ability of condensates to form (Kroschwald et al., 2017) and wider cellular implications thought to alter the activity of kinases (Düster et al., 2021). While we did not observe any noticeable cellular issues in mature Drosophila oocytes with 1,6 HD, we also used 2,5 HD, known to be less problematic in most tissues (Ulianov et al., 2021) and the monovalent salt sodium chloride (NaCl), which changes electrostatic interactions (Sankaranarayanan et al., 2021).”

      (4B) Continuing the comment above: it is good that the authors checked that HD and NaCl treatment does not cause egg activation. But no one outside of the field of Drosophila egg activation knows what the 2-minute bleach test is and shouldn't have to delve into the literature to understand this sentence. Please explain in one sentence that "if eggs are activated, then x happens following a short exposure to bleach (citations). We exposed HD and NaCl treated eggs to bleach and observed... ."

      We have made the following edits to resolve this point. Revised line 174: “After treating mature eggs with these solutions, we observed BcdSun32 protein in the oocyte anterior (Figure 2A-B). One caveat to this experiment could be that treating mature eggs with these chemicals results in egg activation which would in turn generate Bcd protein. To eliminate this possibility, we first screened for phenotypic egg activation markers, including swelling and a change in the chorion (73). We also applied the classic approach of bleaching eggs for two minutes which causes lysis of unactivated eggs (74). All chemically treated eggs failed this bleaching test meaning they were not activated (74). While we unable to rule out non-specific actions of these treatments, these experiments corroborate that storage in P bodies that adopt an arrested physical state is crucial to maintain bcd translational repression (31).”

      (4C) Continuing the comment above: The section of the results related to the endos mutation needs additional information. It is not apparent to the average reader how the endos mutation results in changes in RNP granules, nor what the expected outcome of such an effect would "further test the model" set up by the HD and NaCl experiments. The average reader needs more hand-holding throughout this entire section (related to figure 2) to follow the exposition of the results.

      We have made the following edits to resolve this point. Edited line 185: “Finally, we used a genetic manipulation to change the physical state of P bodies in mature oocytes. Mutations in Drosophila Endosulfine (Endos), which is part of the conserved phosphoprotein ⍺-endosulfine (ENSA) family (75), caused a liquid-like P body state after oocyte maturation, similar to that observed with chemical treatment (Figure 2C) (31). This temporal effect matched the known roles of Endos as the master regulator of oocyte maturation (75, 76). endos mutant oocytes lost the colocalisation of bcd mRNA and P bodies, concurrent with P bodies becoming less viscous during oocyte maturation (Figure 2D, Figure S1). Particle size and position analysis showed that bcd mRNA prematurely exhibits an embryo distribution in these mutants (Figure 2E). Due to genetic and antibody constraints, we are unable to test for translation of bcd in the endos mutant. However, it follows that bcd observed in this diffuse distribution outside of P bodies would be translationally active (Figure 2E-F).”

      • (4D) Continuing the comment above: The average reader also needs a better explanation of what hypothesis is being tested in Figure 1 with the pharmacological inhibition of calcium. *

      We have made the following edits to resolve this point. Revised line 138: “We next sought to maintain the relationship between bcd mRNA and P bodies through egg activation. This would act as a control to further test if colocalisation of bcd to P bodies was necessary for its translational repression. Previous work has shown that a calcium wave is required at egg activation for further development (references to add Kaneuchi et al., 2015; York-Anderson et al., 2019; Hu and Wolfner, 2019). Chemical treatment with NS8593 disrupts this calcium wave, while other phenotypic markers of egg activation are still observed (58). Using NS8593 to disrupt the calcium wave in the activated egg, we show P bodies are retained during ex vivo egg activation (Figure 1E). In these treated eggs, bcd mRNA remains colocalised with the retained P bodies (Figure 1F). Based on these results and previous observations (31, 66), we hypothesised that the loss of colocalisation between bcd and P bodies correlates with bcd translation.”

      *It is unclear why Bcd translation could not be measured in the endos mutant background, but it would be necessary to measure Bcd translation in the endos background. If genotypically it is not possible/inconvenient to invoke the suntag reporter in the endos background, would it not be sufficient to immunostain against Bcd itself? Different Bcd antisera have recently been reported and distributed by the Wieschaus and the Zeitlinger groups. *

      We have recently received the Bcd antibody from the Zeitlinger group. This has not been shown to work for immunostaining. It remains unclear if it will be successful in this capacity, but we are currently testing it and will include this experiment in the revision if successful.

      *Figure 4 overall is glorious, but there is a problem with panel C. What are the white lines? Why does the intensity for the green and magenta channel change abruptly in the middle of the embryo? *

      These white lines divide the embryo into 4 compartments. We used this method to quantify the intensity of Bcd translation with respect to the bcd puncta. We will correct this image as there is a problem in formatting.

      *It is noted that neither the methods section or the supplement does not contain any mention of how the modeling was performed. How was parameter beta fit? At least a brief section should be added to the methods describing how beta was fit (pending adjustments suggested in comment 1 above). A platinum-level addition would include a modeling supplement that reports the sensitivity of model outcomes to changes in parameters. *

      We apologise for this omission and will include full methodological details in the revision.

      Minor Comments:

        • Line 28: "Source-Diffusion-Degradation" should be changed to "Synthesis-..."* We will edit in the revised version.

      *Line 39: "blastocyst" should be "blastoderm stage embryo". *

      We will edit in the revised version.

      • Line 81: "P bodies are an evolutionarily cytoplasmic RNP granule." is "conserved" missing here? *

      We will edit in the revised version.

      • Throughout the manuscript, there should be better reporting of the imaged genotypes and whether the suntag is being visualized by indirect immunostaining of fixed tissues or through an encoded nanobody-GFP fusion. *

      We will explain in detail in the revised version.

      • Figure 1G: Why is the background staining so different across conditions? Is this a normalization artifact?*__ __

      We agree with this shortcoming. We have now added the following to the figure legend to clarify this observation. “G: Representative fixed 10 µm Z-stack images (from 10 samples) showing BcdSun32 protein (anti-GCN4) is only present at the anterior of an in vitro activated egg or early embryo 30-minute post fertilization. BcdSun32 protein is not detected in these samples at the posterior pole (image contrast increased to highlight the lack of distinct particles at the posterior). BcdSun32 protein is also not detected at the anterior or posterior of a mature oocyte or an in vitro activated egg incubated with NS8953 (images have the contrast increased to highlight the lack of distinct particles). Scale bar: 20 mm; zoom 2 mm.” (Revised line 623).

      Figure 2 legend: what is +Sch in the x-axis labels of figure 2B? The legend says that 2B is the quantification of the data in 2A, but there is no (presumed control) +Sch image in 2A.__ __

      Thank you for this suggestion we have added the data to Figure 2A.

      • Figure 5A largely repeats information presented in figure 4A. Please consider moving to a supplement. Also, please re-orient embryos to follow the convention that dorsal-most surfaces be presented on the top of the displayed images. *

      Thank you for this suggestion. We will consider moving Figure 5A to the supplementary.

      • The lower-case roman numerals referred to in the text for figure 7B are not included in the corresponding figure panel. *

      We will edit in the revised version.

      • Figure 7C y-axis typo (concentration). *

      We will edit in the revised version.

      • Line 222: "make a long-range functional gradient": more accurate to say, "but also marks mature, Bcd protein which resolves in the expected long-range gradient." *

      We will edit in the revised version.

      • Methods: Please check that all buffers referred to as acronyms are both compositionally defined in the reagents table, and that full names are written out at the time of first mention in the presented order. For instance, Schneider's media is referred to a few times before defining the acronym about midway through the methods section.*__ __

      We have added to Figure 2B: “Quantification of experiments shown in A. The number of oocytes that displayed Bcd protein at the anterior as measured by the presence of BcdSun32 at the anterior of the oocyte, but not the posterior. Schneider’s Insect Medium (+Sch) used as a negative control. N = 30 oocytes for each treatment. Scale bar: 5 um.” (Revised line 646).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This is a review of "Dynamics of bicoid mRNA localization and translation dictate morphogen gradient formation" by Athilingam et al. In this manuscript, the authors perform quantification of mRNA localization and translation of bicoid, spanning oogenesis through the maternal to zygotic transition, yielding a definitive characterization of Bicoid gradient formation. The experiments, analysis, and interpretation are on the whole performed rigorously. I very much enjoyed this paper, partly for incorporating the aspects of bcd regulation during oogenesis, which compared to embryonic function of bcd is relatively under-studied. Also valuable is improving the characterization of how bcd expression is shut down at NC14. I have several major comments for revision, and a few minor comments. I should stress that none of the major comments are terrible but are intended to improve the impact/readability/flow of this nice manuscript. With the exception of a straightforward immunostaining experiment, all major comments constitute reworking of the model or the text.

      Major Comments:

      1) It is not evident from the main results and methods text that the new SDD model incorporates the phenomenon reported in figure 4B. From my reading, the parameter beta accounts for the Bcd translation rate, which according to figure 7B(ii) effectively switches from off to on around fertilization and thereafter remains constant. Figure 4B shows that the fraction of bcd mRNA engaged in translation decreases beginning around NC12/13, and this is one of the more powerful results that comes from monitoring translation in addition to RNA localization/abundance/stability. My expectation based on figure 4B would be that parameter beta should decrease over time beginning around 90-100 minutes and approach zero by ~150 minutes. This rate could be fit to the experimental data that yields figure 4B. The modeling should be repeated while including this information.

      2) The presentation of the SDD model should be expanded to address how well the characteristic decay length fits A) measured Bcd protein distributions, B) measured at different nuclear cycles. This would strengthen the claim that the new SDD model better captures gradient dynamics given the addition of translation and RNA distribution. These experimental data already exist as reported in Figure 5. In the current Figure 7, panels D and D' add little to the story and could be moved to a supplement if the authors want to include it (in any case, please fix the typo on the time axis of fig 7D' to read "hours"). The model per cell cycle and the comparison of experimental and modeled decay lengths could replace current D and D'.

      3) The exposition of the manuscript would benefit significantly by including a section either in the introduction or the appropriate section of the results that defines the competing models for gradient formation. In the current version, these models are only cited, and the key details only come out late (e.g., lines 302 onward, in the Discussion). Nevertheless, some of the results are presented as if in dialog with these models, but it reads as a one-sided conversation. For instance: Figure 3. The undercurrent in this figure is the RNA-gradient model. In the context of this model, the results clearly show that translation of bcd is restricted to the anterior. Without this context, Figure 3 could read as a fairly unremarkable observation that translation occurs wherever there is mRNA. Restructuring the manuscript to explicitly name competing models and to address how experimental results support or detract from each competing model would greatly enhance the impact of the exposition.

      4A) Related to point 3: The entire results text surrounding Figure 2 should be revised to include more detail about A) what specific hypotheses are being tested; and B) to critically evaluate the limitations of the experimental approaches used to evaluate these hypotheses. Hexanediol and high salt conditions are not named explicitly in the text, but the text touts these as "chemicals" that "disrupt P-body integrity." This implies that the treatments are specific to P-bodies. Neither of these approaches are only disrupting P Body integrity. This does not invalidate this approach, but the manuscript needs to state what hypothesis HD and NaCl treatment addresses, and acknowledge the caveats of the approach (such as the non-specificity and the assumptions about the mechanism of action for HD).

      4B) Continuing the comment above: it is good that the authors checked that HD and NaCl treatment does not cause egg activation. But no one outside of the field of Drosophila egg activation knows what the 2-minute bleach test is and shouldn't have to delve into the literature to understand this sentence. Please explain in one sentence that "if eggs are activated, then x happens following a short exposure to bleach (citations). We exposed HD and NaCl treated eggs to bleach and observed... ."

      4C) Continuing the comment above: The section of the results related to the endos mutation needs additional information. It is not apparent to the average reader how the endos mutation results in changes in RNP granules, nor what the expected outcome of such an effect would "further test the model" set up by the HD and NaCl experiments. The average reader needs more hand-holding throughout this entire section (related to figure 2) to follow the exposition of the results.

      4D) Continuing the comment above: The average reader also needs a better explanation of what hypothesis is being tested in Figure 1 with the pharmacological inhibition of calcium.

      5) It is unclear why Bcd translation could not be measured in the endos mutant background, but it would be necessary to measure Bcd translation in the endos background. If genotypically it is not possible/inconvenient to invoke the suntag reporter in the endos background, would it not be sufficient to immunostain against Bcd itself? Different Bcd antisera have recently been reported and distributed by the Wieschaus and the Zeitlinger groups.

      6) Figure 4 overall is glorious, but there is a problem with panel C. What are the white lines? Why does the intensity for the green and magenta channel change abruptly in the middle of the embryo?

      7) It is noted that neither the methods section or the supplement does not contain any mention of how the modeling was performed. How was parameter beta fit? At least a brief section should be added to the methods describing how beta was fit (pending adjustments suggested in comment 1 above). A platinum-level addition would include a modeling supplement that reports the sensitivity of model outcomes to changes in parameters.

      Minor Comments:

      • Line 28: "Source-Diffusion-Degradation" should be changed to "Synthesis-..."
      • Line 39: "blastocyst" should be "blastoderm stage embryo".
      • Line 81: "P bodies are an evolutionarily cytoplasmic RNP granule." is "conserved" missing here?
      • Throughout the manuscript, there should be better reporting of the imaged genotypes and whether the suntag is being visualized by indirect immunostaining of fixed tissues or through an encoded nanobody-GFP fusion.
      • Figure 1G: Why is the background staining so different across conditions? Is this a normalization artifact?
      • Figure 2 legend: what is +Sch in the x-axis labels of figure 2B? The legend says that 2B is the quantification of the data in 2A, but there is no (presumed control) +Sch image in 2A.
      • Figure 5A largely repeats information presented in figure 4A. Please consider moving to a supplement. Also, please re-orient embryos to follow the convention that dorsal-most surfaces be presented on the top of the displayed images.
      • The lower-case roman numerals referred to in the text for figure 7B are not included in the corresponding figure panel.
      • Figure 7C y-axis typo (concentration).
      • Line 222: "make a long-range functional gradient": more accurate to say, "but also marks mature, Bcd protein which resolves in the expected long-range gradient."
      • Methods: Please check that all buffers referred to as acronyms are both compositionally defined in the reagents table, and that full names are written out at the time of first mention in the presented order. For instance, Schneider's media is referred to a few times before defining the acronym about midway through the methods section.

      Referees cross-commenting

      OK, We've been asked to comment on each others' reviews. I am reviewer 3. We have not been asked, as far as I can tell, to come up with a consensus review.

      Overall, I feel that we are all generally enthusiastic about this manuscript. From most to least enthusiastic, we have reviewer 1, 3, and finally 2. But all three of us are apparently advocating positively and encouraging revision and clarification because, as we all agree, these results are important to publish.

      Consensus Strengths:

      1. The experimental approach is elegant, rigorous, and innovative, especially the real-time visualization of Bcd translation.
      2. The data provide new mechanistic insight into when and where bcd is translated and how this changes over developmental time.
      3. The relocalization of bcd mRNAs to P bodies during nc14 and the implications for RNA degradation are particularly compelling.
      4. The manuscript establishes a path toward refining reaction-diffusion models of morphogen gradients using direct measurements of translation dynamics.

      I agree with all of Reviewer 1's minor points.

      I agree with Reviewer 2's points about:

      • Showing the SunTag validation data using the fluorescent reporter.
      • Clarifying the noted "translation" vs. "protein" issues. This bothered me too, but I wasn't able to articulate the issue as well as done here. This major issue summarizes several of the Reviewer's comments.
      • Generally tightening the precision with which the results are discussed.

      Overall: we have all provided favorable reviews that require mostly tightening of the text, showing some control datasets, maybe quantifying more points across the AP axis, and presenting the SDD model more comprehensively (comparing with old/translation-agnostic model, reporting characteristic decay lengths at different nuclear cycles, incorporating the reported change in translation rate across nuclear cycles (if this survives the clarification of what 'translation' means per Reviewer 2's comments), and perhaps providing more methodological detail on how parameters were fit).

      Significance

      The importance of this study is at several levels. For the developmental biologist, it addresses important mechanisms of translational control and RNA stability over the functional lifetime of a single, critical biological cue that governs embryonic patterning. Not only do the experiments provide quantification of these features, but also point to likely candidates (P-bodies) for gating bcd's translation in the narrow window between egg activation and cellular blastoderm. For the biophysically-inclined, this adds critical quantitative information of translational state that allows for further refining computational models for how this manifestation of a reaction-diffusion system actually comes together in a complex biological context.

      The primary audience for this work will be the two groups above: developmental biologists and scientists interested in the quantitative modeling of biological phenomena.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript by Athilingam et al., the authors are studying the translation of the morphogen Bicoid (Bcd), which is in anterior-posterior patterning of the blastoderm Drosophila embryo. They have used an array of sunTag elements in the 5' UTR of Bcd to detect the localization of translation. They found that, not only is Bcd not translated until egg activation, but it can only be translated at the anterior pole, even though bcd mRNA has a broader spatial distribution.

      In general, the paper uses a cutting-edge methodology to address one of the foundational questions of the best-studied morphogen gradient: namely, what is the spatial distribution of the Bcd source? Together with the dynamics of its spreading (which they addressed in a separate study in 2024) and Bcd degradation, their results point to a modified form of the synthesis/diffusion/degradation (SDD) model of Bcd gradient formation, which they have analyzed in the final subsection of the results. However, there are several major issues that erode the validity and impact of the paper, most of which can be put into the category of vague explanations, missing information, or contradictory statements, making it hard to understand/verify what conclusions can be drawn. This is also coupled with vague figures and captions. We describe these, and a few minor issues, in detail below:

      • Line 114: The authors claim to have validated the SunTag using a fluorescent reporter, but do not show any data. Ref 60 is a general reference to the SunTag, and not the Bcd results in this paper. Perhaps place their data into a supplemental figure or movie?
      • Line 128 and Fig. 1E: The claim that bcd becomes dispersed is difficult to verify by looking at the image. The language could also be more precise. What does it mean to lose tight association? Perhaps the authors could quantify the distribution, and summarize it by a length scale parameter? This is one of the main claims of the paper (cf. Line 23 of the abstract) but it is described vaguely and tersely here.
      • Line 146, Fig. 1G: This is a really important figure in the paper, but it is confusing because it seems the authors use the word "translation," when they mean "presence of Bcd protein." In other places in the paper, the authors give the impression that "bcd translation" means translation in progress (assayed by the colocalization of GCN4 and bcd mRNA). However, in Fig. 1G, the focus is only on GCN4. Detecting Bcd protein only at the anterior does not mean that translation happens only at the anterior (e.g., diffusion or spatially-restricted degradation could be in play).

      It would also be helpful to show a plot with quantification of Bcd detection (or translation) on the y-axis and a continuous AP coordinate on the x-axis, instead of just two points (anterior and posterior poles, the latter of which is uninteresting because observing no Bcd at the posterior pole is expected).

      Another issue with Fig. 1G is that the A and P panels presumably have different brightness and contrast. If not, just from looking at the A and P panels, the conclusion would be that Bcd protein is diffuse (and abundant) in the posterior and concentrated into puncta in the anterior. The authors should either make the brightness and contrast consistent or state that the P panel had a much higher brightness than the A panel.

      • Line 176: This section is very confusing, because at this point the authors already addressed the spatial localization of translation in Fig. 1G,H (see my above comment). However, here it seems the authors have switched the definition of translation back to "translation in progress." Therefore, the confusion here could be eliminated by addressing the above point.
      • Line 185: The sentence here is seemingly contradictory: "most...within 100 microns" implies that at least some are beyond 100 microns, while the sentence ends with "[none]...more than 100 microns." The language could perhaps be altered to be less vague/contradictory.
      • Line 204: It would be really nice to have quantification of the translation events, such as curves of rate of translation as a function of a continuous AP coordinate, and a curve for each nc.
      • Line 209 and Fig 4C: The authors use the terms "intensity of translation events" or "translation intensity" without clearly defining them. From the figure (specifically from the y-axis label), it looks like the authors are quantifying the intensity per molecule (which is not clearly the same thing as "translation intensity"), but it would be nice if that were stated explicitly.

      In addition, the authors again quantify only two points. This is a continuously frustrating part of the manuscript, which applies to nearly all figures where the authors looked only at two points in space. At a typical sample size of N = 3, it seems well within time constraints to image at multiple points along the AP axis.

      Furthermore, it sounds like the authors are saying the "translation intensity" is the same in anterior and the posterior, which is counterintuitive. The expectation is that translation would be undetectable at the posterior end, in part because bcd mRNA would not be present. (Note that this expectation is even acknowledged by the authors on Line 185, which I comment on above, and also on Line 197). There should also be very low levels of Bcd protein (possibly undetectable) at the posterior pole. As such, the authors should explain how they think their claim of the same "translation intensities" in the anterior vs posterior fits into the bigger picture of what we know about Bcd and what they have already stated in the manuscript. They should also explain how they observed enough molecules to quantify at the posterior end. The authors should also disclose how many points are in each box in the boxplot. For example, the sample size is N = 3 embryos. In just three embryos, how many bcd/GCN4 colocalizations did the authors observe at the posterior end of the embryo?

      • Line 215: The sentence that starts on this line seems self-contradictory: I cannot tell whether or not there is a difference in translation based on position.
      • Line 229: Long-ranged is a relative term. From the graph, one could state there is some spatial extent to the mRNA gradient, so it is unclear what the authors mean when they say it is not "long-ranged." Could the mRNA gradient be quantified, such as with a spatial length scale? This would provide more information for readers to make their own conclusions about whether it is long-ranged.
      • Line 230: When the authors claim the Bcd gradient is steeper earlier, a quantification of the spatial extent (exponential decay length scale) would be appropriate. Indeed, lambda as a function of time would be beneficial. It should also be placed in context of earlier papers that claim the spatial length scale is constant.
      • Lines 235-236: The two sentences that start on these two lines are vague and seemingly contradictory. The first sentence says there is a spatial shift, but the second sentence sounds like it is saying there is no spatial change. The language could be more precise to explain the conclusions.

      Minor issues/typos (still must be addressed for content):

      • Line 81: Probably meant "evolutionarily conserved"
      • Figure 1 legend: part B says "from 15 samples" but also says N = 20. Which is it, or do these numbers refer to different things?
      • Line 217: migration of what?
      • Line 228: "early embryo" is vague. The authors should give specific time points or nuclear cycle numbers.
      • Line 301: Other locations in the paper say 75 microns or 100 microns.
      • Fig. 5: all images should be oriented such that the dorsal midline is on the upper half of the embryo/image.
      • Fig. 5B: There are light tan and/or light orange curves (behind the bold curves) that are not explained.
      • Fig. 5C: the plot says "normalized" but nowhere do the authors describe what the curves are normalized to. There is also no explanation for what the broad areas of light color correspond to.

      Referees cross-commenting

      This is Reviewer 2. Yes, I am enthusiastic about the work: it is a much needed set of experiments and it fits well into the overall goal of quantitatively understanding the processes that establish the Bcd gradient. My main concern(s) about this paper is the loose and vague way they described their experiments and the interpretations. My hope is they will use the revision as an opportunity to more precisely explain their work.

      Other than that, I am in agreement with the other reviewers on the need to revise for clarity and publish this important work.

      Significance

      The results, if upheld, are highly significant, as they are foundational measurements addressing a longstanding question of how morphogen gradients are formed, using Bcd (the foundational morphogen gradient) as a model. They also address fundamental questions in genetics and molecular biology: namely, control of mRNA distribution and translation.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper the authors use the Suntag system to visualise bcd mRNA translation in the Drosophila embryo. They elucidate the relationship between bcd mRNA translation and P body localisation. In the oocyte, bcd mRNAs are localised in P bodies and translationally repressed, but upon egg activation bcd mRNAs are released from P bodies and translated. In addition, during mid-nc14, bcd mRNAs become localised to embryonic P bodies and degraded. The authors use their data to modify the Synthesis, Diffusion, Degradation model of Bcd gradient formation, which recapitulates the Bcd gradient detected experimentally.

      Overall, I think the data are of high quality and support the authors' conclusions. I only have minor comments, as follows:

      Fig 1B - add arrows showing mRNAs being translated or not (the latter mentioned in line 113 is not so easy to see).

      Fig 2A - add a sentence explaining why 1,6HD, 2,5HD and NaCl disrupt P bodies.

      Fig 4C - explain in the legend what the white lines drawn over the image represent. And why is there such an obvious distinction in the staining where suddenly the DAPI is much more evident (is the image from tile scans)?

      Line 215 - 'We did not see any significant differences in the translation of bcd based on their position, however, there appears an enhanced translation of bcd localised basally to the nuclei (Figure S5).' Since the difference is not significant, I do not think the authors should conclude that translation is enhanced basally.

      Line 218: 'The interphase nuclei and their subsequent mitotic divisions appeared to displace bcd towards the apical surface (Figure S6B).' Greater explanation is needed in the legend to Fig S6B to support this statement as the data just seem to show a nuclear division - I would have thought an apical-basal view is needed to conclude this.

      Fig 5B - the authors compare Bcd protein distribution across developmental time. However, in the early time points cytoplasmic Bcd is measured (presumably as it does not appear nuclear until nc8 onwards) and compare the distribution to nuclear Bcd intensities from nc9 onwards. Is most/all of the Bcd protein nuclear localised form nc9 to validate the nuclear quantitation? Does the distribution look the same if total Bcd protein is measured per volume rather than just the nuclear signal? Are the authors assuming a constant fast rate of nuclear import?

      Line 259 - 'We then asked if considering the spatiotemporal pattern of bcd translation' - the authors should clarify what new information was included in the model. Similarly in line 286, 'By including more realistic bcd mRNA translation' - what does this actually mean? In line 346, 'We see that the original SDD model .... was too simple.' It would be nice to compare the outputs from the original vs modified SDD models to support the statement that the original model was too simple.

      Fig S1A - clarify what the difference is between the 2 +HD panels shown.

      Fig S2E - the graph axis label/legend says it is intensity/molecule. Since intensity/molecule is higher in the anterior for bcd RNAs, is this because there are clumps of mRNAs (in which case it's actually intensity/puncta)?

      Fig S4 - I think this line is included in error: '(B) The line plots of bcd spread on the Dorsal vs. Ventral surfaces.' In B, D, E - is the plot depth from the dorsal surface? I would have preferred to see actual mRNA numbers rather than normalised mRNAs. In Fig S4D moderate, from 10um onwards there are virtually no mRNA counts based on the normalised value, but what is the actual number? The equivalent % translated data in Fig S4E look noisy so I wonder if this is due to there being a tiny mRNA number. The same is true for Figs S4D, E 10um+ in the low region.

      Referees cross-commenting

      I think the concerns raised by reviewers 2 and 3 are valid, and that it is feasible for the authors to address all the reviewers' concerns in order to improve the manuscript.

      Significance

      General assessment

      Strengths are: 1) the data are of high quality; 2) the study advances the field by directly visualising Bcd mRNA translation during early Drosophila development; 3) the data showing re-localisation of bcd mRNAs to P bodies nc14 provides new mechanistic insight into its degradation; 4) a new SDD model for Bcd gradient formation is presented. Limitations of the study are: 1) there was already strong evidence (but no direct demonstration) that bcd mRNA translation was associated with release from P bodies at egg activation; 2) it is not totally clear to me how exactly the modified SDD model varies from the original one both in terms of parameters included and model output.

      Advance

      The advance is conceptual, technical and mechanistic.

      Audience

      The results will be important to a broad range of researchers interested in the formation of developmental morphogen gradients and the post-transcriptional regulation of gene expression, particularly the relationship with P bodies.

      My expertise

      Wetlab developmental biologist

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer #1 __

      *This study "Interpreting the Effects of DNA Polymerase Variants at the Structural Level" comprises an in-depth analysis of protein sequence variants in two DNA polymerase enzymes with particular emphasis on deducing the mechanistic impact in the context of cancer. The authors identify numerous variants for prioritisation in further studies, and showcase the effectiveness of integrating various data sources for inferring the mechanistic impact of variants. *

      *All the comments below are minor, I think the manuscript is exceptionally well written. *

      *> The main body of the manuscript has almost as much emphasis on usage of the MAVISp tool as analysis of the polymerase variants. I don't think this is an issue, as an illustrated example of proper usage is very handy. I do, however, think that the title and abstract should better reflect this emphasis. E.g. "Interpreting the Effects of DNA Polymerase Variants at the Structural Level with MAVISp". This would make the paper more discoverable to people interested in learning about the tool. *

      We have changed the manuscript title according to the reviewer’s suggestions, and the current title is “Interpreting the Effects of DNA Polymerase Variants at the Structural Level using MAVISp and molecular dynamics simulations.”

      • *

      *> Figure 1. I don't believe there is much value in showing the intersection between the datasets (especially since the in-silico saturation dataset intersects perfectly with all the others). As an alternative, I suggest a flow-chart or similar visual overview of the analysis pipeline. *

      • *

      We moved the former Figure 1 to SI. We decided to keep it at least in SI because it provides guidance on the number of variants relative to the total reported across the different disease-related datasets annotated with the MAVISp toolkit. On the other hand, the suggestion of a visual scheme for the pipeline followed in the analyses is a great idea. We have thus added Figure 1, which illustrates the pipeline workflows for analysis of known pathogenic variants and for discovery of VUS and other unknown variants, as suggested by the reviewer.

      *> Please note in the MAVISp dot-plot figure legends that the second key refers to the colour of the X-axis labels rather than the dots *

      We have revised the code that produces the dotplot so the second key is placed closer to the x-axis and clearer to read.

      Missing figure reference (Figure XXX) at the bottom of page 16

      We apologize for this mistake. Figures, contents, and the order have changed significantly to address all reviewers’ comments; this statement is no longer included. Also, we have carefully proofread the final version of the manuscript before resubmitting it.


      __Reviewer #2 __

      • *

      This manuscript reports a comprehensive study of POLE and POLD1 annotated clinical variants using a recently developed framework, MAVISp, that leverages scores and classifications from evolutionary-based variant effect predictors. The resource can be useful for the community. However, I have a number of major concerns regarding the methodology, the presentation of the results.

      *** On the choice of tools in MAVISp and interpretation of their outputs *

      - Based on the ProteinGym benchmark: https://proteingym.org/benchmarks*, GEMME outperforms EVE for predicting the pathogenicity of ClinVar mutations, with an AUC of 0.919 for GEMME compared to 0.914 for EVE. Thus, it is not clear for me why the authors chose to put more emphasis on EVE for predicting mutation pathogenicity. It seems that GEMME can better predict this property, without any adaptation or training on clinical labels. *

      • *

      We appreciate this comment, but we should not exclude EVE entirely from our data collection or from VEP coverage under MAVISp, based on a difference in AUC of 0.005. It was not our intention to place more emphasis on EVE predictions, and we have revised it accordingly. We would like to clarify the workflow we use for applications of the MAVISp framework in “discovery mode,” i.e., for variants not reported as pathogenic in ClinVar. This relies on AlphaMissense to prioritize the pathogenic variants and then retain further only the ones that also have an impact according to DeMaSk, which provides further indication for loss/gain-of-fitness. DeMaSk nicely fits the MAVISp framework, as it was trained on data from experimental deep mutational scans, which we generally import in the EXPERIMENTAL_DATA module. We have revised the text to make this clearer. GEMME and EVE (or REVEL) can be used for complementary analysis in the discovery workflow. Other users of MAVISp data might want to combine them with a different design, and they have access to all the original scores in the MAVISp database CSV file and the code for downstream analysis to do so. The choice for our MAVISp discovery workflow is mainly dictated by the fact that we have noticed we do not always have full coverage of all variants in many protein instances for EVE, GEMME, and REVEL. In particular, since the reviewer highlights GEMME over EVE, GEMME is currently unavailable for a few cases in the MAVISp database. This is because we need to rely on an external web server to collect the data, which slows down data collection on our end.

      Additionally, we have encountered instances where GEMME was unable to provide an output for inclusion in the MAVISp entries. When we designed the workflow for variant characterization in focused studies, we also made practical considerations. We are also exploring the possibility of using pre-calculated GEMME scores from

      https://datadryad.org/dataset/doi:10.5061/dryad.vdncjsz1s, but we encountered some challenges at the moment that deserve further investigations and considerations. For example, MAVISp annotations rely on the canonical isoform as reported in Uniprot, which can lead to mismatches with the GeMME pre-computed scores. So far, we have identified a couple of entries whose canonical isoforms no longer match the one in the pre-computed GEMME score dataset. Another limitation is the absence of the original MSA files in the dataset, which we would need for a more in-depth comparison with the ones we used for our calculations. We are facing some challenges in reproducing the MSA output from MMseq2-based ColabFold protocol in this context that need to be solved first. Overall, the dataset shows potential for integration into MAVISp, but we need to define the inclusion criteria and compare it with the existing results in more detail.

      Additionally, since the principle behind MAVISp is to provide a framework rooted in protein structure, AlphaMissense was the most reasonable choice for us as the primary indicator among the VEPs for our discovery workflow, and it has performed reasonably well in this case study and others.

      Of course, our discovery design is one of the many applications and designs that could be envisioned using the data provided and collected by MAVISp. We also include all raw scores in the database's final CSV files, allowing other end users to decide how to use them in their own computational design. The design choice we made for the discovery phase of focused studies, using MAVISp to identify variants of interest for further studies, has been applied in other publications (see https://elelab.gitbook.io/mavisp/overview/publications-that-used-mavisp-data) in some cases together with experiments. It is also a fair choice for the application, as the ultimate goal is to provide a catalog of variants for further studies that may have a potentially damaging impact, along with a corresponding structural mechanism.

      We have now revised the results section text where Table 1 is cited to clarify this. We also revised the terminology because we are using the VEPs' capability to predict damaging variants, rather than the pathogenic variants themselves. Experiments on disease models should validate our predictions before concluding whether a variant is pathogenic in a disease context, and we want to avoid misunderstandings among readers regarding our stance on this matter.

      - Which of the predictors, among AM, EVE, GEMME, and DeMaSK, provide a classification of variants and which ones provide continuous scores? This should be clarified in the text. If some predictors do not output a classification, then evaluating their performance on a classification task is unfair. The MAVISp framework sets thresholds on the predicted scores to perform the classification and it is unclear from reading the manuscript whether these thresholds are optimal nor whether using universal cutoff values is pertinent. For instance, for GEMME, a recent study shows that fitting a Gaussian mixture to the predicted score distribution yields higher accuracy than setting a universal threshold (https://doi.org/10.1101/2025.02.09.637326*). Along this line, for predictors that do not provide a classification, I am not convinced of the benefit for the users of having access to only binary labels, instead of the continuous scores. The users currently do not have any idea of whether each variant is borderline (close to theshold) or confident (far from threshold). *

      We agree with the reviewer, and this is due to us not being sufficiently clear in the manuscript. We have now revised the first part of the results to clarify this and to explain how we use the MAVISp data for application to focused studies, where the goal is to identify the most interesting variants that are potentially damaging and have a linked structural mechanism. Of course, there are other applications for leveraging the data in the database. We do offer scores to variants instead of just classification labels in the MAVISp csv file. They can be accessed, together with the full dataset, through the MAVISp website and reused for any applications.

      Additionally, we used the scores in the revised manuscript for the VUS variant ranking (Figure 5), applying a strategy recently designed as an addition to the downstream analysis tool kit of MAVISp (​​https://github.com/ELELAB/MAVISp_downstream_analysis), thereby allowing the scores themselves to be taken into account. Also, in the final part of the manuscript, the VEP scores have been used to introduce the ACMG-like classification of the variants in response to reviewer 3 (Figure 9 and Tables S3-S4). We absolutely agree that it is informative to keep the continuous scores, and we have never overlooked this aspect. However, we also need a strategy with a simpler classification to highlight the most interesting variants among thousands or more to start an exploration. This is why we included the support with dotplots and lolliplots, for example. Our purpose here is to identify, among many cases, those with a potentially damaging signature (and thus we need a binary classification for simplicity). Next, we evaluate whether this signature entails a fitness effect (with DeMaSk), and finally, retain only the cases we can identify with a structural mechanism to study further.

      The thresholds we set as the default for data analysis of dotplots in GEMME and DeMaSk are discussed in __Supplementary Text S3 __of the original MAVISp article. In brief, we carried out an ROC analysis against the scores for known pathogenic and benign variants in ClinVar with review status higher than 2. For applicative purposes, one could design other strategies to analyze the MAVISp data too; it is not limited to the workflow we decided to set as the primary one for our focused studies, as already mentioned above.

      We have now also included classification based on the GMM model applied to GEMME scores for POLE and POLD1, so it can be evaluated against other designs for our protein of interest (see Table 1 in the revised version). The method section has been revised to include this part, and the ProteoCast pre-print is cited as a reference. We have not yet officially included this classification in the MAVISp database because we must first follow internal protocols to meet the inclusion criteria for new methods or analyses. We will do so by performing a similar comparison on the entire MAVISp dataset and focusing on high-quality variants, as ClinVar annotations, as we did to set the current thresholds for GEMME in Supplementary Table S3 of the original MAVISp article. We need to allocate time and resources to this pilot, which is scheduled for Q1 2026.

      ** On the presentation and impact of the results

      • While reading the manuscript, it is difficult to grasp the main messages. The text contains abundant discussion about the potential caveats of the framework, the care that should be taken in interpreting the results, and the dependency on the clinical context. Although these aspects are certainly important, this extensive discussion (spread throughout the manuscript) obscures the results. Moreover, the way variants are catalogued throughout the text makes it difficult to grasp key highlights. The reader is left unsure about whether the framework can actually help the clinical practitioners.

      We have revised the text to make it easier to read, including additional MD simulations of three variants of interest and more downstream analyses to clarify the mechanisms of action. We also added a recap of the most interesting variants and their associated mechanisms, along with the ranking of the variants using the different features available in the MAVISp csv file for the VUS. We hope that this makes it more accessible and valuable. In the original publication, Table 2 aimed to provide a summary of the interesting variants, and we have revised it now in light of the ranking results and the additional analyses that allow us to clarify the mechanisms of action further. We have also introduced__ Figure 9 and Tables S3 and S4__, which present data on ACMG-like classification for VUS that can fall into the likely pathogenic or benign categories.

      • In many cases, the authors state that experimental validation is required to validate the results. Could they be more explicit on the experimental design and the expected outcome?

      We have added a section on the point above at pages 21 and 30, where, alongside the summary of mechanisms per variant, we propose the experimental readouts to use based on known MAVE assays or assays that could be designed.

      • AlphaMissense seems to tend to over-predict pathogenicity. Could the authors comment on that?

      We are unsure whether this comment relates to our specific case or to a general feature of AlphaMissense.

      In the latest iteration of our small benchmarking dataset for POLE and POLD1 (as shown in the paper), we achieve a sensitivity of 1 and a balanced specificity of 0.96 for AlphaMissense, which suggests that AlphaMissense does not over-predict pathogenicity very significantly in these proteins, predicting true negatives (i.e., non-pathogenic) mutations quite accurately. As performance was sufficient in our case, we deemed recalibrating the classification threshold for AlphaMissense unnecessary.

      We are aware that this is not necessarily the case for every gene, e.g., it has been shown that AlphaMissense shows lower specificity in some cases (see e.g. 10.3389/fgene.2024.1487608, 10.1038/s41375-023-02116-3). This is also why we found it essential to evaluate its performance with its recommended classification on a gene-specific basis, as done here. In the future, we will keep a critical eye on our predictors to understand whether they are suitable for the specific case of study, or whether they require threshold recalibration or the use of a different predictor.

      ** On specific variants

      • The mention of H1066R, H1068, and D1068Y is very confusing. There seems to be a confusion between residue numbers and amino acid types.

      We have revised the text for typos and errors. This part of the text changed, so these specific variants are no longer mentioned.

      • A major limitation of the 3D modeling is this impossibility to include Zn2+ coordination by cysteine residues. This limitation holds for both POLE and POLD1. Could the authors comment on the implication of this limitation for interpreting the mechanistic impact of variants. In particular, there are several variants reported in the study that consist in gain of cysteines. The authors discuss the potential impact of some of these mutations on the structural stability but not that on Zn coordination or the formation of disulphide bridges.

      This is a great suggestion. We had, for a long time, a plan in the pipeline to include a module to tackle changes in cysteines. We have now used this occasion to include a new module that allows identifying mutations: 1) that are likely to disrupt native disulphide bridges and annotate them as damaging or 2) potential de novo formation of disulphide bridges upon a mutation of a residue to a cysteine, also annotated as damaging with respect to the original functionality. We also included a step that evaluates if the protein target is eligible for the analysis based on the cellular localization, since in specific compartments the redox condition (such as the nucleus) would not favour disulfide bridges. The module has been added to MAVISp, and we are collecting data with the module for the existing entries in the database to be able to release them at one of the following updates. More details are on the website in the Documentation section (https://services.healthtech.dtu.dk/services/MAVISp-1.0/). We could not apply the module to POLE and POLD1 since they are nuclear proteins, and it would not be meaningful to look into this structural aspect either in connection with loss of native cysteines or de novo disulfide bridge formation upon mutations that change a wild-type residue to a cysteine.

      We would like to clarify that the structures we use, as it is a focused study rather than high-throughput data collection for the first inclusion in the MAVISp database, have been modelled with zinc at the correct position. It is just the first layer of high-throughput collection with MAVISp, which uses models without cofactors unless the biocurator attempts to model them or we move to collect further data for research studies (as done here). Prompted by this confusion, we have now added a field to the metadata of a MAVISp entry indicating the cofactor state. Nevertheless, the RaSP stability prediction does not account for the cofactor's presence, even when it is bound in the model. This is discussed in the Method Section. We thus did not further analyze the variants in sites directly coordinating the metal groups due to these limitations.

      • MAVISp does not identify any mechanistic effect for a substantial portion of variants labelled as pathogenic. Could the authors comment on this point?

      We are not sure how to interpret this question. It can be read two ways. Either the reviewer is asking about the known pathogenic ClinVar variants without mechanistic indicators, or more generally, the ones that we label “pathogenic” in discovery (we actually refer to more usually damaging in the dotplots), and for which we cannot associate a mechanism.

      Overall, as a general consideration, it would be challenging to envision a mechanism for each variant predicted to be functionally damaging. For example, in the case of POLE and POLD1, we still lack models of complexes that did not meet the quality-control and inclusion criteria for the binding-free-energy scheme used by the LOCAL INTERACTION module. Also, when it comes to effects on catalysis or to analyzing effects in more detail at the cofactor sites, we could miss effects that would require QM/MM calculations. Other points we have not yet covered include cases related to changes in protein abundance due to degron exposure for degradation, which is one of the mechanistic indicators we are currently developing. Moreover, we used only unbiased molecular simulations of the free protein, and we would need future studies with enhanced sampling approaches and longer timescales to better address conformational changes and changes in the population of different protein conformational states induced by the mutation (including DNA). This can be handled formally by the MAVISp framework using metadynamics approaches, but it would be outside the scope of this work and is a direction for future studies on a subset of variants to investigate in even greater detail.

      Furthermore, modifications related to PTM differ from phosphorylations. Anyway, our scope is to use the platform to provide structure-based characterization of either known pathogenic variants or potentially damaging ones predicted by VEPs, and focus on more detailed analyses of those. As we develop MAVISp further and design new modules, we will also be able to tackle other mechanistic aspects. This discussion, however, is more relevant to the MAVISp method paper itself.

      Moreover, none of the variants discussed are associated with allosteric effect. Is this expected?

      .

      In general, allosteric mutations are rare. Nevertheless, in these case studies, the size of the proteins under investigation also poses some challenges for the underlying coarse-grain model used in the simple mode to generate the allosteric signalling map, as we have found it performs best on protein structures below 1000 residues

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      The manuscript utilized the MAVISp framework to characterize 64,429 missense variants (43,415 in POLE and 21,014 in POLD1) through computational saturation mutagenesis. The authors integrate protein stability predictions with pathogenicity predictors to provide mechanistic insights into DNA polymerase variants relevant to cancer predisposition and immunotherapy response. There are discussions of known PPAP-associated variants and somatic cancer mutations in the context of known data and some proposed variants of interest (which are not validated).

      Major comments:

      I was unaware of the MAVISp framework. It concerns me that alebit this paper has a lot of technical details about the framework, its not the paper about the framework. I did look into the paper https://www.biorxiv.org/content/10.1101/2022.10.22.513328v5 which keeps benign updated (version five now) for three years, but I do not see a peer reviewed version. It would be unfair of me to peer review the underlying framework of the work but together with the previous comments, I am a bit concerned.

      We have intentionally left the MAVISp resource paper as a living pre-print until we have sufficient data in the database that could be useful to the rest of the community. We have been actively revising the manuscript, thanks to comments from users in previous versions, to ensure it provides a solid resource. We had attempted approximately one and a half years ago a submission to a high-impact journal and even addressed the reviewers’ comments there. Still, we did not receive feedback for a long time, and ultimately, we were not sent to the reviewers again despite more than six months of work on our side. After that, we realized that we would benefit from collecting a larger dataset, and we invested time and effort in that and submitted again for revision, this time through Review Commons in the Summer of 2025. Anyway, the paper has been peer-reviewed by three reviewers through Review Commons. We submitted the revised version and response to reviewers, and it is now under revision with Protein Science. The reviewers’ comments and our responses can be found in the “Latested Referred Preprints” on the Review Commons website with the date of 17th of October 2025.

      We would also like to clarify another point on this. In our experience, it is common practice to keep sofware on BioRxiv even for a long and to bring it to a more complete form in parallel with the community already applying it. This allows feedback from peers in a broad manner. We had similar experiences with MoonlightR, where the first publications with applications within the TCGA-PanCancer papers came before the publication of the tool itself, and the same has been for any of our main workflows, such as MutateX or RosettaDDGPrediction, which are widely used by the community. Finally, it can be considered that the MAVISp framework has already been used in different published peer-review studies (since 2023), attesting to its integrity and potential. Here, the reviewer can read more about the studies that used MAVISp data or modules: https://elelab.gitbook.io/mavisp/overview/publications-that-used-mavisp-data

      For example, the authors are using AlphaFold models to predict DDG values. Delgado et al. (2025, Bioinformatics) explicitly tested FoldX on such models and concluded that "AlphaFold2 models are not suitable for point mutation ΔΔG estimation" after observing a correlation of 0.06 between experimental and calculated values. AlphaFold's own documentation states it "has not been validated for predicting the effect of mutations". Pak et al. (2023, PLOS ONE) showed correlation between AlphaFold confidence metrics and experimental ΔΔG of -0.17. Needless to say that these concerns seriously undermine the validity of a major part of the study.

      We appreciate the reviewer’s comments and would like to clarify a point regarding the MAVISp STABILITY module, which we believe may have been misunderstood. Based on the studies cited by the reviewer, which critique the use of AF-generated mutant structures for assessing stability effects, we understand that this assumption may have led to the concern.

      The STABILITY module utilises three in silico tools (FoldX, Rosetta, and RaSP) to assess changes in protein stability resulting from missense mutations. Importantly, the input to these assessments consists of AF models of the WT protein structures, not of AF-generated mutant structures. The mutants are generated using the FoldX and Rosetta protocols, along with estimates of the changes in free energy. For further details and clarification, we kindly refer the reviewer to the MAVISp original publication.

      Also, one should consider the goal of our use of free energy calculations: not to identify the exact ΔΔG values, but to correlate with data from in vitro or biophysical experiments, such as those from cellular experiments like MAVE. We, other researchers, have shown that we have a good agreement in the MAVISp paper (case study on PTEN as an example in the original MAVISp publication and https://pmc.ncbi.nlm.nih.gov/articles/PMC5980760/ https://pubmed.ncbi.nlm.nih.gov/28422960/,10.7554/eLife.49138). Also, we had, before even designing the STABILITY module for MAVISp, verified that we can use WT structures from AlphaFold (upon proper trimming and quality control with Prockech) instead of experimental structure without compromising accuracy in the publications of the two main protocols of the STABILITY module (MutateX and RosettaDDGPrediction and a case study on p53, https://doi.org/10.1093/bib/bbac074,https://doi.org/10.1002/pro.4527). In the focused studies, we also carefully consider whether the prediction is at a site with a low pLDDT score or surrounded by other sites with a low pLDDT score before reaching any conclusions. The pLDDT score is reported in the MAVISp csv file exactly to be used for flagging variants or looking closer at them, as we discuss in this study (see, for example, Figure 2). Additionally, it should be noted that we employ a consensus approach across the two classes of methods in MAVISp to account for their limitations arising from their empirical energy function or backbone stiffness. Furthermore, in the focused studies, we also collected molecular dynamics simulations for the ensemble mode and reassessed the stability on different conformations from the trajectory to compensate for the issues with backbone stiffness of FoldX, RaSP, and Rosetta ΔΔG protocols.

      I have to add that this is also true for the technical choices: Several integrated predictors (DeMaSk, GEMME) are outperformed by newer methods according to benchmarking studies (https://www.embopress.org/doi/full/10.15252/msb.202211474). AlphaMissense, while state-of-the-art, shows substantial overcalling of pathogenic variants. could ensemble meta-predictors (REVEL, BayesDel) improve accuracy?

      The MAVISP framework includes REVEL as one of the VEPs available for data analysis. In this way, we were representing one of the ensemble meta-predictors. This is explained in the MAVISp original paper. We were not aware of BayesDel, which we will consider for one of the next pilot projects to assess new tools for the framework (see more details below on how we generally proceed). Currently, we cannot use REVEL for all variants because we do not necessarily have genomic coordinates for them. We retrieve genomic-level variants corresponding to our protein variants from mutation databases, where available (e.g., ClinVar, COSMIC, or CbioPortal). However, as we strive to cover every possible mutation, several of the variants in MAVISp are not in the database, which means we do not have the corresponding genomic variation for those, limiting our ability to annotate them with VEPs. In the future (see GitHub issue https://github.com/ELELAB/cancermuts/issues/235), we will revise the code to identify the genomic variants that could give rise to each protein mutation of interest, thereby increasing the coverage of VEP annotations.

      We can see from the work cited by the reviewer that ESM-1v, EVE, and DeepSequence are among the top performers, whereas reviewer 2 cited another work in which GEMME outperforms EVE. We have been covering all of them, except ESM-1v, in our framework. We are planning to evaluate for inclusion in MAVISP some of the new top-performing predictors, including ESM-1v, in Q2 2026 (according to the protocol described later in this answer), which is why it is not available yet.

      In our discovery protocol (i.e., when we work on VUS or variants not classified in ClinVar), we generally use AlphaMissense as the first indicator of potentially damaging variants. EVE, REVEL, or GEMME could be used in the case that AlphaMissense data are missing or as a second layer of evidence in the case we want, for example, to select a smaller pool of variants for experimental validation in a protein target with too many uncharacterized variants and too many that pass the evaluation with our discovery workflow. Finally, we rely on DeMaSk, as it also provides information on possible loss- or gain-of-fitness signatures to further filter the variant of interest for the search of mechanistic indicators. Since the MAVISp framework is modular, other users may want to use the data differently and design a different workflow. They have access to them (scores and classifications) through the web portal. The fact that we combine AlphaMissense with DeMaSk could yield final results after further variant filtering and mitigate the issue that AlphaMissense risks over-predicting pathogenicity.

      In general, we work to keep MAVISp up-to-date, and we have developed a protocol for the inclusion of new methodologies in the available module before generating and releasing data with new tools in the database. In particular, we perform comparative studies using data already available in the database to evaluate the performance of new approaches against that of the tools already included. Depending on the module, we use different golden standards that we are also curating in parallel, and it would make sense to apply for that specific module. For example, if the question is to evaluate VEP, we would compare it against ClinVar known variants with good review status. If the VEP performs better than the currently included ones, we can include it as an additional source of annotations and evaluate whether we could change the protocol for the discovery/characterization of variants. We operate similarly for the structural modules. For example, for stability, we are importing experimental data from MAVE assays on protein abundance and use them as a golden standard where we evaluate new approaches against the current FoldX and Rosetta-based consensus for changes in folding free energies. Instead, If we find evidence that suggests switching to a new method or integrating it would be beneficial, we will do so as a result of these investigations. An example of our working mode for evaluating tools for inclusion in the framework is illustrated by how we handled the comparison between RaSP and Rosetta in the MAVISp original article (Supplementary file S2) before officially switching to RaSP for high-throughput data collection. We still maintain Rosetta, especially in focused studies, to validate further variants classified as uncertain.

      *Further, I found the web site of the framework, where I looked for the data on these models, rather user unfriendly. Selecting POLD1, POLD2, or POLE tells me I am viewing entries A2ML1, ABCB11, ABCB6 respectively, when I search for POL and then click: these are the first three entries of the table, bot the what I click on. displaying the whole table and clicking on POLD1, gets me to POLD1. However, when I selected "Damaging mutations on structure" I get "Could not fetch protein structure model from the AlphaFold Protein Structure Database". Many other features are not working (Safari or Chrome, in a Mac). That is a concern for the usability of the dataset. *

      • *

      We have been able to reproduce the bugs identified by the reviewer and have fixed them. The second was connected to recent updates on the AlphaFold Protein Structure Database. We are not really sure how to work and act on the “other features that are not working” due to lack of specificity in this comment. Still, we have worked to make the website more robust: the coauthors of this work and other colleagues in the MAVISp team have extensively tested it across different proteins and with various browsers and operating systems, and we have fixed all identified issues. We also have a GitHub repository where users can open issues to share problems they have been experiencing with the website, which we will fix as promptly as we can (https://www.github.com/ELELAB/MAVISp), as we do for any of the tools we develop and maintain. If the reviewer were to come across other specific problems with the website, we recommend to (anonymously) open issues on the MAVISp repository so that they can be described more in detail and dealt with appropriately.

      This comment seems more related to the MAVISP paper itself than to the POLE and POLD1 entries. We have been doing several revisions to the web app to improve it over time. We are also afraid that the reviewer consulted it during one of these changes, and we hope it will be better now. For POLE and POLD1, the CSV files were, in any case, also available through the MAVISp website itself (https://services.healthtech.dtu.dk/services/MAVISp-1.0/), as well as in the OSF repository connected to this paper (https://osf.io/z8x4j/overview), in case the reader needed to consult them or as a reference for the analyses reported in this paper.

      Albeit this is a thorough analysis with the existing tools, and the authors make some sparse attempts to put the mutants classification in context with examples, the work stays descriptive for know effects in literature, or point out that e.g. "further functional and in vitro assays are required". The examples are not presented in a systematic way, or in an appealing manner. Thus, what this manuscript adds to the web site is unclear. It is a description of content, which could be at least more appealing if examples woudl be more clearly outlined in a conceptual framework, and illustrated more consistently. For exmaple I read in the middle of mage 16 "One such example is the F931S (p.Phe931Ser) variant (Figure 5A)" and then I see "F931 forms contacts with D626, a critical residue for the coordination of Mg2+ which is essential for the correct orientation of the incoming nucleotide (Figure XXX)". Figure 5B is not XXX as this has just many mutations labeled. These issues are very discouraging. I woudl recommend to put much more effort in examples, put them in clearer paragraphs, and decribe results rather than the methodology. Doing both in an intemigled way, clearly does not work for me.

      We have revised the storyline to make it more straightforward for the reader, focusing on the essential messages and avoiding excessive description in the results section, instead conveying the key points directly. We also included new simulation data on three variants and downstream analyses of other variants. We revised the section to focus less on methodologies and more on the actual biological results. We have also added a ranking approach for the VUS and an ACMG-like classification to facilitate the identification of the most important results.

      Additionally, we included a summary Table (Table 2) and Figure 9 that present the main findings on the VUS, and we discussed in the text the possible associated experimental validation.

      We also do not fully understand the reviewer’s comment “the work stays descriptive for know effects in literature”. We agree that we should make a better effort to write the results in a logical and easy-to-follow manner, without risking the reader getting lost in too many details, and with more dedicated subsections. However, the paper does not describe just known effects in the literature. We had, in the previous version, a section aimed at identifying mechanistic indicators for ClinVar-reported variants that are also (in some cases) functionally characterized. This is true, but it is the very first part of the results, and it is still adding structure-based knowledge to these variants. After this, we also reported predicted results with mechanisms for VUS and variants in other databases. We took the opportunity in this revised version to elaborate more on the results of the variants reported in COSMIC and cBioPortal.

      We are afraid that we also do not fully understand the reviewer's comment on the fact that “Thus, what this manuscript adds to the website is unclear.” We have generated POLE and POLD1 data with the MAVISp toolkit in both ensemble and simple mode, and the whole pool of local interactions with other proteins and DNA, specifically for this publication. It should be acknowledged that we have generated new data in ensemble mode, which relies on all-atom microsecond molecular dynamics simulations, and additional modules for the simple mode, including calculations with the flexddg protocol of Rosetta, which is also computationally demanding, to provide a comprehensive overview of the effects of variants in POLE and POLD1. The two proteins were available in the database only in simple mode with the basic default modules, and the remaining data were collected during this research article. This can also be inferred by the references in the csv file of the ensemble mode, which refer only to the DOI of the pre-print of this article. This entails a substantial effort in computing and analysis. The website is the repository for data that researchers collect using the MAVISp protocols or modules; in our opinion, it cannot replace a research project. We designed the database to store the data generated by the framework for others to consult and use for various purposes (e.g., biological studies, preparing datasets for benchmarking approaches against existing ones, or using features for machine learning applications). The entry point in the database is the simple mode, along with some compulsory modules (VEPs, STABILITY, PTM, EFOLDMINE, SASA). After this initial entry point, a biocurator or a team of researchers can decide to expand data coverage by moving into the other modules. Still, at some point, one would need to design focused studies to have a comprehensive overview of the effects on specific targets, as we did here, or, for example, in the publication https://doi.org/10.1016/j.bbadis.2024.167260.

      Furthermore, there are analyses here, especially in the simulations, that are not directly available from consulting the database; in these cases, one needs to use other resources beyond MAVISp to investigate further the mechanisms underlying the predicted mechanistic indicators. We also included simulations of mutant variants to validate the hypothesis further. And another example is the analysis of the effects on the splicing site that is not covered by a structure-based framework, such as MAVISp, but is still an essential aspect in the analysis of the variants' effects.

      Will the community find this analysis useful?

      The analysis provided here will be helpful, especially for researchers interested in experimental studies of these enzymes, because they have throughout the study an extensive portfolio of structural data to consult, including a ranked list of variants by class of effect. We originally started designing MAVISp because we realized it was needed by our experimental collaborators, both in cellular biology and in more clinical research, whenever they needed to predict or simulate variants, and we expanded the concept into a robust, versatile framework for broader use. Especially for those genes where extensive MAVE data are not available (as in this case), having a set of variants to test experimentally is crucial support, as it provides the potential mechanism behind the predicted damaging variant.

      How many ClinVar VUS could be reclassified using MAVISp data under current ACMG/AMP guidelines?

      • *

      The ACMG/AMP variant classification guidelines, to the best of our knowledge, include computational evidence (PP3/BP4) and well-established functional studies (PS3/BS3). Because MAVISp provides multi-level mechanistic predictions derived from structural modelling, these data formally fall within the PP3/BP4 computational category. They cannot be used to reclassify ClinVar VUS independently under ACMG/AMP rules. This is not really the goal of our framework, which is to provide a structure-based framework for investigating potentially damaging variants predicted by VEPs. However, the suggestion of the reviewer is something we wanted to explore too in general with MAVISp data, and we failed because of a lack of time. We checked the requirements for PP3, BP4, and PM1 and developed a classifier for VUS reported in ClinVar, using MAVISp features in accordance with the ACMG/AMP guidelines. Using ClinVar pathogenic and benign variants with at least a review status of 1 for calibration, we obtained thresholds for all MAVISp-supported VEPs (REVEL, AlphaMissense, EVE, GEMME, and DeMaSk). These thresholds were then applied to all ClinVar VUS to determine PP3 (pathogenic-supporting) and BP4 (benign-supporting) evidence. In parallel, we constructed a PM1-like mechanistic evidence category that integrates MAVISp structural stability, protein–protein interactions, DNA interactions, long-range allosteric paths, functional sites, and PTM-mediated regulatory effects. Variants classified as damaging in MAVISp according to such criteria were assigned PM1-like support. These evidence tags provide mechanistic insight to support VUS classification for polymerase proofreading genes. The workflow and complete annotated VUS table are now included in the revised manuscript and in the OSF repository. Although these findings cannot formally reclassify variants under ACMG/AMP criteria, they provide prioritization for PS3/BS3 experimental validation and highlight variants that are likely to be reclassified once supporting functional evidence becomes available.

      How do MAVISp predictions meet calibrated thresholds, as in https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-023-01234-y* for the exonuclease domain of POLE and POLD1? *

      • *

      Mur et al. (Genome Medicine 2023) restricted their ACMG/AMP recommendations to the exonuclease domain (ED) because (i) nearly all known pathogenic germline variants in POLE/POLD1 cluster within the ED, (ii) the ED has a well-characterised structure–function architecture, and (iii) sufficient pathogenic and benign variants exist only within the ED to support empirical calibration. To mirror this approach, we performed the calibration workflow exclusively on ED variants (POLE residues 268–471; POLD1 residues 304–533). For these ED-restricted variants, we recalibrated all MAVISp-derived computational predictors (REVEL, AlphaMissense, EVE, GEMME, DeMaSk) using ClinVar P/LP and B/LB variants. We applied the resulting POLE/POLD1-specific thresholds to all ClinVar VUS within the ED. We also applied our PM1-like structural/functional evidence exclusively to ED variants. The results of this ED-specific analysis are now reported in the revised manuscript (Figure 9 Supplementary Tables S3 and S4), as also explained in the response to the previous question. This ensures that MAVISp predictions are applied in a manner that is consistent with the principles of Mur et al. and ACMG/AMP variant interpretation.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript used the MAVISp framework to characterize 64,429 missense variants (43,415 in POLE, 21,014 in POLD1) through computational saturation mutagenesis. The authors integrate protein stability predictions with pathogenicity predictors to provide mechanistic insights into DNA polymerase variants relevant to cancer predisposition and immunotherapy response. There are discussions of known PPAP-associated variants and somatic cancer mutations in the context of known data and some proposed variants of interest (which are not validated).

      Major comments:

      I was unaware of the MAVISp framework. It concerns me that alebit this paper has a lot of technical details about the framework, its not the paper about the framework. I did look into the paper https://www.biorxiv.org/content/10.1101/2022.10.22.513328v5 which keeps benign updated (version five now) for three years, but I do not see a peer reviewed version. It would be unfair of me to peer review the underlying framework of the work but together with the previous comments, I am a bit concerned. For example, the authors are using AlphaFold models to predict DDG values. Delgado et al. (2025, Bioinformatics) explicitly tested FoldX on such models and concluded that "AlphaFold2 models are not suitable for point mutation ΔΔG estimation" afte observing a correlation of 0.06 between experimental and calculated values. AlphaFold's own documentation states it "has not been validated for predicting the effect of mutations". Pak et al. (2023, PLOS ONE) showed correlation between AlphaFold confidence metrics and experimental ΔΔG of -0.17. Needless to say that these concerns seriously undermine the validity of a major part of the study. I have to add tha this is also true for toher technical choices: Several integrated predictors (DeMaSk, GEMME) are outperformed by newer methods according to benchmarking studies (https://www.embopress.org/doi/full/10.15252/msb.202211474). AlphaMissense, while state-of-the-art, shows substantial overcalling of pathogenic variants. could ensemble meta-predictors (REVEL, BayesDel) improve accuracy?

      Further, I found the web site of the framework, where I looked for the data on these models, rather user unfriendly. Selecting POLD1, POLD2, or POLE tells me I am viewing entries A2ML1, ABCB11, ABCB6 respectively, when I search for POL and then click: these are the first three entries of the table, bot the what I click on. displaying the whole table and clicking on POLD1, gets me to POLD1. However, when I selected "Damaging mutations on structure" I get "Could not fetch protein structure model from the AlphaFold Protein Structure Database". Many other features are not working (Safari or Chrome, in a Mac). That is a concern for the usability of the dataset.

      Albeit this is a thorough analysis with the existing tools, and the authors make some sparse attempts to put the mutants classification in context with examples, the work stays descriptive for know effects in literature, or point out that e.g. "further functional and in vitro assays are required". The examples are not presented in a systematic way, or in an appealing manner. Thus, what this manuscript adds to the web site is unclear. It is a description of content, which could be at least more appealing if examples woudl be more clearly outlined in a conceptual framework, and illustrated more consistently. For exmaple I read in the middle of mage 16 "One such example is the F931S (p.Phe931Ser) variant (Figure 5A)" and then I see "F931 forms contacts with D626, a critical residue for the coordination of Mg2+ which is essential for the correct orientation of the incoming nucleotide (Figure XXX)". Figure 5B is not XXX as this has just many mutations labeled. These issues are very discouraging. I woudl recommend to put much more effort in examples, put them in clearer paragraphs, and decribe results rather than the methodology. Doing both in an intemigled way, clearly does not work for me.

      Will the community find this analysis useful? How many ClinVar VUS could be reclassified using MAVISp data under current ACMG/AMP guidelines? How do MAVISp predictions meet calibrated thresholds as in https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-023-01234-y for the exonuclease domain of POLE and POLD1? Such questions might undermien teh appear of the work and coudl been looked into.

      Referee cross-commenting

      I agree with all the comments raised by reviewer 2; she/he elaborates more on some issues I brought up too briefly (e.g. the choice of GEMME) while other issues that I made more comments about are also mentioned. I only want to note that the statement "A major limitation of the 3D modeling is this impossibility to include Zn2+ coordination by cysteine residues" is not accurate, as there are many 3D structure prediction tools and modeling tools that are capable og handling zinc ions coordinated by cysteines.

      While I respect that Referee 1 is clearly more positive and less concerned by methodological issues, I note that while I agree that "The authors identify numerous variants for prioritisation in further studies" (albeit in a sparse and not well organised manner in my view), I am not convinced by the present manuscript that "the effectiveness of integrating various data sources for inferring the mechanistic impact of variants" is really shown: there are hypotheses generated, but none are tested, so the effectiveness of the approach remains to be proven in my view.

      I still view this as a thorough study and a very brave attempt to be integrative and inclusive, but several methodological limitations and lack of concrete novel insight, seriously dampen my enthusiasm.

      Significance

      Strengths:

      A very comprehensive analysis of POLE and POLD1 missense variants (64,429 total), approximately 600-fold more coverage than the ~100 experimentally characterized variants in the PolED database. The multi-layered MAVISp approach provides mechanistic interpretability beyond simple pathogenic/benign classifications, potentially valuable for understanding variant effects on stability, DNA binding, protein interactions, and allosteric communication. The clinical context is highly relevant given POLE/POLD1 roles in disease.

      Limitations:

      The methodological concerns were outlined above. No solid new insight examples in a validated manner. Examples of how the datasets can be really used are not well-organised as they appear in the context of the approach in perplexed manner.

      Advance:

      The advance is primarily technical and database-driven rather than conceptually novel. Scale, Multi-dimensional assessment, Mechanistic insight and consideration of Clinical framework integration is a clear advance.

      Audience:

      The audience is the POLDPOLE experts; I however doubt if clinical scientists will find the paper useful, especially in the context of the absence of a dedicated resource and the fact that the entried in the MAVISp web-toold are not easily and intuitively accessible and clinical requirements(eg Integration with ACMG/AMP classification frameworks) are not clearly met.

      Reviewer expertise: I am a structural biologist with experience in structure analysis of experimental and predicted models, but no specific expertise or interest in polymerases.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript reports a comprehensive study of POLE and POLD1 annotated clinical variants using a recently developed framework, MAVISp, that leverages scores and classifications from evolutionary-based variant effect predictors. The resource can be useful for the community. However, I have a number of major concerns regarding the methodology, the presentation of the results and the impact of the work.

      On the choice of tools in MAVISp and interpretation of their outputs

      • Based on the ProteinGym benchmark: https://proteingym.org/benchmarks, GEMME outperforms EVE for predicting the pathogenicity of ClinVar mutations, with an AUC of 0.919 for GEMME compared to 0.914 for EVE. Thus, it is not clear for me why the authors chose to put more emphasis on EVE for predicting mutation pathogenicity. It seems that GEMME can better predict this property, without any adaptation or training on clinical labels.
      • Which of the predictors, among AM, EVE, GEMME, and DeMaSK, provide a classification of variants and which ones provide continuous scores? This should be clarified in the text. If some predictors do not output a classification, then evaluating their performance on a classification task is unfair. I would guess that the MAVISp framework sets thresholds on the predicted scores to perform the classification and it is unclear from reading the manuscript whether these thresholds are optimal nor whether using universal cutoff values is pertinent. For instance, for GEMME, a recent study shows that fitting a Gaussian mixture to the predicted score distribution yields higher accuracy than setting a universal threshold (https://doi.org/10.1101/2025.02.09.637326). Along this line, for predictors that do not provide a classification, I am not convinced of the benefit for the users of having access to only binary labels, instead of the continuous scores. The users currently do not have any idea of whether each variant is borderline (close to theshold) or confident (far from threshold).

      On the presentation and impact of the results

      • While reading the manuscript, it is difficult to grasp the main messages. The text contains abundant discussion about the potential caveats of the framework, the care that should be taken in interpreting the results and the dependency on the clinical context. Although these aspects are certainly important, this extensive discussion (spread throughout the manuscript) obscures the results. Moreover, the way variants are catalogued throughout the text makes it difficult to grasp key highlights. The reader is left unsure about whether the framework can actually help the clinical practitionners.
      • In many cases, the authors state that experimental validation is required to validate the results. Could they be more explicit on the experimental design and the expected outcome?
      • AlphaMissense seems to have a tendency to over-predict pathogenicity. Could the authors comment on that?

      On specific variants

      • The mention of H1066R, H1068, and D1068Y is very confusing. There seems to be a confusion between residue numbers and amino acid types.
      • A major limitation of the 3D modeling is this impossibility to include Zn2+ coordination by cysteine residues. This limitation holds for both POLE and POLD1. Could the authors comment on the implication of this limitation for interpreting the mechanistic impact of variants. In particular, there are several variants reported in the study that consist in gains of cysteines. The authors discuss the potential impact of some of these mutations on the structural stability but not that on Zn coordination or the formation of disulphide bridges.
      • MAVISp does not identify any mechanistic effect for a substantial portion of variants labelled as pathogenic. Could the authors comment on this point? Moreover, none of the variant discussed are associated with allosteric effect, is this expected?

      Referee cross-commenting

      I agree with the comments and overall assessment of Reviewer 3. I would like to take this opportunity to clarify that I did not meant 3D modelling of Zinc ion coordination by Cys is impossible in general. I wanted to emphasise that the exclusion some Zinc-binding sites in the present study is a limitation.

      Significance

      The work's strength is its comprehensive analysis. The weaknesses are a methodology that does not seem mature and with output that are still difficult to predict. In addition, it seems that a lot of expertise and manual curation based on metadata (phenotype, functional state...) is needed for the users to benefit from the analysis. The manuscript reads a bit like a catalogue from where it is difficult to understand to what extent the results are significant and impactful.

      I have expertise in computational modelling, protein sequence-structure-function relationship and prediction of variant effects.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study "Interpreting the Effects of DNA Polymerase Variants at the Structural Level" comprises an in-depth analysis of protein sequence variants in two DNA polymerase enzymes with particular emphasis on deducing the mechanistic impact in the context of cancer. The authors identify numerous variants for prioritisation in further studies, and showcase the effectiveness of integrating various data sources for inferring the mechanistic impact of variants.

      All the comments below are minor, I think the manuscript is exceptionally well written.

      • The main body of the manuscript has almost as much emphasis on usage of the MAVISp tool as analysis of the polymerase variants. I don't think this is an issue, as an illustrated example of proper usage is very handy. I do however, think that the title and abstract should better reflect this emphasis. E.g. "Interpreting the Effects of DNA Polymerase Variants at the Structural Level with MAVISp". This would make the paper more discoverable to people interested in learning about the tool.
      • Figure 1. I don't believe there is much value in showing the intersection between the datasets (especially since the in-silico saturation dataset intersects perfectly with all the others). As an alternative, I suggest a flow-chart or similar visual overview of the analysis pipeline.
      • Please note in the MAVISp dot-plot figure legends that the second key refers to the colour of the X-axis labels rather than the dots
      • Missing figure reference (Figure XXX) at the bottom of page 16

      Significance

      In addition to identifying a large number of variants in POLE and POLD1 that are good candidates for further investigation, this study acts as a showcase for how evidence from different sources can be combined in a context-dependent manner (in this case, cancer). In terms of limitations, the lack of structures for POLD1 proved a hindrance several times in this study, obscuring some potentially pathogenic variants.

      This manuscript bears some similarity to the MAVISp paper (10.1101/2022.10.22.513328v6), which gives a number of brief examples of analyses that can be conducted with the pipeline. This paper differs in that it shows a full start-to-end analysis and goes into considerable detail about possible mechanisms of pathogenesis. The mechanistic detail and potential clinical relevance set this paper apart.

      This paper would most likely interest those involved in VUS interpretation, both in a clinical and research capacity. Those specialised in DNA polymerase enzymes would also likely find this paper of interest. This paper may also be useful for future research into the impact of the highlighted variants.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Wiesner et al. use a combination of state-of-the-art imaging techniques to visualize the exocytosis of vesicles labeled with vamp2-phluorin. This work builds on previous findings of the group and aims to quantify vesicle exocytosis along the axon and relate it to the location of actin rings (MPS). Exocytosis indeed occurs in axonal and dendritic regions ; however, at a significantly lower rate than in presynaptic terminals. Exceptionally, the AIS shows a remarkably high exocytosis rate compared to other axonal regions. Perturbation of the MPS with swinholide increases the nonsynaptic release of vamp2-phluorin. The spots supporting exocytosis along the axon lack spectrin but are spatially segregated from regions used for CCP formation.

      This work takes advantage of last-generation optical microscopy approaches to provide a quantitative analysis of exocytosis along the axon in nonsynaptic regions. Findings are solid, and the segregation of spots supporting exocytosis and endocytosis is intriguing. However, it is unclear to me whether the results obtained reflect a general mechanism or if they are biased by the experimental approach. Specifically, I have these major comments:

      1) Use of the term "spontaneous." I understand that the term "spontaneous" refers to exocytosis that "just" occurs. But exocytosis cannot be evaluated without considering electrical activity. Vamp2-phluorin has been extensively used to investigate neurotransmitter release. Since spontaneous neurotransmitter release occurs in the absence of action potentials, it is important to know how the rates of exocytosis are affected after incubation with TTX. These experiments are necessary to show if vesicles are indeed released spontaneously or if they require the presence of action potentials.

      2) The rates of spontaneous exocytosis are expressed in μm² /hour because these events are quite infrequent. According to the methods section, typical recording times are 5 minutes or less (lines 522-533). It would be more appropriate to express values per minute to establish comparisons with other works. The goal is to understand what sort of vesicles are being exocytosed. This is a key question that must be addressed before exploring other aspects such as the relationship to spectrin or endocytosis. If the authors can provide more information about the types of vesicles being exocytosed, this work becomes very relevant. Since I am aware of the technical difficulties associated with this, some suggestions are: use a vamp2-apex2-phluorin construct and confirm vesicle identity by EM, or, use iGluSnFR to confirm neurotransmitter release along axons.

      Minor comments:

      1) Since culture conditions promote synapse formation, could spontaneous exocytosis found along axons related to synapse formation? This aspect could be tested by co-staining with PSD-95 after fixation.

      Significance

      Significance:

      This is state-of-the-art study of the cell biology of neurons. The works demonstrates that vesicles are exocytosed along the axon and describes the molecular characteristics of the cytoskeletal elements involved.

      General assessment:

      The main strength of the study is the quality and diversity of imaging approaches used. The main limitation is defining the type of vesicle being exocytosed. It is important to know if vesicles imaged contain neurotransmitters.

      Advance:

      This paper is technically sound and provides interesting new concepts about how exocytosis occurs in nonsynaptic regions.

      Audience:

      This paper is appropriate for an audience familiarized with cell biology or cellular and molecular neuroscience