10,000 Matching Annotations
  1. Dec 2025
    1. You'll notice that the links here point to much older, smaller numbers. 1013 3667. These are ideas from thousands of entries ago. ideas I'm still grappling with today. That's what I really love about my link book. The way old and new ideas collide on paper. That's where the real creative sparks happen. And that's why I say the linkbook is an innovation engine. It doesn't just store ideas. It helps them grow, interact, and evolve.<br /> —Michael Herrick [2:45](https://youtu.be/30_v2FHJ9e4?si=HclrmkAMnd6LVca_&t=165

      Michael Herrick noticing what others have seen in the past. He doesn't give the idea a new name like he's done with "Linkbook" for commonplacing or various other iterations.

    1. .gdprSupport( builder -> builder.uouAndUserSupport() // ← Use this )

      wrong. use: .gdpr(gdprSpec -> gdprSpec .uouAndUserSupport(AppDefId, contextBuilder.secrets().appSecret()) ) before the change it is: .gdpr(gdprSpec -> gdprSpec .userSupportOnly(AppDefId, contextBuilder.secrets().appSecret()) )

    2. Add a Translator

      add note to put in a new trnaslators file, because when in class mappers the transformers aren't yet build so refrence to the transformer is unknown.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03091

      Corresponding author(s): Chia-Tsen, Tsai, Liuh-Yow Chen

      1. General Statements [optional]

      We thank the reviewers for their valuable time and constructive feedback on our study, which ultimately improved our manuscript. Herein, we provide a detailed response to each of the reviewers' comments, supported by new data that have been integrated into both the main text and the supplementary figures.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary This manuscript builds upon the authors' prior findings that targeting COUP-TF2 to TRF1 induces ALT-associated phenotypes and G2-mediated synthesis in telomerase-immortalised BJT human fibroblasts. In this study, the authors show that telomere-coupled COUP-TF2 promotes H3K9me3 enrichment in these cells, and that this effect is blocked by TRIM28 depletion. Furthermore, TRIM28 depletion also suppresses the formation of ALT phenotypes in VA13 ALT cells. Given that TRIM28 has been implicated in regulating H3K9me3 deposition via SETDB1, and has been reported to co-purify with TR2 and TR4 (though not previously in the context of ALT telomeres), these findings add mechanistic depth to how heterochromatin regulators contribute to ALT activity. Overall, the manuscript's conclusions are generally supported by the presented data, but several aspects require clarification or additional experimental validation.

      The authors report a modest reduction in telomeric H3K9me3 following COUP-TF2 and TR4 depletion in U-2 OS and VA13 cells (Figure 1B). To strengthen the claim that these orphan receptors specifically regulate H3K9me3, the authors should 1) Assess additional heterochromatic histone marks (e.g., H4K20me3) at telomeres, 2) Normalize telomeric signals to both parental histone levels and input, and 3) Evaluate whether global H3K9me3 levels also decrease upon receptor depletion

      Response: We appreciate the reviewer's suggestion. To address the concern regarding specificity, we assessed H3K27me3 and H4K20me3 levels upon COUP-TF2/TR4 depletion and found no significant changes (Supplementary Fig. 1C). Furthermore, we reprocessed the telomeric ChIP data, normalizing to both input DNA and parental histone levels (Figure 1B). This refined analysis reinforces our original conclusion. Finally, Western blot analysis showed no significant changes in global H3 or H3K9me3 levels upon COUP-TF2/TR4 depletion (Figure 1A). Altogether, these results further support the specificity of COUP-TF2/TR4 for H3K9me3 at telomeres. We have revised the main text (page 3) and updated Figure 1A, 1B, and Supplementary Figure 1C for these changes.

      Most experiments explore chromatin changes in telomerase-positive BJT fibroblasts (Figure 2, Figure 4D). It remains unclear whether similar manipulations in ALT cells yield consistent effects, which would give a broader context for ALT phenotype induction. Are ALT phenotypes similarly induced in ALT cells? Does altered chromatin status affect telomere length or telomerase recruitment/activity? Can these pathways drive ALT phenotypes in non-immortalised cells?

      Response: We appreciate the reviewer's suggestion and have explored chromatin changes in telomerase-negative BJ and IMR90 primary fibroblasts (Supplementary Fig. 2C, D). Consistent to the result in BJ-telomerase cells, we found that VP64-TRF1 decreased telomeric H3, H4, and H3K9me3 levels, whereas KRAB-TRF1 increased these marks. Moreover, expression of either VP64-TRF1 or KRAB-TRF1 was sufficient to induce APB formation and ATDs in BJ and IMR90 cells. These results indicate that the chromatin changes at telomeres can drive ALT phenotypes in both primary and telomerase-immortalized fibroblast cells.

          Additionally, regarding whether chromatin alteration affects telomere length or telomere regulation, we have explored telomere length changes in BJT cells expressing vector, TRF1, KRAB-TRF1 or VP64-TRF1. The result of telomere restriction fragment (TRF) assay showed that the cells of all conditions maintained static telomere lengths through 30 days in culture (data shown below), suggesting that the chromatin alterations may not impact telomerase recruitment or activity. As this result is beyond the scope of current study, this data is only shown here in the rebuttal letter for a reference and is not included in the revised manuscript.
      
          Moreover, according to the reviewer's suggestion, we also carried out VP64-TRF1 or KRAB-TRF1 expression experiments in WI38-VA13/2RA cells that express high TERRA and have altered chromatin structures. Our data revealed that VP64-TRF1 suppresses telomere H3K9me3 and ALT activity, while KRAB-TRF1 increases both (Supplementary Figure 2E), suggesting an association of heterochromatin state with ALT activation in WI38-VA13/2RA cells.
      
          The observation that VP64-TRF1 reduces ALT activity in WI38-2RA/VA13 cells contrasts with findings in BJT cells. It is worth noting that studies from the Azzalian and Linger groups demonstrated that experimentally induced TERRA expression promotes ALT activity in ALT and non-ALT cells (PMID: 36122232, PMID: 40624280). Therefore, we propose that TERRA upregulation by VP64-TRF1 may contribute to the ALT induction observed in BJT cells (Supplementary Figure 2A, B), whereas the ability of VP64-TRF1 to suppress ALT activity in WI38-2RA/VA13 cells could be attributed to the reduction of telomere H3K9me3 and heterochromatin loss. Importantly, KRAB-TRF1 concurrently enhanced histone H3, H4, and H3K9me3 occupancy and ATL activity in both human fibroblasts and ALT cells. Altogether, these results support the notion that heterochromatin formation triggers ALT.
      
          We also examined TRIM28 recruitment to telomeres by telomere-ChIP and found that COUP-TF2LBD-TRF1 promotes TRIM28 telomere enrichment in BJ, IMR90 and U2OS, similar to BJT cells (Supplementary Fig. 5A-D).  Moreover, in ALT cell lines WI38-2RA/VA13, U2OS, and Saos-2, depletion of COUP-TF2 or TR4 reduced TRIM28 telomeric association (Figure 4A, B). Together, the data from human fibroblasts and ALT cells supports a role of orphan NRs in recruiting TRIM28 to ALT telomeres.
      

      We acknowledge the reviewer's suggestions, which allow us to clarify and strengthen the conclusions. The corresponding data are presented in Figure 4A-B and Supplementary Figure 2B-D and 5E-F, and the main text has been modified on page 4-6 in the revised manuscript.

      When referring to Figure 3G, the authors state that that telomeric H3K9me3 was abolished upon depleting TRIM28 from the U2OS and WI38-VA13/2RA cells. Abolished is a strong word for a 50% decrease, and this sentence should be revised. The reduction appears greater than that seen with COUP-TF2/TR4 depletion. Are the effects additive? If so, might TRIM28 act, at least in part, independently of COUP-TF2/TR4?

      Response: We appreciate the reviewer's comments. We have revised the manuscript on page 5, replacing "abolished" with "significantly reduced" to better describe the effect of TRIM28 depletion on telomeric H3K9me3. To further investigate the interplay between TRIM28 and orphan NRs in regulating telomeric H3K9me3, we conducted single and combined knockdown experiments in U2OS and WI38-VA13/2RA cells, followed by telomere-ChIP analysis (Supplementary Figures 4D, E). Our results showed that single depletion of either orphan NRs or TRIM28 lead to a similar decrease in telomeric H3K9me3, and that combined knockdown do not result in any further reduction. These findings support an epistatic interaction between orphan NRs and TRIM28 in the regulation of telomeric H3K9me3. We have expanded on this interpretation in the main text (page 6) and included the relevant data in Supplementary Figures 4D, E.

      VA13 cells consistently exhibit stronger effects than U-2 OS (e.g., Figures 1 and 3). This discrepancy could be linked to the high content of variant repeats in VA13 cells. The authors should assess whether variant repeat content underlies the differential response. Repeating key experiments in additional ALT lines with varied repeat compositions would be informative.

      Response: We appreciate the reviewer's suggestion and have extended our analyses to two additional ALT osteosarcoma cell lines, SAOS-2 and G292. In both lines, depletion of orphan NRs resulted in a consistent decrease in telomeric H3K9me3 levels (Supplementary Figures 1A, B). We also examined the contribution of TRIM28 to telomeric H3K9me3 in these cells. siRNA-mediated depletion of TRIM28 in SAOS-2 and G292 cells similarly caused a significant reduction in telomeric H3K9me3 and ALT phenotypes (Supplementary Figure 4A-C). Together, these results from 4 ALT cell lines confirm that orphan NRs and TRIM28 promote telomeric H3K9me3 formation in ALT cells. We have modified the main text on page 3 and 5-6 for these results.

      In line with the previous point, it would be useful to show whether TRIM28 telomeric enrichment is affected by COUP-TF2/TR4 depletion in U2OS cells (Figure 4C). To improve confidence in these findings, the authors should perform telomeric ChIP assays, especially with the COUP-TF2^LBDΔAF2-TRF1 mutant construct.

      Response: Following the reviewer's suggestion, we performed telomere-ChIP assays to assess TRIM28 enrichment at telomeres upon expression of COUP-TF2LBD-TRF1 and its ΔAF2 mutant in U2OS cells. Consistent with our immunofluorescence results, telomere-ChIP revealed that COUP-TF2LBD-TRF1 expression promotes TRIM28 telomere enrichment, while the AF2 deletion mutant failed to recruit TRIM28 (Supplementary Figure 5D). We have modified the main text on page 6 for this result.

      The immunoprecipitation experiments showing TRIM28 association with orphan receptors should include benzonase treatment to rule out DNA-mediated co-association (Figure 4F-G).

      Response: We appreciate the reviewer's suggestion. To address the possibility of DNA-mediated interactions, we pre-incubated cell lysates with benzonase prior to Co-IP (Page 7). This treatment did not disrupt the association between TRIM28 and COUP-TF2 or TR4 in WI38-VA13/2RA and BJT cells (Supplementary Figures 5E-G), indicating a DNA-independent interaction. We have modified the main text on page 7 for this result.

      The study would benefit from a direct assessment of whether COUP-TF2LBDΔAF2-TRF1 fails to induce ALT phenotypes in BJTfibroblasts.

      Response: We thank the reviewer for this suggestion. As the role of the COUP-TF2 AF2 domain in ALT induction in BJT fibroblasts has recently been thoroughly investigated and published by our group (PMID: 38752489), we have directed the current study towards a more detailed mechanistic question. Specifically, we have carried out experiments to further demonstrate that COUP-TF2 recruits TRIM28 to telomeres via its AF2 domain in both human fibroblasts and ALT cells (Supplementary Figures 5A-D). On Page 6, we have modified the main text for these results and included a citation to our previous publication to provide the necessary background.

      The experiments performed in Figure 5E-H lack a vector-only + siCtrl control.• In Figure 5E, the observation that APB formation is restored in siTRIM28 + Vector-treated cells is unexpected. The authors should address this finding and clarify whether this reflects biological noise or a compensatory effect.

      Response: We thank the reviewer for this suggestion. We have repeated the experiments with a revised design, ensuring a consistent vector background across all groups (Vector + siCtrl, Vector + siTRIM28, TRIM28 WT + siTRIM28, and TRIM28 ΔRBCC + siTRIM28) (Figure 5E-H). This improved design confirms that expression of wild-type TRIM28, but not TRIM28 ΔRBCC, restores APB formation, ATDS, ssTeloC, and telomeric H3K9me3 levels in TRIM28-depleted cells. The updated dataset also resolves the previous unexpected increase in APB formation in the siTRIM28 + Vector condition, which is now excluded. We have modified the main text accordingly on page 8.

      Reviewer #1 (Significance (Required)):

      This work offers valuable mechanistic insight into how COUP-TF2 and TRIM28 coordinate to regulate heterochromatin deposition and ALT phenotype formation. It adds to the growing understanding of chromatin-mediated telomere regulation. What remains unclear is how important this interaction is for ALT maintenance, as H3K9me3 is only moderately altered upon TRIM28 depletion in ALT cells. Depletion of TRIM28 has been shown previously to induce APB formation and telomere elongation in U-2 OS ALT cells (Wang et al., 2021), the opposite to what the authors observed here in VA13 cells (Figure 5E-H). Clarifying whether these differences are variant repeat-dependent, or reflect intrinsic features of specific ALT cell lines, would substantially elevate the study's impact.

      Response: We appreciate the reviewer's recognition of the significance of our work in elucidating the molecular basis of ALT regulation through COUP-TF2-TRIM28-mediated heterochromatin formation. In response to the reviewer's insightful comment regarding the importance of this interaction for ALT maintenance, we have expanded our study. We now include data from three additional primary human fibroblasts and a total of four ALT cancer cell lines (Figure 4, Supplementary Figure 4). These new data further strengthen the conclusion that TRIM28 promotes telomeric H3K9me3 and ALT-associated features. Furthermore, our rescue experiments support the model that the ALT-promoting function of TRIM28 in both fibroblasts and ALT cell lines is mediated through its physical interaction with COUP-TF2 (Supplementary Figure 5). We believe these results provide a solid foundation for demonstrating a cooperative role of COUP-TF2 and TRIM28 in ALT maintenance, and address the reviewer's concern regarding the generalizability of our findings.

      Reviewer #2 (Evidence, reproducibility and clarity (Required):

      Summary This manuscript investigates the role of orphan nuclear receptors (ORs), specifically COUP-TF2 and TR4, in promoting H3K9me3 enrichment at ALT telomeres via recruitment of TRIM28 (KAP1). The authors propose that the AF2 domain of COUP-TF2, located in its ligand-binding domain (LBD), is sufficient to recruit TRIM28 to telomeres. This, in turn, promotes heterochromatinization and induces hallmarks of the Alternative Lengthening of Telomeres (ALT) pathway, including APB formation and telomeric DNA synthesis outside of S-phase. This study addresses one important and unresolved question in the field: by what mechanism is the heterochromatic state established at ALT telomeres? Another timely question, not addressed here is: how is heterochromatin (specifically H3K9me3) functionally linked to ALT? The findings are potentially novel and mechanistically insightful. However, key elements of the study, particularly the central tethering experiments, require stronger quantification and clarity. Additional mechanistic tests and literature adjustments would also improve the manuscript.

      Major Concerns

      Central TRF1-COUP-TF2-LBD result lacks quantification and clarity: the tethering of COUP-TF2's LBD to telomeres via TRF1 is a core result of the paper. This experiment demonstrates that this domain is sufficient to induce weak H3K9me3 enrichment and ALT features (APBs and ATDS). However, the supporting ALT data are presented only in Supplementary Figures S1A and S1B, and are not quantified. These data should be quantified with appropriate statistics and moved to a main figure.

      Response: The current study builds upon our recent publication (PMID: 38752489), which comprehensively analyzed ALT induction (APBs, ATDS, C-circles, T-SCEs) by orphan NR-TRF1 expression (COUP-TF1, COUP-TF2, TR2, and TR4; full-length and LBD) in various human fibroblast cell lines. To avoid potential duplicate publication concerns, particularly regarding APB and ATDS results for COUP-TF2LBD-TRF1 in BJT cells, we have put the data with revised quantification results in Supplementary Figure 1D-E. We will follow the reviewer's suggestion and move this data to the main figures if the editors agree.

      Furthermore, the broader functional implication is not explored. Does this tethering induce a fully functional ALT pathway? For example, can telomerase knockout cells expressing TRF1-COUP-TF2-LBD maintain long-term proliferation? Such evidence would significantly strengthen the impact of the study.

      Response: While COUP-TF2LBD-TRF1 expression rapidly induces key ALT phenotypes, we acknowledge that this alone is insufficient to directly promote telomere lengthening and long-term proliferation of primary fibroblasts, as discussed in Gaela et al., 2024 (PMID: 38752489). However, our ongoing, unpublished studies indicate that COUP-TF2LBD-TRF1 can drive immortalization of primary BJ fibroblasts expressing SV40LT by promoting ALT-mediated telomere elongation (Attached Figure A-C; additional data not shown). These findings suggest that COUP-TF2 may cooperate with additional genetic or epigenetic alterations to facilitate ALT development. We appreciate the reviewer's recognition of this critical aspect. As our immortalization study is still in progress and will be the subject of a separate manuscript, we hope the reviewer understands that the data shown in this letter will not be included in the revised manuscript.

      Chromatin manipulation experiments lead to ambiguous conclusions: the authors propose that telomeric heterochromatin promotes ALT activity, but their own experiments (e.g., Figure 2) show that both heterochromatin-inducing (KRAB-TRF1) and euchromatin-inducing (VP64-TRF1) tethering can trigger ALT-like features. This makes it difficult to conclude that heterochromatin is specifically required.

      To clarify:

      -Did the authors express TRF1-VP64 in an ALT cell line? According to their model, this should suppress ALT activity.

      -More broadly, do chromatin alterations per se (regardless of direction) trigger ALT features? Clarifying these points is important for interpretation.

      Response: In response to the reviewer's suggestion, we expressed VP64-TRF1 and KRAB-TRF1 in WI38-2RA/VA13 cells to investigate telomere chromatin changes and ALT activity. Our data indeed revealed that VP64-TRF1 suppresses telomere H3K9me3 and ALT activity, while KRAB-TRF1 increases both (Supplementary Figure 2E), suggesting that heterochromatin triggers ALT activation.

      The observation that VP64-TRF1 reduces ALT activity in WI38-2RA/VA13 cells contrasts with findings in BJT cells. Of note, studies from the Azzalian and Lingner groups demonstrated that experimentally induced TERRA expression promotes ALT activity in ALT and non-ALT cells (PMID: 36122232, PMID: 40624280). Therefore, we propose that TERRA upregulation may contribute to the ALT induction observed in BJT cells (Figure 2A, Supplementary Figure 2A, B). Given the high basal TERRA expression, expression of VP64-TRF1 and KRAB-TRF1 did not result in a consistent change in TERRA levels (Supplementary Figure 2F). Thus, the ability of VP64-TRF1 to suppress ALT activity in WI38-2RA/VA13 cells could be attributed to the reduction of telomere H3K9me3 and heterochromatin loss. Altogether, our results support the hypothesis that heterochromatin formation, rather than euchromatin triggers ALT.

      We thank the reviewer's insightful comments, which have allowed us to resolve the ambiguity of our results and strengthen the notion that heterochromatin formation promotes ALT. We think that the heterochromatin features and high TERRA expression represent two independent, coexisting mechanisms within ALT cancer cells to guarantee ALT activation. We have modified the main text on page 4-5 accordingly.

      TERRA downregulation contradicts current models: while TERRA upregulation is often observed in ALT cells and is thought to contribute to replication stress and recombination at telomeres, the authors show that TRF1-KAP1 expression induces ALT features while TERRA is downregulated. This observation is not addressed in the manuscript. The authors should at least discuss this discrepancy and propose whether this reflects a cell line-specific phenomenon or a decoupling between TERRA levels and ALT induction in this context.

      Response: We thank the reviewer for the comments. As mentioned above (Major Concerns 2), heterochromatin formation and TERRA expression are two mechanisms that can independently promote ALT. Unlike ALT cell lines that have high TERRA levels, human fibroblasts BJ cells have low TERRA that does not induce ALT phenotypes. Thus, the effect of KRAB-TRF1 on ALT induction in BJ cells could be attributed to the heterochromatin formation, but not reduction of TERRA. We have modified the main text on page 5 to clarify the result.

      Minor Comments

      Introduction (p. 3): The authors cite Episkopou et al. as showing increased H3K9me3 at ALT telomeres. This is incorrect; that paper suggests the opposite. The first study to clearly demonstrate H3K9me3 enrichment at ALT telomeres is Cubiles et al., 2018 and should be cited instead. Results (p. 5, first paragraph): The manuscript should cite Déjardin and Kingston, 2009 as the first to report COUP-TF2 and TR4 localization at ALT telomeres. The studies by Conomos et al., 2012 and Gaela et al., 2024 build on this prior evidence. Please also include this citation in the bibliography.

      Response: We appreciate the reviewer's careful reading and for pointing out these errors. The citation errors on pages 2 and 3 have now been corrected.Broader relevance of TRIM28-OR interaction: TRIM28 is a complex protein with roles in SUMOylation, heterochromatin formation, and transcriptional initiation/elongation regulation.

      The authors should explore whether similar COUP-TF2/TRIM28 interactions occur at other genomic loci. Public ChIP-seq data for COUP-TF2, TR4, and TRIM28 could be mined to investigate whether these factors co-occupy regulatory regions elsewhere in the genome, and how this relates to gene expression states.

      Response: We appreciate the reviewer's insightful suggestion regarding a potential genome-wild functional interaction between TRIM28 and COUP-TF2. To address this, we analyzed public ENCODE ChIP-seq data from K562 cells (TRIM28: ENCSR000BRW; COUP-TF2: ENCSR000BRS). This analysis revealed 3,326 co-binding sites for TRIM28 and COUP-TF2 (Attached Figure A). Interestingly, these co-binding sites were preferentially located within gene bodies (70.7%) and promoter regions (4.3%) (Attached Figures B-D), suggesting a potential cooperative role in gene regulation that aligns with our observation of physical interaction. While the finding is intriguing, a full exploration is beyond the scope of this manuscript, which focuses on ALT telomere regulation. We consider this is an important insight and have briefly noted it in the discussion (p. 9), although the corresponding analyses are not included in the revised manuscript.

      Reviewer #2 (Significance (Required)):

      This work contributes mechanistic insight into how heterochromatin is established at ALT telomeres-an important and timely question in telomere biology and cancer research. It offers a noncanonical recruitment mechanism for TRIM28, independent of KRAB-ZNFs, and highlights the functional role of orphan nuclear receptors in telomeric chromatin regulation. The study has potential implications for understanding ALT regulation and for identifying new intervention points in ALT-positive cancers. The work is conceptually interesting, but the conclusions are currently limited by insufficient quantification, some interpretative ambiguities, and a few overlooked references. Addressing the concerns listed above would significantly enhance the rigor and impact of the manuscript.

      Response: We appreciate the reviewer's recognition of the significance of our work in elucidating the molecular basis of ALT regulation through COUP-TF2-TRIM28-mediated heterochromatin formation. We also thank the reviewer for the valuable feedback, which has significantly strengthened our manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      This manuscript investigates the role of orphan nuclear receptors (ORs), specifically COUP-TF2 and TR4, in promoting H3K9me3 enrichment at ALT telomeres via recruitment of TRIM28 (KAP1). The authors propose that the AF2 domain of COUP-TF2, located in its ligand-binding domain (LBD), is sufficient to recruit TRIM28 to telomeres. This, in turn, promotes heterochromatinization and induces hallmarks of the Alternative Lengthening of Telomeres (ALT) pathway, including APB formation and telomeric DNA synthesis outside of S-phase. This study addresses one important and unresolved question in the field: by what mechanism is the heterochromatic state established at ALT telomeres? Another timely question, not addressed here is: how is heterochromatin (specifically H3K9me3) functionally linked to ALT? The findings are potentially novel and mechanistically insightful. However, key elements of the study, particularly the central tethering experiments, require stronger quantification and clarity. Additional mechanistic tests and literature adjustments would also improve the manuscript.

      Major Concerns

      1. Central TRF1-COUP-TF2-LBD result lacks quantification and clarity: the tethering of COUP-TF2's LBD to telomeres via TRF1 is a core result of the paper. This experiment demonstrates that this domain is sufficient to induce weak H3K9me3 enrichment and ALT features (APBs and ATDS). However, the supporting ALT data are presented only in Supplementary Figures S1A and S1B, and are not quantified. These data should be quantified with appropriate statistics and moved to a main figure. Furthermore, the broader functional implication is not explored. Does this tethering induce a fully functional ALT pathway? For example, can telomerase knockout cells expressing TRF1-COUP-TF2-LBD maintain long-term proliferation? Such evidence would significantly strengthen the impact of the study.
      2. Chromatin manipulation experiments lead to ambiguous conclusions: the authors propose that telomeric heterochromatin promotes ALT activity, but their own experiments (e.g., Figure 2) show that both heterochromatin-inducing (KRAB-TRF1) and euchromatin-inducing (VP64-TRF1) tethering can trigger ALT-like features. This makes it difficult to conclude that heterochromatin is specifically required. To clarify:
      3. Did the authors express TRF1-VP64 in an ALT cell line? According to their model, this should suppress ALT activity.
      4. More broadly, do chromatin alterations per se (regardless of direction) trigger ALT features? Clarifying these points is important for interpretation.
      5. TERRA downregulation contradicts current models: while TERRA upregulation is often observed in ALT cells and is thought to contribute to replication stress and recombination at telomeres, the authors show that TRF1-KAP1 expression induces ALT features while TERRA is downregulated. This observation is not addressed in the manuscript. The authors should at least discuss this discrepancy and propose whether this reflects a cell line-specific phenomenon or a decoupling between TERRA levels and ALT induction in this context.

      Minor Comments

      Introduction (p. 3): The authors cite Episkopou et al. as showing increased H3K9me3 at ALT telomeres. This is incorrect; that paper suggests the opposite. The first study to clearly demonstrate H3K9me3 enrichment at ALT telomeres is Cubiles et al., 2018 and should be cited instead. Results (p. 5, first paragraph): The manuscript should cite Déjardin and Kingston, 2009 as the first to report COUP-TF2 and TR4 localization at ALT telomeres. The studies by Conomos et al., 2012 and Gaela et al., 2024 build on this prior evidence. Please also include this citation in the bibliography. Broader relevance of TRIM28-OR interaction: TRIM28 is a complex protein with roles in SUMOylation, heterochromatin formation, and transcriptional initiation/elongation regulation. The authors should explore whether similar COUP-TF2/TRIM28 interactions occur at other genomic loci. Public ChIP-seq data for COUP-TF2, TR4, and TRIM28 could be mined to investigate whether these factors co-occupy regulatory regions elsewhere in the genome, and how this relates to gene expression states.

      Significance

      This work contributes mechanistic insight into how heterochromatin is established at ALT telomeres-an important and timely question in telomere biology and cancer research. It offers a noncanonical recruitment mechanism for TRIM28, independent of KRAB-ZNFs, and highlights the functional role of orphan nuclear receptors in telomeric chromatin regulation. The study has potential implications for understanding ALT regulation and for identifying new intervention points in ALT-positive cancers.

      The work is conceptually interesting, but the conclusions are currently limited by insufficient quantification, some interpretative ambiguities, and a few overlooked references. Addressing the concerns listed above would significantly enhance the rigor and impact of the manuscript.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary

      This manuscript builds upon the authors' prior findings that targeting COUP-TF2 to TRF1 induces ALT-associated phenotypes and G2-mediated synthesis in telomerase-immortalised BJT human fibroblasts. In this study, the authors show that telomere-coupled COUP-TF2 promotes H3K9me3 enrichment in these cells, and that this effect is blocked by TRIM28 depletion. Furthermore, TRIM28 depletion also suppresses the formation of ALT phenotypes in VA13 ALT cells. Given that TRIM28 has been implicated in regulating H3K9me3 deposition via SETDB1, and has been reported to co-purify with TR2 and TR4 (though not previously in the context of ALT telomeres), these findings add mechanistic depth to how heterochromatin regulators contribute to ALT activity. Overall, the manuscript's conclusions are generally supported by the presented data, but several aspects require clarification or additional experimental validation.

      • The authors report a modest reduction in telomeric H3K9me3 following COUP-TF2 and TR4 depletion in U-2 OS and VA13 cells (Figure 1B). To strengthen the claim that these orphan receptors specifically regulate H3K9me3, the authors should 1) Assess additional heterochromatic histone marks (e.g., H4K20me3) at telomeres, 2) Normalize telomeric signals to both parental histone levels and input, and 3) Evaluate whether global H3K9me3 levels also decrease upon receptor depletion
      • Most experiments explore chromatin changes in telomerase-positive BJT fibroblasts (Figure 2, Figure 4D). It remains unclear whether similar manipulations in ALT cells yield consistent effects, which would give a broader context for ALT phenotype induction. Are ALT phenotypes similarly induced in ALT cells? Does altered chromatin status affect telomere length or telomerase recruitment/activity? Can these pathways drive ALT phenotypes in non-immortalised cells?
      • When referring to Figure 3G, the authors state that that telomeric H3K9me3 was abolished upon depleting TRIM28 from the U2OS and WI38-VA13/2RA cells. Abolished is a strong word for a 50% decrease, and this sentence should be revised. The reduction appears greater than that seen with COUP-TF2/TR4 depletion. Are the effects additive? If so, might TRIM28 act, at least in part, independently of COUP-TF2/TR4?
      • VA13 cells consistently exhibit stronger effects than U-2 OS (e.g., Figures 1 and 3). This discrepancy could be linked to the high content of variant repeats in VA13 cells. The authors should assess whether variant repeat content underlies the differential response. Repeating key experiments in additional ALT lines with varied repeat compositions would be informative.
      • In line with the previous point, it would be useful to show whether TRIM28 telomeric enrichment is affected by COUP-TF2/TR4 depletion in U-2 OS cells (Figure 4C). To improve confidence in these findings, the authors should perform telomeric ChIP assays, especially with the COUP-TF2^LBDΔAF2-TRF1 mutant construct.
      • The immunoprecipitation experiments showing TRIM28 association with orphan receptors should include benzonase treatment to rule out DNA-mediated co-association (Figure 4F-G).
      • The study would benefit from a direct assessment of whether COUP-TF2LBDΔAF2-TRF1 fails to induce ALT phenotypes in BJT fibroblasts.
      • The experiments performed in Figure 5E-H lack a vector-only + siCtrl control.
      • In Figure 5E, the observation that APB formation is restored in siTRIM28 + Vector-treated cells is unexpected. The authors should address this finding and clarify whether this reflects biological noise or a compensatory effect.

      Significance

      This work offers valuable mechanistic insight into how COUP-TF2 and TRIM28 coordinate to regulate heterochromatin deposition and ALT phenotype formation. It adds to the growing understanding of chromatin-mediated telomere regulation. What remains unclear is how important this interaction is for ALT maintenance, as H3K9me3 is only moderately altered upon TRIM28 depletion in ALT cells. Depletion of TRIM28 has been shown previously to induce APB formation and telomere elongation in U-2 OS ALT cells (Wang et al., 2021), the opposite to what the authors observed here in VA13 cells (Figure 5E-H). Clarifying whether these differences are variant repeat-dependent, or reflect intrinsic features of specific ALT cell lines, would substantially elevate the study's impact.

    1. (1)  Accurately reports information from the sources using different phrases and sentences; (2)  Organized in such a way that readers can immediately see where the information from the sources overlap;. (3)  Makes sense of the sources and helps the reader understand them in greater depth.

      Keep these key features in mind when it comes to writing. Make a checklist and check your boxes as you go

    1. アンインストール

      addは追加するだけどremoveはアンインストールするなので表現が揃っていない。 追加/削除 or インストール/アンインストールに揃えて欲しい

    2. コマンドのインストールと実行を自動的に行ってくれます。

      nits: 既存環境を汚さない?というのが気になったので、以下のようなことがわかる文があるとより親切かなと思いました。

      「uvxは現在のプロジェクトから分離された一時的な仮想環境にインストールされる」

    3. 「今年の残りは: ○ヶ月と○日」と出力するスクリプト

      この辺りは同じファイルを更新している?ので、ファイル名もcaptionに含めると、同いファイルを変更していることが伝わりやすいかなと思いました。 (これ以前のコードも同様に)

    4. この手順よりは、新規にディレクトリ作成、uv init実行した後に、pyproject.toml, uv.lockだけコピーしてuv sync実行するみたいな手順がいいのではと思った。

      実際のユースケースに合わせる

    5. uv pip installした場合はuv addと違ってpyproject.tomlとかuv.lockが更新されないということを書いた方がいいかなぁ

    6. uv.lockの更新があるかを事前に調べる

      pyproject.tomlの設定を見て、今のuv.lockファイルに存在するパッケージのままか新しいバージョンが存在するかを調べる。

      みたいな意味合いだと思います。

      あと、これってどっちかというと hoge>4.0.0 とかかいていて、hogeの最新バージョンがでているか(pip list -O)みたいな使い方がメインかなと思ったんですが、そうではない?

      pyproject.tomlを書き換えたのはここで例として示しているために必要なだけど、本来はpyproject.toml書き換えたらuv syncの方を使うかなと思ったので(uv 素人なのではずしてたらすいません)

    7. 依存関係にある

      依存関係というよりも、「さきほど追加したパッケージが」とかいう説明でよいのではないか

    8. Pythonパッケージがイ

      pipコマンドと同様に、とかあってもいいかも

      PyPIからダウンロードしてインストールされていることとか触れて欲しい。

    9. 依存関係を管理する

      依存関係を管理するってちょっとイメージしにくいかなと思いました。

      プロジェクトで使用するライブラリを管理する、とか?

    10. uv-example

      さっきとは異なるディレクトリに作ったと言うことですかね。手順で mkdir uv-example がないのでちょっと混乱するかも

    11. 「A virtual environment already exists at `.venv`. Do you want to replace it? [y/n]」という

      長いし、トルでも意味は通じるかなと

      確認するメッセージが表示されます。

    12. 仮想環境を有効化した場合は、プロンプトにプロジェクト(ディレクトリ名)が表示されることにも言及して欲しい。

      venvは作成したvenv名がプロンプトで表示されていたので

    1. Kidney structure

      High-Level Summary

      The kidneys are bean-shaped organs protected by three outer layers and organized internally into the cortex, medulla, and renal pelvis. Nephrons in the cortex filter blood supplied by a highly branched vascular network that enters and exits through the renal hilum. Urine formed by nephrons flows through the renal pyramids into calyces, then the renal pelvis, and finally the ureter. Each kidney contains over one million nephrons, which are either cortical or juxtamedullary, depending on their position relative to the medulla.

      Study Notes: Kidney Structure 1. External Kidney Structure

      The kidney is surrounded by three protective layers (outer → inner):

      • Renal fascia Tough connective tissue Anchors kidney to surrounding structures

      • Perirenal fat capsule Cushions and stabilizes the kidney

      • Renal capsule Thin, tough layer directly covering kidney surface

      • Internal Kidney Regions

      The kidney has three main internal regions:

      Renal Cortex (outer region) Granular appearance Contains nephrons (functional units of the kidney) Site of blood filtration

      Renal Medulla (middle region) Made of renal pyramids (cone-shaped tissue masses) Each kidney has ~8 pyramids Renal columns lie between pyramids and carry blood , vessels Pyramid tips = renal papillae, which point toward the , pelvis

      Renal Pelvis (inner region) Located at the hilum Funnel-shaped urine collection area Drains urine into the ureter

      1. Hilum of the Kidney

      2. Concave region of the kidney

      3. Entry/exit point for: Renal arteries Renal veins Nerves

      4. Exit point for the ureter

      5. Urine Flow Pathway

      6. Minor calyces → Major calyces → Renal pelvis → Ureter → Urinary bladder

      7. Renal Lobes

      8. A renal lobe = one renal pyramid + surrounding cortical tissue

      9. Functional subdivision of the kidney

      Blood Supply of the Kidney (In Order) 1. Aorta 2. Renal arteries 3. Segmental arteries 4. Interlobar arteries (run through renal columns) 5. Arcuate arteries (arch at cortex–medulla boundary) 6. Cortical radiate arteries 7. Afferent arterioles 8. Glomerular capillaries (nephrons)

      Venous return: Veins follow the same path in reverse. Same names as arteries except no segmental veins. Drain into the inferior vena cava.

      Nephrons (Functional Units) Each kidney contains >1 million nephrons Located mainly in the renal cortex

      Types of Nephrons Cortical nephrons (≈85%) Located deep in cortex Short loops of Henle Juxtamedullary nephrons Located near cortex–medulla boundary Long loops of Henle Important for urine concentration

      Parts of a Nephron Renal corpuscle Renal tubule Associated capillary network

    1. Mental Status Exam

      1-Apperance: how does the person look like wearing and pysical 2-mood: how emotions show itself 3- Cognition: aware of the time and location 4-insight and judgement: aware of the illness itself 5- intellectual functioning: the expression of the toughts are not distrupted and has a flow

    1. Dus Romanisatie van de kerk: 1. Taal van Grieks naar Latijn 2. Centralisatie van het Christendom 3. Keizers gingen concilies leggen 4. Definiëren van ketterij 5. Bevorderen van bekering -> uiteindelijk dwang 6. Er ontstond een hiërarchische structuur: bisschoppen kregen publieke functies zoals praetoren 7. Er werden meer beslechtingsregels ingevoerd waardoor het meer conflictenrecht werd

    1. A smoker develops damage to several alveoli that then can no longer function. How does this affect gas exchange?

      Gas exchange relies on Fick’s Law of Diffusion, which states that the rate of diffusion is directly proportional to the surface area of the membrane. Oxygen cannot enter the bloodstream fast enough to meet the body's demands, and carbon dioxide cannot exit efficiently. The alveolar walls contain the dense network of capillaries where the actual exchange takes place. When the walls are destroyed, the capillaries are destroyed with them. This creates a "dead space" effect where there is air in the lungs, but insufficient blood flow to pick up the oxygen. Healthy alveoli are elastic—they snap back to push air out during exhalation. Smoking destroys the elastin fibers that provide this recoil. When the smoker inhales fresh air, it mixes with this stale, trapped air.10 This lowers the partial pressure of oxygen (PO2) within the alveoli, reducing the driving force that pushes oxygen into the blood.

    1. 1. Allow your prewriting to be exploratory. Reflective writing encourages you to explore an experience and explain or ponder the individual choices you have made. Stand back and view the experience from an objective point of view. 2. While reflective writing asks you to write about your own experience, you should be as thorough as you would for any other writing task. Remember to keep your reader in mind. Try to remove your emotions from the experience. Rather than blame yourself for a specific choice, consider the reasoning for that decision and explore what you’ve learned. 3. Avoid focusing on writing about every moment of the event or process. Reflective writing should focus on specific snapshots of your experience, so avoid spending too much time narrating. Instead, reflect on how a specific choice impacted the experience. Ultimately, your essay’s goal is not to create a narrative but to speculate about the significance of your experience.

      Keep these tips in mind when it comes to writing about your own reflections.

    2. 3. Avoid focusing on writing about every moment of the event or process. Reflective writing should focus on specific snapshots of your experience, so avoid spending too much time narrating. Instead, reflect on how a specific choice impacted the experience. Ultimately, your essay’s goal is not to create a narrative but to speculate about the significance of your experience.

      Try to keep it short and to the point so it isn't drawn out and wordy.

    1. To understand the relationship between model scale, training efficiency, and downstream performance, we trained the Tx1 model series at three scales: 70M, 1B, and 3B parameters. Fig. 7A shows the training cost versus computational budget (measured in FLOPs) for Tx1 compared to other single-cell foundation models including SE-600M, scGPT, and nv-Geneformer variants. Tx1 achieves substantially improved training efficiency, with 3–30× better compute efficiency relative to these prior models.

      Thank you for sharing this dataset and model (as well as the SCVI model). In terms of training cost versus computational budget, how would the smaller training subsets factor in to efficiency for the smaller models? It's interesting to consider training compute normalized by fraction of the data on which the model was trained. Is it possible that training the 3B model on only a subset of the dataset would not hurt performance and therefore improve training efficiency metrics? I appreciate this deep analysis of the training process.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      1. General Statements

      In this study, we mechanistically define a new molecular interaction linking two of the cell's major morphological regulatory pathways-the Rho GTPase and Hippo signaling networks. These two major signaling pathways are both required for life across huge swaths of the tree of life. They are required for the dynamic organization and reorganization of proteins, lipids, and genetic material that occurs in essential cellular processes such as division, motility and differentiation. For decades these pathways have been almost exclusively studied independently, however, they are known to act in concert in cancer to drive cytoskeletal remodeling and morphological changes that promote proliferation and metastasis. However, mechanistic insight into how they are coordinated is lacking.

      Our data reveal a mechanistic model where coordination is mediated by the RhoA GTPase-activating protein ARHGAP18, which forms molecular interactions with both the tumor suppressor Merlin (NF2) and the transcriptional co-regulator YAP (YAP1). Using a combination of state-of-the-art super-resolution microscopy (STORM, SORA-confocal) in cultured human cells, biochemical pulldown assays with purified proteins, and analyses of tissue-derived samples, we characterize ARHGAP18's function from the molecular to the tissue level in both native and cancer model systems.

      Together, these findings establish a previously unrecognized molecular connection between the RhoA and Hippo pathways and culminate in a working model that integrates our current results with prior work from our group and decades of prior studies. This model provides a new conceptual framework for understanding how RhoA and Hippo signaling are coordinated to regulate cell morphology and tumor progression in human cells.

      In this substantially revised manuscript, we have addressed all comments from the expert reviewers described point-by-point below. A shared major comment from the reviewers was the request for direct evidence of the proposed mechanistic model. To address these constructive comments, we've added new experiments, new quantification, new text, new control data, and have added two expert authors, adding super-resolution mouse tissue imaging data for the endogenous study of ARHGAP18 in its native condition. We believe that these additions greatly enhance the manuscript and collectively address the overall message from the reviewer's collective comments.

      2. Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This manuscript describes a dual mechanism by which ARHGAP18 regulates the actin cytoskeleton. The authors propose that in addition to the known role for ARHGAP18 in regulating Rho GTPases, it also affects the cytoskeleton through regulation of the Hippo pathway transcriptional regulator YAP. ARHGAP18 knockout Jeg3 cells are were generated and show a clear loss of basal stress fiber like F-actin bundles. The authors further characterize the effects of ARHGAP18 knockout and overexpression. It is also discovered that ARHGAP18 binds to the Hippo pathway regulator Merlin and to YAP. Ultimately it is concluded that ARHGAP18 regulates the F-actin cytoskeleton through dual regulation of RHO GTPases and of YAP. While the phenotype of the ARHGAP18 knockout and the association of ARHGAP18 with Merlin and YAP is interesting, I found the authors conclusion that these phenotypes are due to ARHGAP18 regulation of both RHO and YAP to be based on largely correlative evidence and sometimes lacking in controls or tests for significance. In addition the authors often make overly strong conclusions based on the experimental evidence. In some instances, the rationale for how the experimental results support the conclusion is insufficiently articulated, making evaluation challenging. In general although the authors have some interesting observations, more definitive experiments with proper controls and statistical tests for significance and reproducibility are needed to justify their overall conclusions.

      • *

      *We appreciate the reviewers' constructive comments and have added substantial new data and quantifications to address their concerns. We have focused these new data on directly testing the proposed mechanisms, adding controls, and performing quantitative analysis with statistical testing. Additionally, we have edited our language to make our rationale clearer and to present our conclusions as a more moderate assessment of our experimental results. Below we respond to the specific comments made by the reviewer, followed by a list of additional editorial changes we've made based on the reviewer's overarching comments on clarity and rationale. *

      Specific Comments

      1) The authors make a big point about the effects of ARHGAP18 on myosin light chain phosphorylation. However, this result is not quantified and tested for statistical significance and reproducibility.

      *We thank the reviewer for their comments on our western blotting quantification, which in the original submission version had quantification of RhoA downstream signaling of pCofilin/ Cofilin and pLIMK/ LIMK. We had withheld the pMLC and MLC quantification as the result was previously published with quantification, reproducibility, and statistical significance by our group in our prior manuscript on ARHGAP18 published in Elife in 2024 (Fig. 4E of *

      https://doi.org/10.7554/eLife.83526 ). However, these prior results lacked the new overexpression data. We recognize the need to add these data to this manuscript as requested by the reviewer.

      • *

      *To address the reviewer's comment, we have added quantification of pMLC/MLC (Fig. 1F) *

      2) Along similar lines in Figure 2C they state that overexpression of ARHGAP18 causes cells to invade over the top of their neighbors. This might be true and interesting, but only a single cell is shown and there is no quantification or controls for simply overexpressing something in that cell. The authors also conclude from this image that the overexpression phenotype is independent of its GAP activity on Rho. It is not clear how this conclusion is made based on the data. It would seem like a more definitive experiment would be to see if a similar phenotype was induced by an ARHGAP18 mutant deficient in GAP activity.

      Based on the reviewer's comment, we recognize the qualitative statements made in Figure 2C (now Figure 3) should've been made more quantitative. We have added the control of Jeg 3 WT cells expressed with empty vector flag to show that WT cells do not invade over the top of each other (Fig. 3F). Additionally, we have added the quantification found in Fig. 3E, which shows the % invasive/ non-invasive cells between WT and ARHGAP18 overexpression cells. We have clarified our conclusions to make clear that these data do not directly test if the invasive phenotype derives from a Rho-independent mechanism. The text now states the following conclusion alongside others, which can be seen in our tracked changes:

      • *

      "These data support the conclusion that ARHGAP18 acts to regulate basal and junctional actin. However, it was not clear whether this activity occurred through a Rho-independent or a Rho-dependent mechanism."

      • *

      We have added new data of cells expressing an ARHGAP18 mutant deficient in GAP activity, which is explained in detail in the following response below.

      3) In Figure 3 the authors compare gene expression profiles of ARHGAP18 knockout cells to wild-type cells. They see lots of differences in focal adhesion and cytoskeletal proteins and conclude that this supports their conclusion that ARHGAP18 is not just acting through RHO. The rationale for this in not clear. In addition, they observe changes in expression profiles consistent with changes in YAP activity. They conclude that the effects are direct. This very well might be true. However RHO is a potent regulator of YAP activity and the results seem quite consistent with ARHGAP18 acting through RHO to affect YAP.

      • *

      We thank the reviewer for their comment and believe the revised manuscript now presents direct evidence to support the conclusions made through the editing text and the incorporation of new data.

      • *

      First, the reviewer highlighted that we were not clear in our rationale and explanation of the conclusions made from our RNAseq data in the new Figure 4 (Previously Figure 3). We agree with the reviewer that the RNAseq data alone is not sufficient rationale for the conclusion that ARHGAP18 is acting through YAP directly. In the revised manuscript, the conclusion is now made based on the combination of our multi-faceted investigation of the relationship between ARHGAP18 and YAP (most importantly, new Figure 5). It's important for us to argue that our RNAseq analysis is much more robust and specific than simply reporting a descriptive assay seeing lots of differences in cytoskeletal proteins. We recruited an outside RNAseq expert collaborator; Dr. Yongho Bae, to perform state-of-the-art IPA analysis and a grueling manual curation of the top hit genes to identify the predominant signaling pathways linking the loss of ARHGAP18 to known YAP translational products. We've provided a supplemental table listing each citation supporting the identified YAP pathway associations from this manual curation. We also have added a new discussion paragraph on RNAseq data to clarify our specific RNAseq data results and analysis. In the revised manuscript, we have moderated our language in the results text regarding the RNAseq data to reflect the reviewer's suggestion:

      • *

      "Our RNAseq data alone could not independently confirm if the alterations to transcriptional signaling and expression of actin cytoskeleton proteins were through a Rho-dependent or Rho-independent mechanism."

      • *

      • *

      Second, in this comment and the above, the reviewer highlights the need for a new experiment to directly test the Rho Independent effects of ARHGAP18, which we now provide in the new Figure 5. In this new data, we've applied an experimental design suggested by reviewer 2 regarding the same concern. In short, we've produced and expressed a point mutant variant ARHGAP18(R365A), which abolishes the Rho GAP activity while maintaining the remainder of the protein intact. This construct allows us to directly test the effects of ARHGAP18 independent from its RhoA GAP activity. We find that the GAP-deficient ARHGAP18 is able to fully rescue basal focal adhesions, indicating that the basal actin phenotype is at least in part regulated through a Rho-independent mechanism.

      • *

      • *

      *We believe the revised manuscript, when taken in totality, provides the definitive proof requested by the reviewer. Specifically, the combination of Figure 5, where we show new data using the ARHGAP18(R365A) variant, and the result that ARHGAP18 forms a stable complex with YAP (Fig. 6G) or Merlin (Fig.6A), is supportive of direct Rho-independent molecular interactions between YAP, Merlin, and ARHGAP18. *

      4) In Figure 4A showing Merlin binding to ARHGAP18 there is no control for the amount of Merlin sticking to the column as was done in Figure 4F for binding experiments with YAP. This makes it difficult to determine the significance of the observed binding.

      We have performed the requested control experiment and added the results to Figure 6A.

      5) The images in Figure 4C showing YAP being maintained in the nucleus more in ARHGAP18 knockout cells compared to wild-type. However the images only show a few cells and YAP localization can be highly variable depending on where you look in a field. Images with more cells and some sort of quantification would bolster this result.

      We have provided quantification (Figure 6D) of what was originally Figure 4C (now Figure 6C).

      Reviewer #1 (Significance (Required)):

      While the phenotype of the ARHGAP18 knockout and the association of ARHGAP18 with Merlin and YAP is interesting, I found the authors conclusion that these phenotypes are due to ARHGAP18 regulation of both RHO and YAP to be based on largely correlative evidence and sometimes lacking in controls or tests for significance. In addition the authors often make overly strong conclusions based on the experimental evidence. In some instances, the rationale for how the experimental results support the conclusion is insufficiently articulated, making evaluation challenging. In general although the authors have some interesting observations, more definitive experiments with proper controls and statistical tests for significance and reproducibility are needed to justify their overall conclusions.

      In the above comments, we detail the specific definitive experiments, proper controls, and statistical tests for significance, requested by the reviewer, which we believe greatly strengthen our manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This manuscript investigates the Rho effector, ARHGAP18 in Jegs cells, a trophoblastic cell line. It presents a number of new pieces of data, which increase our understanding of the importance of this GAP on cell function and explains at a molecular level previous results of other workers in the field. ARHGAP18 was originally given the name "conundrum' and continues to stand apart from the majority of other GAP proteins and their functions. Hence the data here is significant and of high standard.

      The data is clear, and the images are of high quality and extremely impressive in their resolution. It is significant and adds a further layer to our understanding of the regulation of cell migration, particularly in the formation and resolution of microvilli.

      • *

      We appreciate the reviewer's comments and supportive insights.

      The data is based on the use of the cell line Jeg3. Even the authors previous publication in eLife is based only on this cell line. They need to show the conclusions are general and not specific to this line of cells. As an extension of this, is the ARHGAP18 function shown here only in transformed cells? Does the same mechanisms operate in normal cells, which respond to activation to proliferate or migrate?

      • *
      • We respectfully point out that the critical experiments of the prior eLife publication were validated in DLD-1 colorectal cells and not Jeg-3 cells alone (Figure 1-figure supplement 2). Our newly independent lab, established just over a year ago, is unable to perform a full expansion of the manuscript using untransformed cells, however, we agree with the reviewer's perspective and wish to address the comment to the best of our current capability. To answer the reviewers' suggestions, we have recruited Dr. Christine Schaner Tooley, an expert in mouse model system studies. In the revised manuscript, we've added new Super-Resolution SORA confocal images of endogenous ARHGAP18's localization in the intact intestinal villi tissue, and apical junctions of WT mice (Fig.1A-C). These data indicate that endogenous ARHGAP18 is enriched (but not exclusively localized) at the apical plasma membranes of normal WT epithelial cells. This localization, where both Merlin and Ezrin are present at apical membrane/ junctions under normal conditions, is a major component of the working model proposed in Fig. 7. These data also indicate that ARHGAP18 is capable of entering the nucleus in WT cells, another critical aspect of our proposed model. Collectively, our DLD-1 studies published previously and or new studies using WT mice tissue samples support the conclusion that at least some of ARHGAP18's functions described in this manuscript are not limited to Jeg3 cells.*

      In endothelial cells, Lovelace et al 2017 showed localization to microtubules and that depletion of ARHGAP18 resulted in microtubule instability. The authors may like to comment on the differences. Is this a cell type difference or RhoA versus RhoC difference?

      • *

      In our previous publication (Lombardo Elife), we validated the finding that ARHGAP18 forms a complex with microtubules, as we detected tubulin in the ARHGAP18 pulldown experiment (Figure 1- Source Data). However, our data indicate that in Jeg3 cells ARHGAP18 does not localize to the same microtubule associated spheres observed in the Lovelace publication. We now comment on the shared conclusions and differences between this manuscript and the Lovelace et al 2017 in the discussion section.

      • *

      "In endothelial cells, ARHGAP18 has been reported to localize microtubules and plays a role in maintaining proper microtubule stability (Lovelace et al., 2017). In our epithelial cell culture models and WT mouse intestine, we have been unable to detect ARHGAP18 at microtubules suggesting ARHGAP18 may have additional functions is various cell types."

      On pages 7,9 they conclude that MLC and basal and junctional actin are regulated through a GAP independent mechanism. The best way to show this is with overexpression of a GAP mutant.

      We appreciate the reviewer's insight and have produced and expressed a GAP mutant, ARHGAP18(R365A), in our cells, directly testing our conclusion that ARHGAP18 has a GAP-independent function. These data are now presented in revised Figure 5 and explained further in response to reviewer #1.

      There is a huge amount of data presented in Figure 3, but their 2 genes which they focus on, LOP1 and CORO1A, are discussed but no actual data presented in support.

      We now validate the CORO1A by qPCR in Figure 4J.

      • *

      Reviewer #2 (Significance (Required)):

      The data is significant and adds a further layer to our understanding of the regulation of cell migration, particularly in the formation and resolution of microvilli. This manuscript will be of significance to an basic science audience in the field of RhoGTPases and cell migration.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The study by Murray et al explores the effects of ARHGAP18 on the actin cytoskeleton, Rho effector kinases, non-muscle myosin, and transcription. Using super resolution microscopy, they show that in ARHGAP18 KO cells there is a mixed and unexpected cytoskeleton phenotype where myosin phosphorylation appears to be increased, but actin is disorganised with reduced stress fibres, diminished focal adhesions and augmented invasiveness. They conclude that the underlying mechanisms are likely independent from RhoA. Next, they perform RNAseq using the KO cells and identify an array of dysregulated genes, including those that play crucial roles in microvilli (related to previously published findings). Analysis of the data identify gene expression changes that are relevant for altered focal adhesion (integrins). Further analysis reveals that a large cohort of the dysregulated genes are YAP targets. They then show that in ARHGAP18 KO cells YAP nuclear localization, as detected by immunostaining, is augmented; and demonstrate that immobilized ARHGAP18 protein can bind the Hippo regulator merlin as well as YAP itself.

      Major comments:

      1, The premise of the study (that ARHGAP18 is a RhoA effector or may acts independently of RhoA) remains not proven.

      We have added new evidence of direct RhoA independent activity for ARHGAP18 described in the above comments. Specifically, we've added data using a RhoA-GAP dead variant of ARHGAP18 in Figure 5, which we believe addresses this comment.

      • *

      At several places (including in the title) the authors refer to ARHGAP18 as a Rho effector, which would suggest that it is downstream form Rho, but the basis for this is not clear. In fact, their own previous study suggested that ARHGAP is a RhoA regulator, rather than an effector. In general, the connection of the described effects to RhoA remains unclear, and not addressed in this study. The authors seem to go back and forth in their conclusions regarding the connection between ARHGAP18 and RhoA. For example, the first section of results is finished by stating (line 194): "These data support the conclusion that ARHGAP18 acts to regulate basal and junctional actin through Rho-independent mechanism". But the next section starts by stating (line 198): "We hypothesized that the invasive and cytoskeletal phenotypes observed at the basal surface of cells devoid of ARHGAP18 may be a result of changes in regulation at the transcriptional level either directly through RhoA signaling or through an additional mechanism specific to ARHGAP18". The paper would be strengthened by adding data that show whether the effects are indeed downstream, from RhoA or RhoA independent. If there is no sufficient demonstration that ARHGAP18 is downstream of RhoA and is an effector, this needs to be stated explicitly, and the wording should be changed.

      *We now provide new data in Figure 5, which directly tests the RhoA independent functions of ARHGAP18 as recommended by the reviewer. Our understanding of the term effector is 'a molecule that activates, controls, or inactivates a process or action.' Based on this understanding, we used the term to convey ARHGAP18's functional role within the feedback loop, rather than to imply that it acts exclusively downstream. *

      • *

      We seek to clarify our perspective with the reviewer's assertion that we go "back and forth" as to if ARHGAP18 functions in a Rho Dependent or Rho Independent manner. It was our intent to propose a model where ARHGAP 18 acts in two separate circuits that regulate cell signaling. The first circuit involves ARHGAP18's canonical RhoA GAP activity, which involves ERMs and LOK/SLK, and is limited to the apical plasma membrane. This first signaling circuit was characterized in our prior Elife manuscript (Lombardo et al., 2024) and in an earlier JCB manuscript (Zaman and Lombardo et al., 2021). In this newly revised manuscript, we provide a partial mechanistic characterization of the second circuit, which we freely admit is much more complex and will likely require additional study to fully characterize.

      • *

      As both circuits operate as signaling feedback loops, we find the terms 'upstream' and 'downstream' to be of limited value, and we attempt to avoid their use when possible. We retain their use only when referring to the Hippo and ROCK signaling cascades, where these designations are well established. We suggest that the conceptual inconsistencies of Conundrum/ARHGAP18 may have arisen from the tendency to view it in strictly binary terms as upstream or downstream. Here, we propose a third possibility that ARHGAP18 functions as both, participating in a negative feedback loop.

      • *

      *We have edited and added data testing if the effects are Rho independent and discussion text in response to the reviewer's comments and clarify the molecular function of ARHGAP18.

      "Additionally, focal adhesions and basal actin bundles are restored to WT levels when the ARHGAP18(R365A) GAP-ablated mutant is expressed in ARHGAP18 KO cells (Fig. 5A, B). These results represent the strongest argument that ARHGAP18 functions in additional pathways to RhoA/C alone. Our data suggests that at least one of the alternative pathways is through ARHGAP18's interaction with YAP and Merlin. From these data we conclude that ARHGAP18 has important functions in both RhoA signaling through both its GAP activity and in Hippo signaling through its GAP independent binding partners. "*

      • *

      • *

      The study is descriptive and contains a series of observations that are not connected. Because of this, the study's conclusions are not well supported, and key mechanistic insight is limited. The study feels like a set of separate observations, that remain incompletely worked out and have some preliminary feel to them. The model in the last figure also seems to contain hypotheses based on the observations, several of which remains to be proven.

      • *

      *We present our revised manuscript, in which we've more clearly outlined our rationale and conclusions, as detailed in the above responses, to emphasize the overall connectivity of the study. We have also updated the title of Figure 7 to read "__Theoretical __Model of ARHGAP18's coordination of RhoA and Hippo signaling pathways in Human epithelial cells." To make it clear that we are presenting a working model, which has elements that will require additional investigation. Throughout the manuscript, we highlight the unknown elements that remain to be tested or other outstanding questions. Thus, we do not aim to characterize this complex signaling coordination completely. Instead, this manuscript represents the 3rd iteration in our systematic advances to describe this entirely new signaling pathway. We agree that, despite three separate manuscripts (this one included) to date, this work represents an early stage in understanding the system, many additional studies will be needed to characterize this signaling system fully. Figure 7 is presented as a working model that results from a thoughtful combination of our collective data and that of other researchers, derived from numerous species across decades of study. We firmly believe that proposing such integrative models is valuable for advancing the field. We also recognize the importance of clearly indicating which aspects remain hypothetical. We now explicitly note in several places within the discussion which components of the model will require further validation and experimental confirmation. For example, regarding our theoretical mechanism in Figure 7 we state: *

      "Validation of the direct mechanism by which YAP/TAZ transcriptional changes drive basal actin changes in ARHGAP18 KO cells will require further investigation based on predictions from RNAseq results."

      • *

      Addressing any possible connection between key effects of ARHGAP18 KO (changes in actin, focal adhesion, integrins, Yap and merlin binding) could strengthen the manuscript. One such specific question is the whether the changes in integrin expression (RNAseq) are indeed connected to the actin alterations and reduction ion focal adhesions (Fig 1). Staining for these integrins to show they are indeed altered, and/or manipulating any of them to reproduce changes could provide and exciting addition.

      • *

      *We attempted to stain cells for Integrins by purchasing three separate antibodies. However, despite extensive optimization and careful selection of the specific integrins using our RNAseq results we were unable to get any of these antibodies to work in any cell type or condition. We believe that there is a technical challenge to staining for integrins due to their transmembrane and extracellular components, which we were unable to overcome. As an attempt to address the reviewers comment, we alternatively stained cells for paxillin which directly binds the cytoplasmic tails of integrins (Fig. 3&5). *

      Some of the experimental findings are not convincing or lack controls. Fig 1: some of the western blots are not convincing or poor quality. [...] On the same figure, the quality of LIM kinase blots is poor. [...] The signal is weak, and the blot does not appear to support the quantification. The last condition (expression of flag-ARHGAP18) results in a large drop in pLIMK and pcofilin on the blot, which is not reflected by the graph. Addition of *a better blot and the use of strong positive or negative control would boost confidence in these data. *

      • *

      In response to this and other reviewers' comments, we have added new western data and quantification to Figure 1. We now focus on MLC/pMLC data as we believe these data highlight the potential Rho-independent mechanism of ARHGAP18, and we were able to greatly improve the quality of the blots through careful optimization. We hope the reviewer finds these blots and quantifications (Fig. 1E and F) more convincing.

      *We note that phospho-specific Western blotting presents considerably greater technical challenges than conventional blotting. We believe that the appearance of an attractive looking blot does not always correlate to quality or reproducibility and have focused on taking extraordinarily careful steps in the blotting of our phospho-specific antibodies, which at times comes at the cost of the blot's attractiveness in appearance. For example, all phospho-specific antibodies are run using two color fluorescent markers to blot against both the total protein and the phospho-protein on the same blot. This approach often leads to blots that have reduced signal to noise compared to chemiluminescent Westerns. Additionally, we use phospho-specific blocking buffer reagents which do not contain phosphate-based buffers or agents that attract non-specific phospho-staining signals. These blocking buffers are not as effective as non-fat milk in pbs at blocking the background signal, however, they are ultimately cleaner for phospho-specific primary antibodies. We use carefully optimized protocols, from cell treatment to lysis, transfer, and antibody incubation, including methods developed by laboratories where the corresponding author of the manuscript was trained. Nonetheless, despite these efforts, we have now removed the LIMK and cofilin data because we deemed them unnecessary for the main conclusions of this manuscript and were unable to improve their quality to satisfy the reviewer. *

      The changes in pMLC on the western blots are very small, and for any conclusion, these studies require quantification. Further, the expression levels of Flag-ARHGAP18 needs to be shown to support the statement that the protein is expressed, and indeed overexpressed under these conditions (vs just re-expressed).

      In continuation of the above comment, we have made significant effort to improve the quality of our pMLC western blots and now provide quantification in Figure 1. We also now provide the Flag-ARHGAP18 signal as requested by the reviewer.

      Fig 4: the differences in YAP nuclear localization under the various conditions are not well visible. Quantitation of nuclear/cytosolic signal ratio should be provided. Please provide a rationale and more context for using serum starvation and re-addition. What is the expected effect? Serum removal and addition is referred to as nutrient removal and re-addition, but this is inaccurate, as it does not equal nutrient removal, since serum contains a variety of other important components, e.g. growth factors too.

      We have provided new quantification of the nuclear/cytosolic signal ratio in Figure 6D. We have explained our rational for the study through the following new text:

      "Merlin is activated and localized to junctions upon signaling, promoting growth and proliferation; among these signals is the availability of growth factors and other components of serum (Bretscher et al., 2002). We hypothesized that since ARHGAP18 formed a complex with Merlin that ARHGAP18's localization may localize to junctions under conditions which promote Merlin activation."

      • *

      We have altered our use of "nutrient removal" to "serum removal"

      The binding between ARHGAP18 and merlin is interesting, but a key limitation is the use of expressed proteins. Can the binding be shown for the endogenous proteins (IP, colocalization). Another important unaddressed question is the relevance of this binding, and the relation of this to altered YAP nuclear localization.

      • *

      *Our data in Fig. 6G shows binding of a resin bound human ARHGAP18 to endogenous YAP from human cells as suggested by the reviewer. In Fig. 6A, we have selected to use GFP-Merlin as Merlin shares approximately 60% sequence identity with Ezrin, Radixin, and Moesin (ERMs). Their similarity is such that Merlin was named for Moesin-Ezrin-Radixin-Like Protein. In our experience, nearly all Merlin or ERM antibodies have some cross-contaminating signal. Thus, a major concern is that if we were to blot for endogenous Merlin in the pull-down experiment, we may see a band that could in fact be ERMs. To avoid this, we tagged Merlin with GFP to ensure that the product pulled down by ARHGAP18 was Merlin, not an ERM. Regarding the ARHGAP18-resin bound column, our homemade ARHGAP18 antibody is polyclonal. We have extensive experience in pulldown assays and have found that the binding of a polyclonal antibody to the bait protein can produce less accurate results, as the binding site for the antibody is unknown and can sterically hinder attachment of target proteins like Merlin. In our experience, attachment to a flag-tag, which is expressed after a flexible linker at the N- or C-terminus, allows us to overcome this limitation, which we've used in this manuscript. *

      Minor comments:

      Introduction line 99: "When localized to the nucleus, YAP/TAZ promotes the activation of cytoskeletal transcription factors associated with cell proliferation and actin polymerization" Please clarify what you mean by this statement, that is inaccurate in its present for. Did you mean effects on transcription factors that control cytoskeletal proteins, or do you mean that Yap/Taz affect these proteins? Please also provide reference for this.

      We've altered the sentence as suggested by the reviewer, which now reads the following:

      "When localized to the nucleus, YAP/TAZ promotes transcriptional changes associated with cell proliferation and actin polymerization."

      • *

      *The full mechanism for how YAP/TAZ promotes proliferation and actin polymerization is a currently debated issue. We do not think introducing the various current proposed models is required for this manuscript, and we simply intend to convey that when in the nucleus, YAP/TAZ promotes transcriptional changes that drive actin polymerization and cell proliferation. *

      -What is the cell confluence in these experiments? For epithelial cells confluence affects actin structure. Please comment on similarity of confluency across experimental conditions?

      • *

      All cellular experiments are paired where WT and ARHGAP18 KO cells are plated at the same time under identical conditions. For imaging, we plate all cells onto glass coverslips in a 6 well dish so that each condition is literally in the same cell culture plate and gets identical treatment. In our prior Elife paper studying ARHGAP18, we characterized that ARHGAP18 KO cells and WT cells divide at a similar rate and have similar proliferation characteristics. The epithelial cell cultures are maintained for experiments around 70-80% confluency. For the focal adhesion staining experiments, the confluency is slightly lower, between 50-60% to capture the focal adhesions towards the leading edge. We have added the following new text to further describe these methods: "Cell cultures for experiments were maintained at 70%-80% confluency. For focal adhesion experiments, the cell cultures were maintained at 50%-60% confluency."

      -Fig 2 legend: please indicate that the protein detected was non-muscle myosin heavy chain (distinct from the light chain detected in Fig 1).

      • *

      We have altered original Figure 2 (new Figure 3) legend.

      -Line 339-340: please check the syntax of this sentence -Western blot quantification: the comparison of experiments with samples run on different gels/blots requires careful normalization and experimental consistency. Please describe how this was achieved.

      • *

      We have added the following new text to further describe these methods:

      "For blots which required quantification of antibodies that were only rabbit primaries (e.g., pMLC/MLC antibodies listed above), samples were loaded onto a single gel and transferred onto a single membrane at the same time. After transfer, the membrane was cut in half and subsequent steps were done in parallel. All quantified blots were checked for equal loading using either anti-tubulin as a housekeeping protein or total protein as detected by Coomassie staining"

      Reviewer #3 (Significance (Required)):

      Rho signalling is a central regulator of an array of normal and pathological cell functions, and our understanding of the context dependent regulation of this key pathway remains very incomplete. Therefore, new knowledge on the role of specific regulators, such as ARHGAP18, is of interest to a very broad range of researchers. A further exciting aspect of this protein, that despite indications by many studies that it acts as a GAP (inhibitor) for Rho proteins, there are findings in the literature that suggest that its manipulation can affect actin in unexpected (opposite) manner. These point to possible Rho-independent roles, and warranted further in-depth exploration.

      One of the strength of the study is that it explores possible roles of ARHGAP18 beyond RhoA and describes some new and interesting observations, which advance our knowledge. The authors use some excellent tools (e.g. ARHGAP KO cells and re-expression) and approaches (e.g. super resolution microscopy to analyze actin changes, RNAseq and bioinformatics to find genes that may be downstream from ARHGAP18). A key limitation of the study however, is that it is not clear whether the observed findings are indeed independent from RhoA. Further limitation is that potential causal relationships between the described findings are not studied, and therefore the findings are in some cases overinterpreted, and limited mechanistic insights are provided. In some cases the exclusive use of expressed proteins is also a limitation. Finally, some of the experiments also need improvement.

      Reviewer expertise: RhoA signalling, guanine nucleotide exchange factors, epithelial biology, cell migration, intercellular junctions.

      In the above comments, we detail the new experimental data addressing reviewer 3's listed key limitations. We've added new data using the Rho GAP deficient ARHGAP18(R365A) variant which allows for the direct characterization of ARHGAP18's Rho independent activity. We have introduced new data in WT cells studying endogenous proteins to address the limitations from expressed proteins. Finally, we have moderated our language to address overinterpretation. Collectively, we believe that our revised manuscript addresses the constructive reviewer's comments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The study by Murray et al explores the effects of ARHGAP18 on the actin cytoskeleton, Rho effector kinases, non-muscle myosin, and transcription. Using super resolution microscopy, they show that in ARHGAP18 KO cells there is a mixed and unexpected cytoskeleton phenotype where myosin phosphorylation appears to be increased, but actin is disorganised with reduced stress fibres, diminished focal adhesions and augmented invasiveness. They conclude that the underlying mechanisms are likely independent from RhoA. Next, they perform RNAseq using the KO cells and identify an array of dysregulated genes, including those that play crucial roles in microvilli (related to previously published findings). Analysis of the data identify gene expression changes that are relevant for altered focal adhesion (integrins). Further analysis reveals that a large cohort of the dysregulated genes are YAP targets. They then show that in ARHGAP18 KO cells YAP nuclear localization, as detected by immunostaining, is augmented; and demonstrate that immobilized ARHGAP18 protein can bind the Hippo regulator merlin as well as YAP itself.

      Major comments:

      1. The premise of the study (that ARHGAP18 is a RhoA effector or may acts independently of RhoA) remains not proven. At several places (including in the title) the authors refer to ARHGAP18 as a Rho effector, which would suggest that it is downstream form Rho, but the basis for this is not clear. In fact, their own previous study suggested that ARHGAP is a RhoA regulator, rather than an effector. In general, the connection of the described effects to RhoA remains unclear, and not addressed in this study. The authors seem to go back and forth in their conclusions regarding the connection between ARHGAP18 and RhoA. For example, the first section of results is finished by stating (line 194): "These data support the conclusion that ARHGAP18 acts to regulate basal and junctional actin through Rho-independent mechanism". But the next section starts by stating (line 198): "We hypothesized that the invasive and cytoskeletal phenotypes observed at the basal surface of cells devoid of ARHGAP18 may be a result of changes in regulation at the transcriptional level either directly through RhoA signaling or through an additional mechanism specific to ARHGAP18". The paper would be strengthened by adding data that show whether the effects are indeed downstream, from RhoA or RhoA independent. If there is no sufficient demonstration that ARHGAP18 is downstream of RhoA and is an effector, this needs to be stated explicitly and the wording should be changed.
      2. The study is descriptive and contains a series of observations that are not connected. Because of this, the study's conclusions are not well supported, and key mechanistic insight is limited. The study feels like a set of separate observations, that remain incompletely worked out and have some preliminary feel to them. The model in the last figure also seems to contain hypotheses based on the observations, several of which remains to be proven. Addressing any possible connection between key effects of ARHGAP18 KO (changes in actin, focal adhesion, integrins, Yap and merlin binding) could strengthen the manuscript. One such specific question is the whether the changes in integrin expression (RNAseq) are indeed connected to the actin alterations and reduction ion focal adhesions (Fig 1). Staining for these integrins to show they are indeed altered, and/or manipulating any of them to reproduce changes could provide and exciting addition.
      3. Some of the experimental findings are not convincing or lack controls.

      Fig 1: some of the western blots are not convincing or poor quality. The changes in pMLC on the western blots are very small, and for any conclusion, these studies require quantification. Further, the expression levels of Flag-ARHGAP18 needs to be shown to support the statement that the protein is expressed, and indeed overexpressed under these conditions (vs just re-expressed). On the same figure, the quality of LIM kinase blots is poor. The signal is weak, and the blot does not appear to support the quantification. The last condition (expression of flag-ARHGAP18) results in a large drop in pLIMK and pcofilin on the blot, which is not reflected by the graph. Addition of a better blot and the use of a strong positive or negative control would boost confidence in these data.

      Fig 4: the differences in YAP nuclear localization under the various conditions are not well visible. Quantitation of nuclear/cytosolic signal ratio should be provided. 4. Please provide a rationale and more context for using serum starvation and re-addition. What is the expected effect? Serum removal and addition is referred to as nutrient removal and re-addition, but this is inaccurate, as it does not equal nutrient removal, since serum contains a variety of other important components, e.g. growth factors too. 5. The binding between ARHGAP18 and merlin is interesting, but a key limitation is the use of expressed proteins. Can the binding be shown for the endogenous proteins (IP, colocalization). Another important unaddressed question is the relevance of this binding, and the relation of this to altered YAP nuclear localization.

      Minor comments:

      • Introduction line 99: "When localized to the nucleus, YAP/TAZ promotes the activation of cytoskeletal transcription factors associated with cell proliferation and actin polymerization" Please clarify what you mean by this statement, that is inaccurate in its present for. Did you mean effects on transcription factors that control cytoskeletal proteins, or do you mean that Yap/Taz affect these proteins? Please also provide reference for this.
      • What is the cell confluence in these experiments? For epithelial cells confluence affects actin structure. Please comment on similarity of confluency across experimental conditions?
      • Fig 2 legend: please indicate that the protein detected was non-muscle myosin heavy chain (distinct from the light chain detected in Fig 1).
      • Line 339-340: please check the syntax of this sentence
      • Western blot quantification: the comparison of experiments with samples run on different gels/blots requires careful normalization and experimental consistency. Please describe how this was achieved.

      Significance

      Rho signalling is a central regulator of an array of normal and pathological cell functions, and our understanding of the context dependent regulation of this key pathway remains very incomplete. Therefore, new knowledge on the role of specific regulators, such as ARHGAP18, is of interest to a very broad range of researchers. A further exciting aspect of this protein, that despite indications by many studies that it acts as a GAP (inhibitor) for Rho proteins, there are findings in the literature that suggest that its manipulation can affect actin in unexpected (opposite) manner. These point to possible Rho-independent roles, and warranted further in-depth exploration. One of the strength of the study is that it explores possible roles of ARHGAP18 beyond RhoA and describes some new and interesting observations, which advance our knowledge. The authors use some excellent tools (e.g. ARHGAP KO cells and re-expression) and approaches (e.g. super resolution microscopy to analyze actin changes, RNAseq and bioinformatics to find genes that may be downstream from ARHGAP18). A key limitation of the study however, is that it is not clear whether the observed findings are indeed independent from RhoA. Further limitation is that potential causal relationships between the described findings are not studied, and therefore the findings are in some cases overinterpreted, and limited mechanistic insights are provided. In some cases the exclusive use of expressed proteins is also a limitation. Finally, some of the experiments also need improvement.<br /> Reviewer expertise: RhoA signalling, guanine nucleotide exchange factors, epithelial biology, cell migration, intercellular junctions.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript investigates the Rho effector, ARHGAP18 in Jegs cells, a trophoblastic cell line. It presents a number of new pieces of data, which increase our understanding of the importance of this GAP on cell function and explains at a molecular level previous results of other workers in the field. ARHGAP18 was originally given the name "conundrum' and continues to stand apart from the majority of other GAP proteins and their functions. Hence the data here is significant and of high standard.

      The data is clear, and the images are of high quality and extremely impressive in their resolution. It is significant and adds a further layer to our understanding of the regulation of cell migration, particularly in the formation and resolution of microvilli.

      The data is based on the use of the cell line Jeg3. Even the authors previous publication in eLife is based only on this cell line. They need to show the conclusions are general and not specific to this line of cells. As an extension of this, is the ARHGAP18 function shown here only in transformed cells? Does the same mechanisms operate in normal cells, which respond to activation to proliferate or migrate? In endothelial cells, Lovelace et al 2017 showed localisation to microtubules and that depletion of ARHGAP18 resulted in microtubule instability. The authors may like to comment on the differences. Is this a cell type difference or RhoA versus RhoC difference?

      On pages 7,9 they conclude that MLC and basal and junctional actin are regulated through a GAP independent mechanism. The best way to show this is with overexpression of a GAP mutant.

      There is a huge amount of data presented in Figure 3, but their 2 genes which they focus on, LOP1 and CORO1A, are discussed but no actual data presented in support.

      Significance

      The data is significant and adds a further layer to our understanding of the regulation of cell migration, particularly in the formation and resolution of microvilli.

      This manuscript will be of significance to an basic science audience in the field of RhoGTPases and cell migration.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This manuscript describes a dual mechanism by which ARHGAP18 regulates the actin cytoskeleton. The authors propose that in addition to the known role for ARHGAP18 in regulating Rho GTPases, it also affects the cytoskeleton through regulation of the Hippo pathway transcriptional regulator YAP. ARHGAP18 knockout Jeg3 cells are were generated and show a clear loss of basal stress fiber like F-actin bundles. The authors further characterize the effects of ARHGAP18 knockout and overexpression. It is also discovered that ARHGAP18 binds to the Hippo pathway regulator Merlin and to YAP. Ultimately it is concluded that ARHGAP18 regulates the F-actin cytoskeleton through dual regulation of RHO GTPases and of YAP. While the phenotype of the ARHGAP18 knockout and the association of ARHGAP18 with Merlin and YAP is interesting, I found the authors conclusion that these phenotypes are due to ARHGAP18 regulation of both RHO and YAP to be based on largely correlative evidence and sometimes lacking in controls or tests for significance. In addition the authors often make overly strong conclusions based on the experimental evidence. In some instances, the rationale for how the experimental results support the conclusion is insufficiently articulated, making evaluation challenging. In general although the authors have some interesting observations, more definitive experiments with proper controls and statistical tests for significance and reproducibility are needed to justify their overall conclusions.

      Specific Comments

      1) The authors make a big point about the effects of ARHGAP18 on myosin light chain phosphorylation. However this result is not quantified and tested for statistical significance and reproducibility.

      2) Along similar lines in Figure 2C they state that overexpression of ARHGAP18 causes cells to invade over the top of their neighbors. This might be true and interesting, but only a single cell is shown and there is no quantification or controls for simply overexpressing something in that cell. The authors also conclude from this image that the overexpression phenotype is independent of its GAP activity on Rho. It is not clear how this conclusion is made based on the data. It would seem like a more definitive experiment would be to see if a similar phenotype was induced by an ARHGAP18 mutant deficient in GAP activity.

      3) In Figure 3 the authors compare gene expression profiles of ARHGAP18 knockout cells to wild-type cells. They see lots of differences in focal adhesion and cytoskeletal proteins and conclude that this supports their conclusion that ARHGAP18 is not just acting through RHO. The rationale for this in not clear. In addition, they observe changes in expression profiles consistent with changes in YAP activity. They conclude that the effects are direct. This very well might be true. However RHO is a potent regulator of YAP activity and the results seem quite consistent with ARHGAP18 acting through RHO to affect YAP.

      4) In Figure 4A showing Merlin binding to ARHGAP18 there is no control for the amount of Merlin sticking to the column as was done in Figure 4F for binding experiments with YAP. This makes it difficult to determine the significance of the observed binding.

      5) The images in Figure 4C showing YAP being maintained in the nucleus more in ARHGAP18 knockout cells compared to wild-type. However the images only show a few cells and YAP localization can be highly variable depending on where you look in a field. Images with more cells and some sort of quantification would bolster this result.

      Significance

      While the phenotype of the ARHGAP18 knockout and the association of ARHGAP18 with Merlin and YAP is interesting, I found the authors conclusion that these phenotypes are due to ARHGAP18 regulation of both RHO and YAP to be based on largely correlative evidence and sometimes lacking in controls or tests for significance. In addition the authors often make overly strong conclusions based on the experimental evidence. In some instances, the rationale for how the experimental results support the conclusion is insufficiently articulated, making evaluation challenging. In general although the authors have some interesting observations, more definitive experiments with proper controls and statistical tests for significance and reproducibility are needed to justify their overall conclusions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear editor and reviewers,

      We sincerely thank you for your thoughtful comments and constructive suggestions, which have greatly improved the quality and clarity of our manuscript. In response, we have implemented all requested changes, which are highlighted in yellow throughout the revised text, and updated several figures accordingly. Furthermore, we have performed all additional experiments recommended by the reviewers and incorporated the new data into the manuscript. To enhance clarity, we have also included a schematic representation of our proposed model in an additional figure, providing a concise visual summary of our findings.

      We hope that these revisions fully address all concerns raised by the reviewers and meet all the expectations for publication.

      Below, we answer the reviewers point by point (in blue).


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this paper, the authors address the important question of the role of centrosomes during neuronal development. They use Drosophila as an in vivo model. The field is somewhat unclear on the role and importance of centrosomes during neuronal development, although the current data would suggest they are dispensable for axon specification and growth. Early studies in cultured mammalian neurons showed that centrosomes are active and that their microtubules can be cut and transported into the neurites. But a study then showed that centrosomes in these cultured neurons are deactivated relatively early during neuronal development in vitro and that ablating centrosomes even when they are active had no obvious effect on axon specification and growth. Consistent with this, a study in Drosophila provided evidence that centrosomes were not active or necessary in different types of neurons. More recently, a study showed that centrosomal microtubules are dispensable for axon specification and growth in mice in vivo but are required for neuronal migration in the cerebral cortex. However, another study has linked the generation of acetylated microtubules at centrosomes with axon development. In this current study, the authors examine the effect of centrosome loss on various motor and sensory neurons and muscles mainly by examining mutants in essential centriole duplication genes. They associate axonal routing and morphology defects with centrosome loss and provide some evidence that centrosomes could still be active in the developing neurons. Overall, they conclude that centrosomes are active during at least early neuronal development and that this activity is important for proper axonal morphology and routing.

      While I think this study addressing a very interesting and important question, I think as it stands the data is not sufficient to be conclusive on a role for centrosomes during neuronal development. My biggest concern is that most phenotypes have not yet been shown to be cell autonomous, as whole animal mutants have been analysed rather than analysing the effect of cell-specific depletion, and the evidence for active centrosomes needs to be strengthened. If the authors can provide stronger evidence for a role of centrosomes in axonal development then the paper will certainly be of interest to a broad readership.

      We thank the reviewer for the clear and concise summary and fully agree that our study addresses a critical gap in understanding. Centrosomes have long been implicated in morphogenesis, yet their precise contribution to nervous system development has remained unclear. Our findings provide compelling evidence that centrosomes are indispensable for proper nervous system formation and that their absence also triggers muscular defects, highlighting their broader role in tissue organization.

      We acknowledge that the original manuscript lacked some key details; therefore, we have now strengthened our conclusions with additional experiments. Specifically, we demonstrate that these effects are cell-autonomous by using two independent RNAi lines targeted to a subset of motor neurons. Furthermore, we present new data showing that neuronal centrosomes remain active during the early stages of axonal development, emphasising their functional relevance in morphogenesis. All new experiments, figures, and corresponding text revisions are detailed below.

      Major comments 1) The sas-6 transallelic combination shows only 17% embryonic lethality compared to 50% embryonic lethality with sas-4 mutants. Given that both mutants should result in the same degree of centrosome loss (this should be quantified in sas-6 mutants) it would suggest that either sas-4 has other roles away from centrosomes or that the sas-4 mutant chromosome used in the experiment has other mutations that affect viability. The effect of picking up "second-site lethal" mutations on mutant chromosomes is common and so I would not be surprised if this is the reason for the difference in phenotypes. This can be addressed either by "cleaning up" the sas-4 mutant chromosome by backcrossing to wild-type lines, allowing recombination to occur and replace the potential second site mutations, or by using transallelic combinations of sas-4, as they did for sas-6. The "easier" option may just be to analyse all the phenotypes with the sas-6 transallelic combination.

      We appreciate this comment, as it brought to light an issue with the CRISPR line Sas-6-Δa. Upon reanalysing all the data, we determined that this line is embryonic lethal both in homozygosis and when combined with the deficiency uncovering the genomic region, Df(3R)BSC794. In contrast, Sas-6-Δb homozygotes are viable. The inconsistency between these results raised concerns about whether the Δa and Δb Sas-6 mutants carry deletions confined to the Sas-6 coding region. Although this would not hinder our cell biology analysis, it could represent a problem in viability tests. To address this, we repeated all analyses using Sas-6-Δb homozygotes and Sas-6-Δb combined with Df(3R)BSC794. These new results are more consistent and indicate that approximately 50% of Sas-6/Def individuals hatch as adults. Fig. 3 was redone and the manuscript text changed in view of these results.

      2) Using "whole animal" mutants for assessing neuronal morphology is risky due to non-cell-autonomous effects. The authors have carried out some phenotypic analysis of neurons depleted of Sas-4 by cell-specific RNAi, but I feel they need to do this for all of their analysis. This includes embryonic lethality measures, quantification of centrosome numbers, and all axonal phenotypes in Sas-4 RNAi neurons. It would also be prudent to use 2 distinct RNAi lines to help ensure any phenotypes are not off-target effects (and this may help clarify why the authors see some additional phenotypes with RNAi). Indeed, there are relatively weak phenotypes in muscles when using RNAi compared to the mutants and these potential non-cell-autonomous effects could then have a knock-on effect on neuronal morphology. If the authors were concerned that RNAi is not very efficient (explaining any potential weaker phenotypes than in mutants) the authors could examine the effectiveness of RNAi lines by analysing protein depletion by western blotting or mRNA depletion by rt-qPCR (although this has to be done in a different cell type due to the difficulty in obtaining a neuronal extract).

      We have now added a new panel to supplementary Figure 1, showing how the expression of a different Sas-4 RNAi line (2) induces similar nervous system phenotypes when expressed only in aCC, pCC and RP2 pioneer neurons (Sup. Fig. 1 M-O).

      3) When analysing centriole presence or absence it is a good idea to stain with two different centriole markers e.g. Asl and Plp. This helps rule out unspecific staining. It is clear from the images that similar sized foci can be observed outside of the cells (see Figure 5A for example), so clearly some of the foci that appear to be within the cells may also be unspecific staining.

      In a new supplementary figure, we now show that Asl and Plp colocalize and quantify the number of times we find this colocalization in neurons (Supl. Fig 3). In addition, and we apologise for the confusion, but the reason why there are foci outside the marked cells is because these are wholemount embryonic stainings and the anti-Plp antibody marks all centrosomes in all cells in the embryo.

      4) The evidence for active centrosomes is not that convincing. Acetylated tubulin is associated with stable MTs, which are not normally organised by "active" centrosomes that nucleate dynamic microtubules. Moreover, it is plausible that centriole foci happen to overlap with the acetylated tubulin staining by chance. This would explain why not all centrosomes colocalise with acetylated tubulin signal. The authors could better test centrosome activity by performing live imaging with EB1-GFP. If centrosomes are active, it is very easy to observe the many comets produced by the centrosomes.

      We appreciate the reviewer’s comment and agree that acetylated tubulin alone is not an ideal marker for centrosome activity. To address this, we performed live imaging of aCC neurons expressing EB1-GFP together with Asl-Tomato. This was technically challenging because we were imaging only two neurons per segment in live embryos, under significant limitations in fluorescence detection and timing. Despite these constraints, we were able to clearly observe EB1 comets emerging from the centrosome and moving toward the cell periphery, providing direct evidence of microtubule nucleation from centrosomes in neurons.

      Importantly, we complemented this with a microtubule depolymerization/polymerization assay, which provides unequivocal evidence that polymerization initiates at the centrosome. After depolymerization, we observed microtubule regrowth from the centrosome, confirming its role as an active microtubule-organizing centre in these neurons. Together, we hope that these results are enough to demonstrate that neuronal centrosomes are functionally active during early axonal development. These experiments are presented in Figure 6 and corresponding text in the manuscript.

      5) If the authors believe that centrosomes have a role in axon pathfinding in sensory neurons, they should show that these centrosomes are active, at least during early stages (again using EB1-GFP imaging).

      We appreciate the reviewer’s suggestion and agree that EB1-GFP imaging would be the most direct way to assess centrosome activity in sensory neurons. However, performing time-lapse imaging in these neurons is technically very demanding due to their location and accessibility in live embryos, and we did not attempt this approach. Instead, we now provide new evidence showing that sensory neuron centrosomes colocalize with both α-tubulin and γ-tubulin. This strongly supports that these centrosomes are associated with microtubule nucleation machinery and are as likely as motor neuron centrosomes to be active during early stages of axon development. These new data have been included in the revised manuscript (see Figure 5 and corresponding text).

      6) The authors mention in the discussion that "increased JNK activity, can result in axonal wiggliness (Karkali et al, 2023)". I therefore wonder whether centrosome loss may induce JNK activation (the stress response), as this would then indicate an indirect effect of centrosome loss on axonal structure rather than a direct influence of centrosome-generated microtubules. The authors could assess whether the DNK-JNK pathway is activated in neurons lacking centrosomes by expression UAS-Puc-GFP and quantifying the nuclear signal.

      In a new supplementary figure, we now show by using a reporter for JNK signalling, as requested, that Sas-4 neurons do not activate the JNK pathway (Supl. Fig 4).

      7) In Figure 5, the authors claim that they find "a correlation between axonal guidance phenotypes and the numbers of centrioles per embryo". I don't think this is a strong correlation. The difference in centriole number between embryos with no defects and those with defects is very small. In contrast, the difference between centriole numbers in control (no defects) and mutant (no defects) is very large. So, there does not appear to be a strong correlation between centrosome number and phenotype.

      We agree and we have corrected this sentence to better explain the results.

      Minor comments

      1) I don't understand Figure 3C - why do the % of surviving homozygotes and heterozygotes add up to 100%? Should the grey boxes not relate to dead and the white to surviving?

      Thank you for pointing this out. Figures 1B and 3C represent only the surviving individuals. The grey boxes correspond to surviving homozygotes, and the white boxes correspond to surviving heterozygotes. The percentages add up to 100% only at embryonic stages because all embryos reach late embryonic stages. The grey and white boxes reflect the proportion of these two genotypes among the survivors, not the total number of embryos including those that died. We have changed the text to convey this.

      2) "In mouse fibroblasts, myoblasts and endothelial cells, centrosome orientation is important for nuclear positioning and cell migration(Chang et al, 2015; Gomes et al, 2005; Kushner et al, 2014)." Do you mean "centrosome position"?

      Yes, text changed, thank you for spotting it.

      3) In the introduction, the authors mention Meka et al. when saying the centrosomal microtubules are important for axonal development, but they should also discuss the counter argument from Vinopal et al., 2023 (Neuron) that showed how centrosomes were required for neuronal migration but not axon growth, which was instead mediated by Golgi-derived microtubules.

      Done, thank you very much.

      4) Lines 228-230 - repeated sentence

      Corrected, thank you very much.

      5) Additionally, we did not detect centrioles in the quadrant opposite the axon exit point (Fig. 2B n=75) - this data is not in Fig 2B

      Correct, it is in figure 4B, thank you very much.

      6) "This significant decrease in the humber of centrioles further supports the critical role of Sas-4 in pioneer neurons of the ventral nerve cord (VNC) during Drosophila embryogenesis". It rather highlights that Sas-4 is required for centriole formation in these neurons. Also, humber = number.

      We agree, and have changed the text, thank you very much.

      7) Result title: Non-ciliated sensory neurons have centrioles. This is kind of obvious. A better title may be "axon phenotypes correlate with centriole numbers in sensory neurons" but unfortunately i don't think there is good evidence for this (See major point above).

      We agree and we have changed. We now believe we have strong evidence to support it. We hope the additional data presented in the revision convincingly demonstrate this point.

      Reviewer #1 (Significance (Required)):

      As mentioned above, the advance will be important if more evidence is provided. In this case, the paper will be interesting to a broad readership. But currently the paper is limited by the lack of evidence for centrosome function and activity in the neurons.

      We hope that reviewer 1, now considers that the manuscript is not limited anymore and that it shows convincing evidence for centrosome function and activity in embryonic neurons.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: In this manuscript, Gonzalez et al. examine the potential function of centrosomes in the neurons and muscle cells of Drosophila embryos. By studying various mutant and RNAi lines in which centriole duplication has been disrupted, they conclude that the loss of centrioles disrupts axonal pathfinding and muscle integrity.

      Major points: 1. Throughout the manuscript, the phenotypes presented are often quite subtle. For this reason, I would really recommend that these experiments are scored blind. Perhaps the authors did this, but I didn't see any mention of this.

      All our phenotypic analyses are performed blind. We apologize for not having originally included this information in the Methods section; it has now been added. Embryos are stained using colorimetric methods (DAB) to label the nervous system, while balancer chromosomes are marked with a fluorescent antibody. This approach allows us to assess and quantify phenotypes using white light without knowing whether the embryos are homozygous mutants or heterozygous, which can only be detected by changing the channels to fluorescence.

      1. The authors conclude that neurons have active centrioles that function as centrosomes (Figure 6), but the data here is confusing. The authors state that in these cells they observe acetylated MTs extending from the centrosomes and these colocalised with g-tubulin. But the authors don't show the overlap between centrosomes, g-tubulin and MTs, as they stain for these separately. This is problematic, as it was not clear from these images that the majority of the MTs really are extending from the centrosome: the centrosome may just associate or be close by to these MT cables (Figure 6A,B). Moreover, the authors show that only a fraction of the centrosomes in these cells associate with g-tubulin, so presumably in cells where the centrosomes lack g-tubulin they would not expect the centrosomes to be associated with the MTs-but they do not show that this is the case. Perhaps the authors can't test this, but an alternative would be to show that these MT arrays are absent in Sas-4 mutants. This would give more confidence that these MTs arise from the centrosomes.

      We agree that the initial data based on acetylated microtubules and γ-tubulin colocalization were not sufficient to conclude that microtubules originate from the centrosome, as these markers can only suggest association. To address this, we have now included additional experiments that provide direct evidence of centrosome activity.

      First, we performed live imaging of aCC neurons expressing EB1-GFP together with Asl-Tomato. Despite the technical challenges of imaging only two neurons per segment in live embryos under strict fluorescence and timing constraints, we were able to clearly observe EB1 comets emerging from the centrosome and moving toward the cell periphery. This demonstrates active microtubule nucleation from centrosomes rather than mere proximity to microtubule bundles.

      Second, we carried out a microtubule depolymerization/polymerization assay, which provides unequivocal evidence that polymerization initiates at the centrosome. After depolymerization, microtubules regrew from the centrosome, confirming its role as an active microtubule-organizing center. These experiments go beyond colocalization and directly address the concern that centrosomes might simply be adjacent to microtubule cables.

      Regarding the suggestion to use Sas-4 mutants, while we did not perform this experiment, the regrowth assay combined with EB1 imaging strongly supports that these microtubules originate from the centrosome. All new data are presented in Figure 6 and the corresponding text in the revised manuscript.

      1. The authors show that muscle cell integrity is compromised by centriole-loss (Figure 2). This is very surprising as it is widely believed that centrosomes are non-functional in muscle cells, and the MTs are instead organised around the nuclear envelope. I'm not aware of the situation in Drosophila muscle cells, but the authors should ideally try to examine if the centrioles are functioning as centrosomes in these cells. At the very least they should discuss how they think centriole-loss is influencing the muscle integrity when it is widely believed they are inactive in these cells.

      We do not claim that centrosomes are active in muscle cells at these developmental stages. The observed muscle defects could result from earlier processes such as cell division, migration, or muscle fusion. We agree that this is an intriguing observation; however, pursuing this question further would go beyond the scope of the current manuscript. As requested by the reviewer, we have now expanded the discussion to consider how centriole loss might impact muscle integrity.

      Regardless of the strength of the supporting data, I think the authors should tone down their conclusions. The title and abstract led me to believe that centriole loss would cause significant problems in axonal pathfinding and muscle integrity. In all the mutant specimens examined (and certainly the low magnification views shown in Figure 1D'-F', Figure 1I'-K' and Figure 2D'-F') the mutants look very similar to the WT. Many readers may not get past the title and abstract, so the authors should make it clearer that these defects are very subtle.

      We have changed the text to convey this idea.

      Minor points: 1. In Figures 4 and 5, CP309 staining is relied on to identify centrioles, but there is quite a background of non-specific dots, making it hard to be certain what is a centriole and what isn't. For example, in Figure 5D' there are lots of dots within some of the cells - are any of these centrioles? How can the authors be certain which dot is a centriole in some of the cells shown in Figure 5C'? Is it possible to use a second marker and only count as centrioles dots that are recognised by both antibodies?

      We thank the reviewer for this suggestion and agree that using a second marker improves confidence in centriole identification. In a new supplementary figure (Supplementary Fig. 3), we now show that Asl and Plp colocalize in neurons and provide a quantification of the frequency of this colocalization. This dual labelling confirms the identity of centrioles and addresses the concern about non-specific background.

      We also apologize for any confusion regarding the presence of foci outside the marked cells. These images are whole-mount embryonic stainings, and the anti-Plp antibody labels all centrosomes in all cells of the embryo, which explains the additional foci observed.

      In the abstract that authors state that traditionally centrosomes have been considered to be non-essential in terminally differentiated cells. I don't think this is correct. In the standard "textbook" view of a cell, the centrosome is normally positioned in the centre of the cell organising an extensive array of MTs that are thought play an important role in organising intracellular transport, the positioning and movement of organelles and the maintenance and establishment of cell polarity. I don't think it is only recent evidence that suggests they play vital roles in terminally differentiated cells.

      We thank the reviewer for this correction and we have changed the text accordingly.

      1. Line 162 the authors state that in the RNAi knockdown lines they observe several additional phenotypes, but then in the same sentence (Line 164) they say that these defects were also observed in the original mutant and mutant/Df lines.

      We apologise for this confusion, we have rearranged the sentence for clearance.

      The sentences in Line281-287 don't reference any of the Figures, so it seems the authors are just stating these results without presenting any data (e.g. "Significantly, we also found a correlation between axonal guidance phenotypes and the numbers of centrioles per embryo". If they've tested this correlation, they should show it.

      We have rearranged the sentences for better understanding.

      In Figure 7 I did not understand how the authors measured tortuosity (wiggliness) and could see no description in the methods. This is important as, again the defect seems quite subtle, but perhaps I am not understanding which bits of the axon are being measures. Is it just the small bit of the axons close to the asterixis that is being measured, or the whole FasII track?

      We have now added another quantification and additional descriptions in the methods section.

      Reviewer #2 (Significance (Required)):

      The potential function of centrosomes in axonal outgrowth is quite controversial, so this study is potentially of considerable interest.

      However, several aspects of the data presented here were confusing or not terribly convincing. In its present state, I don't think the main conclusions are strongly enough supported by the data.

      We hope that reviewer 2, now considers that the manuscript is not confusing anymore and that it shows convincing evidence for centrosome function and activity in embryonic neurons.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript of González et al. entitled "Centriole Loss in Embryonic Development Disrupts Axonal Pathfinding and Muscle Integrity" deals with the role of centrosomes in shaping axonal morphology. To this aim the AA analysed Drosophila Sas-4 mutants that are reported to develop until adult stage without centrioles. Remarkably, the AA observe that 50% of the homozygous mutant embryos fail to hatch as larvae. The present observations suggest that centrosome loss results in axonemal shaping defects and muscle developmental abnormalities. Finally, the AA show the presence of functional centrosomes in neurons. In my opinion, the manuscript is interesting because shows unexpected findings. However, to justify these new findings the AA are required to improve some experimental observations.

      We thank the reviewer for his summary of our work and for considering it interesting. We have taken into account all the comments and believe that these have helped improve our manuscript.

      Major: Abstract- It is unclear in which phenotypic condition the observations of centrosome loss or centrosome presence have been found. Please better explain. l.36. embryos, larvae, adult, from Sas4 or controls? If mutants, the observations are very interesting since Sas4 would be without centrioles. Indeed, Basto et al., show that chemosensory neurons do not develop an axoneme in the absence of centrioles, but extend dendrites toward the sensory bristle.

      We have made clear which refer to wild-type and which are Centriole Loss (CL) conditions. CL conditions refer to mutant and downregulation conditions, whereas targeted downregulation refers to RNAi downregulation only in neurons.

      I do not think appropriate the use of "centriole" in the main title since the centrioles would be localized by true centriolar antigens rather than by centrosomal antigens. This problem occurs throughout the text and some figures where the AA image centrioles by centrosomal material. In Gig. 5A only the AA properly look at Asl localization. The other pictures of presumptive centrioles or centriole quantification report CP309 dots. This localization does not unequivocally reveal centrioles, since CP309 is essentially required for centrosome-mediated Mt nucleation. There are differentiated Drosophila tissues in which centrioles are present, but inactivated, and unable to recruit pericentriolar material. Mt are nucleated by ncMTOCs that contain centrosomal material and gamma-tubulin. Thus, the centrosomal antigens do not colocalize with centrioles.

      We have changed centrioles to centrosomes in the title and most sections in the manuscript. We have also included an extra control, showing that Asl and Plp colocalize and quantify the number of times we find this colocalization in neurons (Supl. Fig 3). Asl is a reliable and widely used marker for centrioles, as it localizes specifically to the centriole structure (Varmark H, Llamazares S, Rebollo E, Lange B, Reina J, Schwarz H, Gonzalez C. Asterless is a centriolar protein required for centrosome function and embryo development in Drosophila. Curr Biol. 2007 Oct 23;17(20):1735-45. doi: 10.1016/j.cub.2007.09.031. PMID: 17935995.)

      Minor: l. 58. The early arrest is mainly due to a checkpoint control. In double mutant for Sas4 and P53 the embryos survive longer, even if their further development is asrrested.

      We thank the reviewer for this comment, and we have changed the text accordingly.

      1. Previous works, also quoted by the AA, reported that in mature neurons the centrosome are inactivated, whereas the present manuscript describes functional centrosomes in Drosophila motor and peripheral nervous system. This is an intriguing observations that needs a better explanation in Discussion section.

      We thank the reviewer for this comment, and we have changed the discussion accordingly.

      l.143-145. I understand that 50% of the Sas4 embryos that reach the adult stage have centrioles. Is it correct? But if it is so, how the AA explain the absence of centrioles in sensory neurons of adult flies as reported by Basto et al. ?

      According to our results they have less centrioles than controls already at embryonic stages. In addition, as reported in Basto et al. they continue losing centrioles during larval stages and metamorphosis, which explains why centrioles are not detected at adult stages.

      l.215. It is unclear for me why the AA analyse Sas6 flies, unless explain the mutant phenotype.

      To strengthen our conclusions with Sas-4 and exclude the possibility that the observed phenotypes arise from a centrosome-independent function of Sas-4. For this reason, we have taken additional steps to confirm that the effects are specifically due to centrosome loss and we used Sas-6 mutants as one of these.

      l.221. How the centrioles have been quantified? What antibody, the AA used.

      We have quantified centrosomes using antibodies agains Plp (CP309) and Asl-YFP expression.

      l.244. and Fig 4C,D. I see high background with CP309. As reported previously I think better to use antibodies against centriolar proteins, such as Sas6, Ana1, Asl, or Sas4 ( if centrioles are present in 50% of mutants as the AA claim, the antibody could be also useful). In addition, I can see some CP309 spots in Fig 4E,F. Are they centrioles?

      Indeed, as we report, Sas-4 mutant embryos are not totally devoid of centrosomes. In addition, and we apologise for the confusion, but the reason why there are foci outside the marked cells in control embryos is because these are wholemount embryonic stainings and the anti-Plp antibody marks all centrosomes in all cells in the embryo, not just in the neurons.

      l.270 and Fig. 5A and Fig.5 C-E. Why the AA localize Cp309 and not Asl (Fig. 5A) to detect centrioles?

      In a new supplementary figure, we now show that Asl and Plp colocalize and quantify the number of times we find this colocalization in neurons (Supl. Fig 3). So, we can use CP309 in neurons to the same effect as Asl-

      L295-296. I cannot see Mts, but only a diffuse staining. I am expecting to see distinct Mt bundles.

      In figure 5 it is now easier to see the MT bundles in the new experiment in Fig. 5F-I , where we performed MT depolymerisation/repolymerisation: Nevertheless, we need to stress out that we are doing these analyses in wholemount embryonic stainings.

      326-327. How the AA explain this different lethality, even if both the proteins are involved in centriole assembly?

      We have now redone all the viability and mutant phenotype analysis using Sas-6 CRISPR mutant over the Deficiency, which is a better way to access the phenotype.

      335-337. In my opinion the quoted publications are not relevant.

      We believe that these references back up our hypothesis because:

      • Metzger et al 2012 stress the importance of nuclear position in muscle development in Drosophila
      • Loh et al 2023, relate centrosomes with nuclear migration in Drosophila
      • Tillery et al 2018, is a review describing MTs in muscle development in Drosophila.

      358-359. Does maternal contribution persist after gastrulation?

      While bulk degradation occurs by midblastula transition, some stable maternal products persist beyond gastrulation. In our case, if centrioles are formed due to the maternal contribution, they will only be diluted by cell division, which explains why we can detect centrioles at late embryonic stages.

      l.366. This is an intriguing point, but as previously observed I have some problem with centriole localization. References. Please uniform Journal abbreviations and control page numbers.

      I hope we have clarified this problem with the new experiments showing MT repolarization from the centrosomes in neurons.

      Reviewer #3 (Significance (Required)):

      The manuscript is potentially interesting for peoples working of cell and molecular biology, and development. However, the paper needs an additional working to be suitable for publication.

      We hope that reviewer 3, considers that the additional work and revision make this manuscript suitable for publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript of González et al. entitled "Centriole Loss in Embryonic Development Disrupts Axonal Pathfinding and Muscle Integrity" deals with the role of centrosomes in shaping axonal morphology. To this aim the AA analysed Drosophila Sas-4 mutants that are reported to develop until adult stage without centrioles. Remarkably, the AA observe that 50% of the homozygous mutant embryos fail to hatch as larvae. The present observations suggest that centrosome loss results in axonemal shaping defects and muscle developmental abnormalities. Finally, the AA show the presence of functional centrosomes in neurons. In my opinion, the manuscript is interesting because shows unexpected findings. However, to justify these new findings the AA are required to improve some experimental observations.

      Major:

      Abstract- It is unclear in which phenotypic condition the observations of centrosome loss or centrosome presence have been found. Please better explain. l.36. embryos, larvae, adult, from Sas4 or controls? If mutants, the observations are very interesting since Sas4 would be without centrioles. Indeed, Basto et al., show that chemosensory neurons do not develop an axoneme in the absence of centrioles, but extend dendrites toward the sensory bristle.

      I do not think appropriate the use of "centriole" in the main title since the centrioles would be localized by true centriolar antigens rather than by centrosomal antigens. This problem occurs throughout the text and some figures where the AA image centrioles by centrosomal material. In Gig. 5A only the AA properly look at Asl localization. The other pictures of presumptive centrioles or centriole quantification report CP309 dots. This localization does not unequivocally reveal centrioles, since CP309 is essentially required for centrosome-mediated Mt nucleation. There are differentiated Drosophila tissues in which centrioles are present, but inactivated, and unable to recruit pericentriolar material. Mt are nucleated by ncMTOCs that contain centrosomal material and gamma-tubulin. Thus, the centrosomal antigens do not colocalize with centrioles.

      Minor:

      l. 58. The early arrest is mainly due to a checkpoint control. In double mutant for Sas4 and P53 the embryos survive longer, even if their further development is asrrested.

      l. 102. Previous works, also quoted by the AA, reported that in mature neurons the centrosome are inactivated, whereas the present manuscript describes functional centrosomes in Drosophila motor and peripheral nervous system. This is an intriguing observations that needs a better explanation in Discussion section.

      l.143-145. I understand that 50% of the Sas4 embryos that reach the adult stage have centrioles. Is it correct? But if it is so, how the AA explain the absence of centrioles in sensory neurons of adult flies as reported by Basto et al. ?

      l.215. It is unclear for me why the AA analyse Sas6 flies, unless explain the mutant phenotype.

      l.221. How the centrioles have been quantified? What antibody, the AA used.

      l.244. and Fig 4C,D. I see high background with CP309. As reported previously I think better to use antibodies against centriolar proteins, such as Sas6, Ana1, Asl, or Sas4 ( if centrioles are present in 50% of mutants as the AA claim, the antibody could be also useful). In addition, I can see some CP309 spots in Fig 4E,F. Are they centrioles?

      l.270 and Fig. 5A and Fig.5 C-E. Why the AA localize Cp309 and not Asl (Fig. 5A) to detect centrioles?

      L295-296. I cannot see Mts, but only a diffuse staining. I am expecting to see distinct Mt bundles.

      L. 326-327. How the AA explain this different lethality, even if both the proteins are involved in centriole assembly?

      l. 335-337. In my opinion the quoted publications are not relevant.

      l. 358-359. Does maternal contribution persist after gastrulation?

      l.366. This is an intriguing point, but as previously observed I have some problem with centriole localization.

      References. Please uniform Journal abbreviations and control page numbers.

      Significance

      The manuscript is potentially interesting for peoples working of cell and molecular biology, and development. However, the paper needs an additional working to be suitable for publication.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: In this manuscript, Gonzalez et al. examine the potential function of centrosomes in the neurons and muscle cells of Drosophila embryos. By studying various mutant and RNAi lines in which centriole duplication has been disrupted, they conclude that the loss of centrioles disrupts axonal pathfinding and muscle integrity.

      Major points:

      1. Throughout the manuscript, the phenotypes presented are often quite subtle. For this reason, I would really recommend that these experiments are scored blind. Perhaps the authors did this, but I didn't see any mention of this.
      2. The authors conclude that neurons have active centrioles that function as centrosomes (Figure 6), but the data here is confusing. The authors state that in these cells they observe acetylated MTs extending from the centrosomes and these colocalised with g-tubulin. But the authors don't show the overlap between centrosomes, g-tubulin and MTs, as they stain for these separately. This is problematic, as it was not clear from these images that the majority of the MTs really are extending from the centrosome: the centrosome may just associate or be close by to these MT cables (Figure 6A,B). Moreover, the authors show that only a fraction of the centrosomes in these cells associate with g-tubulin, so presumably in cells where the centrosomes lack g-tubulin they would not expect the centrosomes to be associated with the MTs-but they do not show that this is the case. Perhaps the authors can't test this, but an alternative would be to show that these MT arrays are absent in Sas-4 mutants. This would give more confidence that these MTs arise from the centrosomes.
      3. The authors show that muscle cell integrity is compromised by centriole-loss (Figure 2). This is very surprising as it is widely believed that centrosomes are non-functional in muscle cells, and the MTs are instead organised around the nuclear envelope. I'm not aware of the situation in Drosophila muscle cells, but the authors should ideally try to examine if the centrioles are functioning as centrosomes in these cells. At the very least they should discuss how they think centriole-loss is influencing the muscle integrity when it is widely believed they are inactive in these cells.
      4. Regardless of the strength of the supporting data, I think the authors should tone down their conclusions. The title and abstract led me to believe that centriole loss would cause significant problems in axonal pathfinding and muscle integrity. In all the mutant specimens examined (and certainly the low magnification views shown in Figure 1D'-F', Figure 1I'-K' and Figure 2D'-F') the mutants look very similar to the WT. Many readers may not get past the title and abstract, so the authors should make it clearer that these defects are very subtle.

      Minor points:

      1. In Figures 4 and 5, CP309 staining is relied on to identify centrioles, but there is quite a background of non-specific dots, making it hard to be certain what is a centriole and what isn't. For example, in Figure 5D' there are lots of dots within some of the cells - are any of these centrioles? How can the authors be certain which dot is a centriole in some of the cells shown in Figure 5C'? Is it possible to use a second marker and only count as centrioles dots that are recognised by both antibodies?
      2. In the abstract that authors state that traditionally centrosomes have been considered to be non-essential in terminally differentiated cells. I don't think this is correct. In the standard "textbook" view of a cell, the centrosome is normally positioned in the centre of the cell organising an extensive array of MTs that are thought play an important role in organising intracellular transport, the positioning and movement of organelles and the maintenance and establishment of cell polarity. I don't think it is only recent evidence that suggests they play vital roles in terminally differentiated cells.
      3. Line 162 the authors state that in the RNAi knockdown lines they observe several additional phenotypes, but then in the same sentence (Line 164) they say that these defects were also observed in the original mutant and mutant/Df lines.
      4. The sentences in Line281-287 don't reference any of the Figures, so it seems the authors are just stating these results without presenting any data (e.g. "Significantly, we also found a correlation between axonal guidance phenotypes and the numbers of centrioles per embryo". If they've tested this correlation, they should show it.
      5. In Figure 7 I did not understand how the authors measured tortuosity (wiggliness) and could see no description in the methods. This is important as, again the defect seems quite subtle, but perhaps I am not understanding which bits of the axon are being measures. Is it just the small bit of the axons close to the asterixis that is being measured, or the whole FasII track?

      Significance

      The potential function of centrosomes in axonal outgrowth is quite controversial, so this study is potentially of considerable interest.

      However, several aspects of the data presented here were confusing or not terribly convincing. In its present state, I don't think the main conclusions are strongly enough supported by the data.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper, the authors address the important question of the role of centrosomes during neuronal development. They use Drosophila as an in vivo model. The field is somewhat unclear on the role and importance of centrosomes during neuronal development, although the current data would suggest they are dispensable for axon specification and growth. Early studies in cultured mammalian neurons showed that centrosomes are active and that their microtubules can be cut and transported into the neurites. But a study then showed that centrosomes in these cultured neurons are deactivated relatively early during neuronal development in vitro and that ablating centrosomes even when they are active had no obvious effect on axon specification and growth. Consistent with this, a study in Drosophila provided evidence that centrosomes were not active or necessary in different types of neurons. More recently, a study showed that centrosomal microtubules are dispensable for axon specification and growth in mice in vivo but are required for neuronal migration in the cerebral cortex. However, another study has linked the generation of acetylated microtubules at centrosomes with axon development. In this current study, the authors examine the effect of centrosome loss on various motor and sensory neurons and muscles mainly by examining mutants in essential centriole duplication genes. They associate axonal routing and morphology defects with centrosome loss and provide some evidence that centrosomes could still be active in the developing neurons. Overall, they conclude that centrosomes are active during at least early neuronal development and that this activity is important for proper axonal morphology and routing.

      While I think this study addressing a very interesting and important question, I think as it stands the data is not sufficient to be conclusive on a role for centrosomes during neuronal development. My biggest concern is that most phenotypes have not yet been shown to be cell autonomous, as whole animal mutants have been analysed rather than analysing the effect of cell-specific depletion, and the evidence for active centrosomes needs to be strengthened. If the authors can provide stronger evidence for a role of centrosomes in axonal development then the paper will certainly be of interest to a broad readership.

      Major comments

      1. The sas-6 transallelic combination shows only 17% embryonic lethality compared to 50% embryonic lethality with sas-4 mutants. Given that both mutants should result in the same degree of centrosome loss (this should be quantified in sas-6 mutants) it would suggest that either sas-4 has other roles away from centrosomes or that the sas-4 mutant chromosome used in the experiment has other mutations that affect viability. The effect of picking up "second-site lethal" mutations on mutant chromosomes is common and so I would not be surprised if this is the reason for the difference in phenotypes. This can be addressed either by "cleaning up" the sas-4 mutant chromosome by backcrossing to wild-type lines, allowing recombination to occur and replace the potential second site mutations, or by using transallelic combinations of sas-4, as they did for sas-6. The "easier" option may just be to analyse all the phenotypes with the sas-6 transallelic combination.
      2. Using "whole animal" mutants for assessing neuronal morphology is risky due to non-cell-autonomous effects. The authors have carried out some phenotypic analysis of neurons depleted of Sas-4 by cell-specific RNAi, but I feel they need to do this for all of their analysis. This includes embryonic lethality measures, quantification of centrosome numbers, and all axonal phenotypes in Sas-4 RNAi neurons. It would also be prudent to use 2 distinct RNAi lines to help ensure any phenotypes are not off-target effects (and this may help clarify why the authors see some additional phenotypes with RNAi). Indeed, there are relatively weak phenotypes in muscles when using RNAi compared to the mutants and these potential non-cell-autonomous effects could then have a knock-on effect on neuronal morphology. If the authors were concerned that RNAi is not very efficient (explaining any potential weaker phenotypes than in mutants) the authors could examine the effectiveness of RNAi lines by analysing protein depletion by western blotting or mRNA depletion by rt-qPCR (although this has to be done in a different cell type due to the difficulty in obtaining a neuronal extract).
      3. When analysing centriole presence or absence it is a good idea to stain with two different centriole markers e.g. Asl and Plp. This helps rule out unspecific staining. It is clear from the images that similar sized foci can be observed outside of the cells (see Figure 5A for example), so clearly some of the foci that appear to be within the cells may also be unspecific staining.
      4. The evidence for active centrosomes is not that convincing. Acetylated tubulin is associated with stable MTs, which are not normally organised by "active" centrosomes that nucleate dynamic microtubules. Moreover, it is plausible that centriole foci happen to overlap with the acetylated tubulin staining by chance. This would explain why not all centrosomes colocalise with acetylated tubulin signal. The authors could better test centrosome activity by performing live imaging with EB1-GFP. If centrosomes are active, it is very easy to observe the many comets produced by the centrosomes.
      5. If the authors believe that centrosomes have a role in axon pathfinding in sensory neurons, they should show that these centrosomes are active, at least during early stages (again using EB1-GFP imaging).
      6. The authors mention in the discussion that "increased JNK activity, can result in axonal wiggliness (Karkali et al, 2023)". I therefore wonder whether centrosome loss may induce JNK activation (the stress response), as this would then indicate an indirect effect of centrosome loss on axonal structure rather than a direct influence of centrosome-generated microtubules. The authors could assess whether the DNK-JNK pathway is activated in neurons lacking centrosomes by expression UAS-Puc-GFP and quantifying the nuclear signal.
      7. In Figure 5, the authors claim that they find "a correlation between axonal guidance phenotypes and the numbers of centrioles per embryo". I don't think this is a strong correlation. The difference in centriole number between embryos with no defects and those with defects is very small. In contrast, the difference between centriole numbers in control (no defects) and mutant (no defects) is very large. So, there does not appear to be a strong correlation between centrosome number and phenotype.

      Minor comments

      1. I don't understand Figure 3C - why do the % of surviving homozygotes and heterozygotes add up to 100%? Should the grey boxes not relate to dead and the white to surviving?
      2. "In mouse fibroblasts, myoblasts and endothelial cells, centrosome orientation is important for nuclear positioning and cell migration(Chang et al, 2015; Gomes et al, 2005; Kushner et al, 2014)." Do you mean "centrosome position"?
      3. In the introduction, the authors mention Meka et al. when saying the centrosomal microtubules are important for axonal development, but they should also discuss the counter argument from Vinopal et al., 2023 (Neuron) that showed how centrosomes were required for neuronal migration but not axon growth, which was instead mediated by Golgi-derived microtubules.
      4. Lines 228-230 - repeated sentence
      5. Additionally, we did not detect centrioles in the quadrant opposite the axon exit point (Fig. 2B n=75) - this data is not in Fig 2B
      6. "This significant decrease in the humber of centrioles further supports the critical role of Sas-4 in pioneer neurons of the ventral nerve cord (VNC) during Drosophila embryogenesis". It rather highlights that Sas-4 is required for centriole formation in these neurons. Also, humber = number.
      7. Result title: Non-ciliated sensory neurons have centrioles. This is kind of obvious. A better title may be "axon phenotypes correlate with centriole numbers in sensory neurons" but unfortunately i don't think there is good evidence for this (See major point above).

      Significance

      As mentioned above, the advance will be important if more evidence is provided. In this case, the paper will be interesting to a broad readership. But currently the paper is limited by the lack of evidence for centrosome function and activity in the neurons.

    1. Reviewer #1 (Public Review):

      Summary:

      Argunşah et al. describe and investigate the mechanisms underlying the differential response dynamics of barrel vs septa domains in the whisker-related primary somatosensory cortex (S1). Upon repeated stimulation, the authors report that the response ratio between multi- and single-whisker stimulation increases in layer (L) 4 neurons of the septal domain, while remaining constant in barrel L4 neurons. The authors attribute this divergence to differences in short-term synaptic plasticity, particularly within somatostatin-expressing (SST⁺) interneurons. This interpretation is supported by 1) the increased density of SST+ neurons in L4 of the septa compared to barrel domain, 2) the stronger response of (L2/3) SST+ neurons to repeated multi- vs single-whisker stimulation and 3) the reduced functional difference in single- versus multi-whisker response ratios across barrel and septal domains in Elfn1 KO mice, which lack a synaptic protein that confers characteristic short-term plasticity, notably in SST+ neurons. Consistently, a decoder trained on WT data fails to generalize to Elfn1 KO responses. Finally, the authors report a relative enrichment of S2- and M1-projecting cell densities in L4 of the septal domain compared to the barrel domain, suggesting that septal and barrel circuits may differentially route information about single vs multi-whisker stimulation downstream of S1.

      Strengths:

      This paper describes and aims to study a circuit underlying differential response between barrel columns and septal domains of the primary somatosensory cortex. This work supports the view these two domains contribute distinctly to the processing single versus multi-whisker inputs and highlight the role of SST+ neuron and their short-term plasticity. Together, this study suggests that the barrel cortex multiplexes whisker-derived sensory information across its domains, enabling parallel processing within S1.

      Weaknesses:

      Although the divergence in responses to repeated single- versus multi-whisker stimulation between barrel and septal domains is consistent with a role for SST⁺ neuron short-term plasticity, the evidence presented does not conclusively demonstrate that this mechanism is the critical driver of the difference. The lack of targeted recordings and manipulations limits the strength of this conclusion: SST⁺ neuron activity is not measured in L4, nor is it assessed in a domain-specific manner. The Elfn1 knockout manipulation does not appear to selectively affect either stimulus condition, domain or interneuron subtype. Finally, all experiments were performed under anesthesia, which raises concerns about how well the reported dynamics generalize to awake cortical processing.

    2. Reviewer #2 (Public review):

      Summary:

      Argunsah and colleagues demonstrate that SST expressing interneurons are concentrated in the mouse septa and differentially respond to repetitive multi-whisker inputs. Identifying how a specific neuronal phenotype impacts responses is an advance.

      Strengths:

      (1) Careful physiological and imaging studies.

      (2) Novel result showing the role of SST+ neurons in shaping responses.

      (3) Good use of a knockout animal to further the main hypothesis.

      (4) Clear analytical techniques.

      Comments on revisions:

      The authors have effectively responded to my initial critiques - I have no further concerns.

    3. Reviewer #3 (Public review):

      Summary:

      This study investigates the functional differences between barrel and septal columns in the mouse somatosensory cortex, focusing on how local inhibitory dynamics (particularly involving SST⁺ interneurons) may mediate temporal integration of multi-whisker (MW) stimuli in septa. Using a combination of in vivo multi-unit recordings, calcium imaging, and anatomical tracing, the authors propose a model in which Elfn1-dependent synaptic facilitation onto SST⁺ interneurons contributes to the distinct sensory responses to MW input in barrels and septa, enabling functional segregation between these domains.

      Strengths:

      The study presents a thought-provoking and useful conceptual model for understanding sensory processing in the somatosensory cortex. While barrel columns have been widely studied, septal regions remain relatively understudied in mice. If septa indeed act as selective integrators of distributed sensory input, this would suggest a novel computational role for cortical microcircuits beyond the classical view focused on barrels. Although still hypothetical, the proposed model in which SST⁺ interneurons contribute to domain-specific sensory responses between barrel and septal domains is intriguing and opens new avenues for investigating inhibitory circuit mechanisms.

      Weaknesses:

      The primary limitation of this study lies in the spatial and cellular specificity of the recording techniques. The physiological data rely predominantly on unsorted multi-unit activity (MUA) recorded with low-channel-count silicon probes. Because MUA aggregates signals from multiple neurons over a radius of approximately 50-100 µm (often wider than the typical septal width in mice), this approach makes it difficult to confidently isolate activity originating strictly from within septal domains. The manuscript would benefit from additional analyses to validate the spatial specificity of these recordings, such as systematically varying spike detection thresholds to test the robustness of domain attribution, as suggested by the reviewer. Furthermore, although the authors now appropriately frame their findings in the Elfn1 knockout mice as indirect evidence, it is worth emphasizing that the study lacks direct in vivo, cell-type-specific recordings and manipulations to more definitively test the proposed mechanism.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Reviews):

      Summary:

      Argunşah et al. describe and investigate the mechanisms underlying the differential response dynamics of barrel vs septa domains of the whisker-related primary somatosensory cortex (S1). Upon repeated stimulation, the authors report that the response ratio between multi- and single-whisker stimulation increases in layer (L) 4 neurons of the septal domain, while remaining constant in barrel L4 neurons. This difference is attributed to the short-term plasticity properties of interneurons, particularly somatostatin-expressing (SST+) neurons. This claim is supported by the increased density of SST+ neurons found in L4 of the septa compared to barrels, along with a stronger response of (L2/3) SST+ neurons to repeated multi- vs single-whisker stimulation. The role of the synaptic protein Elfn1 is then examined. Elfn1 KO mice exhibited little to no functional domain separation between barrel and septa, with no significant difference in single- versus multi-whisker response ratios across barrel and septal domains. Consistently, a decoder trained on WT data fails to generalize to Elfn1 KO responses. Finally, the authors report a relative enrichment of S2- and M1-projecting cell densities in L4 of the septal domain compared to the barrel domain.

      Strengths:

      This paper describes and aims to study a circuit underlying differential response between barrel columns and septal domains of the primary somatosensory cortex. This work supports the view that barrel and septal domains contribute differently to processing single versus multi-whisker inputs, suggesting that the barrel cortex multiplexes sensory information coming from the whiskers in different domains.

      We thank the reviewer for the very neat summary of our findings that barrel cortex multiplexes converging information in separate domains.

      Weaknesses:

      While the observed divergence in responses to repeated SWS vs MWS between the barrel and septal domains is intriguing, the presented evidence falls short of demonstrating that short-term plasticity in SST+ neurons critically underpins this difference. The absence of a mechanistic explanation for this observation limits the work’s significance. The measurement of SST neurons’ response is not specific to a particular domain, and the Elfn1 manipulation does not seem to be specific to either stimulus type or a particular domain.

      We appreciate the reviewer’s perspective. Although further research is needed to understand the circuit mechanisms underlying the observed phenomenon, we believe our data suggest that altering the short-term dynamics of excitatory inputs onto SST neurons reduces the divergent spiking dynamics in barrels versus septa during repetitive single- and multi-whisker stimulation. Future work could examine how SST neurons, whose somata reside in barrels and septa, respond to different whisker stimuli and the circuits in which they are embedded. At this time, however, the authors believe there is no alternative way to test how the short-term dynamics of excitatory inputs onto SST neurons, as a whole, contribute to the temporal aspects of barrel versus septa spiking.

      The study's reach is further constrained by the fact that results were obtained in anesthetized animals, which may not generalize to awake states.

      We appreciate the reviewer’s concern regarding the generalizability of our findings from anesthetized animals to awake states. Anesthesia was employed to ensure precise individual whisker stimulation (and multi-whisker in the same animal), which is challenging in awake rodents due to active whisking. While anesthesia may alter higher-order processing, core mechanisms, such as short and long term plasticity in the barrel cortex, are preserved under anesthesia (Martin-Cortecero et al., 2014; Mégevand et al., 2009).

      The statistical analysis appears inappropriate, with the use of repeated independent tests, dramatically boosting the false positive error rate.

      Thank you for your feedback on our analysis using independent rank-based tests for each time point in wild-type (WT) animals. To address concerns regarding multiple comparisons and temporal dependencies (for Figure 1F and 4D for now but we will add more in our revision), we performed a repeated measures ANOVA for WT animals (13 Barrel, 8 Septa, 20 time points), which revealed a significant main effect of Condition (F(1,19) = 16.33, p < 0.001) and a significant Condition-Time interaction (F(19,361) = 2.37, p = 0.001). Post-hoc tests confirmed significant differences between Barrel and Septa at multiple time points (e.g., p < 0.0025 at times 3, 4, 6, 7, 8, 10, 11, 12, 16, 19 after Bonferroni posthoc correction), supporting a differential multi-whisker vs. single-whisker ratio response in WT animals. In contrast, a repeated measures ANOVA for knock-out (KO) animals (11 Barrel, 7 Septa, 20 time points) showed no significant main effect of Condition (F(1,14) = 0.17, p = 0.684) or Condition-Time interaction (F(19,266) = 0.73, p = 0.791), indicating that the BarrelSepta difference observed in WT animals is absent in KO animals.

      Furthermore, the manuscript suffers from imprecision; its conclusions are occasionally vague or overstated. The authors suggest a role for SST+ neurons in the observed divergence in SWS/MWS responses between barrel and septal domains. However, this remains speculative, and some findings appear inconsistent. For instance, the increased response of SST+ neurons to MWS versus SWS is not confined to a specific domain. Why, then, would preferential recruitment of SST+ neurons lead to divergent dynamics between barrel and septal regions? The higher density of SST+ neurons in septal versus barrel L4 is not a sufficient explanation, particularly since the SWS/MWS response divergence is also observed in layers 2/3, where no difference in SST+ neuron density is found.

      Moreover, SST+ neuron-mediated inhibition is not necessarily restricted to the layer in which the cell body resides. It remains unclear through which differential microcircuits (barrel vs septum) the enhanced recruitment of SST+ neurons could account for the divergent responses to repeated SWS versus MWS stimulation.

      We fully appreciate the reviewer’s comment. We currently do not provide any evidence on the contribution of SST neurons in the barrels versus septa in layer 4 on the response divergence of spiking observed in SWS versus MWS. We only show that these neurons differentially distribute in the two domains in this layer. It is certainly known that there is molecular and circuit-based diversity of SST-positive neurons in different layers of the cortex, so it is plausible that this includes cells located in the two domains of vS1, something which has not been examined so far. Our data on their distribution are one piece of information that SST neurons may have a differential role in inhibiting barrel stellate cells versus septa ones. Morphological reconstructions of SST neurons in L4 of the somatosensory barrel cortex has shown that their dendrites and axons project locally and may confine to individual domains, even though not specifically examined (Fig. 3 of Scala F et al., 2019). The same study also showed that L4 SST cells receive excitatory input from local stellate cells) and is known that they are also directly excited by thalamocortical fibers (Beierlein et al., 2003; Tan et al., 2008), both of which facilitate.

      As shown in our supplementary figure, the divergence is also observed in L2/3 where, as the reviewer also points out, where we do not have a differential distribution of SST cells, at least based on a columnar analysis extending from L4. There are multiple scenarios that could explain this “discrepancy” that one would need to examine further in future studies. One straightforward one is that the divergence in spiking in L2/3 domains may be inherited from L4 domains, where L4 SST act on. Another is that even though L2/3 SST neurons are not biased in their distribution their input-output function is, something which one would need to examine by detailed in vitro electrophysiological and perhaps optogenetic approaches in S1. Despite the distinctive differences that have been found between the L4 circuitry in S1 and V1 (Scala F et al., 2019), recent observations indicate that small but regular patches of V1 marked by the absence of muscarinic receptor 2 (M2) have high temporal acuity (Ji et al., 2015), and selectively receive input from SST interneurons (Meier et al., 2025). Regions lacking M2 have distinct input and output connectivity patterns from those that express M2 (Meier et al., 2021; Burkhalter et al., 2023). These findings, together with ours, suggest that SST cells preferentially innervate and regulate specific domains columns- in sensory cortices.

      Regardless of the mechanism, the Elfn1 knock-out mouse line almost exclusively affects the incoming excitability onto SST neurons (see also reply to comment below), hence what can be supported by our data is that changing the incoming short-term synaptic plasticity onto these neurons brings the spiking dynamics between barrels and septa closer together.

      The Elfn1 KO mouse model seems too unspecific to suggest the role of the short-term plasticity in SST+ neurons in the differential response to repeated SWS vs MWS stimulation across domains. Why would Elfn1-dependent short-term plasticity in SST+ neurons be specific to a pathway, or a stimulation type (SWS vs MWS)? Moreover, the authors report that Elfn1 knockout alters synapses onto VIP+ as well as SST+ neurons (Stachniak et al., 2021; previous version of this paper)-so why attribute the phenotype solely to SST+ circuitry? In fact, the functional distinctions between barrel and septal domains appear largely abolished in the Elfn1 KO.

      Previous work by others and us has shown that globally removing Elfn1 selectively removes a synaptic process from the brain without altering brain anatomy or structure. This allows us to study how the temporal dynamics of inhibition shape activity, as opposed to inhibition from particular cell types. We will nevertheless update the text to discuss more global implications for SST interneuron dynamics and include a reference to VIP interneurons that contain Elfn1.

      When comparing SWS to MWS, we find that MWS replaces the neighboring excitation which would normally be preferentially removed by short-term plasticity in SST interneurons, thus providing a stable control comparison across animals and genotypes. On average, VIP interneurons failed to show modulation by MWS. We were unable to measure a substantial contribution of VIP cells to this process and also note that the Elfn1 expressing multipolar neurons comprise only ~5% of VIP neurons (Connor and Peters, 1984; Stachniak et al., 2021), a fraction that may be lost when averaging from 138 VIP cells. Moreover, the effect of Elfn1 loss on VIP neurons is quite different and marginal compared to that of SST cells, suggesting that the primary impact of Elfn1 knockout is mediated through SST+ interneuron circuitry. Therefore, even if we cannot rule out that these 5% of VIP neurons contribute to barrel domain segregation, we are of the opinion that their influence would be very limited if any.

      Reviewer #2 (Public Reviews):

      Summary:

      Argunsah and colleagues demonstrate that SST-expressing interneurons are concentrated in the mouse septa and differentially respond to repetitive multi-whisker inputs. Identifying how a specific neuronal phenotype impacts responses is an advance.

      Strengths:

      (1)  Careful physiological and imaging studies.

      (2)  Novel result showing the role of SST+ neurons in shaping responses.

      (3)  Good use of a knockout animal to further the main hypothesis.

      (4)  Clear analytical techniques.

      We thank the reviewer for their appreciation of the study.

      Weaknesses:

      No major weaknesses were identified by this reviewer. Overall, I appreciated the paper but feel it overlooked a few issues and had some recommendations on how additional clarifications could strengthen the paper. These include:

      (1) Significant work from Jerry Chen on how S1 neurons that project to M1 versus S2 respond in a variety of behavioral tasks should be included (e.g. PMID: 26098757). Similarly, work from Barry Connor’s lab on intracortical versus thalamocortical inputs to SST neurons, as well as excitatory inputs onto these neurons (e.g. PMID: 12815025) should be included.

      We thank the reviewer for these valuable resources that we overlooked. We will include Chen et al. (2015), Cruikshank et al. (2007) and Gibson et al. (1999) to contextualize S1 projections and SST+ inputs, strengthening the study’s foundation as well as Beierlein et al. (2003) which nicely show both local and thalamocortical facilitation of excitatory inputs onto L4 SST neurons, in contrast to PV cells. The paper also shows the gradual recruitment of SST neurons by thalamocortical inputs to provide feed-forward inhibition onto stellate cells (regular spiking) of the barrel cortex L4 in rat.

      (2) Using Layer 2/3 as a proxy to what is happening in layer 4 (~line 234). Given that layer 2/3 cells integrate information from multiple barrels, as well as receiving direct VPm thalamocortical input, and given the time window that is being looked at can receive input from other cortical locations, it is not clear that layer 2/3 is a proxy for what is happening in layer 4.

      We agree with the reviewer that what we observe in L2/3 is not necessarily what is taking place in L4 SST-positive cells. The data on L2/3 was included to show that these cells, as a population, can show divergent responses when it comes to SWS vs MWS, which is not seen in L2/3 VIP neurons. Regardless of the mechanisms underlying it, our overall data support that SST-positive neurons can change their activation based on the type of whisker stimulus and when the excitatory input dynamics onto these neurons change due to the removal of Elfn1 the recruitment of barrels vs septa spiking changes at the temporal domain. Having said that, the data shown in Supplementary Figure 3 on the response properties of L2/3 neurons above the septa vs above the barrels (one would say in the respective columns) do show the same divergence as in L4. This suggests that a circuit motif may exist that is common to both layers, involving SST neurons that sit in L4, L5 or even L2/3. This implies that despite the differences in the distribution of SST neurons in septa vs barrels of L4 there is an unidentified input-output spatial connectivity motif that engages in both L2/3 and L4. Please also see our response to a similar point raised by reviewer 1.

      (3) Line 267, when discussing distinct temporal response, it is not well defined what this is referring to. Are the neurons no longer showing peaks to whisker stimulation, or are the responses lasting a longer time? It is unclear why PV+ interneurons which may not be impacted by the Elfn1 KO and receive strong thalamocortical inputs, are not constraining activity.

      We thank the reviewer for their comment and will clarify the statement.

      This convergence of response profiles was further clear in stimulus-aligned stacked images, where the emergent differences between barrels and septa under SWS were largely abolished in the KO (Figure 4B). A distinction between directly stimulated barrels and neighboring barrels persisted in the KO. In addition, the initial response continued to differ between barrel and septa and also septa and neighbor (Figure 4B). This initial stimulus selectivity potentially represents distinct feedforward thalamocortical activity, which includes PV+ interneuron recruitment that is not directly impacted by the Elfn1 KO (Sun et al., 2006; Tan et al., 2008). PV+ cells are strongly excited by thalamocortical inputs, but these exhibit short-term depression, as does their output, contrasting with the sustained facilitation observed in SST+ neurons. These findings suggest that in WT animals, activity spillover from principal barrels is normally constrained by the progressive engagement of SST+ interneurons in septal regions, driven by Elfn1-dependent facilitation at their excitatory synapses. In the absence of Elfn1, this local inhibitory mechanism is disrupted, leading to longer responses in barrels, delayed but stronger responses in septa, and persistently stronger responses in unstimulated neighbors, resulting in a loss of distinction between the responses of barrel and septa domains that normally diverge over time (see Author response image 1 below).

      Author response image 1.

      (A) Barrel responses are longer following whisker stimulation in KO. (B) Septal responses are slightly delayed but stronger in KO. (C) Unstimulated neighbors show longer persistent responses in KO.

       

      (4) Line 585 “the earliest CSD sink was identified as layer 4…” were post-hoc measurements made to determine where the different shank leads were based on the post-hoc histology?

      Post hoc histology was performed on plane-aligned brain sections which would allow us to detect barrels and septa, so as to confirm the insertion domains of each recorded shank. Layer specificity of each electrode therefore could therefore not be confirmed by histology as we did not have coronal sections in which to measure electrode depth.

      (5) For the retrograde tracing studies, how were the M1 and S2 injections targeted (stereotaxically or physiologically)? How was it determined that the injections were in the whisker region (or not)?

      During the retrograde virus injection, the location of M1 and S2 injections was determined by stereotaxic coordinates (Yamashita et al., 2018). After acquiring the light-sheet images, we were able to post hoc examine the injection site in 3D and confirm that the injections were successful in targeting the regions intended. Although it would have been informative to do so, we did not functionally determine the whisker-related M1 and whisker-related S2 region in this experiment.

      (6) Were there any baseline differences in spontaneous activity in the septa versus barrel regions, and did this change in the KO animals?

      Thank you for this interesting question. Our previous study found that there was a reduction in baseline activity in L4 barrel cortex of KO animals at postnatal day (P)12, but no differences were found at P21 (Stachniak et al., 2023).

      Reviewer #3 (Public Reviews):

      Summary:

      This study investigates the functional differences between barrel and septal columns in the mouse somatosensory cortex, focusing on how local inhibitory dynamics, particularly involving Elfn1-expressing SST⁺ interneurons, may mediate temporal integration of multiwhisker (MW) stimuli in septa. Using a combination of in vivo multi-unit recordings, calcium imaging, and anatomical tracing, the authors propose that septa integrate MW input in an Elfn1-dependent manner, enabling functional segregation from barrel columns.

      Strengths:

      The core hypothesis is interesting and potentially impactful. While barrels have been extensively characterized, septa remain less understood, especially in mice, and this study's focus on septal integration of MW stimuli offers valuable insights into this underexplored area. If septa indeed act as selective integrators of distributed sensory input, this would add a novel computational role to cortical microcircuits beyond what is currently attributed to barrels alone. The narrative of this paper is intellectually stimulating.

      We thank the reviewer for finding the study intellectually stimulating.

      Weaknesses:

      The methods used in the current study lack the spatial and cellular resolution needed to conclusively support the central claims. The main physiological findings are based on unsorted multi-unit activity (MUA) recorded via low-channel-count silicon probes. MUA inherently pools signals from multiple neurons across different distances and cell types, making it difficult to assign activity to specific columns (barrel vs. septa) or neuron classes (e.g., SST⁺ vs. excitatory).

      The recording radius (~50-100 µm or more) and the narrow width of septa (~50-100 µm or less) make it likely that MUA from "septal" electrodes includes spikes from adjacent barrel neurons.

      The authors do not provide spike sorting, unit isolation, or anatomical validation that would strengthen spatial attribution. Calcium imaging is restricted to SST⁺ and VIP⁺ interneurons in superficial layers (L2/3), while the main MUA recordings are from layer 4, creating a mismatch in laminar relevance.

      We thank the reviewer for pointing out the possibility of contamination in septal electrodes. Importantly, it may not have been highlighted, although reported in the methods, but we used an extremely high threshold (7.5 std, in methods, line 583) for spike detection in order to overcome the issue raised here, which restricts such spatial contaminations. Since the spike amplitude decays rapidly with distance, at high thresholds, only nearby neurons contribute to our analysis, potentially one or two. We believe that this approach provides a very close approximation of single unit activity (SUA) in our reported data. We will include a sentence earlier in the manuscript to make this explicit and prevent further confusion.

      Regarding the point on calcium imaging being performed on L2/3 SST and VIP cells instead of L4. Both reviewer 1 and 2 brought up the same issue and we responded as follows. As shown in our supplementary figure, the divergence is also observed in L2/3 where we do not have a differential distribution of SST cells, at least based on a columnar analysis extending from L4. There are multiple scenarios that could explain this “discrepancy” that one would need to examine further in future studies. One straightforward one is that the divergence in spiking in L2/3 domains may be inherited from L4 domains, where L4 SST act on. Another is that even though L2/3 SST neurons are not biased in their distribution their input-output function is, something which one would need to examine by detailed in vitro electrophysiological and perhaps optogenetic approaches in S1. Despite the distinctive differences that have been found between the L4 circuitry in S1 and V1 (Scala F et al., 2019), recent observations indicate that small but regular patches of V1 marked by the absence of muscarinic receptor 2 (M2) have high temporal acuity (Ji et al., 2015), and selectively receive input from SST interneurons (Meier et al., 2025). Regions lacking M2 have distinct input and output connectivity patterns from those that express M2 (Meier et al., 2021; Burkhalter et al., 2023). These findings, together with ours, suggest that SST cells preferentially innervate and regulate specific domains -columns- in sensory cortices.

      Furthermore, while the role of Elfn1 in mediating short-term facilitation is supported by prior studies, no new evidence is presented in this paper to confirm that this synaptic mechanism is indeed disrupted in the knockout mice used here.

      We thank Reviewer #3 for noting the absence of new evidence confirming Elfn1’s disruption of short-term facilitation in our knockout mice. We acknowledge that our study relies on previously strong published data demonstrating that Elfn1 mediates short-term synaptic facilitation of excitatory inputs onto SST+ interneurons (Sylwestrak and Ghosh, 2012; Tomioka et al., 2014; Stachniak et al., 2019, 2023). These studies consistently show that Elfn1 knockout abolishes facilitation in SST+ synapses, leading to altered temporal dynamics, which we hypothesize underlies the observed loss of barrel-septa response divergence in our Elfn1 KO mice (Figure 4). Nevertheless, to address the point raised, we will clarify in the revised manuscript (around lines 245-247 and 271-272) that our conclusions are based on these established findings, stating: “Building on prior evidence that Elfn1 knockout disrupts short-term facilitation in SST+ interneurons (Sylwestrak and Ghosh, 2012; Tomioka et al., 2014; Stachniak et al., 2019, 2023), we attribute the abolished barrel-septa divergence in Elfn1 KO mice to altered SST+ synaptic dynamics, though direct synaptic measurements were not performed here.”

      Additionally, since Elfn1 is constitutively knocked out from development, the possibility of altered circuit formation-including changes in barrel structure and interneuron distribution, cannot be excluded and is not addressed.

      We thank Reviewer #3 for raising the valid concern that constitutive Elfn1 knockout could potentially alter circuit formation, including barrel structure and interneuron distribution. To address this, we will clarify in the revised manuscript (around line ~271 and in the Discussion) that in our previous studies that included both whole-cell patch-clamp in acute brain slices ranging from postnatal day 11 to 22 (P11 - P21) and in vivo recordings from barrel cortex at P12 and P21, we saw no gross abnormalities in barrel structure, with Layer 4 barrels maintaining their characteristic size and organization, consistent with wildtype (WT) mice (Stachniak et al., 2019, 2023). While we cannot fully exclude subtle developmental changes, prior studies indicate that Elfn1 primarily modulates synaptic function rather than cortical cytoarchitecture (Tomioka et al., 2014). Elfn1 KO mice show no gross morphological or connectivity differences and the pattern and abundance of Elfn1 expressing cells (assessed by LacZ knock in) appears normal (Dolan and Mitchell, 2013).

      We will add the following to the Discussion: “Although Elfn1 is constitutively knocked out, we find here and in previous studies that barrel structure is preserved (Stachniak et al., 2019, 2023). Further, the distribution of Elfn1 expressing interneurons is not different in KO mice, suggesting minimal developmental disruption (Dolan and Mitchell, 2013).

      Nonetheless, we acknowledge that subtle circuit changes cannot be ruled out without the usage of time-depended conditional knockout of the gene.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors): 

      (1) My biggest concern is regarding statistics. Did the authors repeatedly apply independent tests (Mann-Whitney) without any correction for multiple comparisons (Figures 1 and 4)? In that case, the chances of a spurious "significant" result rise dramatically. 

      In response to the reviewer’s comment, we now present new statistical results by utilizing ANOVA and blended these results in the manuscript between lines 172 and 192 for WT data and 282 and 298 for Elfn1 KO data. This new statistical approach shows the same differences as we had previously reported, hence consolidating the statements made. 

      (2) The findings only hint at a mechanism involving SST+ neurons for how SWS and MWS are processed differently in the barrel vs septal domains. As a direct test of SST+ neuron involvement in the divergence of barrel and septal responses, the authors might consider SST-specific manipulations - for example, inhibitory chemo- or optogenetics during SWS and MWS stimulation.

      We thank the reviewer for this comment and agree that a direct manipulation of SST+ neurons via inhibitory chemo- or opto-genetics could provide further supporting evidence for the main claims in our study. We have opted out from performing these experiments for this manuscript as we feel they can be part of a future study.  At the same time, it is conceivable that such manipulations and depending on how they are performed may lead to larger and non-specific effects on cortical activity, since SST neurons will likely be completely shut down. So even though we certainly appreciate and value the strengths of such approaches, our experiments have addressed a more nuanced hypothesis, namely that the synaptic dynamics onto SST+ neurons matter for response divergence of septa versus barrels, which could not have been easily and concretely addressed by manipulating SST+ cell firing activity.  

      (3) In general, it is hard to comprehend what microcircuit could lead to the observed divergence in the MWS/SWS ratio in the barrel vs septal domain. There preferential recruitment of SST+ neurons during MWS is not specific to a particular domain, and the higher density of SST+ neurons specifically in L4 septa cannot per se explain the diverging MWS/SWS ratio in L4 septal neurons since similar ratio divergence is observed across domains in L2/3 neurons without increase SST+ neuron density in L2/3. This view would also assume that SST+ inhibition remains contained to its own layer and domain. Is this the case? Is it that different microcircuits between barrels and septa differently shape the response to repeated MWS? This is partially discussed in the paper; can the authors develop on that? What would the proposed mechanism be? Can the short-term plasticity of the thalamic inputs (VPM vs POm) be part of the picture?

      We thank the reviewer for raising this important point. We propose that the divergence in MWS/SWS ratios across barrel and septal domains arises from dynamic microcircuit interactions rather than static anatomical features such as SST+ density, which we describe and can provide a hint. In L2/3, where SST+ density is uniform, divergence persists, suggesting that trans-laminar and trans-domain interactions are key. Barrel domains, primarily receiving VPM inputs, exhibit short-term depression onto excitatory cells and engage PV+ and SST+ neurons to stabilize the MWS/SWS ratio, with Elfn1-dependent facilitation of SST+ neurons gradually increasing inhibition during repetitive SWS. Septal domains, in contrast, are targeted by facilitating POm inputs, combined with higher L4 SST+ density and Elfn1-mediated facilitation, producing progressive inhibitory buildup that amplifies the MWS/SWS ratio. SST+ projections in septa may extend trans-laminarly and laterally, influencing L2/3 and neighboring barrels, thereby explaining L2/3 divergence despite uniform SST+ density in L2/3. In this regards, direct laminar-dependent manipulations will be required to confirm whether L2/3 divergence is inherited from L4 dynamics. In Elfn1 KO mice, the loss of facilitation in SST+ neurons likely flattens these dynamics, disrupting functional segregation. Future experiments using VPM/POm-specific optogenetic activation and SST+ silencing will be critical to directly test this model.

      We expanded the discussion accordingly.

      (4) Can the decoder generalize between SWS and MWS? In this condition, if the decoder accuracy is higher for barrels than septa, it would support the idea that septa are processing the two stimuli differently. 

      Our results show that septal decoding accuracy is generally higher than barrel accuracy when generalizing from multi-whisker stimulation (MWS) to single-whisker stimulation (SWS), indicating distinct information processing in septa compared to barrels.

      In wild-type (WT) mice, septal accuracy exceeds barrel accuracy across all time windows (150ms, 51-95ms, 1-95ms), with the largest difference in the 51-95ms window (0.9944 vs. 0.9214 at pulse 20, 10Hz stimulation). This septal advantage grows with successive pulses, reflecting robust, separable neural responses, likely driven by the posterior medial nucleus (POm)’s strong MWS integration contrasting with minimal SWS activation. Barrel responses, driven by consistent ventral posteromedial nucleus (VPM) input for both stimuli, are less distinguishable, leading to lower accuracy.

      In Elfn1 knockout (KO) mice, which disrupt excitatory drive to somatostatin-positive (SST+) interneurons, barrel accuracy is higher initially in the 1-50ms window (0.8045 vs. 0.7500 at pulse 1), suggesting reduced early septal distinctiveness. However, septal accuracy surpasses barrels in later pulses and time windows (e.g., 0.9714 vs. 0.9227 in 51-95ms at pulse 20), indicating restored septal processing. This supports the role of SST+ interneurons in shaping distinct MWS responses in septa, particularly in late-phase responses (51-95ms), where inhibitory modulation is prominent, as confirmed by calcium imaging showing stronger SST+ activation during MWS.

      These findings demonstrate that septa process SWS and MWS differently, with higher decoding accuracy reflecting structured, POm- and SST+-driven response patterns. In Elfn1 KO mice, early deficits in septal processing highlight the importance of SST+ interneurons, with later recovery suggesting compensatory mechanisms. 

      We have added Supplementary Figure 4 and included this interpretation between lines 338353. 

      We thank the reviewer for suggesting this analysis.

      (5) It is not clear to me how the authors achieve SWS. How is it that the pipette tip "placed in contact with the principal whisker" does not detach from the principal whisker or stimulate other whiskers? Please clarify the methods. 

      Targeting the specific principal whisker is performed under the stereoscope.  

      Specifically, we have added this statement in line 628:

      “We trimmed the whiskers where necessary, to avoid them touching each other and to avoid stimulating other whiskers. By putting the pipette tip very close (almost touching) to the principal whisker, the movement of the tip (limited to 1mm) would reliably move the targeted whisker. The specificity of the stimulation of the selected principal whisker was observed under the stereoscope.”

      (6) The method for calculating decoder accuracy is not clearly described-how can accuracy exceed 1? The authors should clarify this metric and provide measures of variability (e.g., confidence intervals or standard deviations across runs) to assess the significance of their comparisons. Additionally, using a consistent scale across all plots would improve interoperability. 

      We thank the reviewer for raising this point. We have now changed the way accuracies are calculated and adopted a common scale among different plots (see updated Figure 5). We have also changed the methods section accordingly.

      (7) Figure 1: The sample size is not specified. It looks like the numbers match the description in the methods, but the sample size should be clearly stated here. 

      These are the numbers the reviewer is inquiring about. 

      WT: (WT) animals: a 280 × 95 × 20 matrix for the stimulated barrel (14 Barrels, 95ms, 20 pulses), a 180 × 95 × 20 matrix for the septa (9 Septa, 95ms, 20 pulses), and a 360 × 95 × 20 matrix for the neighboring barrel (18 Neighboring barrels, 95ms, 20 pulses). N=4 mice.

      KO: 11-barrel columns, 7 septal columns, 11 unstimulated neighbors from N=4 mice.

      Panels D-F are missing axes and axis labels (firing rate, p-value). Panel D is mislabeled (left, middle, and right). I can't seem to find the yellow line. 

      Thank you for this observation. We made changes in the figures to make them easier to navigate based on the collective feedback from the reviewers.

      Why is changing the way to compare the differences in the responses to repeated stimulation between SWS and MWS? 

      To assess temporal accumulation of information, we compared responses to repeated single-whisker stimulation (SWS) and multi-whisker stimulation (MWS) using an accumulative decoding approach rather than simple per-pulse firing rates. This method captures domain-specific integration dynamics over successive pulses.

      The use of the term "principal whisker" is confusing, as it could refer to the whisker that corresponds to the recorded barrel. 

      When we use the term principal whisker, the intention is indeed to refer to the whisker corresponding to the recorded barrel during single whisker stimulation. The term principal whisker is removed from Figure legend 1 and legend S1C where it may have led to  ambiguity.    

      Why the statement "after the start of active whisking"? Mice are under anesthesia here; it does not appear to be relevant for the figure. 

      “After the start of active whisking” refers to the state of the barrel cortex circuitry at the time of recordings. The particular reference we use comes from the habit of assessing sensory processing also from a developmental point of view. The reviewer is correct that it has nothing to do the with the status of the experiment. Nevertheless, since the reviewer found that it may create confusion, we have now taken it out. 

      (8) Figure 3: The y-axis label is missing for panel C. 

      This is now fixed. (dF/F).

      (9) Figure 4: Axis labels are missing.

      Added.

      Minor: 

      (10) Line 36: "progressive increase in septal spiking activity upon multi-whisker stimulation". There is no increase in septal spiking activity upon MWS; the ratio MWS/SWS increases.

      We have changed the sentence as follows: Genetic removal of Elfn1, which regulates the incoming excitatory synaptic dynamics onto SST+ interneurons, leads to the loss of the progressive increase in septal spiking ratio (MWS/SWS) upon stimulation.

      (11) Line 105: domain-specific, rather than column-specific, for consistency.

      We have changed it.

      (12) Lines 173-174: "a divergence between barrel and septa domain activity also occurred in Layer 4 from the 2nd pulse onward (Figure 1E)". The authors only show a restricted number of comparisons. Why not show the p-values as for SWS?

      The statistics is now presented in current Figure 1E.

      (13) Lines 151-153: "Correspondingly, when a single whisker is stimulated repeatedly, the response to the first pulse is principally bottom-up thalamic-driven responses, while the later pulses in the train are expected to also gradually engage cortico-thalamo-cortical and cortico-cortical loops." Can the authors please provide a reference?

      We have now added the following references : (Kyriazi and Simons, 1993; Middleton et al., 2010; Russo et al., 2025).

      (14) Lines 184-186: "Our electrophysiological experiments show a significant divergence of responses over time upon both SWS and MWS in L4 between barrels (principal and neighboring) and adjacent septa, with minimal initial difference". The only difference between the neighboring barrel and septa is the responses to the initial pulse. Can the author clarify? 

      We have now changed the sentence as follows: Our electrophysiological experiments show a significant divergence of responses between domains upon both SWS and MWS in L4. (Line 198 now)

      (15) Line 214: "suggest these interneurons may play a role in diverging responses between barrels and septa upon SWS". Why SWS specifically?

      We have changed the sentence as follows: These results confirmed that SST+ and VIP+ interneurons have higher densities in septa compared to barrels in L4 and suggest these interneurons may play a role in diverging responses between barrels and septa. (Line 231 now).

      (16) Line 235: "This result suggests that differential activation of SST+ interneurons is more likely to be involved in the domain-specific temporal ratio differences between barrels and septa". Why? The results here are not domain-specific.

      We have now revised this statement to: This result suggested that temporal ratio differences specific to barrels and septa might involve differential activation of SST+ interneurons rather than VIP+ interneurons.

      (17) Lines 241-243: "SST+ interneurons in the cortex are known to show distinct short-term synaptic plasticity, particularly strong facilitation of excitatory inputs, which enables them to regulate the temporal dynamics of cortical circuits." Please provide a reference.

      We have now added the following references: (Grier et al., 2023; Liguz-Lecznar et al., 2016).

      (18) Lines 245-247: "A key regulator of this plasticity is the synaptic protein Elfn1, which mediates short-term synaptic facilitation of excitation on SST+ interneurons (Stachniak et al., 2021, 2019; Tomioka et al., 2014)". Is Stachniak et al., 2021 not about the role of Elf1n in excitatory-to-VIP+ neuron synapses?

      The reviewer correctly spotted this discrepancy . This reference has now been removed from this statement.

      (19) Lines 271-272: "Building on our findings that Elfn1-dependent facilitation in SST+ interneurons is critical for maintaining barrel-septa response divergence". The authors did not show that.

      We have now changed the statement to: Building on our findings that Elfn1 is critical for maintaining barrel-septa response divergence  

      (20) Line 280: second firing peak, not "peal".

      Thank you, it is now fixed.

      (21) Lines 304-305: "These results highlight the critical role of Elfn1 in facilitating the temporal integration of 305 sensory inputs through its effects on SST+ interneurons". This claim is also overstated. 

      We have now changed the statement to: These results highlight the contribution of Elfn1 to the temporal integration of sensory inputs. (Line 362)

      (22) Line 329: Any reason why not cite Chen et al., Nature 2013?

      We have now added this reference, as also pointed out by reviewer 1.

      (23) Line 341-342: "wS1" and "wS2" instead of S1 and S2 for consistency.

      Thanks, we have now updated the terms.

      Reviewer #2 (Recommendations for the authors): 

      (1) Figure 3D - the SW conditions are labeled but not the MW conditions (two right graphs) - they should be labeled similarly (SSTMW, VIPMW). 

      The two right graphs in Figure 3D represent paired SW vs MW comparisons of the evoked responses for SST and VIP populations, respectively.

      (2) Figure 6 D and E I think it would be better if the Depth measurements were to be on the yaxis, which is more typical of these types of plots. 

      We thank the reviewer for this comment. Although we appreciate this may be the case, we feel that the current presentation may be easier for the reader to navigate, and we have hence kept it. 

      (3) Having an operational definition of septa versus barrel would be useful. As the authors point out, this is a tough distinction in a mouse, and often you read papers that use Barrel Wall versus Barrel Hollow/Center - operationally defining how these areas were distinguished would be helpful. 

      We thank the reviewer for this comment and understand the point made.

      We have now updated the methods section in line 611: 

      DiI marks contained within the vGlut2 staining were defined as barrel recordings, while DiI marks outside vGlut2 staining were septal recordings.

      Reviewer #3 (Recommendations for the authors): 

      To support the manuscript's major claims, the authors should consider the following:

      (1) Validate the septal identity of the neurons studied, either anatomically or functionally at the single-cell level (e.g., via Ca²⁺ imaging with confirmed barrel/septa mapping). 

      We thank the reviewer for this suggestion, but we feel that these extensive experiments are beyond the scope of this study. 

      (2) Provide both anatomical and physiological evidence to assess the possibility of altered cortical development in Elfn1 KO mice, including potential changes in barrel structure or SST⁺ cell distribution. 

      To address the reviewer’s point, we have now added the following to the Discussion: “Although Elfn1 is constitutively knocked out, we find here and in previous studies that barrel structure is preserved (Stachniak et al., 2019, 2023). Further, the distribution of Elfn1 expressing interneurons is not different in KO mice, suggesting minimal developmental disruption (Dolan and Mitchell, 2013). Nonetheless, we acknowledge that subtle circuit changes cannot be ruled out without conditional knockouts.”,

      (3) Examine the sensory responses of SST⁺ and VIP⁺ interneurons in deeper cortical layers, particularly layer 4, which is central to the study's main conclusions.

      We thank the reviewer for this suggestion and appreciate the value it would bring to the study. We nevertheless feel that these extensive experiments are beyond the scope of this study and hence opted out from performing them. 

      Minor Comments:

      (1)  The authors used a CLARITY-based passive clearing protocol, which is known to sometimes induce tissue swelling or distortion. This may affect anatomical precision, especially when assigning neurons to narrow domains such as septa versus barrels. Please clarify whether tissue expansion was measured, corrected, or otherwise accounted for during analysis.

      Yes, the tissue expansion was accounted during analysis for the laminar specification. We excluded the brains with severe distortion. 

      (2) While the anatomical data are plotted as a function of "depth from the top of layer 4," the manuscript does not specify the precise depth ranges used to define individual cortical layers in the cleared tissue. Given the importance of laminar specificity in projection and cell type analyses, the criteria and boundaries used to delineate each layer should be explicitly stated.

      Thank you for pointing this out. We now include the criteria for delineating each layer in the manuscript. “Given that the depth of Layer 4 (L4) can be reliably measured due to its welldefined barrel boundaries, and that the relative widths of other layers have been previously characterized (El-Boustani et al., 2018), we estimated laminar boundaries proportionally. Specifically, Layer 2/3 was set to approximately 1.3–1.5 times the width of L4, Layer 5a to ~0.5 times, and Layer 5b to a similar width as L4. Assuming uniform tissue expansion across the cortical column, we extrapolated the remaining laminar thicknesses proportionally.”

      (3)  In several key comparisons (e.g., SST⁺ vs. VIP⁺ interneurons, or S2-projecting vs. M1projecting neurons), it is unclear whether the same barrel columns were analyzed across conditions. Given the anatomical and functional heterogeneity across wS1 columns, failing to control for this may introduce significant confounds. We recommend analyzing matched columns across groups or, if not feasible, clearly acknowledging this limitation in the manuscript.

      We thank the reviewer for raising this important point. For the comparison of SST⁺ versus VIP⁺ interneurons, it would in principle have been possible to analyze the same barrel columns across groups. However, because some of the cleared brains did not reach the optimal level of clarity, our choice of columns was limited, and we were not always able to obtain sufficiently clear data from the same columns in both groups. Similarly, for the analysis of S2- versus M1-projecting neurons, variability in the position and spread of retrograde virus injections made it difficult to ensure measurements from identical barrel columns. We have now added a statement in the Discussion to acknowledge this limitation.

      (4) Figure 1C: Clarify what each point in the t-SNE plot represents-e.g., a single trial, a recording channel, or an averaged response. Also, describe the input features used for dimensionality reduction, including time windows and preprocessing steps.

      In response to the reviewer’s comment, we have now added the following in the methods: In summary, each point in the t-SNE plots represents an averaged response across 20 trials for a specific domain (barrel, septa, or neighbor) and genotype (WT or KO), with approximately 14 points per domain derived from the 280 trials in each dataset. The input features are preprocessed by averaging blocks of 20 trials into 1900-dimensional vectors (95ms × 20), which are then reduced to 2D using t-SNE with the specified parameters. This approach effectively highlights the segregation and clustering patterns of neural responses across cortical domains in both WT and KO conditions.

      (5) Figures 1D, E (left panels): The y-axes lack unit labeling and scale bars. Please indicate whether values are in spikes/sec, spikes/bin, or normalized units.

      We have now clarified this. 

      (6) Figures 1D, E (right panels): The color bars lack units. Specify whether the values represent raw firing rates, z-scores, or other normalized measures. Replace the vague term "Matrix representation" with a clearer label such as "Pulse-aligned firing heatmap."

      Thank you, we have now done it.

      (7) Figure 1E (bottom panel): There appears to be no legend referring to these panels. Please define labels such as "B" and "S." 

      Thank you, we have now done it.

      (8) Figure 1E legend: If it duplicates the legend from Figure 1D, this should be made explicit or integrated accordingly. 

      We have changed the structure of this figure.

      (9) Figure 1F: Define "AUC" and explain how it was computed (e.g., area under the firing rate curve over 0-50 ms). Indicate whether the plotted values represent percentages and, if so, label the y-axis accordingly. If normalization was applied, describe the procedure. Include sample sizes (n) and specify what each data point represents (e.g., animal, recording site). 

      The following paragraph has been added in the methods section:

      The Area Under the Curve (AUC) was computed as the integral of the smoothed firing rate (spikes per millisecond) over a 50ms window following each whisker stimulation pulse, using trapezoidal integration. Firing rate data for layer 4 barrel and septal regions in wild-type (WT) and knockout (KO) mice were smoothed with a 3-point moving average and averaged across blocks of 20 trials. Plotted values represent the percentage ratio of multi-whisker (MW) to single whisker (SW) AUC with error bars showing the standard error of the mean. Each data point reflects the mean AUC ratio for a stimulation pulse across approximately 11 blocks (220 trials total). The y-axis indicates percentages.

      (10) Figure 3C: Add units to the vertical axis.

      We have added them.

      (11) Figure 3D: Specify what each line represents (e.g., average of n cells, individual responses?). 

      Each line represents an average response of a neuron.  

      (12) Figure 4C legend: Same with what?". No legend refers to the bottom panels - please revise to clarify. 

      Thank you. We have now changed the figure structure and legends and fixed the missing information issue.

      (13) Supplementary Figure 1B: Indicate the physical length of the scale bar in micrometers. 

      This has been fixed. The scale bar is 250um.

      (14) Indicate the catalog number or product name of the 8×8 silicon probe used for recordings.

      We have added this information. It is the A8x8-Edge-5mm-100-200-177-A64

      References

      (1) Beierlein, M., Gibson, J. R. & Connors, B. W. (2003). Two dynamically distinct inhibitory networks in layer 4 of the neocortex. J. Neurophysiol. 90, 2987–3000.

      (2) Burkhalter, A., D’Souza, R. D. & Ji, W. (2023). Integration of feedforward and feedback information streams in the modular architecture of mouse visual cortex. Annu. Rev. Neurosci. 46, 259–280.

      (3) Chen, J. L., Margolis, D. J., Stankov, A., Sumanovski, L. T., Schneider, B. L. & Helmchen, F. (2015). Pathway-specific reorganization of projection neurons in somatosensory cortex during learning. Nat. Neurosci. 18, 1101–1108.

      (4) Connor, J. R. & Peters, A. (1984). Vasoactive intestinal polypeptide-immunoreactive neurons in rat visual cortex. Neuroscience 12, 1027–1044.

      (5) Cruikshank, S. J., Lewis, T. J. & Connors, B. W. (2007). Synaptic basis for intense thalamocortical activation of feedforward inhibitory cells in neocortex. Nat. Neurosci. 10, 462–468.

      (6) Dolan, J. & Mitchell, K. J. (2013). Mutation of Elfn1 in mice causes seizures and hyperactivity. PLoS One 8, e80491.

      (7) Gibson, J. R., Beierlein, M. & Connors, B. W. (1999). Two networks of electrically coupled inhibitory neurons in neocortex. Nature 402, 75–79.

      (8) Ji, W., Gămănuţ, R., Bista, P., D’Souza, R. D., Wang, Q. & Burkhalter, A. (2015). Modularity in the organization of mouse primary visual cortex. Neuron 87, 632–643.

      (9) Martin-Cortecero, J. & Nuñez, A. (2014). Tactile response adaptation to whisker stimulation in the lemniscal somatosensory pathway of rats. Brain Res. 1591, 27–37.

      (10) Mégevand, P., Troncoso, E., Quairiaux, C., Muller, D., Michel, C. M. & Kiss, J. Z. (2009). Long-term plasticity in mouse sensorimotor circuits after rhythmic whisker stimulation. J. Neurosci. 29, 5326–5335.

      (11) Meier, A. M., Wang, Q., Ji, W., Ganachaud, J. & Burkhalter, A. (2021). Modular network between postrhinal visual cortex, amygdala, and entorhinal cortex. J. Neurosci. 41, 4809– 4825.

      (12) Meier, A. M., D’Souza, R. D., Ji, W., Han, E. B. & Burkhalter, A. (2025). Interdigitating modules for visual processing during locomotion and rest in mouse V1. bioRxiv 2025.02.21.639505.

      (13) Scala, F., Kobak, D., Shan, S., Bernaerts, Y., Laturnus, S., Cadwell, C. R., Hartmanis, L., Froudarakis, E., Castro, J. R., Tan, Z. H., et al. (2019). Layer 4 of mouse neocortex differs in cell types and circuit organization between sensory areas. Nat. Commun. 10, 4174.

      (14) Stachniak, T. J., Sylwestrak, E. L., Scheiffele, P., Hall, B. J. & Ghosh, A. (2019). Elfn1induced constitutive activation of mGluR7 determines frequency-dependent recruitment of somatostatin interneurons. J. Neurosci. 39, 4461–4475.

      (15) Stachniak, T. J., Kastli, R., Hanley, O., Argunsah, A. Ö., van der Valk, E. G. T., Kanatouris, G. & Karayannis, T. (2021). Postmitotic Prox1 expression controls the final specification of cortical VIP interneuron subtypes. J. Neurosci. 41, 8150–8166.

      (16) Stachniak, T. J., Argunsah, A. Ö., Yang, J. W., Cai, L. & Karayannis, T. (2023). Presynaptic kainate receptors onto somatostatin interneurons are recruited by activity throughout development and contribute to cortical sensory adaptation. J. Neurosci. 43, 7101–7118.

      (17) Sun, Q.-Q., Huguenard, J. R. & Prince, D. A. (2006). Barrel cortex microcircuits: Thalamocortical feedforward inhibition in spiny stellate cells is mediated by a small number of fast-spiking interneurons. J. Neurosci. 26, 1219–1230.

      (18) Sylwestrak, E. L. & Ghosh, A. (2012). Elfn1 regulates target-specific release probability at CA1-interneuron synapses. Science 338, 536–540.

      (19) Tan, Z., Hu, H., Huang, Z. J. & Agmon, A. (2008). Robust but delayed thalamocortical activation of dendritic-targeting inhibitory interneurons. Proc. Natl. Acad. Sci. USA 105, 2187–2192.

      (20) Tomioka, N. H., Yasuda, H., Miyamoto, H., Hatayama, M., Morimura, N., Matsumoto, Y., Suzuki, T., Odagawa, M., Odaka, Y. S., Iwayama, Y., et al. (2014). Elfn1 recruits presynaptic mGluR7 in trans and its loss results in seizures. Nat. Commun. 5, 4501.

      (21) Yamashita, T., Vavladeli, A., Pala, A., Galan, K., Crochet, S., Petersen, S. S. & Petersen, C. C. (2018). Diverse long-range axonal projections of excitatory layer 2/3 neurons in mouse barrel cortex. Front. Neuroanat. 12, 33.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Pereira de Castro and coworkers are studying potential competition between a more standard splicing factor SF1 and an alternative splicing factor called QK1. This is interesting because they bind to overlapping sequence motifs and could potentially have opposing effects on promoting the splicing reaction. To test this idea, the authors KD either SF1 or QK1 in mammalian cells and uncover several exons whose splicing regulation follows the predicted pattern of being promoted for splicing by SF1 and repressed by QK1. Importantly, these have introns enriched in SF1 and QK1 motifs. The authors then focus on one exon in particular with two tandem motifs to study the mechanism of this in greater detail and their results confirm the competition model. Mass spec analysis largely agrees with their proposal; however, it is complicated by apparently quick transition of SF1 bound complexes to later splicing intermediates. An inspired experiment in yeast shows how QK1 competition could potentially have a determinental impact on splicing in an orthogonal system. Overall these results show how splicing regulation can be achieved by competition between a "core" and alternative splicing factor and provide additional insight into the complex process of branch site recognition. The manuscript is exceptionally clear and the figures and data very logically presented. The work will be valuable to those in the splicing field who are interested in both mechanism and bioinformatics approaches to deconvolve any apparent "splicing code" being used by cells to regulate gene expression.

      Strengths:

      (1) The main discovery of the manuscript involving evidence for SF1/QK1 competition is quite interesting and important for this field. This evidence has been missing and may change how people think about branch site recognition.

      (2) The experiments and the rationale behind them are clearly and logically presented.

      (3) The experiments are carried out to a high standard and well-designed controls are included.

      (4) The extrapolation of the result to yeast in order to show the potentially devastating consequences of QK1 competition was creative and informative.

      Weaknesses:

      Overall the weaknesses are relatively minor and involve cases where conclusions could potentially have been strengthened with additional experimentation. For example, pull-down of the U2 snRNP could be strengthened by detection of the snRNA whereas the proteins may themselves interact with these factors in the absence of the snRNA. In addition the discussion is a bit speculative given the data, but compelling nonetheless.

    2. Reviewer #3 (Public review):

      Summary:

      In this manuscript the authors were trying to establish whether competition between the RNA binding proteins SF1 and QKI controlled splicing outcomes. These two proteins have similar binding sites and protein sequences, but SF1 lacks a dimerization motif and seems to bind a single version of the binding sequence. Importantly, these binding sequences correspond to branchpoint consensus sequences, with SF1 binding leading to productive splicing, but QKI binding leading instead to association with paraspeckle proteins. They show that in human cells SF1 generally activates exons and QKI represses, and a large group of the jointly regulated exons (43% of joint targets) are reciprocally controlled by SF1 and QKI. They focus on one of these exons RAI14 that shows this reciprocal pattern of regulation, and has 2 repeats of the binding site that make it a candidate for joint regulation, and confirm regulation within a minigene context. The authors used assembly of proteins within nuclear extracts to explain the effect of QKI versus SF1 binding. Finally the authors show that expression of QKI is lethal in yeast, and causes splicing defects.

      How this fits in the field. This study is interesting and provides a conceptual advance by providing a general rule how SF1 and QKI interact with relation to binding sites, and the relative molecular fates followed, so is very useful. Most of the analysis seems to focus on one example, but the choice of this example was carefully explained in the text. The molecular analysis and global work significantly adds to the picture from the previously published paper about NUMB joint regulation by QKI and SF (Zong et al, cited in text as reference 50, that looked at SF1 and QKI binding in relation to a duplicated binding site/branchpoint sequence in NUMB).

      Strengths:

      The data presented are strong and clear. The ideas discussed in this paper are of wide interest, and present a simple model where two binding sites generates a potentially repressive QKI response, whereas exons that have a single upstream sequence are just regulated by SF1. The assembly of splicing complexes on RNAs derived from RAI14 in nuclear extracts, followed by mass spec gave interesting mechanistic insight into what was occurring as a result of QKI versus SF1 binding.

      Weaknesses:

      The authors have addressed the previous weaknesses of the study, resulting in a much stronger manuscript

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This important manuscript provides insights into the competition between Splicing Factor 1 (SF1) and Quaking (QKI) for binding at the ACUAA branch point sequence in a model intron, regulating exon inclusion. The study employs rigorous transcriptomic, proteomic, and reporter assays, with both mammalian cell culture and yeast models. Nevertheless, while the data are convincing, broadening the analysis to additional exons and narrowing the manuscript's title to better align with the experimental scope would strengthen the work.

      Public Reviews:

      Reviewer #1 (Public review):

      In this manuscript, the authors aimed to show that SF1 and QKI compete for the intron branch point sequence ACUAA and provide evidence that QKI represses inclusion when bound to it.

      Major strengths of this manuscript include:

      (1) Identification of the ACUAA-like motif in exons regulated by QKI and SF1.

      (2) The use of the splicing reporter and mutant analysis to show that upstream and downstream ACUAAC elements in intron 10 of RAI are required for repressing splicing.

      (3) The use of proteomic to identify proteins in C2C12 nuclear extract that binds to the wild type and mutant sequence.

      (4) The yeast studies showing that ectopic lethality when Qki5 expression was induced, due to increased mis-splicing of transcripts that contain the ACUAA element.

      The authors conclusively show that the ACUAA sequence is bound by QKI and provide strong evidence that this leads to differences in exons inclusion and exclusion. In animal cells, and especially in human, branchpoint sequences are degenerate but seem to be recognized by specific splicing factors. Although a subset of splicing factors shows tissue-specific expression patterns most don't, suggesting that yet-to-be-identified mechanisms regulate splicing. This work suggests that an alternate mechanism could be related to the binding affinity of specific RNA binding factors for branchpoint sequences coupled with the level of these different splicing factors in a given cell.

      We thank the reviewer for the positive comments.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Pereira de Castro and coworkers are studying potential competition between a more standard splicing factor SF1, and an alternative splicing factor called QK1. This is interesting because they bind to overlapping sequence motifs and could potentially have opposing effects on promoting the splicing reaction. To test this idea, the authors KD either SF1 or QK1 in mammalian cells and uncover several exons whose splicing regulation follows the predicted pattern of being promoted for splicing by SF1 and repressed by QK1. Importantly, these have introns enriched in SF1 and QK1 motifs. The authors then focus on one exon in particular with two tandem motifs to study the mechanism of this in greater detail and their results confirm the competition model. Mass spec analysis largely agrees with their proposal; however, it is complicated by the apparently quick transition of SF1-bound complexes to later splicing intermediates. An inspired experiment in yeast shows how QK1 competition could potentially have a detrimental impact on splicing in an orthogonal system. Overall, these results show how splicing regulation can be achieved by competition between a "core" and alternative splicing factor and provide additional insight into the complex process of branch site recognition. The manuscript is exceptionally clear and the figures and data are very logically presented. The work will be valuable to those in the splicing field who are interested in both mechanism and bioinformatics approaches to deconvolve any apparent "splicing code" being used by cells to regulate gene expression. Criticisms are minor and the most important of them stem from overemphasis on parts of the manuscript on the evolutionary angle when evolution itself wasn't analyzed per se.

      We thank the reviewer for the positive comments and very clear and fair critical points.

      Strengths:

      (1) The main discovery of the manuscript involving evidence for SF1/QK1 competition is quite interesting and important for this field. This evidence has been missing and may change how people think about branch site recognition.

      (2) The experiments and the rationale behind them are exceptionally clearly and logically presented. This was wonderful!

      Thank you so much. We felt the overall flow of the paper and data make for a nice “story” that conveys a relatively easy-to-understand explanation for a complex subject.

      (3) The experiments are carried out to a high standard and well-designed controls are included.

      (4) The extrapolation of the result to yeast in order to show the potentially devastating consequences of the QK1 competition was very exciting and creative.

      We agree this is a very exciting result and finding! Thanks.

      Weaknesses:

      Overall the weaknesses are relatively minor and involve cases where clarification is necessary, some additional analysis could bolster the arguments, and suggestions for focusing the manuscript on its strengths.

      (1) The title (Ancient...evolutionary outcomes), abstract, and some parts of the discussion focus heavily on the evolutionary implications of this work. However, evolutionary analysis was not performed in these studies (e.g., when did QK1 and SF1 proteins arise and/or diverge? How does this line up with branch site motifs and evolution of U2? Any insight from recent work from Scott Roy et al?). I think this aspect either needs to be bolstered with experimental work/data or this should be tamped down in the manuscript. I suggest highlighting the idea expressed in the sentence "A nuanced implication of this model is that loss-of-function...". To me, this is better supported by the data and potentially by some analysis of mutations associated with human disease.

      We have revised the title and dampened the evolutionary aspects of the previous version of the manuscript.

      (2) One paper that I didn't see cited was that by Tanackovic and Kramer (Mol Biol Cell 2005). This paper is relevant because they KD SF1 and found it nonessential for splicing in vivo. Do their results have implications for those here? How do the results of the KD compare? Could QK1 competition have influenced their findings (or does their work influence the "nuanced implication" model referenced above?)?

      This is an interesting point, and thank you for the suggestion. We have now included a brief description of this study in the Introduction of the revised manuscript and do note that the authors measured intron retention of a beta globin reporter and SF3A1, SF3A2, and SF3A3 during SF1 knockdown, but did not detect elevated unspliced RNA in these targets.

      (3) Can the authors please provide a citation for the statement "degeneracy is observed to a higher degree in organisms with more alternative splicing"? Does recent evolutionary analysis support this?

      We have removed the statement, as it did not add much to the content and I am not sure I can state the concept I was attempting to convey in a simple manner with few citations.

      (4) For the data in Figure 3, I was left wondering if NMD was confounding this analysis. Can the authors respond to this and address this concern directly?

      We have not measured if the reporters used in Figure 3 produce protein(s). Presumably, though, all spliced reporter RNA would be degraded equally (the included/skipped isoforms’ “reading frames” are not altered from one another). This would not be case for unspliced nuclear reporter RNA, however. Given this difference, and that our analysis can not resolve the subcellular localization of the different reporter species, we have removed the measurement of and subsequent results describing unspliced reporter RNA from Figure 3.

      (5) To me, the idea that an engaged U2 snRNP was pulled down in Figure 4F would be stronger if the snRNA was detected. Was that able to be observed by northern or primer extension? Would SF1 be enriched if the U2 snRNA was degraded by RNaseH in the NE?

      We did not measure any co-associating RNAs in this experimental approach, but agree that this approach would strengthen the evidence for it.

      (6) I'm wondering how additive the effects of QK1 and SF1 are... In Figure 2, if QK1 and SF1 are both knocked down, is the splicing of exon 11 restored to "wt" levels?

      This is an interesting question that we were unfortunately unable to address experimentally here.

      (7) The first discussion section has two paragraphs that begin "How does competition between SF1..." and "Relatively little is known about how...". I found the discussion and speculation about localization, paraspekles, and lncRNAs interesting but a bit detracting from the strengths of the manuscript. I would suggest shortening these two paragraphs into a single one.

      We have revised the Discussion.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript, the authors were trying to establish whether competition between the RNA-binding proteins SF1 and QKI controlled splicing outcomes. These two proteins have similar binding sites and protein sequences, but SF1 lacks a dimerization motif and seems to bind a single version of the binding sequence. Importantly, these binding sequences correspond to branchpoint consensus sequences, with SF1 binding leading to productive splicing, but QKI binding leading instead to association with paraspeckle proteins. They show that in human cells SF1 generally activates exons and QKI represses, and a large group of the jointly regulated exons (43% of joint targets) are reciprocally controlled by SF1 and QKI. They focus on one of these exons RAI14 that shows this reciprocal pattern of regulation, and has 2 repeats of the binding site that make it a candidate for joint regulation, and confirm regulation within a minigene context. The authors used the assembly of proteins within nuclear extracts to explain the effect of QKI versus SF1 binding. Finally, the authors show that the expression of QKI is lethal in yeast, and causes splicing defects.

      How this fits in the field. This study is interesting and provides a conceptual advance by providing a general rule on how SF1 and QKI interact in relation to binding sites, and the relative molecular fates followed, so is very useful. Most of the analysis seems to focus on one example, although the molecular analysis and global work significantly add to the picture from the previously published paper about NUMB joint regulation by QKI and SF (Zong et al, cited in text as reference 50, that looked at SF1 and QKI binding in relation to a duplicated binding site/branchpoint sequence in NUMB).

      Thank you for the encouraging remarks.

      Strengths:

      The data presented are strong and clear. The ideas discussed in this paper are of wide interest, and present a simple model where two binding sites generate a potentially repressive QKI response, whereas exons that have a single upstream sequence are just regulated by SF1. The assembly of splicing complexes on RNAs derived from RAI14 in nuclear extracts, followed by mass spec gave interesting mechanistic insight into what was occurring as a result of QKI versus SF1 binding.

      Weaknesses:

      I did not think the title best summarises the take-home message and could be perhaps a bit more modest. Although the authors investigated splicing patterns in yeast and human cells, yeast do not have QKI so there is no ancient competition in that case, and the study did not really investigate physiological or evolutionary outcomes in splicing, although it provides interesting speculation on them. Also as I understood it, the important issue was less conserved branchpoints in higher eukaryotes enabling alternative splicing, rather than competition for the conserved branchpoint sequence. So despite the the data being strong and properly analysed and discussed in the paper, could the authors think whether they fit best with the take-home message provided in the title? Just as a suggestion (I am sure the authors can do a better job), maybe "molecular competition between variant branchpoint sequences predict physiological and evolutionary outcomes in splicing"?

      Thank you for this point (Reviewer 2 had a similar comment) and the suggestion. We have revised the title.

      Although the authors do provide some global data, most of the detailed analysis is of RAI14. It would have been useful to examine members of the other quadrants in Figure 1C as well for potential binding sites to give a reason why these are not co-regulated in the same way as RAI14. How many of the RAI14 quadrants had single/double sites (the motif analysis seemed to pull out just one), and could one of the non-reciprocally regulated exons be moved into a different quadrant by addition or subtraction of a binding site or changing the branchpoint (using a minigene approach for example).

      This is an interesting point that we have considered. Our intent with the focus on RAI14 was to use a naturally occurring intron bps with evidence of strong QKI binding that did not require a high degree of sequence manipulation or engineering.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Most of my recommendations are really centered on the figures. In their current state, they detract from the data shown and could be improved: I recommend the authors use a uniform font. For example, Figure 1E and F have at least three different fonts of varying sizes making it very messy. In Figure 1C, the authors could bold the Ral14 ex11 or simply indicate that the blue is this exon in the legend, thus removing the text from this very busy graph. In Figure 4F, I would recommend, having all the labels the same size and putting those genes of interest like Sf3a1 in bold. This could also be done in Figure 4E.

      Thank you for the suggestion and we have edited these (FYI the font in Fig’s 1E and 1F were from the rMAPS default output, but I agree, it gives a sloppy appearance).

      (2) In Figures 4D and 4G, is there QKI binding to the downstream deletion mutant after 30 minutes? Also, in Figure 4G, are these all from the same blot? The band sizes seem to be very different between lanes. If these were not on the same blot, the original gels should be submitted.

      A small amount of Qki appears to be binding after 30 min. All lanes/blots are from the same gels/membranes; see new Supplemental Figure 4 for the original (uncropped) images of the blots.

      (3) The authors should indicate, the source and concentration of the antibodies used for their WB. They should also indicate the primers used for RT-PCRs.

      We have revised the methods to include the antibody information and have uploaded a supplemental table 8 with all oligonucleotide sequences used (which I (Sam Fagg) neglected to do initially, so that’s my bad).

      Reviewer #2 (Recommendations for the authors):

      (1) This may come down to the author's preference but branch point and branch site are frequently two words, not a single compound word (branch point vs. branchpoint). In addition, the authors may want to use branchsite with the abbreviation BS more frequently since they often don't describe the specific point of branching, and bp and bps could be confused for the more frequent abbreviations for base pair(s).

      Good suggestion; we have edited the text accordingly.

      (2) In general the addition of page numbers and line numbers to the manuscript would greatly aid reviewers!

      Point taken…

      (3) Introduction; "...under normal growth conditions they are efficiently spliced". I would say MOST introns in yeast are efficiently spliced. This is definitely not universal.

      Text edited to indicate that most are efficiently spliced.

      (4) Introduction; " recognition of the bps by SF1 (mammals) (20)". The choice of reference 20 is an odd one here. I think the Robin Reed and Michael Rosbash paper was the first to show SF1 was the human homolog of BBP.

      Got it, thanks (added #14 here and kept #20 also since it shows the structure of SF1 in complex with a UACUAAC bps.)

      (5) Results; "QK1 and SF1 co-regulate.."; it may be useful for the reader if you could explain in more detail why exon inclusion and intron retention are expected outcomes for QK1 knockdown and vice versa for SF1. The exon inclusion here is more obvious than the intron retention phenotype. (In other words, if more exons are included shouldn't it follow that more introns are removed?)

      We explain the expected results for exon inclusion in the Introduction and this paragraph of the Results. Although we have observed more intron retention under QKI loss-of-function approaches before, I am uncertain where the reviewer sees that we indicate any expected result for intron retention from either QKI or SF1 knockdown. I believe the statement you refer to might be on line 162 and starts with: “Consistent with potentially opposing functions in splicing…” ?

      Also, I agree that if SF1 is a “splicing activator,” one might expect more IR in its absence (but this is not the case; there is, in fact, less), but nonetheless, the opposite outcome is observed with QKI knockdown (more IR). It is unclear why this is the case, and we did not investigate it.

      (6) Results; "QK1 and SF1 co-regulate.."; "Thus the most highly represented set.." To me, the most highly represented set is those which are not both QK1-repressed and SF1-activated. Does this indicate that other factors are involved at most sites than simple competition between these two?

      We have revised the sentence in question to include the text “by quadrant” in order to convey our meaning more precisely.

      (7) Throughout the manuscript, 5 apostrophes and 3 apostrophes are used instead of 5 prime symbols and 3 prime symbols.

      Thank you for pointing that out. We have fixed each instance of this.

      (8) Sometimes SF1 is written as Sf1. (also Tatsf1)

      This was a mouse/human gene/protein nomenclature error that we have fixed; thank you for pointing this out.

      (9) You may want to make sure that figures are labeled consistently with the manuscript text. In Figure 1B, it is RI rather than IR. In Figure 4 it is myoblast NE rather than C2C12 nuclear extract.

      We have fixed these, checked for other examples, and where relevant, edited those too.

      (10) I think Figure 1A could be improved by also including a depiction of the domain arrangements of SF1 and QK1.

      Done.

      (11) I was a bit confused with all the lines in Figure 1E and 1F. What is the difference between the log (pVal) and upregulated plots? Can these figures be simplified or explained more thoroughly?

      Based on this comment and one from Reviewer 1, we have slightly revised the wording (and font) on the output, which hopefully clarifies. These are motif enrichment plots generated by rMAPS (Refs 61 and 62) analysis of rMATS (Ref 60) data for exons more included (depicted by the red lines) or more skipped (depicted by the blue lines) compared to control versus a “background” set of exons that are detectable but unchanged. The -log<sub>10</sub> is P-value (dotted line) indicates the significance of exons more included in shRNA treatment vs control shRNA (previously read “upregulated”) compared to background exons that are detectable but unchanged; the solid lines indicate the motif score; these are described in the references indicated.

      (12) Figure 1B, it is a bit hard to conclude that there is more AltEx or "RI/IR" in one sample vs. the other from these plots since the points overlay one another. Can you include numbers here?

      Added (and deleted Suppl Fig S1, which was simply a chart showing the numbers).

      (13) How was PSI calculated in Figure 2A?

      VAST-tools (we state this in the legend in the revised version).

      You may want to include rel protein (or the lower limit of detection) for Figure 2B to be consistent with 2C. Why is KD of SF1 so poor and variable between 2C and 2D?

      We have not investigated this, but these blots show an optimized result that we were able to obtain for the knockdown in each cell type. It may be that HEK293 cells (Fig 2B) have a stronger requirement for SF1 than C2C12 cells…? I would argue that it is not necessarily “poor” in Fig 2C, as we observe ~70% depletion of the protein.

      Why are two bands present in the gel?

      Two to three isoforms of SF1 are present in most cell types.

      A good (or bad, really) example of an SF1 western blot (and knockdown of ~35% in K562 or ~45% in HepG2 can also be seen on the ENCODE project website, for reference:

      https://www.encodeproject.org/documents/6001a414-b096-4073-94ff-3af165617eb5/@@download/attachment/SF1_BGKLV28-49.pdf

      By comparison, I think ours are much more cosmetically pleasing, and our knockdown (especially in C2C12) is much more efficient.

      (14) Figure 3, The asterisk refers to a cryptic product. Can the uaAcuuuCAG be used as a branch point? Presumably the natural 3' SS is now too close so this would result in activation of a downstream 3'SS?

      We did not pursue determining the identity of this minor and likely artefactual product, but we (and others) have observed a similar phenomenon when using splicing reporter-based mutational approaches.

      (15) For the methods. The "RNA extraction, RT -PCR,..." subheading needs to be on its own line. Please add (w/v) or (v/v) to percentages where appropriate. Please convert ug to the symbol for "micro".

      Thank you, we have made these changes.

      (16) In Figure 4B, the text here and legend are microscopic. Even with reading glasses, I couldn't make anything out!

      We have increased the font sizes for the text and scale bar…when referring to “legend” does the reviewer mean the scale bar?

      (17) As a potential discussion item, it is worth noting that SF1 could also repress splicing if it could either not engage with U2AF or be properly displaced by U2 snRNP so the snRNA could pair. I was wondering if QK1 could similarly be activating if it could engage with U2AF. I'm unsure if this could be tested by domain swaps (and is beyond the scope of this paper). It just may be worth speculating about.

      Good point and suggestion…we are looking into this.

      Reviewer #3 (Recommendations for the authors):

      (1) Is the reference in the text to Figure 5F correct for actin splicing (this is just before the discussion)?

      I see references several lines up from this, but I do not see a reference just before the discussion…?

      (2) I was not sure why the minigene experiments showed such high levels of intron retention that seemed to be impacted also by deletion of the branchpoint sequences, and suggest that the two branchpoints are not equal in strength.

      Neither were we, but Reviewer 2 has suggested that degradation of the spliced products could be rapid (NMD substrates) which could complicate the interpretation of what appears to be higher levels of intron retention. Given the possibility that this could be a non-physiological artefact, we have removed the measurement of unspliced reporter and now only show the spliced products (equally subject to degradation) and report their percent inclusion.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors investigate the nanoscopic distribution of glycine receptor subunits in the hippocampus, dorsal striatum, and ventral striatum of the mouse brain using single-molecule localization microscopy (SMLM). They demonstrate that only a small number of glycine receptors are localized at hippocampal inhibitory synapses. Using dual-color SMLM, they further show that clusters of glycine receptors are predominantly localized within gephyrin-positive synapses. A comparison between the dorsal and ventral striatum reveals that the ventral striatum contains approximately eight times more glycine receptors and this finding is consistent with electrophysiological data on postsynaptic inhibitory currents. Finally, using cultured hippocampal neurons, they examine the differential synaptic localization of glycine receptor subunits (α1, α2, and β). This study is significant as it provides insights into the nanoscopic localization patterns of glycine receptors in brain regions where this protein is expressed at low levels. Additionally, the study demonstrates the different localization patterns of GlyR in distinct striatal regions and its physiological relevance using SMLM and electrophysiological experiments. However, several concerns should be addressed.

      Specific comments on the original version:

      (1) Colocalization analysis in Figure 1A. The colocalization between Sylite and mEos-GlyRβ appears to be quite low. It is essential to assess whether the observed colocalization is not due to random overlap. The authors should consider quantifying colocalization using statistical methods, such as a pixel shift analysis, to determine whether colocalization frequencies remain similar after artificially displacing one of the channels.

      (2) Inconsistency between Figure 3A and 3B. While Figure 3B indicates an ~8-fold difference in the number of mEos4b-GlyRβ detections per synapse between the dorsal and ventral striatum, Figure 3A does not appear to show a pronounced difference in the localization of mEos4b-GlyRβ on Sylite puncta between these two regions. If the images presented in Figure 3A are not representative, the authors should consider replacing them with more representative examples or providing an expanded images with multiple representative examples. Alternatively, if this inconsistency can be explained by differences in spot density within clusters, the authors should explain that.

      (3) Quantification in Figure 5. It is recommended that the authors provide quantitative data on cluster formation and colocalization with Sylite puncta in Figure 5 to support their qualitative observations.

      (4) Potential for pseudo replication. It's not clear whether they're performing stats tests across biological replica, images, or even synapses. They often quote mean +/- SEM with n = 1000s, and so does that mean they're doing tests on those 1000s? Need to clarify.

      (5) Does mEoS effect expression levels or function of the protein? Can't see any experiments done to confirm this. Could suggest WB on homogenate, or mass spec?

      (6) Quantification of protein numbers is challenging with SMLM. Issues include i) some of FP not correctly folded/mature, and ii) dependence of localisation rate on instrument, excitation/illumination intensities, and also the thresholds used in analysis. Can the authors compare with another protein that has known expression levels- e.g. PSD95? This is quite an ask, but if they could show copy number of something known to compare with, it would be useful.

      (7) Rationale for doing nanobody dSTORM not clear at all. They don't explain the reason for doing the dSTORM experiments. Why not just rely on PALM for coincidence measurements, rather than tagging mEoS with a nanobody, and then doing dSTORM with that? Can they explain? Is it to get extra localisations- i.e. multiple per nanobody? If so, localising same FP multiple times wouldn't improve resolution. Also, no controls for nanobody dSTORM experiments- what about non-spec nb, or use on WT sections?

      (8) What resolutions/precisions were obtained in SMLM experiments? Should perform Fourier Ring Correlation (FRC) on SR images to state resolutions obtained (particularly useful for when they're presenting distance histograms, as this will be dependent on resolution). Likewise for precision, what was mean precision? Can they show histograms of localisation precision.

      (9) Why were DBSCAN parameters selected? How can they rule out multiple localisations per fluor? If low copy numbers (<10), then why bother with DBSCAN? Could just measure distance to each one.

      (10) For microscopy experiment methods, state power densities, not % or "nominal power".

      (11) In general, not much data presented. Any SI file with extra images etc.?

      (12) Clarification of the discussion on GlyR expression and synaptic localization: The discussion on GlyR expression, complex formation, and synaptic localization is sometimes unclear, and needs terminological distinctions between "expression level", "complex formation" and "synaptic localization". For example, the authors state: "What then is the reason for the low protein expression of GlyRβ? One possibility is that the assembly of mature heteropentameric GlyR complexes depends critically on the expression of endogenous GlyR α subunits." Does this mean that GlyRβ proteins that fail to form complexes with GlyRα subunits are unstable and subject to rapid degradation? If so, the authors should clarify this point. The statement "This raises the interesting possibility that synaptic GlyRs may depend specifically on the concomitant expression of both α1 and β transcripts." suggests a dependency on α1 and β transcripts. However, is the authors' focus on synaptic localization or overall protein expression levels? If this means synaptic localization, it would be beneficial to state this explicitly to avoid confusion. To improve clarity, the authors should carefully distinguish between these different aspects of GlyR biology throughout the discussion. Additionally, a schematic diagram illustrating these processes would be highly beneficial for readers.

      (13) Interpretation of GlyR localization in the context of nanodomains. The distribution of GlyR molecules on inhibitory synapses appears to be non-homogeneous, instead forming nanoclusters or nanodomains, similar to many other synaptic proteins. It is important to interpret GlyR localization in the context of nanodomain organization.

      Significance:

      The paper presents biological and technical advances. The biological insights revolve mostly on the documentation of Glycine receptors in particular synapses in forebrain, where they are typically expressed at very low levels. The authors provide compelling data indicating that the expression is of physiological significance. The authors have done a nice job of combining genetically tagged mice with advanced microscopy methods to tackle the question of distributions of synaptic proteins. Overall, these advances are more incremental than groundbreaking.

      Comments on revised version:

      The authors have addressed the majority of the significant issues raised in the review and revised the manuscript appropriately. One issue that can be further addressed relates to the issue of pseudo-replication. The authors state in their response that "All experiments were repeated at least twice to ensure reproducibility (N independent experiments). Statistical tests were performed on pooled data across the biological replicates; n denotes the number of data points used for testing (e.g., number of synaptic clusters, detections, cells, as specified in each case).". This suggests that they're not doing their stats on biological replicates, and instead are pseudo replicating. It's not clear how they have ensured reproducibility, when the stats seem to have been done on pooled data across repeats.

    2. Reviewer #3 (Public review):

      In this study, Camuso et al., make use of a knock-in mouse model expressing endogenously mEos4b-tagged GlyRβ subunits to detect endogenous glycine receptors in mouse brain using single-molecule localization microscopy (SMLM). At synapses in the hippocampus GlyRβ molecules are detected at very low copy numbers. Assuming that each detected GlyRβ molecule is incorporated in a pentameric glycine receptor, it was estimated that while the majority of hippocampal inhibitory synapses do not contain glycine receptors, a small population of inhibitory synapses contain a few (up to 10) glycine receptors. Using dual-color SMLM approaches it is furthermore confirmed that the detected GlyRβ molecules are embedded in the postsynaptic domain marked by gephyrin. In contrast to the hippocampus, at inhibitory synapses in the striatum GlyRβ molecules were detected at considerably higher copy numbers. Interestingly, the observed number of GlyRβ detections was significantly higher in the ventral striatum compared to the dorsal striatum. These findings are corroborated by electrophysiological recordings showing that postsynaptic glycinergic currents can be readily detected in the ventral striatum but are almost absent in the dorsal striatum. Using lentiviral overexpression of recombinant GlyRalpha1, alpha2, and beta subunits in cultured hippocampal neurons, it is shown that GlyR alpha1 subunits are readily detectable at synapses, but overexpressed GlyRalpha2 and beta subunits did not strongly enrich at synapses. This could indicate that GlyRa1 expression is limiting the synaptic expression of GlyRβ-containing glycine receptors in hippocampal neurons.

      Comments on revised version:

      This revised manuscript is significantly improved. New experimental and quantitative analysis is presented that strengthen the conclusions. Overall, the results presented in this manuscript are based on carefully performed SMLM experiments and are well-presented and described. The knock-in mouse with endogenously tagged GlyRβ molecules is a very strong aspect of this study and provides confidence in the labeling, the combination with SMLM is very strong as it provides high sensitivity and spatial resolution. These results confirm previous studies and will be of interest to a specialised audience interested in glycine receptors, inhibitory synapse biology and super-resolution microscopy.

    3. Author response:

      The following is the authors’ response to the current reviews.

      We thank the editors of eLife and the reviewers for their thorough evaluation of our study. As regards the final comments of reviewer 1 please note that all experimental replicates were first analyzed separately, and were then pooled, since the observed changes were comparable between experiments. This mean that statistical analyses were done on pooled biological replicates.


      The following is the authors’ response to the original reviews.

      General Statements

      We thank the reviewers for their thorough and constructive evaluation of our work. We have revised the manuscript carefully and addressed all the criticisms raised, in particular the issues mentioned by several of the reviewers (see point-by-point response below). We have also added a number of explanations in the text for the sake of clarity, while trying to keep the manuscript as concise as possible.

      In our view, the novelty of our research is two-fold. From a neurobiological point of view, we provide conclusive evidence for the existence of glycine receptors (GlyRs) at inhibitory synapses in various brain regions including the hippocampus, dentate gyrus and sub-regions of the striatum. This solves several open questions and has fundamental implications for our understanding of the organisation and function of inhibitory synapses in the telencephalon. Secondly, our study makes use of the unique sensitivity of single molecule localisation microscopy (SMLM) to identify low protein copy numbers. This is a new way to think about SMLM as it goes beyond a mere structural characterisation and towards a quantitative assessment of synaptic protein assemblies.

      Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity): 

      In this manuscript, the authors investigate the nanoscopic distribution of glycine receptor subunits in the hippocampus, dorsal striatum, and ventral striatum of the mouse brain using single-molecule localization microscopy (SMLM). They demonstrate that only a small number of glycine receptors are localized at hippocampal inhibitory synapses. Using dual-color SMLM, they further show that clusters of glycine receptors are predominantly localized within gephyrinpositive synapses. A comparison between the dorsal and ventral striatum reveals that the ventral striatum contains approximately eight times more glycine receptors and this finding is consistent with electrophysiological data on postsynaptic inhibitory currents. Finally, using cultured hippocampal neurons, they examine the differential synaptic localization of glycine receptor subunits (α1, α2, and β). This study is significant as it provides insights into the nanoscopic localization patterns of glycine receptors in brain regions where this protein is expressed at low levels. Additionally, the study demonstrates the different localization patterns of GlyR in distinct striatal regions and its physiological relevance using SMLM and electrophysiological experiments. However, several concerns should be addressed. 

      The following are specific comments: 

      (1) Colocalization analysis in Figure 1A. The colocalization between Sylite and mEos-GlyRβ appears to be quite low. It is essential to assess whether the observed colocalization is not due to random overlap. The authors should consider quantifying colocalization using statistical methods, such as a pixel shift analysis, to determine whether colocalization frequencies remain similar after artificially displacing one of the channels. 

      Following the suggestion of reviewer 1, we re-analysed CA3 images of Glrb<sup>eos/eos</sup> hippocampal slices by applying a pixel-shift type of control, in which the Sylite channel (in far red) was horizontally flipped relative to the mEos4b-GlyRβ channel (in green, see Methods). As expected, the number of mEos4b-GlyRβ detections per gephyrin cluster was markedly reduced compared to the original analysis (revised Fig. 1B), confirming that the synaptic mEos4b detections exceed chance levels (see page 5). 

      (2) Inconsistency between Figure 3A and 3B. While Figure 3B indicates an ~8-fold difference in the number of mEos4b-GlyRβ detections per synapse between the dorsal and ventral striatum, Figure 3A does not appear to show a pronounced difference in the localization of mEos4bGlyRβ on Sylite puncta between these two regions. If the images presented in Figure 3A are not representative, the authors should consider replacing them with more representative examples or providing an expanded images with multiple representative examples. Alternatively, if this inconsistency can be explained by differences in spot density within clusters, the authors should explain that. 

      The pointillist images in Fig. 3A are essentially binary (red-black). Therefore, the density of detections at synapses cannot be easily judged by eye. For clarity, the original images in Fig. 3A have been replaced with two other examples that better reflect the different detection numbers in the dorsal and ventral striatum. 

      (3) Quantification in Figure 5. It is recommended that the authors provide quantitative data on cluster formation and colocalization with Sylite puncta in Figure 5 to support their qualitative observations. 

      This is an important point that was also raised by the other reviewers. We have performed additional experiments to increase the data volume for analysis. For quantification, we used two approaches. First, we counted the percentage of infected cells in which synaptic localisation of the recombinant receptor subunit was observed (Fig. 5C). We found that mEos4b-GlyRa1 consistently localises at synapses, indicating that all cells express endogenous GlyRb. When neurons were infected with mEos4b-GlyRb, fewer cells had synaptic clusters, meaning that indeed, GlyR alpha subunits are the limiting factor for synaptic targeting. In cultures infected with mEos4b-GlyRa2, only very few neurons displayed synaptic localisation (as judged by epifluorescence imaging). We think this shows that GlyRa2 is less capable of forming heteromeric complexes than GlyRa1, in line with our previous interpretation (see pp. 9-10, 13). 

      Secondly, we quantified the total intensity of each subunit at gephyrin-positive domains, both in infected neurons as well as non-infected control cultures (Fig. 5D). We observed that mEos4bGlyRa1 intensity at gephyrin puncta was higher than that of the other subunits, again pointing to efficient synaptic targeting of GlyRa1. Gephyrin cluster intensities (Sylite labelling) were not significantly different in GlyRb and GlyRa2 expressing neurons compared to the uninfected control, indicating that the lentiviral expression of recombinant subunits does not fundamentally alter the size of mixed inhibitory synapses in hippocampal neurons. Interestingly, gephyrin levels were slightly higher in hippocampal neurons expressing mEos4b-GlyRa1. In our view, this comes from an enhanced expression and synaptic targeting of mEos4b-GlyRa1 heteromers with endogenous GlyRb, pointing to a structural role of GlyRa1/b in hippocampal synapses (pp. 10, 13).

      The new data and analyses have been described and illustrated in the relevant sections of the manuscript.

      (4) Potential for pseudo replication. It's not clear whether they're performing stats tests across biological replica, images, or even synapses. They often quote mean +/- SEM with n = 1000s, and so does that mean they're doing tests on those 1000s? Need to clarify. 

      All experiments were repeated at least twice to ensure reproducibility (N independent experiments). Statistical tests were performed on pooled data across the biological replicates; n denotes the number of data points used for testing (e.g., number of synaptic clusters, detections, cells, as specified in each case). We have systematically given these numbers in the revised manuscript (n, N, and other experimental parameters such as the number of animals used, coverslips, images or cells). Data are generally given as mean +/- SEM or as mean +/- SD as indicated.

      (5) Does mEoS effect expression levels or function of the protein? Can't see any experiments done to confirm this. Could suggest WB on homogenate, or mass spec? 

      The Glrb<sup>eos/eos</sup> knock-in mouse line has been characterised previously and does not to display any ultrastructural or functional deficits at inhibitory synapses (Maynard et al. 2021 eLife). GlyRβ expression and glycine-evoked responses were not significantly different to those of the wildtype. The synaptic localisation of mEos4b-GlyRb in KI animals demonstrates correct assembly of heteromeric GlyRs and synaptic targeting. Accordingly, the animals do not display any obvious phenotype. We have clarified this in the manuscript (p. 4). In the case of cultured neurons, long-term expression of fluorescent receptor subunits with lentivirus   has proven ideal to achieve efficient synaptic targeting. The low and continuous supply of recombinant receptors ensures assembly with endogenous subunits to form heteropentameric receptor complexes (e.g. [Patrizio et al. 2017 Sci Rep]). In the present study, lentivirus infection did not induce any obvious differences in the number or size of inhibitory synapses compared to control neurons, as judged by Sylite labelling of synaptic gephyrin puncta (new Fig. 5D).

      (6) Quantification of protein numbers is challenging with SMLM. Issues include i) some of FP not correctly folded/mature, and ii) dependence of localisation rate on instrument, excitation/illumination intensities, and also the thresholds used in analysis. Can the authors compare with another protein that has known expression levels- e.g. PSD95? This is quite an ask, but if they could show copy number of something known to compare with, it would be useful. 

      We agree that absolute quantification with SMLM is challenging, since the number of detections depends on fluorophore maturation, photophysics, imaging conditions, and analysis thresholds (discussed in Patrizio & Specht 2016, Neurophotonics). For this reason, only very few datasets provide reliable copy numbers, even for well-studied proteins such as PSD-95. One notable exception is the study by Maynard et al. (eLife 2021) that quantified endogenous GlyRβcontaining receptors in spinal cord synapses using SMLM combined with correlative electron microscopy. The strength of this work was the use of a KI mouse strain, which ensures that mEos4b-GlyRβ expression follows intrinsic regional and temporal profiles. The authors reported a stereotypic density of ~2,000 GlyRs/µm² at synapses, corresponding to ~120 receptors per synapse in the dorsal horn and ~240 in the ventral horn, taking into account various parameters including receptor stoichiometry and the functionality of the fluorophore. These values are very close to our own calculations of GlyR numbers at spinal cord synapses that were obtained slightly differently in terms of sample preparation, microscope setup, imaging conditions, and data analysis, lending support to our experimental approach. Nevertheless, the obtained GlyR copy numbers at hippocampal synapses clearly have to be taken as estimates rather than precise figures, because the number of detections from a single mEos4b fluorophore can vary substantially, meaning that the fluorophores are not represented equally in pointillist images. This can affect the copy number calculation for a specific synapse, in particular when the numbers are low (e.g. in hippocampus), however, it should not alter the average number of detections (Fig. 1B) or the (median) molecule numbers of the entire population of synapses (Fig. 1C). We have discussed the limitations of our approach (p. 11).

      (7) Rationale for doing nanobody dSTORM not clear at all. They don't explain the reason for doing the dSTORM experiments. Why not just rely on PALM for coincidence measurements, rather than tagging mEoS with a nanobody, and then doing dSTORM with that? Can they explain? Is it to get extra localisations- i.e. multiple per nanobody? If so, localising same FP multiple times wouldn't improve resolution. Also, no controls for nanobody dSTORM experiments- what about non-spec nb, or use on WT sections? 

      As discussed above (point 6), the detection of fluorophores with SMLM is influenced by many parameters, not least the noise produced by emitting molecules other than the fluorophore used for labelling. Our study is exceptional in that it attempts to identify extremely low molecule numbers (down to 1). To verify that the detections obtained with PALM correspond to mEos4b, we conducted robust control experiments (including pixel-shift as suggested by the reviewer, see point 1, revised Fig. 1B). The rationale for the nanobody-based dSTORM experiments was twofold: (1) to have an independent readout of the presence of low-copy GlyRs at inhibitory synapses and (2) to analyse the nanoscale organisation of GlyRs relative to the synaptic gephyrin scaffold using dual-colour dSTORM with spectral demixing (see p. 6). The organic fluorophores used in dSTORM (AF647, CF680) ensure high photon counts, essential for reliable co-localisation and distance analysis. PALM and dSTORM cannot be combined in dual-colour mode, as they require different buffers and imaging conditions. 

      The specificity of the anti-Eos nanobody was demonstrated by immunohistochemistry in spinal cord cultures expressing mEos4b-GlyRb and wildtype control tissue (Fig. S3). In response to the reviewer's remarks, we also performed a negative control experiment in Glrb<sup>eos/eos</sup> slices (dSTORM), in which the nanobody was omitted (new Fig. S4F,G). Under these conditions, spectral demixing produced a single peak corresponding to CF680 (gephyrin) without any AF647 contribution (Fig. S4F). The background detection of "false" AF647 detections at synapses was significantly lower than in the slices labelled with the nanobody. We conclude that the fluorescence signal observed in our dual-colour dSTORM experiments arises from the specific detection of mEos4b-GlyRb by the nanobody, rather than from background, crossreactivity or wrong attribution of colour during spectral demixing. We have added these data and explanations in the results (p. 7) and in the figure legend of Fig. S4F,G.

      (8) What resolutions/precisions were obtained in SMLM experiments? Should perform Fourier Ring Correlation (FRC) on SR images to state resolutions obtained (particularly useful for when they're presenting distance histograms, as this will be dependent on resolution). Likewise for precision, what was mean precision? Can they show histograms of localisation precision. 

      This is an interesting question in the context of our experiments with low-copy GlyRs, since the spatial resolution of SMLM is limited also by the density of molecules, i.e. the sampling of the structure in question (Nyquist-Shannon criterion). Accordingly, the priority of the PALM experiments was to improve the sensibility of SMLM for the identification of mEos4b-GlyRb subunits, rather than to maximize the spatial resolution. The mean localisation precision in PALM was 33 +/- 12 nm, as calculated from the fitting parameters of each detection (Zeiss, ZEN software), which ultimately result from their signal-to-noise ratio. This is a relatively low precision for SMLM, which can be explained by the low brightness of mEos4b compared to organic fluorophores together with the elevated fluorescence background in tissue slices.

      In the case of dSTORM, the aim was to study the relative distribution of GlyRs within the synaptic scaffold, for which a higher localisation precision was required (p. 6). Therefore, detections with a precision ≥ 25 nm were filtered during analysis with NEO software (Abbelight). The retained detections had a mean localisation precision of 12 +/- 5 for CF680 (Sylite) and 11 +/- 4 for AF647 (nanobody). These values are given in the revised manuscript (pp. 18, 22).

      (9) Why were DBSCAN parameters selected? How can they rule out multiple localisations per fluor? If low copy numbers (<10), then why bother with DBSCAN? Could just measure distance to each one. 

      Multiple detections of the same fluorophore are intrinsic to dSTORM imaging and have not been eliminated from the analysis. Small clusters of detections likely represent individual molecules (e.g. single receptors in the extrasynaptic regions, Fig. 2A). DBSCAN is a robust clustering method that is quite insensitive to minor changes in the choice of parameters. For dSTORM of synaptic gephyrin clusters (CF680), a relatively low length (80 nm radius) together with a high number of detections (≥ 50 neighbours) were chosen to reconstruct the postsynaptic domain with high spatial resolution (see point 8). In the case of the GlyR (nanobody-AF647), the clustering was done mostly for practical reasons, as it provided the coordinates of the centre of mass of the detections. The low stringency of this clustering (200 nm radius, ≥ 5 neighbours) effectively filters single detections that can result from background noise or incorrect demixing. An additional reference explaining the use of DBSCAN including the choice of parameters is given on p. 22 (see also R2 point 4).

      (10) For microscopy experiment methods, state power densities, not % or "nominal power". 

      Done. We now report the irradiance (laser power density) instead of nominal power (pp. 18, 21). 

      (11) In general, not much data presented. Any SI file with extra images etc.? 

      The original submission included four supplementary figures with additional data and representative images that should have been available to the reviewer (Figs. S1-S4). The SI file has been updated during revision (new Fig. S4E-G). 

      (12) Clarification of the discussion on GlyR expression and synaptic localization: The discussion on GlyR expression, complex formation, and synaptic localization is sometimes unclear, and needs terminological distinctions between "expression level", "complex formation" and "synaptic localization". For example, the authors state:"What then is the reason for the low protein expression of GlyRβ? One possibility is that the assembly of mature heteropentameric GlyR complexes depends critically on the expression of endogenous GlyR α subunits." Does this mean that GlyRβ proteins that fail to form complexes with GlyRα subunits are unstable and subject to rapid degradation? If so, the authors should clarify this point. The statement "This raises the interesting possibility that synaptic GlyRs may depend specifically on the concomitant expression of both α1 and β transcripts." suggests a dependency on α1 and β transcripts. However, is the authors' focus on synaptic localization or overall protein expression levels? If this means synaptic localization, it would be beneficial to state this explicitly to avoid confusion. To improve clarity, the authors should carefully distinguish between these different aspects of GlyR biology throughout the discussion. Additionally, a schematic diagram illustrating these processes would be highly beneficial for readers. 

      We thank the reviewer to point this out. We are dealing with several processes; protein expression that determines subunit availability and the assembly of pentameric GlyRs complexes, surface expression, membrane diffusion and accumulation of GlyRb-containing receptor complexes at inhibitory synapses. We have edited the manuscript, particularly the discussion and tried to be as clear as possible in our wording.

      We chose not to add a schematic illustration for the time being, because any graphical representation is necessarily a simplification. Instead, we preferred to summarise the main numbers in tabular form (Table 1). We are of course open to any other suggestions.

      (13) Interpretation of GlyR localization in the context of nanodomains. The distribution of GlyR molecules on inhibitory synapses appears to be non-homogeneous, instead forming nanoclusters or nanodomains, similar to many other synaptic proteins. It is important to interpret GlyR localization in the context of nanodomain organization. 

      The dSTORM images in Fig. 2 are pointillist representations that show individual detections rather than molecules. Small clusters of detections are likely to originate from a single AF647 fluorophore (in the case of nanobody labelling) and therefore represent single GlyRb subunits. Since GlyR copy numbers are so low at hippocampal synapses (≤ 5), the notion of nanodomain is not directly applicable. Our analysis therefore focused on the integration of GlyRs within the postsynaptic scaffold, rather than attempting to define nanodomain structures (see also response to point 8 of R1). A clarification has been added in the revised manuscript (p. 6).

      Reviewer #1 (Significance): 

      The paper presents biological and technical advances. The biological insights revolve mostly on the documentation of Glycine receptors in particular synapses in forebrain, where they are typically expressed at very low levels. The authors provide compelling data indicating that the expression is of physiological significance. The authors have done a nice job of combining genetically-tagged mice with advanced microscopy methods to tackle the question of distributions of synaptic proteins. Overall these advances are more incremental than groundbreaking. 

      We thank the reviewer for acknowledging both the technical and biological advances of our study. While we recognize that our work builds upon established models, we consider that it also addresses important unresolved questions, namely that GlyRs are present and specifically anchored at inhibitory synapses in telencephalic regions, such as the hippocampus and striatum. From a methodological point of view, our study demonstrates that SMLM can be applied not only for structural analysis of highly abundant proteins, but also to reliably detect proteins present at very low copy numbers. This ability to identify and quantify sparse molecule populations adds a new dimension to SMLM applications, which we believe increases the overall impact of our study beyond the field of synaptic neuroscience.

      Reviewer #2 (Evidence, reproducibility and clarity): 

      In their manuscript "Single molecule counting detects low-copy glycine receptors in hippocampal and striatal synapses" Camuso and colleagues apply single molecule localization microscopy (SMLM) methods to visualize low copy numbers of GlyRs at inhibitory synapses in the hippocampal formation and the striatum. SMLM analysis revealed higher copy numbers in striatum compared to hippocampal inhibitory synapses. They further provide evidence that these low copy numbers are tightly linked to post-synaptic scaffolding protein gephyrin at inhibitory synapses. Their approach profits from the high sensitivity and resolution of SMLM and challenges the controversial view on the presence of GlyRs in these formations although there are reports (electrophysiology) on the presence of GlyRs in these particular brain regions. These new datasets in the current manuscript may certainly assist in understanding the complexity of fundamental building blocks of inhibitory synapses. 

      However I have some minor points that the authors may address for clarification: 

      (1) In Figure 1 the authors apply PALM imaging of mEos4b-GlyRß (knockin) and here the corresponding Sylite label seems to be recorded in widefield, it is not clearly stated in the figure legend if it is widefield or super-resolved. In Fig 1 A - is the scale bar 5 µm? Some Sylite spots appear to be sized around 1 µm, especially the brighter spots, but maybe this is due to the lower resolution of widefield imaging? Regarding the statistical comparison: what method was chosen to test for normality distribution, I think this point is missing in the methods section. 

      This is correct; the apparent size of the Sylite spots does not reflect the real size of the synaptic gephyrin domain due to the limited resolution of widefield imaging including the detection of outof-focus light. We have clarified in the legend of Fig. 1A that Sylite labelling was with classic epifluorescence microscopy. The scale bar in Fig. 1A corresponds to 5 µm. Since the data were not normally distributed, nonparametric tests (Kruskal- Wallis one-way ANOVA with Dunn’s multiple comparison test or Mann-Whitney U-test for pairwise comparisons) were used (p. 23). 

      Moreover I would appreciate a clarification and/or citation that the knockin model results in no structural and physiological changes at inhibitory synapses, I believe this model has been applied in previous studies and corresponding clarification can be provided. 

      The Glrbeos/eos mouse model has been described previously and does not exhibit any structural or physiological phenotypes (Maynard et al. 2021 eLife). The issue was also raised by reviewer R1 (point 5) and has been clarified in the revised manuscript (p. 4).

      (2) In the next set of experiments the authors switch to demixing dSTORM experiments - an explanation why this is performed is missing in the text - I guess better resolution to perform more detailed distance measurements? For these experiments: which region of the hippocampus did the authors select, I cannot find this information in legend or main text. 

      Yes, the dSTORM experiments enable dual-colour structural analysis at high spatial resolution (see response to R1 point 7). An explanation has been added (p. 6).

      (3) Regarding parameters of demixing experiments: the number of frames (10.000) seems quite low and the exposure time higher than expected for Alexa 647. Can the authors explain the reason for chosing these particular parameters (low expression profile of the target - so better separation?, less fluorophores on label and shorter collection time?) or is there a reference that can be cited? The laser power is given in the methods in percentage of maximal output power, but for better comparison and reproducibility I recommend to provide the values of a power meter (kW/cm2) as lasers may change their maximum output power during their lifetime. 

      Acquisition parameters (laser power, exposure time) for dSTORM were chosen to obtain a good localisation precision (~12 nm; see R1 point 8). The number of frames is adequate to obtain well sampled gephyrin scaffolds in the CF680 channel. In the case of the GlyR (nanobody-AF647), the concept of spatial resolution does not really apply due to the low number of targets (see R1, point 13). Power density (irradiance) values have now been given (pp. 18, 21).

      (4) For analysis of subsynaptic distribution: how did the authors decide to choose the parameters in the NEO software for DBSCAN clustering - was a series of parameters tested to find optimal conditions and did the analysis start with an initial test if data is indeed clustered (K-ripley) or is there a reference in literature that can be provided? 

      DBSCAN parameters were optimised manually, by testing different values. Identification of dense and well-delimited gephyrin clusters (CF680) was achieved with a small radius and a high number of detections (80 nm, ≥ 50 neighbours), whereas filtering of low-density background in the AF647 channel (GlyRs) required less stringent parameters (200 nm, ≥ 5) due to the low number of target molecules. Similar parameters were used in a previous publication (Khayenko et al. 2022, Angewandte Chemie). The reference has been provided on p. 22 (see also R1 point 9).

      (5) A conclusion/discussion of the results presented in Figure 5 is missing in the text/discussion. 

      This part of the manuscript has been completely overhauled. It includes new experimental data, quantification of the data (new Fig.5), as well as the discussion and interpretation of our findings (see also R1, point 3). In agreement with our earlier interpretation, the data confirm that low availability of GlyRa1 subunits limits the expression and synaptic targeting of GlyRa1/b heteropentamers. The observation that GlyRa1 overexpression with lentivirus increases the size of the postsynaptic gephyrin domain further points to a structural role, whereby GlyRs can enhance the stability (and size) of inhibitory synapses in hippocampal neurons, even at low copy numbers (pp. 13-14). 

      (6) In line 552 "suspension" is misleading, better use "solution" 

      Done.

      Reviewer #2 (Significance): 

      Significance: The manuscript provides new insights to presence of low-copy numbers by visualizing them via SMLM. This is the first report that visualizes GlyR optically in the brain applying the knock-in model of mEOS4b tagged GlyRß and quantifies their copy number comparing distribution and amount of GlyRs from hippocampus and striatum. Imaging data correspond well to electrophysiological measurements in the manuscript. 

      Field of expertise: Super-Resolution Imaging and corresponding analysis 

      Reviewer #4 (Evidence, reproducibility and clarity): 

      In this study, Camuso et al., make use of a knock-in mouse model expressing endogenously mEos4b-tagged GlyRβ to detect endogenous glycine receptors using single-molecule localization microscopy. The main conclusion from this study is that in the hippocampus GlyRβ molecules are barely detected, while inhibitory synapses in the ventral striatum seem to express functionally relevant GlyR numbers. 

      I have a few points that I hope help to improve the strength of this study. 

      - In the hippocampus, this study finds that the numbers of detections are very low. The authors perform adequate controls to indicate that these localizations are above noise level. Nevertheless, it remains questionable that these reflect proper GlyRs. The suggestion that in hippocampal synapses the low numbers of GlyRβ molecules "are important in assembly or maintenance of inhibitory synaptic structures in the brain" is on itself interesting, but is not at all supported. It is also difficult to envision how such low numbers could support the structure of a synapse. A functional experiment showing that knockdown of GlyRs affects inhibitory synapse structure in hippocampal neurons would be a minimal test of this. 

      It is not clear what the reviewer means by “it remains questionable that these reflect proper GlyRs”. The PALM experiments include a series of stringent controls (see R1, point 1) demonstrating the existence of low-copy GlyRs at inhibitory synapses in the hippocampus (Fig. 1) and in the striatum (Fig. 3), and are backed up by dSTORM experiments (Fig. 2). We have no reason to doubt that these receptors are fully functional (as demonstrated for the ventral striatum (Fig. 4). However, due to their low number, a role in inhibitory synaptic transmission is clearly limited, at least in the hippocampus and dorsal striatum. 

      We therefore propose a structural role, where the GlyRs could be required to stabilise the postsynaptic gephyrin domain in hippocampal neurons. This is based on the idea that the GlyRgephyrin affinity is much higher than that of the GABAAR-gephyrin interaction (reviewed in Kasaragod & Schindelin 2018 Front Mol Neurosci). Accordingly, there is a close relationship between GlyRs and gephyrin numbers, sub-synaptic distribution, and dynamics in spinal cord synapses that are mostly glycinergic (Specht et al. 2013 Neuron; Maynard et al. 2021 eLife; Chapdelaine et al. 2021 Biophys J). It is reasonable to assume that low-copy GlyRs could play a similar structural role at hippocampal synapses. A knockdown experiment targeting these few receptors is technically very challenging and beyond the scope of this study. However, in response to the reviewer's question we have conducted new experiments in cultured hippocampal neurons (new Fig. 5). They demonstrate that overexpression of GlyRa1/b heteropentamers increases the size of the postsynaptic domain in these neurons, supporting our interpretation of a structural role of low-copy GlyRs (p. 14).

      - The endogenous tagging strategy is a very strong aspect of this study and provides confidence in the labeling of GlyRβ molecules. One caveat however, is that this labeling strategy does not discriminate whether GlyRβ molecules are on the cell membrane or in internal compartments. Can the authors provide an estimate of the ratio of surface to internal GlyRβ molecules? 

      Gephyrin is known to form a two-dimensional scaffold below the synaptic membrane to which inhibitory GlyRs and GABAARs attach (reviewed in Alvarez 2017 Brain Res). The majority of the synaptic receptors are therefore thought to be located in the synaptic membrane, which is supported by the close relationship between the sub-synaptic distribution of GlyRs and gephyrin in spinal cord neurons (e.g. Maynard et al. 2021 eLife). To demonstrate the surface expression of GlyRs at hippocampal synapses we labelled cultured hippocampal neurons expressing mEos4b-GlyRa1 with anti-Eos nanobody in non-permeabilised neurons (see Author response image 1). The close correspondence between the nanobody (AF647) and the mEos4b signal confirms that the majority of the GlyRs are indeed located in the synaptic membrane.

      Author response image 1.

      Left: Lentivirus expression of mEos4b-GlyRa1 in fixed and non-permeabilised hippocampal neurons (mEos4b signal). Right: Surface labelling of the recombinant subunit with anti-Eos nanoboby (AF647). 

      - “We also estimated the absolute number of GlyRs per synapse in the hippocampus. The number of mEos4b detections was converted into copy numbers by dividing the detections at synapses by the average number of detections of individual mEos4b-GlyRβ containing receptor complexes”. In essence this is a correct method to estimate copy numbers, and the authors discuss some of the pitfalls associated with this approach (i.e., maturation of fluorophore and detection limit). Nevertheless, the authors did not subtract the number of background localizations determined in the two negative control groups. This is critical, particularly at these low-number estimations. 

      We fully agree that background subtraction can be useful with low detection numbers. In the revised manuscript, copy numbers are now reported as background-corrected values. Specifically, the mean number of detections measured in wildtype slices was used to calculate an equivalent receptor number, which was then subtracted from the copy number estimates across hippocampus, spinal cord and striatum. This procedure is described in the methods (p. 20) and results (p. 5, 8), and mentioned in the figure legends of Fig. 1C, 3C. The background corrected values are given in the text and Table 1.

      - Furthermore, the authors state that "The advantage of this estimation is that it is independent of the stoichiometry of heteropentameric GlyRs". However, if the stoichometry is unknown, the number of counted GlyRβ subunits cannot simply be reported as the number of GlyRs. This should be discussed in more detail, and more carefully reported throughout the manuscript. 

      The reviewer is right to point this out. There is still some debate about the stoichiometry of heteropentameric GlyRs. Configurations with 2a:3b, 3a:2b and 4a:1b subunits have been advanced (e.g. Grudzinska et al. 2005 Neuron; Durisic et al. 2012 J Neurosci; Patrizio et al. 2017 Sci Rep; Zhu & Gouaux 2021 Nature). We have therefore chosen a quantification that is independent of the underlying stoichiometry. Since our quantification is based on very sparse clusters of mEos4b detections that likely originate from a single receptor complex (irrespective of its stoichiometry), the reported values actually reflect the number of GlyRs (and not GlyRb subunits). We have clarified this in the results (p. 5) and throughout the manuscript (Table 1). 

      - The dual-color imaging provides insights in the subsynaptic distribution of GlyRβ molecules in hippocampal synapses. Why are similar studies not performed on synapses in the ventral striatum where functionally relevant numbers of GlyRβ molecules are found? Here insights in the subsynaptic receptor distribution would be of much more interest as it can be tight to the function. 

      This is an interesting suggestion. However, the primary aim of our study was to identify the existence of GlyRs in hippocampal regions. At low copy numbers, the concept of sub-synaptic domains (SSDs, e.g. Yang et al. 2021 EMBO Rep) becomes irrelevant (see R1 point 13). It should be pointed out that the dSTORM pointillist images (Fig. 2A) represent individual GlyR detections rather than clusters of molecules. In the striatum, our specific purpose was to solve an open question about the presence of GlyRs in different subregions (putamen, nucleus accumbens).

      - It is unclear how the experiments in Figure 5 add to this study. These results are valid, but do not seem to directly test the hypothesis that "the expression of α subunits may be limiting factor controlling the number of synaptic GlyRs". These experiments simply test if overexpressed α subunits can be detected. If the α subunits are limiting, measuring the effect of α subunit overexpression on GlyRβ surface expression would be a more direct test. 

      Both R1 and R2 have also commented on the data in Fig. 5 and their interpretation. We have substantially revised this section as described before (see R1 point 3) including additional experiments and quantification of the data (new Fig. 5). The findings lend support to our earlier hypothesis that GlyR alpha subunits (in particular GlyRa1) are the limiting factor for the expression of heteropentameric GlyRa/b in hippocampal neurons (pp. 13-14). Since the GlyRa1 subunit itself does not bind to gephyrin (Patrizio et al. 2017 Sci Rep), the synaptic localisation of the recombinant mEos4b-GlyRa1 subunits is proof that they have formed heteropentamers with endogenous GlyRb subunits and driven their membrane trafficking, which the GlyRb subunits are incapable of doing on their own.

      Reviewer #4 (Significance): 

      These results are based on carefully performed single-molecule localization experiments, and are well-presented and described. The knockin mouse with endogenously tagged GlyRβ molecules is a very strong aspect of this study and provides confidence in the labeling, the combination with single-molecule localization microscopy is very strong as it provides high sensitivity and spatial resolution. 

      The conceptual innovation however seems relatively modest, these results confirm previous studies but do not seem to add novel insights. This study is entirely descriptive and does not bring new mechanistic insights. 

      This study could be of interest to a specialized audience interested in glycine receptor biology, inhibitory synapse biology and super-resolution microscopy. 

      My expertise is in super-resolution microscopy, synaptic transmission and plasticity 

      As we have stated before, the novelty of our study lies in the use of SMLM for the identification of very small numbers of molecules, which requires careful control experiments. This is something that has not been done before and that can be of interest to a wider readership, as it opens up SMLM for ultrasensitive detection of rare molecular events. Using this approach, we solve two open scientific questions: (1) the demonstration that low-copy GlyRs are present at inhibitory synapses in the hippocampus, (2) the sub-region specific expression and functional role of GlyRs in the ventral versus dorsal striatum.

      The following review was provided later under the name “Reviewer #4”. To avoid confusion with the last reviewer from above we will refer to this review as R4-2.

      Reviewer #4-2 (Evidence, reproducibility and clarity):  

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

      The authors investigate the presence of synaptic glycine receptors in the telencephalon, whose presence and function is poorly understood. 

      Using a transgenically labeled glycine receptor beta subunit (Glrb-mEos4b) mouse model together with super-resolution microscopy (SLMM, dSTORM), they demonstrate the presence of a low but detectable amount of synaptically localized GLRB in the hippocampus. While they do not perform a functional analysis of these receptors, they do demonstrate that these subunits are integrated into the inhibitory postsynaptic density (iPSD) as labeled by the scaffold protein gephyrin. These findings demonstrate that a low level of synaptically localized glycerine receptor subunits exist in the hippocampal formation, although whether or not they have a functional relevance remains unknown.

      They then proceed to quantify synaptic glycine receptors in the striatum, demonstrating that the ventral striatum has a significantly higher amount of GLRB co-localized with gephyrin than the dorsal striatum or the hippocampus. They then recorded pharmacologically isolated glycinergic miniature inhibitory postsynaptic currents (mIPSCs) from striatal neurons. In line with their structural observations, these recordings confirmed the presence of synaptic glycinergic signaling in the ventral striatum, and an almost complete absence in the dorsal striatum. Together, these findings demonstrate that synaptic glycine receptors in the ventral striatum are present and functional, while an important contribution to dorsal striatal activity is less likely.

      Lastly, the authors use existing mRNA and protein datasets to show that the expression level of GLRA1 across the brain positively correlates with the presence of synaptic GLRB.

      The authors use lentiviral expression of mEos4b-tagged glycine receptor alpha1, alpha2, and beta subunits (GLRA1, GLRA1, GLRB) in cultured hippocampal neurons to investigate the ability of these subunits to cause the synaptic localization of glycine receptors. They suggest that the alpha1 subunit has a higher propensity to localize at the inhibitory postsynapse (labeled via gephyrin) than the alpha2 or beta subunits, and may therefore contribute to the distribution of functional synaptic glycine receptors across the brain.

      Major comments:

      - Are the key conclusions convincing?

      The authors are generally precise in the formulation of their conclusions.

      (1) They demonstrate a very low, but detectable, amount of a synaptically localized glycine receptor subunit in a transgenic (GlrB-mEos4b) mouse model. They demonstrate that the GLRB-mEos4b fusion protein is integrated into the iPSD as determined by gephyrin labelling. The authors do not perform functional tests of these receptors and do not state any such conclusions.

      (2) The authors show that GLRB-mEos4b is clearly detectable in the striatum and integrated into gephyrin clusters at a significantly higher rate in the ventral striatum compared to the dorsal striatum, which is in line with previous studies.

      (3) Adding to their quantification of GLRB-mEos4b in the striatum, the authors demonstrate the presence of glycinergic miniature IPSCs in the ventral striatum, and an almost complete absence of mIPSCs in the dorsal striatum. These currents support the observation that GLRB-mEos4b is more synaptically integrated in the ventral striatum compared to the dorsal striatum.

      (4) The authors show that lentiviral expression of GLRA1-mEos4b leads to a visually higher number of GLR clusters in cultured hippocampal neurons, and a co-localization of some clusters with gephyrin. The authors claim that this supports the idea that GLRA1 may be an important driver of synaptic glycine receptor localization. However, no quantification or statistical analysis of the number of puncta or their colocalization with gephyrin is provided for any of the expressed subunits. Such a claim should be supported by quantification and statistics 

      A thorough analysis and quantification of the data in Fig.5 has been carried out as requested by all the other reviewers (e.g. R1, point 3). The new data and results have been described in the revised manuscript (pp. 9-10, 13-14).

      - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      One unaddressed caveat is the fact that a GLRB-mEos4b fusion protein may behave differently in terms of localization and synaptic integration than wild-type GLRB. While unlikely, it is possible that mEos4b interacts either with itself or synaptic proteins in a way that changes the fused GLRB subunit’s localization. Such an effect would be unlikely to affect synaptic function in a measurable way, but might be detected at a structural level by highly sensitive methods such as SMLM and STORM in regions with very low molecule numbers (such as the hippocampus). Since reliable antibodies against GLRB in brain tissue sections are not available, this would be difficult to test. Considering that no functional measures of the hippocampal detections exist, we would suggest that this possible caveat be mentioned for this particular experiment.

      This question has also been raised before (R1, point 5). According to an earlier study the mEos4b-GlyRb knock-in does not cause any obvious phenotypes, with the possible exception of minor loss of glycine potency (Maynard et al. 2021 eLife). The fact that the synaptic levels in the spinal cord in heterozygous animals are precisely half of those of homozygous animals argues against differences in receptor expression, heteropentameric assembly, forward trafficking to the plasma membrane and integration into the synaptic membrane as confirmed using quantitative super-resolution CLEM (Maynard et al. 2021 eLife). Accordingly, we did not observe any behavioural deficits in these animals, making it a powerful experimental model. We have added this information in the revised manuscript (p. 4). 

      In addition, without any quantification or statistical analysis, the author’s claims regarding the necessity of GLRA1 expression for the synaptic localization of glycine receptors in cultured hippocampal neurons should probably be described as preliminary (Fig. 5).

      As mentioned before, we have substantially revised this part (R1, point 3). The quantification and analysis in the new Fig. 5 support our earlier interpretation.

      - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      The authors show that there is colocalization of gephyrin with the mEos4b-GlyRβ subunit using the Dual-colour SMLM. This is a powerful approach that allows for a claim to be made on the synaptic location of the glycine receptors. The images presented in Figure 1, together with the distance analysis in Figure 2, display the co-localization of the fluorophores. The co-localization images in all the selected regions, hippocampus and striatum, also show detections outside of the gephyrin clusters, which the authors refer to as extrasynaptic. These punctated small clusters seem to have the same size as the ones detected and assigned as part of the synapse. It would be informative if the authors analysed the distribution, density and size of these nonsynaptic clusters and presented the data in the manuscript and also compared it against the synaptic ones. Validating this extrasynaptic signal by staining for a dendritic marker, such as MAP-2 or maybe a somatic marker and assessing the co-localization with the non-synaptic clusters would also add even more credibility to them being extrasynaptic. 

      The existence of extrasynaptic GlyRs is well attested in spinal cord neurons (e.g. Specht et al. 2013 Neuron; this study see Fig. S2). The fact that these appear as small clusters of detections in SMLM recordings results from the fact that a single fluorophore can be detected several times in consecutive image frames and because of blinking. Therefore, small clusters of detections likely represent single GlyRs (that can be counted), and not assemblies of several receptor complexes. Due to their diffusion in the neuronal membrane, they are seen as diffuse signals throughout the somatodendritic compartment in epifluorescence images (e.g. Fig. 5A). SMLM recordings of the same cells resolves this diffuse signal into discrete nanoclusters representing individual receptors (Fig. 5B). It is not clear what information co-localisation experiments with specific markers could provide, especially in hippocampal neurons, in which the copy numbers (and density) of GlyRs is next to zero.

      In addition we would encourage the authors to quantify the clustering and co-localization of virally expressed GLRA1, GLRA2, and GLRB with gephyrin in order to support the associated claims (Fig. 5). Preferably, the density of GLR and gephyrin clusters (at least on the somatic surface, the proximal dendrites, or both) as well as their co-localization probability should be quantified if a causal claim about subunit-specific requirements for synaptic localization is to be made.

      Quantification of the data have been carried out (new Fig.5C,D). The results have been described before (R1, point 3) and support our earlier interpretation of the data (pp. 13-14).

      Lastly, even though it may be outside of the scope of such a study analysing other parts of the hippocampal area could provide additional important information. If one looks at the Allen Institute’s ISH of the beta subunit the strongest signal comes from the stratum oriens in the CA1 for example, suggesting that interneurons residing there would more likely have a higher expression of the glycine receptors. This could also be assessed by looking more carefully at the single cell transcriptomics, to see which cell types in the hippocampus show the highest mRNA levels. If the authors think that this is too much additional work, then perhaps a mention of this in the discussion would be good. 

      We have added the requested information from the ISH database of the Allen Institute in the discussion as suggested by the reviewer (p. 12). However, in combination with the transcriptomic data (Fig. S1) our finding strongly suggest that the expression of synaptic GlyRs depends on the availability of alpha subunits rather than on the presence of the GlyRb transcript. This is obvious when one compares the mRNA levels in the hippocampus with those in the basal ganglia (striatum) and medulla. While the transcript concentrations of GlyRb are elevated in all three regions and essentially the same, our data show that the GlyRb copy numbers at synapses differ over more than 2 orders of magnitude (Fig. 1B, Table 1). 

      - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Since the labeling and some imaging has been performed already, the requested experiment would be a matter of deploying a method of quantification. In principle, it should not require any additional wet-lab experiments, although it may require additional imaging of existing samples.

      - Are the data and the methods presented in such a way that they can be reproduced?

      Yes, for the most part.

      - Are the experiments adequately replicated and statistical analysis adequate?

      Yes

      Minor comments:

      - Specific experimental issues that are easily addressable.

      N/A

      - Are prior studies referenced appropriately?

      Yes

      - Are the text and figures clear and accurate?

      Yes, although quantification in figure 5 is currently not present.

      A quantification has been added (see R1, point 3).

      - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      This paper presents a method that could be used to localize receptors and perhaps other proteins that are in low abundance or for which a detailed quantification is necessary. I would therefore suggest that Figure S4 is included into Figure 2 as the first panel, showcasing the demixing, followed by the results. 

      We agree in principle with this suggestion. However, the revised Fig. S4 is more complex and we think that it would distract from the data shown in Fig. 2. Given that Fig. S4 is mostly methodological and not essential to understand the text, we have kept it in the supplement for the time being. We leave the final decision on this point to the editor.

      Reviewer #4-2 (Significance): 

      [This review was supplied later]

      - Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Using a novel and high resolution method, the authors have provided strong evidence for the presence of glycine receptors in the murine hippocampus and in the dorsal striatum. The number of receptors calculated is small compared to the numbers found in the ventral striatum. This is the first study to quantify receptor numbers in these region. In addition it also lays a roadmap for future studies addressing similar questions. 

      - Place the work in the context of the existing literature (provide references, where appropriate).

      This is done well by the authors in the curation of the literature. As stated above, the authors have filled a gap in the presence of glycine receptors in different brain regions, a subject of importance in understanding the role they play in brain activity and function. 

      - State what audience might be interested in and influenced by the reported findings.

      Neuroscientists working at the synaptic level, on inhibitory neurotransmission and on fundamental mechanisms of expression of genes at low levels and their relationship to the presence of the protein would be interested. Furthermore, researchers in neuroscience and cell biology may benefit from and be inspired by the approach used in this manuscript, to potentially apply it to address their own aims. 

      We thank the reviewer for the positive assessment of the technical and biological implications of our work, as well as the interest of our findings to a wide readership of neuroscientists and cell biologists. 

      - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Synaptic transmission, inhibitory cells and GABAergic synapses functionally and structurally, cortex and cortical circuits. No strong expertise in super-resolution imaging methods.

    1. Focus is a publication by The International Defence Aid Fund.

      Focus on Political Repression in Southern Africa was a serial publication produced by the International Defense and Aid Fund (IDAF), an organization dedicated to documenting and challenging apartheid-era injustice. Published from the 1960s through the 1980s, Focus provided clear, accessible reporting on political trials, security laws, censorship, prisoner conditions, and everyday experiences of state repression across South Africa and neighboring countries.

      Written for an international audience, the publication combined investigative reporting, legal analysis, and first-hand testimony to expose abuses that were otherwise hidden from the global public. This digital collection brings together issues of Focus to support research, teaching, and general understanding of how apartheid systems operated—and how people and organizations resisted them.

    1. Reviewer #3 (Public review):

      Summary:

      This work provides an overview of the motor neuron landscape in the male reproductive system. Some work had been done to elucidate the circuits of ejaculation in the spine, as well as, the cord but this work fills a gap of knowledge at the level of the reproductive organs. Using complementary approaches the authors show that there are two types of motor neurons that are mutually exclusive: neurons that co-express octopamine and glutamate and neurons that co-express serotonin and glutamate. They also show evidence that both types of neurons express large dense core vesicles indicating that neuropeptides play a role in male fertility. This paper provides a thorough characterization of expression of the different glutamate, octopamine and serotonin receptors in the different organs and tissues of the male reproductive system. The differential expression in different tissues and organs allows building initial theories on the control of emission and expulsion. Additionally, the authors characterize the expression of synaptic proteins and the neuromuscular junction sites. On a mechanistic level, the authors show that neither octopamine/glutamate neuron transmission nor glutamate transmission in serotonin/glutamate neurons are required for male fertility. This final result is quite surprising and opens up many questions on how ejaculation is coordinated.

      Strengths:

      This work fills an important gap on characterization of innervation of the male reproductive system by providing an extensive characterization of the motor neurons and the potential receptors of motor neuron release.The authors show convincing evidence of glutamate/monoamine co-release and of mutual exclusivity of serotonin/glutamate and octopamine/glutamate neurons.

      Weaknesses:

      The experiment looking at peristaltic waves in the male organs is missing labeling of the different regions and quantification of the observed waves.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary: 

      This very thorough anatomical study addresses the innervation of the Drosophila male reproductive tract. Two distinct glutamatergic neuron types were classified: serotonergic (SGNs) and octopaminergic (OGNs). By expansion microscopy, it was established that glutamate and serotonin /octopamine are co-released. The expression of different receptors for 5-HT and OA in muscles and epithelial cells of the innervation target organs was characterized. The pattern of neurotransmitter receptor expression in the target organs suggests that seminal fluid and sperm transport and emission are subjected to complex regulation. While silencing of abdominal SGNs leads to male infertility and prevents sperm from entering the ejaculatory duct, silencing of OGNs does not render males infertile. 

      Strengths: 

      The studied neurons were analysed with different transgenes and methods, as well as antibodies against neurotransmitter synthesis enzymes, building a consistent picture of their neurotransmitter identity. The careful anatomical description of innervation patterns together with receptor expression patterns of the target organs provides a solid basis for advancing the understanding of how seminal fluid and sperm transport and emission are subjected to complex regulation. The functional data showing that SGNs are required for male fertility and for the release of sperm from the seminal vesicle into the ejaculatory duct is convincing. 

      Weaknesses: 

      The functional analysis of the characterized neurons is not as comprehensive as the anatomical description, and phenotypic characterization was limited to simple fertility assays. It is understandable that a full functional dissection is beyond the scope of the present work. The paper contains experiments showing neuron-independent peristaltic waves in the reproductive tract muscles, which are thematically not very well integrated into the paper. Although very interesting, one wonders if these experiments would not fit better into a future work that also explores these peristaltic waves and their interrelation with neuromodulation mechanistically. 

      Reviewer #2 (Public review): 

      Summary: 

      Cheverra et al. present a comprehensive anatomical and functional analysis of the motor neurons innervating the male reproductive tract in Drosophila melanogaster, addressing a gap in our understanding of the peripheral circuits underlying ejaculation and male fertility. They identify two classes of multi-transmitter motor neurons-OGNs (octopamine/glutamate) and SGNs (serotonin/glutamate)-with distinct innervation patterns across reproductive organs. The authors further characterize the differential expression of glutamate, octopamine, and serotonin receptors in both epithelial and muscular tissues of these organs. Behavioral assays reveal that SGNs are essential for male fertility, whereas OGNs and glutamatergic transmission are dispensable. This work provides a high-resolution map linking neuromodulatory identity to organ-specific motor control, offering a valuable framework to explore the neural basis of male reproductive function. 

      Strengths: 

      Through the use of an extensive set of GAL4 drivers and antibodies, this work successfully and precisely defines the neurons that innervate the male reproductive tract, identifying the specific organs they target and the nature of the neurotransmitters they release. It also characterizes the expression patterns and localization of the corresponding neurotransmitter receptors across different tissues. The authors describe two distinct groups of dual-identity neurons innervating the male reproductive tract: OGNs, which co-express octopamine and glutamate, and SGNs, which co-express serotonin and glutamate. They further demonstrate that the various organs within the male reproductive system differentially express receptors for these neurotransmitters. Based on these findings, the authors propose that a single neuron capable of co-releasing a fast-acting neurotransmitter alongside a slower-acting one may more effectively synchronize and stagger events that require precise timing. This, together with the differential expression of ionotropic glutamate receptors and metabotropic aminergic receptors in postsynaptic muscle tissue, adds an additional layer of complexity to the coordinated regulation of fluid secretion, organ contractility, and directional sperm movement-all contributing to the optimization of male fertility. 

      Weaknesses: 

      The main weakness of the manuscript is the lack of detail in the presentation of the results. Specifically, all microscopy image figures are missing information about the number of samples (N), and in the case of colocalization experiments, quantitative analyses are not provided. Additionally, in the first behavioral section, it would be beneficial to complement the data table with figures similar to those presented later in the manuscript for consistency and clarity. 

      Wider context: 

      This study delivers the first detailed anatomical map connecting multi-transmitter motor neurons with specific male reproductive structures. It highlights a previously unrecognized functional specialization between serotonergic and octopaminergic pathways and lays the groundwork for exploring fundamental neural mechanisms that regulate ejaculation and fertility in males. The principles uncovered here may help explain how males of Drosophila and other organisms adjust reproductive behaviors in response to environmental changes. Furthermore, by shedding light on how multi-transmitter systems operate in reproductive control, this model could provide insights into therapeutic targets for conditions such as male infertility and prostate cancer, where similar neuronal populations are involved in humans. Ultimately, this genetically accessible system serves as a powerful tool for uncovering how multi-transmitter neurons orchestrate coordinated physiological actions necessary for the functioning of complex organs. 

      Reviewer #3 (Public review): 

      Summary: 

      This work provides an overview of the motor neuron landscape in the male reproductive system. Some work had been done to elucidate the circuits of ejaculation in the spine, as well as the cord, but this work fills a gap in knowledge at the level of the reproductive organs. Using complementary approaches, the authors show that there are two types of motor neurons that are mutually exclusive: neurons that co-express octopamine and glutamate and neurons that co-express serotonin and glutamate. They also show evidence that both types of neurons express large dense core vesicles, indicating that neuropeptides play a role in male fertility. This paper provides a thorough characterization of the expression of the different glutamate, octopamine, and serotonin receptors in the different organs and tissues of the male reproductive system. The differential expression in different tissues and organs allows building initial theories on the control of emission and expulsion. Additionally, the authors characterize the expression of synaptic proteins and the neuromuscular junction sites. On a mechanistic level, the authors show that neither octopamine/glutamate neuron transmission nor glutamate transmission in serotonin/glutamate neurons is required for male fertility. This final result is quite surprising and opens up many questions on how ejaculation is coordinated. 

      Strengths: 

      This work fills an important gap in the characterization of innervation of the male reproductive system by providing an extensive characterization of the motor neurons and the potential receptors of motor neuron release. The authors show convincing evidence of glutamate/monoamine co-release and of mutual exclusivity of serotonin/glutamate and octopamine/glutamate neurons. 

      Weaknesses: 

      (1) Often, it is mentioned that the expression is higher or lower or regional without quantification or an indication of the number of samples analysed. 

      (2) The experiment aimed at tracking sperm in the male reproductive system is difficult to interpret when it is not assessed whether ejaculation has occurred. 

      (3) The experiment looking at peristaltic waves in the male organs is missing labeling of the different regions and quantification of the observed waves. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors): 

      (1) While the peripheral innervations are very carefully described, it is not clear to which SGNs and OGNs (i.e., cell bodies in the central nervous system) these innervations belong. Are SV, AG, and ED innervated by branches of one neuron or by separate neurons? Multi-color flip-out experiments could provide an answer to this. 

      We agree this is important and are planning these experiments for follow-up study.

      (2) In contrast, for the analysis of the VT19028 split line (Figure 9), only vnc and cell body images are shown. How do the arborisations of these split combinations look in the periphery? Are the same reproductive organs innervated as shown in Figure 2?

      Figure 9S3 was inadvertently omitted from the initial submission.  That figure is now included and shows that the VT019028 split broadly innervates the SV, AG, and ED.

      (3) In the discussion, I think it would be helpful to offer some potential explanations for the role of octopaminergic and glutamatergic signaling. If not required for basic fertility, they probably have some other role.

      Thank you, we have included speculation in the Discussion section "Potential for adaptation to environment".

      (4) Line 543: Figure 8S4 E, (not 8E). 

      Correction made.

      Reviewer #2 (Recommendations for the authors): 

      (1) Line 213-217 

      Comment:

      The use of "significantly less expression" may be misleading, as no quantification or statistical analysis is provided to support this comparison. 

      Suggestion:

      Consider using a more neutral term, such as "markedly less" or "noticeably less," unless quantitative data and statistical analysis are included to substantiate the claim.

      Good recommendation.This suggestion has been incorporated.

      (2) Line 264-267 

      Comment:

      The observation regarding the distinct morphology of SGNs and OGNs is interesting and could strengthen the argument regarding functional differences. 

      Suggestion: 

      Consider including a quantification of morphological complexity (e.g., branching) to support the claim. A method such as Sholl analysis (Sholl, 1953), as adapted in Fernández et al., 2008, could be applied. 

      This is a good suggestion, and we will consider it as part of a follow-up study.

      (3) Line 269-271 

      Comment:

      The anatomical context of the observation is not explicitly stated. 

      Suggestion:

      Add "in the ED" for clarity: "With the TRH-GAL4 experiment in the ED, vGlut-40XMYC (Figure 5S1, A and E) and 6XV5-vMAT (Figure 5S1, B and F) were both present with a highly overlapping distribution (Figure 5S1, I)." 

      Suggestion has been incorporated.

      (4) Line 275-276 

      Comment:

      The claim about the reduced ability to distinguish SGNs and OGNs in the ED would benefit from quantitative support. 

      Suggestion:

      Include a morphological comparison or quantification between SGNs and OGNs in the ED and SV to reinforce this point.

      Certain information on morphological comparison can be inferred within the images themselves, and we will include quantitation in a follow-up study.

      (5) Line 277-279 

      Comment:

      As with line 269, the anatomical site could be specified more clearly. 

      Suggestion: 

      Rephrase as: "With the Tdc2-GAL4 experiment in the ED, vGlut-40XMYC (Figure 5S1, M and Q) and 6XV5-vMAT (Figure 5S1, N and R) were both observed in a highly overlapping distribution (Figure 5S1, U)." 

      Suggestion has been incorporated.

      (6) Line 348-350 

      Comment:

      The phrase "significantly higher density" implies a statistical comparison that is not shown. 

      Suggestion:

      If no quantification is provided, replace with a qualitative term such as "visibly higher" or "notably more dense." Alternatively, add a quantitative analysis with statistical testing to justify the use of "significantly." 

      Suggestion has been incorporated.

      (7) Lines 415-458 (Section comment) 

      Comment:

      There appears to be differential localization of neurotransmitter receptor expression (glutamate in muscle vs. 5-HT in epithelium or neurons), which could have functional implications. 

      Suggestion:

      Expand this section to briefly discuss the differential localization patterns of these receptors and potential implications for signal transduction in male reproductive tissues. 

      (8) Lines 638-682 (Section comment) 

      Comment:

      The table summarizing fertility phenotypes would be more informative with additional detail on experimental outcomes. 

      Suggestion:

      Add a column showing the number of fertile males over the total tested (e.g., "n fertile / n total"). Also, clarify whether the fertility assays are identical to those reported in Figure 10S2, and whether similar analyses were conducted for females. Consider including a figure summarizing fertility results for all genotypes listed in the table, similar to Figure 10S2. 

      The fertility tests reported in Table 1 were separate from those reported in Figure 10S2.  For these tests, the results were clear-cut with 100% of males and females reported as infertile exhibiting the infertile phenotype.  For the males and females reported as fertile, it was also clear-cut with nearly 100% showing fertility at a high level.  In subsequent figures we attempted to assess degrees of fertility.

      (9) Line 724-727 

      Comment:

      There seems to be a mistake in the identification of the driver lines used to silence OA neurons. Also, figure references might be incorrect. 

      Suggestion:

      The OA neuron driver line should be corrected to "Tdc2-GAL4-DBD ∩ AbdB-AD" instead of TRH-GAL4. Additionally, the figure references should be verified; specifically, the letter "B" (in "Figure 10B, D" and "10B, E") appears to be unnecessary or misplaced.

      Thanks for catching this, the corrections have been made.

      (10) Line 872-877 

      Comment:

      The discussion on the co-release of fast-acting glutamate and slower aminergic neurotransmitters is interesting and well-articulated. However, it remains somewhat disconnected from the behavioral findings. 

      Suggestion:

      Consider linking this proposed mechanism to the results observed in the mating duration assays. For instance, the sequential action of neurotransmitters described here could potentially underlie the prolonged mating observed when specific neuromodulators are active, helping to functionally integrate molecular and behavioral data. 

      (11) Line 926-928 

      Comment:

      The interpretation of 5-HT7 receptor expression in the sphincter is compelling, suggesting a role in regulating its function. However, this anatomical observation could be further contextualized with the functional data. 

      Suggestion:

      It may strengthen the interpretation to explicitly connect this finding with the fertility assays, where SGNs - presumably acting via serotonergic signaling - are shown to be necessary for male fertility. This would support a functional role for 5-HT7 in reproductive success via sphincter regulation.

      This has been added. 

      (12) Figure 1 

      Comment:

      The figure legend is generally clear, but could benefit from more consistency and precision in the color-coded labeling. Additionally, the naming of some structures could be more explicit. 

      Suggestion: 

      Revise the figure and the legend as follows:

      Figure 1. The Drosophila male reproductive system. A) Schematic diagram showing paired testes (colour), SVs (green), AGs (purple), Sph (red), ED (gray), and EB (colour). B) Actual male reproductive system. Te - testes, SV - seminal vesicle, AG - accessory gland, Sph - singular sphincter, ED - ejaculatory duct, EB - ejaculatory bulb. Scale bar: 200 µm.

      This suggestion has been incorporated.

      (13) Figure 3S2 

      Comment:

      There appears to be a typographical error in the description of the genotypes, which may lead to confusion. 

      Suggestion:

      Correct the legend to reflect the appropriate genotypes:

      Figure 3S2. Expression of vGlut-LexA and Tdc2-GAL4 in the Drosophila male reproductive system. A, D, G, J, M, P) vGlut-LexA, LexAop-6XmCherry; B, E, H, K, N, Q) Tdc2-GAL4, UAS-6XGFP; C, F, I, L, O, R) Overlay. Scale bars: O - 50 µm; R - 10 µm.

      The corrections have been made.

      (14) Figure 3S3

      Comment:

      The genotypes for panels D and E appear to be incomplete; the DBD component of the split-GAL4 drivers is missing. 

      Suggestion:

      Update the figure legend to: 

      Figure 3S3. Fruitless and Doublesex expression in the Drosophila male reproductive system. A) fru-GAL4, UAS-6XGFP; B) vGlut-LexA, LexAop-6XmCherry; C) Overlay; D) Tdc2-AD ∩ dsx-GAL4-DBD; E) TRH-AD ∩ dsx-GAL4-DBD. Scale bar: 200 µm.

      The corrections have been made.

      (15) Figure 4S4 

      Comment: 

      There is a repeated segment in the figure legend, which makes it unclear and redundant. 

      Suggestion:

      Edit the legend to remove the duplicated lines: 

      Figure 4S4. Expression of vGlut, TβH-GFP, and 5-HT at the junction of the SV and AGs with the ED of the Drosophila male reproductive system. A) vGlut-40XV5; B) TβH-GFP; C) 5-HT; D) vGlut-40XV5, TβH-GFP overlay; E) vGlut-40XV5, 5-HT overlay; F) TβH-GFP, 5-HT overlay. Scale bar: 50 µm.

      The correction has been made.

      (16) Figure 6S5 

      Comment:

      Within this figure, the orientation and/or scale of the tissue varies noticeably between individual panels, making it difficult to directly compare the different experimental conditions. 

      Suggestion:

      For improved clarity and interpretability, consider standardizing the orientation and size of the tissue shown across all panels within the figure. Consistent presentation will facilitate direct comparisons between treatments or genotypes. 

      There is often variation in the size of the male reproductive organs. They were all acquired at the same magnification. The only point of this figure is there is no vGAT or vAChT at these NMJs and the result is unambiguously negative. 

      (17) Figure 10 

      Comment:

      Panel A appears redundant, as it shows the same information as the other panels but without indicating statistical significance. 

      Suggestion:

      Consider removing panel A and keeping only the remaining four graphs, which include relevant statistical comparisons and clearly show significant differences.

      We realize there is some redundancy of panel A with the other panels, but we feel there is value in having all the genotypes in a single panel for comparison.

      Reviewer #3 (Recommendations for the authors): 

      Here are some suggestions to improve the manuscript: 

      (1) Prot B GFP experiment: the authors should explain better the time chosen to look at the sperm content of the male reproductive system. At 10 minutes, it is expected that the male has already ejaculated, and therefore, a failure to ejaculate would result in more sperm in the reproductive system, not less. Since we are not certain when the male ejaculates, it would be important to do the analysis at different time points.

      In the Prot-GFP experiments, the 10-minute time point was chosen because we nearly always observe sperm in the ejaculatory duct of control males.  In the experimental males, we never observed sperm in the ejaculatory duct at this time point.  Also, no Prot-GFP sperm were observed in the reproductive tract of females mated to experimental males even when mating was allowed to go to completion, while abundant sperm were found in females mated to Prot-GFP controls.  Figure 10S1 has been updated to include Images of these female reproductive systems.  The results showing the absence of Prot-GFP sperm in the female reproductive tract mated to experimental males indicates sperm transfer in these males isn't occurring earlier during the copulation process than in control males and that we didn't miss it by only examining at the ejaculatory duct.

      (2) Discuss what may be the role of the octopamine/glutamate neurons and glutamate transmission in serotonin/glutamate neurons in the male reproductive system, given that they are not required for fertility (at least under the context in which it was tested). It is quite a striking result that deserves some attention. 

      We agree it is a surprising result and have included speculation on the role of glutamate and octopamine in male reproduction in the Discussion section "Potential for adaptation to environment".

      (3) Very important: 

      (a) Figure 3 is present in the Word document but not the PDF. 

      (b) Figure 9S3 is not present 

      (c) In Figure 5 X), the legend does not correspond to the panel.

      All of these corrections have been made. 

      (4) Other suggestions:

      (a) A summary schematic (or several) of the findings would make it an easier read.

      (b) Explain why the ejaculatory bulb was left out of the analysis.

      (c) Explain in the main text some of the tools, such as, BONT-C and the conditional vGlut mutation.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Arnould et. al. develop an unbiased, affinity-guided reagent to label P2X7 receptor and use super-resolution imaging to monitor P2X7 redistribution in response to inflammatory signaling.

      Strengths:

      I think the X7-uP probe that they developed is very useful for visualizing localization of P2X7 receptor. They convincingly show that under inflammatory conditions, there is a reorganization of P2X7 localization into receptor clusters. Moreover, I think they have shown a very clever way to specifically label any receptor of interest. This has broad appeal.

      I think the authors have done a very nice job addressing my original concerns. Here are those original concerns and my new comments related to how the authors address them.

      (1) While the authors state that chemical modification of AZ10606120 to produce the X7-UP reagent has "minimal impact" on the inhibition of P2X7, we can see from Figure 2A and 2B that it does not antagonize P2X7 as effectively as the original antagonist. For the sake of completeness and quantitation, I think it would be great if the authors could determine the IC50 for X7-uP and compare it to the IC50 of AZ10606120.

      The authors now show the relative inhibition of X7-uP compared to AZ10606120 at different concentrations. This provides a nice comparison to give the reader an idea of how effectively X7-uP inhibits P2X7 receptor. This is great.

      (2) Do the authors know whether modification of the lysines with biotin affects the receptor's affinity for ATP (or ability to be activated by ATP)? What about P2X7 that has been modified with biotin and then labeled with Alexa 647? For the sake of completeness and quantitation, I think it would be great if the authors could determine the EC50 of biotinylated P2X7 for ATP as well as biotinylated and then Alexa 647 labeled P2X7 for ATP and compare these values to the affinity of unmodified WT P2X7 for ATP.

      I agree with the authors that assessing the functional integrity of P2X7 following biotinylation and fluorophore labeling is outside the scope of this paper but would be important for studies involving dynamic or post-labeling functional analyses such as live trafficking.

      (3) It is a little misleading to color the fluorescence signal from mScarlet green (for example, in Figure 3 and Figure 4). The fluorescence is not at the same wavelength as GFP. In fact, the wavelength (570 nm - 610 nm) for emission is closer to orange/red than to green. I think this color should be changed to differentiate the signal of mScarlet from the GFP signal used for each of the other P2X receptor subtypes.

      The authors have now changed the mScarlet color to orange, which solves my concern.

      (4) It is my understanding that P2X6 does not form homotrimers. Thus, I was a little surprised to see that the density and distribution of P2X6-GFP in Figure 3 looks very similar to the density and distribution of the other P2X subtypes. Do the authors have an explanation for this? Are they looking at P2X6 protomers inserted into the plasma membrane? Does the cell line have endogenous P2X receptor subtypes? Is Figure 3 showing heterotrimers with P2X6 receptor? A little explanation might be helpful.

      The authors address this point very well and include nice data to show that P2X6 does not insert into the plasma membrane as a homotrimer.

      (5) It is easy to overlook the fact that the antagonist leaves the binding pocket once the biotin has been attached to the lysines. It might be helpful if the authors made this a little more apparent in Figure 1 or in the text describing the NASA chemistry reaction.

      The authors have modified Figure 1 to make it easier to understand the NASA chemistry reaction.

      I congratulate the authors on an outstanding paper!

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review): 

      Summary: 

      In this paper, the authors developed a chemical labeling reagent for P2X7 receptors, called X7-uP. This labeling reagent selectively labels endogenous P2X7 receptors with biotin based on ligand-directed NASA chemistry (Ref. 41). After labeling the endogenous P2X7 receptor with biotin, the receptor can be fluorescently labeled with streptavidin-AlexaFluor647. The authors carefully examined the binding properties and labeling selectivity of X7-uP to P2X7, characterized the labeling site of P2X7 receptors, and demonstrated fluorescence imaging of P2X7 receptors. The data obtained by SDS-PAGE, Western blot, and fluorescence microscopy clearly show that X7-uP labels the P2X7 receptor. Finally, the authors fluorescently labeled the endogenous P2X7 in BV2 cells, which are a murine microglia model, and used dSTORM to reveal a nanoscale P2X7 redistribution mechanism under inflammatory conditions at high resolution. 

      Strengths: 

      X7-uP selectively labels endogenous P2X7 receptors with biotin. Streptavidin-AlexaFluor647 binds to the biotin labeled to the P2X7 receptor, allowing visualization of endogenous P2X7 receptors. 

      We thank the reviewer for their positive comment.

      Weaknesses: 

      Weaknesses & Comments 

      (1) The P2X7 receptor exists in a trimeric form. If it is not a monomer under the conditions of the pull-down assay in Figure 2C, the quantitative values may not be accurate. 

      We thank the reviewer for this comment. As shown in Figure 2C, the band observed on the denaturing SDS-PAGE corresponds to the monomeric form of the P2X7 receptor. While we cannot exclude the presence of non-monomeric species under native conditions, no such higher-order forms are visible in the gel. This observation supports the conclusion that the quantitative values presented are based on the monomeric form and are therefore reliable.

      (2) In Figure 3, GFP fluorescence was observed in the cell. Are all types of P2X receptors really expressed on the cell surface ? 

      We thank the reviewer for this excellent comment, which was also raised by reviewer 2. To address this concern, we performed a commercial cell-surface protein biotinylation assay to assess whether GFP-tagged P2X receptors reach the plasma membrane. As expected, all P2X subtypes except P2X6 were detected at the cell surface in HEK293T cells, thereby validating our confocal fluorescence microscopy assay. These new data are now included in Figure 3 — figure supplement 1.

      (3) The reviewer was not convinced of the advantages of the approach taken in this paper, because the endogenous receptor labeling in this study could also be done using conventional antibody-based labeling methods. 

      We thank the reviewer for raising this important point and would like to highlight several advantages of our approach compared to conventional antibody-based labeling.

      First, commercially available P2X7 antibodies often suffer from poor specificity and are generally not suitable for reliably detecting endogenous P2X7 receptors, as documented in previous studies (e.g., PMID: 16564580 and PMID: 15254086). While recent advances have been made using nanobodies with improved specificity for P2X7 (e.g., PMID: 30074479 and PMID: 38953020), our strategy is distinct and complementary to nanobody-based approaches.

      Second, antibodies rely on non-covalent interactions with the receptor, which can result in dissociation over time. In contrast, our X7-uP probe covalently biotinylates lysine residues on the P2X7 receptor through stable amide bond formation. This covalent labeling ensures that the biotin moiety remains permanently attached, an advantage not afforded by reversible binding strategies.

      Third, by selectively biotinylating P2X7 receptors, our method provides a versatile platform for the chemical attachment of a wide range of probes or functional moieties. Although we did not demonstrate this application in the current study, we believe this modularity represents an additional advantage of our approach.

      We have now revised the discussion to highlight these key advantages, allowing the reader to form their own opinion. We hope this addresses the reviewer’s concerns and clarifies the benefits of our approach.

      (4) Although P2X7 was successfully labeled in this paper, it is not new as a chemistry. There is a need for more attractive functional evaluation such as live trafficking analysis of endogenous P2X7. 

      We agree with the reviewer that the underlying chemistry is not novel per se. However, to our knowledge, it has not previously been applied to the P2X7 receptor, and thus constitutes a novel application with specific relevance for studying native P2X7 biology.

      We also appreciate the reviewer’s suggestion regarding live trafficking analysis of endogenous P2X7. While this is indeed a valuable and interesting direction, we believe it lies beyond the scope of the present study, as it would first require demonstrating that the labeling itself does not affect P2X7 function (see below). This important step would necessitate additional experiments, which we consider more appropriate for a follow-up investigation.

      (5) The reviewer has concerns that the use of the large-size streptavidin to label the P2X7 receptor may perturbate the dynamics of the receptor. 

      We thank the reviewer for raising this important point. Although we did not directly measure receptor dynamics, it is indeed possible that tetrameric streptavidin (tStrept-A 647) could promote P2X7 clustering by cross-linking nearby receptors due to its tetravalency (see also point 7 raised by the reviewer). To address this concern, we performed additional dSTORM experiments using a monomeric form of streptavidin-Alexa 647 (mSA) (see PMID: 26979420). Owing to its reduced size and lack of tetravalency, mSA has been shown to minimize artificial crosslinking of synaptic receptors (PMID: 26979420). A drawback of using mSA, however, is that the monomeric form carries only two fluorophores (estimated degree of labeling, DOL ≈ 2, PMID: 26979420), whereas the tetrameric form, according to the manufacturer’s certificate of analysis (Invitrogen S21374), has an average DOL of three fluorophores per monomer, resulting in a total of ~12 fluorophores per streptavidin.

      We tested three conditions with mSA incubation: (i) control BV2 cells (without X7-uP), (ii) untreated X7-uP-labeled BV2 cells, and (iii) X7-uP-labeled BV2 cells treated with LPS and ATP (using the same concentrations and incubation times described in the manuscript). As shown in Author response image 1, only LPS+ATP treatment induced a clear increase in the mean cluster density compared to quiescent (untreated) BV2 cells. This effect closely matches the results obtained with tStrept-A 647, supporting the conclusion the tetrameric streptavidin does not artificially promote P2X7 clustering. It is also possible that the cellular environment of BV2 microglia differs from the confined architecture of synapses, which may further explain why cross-linking effects are less pronounced in our system.

      As expected, the overall fluorescence signal with mSA was about tenfold lower than with tStrept-A 647, consistent with the expected fluorophore stoichiometry. This lower signal may explain why the values for the untreated condition appeared slightly higher than for the control, although the difference was not statistically significant (P = 0.1455).

      We hope these additional experiments adequately address the reviewer’s concerns.

      Author response image 1.

      BV2 labeling with monomeric streptavidin–Alexa 647 (mSA).(A) Bright-field and dSTORM images of BV2 cells labeled with mSA in the presence (untreated and LPS+ATP) or absence (control) of 1 µM X7-uP. Treatment: LPS (1 µg/mL for 24 hours) and ATP (1 mM for 30 minutes). Scale bars, 10 µm. Insets: Magnified dSTORM images. Scale bars, 1 µm.(B) Quantification of the number of localizations (n = 2 independent experiments). Bars represent mean ± s.e.m. One-way ANOVA with Tukey’s multiple comparisons (P values are indicated above the graph).

      (6) It is better to directly label Alexa647 to the P2X7 receptor to avoid functional perturbation of P2X7. 

      Directly labeling of Alexa647 to the P2X7 receptor would require the design and synthesis of a novel probe, which is currently not available. Implementing such a strategy would involve substantial new experimental work that lies beyond the scope of the present study.

      (7) In all imaging experiments, the addition of streptavidin, which acts as a cross-linking agent, may induce P2X7 receptor clustering. This concern would be dispelled if the receptors were labeled with a fluorescent dye instead of biotin and observed. 

      We refer the reviewer to our response in point 5, where we addressed this concern by comparing tetrameric and monomeric streptavidin conjugates. As noted above (see also point 6), directly labeling the receptor with a fluorescent dye would require the development of a new probe, which is outside the scope of the present study.

      (8) There are several mentions of microglia in this paper, even though they are not used. This can lead to misunderstanding for the reader. The author conducted functional analysis of the P2X7 receptor in BV-2 cells, which are a model cell line but not microglia themselves. The text should be reviewed again and corrected to remove the misleading parts that could lead to misunderstanding. e.g. P8. lines 361-364

      First, it combines N-cyanomethyl NASA chemistry with the high-affinity AZ10606120 ligand, enabling rapid labeling in microglia (within 10 min)

      P8. lines 372-373 

      Our results not only confirm P2X7 expression in microglia, as previously reported (6, 26-33), but also reveal its nanoscale localization at the cell surface using dSTORM. 

      We agree with the reviewer’s comment. We have now modified the text, including the title.

      Reviewer #2 (Public review): 

      Summary: 

      In this manuscript, Arnould et. al. develop an unbiased, affinity-guided reagent to label P2X7 receptor and use super-resolution imaging to monitor P2X7 redistribution in response to inflammatory signaling. 

      Strengths: 

      I think the X7-uP probe that they developed is very useful for visualizing localization of P2X7 receptor. They convincingly show that under inflammatory conditions, there is a reorganization of P2X7 localization into receptor clusters. Moreover, I think they have shown a very clever way to specifically label any receptor of interest. This has broad appeal 

      We thank the reviewer for their positive comment.

      Weaknesses: 

      Overall, the manuscript is novel and interesting. However, I do have some suggestions for improvement. 

      (1) While the authors state that chemical modification of AZ10606120 to produce the X7-UP reagent has "minimal impact" on the inhibition of P2X7, we can see from Figure 2A and 2B that it does not antagonize P2X7 as effectively as the original antagonist. For the sake of completeness and quantitation, I think it would be great if the authors could determine the IC50 for X7-uP and compare it to the IC50 of AZ10606120. 

      We thank the reviewer for this insightful comment. Unfortunately, due to the limited availability of X7-uP, we were not able to establish a complete concentration–response curve to determine its IC<sub>50</sub>, which would require testing at concentrations >1 µM. Nevertheless, to estimate the effect of the modification, we assessed current inhibition at 300 µM X7-uP and compared it with the reported IC<sub>50</sub> of AZ10606120 (10 nM). Under these conditions, both compounds produced a similar level of inhibition, indicating that while the chemical modification reduces potency relative to AZ10606120, X7-uP still functions as an effective probe for P2X7. We have now included these data in Figure 2 and revised the text accordingly.

      (2) Do the authors know whether modification of the lysines with biotin affects the receptor's affinity for ATP (or ability to be activated by ATP)? What about P2X7 that has been modified with biotin and then labeled with Alexa 647? For the sake of completeness and quantitation, I think it would be great if the authors could determine the EC50 of biotinylated P2X7 for ATP as well as biotinylated and then Alexa 647 labeled P2X7 for ATP and compare these values to the affinity of unmodified WT P2X7 for ATP.

      We thank the reviewer for raising this important point. At present, we have not determined whether modification of lysine residues with biotin, or subsequent labeling with Alexa647, affects the ATP sensitivity or functional properties of P2X7. However, we believe this does not impact the conclusions of the current study, as all functional assays were conducted prior to X7-uP labeling. The labeling is used here as a terminal "snapshot" to visualize the endogenous receptor without interfering with the functional characterization.

      We fully agree that assessing the functional integrity of P2X7 following biotinylation and fluorophore labeling—such as by determining the EC<sub>50</sub> for ATP—would be essential for studies involving dynamic or post-labeling functional analyses, such as live trafficking. However, as noted earlier in our response to Reviewer 1 (point 4), these experiments lie beyond the scope of the current study.

      (3) It is a little misleading to color the fluorescence signal from mScarlet green (for example, in Figure 3 and Figure 4). The fluorescence is not at the same wavelength as GFP. In fact, the wavelength (570 nm - 610 nm) for emission is closer to orange/red than to green. I think this color should be changed to differentiate the signal of mScarlet from the GFP signal used for each of the other P2X receptor subtypes. 

      As suggested, we changed the mScarlet color to orange for all relevant figures.

      (4) It is my understanding that P2X6 does not form homotrimers. Thus, I was a little surprised to see that the density and distribution of P2X6-GFP in Figure 3 looks very similar to the density and distribution of the other P2X subtypes. Do the authors have an explanation for this? Are they looking at P2X6 protomers inserted into the plasma membrane? Does the cell line have endogenous P2X receptor subtypes? Is Figure 3 showing heterotrimers with P2X6 receptor? A little explanation might be helpful.

      We thank the reviewer for raising this important point. Indeed, it is well established that P2X6 does not form functional channels, which supports the conclusion that it does not form homotrimeric complexes. Although previous studies have shown that P2X6–GFP expression is generally lower, more diffuse, and not efficiently targeted to the cell surface compared with other P2X subtypes (see PMID: 12077178), the similar fluorescence distribution and density observed in our Figure 3 do not imply that P2X6 forms homotrimers.

      We did not directly assess the presence of endogenous P2X6 in our HEK293T cells; however, according to the Human Protein Atlas, there is no detectable P2X6 RNA expression in HEK293 cells (nTPM = 0), indicating that endogenous P2X6 is not expressed in this cell line. To further investigate surface expression (see also point 2 of reviewer 1), we performed a commercial cell-surface protein biotinylation assay to assess whether GFP-tagged P2X6 reaches the plasma membrane. As expected, P2X6 was not detected at the cell surface in HEK293T cells, whereas GFP-tagged P2X1 to P2X5 were readily detected. These results further support the conclusion that P2X6 does not insert into the plasma membrane as a homotrimer, thereby validating our confocal fluorescence microscopy assay. These new data are now included in Figure 3 — figure supplement 1.

      (5) It is easy to overlook the fact that the antagonist leaves the binding pocket once the biotin has been attached to the lysines. It might be helpful if the authors made this a little more apparent in Figure 1 or in the text describing the NASA chemistry reaction.

      We thank the reviewer for this insightful suggestion. To address this, we have modified Figure 1A and updated the legend.

      Reviewer #3 (Public review): 

      Summary: 

      This manuscript describes the development of a covalent labeling probe (X7-uP) that selectively targets and tags native P2X7 receptors at the plasma membrane of BV2 microglial cells. Using super-resolution imaging (dSTORM), the authors demonstrate that P2X7 receptors form nanoscale clusters upon microglial activation by lipopolysaccharide (LPS) and ATP, correlating with synergistic IL-1β release. These findings advance understanding of P2X7 reorganization during inflammation and provide a generalizable labeling strategy for monitoring endogenous P2X7 in immune cells. 

      Strengths: 

      (1) The authors designed X7-uP by coupling a high-affinity, P2X7-specific antagonist (AZ10606120) with N-cyanomethyl NASA chemistry to achieve site-directed biotinylation. This approach offers high specificity, minimal off-target reactivity, and a straightforward pull-down/imaging readout. 

      (2) The results connect P2X7's nanoscale clustering directly with IL-1β secretion in microglia, reinforcing the role of P2X7 in inflammation. By localizing endogenous P2X7 at single-molecule resolution, the authors reveal how LPS priming and ATP stimulation synergistically reorganize the receptor. 

      (3) The authors systematically validate their method in recombinant systems (HEK293 cells) and in BV2 cells, showing selective inhibition, mutational confirmation of the binding site, and Western blot pulldown experiments.

      We thank the reviewer for their positive comment.

      Weaknesses: 

      (1) While the data strongly indicate that P2X7 clustering contributes to IL-1β release, the manuscript would benefit from additional experiments (if feasible) or discussion on how receptor clustering interfaces with downstream inflammasome assembly. Clarification of whether the P2X7 clusters physically colocalize with known inflammasome proteins would solidify the mechanism. 

      We thank the reviewer for this valuable suggestion. Determining the physical colocalization of P2X7 clusters with known inflammasome components would provide important insight into the molecular partners involved in inflammasome activation. However, we believe that such an investigation would constitute a substantial study on its own and therefore lies beyond the scope of the present work.

      Nevertheless, in response to the reviewer’s suggestion, we have added a short paragraph at the end of the Discussion section addressing potential mechanisms by which P2X7 clustering may contribute to downstream inflammasome activation. We also revised the text to tone down the hypothesis of physical colocalization.

      (2) The authors might expand on the scope of X7-uP in other native cells that endogenously express P2X7 (e.g., macrophages, dendritic cells). Although they mention the possibility, demonstrating the probe's applicability in at least one other primary immune cell type would strengthen its general utility. 

      We thank the reviewer for this valuable suggestion. Again, we believe that such an investigation would constitute a substantial study on its own and therefore lies beyond the scope of the present work.

      (3) The authors do include appropriate negative controls, yet providing additional details (e.g., average single-molecule on-time or blinking characteristics) in supplementary materials could help readers assess cluster calculations. 

      As suggested, we have included additional data showing single-molecule blinking events in untreated and LPS+ATP-treated BV2 cells, along with the corresponding movies. The data are now presented in Figure 5—supplement figure 3A and B and Figure 5—Videos 1 and 2.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors): 

      (1) On line 96, the authors refer to the "ballast" domain of P2X7 receptor but do not cite the original article from which this nomenclature originated (McCarthy et al., 2019, Cell). This article should be cited to give appropriate credit. 

      Done.

      (2) On line 602, the authors state that they use models from PDB 1MK5 and 6U9W to generate the cartoons in Figure 6. The manuscripts from which these PDB files were generated need to be appropriately cited. 

      Done.

      (3) On line 319, the authors say "300 mM BzATP" but I think they mean 300 uM.

      Done. Thank you for catching the typo.

      Reviewer #3 (Recommendations for the authors): 

      Overall, excellent data quality. The paper would benefit from a discussion of the physiological implications of clustering. It would also be helpful to elaborate about the potential mechanisms for clustering: diffusion and/or insertion. Finally, the authors should comment on work by Mackinnon's (PMID: 39739811) and Santana lab (PMID: 31371391) on two distinct models for clustering of proteins. 

      As suggested by the reviewer, we have revised the discussion to incorporate their comments. First, we have added the following text:

      “Upon BV2 activation, we observed significant nanoscale reorganization of P2X7. Both LPS and ATP (or BzATP) trigger P2X7 upregulation and clustering, increasing the overall number of surface receptors and the number of receptors per cluster, from one to three (Figure 6). By labeling BV2 cells with X7-uP shortly after IL-1b release, we were able to correlate the nanoscale distribution of P2X7 with the functional state of BV2 cells, consistent with the two-signal, synergistic model for IL-1b secretion observed in microglia and other cell types (Ferrari et al, 1996; Perregaux et al, 2000; Ferrari et al, 2006; Di Virgilio et al, 2017; He et al, 2017; Swanson et al, 2019). In this model, LPS priming leads to intracellular accumulation of pro-IL-1b, while ATP stimulation activates P2X7, triggering NLRP3 inflammasome activation and the subsequent release of mature IL-1b.

      What is the mechanism underlying P2X7 upregulation that leads to an overall increase in surface receptors—does it result from the lateral diffusion of previously masked receptors already present at the plasma membrane, or from the insertion of newly synthesized receptors from intracellular pools in response to LPS and ATP? Although our current data do not distinguish between these possibilities, a recent study suggests that the a1 subunit of the Na<sup>+</sup>/K</sup>+</sup>-ATPase (NKAa1) forms a complex with P2X7 in microglia, including BV2 cells, and that LPS+ATP induces NKAa1 internalization (Huang et al, 2024). This internalization appears to release P2X7 from NKAa1, allowing P2X7 to exist in its free form. We speculate that the internalization of NKAa1 induced by both LPS and ATP exposes previously masked P2X7 sites, including the allosteric AZ10606120 sites, thus making them accessible for X7-uP labeling.”

      Second, we have added a short paragraph at the end of the Discussion section addressing potential mechanisms by which P2X7 clustering may contribute to downstream inflammasome activation:

      “What mechanisms underlie P2X7 clustering in response to inflammatory signals? Several models have been proposed to explain membrane protein clustering, including recruitment to structural scaffolds (Feng & Zhang, 2009), partitioning into membrane domains enriched in specific chemical components such as lipid rafts (Simons & Ikonen, 1997), and self-assembly mechanisms (Sieber et al, 2007). These self-assembly mechanisms include an irreversible stochastic model (Sato et al, 2019) and a more recent reversible self-oligomerization model which gives rise to higher-order transient structures (HOTS) (Zhang et al, 2025). Supported by cryogenic optical localization microscopy with very high resolution (~5 nm), the HOTS model has been observed in various membrane proteins, including ion channels and receptors (Zhang et al, 2025). Furthermore, HOTS are suggested to be dynamically modulated and to play a functional role in cell signaling, potentially influencing both physiological and pathological processes (Zhang & MacKinnon, 2025). While this hypothesis is compelling, our current dSTORM data lack sufficient spatial resolution to confirm whether P2X7 trimers form HOTS via self-oligomerization. Further biophysical and ultra-high-resolution imaging studies are required to test this model in the context of P2X7 clustering.”

    1. Reviewer #3 (Public review):

      Summary:

      The authors present a clearly written and beautifully presented piece of work demonstrating clear evidence to support the idea that BK channels and Cav1.3 channels can co-assemble prior to their assertion in the plasma membrane.

      Strengths:

      The experimental records shown back up their hypotheses and the authors are to be congratulated for the large number of control experiments shown in the ms.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public review):

      Summary:

      This manuscript by Pournejati et al investigates how BK (big potassium) channels and CaV1.3 (a subtype of voltage-gated calcium channels) become functionally coupled by exploring whether their ensembles form early-during synthesis and intracellular trafficking-rather than only after insertion into the plasma membrane. To this end, the authors use the PLA technique to assess the formation of ion channel associations in the different compartments (ER, Golgi or PM), single-molecule RNA in situ hybridization (RNAscope), and super-resolution microscopy.

      Strengths:

      The manuscript is well written and addresses an interesting question, combining a range of imaging techniques. The findings are generally well-presented and offer important insights into the spatial organization of ion channel complexes, both in heterologous and endogenous systems.

      Weaknesses:

      The authors have improved their manuscript after revisions, and some previous concerns have been addressed.

      Still, the main concern about this work is that the current experiments do not quantitatively or mechanistically link the ensembles observed intracellularly (in the endoplasmic reticulum (ER) or Golgi) to those found at the plasma membrane (PM). As a result, it is difficult to fully integrate the findings into a coherent model of trafficking. Specifically, the manuscript does not address what proportion of ensembles detected at the PM originated in the ER. Without data on the turnover or halflife of these ensembles at the PM, it remains unclear how many persist through trafficking versus forming de novo at the membrane. The authors report the percentage of PLApositive ensembles localized to various compartments, but this only reflects the distribution of pre-formed ensembles. What remains unknown is the proportion of total BK and Ca<sub>V</sub>1.3 channels (not just those in ensembles) that are engaged in these complexes within each compartment. Without this, it is difficult to determine whether ensembles form in the ER and are then trafficked to the PM, or if independent ensemble formation also occurs at the membrane. To support the model of intracellular assembly followed by coordinated trafficking, it would be important to quantify the fraction of the total channel population that exists as ensembles in each compartment. A comparable ensemble-to-total ratio across ER and PM would strengthen the argument for directed trafficking of pre-assembled channel complexes.

      We appreciate the reviewer’s thoughtful comment and agree that quantitatively linking intracellular hetero-clusters to those at the plasma membrane is an important and unresolved question. Our current study does not determine what proportion of ensembles at the plasma membrane originated during trafficking. It also does not quantify the fraction of total BK and Ca<sub>V</sub>1.3 channels engaged in these complexes within each compartment. Addressing this requires simultaneous measurement of multiple parameters—total BK channels, total Ca<sub>V</sub>1.3 channels, hetero-cluster formation (via PLA), and compartment identity—in the same cell. This is technically challenging. The antibodies used for channel detection are also required for the proximity ligation assay, which makes these measurements incompatible within a single experiment.

      To overcome these limitations, we are developing new genetically encoded tools to enable real-time tracking of BK and Ca<sub>V</sub>1.3 dynamics in live cells. These approaches will enable us to monitor channel trafficking and the formation of hetero-clusters, as detected by colocalization. This kind of experiments will provide insight into their origin and turnover. While these experiments are beyond the scope of the current study, the findings in our current manuscript provide the first direct evidence that BK and CaV channels can form hetero-clusters intracellularly prior to reaching the plasma membrane. This mechanistic insight reveals a previously unrecognized step in channel organization and lays the foundation for future work aimed at quantifying ensemble-to-total ratios and determining whether coordinated trafficking of pre-assembled complexes occurs.

      This limitation is acknowledged in the discussion section, page 23. It reads: “Our findings highlight the intracellular assembly of BK-Ca<sub>V</sub>1.3 hetero-clusters, though limitations in resolution and organelle-specific analysis prevent precise quantification of the proportion of intracellular complexes that ultimately persist on the cell surface.”

      Reviewer #2 (Public review):

      Summary:

      The co-localization of large conductance calcium- and voltage activated potassium (BK) channels with voltage-gated calcium channels (CaV) at the plasma membrane is important for the functional role of these channels in controlling cell excitability and physiology in a variety of systems.

      An important question in the field is where and how do BK and CaV channels assemble as 'ensembles' to allow this coordinated regulation - is this through preassembly early in the biosynthetic pathway, during trafficking to the cell surface or once channels are integrated into the plasma membrane. These questions also have broader implications for assembly of other ion channel complexes

      Using an imaging based approach, this paper addresses the spatial distribution of BKCaV ensembles using both overexpression strategies in tsa201 and INS-1 cells and analysis of endogenous channels in INS-1 cells using proximity ligation and superesolution approaches. In addition, the authors analyse the spatial distribution of mRNAs encoding BK and Cav1.3.

      The key conclusion of the paper that BK and Ca<sub>V</sub>1.3 are co-localised as ensembles intracellularly in the ER and Golgi is well supported by the evidence.However, whether they are preferentially co-translated at the ER, requires further work. Moreover, whether intracellular pre-assembly of BK-Ca<sub>V</sub>1.3 complexes is the major mechanism for functional complexes at the plasma membrane in these models requires more definitive evidence including both refinement of analysis of current data as well as potentially additional experiments.

      The reviewer raises the question of whether BK and Ca<sub>V</sub>1.3 channels are preferentially co-translated. In fact, I would like to propose that co-translation has not yet been clearly defined for this type of interaction between ion channels. In our current work, we 1) observed the colocalization between BK and Ca<sub>V</sub>1.3 mRNAs and 2) determined that 70% of BK mRNA in active translation also colocalizes with Ca<sub>V</sub>1.3 mRNA. We think these results favor the idea of translational complexes that can underlie the process of co-translation. However, and in total agreement with the Reviewer, the conclusion that the mRNA for the two ion channels is cotranslated would require further experimentation. For instance, mRNA coregulation is one aspect that could help to define co-translation. 

      To avoid overinterpretation, we have revised the manuscript to remove references to “co-translation” in the Results section and included the word “potential” when referring to co-translation in the Discussion section. We also clarified the limitations of our evidence in the Discussion that can be found on page 25: “It is important to note that while our data suggest mRNA coordination, additional experiments are required to directly assess co-translation.”

      Strengths & Weaknesses

      (1) Using proximity ligation assays of overexpressed BK and CaV1.3 in tsa201 and INS1 cells the authors provide strong evidence that BK and CaV can exist as ensembles (ie channels within 40 nm) at both the plasma membrane and intracellular membranes, including ER and Golgi. They also provide evidence for endogenous ensemble assembly at the Golgi in INS-1 cells and it would have been useful to determine if endogenous complexes are also observe in the ER of INS-1 cells. There are some useful controls but the specificity of ensemble formation would be better determined using other transmembrane proteins rather than peripheral proteins (eg Golgi 58K).

      We thank the reviewer for their thoughtful feedback and for recognizing the strength of our proximity ligation assay data supporting BK–Ca<sub>V</sub>1.3 hetero-clusters formation at both the plasma membrane and intracellular compartments. As for specificity controls, we appreciate the suggestion to use transmembrane markers. To strengthen our conclusion, we have performed an additional experiment comparing the number of PLA puncta formed by the interaction of Ca<sub>V</sub>1.3 and BK channels with the number of PLA puncta formed by the interaction of Ca<sub>V</sub>1.3 channels and ryanodine receptors in INS-1 cells. As shown in the figure below, the number of interactions between Ca<sub>V</sub>1.3 and BK channels is significantly higher than that between Ca<sub>V</sub>1.3 and RyR<sub>2</sub>. Of note, RyR<sub>2</sub> is a protein resident of the ER. These results provide additional evidence of the existence of endogenous complex formation in INS-1 cells. We have added this figure as a supplement.

      (2) Ensemble assembly was also analysed using super-resolution (dSTORM) imaging in INS-1 cells. In these cells only 7.5% of BK and CaV particles (endogenous?) co-localise that was only marginally above chance based on scrambled images. More detailed quantification and validation of potential 'ensembles' needs to be made for example by exploring nearest neighbour characteristics (but see point 4 below) to define proportion of ensembles versus clusters of BK or Cav1.3 channels alone etc. For example, it is mentioned that a distribution of distances between BK and Cav is seen but data are not shown.

      We thank the reviewer for this comment. To address the request for more detailed quantification and validation of ensembles, we performed additional analyses:

      Proportion of ensembles vs isolated clusters: We quantified clusters within 200 nm and found that 37 ± 3% of BK clusters are near one or more CaV1.3 clusters, whereas 15 ± 2% of CaV1.3 clusters are near BK clusters. Figure 8– Supplementary 1A

      Distance distribution: As shown in Figure 8–Supplementary 1B, the nearestneighbor distance distribution for BK-to-CaV1.3 in INS-1 cells (magenta) is shifted toward shorter distances compared to randomized controls (gray), supporting preferential localization of BK–CaV1.3 hetero-clusters.

      Together, these analyses confirm that BK–CaV1.3 ensembles occur more frequently than expected by chance and exhibit an asymmetric organization favoring BK proximity to CaV1.3 in INS-1 cells. We have included these data and figures in the revised manuscript, as well as description in the Results section. 

      (3) The evidence that the intracellular ensemble formation is in large part driven by cotranslation, based on co-localisation of mRNAs using RNAscope, requires additional critical controls and analysis. The authors now include data of co-localised BK protein that is suggestive but does not show co-translation. Secondly, while they have improved the description of some controls mRNA co-localisation needs to be measured in both directions (eg BK - SCN9A as well as SCN9A to BK) especially if the mRNAs are expressed at very different levels. The relative expression levels need to be clearly defined in the paper. Authors also use a randomized image of BK mRNA to show specificity of co-localisation with Cav1.3 mRNA, however the mRNA distribution would not be expected to be random across the cell but constrained by ER morphology if cotranslated so using ER labelling as a mask would be useful?

      We thank the reviewer for these constructive suggestions. We measured mRNA colocalization in both directions as recommended. As shown in the figure below, colocalization between KCNMA1 and SCN9A transcripts was comparable in both directions, with no statistically significant difference, supporting the specificity of the observed associations. We decided not to add this to the original figure to keep the figure simple. 

      We agree that co-localization of BK protein with BK mRNA is not conclusive evidence of co-translation, and we do not intend to mislead readers in our conclusion. Consequently, we were careful in avoiding the use of co-translation in the result section and added the word “potential” when referring to co-translation in the Discussion section. We added a sentence in the discussion to caution our interpretation: “It is important to note that while our data suggest mRNA coordination, additional experiments are required to directly assess cotranslation.”

      Author response image 1.

      (4) The authors attempt to define if plasma membrane assemblies of BK and CaV occur soon after synthesis. However, because the expression of BK and CaV occur at different times after transient transfection of plasmids more definitive experiments are required. For example, using inducible constructs to allow precise and synchronised timing of transcription. This would also provide critical evidence that co-assembly occurs very early in synthesis pathways - ie detecting complexes at ER before any complexes 

      We appreciate the reviewer’s insightful suggestion regarding the use of inducible constructs to synchronize transcription timing. This is an excellent approach and would allow direct testing of whether co-assembly occurs early in the synthesis pathway, including detection of complexes at the ER prior to plasma membrane localization. These experiments are beyond the scope of the present work but represent an important direction for future studies.

      We have added the following sentence to the Discussion section (page 24) to highlight this idea. “Future experiments using inducible constructs to precisely control transcription timing will enable more precise quantification of heterocluster formation in the ER compartment prior to plasma membrane insertion and reduce the variability introduced by differences in expression timing after plasmid transfection.” 

      (5) While the authors have improved the definition of hetero-clusters etc it is still not clear in superesolution analysis, how they separate a BK tetramer from a cluster of BK tetramers with the monoclonal antibody employed ie each BK channel will have 4 binding sites (4 subunits in tetramer) whereas Cav1.3 has one binding site per channel. Thus, how do authors discriminate between a single BK tetramer (molecular cluster) with potential 4 antibodies bound compared to a cluster of 4 independent BK channels.

      We appreciate the reviewer’s thoughtful comment regarding the interpretation of super-resolution data. We agree that distinguishing a single BK tetramer from a cluster of multiple BK channels is challenging when using an antibody that can bind up to four sites per channel. To clarify, our analysis does not attempt to resolve individual subunits within a tetramer; rather, it focuses on the nanoscale spatial proximity of BK and Ca<sub>V</sub>1.3 signals.

      We want to note that this limitation applies only to the super-resolution maps in Figures 8C and 9D and does not affect Airyscan-based analyses or measurements of BK–Ca<sub>V</sub>1.3 proximity.

      To address how we might distinguish between a single BK tetramer and a cluster of multiple BK channels, we considered two contrasting scenarios. In the first case, we assume that all four α-subunits within a tetramer are labeled. Based on cryoEM structures, a BK tetramer measures approximately 13 nm × 13 nm (≈169 nm²). Adding two antibody layers (primary and secondary) would increase the footprint by ~14 nm in each direction, resulting in an estimated area of ~41 nm × 41 nm (≈1681 nm²). Under this assumption, particles smaller than ~1681 nm² would likely represent individual tetramers, whereas larger particles would correspond to clusters of multiple tetramers. 

      In the second scenario, we propose that steric constraints at the S9–S10 segment, where the antibody binds, limit labeling to a single antibody per tetramer. If true, the localization precision would approximate 14 nm × 14 nm—the combined size of the antibody complex and the channel—close to the resolution limit of the microscope. To test this, we performed a control experiment using two antibodies targeting the BK C-terminal domain, raised in different species and labeled with distinct fluorophores. Super-resolution imaging revealed that only ~12% of particles were colocalized, suggesting that most channels bind a single antibody.

      If multiple antibodies could bind each tetramer, we would expect much greater colocalization.

      Although these data are not included in the manuscript, we have added the following clarification to the Results section (page 19): “It is important to note that this technique does not allow us to distinguish between labeling of four BK αsubunits within a tetramer and labeling of multiple BK channel clusters. Hence, particles smaller than ~1680 nm² may represent either a single tetramer or a cluster. This limitation applies to Figures 8C and 9D and does not affect measurements of BK–Ca<sub>V</sub>1.3 proximity.”

      Author response image 2.

      (6) The post-hoc tests used for one way ANOVA and ANOVA statistics need to be defined throughout

      We thank the reviewer for highlighting the need for clarity regarding our statistical analyses. We have now specified the post-hoc tests used for all one-way ANOVA and ANOVA comparisons throughout the manuscript, and updated figure legends.

      Reviewer #3 (Public review):

      Summary:

      The authors present a clearly written and beautifully presented piece of work demonstrating clear evidence to support the idea that BK channels and Cav1.3 channels can co-assemble prior to their assertion in the plasma membrane.

      Strengths:

      The experimental records shown back up their hypotheses and the authors are to be congratulated for the large number of control experiments shown in the ms.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors have sufficiently addressed the specific points previously raised and the manuscript has improved clarity in those aspects. My main concern, which still remains, is stated in the public review.

      Reviewer #3 (Recommendations for the authors):

      I am content that the authors have attempted to fully address my previous criticisms.

      I have only three suggestions

      (1) I think the word Homo-clusters at the bottom right of Figure 1 is erroneously included.

      We thank the reviewer for bringing this to our attention. The figure has been corrected accordingly.

      (2) The authors should, for completeness, to refer to the beta, gamma and LINGO subunit families in the Introduction and include appropriate references:

      Knaus, H. G., Folander, K., Garcia-Calvo, M., Garcia, M. L., Kaczorowski, G. J., Smith, M., & Swanson, R. (1994). Primary sequence and immunological characterization of betasubunit of high conductance Ca2+-activated K+ channel from smooth muscle. The Journal of Biological Chemistry, 269(25), 17274-17278.

      Brenner, R., Jegla, T. J., Wickenden, A., Liu, Y., & Aldrich, R. W. (2000a). Cloning and functional characterization of novel large conductance calcium-activated potassium channel beta subunits, hKCNMB3 and hKCNMB4. The Journal of Biological Chemistry, 275(9), 6453-6461.

      Yan, J & R.W. Aldrich. (2010) LRRC26 auxiliary protein allows BK channel activation at resting voltage without calcium. Nature. 466(7305):513-516

      Yan, J & R.W. Aldrich. (2012) BK potassium channel modulation by leucine-rich repeatcontaining proteins. Proceedings of the National Academy of Sciences 109(20):7917-22

      Dudem, S, Large RJ, Kulkarni S, McClafferty H, Tikhonova IG, Sergeant, GP, Thornbury, KD, Shipston, MJ, Perrino BA & Hollywood MA (2020). LINGO1 is a novel regulatory subunit of large conductance, Ca2+-activated potassium channels. Proceedings of the National Academy of Sciences 117 (4) 2194-2200

      Dudem, S., Boon, P. X., Mullins, N., McClafferty, H., Shipston, M. J., Wilkinson, R. D. A., Lobb, I., Sergeant, G. P., Thornbury, K. D., Tikhonova, I. G., & Hollywood, M. A. (2023). Oxidation modulates LINGO2-induced inactivation of large conductance, Ca2+-activated potassium channels. The Journal of Biological Chemistry, 299 (3) 102975.

      We agree with the reviewer’s suggestion and have revised the Introduction to include references to the beta, gamma, and LINGO subunit families. Appropriate citations have been added to ensure completeness and contextual relevance.

      Additionally, BK channels are modulated by auxiliary subunits, which fine-tune BK channel gating properties to adapt to different physiological conditions. The β, γ, and LINGO1 subunits each contribute distinct structural and regulatory features: β-subunits modulate Ca²⁺ sensitivity and can induce inactivation; γ-subunits shift voltage-dependent activation to more negative potentials; and LINGO1 reduces surface expression and promotes rapid inactivation (18-24). These interactions ensure precise control over channel activity, allowing BK channels to integrate voltage and calcium signals dynamically in various cell types.

      (3) I think it may be more appropriate to include the sentence "The probes against the mRNAs of interest and tested in this work were designed by Advanced Cell Diagnostics." (P16, right hand column, L12-14) in the appropriate section of the Methods, rather than in Results.

      We thank the reviewer for this helpful suggestion. In response, we have relocated the sentence to the appropriate section of the Methods, where it now appears with relevant context.

    1. Market justice preferences

      estructura sugerida: 1. Welfare states, commodification & policy feedbacks 2. Distributive justice & market justice 3. Key studies on market justice & main findings 4. Pensions 5. Operationalization

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      A previous study by Komada et al. demonstrated that MAP7 is expressed in both Sertoli and germ cells, and that Map7 gene-trap mutant mice display disrupted microtubule bundle formation in Sertoli cells, accompanied by defects in spermatid manchettes and germ cell loss. In the current study, Kikuchi et al. investigated the role of MAP7 in the formation of the Sertoli cell apical domain during the first wave of spermatogenesis. They generated a GFP-tagged MAP7 mouse line and demonstrated that the endogenous MAP7 protein localizes to the apical microtubules in Sertoli cells and to the manchette microtubules in step 9-11 spermatids. They also generated a new Map7 knockout (KO) mouse line in a genetic background distinct from the one used in the previous study. Focusing on stages before the emergence of step 9-11 spermatids, the authors aimed to isolate defects caused by the function of MAP7 in Sertoli cells. They report that loss of MAP7 impairs Sertoli cell polarity and apical domain formation, accompanied by the microtubule remodeling defect. Using the GFP-tagged MAP7 line, they performed immunoprecipitation-mass spectrometry and identified several MAP7-interacting proteins in the testis, including MYH9. They further observed that MAP7 deletion alters the distribution of MYH9. Single-cell RNA sequencing revealed that the loss of MAP7 in Sertoli cells resulted in slight transcriptomic shifts but had no significant impact on their functional differentiation. Single-cell RNA sequencing analysis also showed delayed meiotic progression in the MAP7-deficient testis. Overall, while the study provides some interesting discoveries of early Sertoli cell defects in MAP7-deficient testes, some conclusions are premature and not fully supported by the presented data. The mechanistic investigations remain limited in depth.

      Response: We thank the reviewer for this insightful summary. We agree that some of our initial interpretations were speculative and have revised the relevant sections to more accurately reflect the limitations of the current data. We also acknowledge that further mechanistic studies will be important to strengthen our conclusions, and we have outlined these plans in the individual responses below.

      Major comments:

      Although the infertility phenotype of the Map7 gene-trap mutant mice has been reported previously, it remains essential to assess fertility in this newly generated MAP7 knockout line. While the authors present testis size and histological differences between WT and KO mice (Extended Fig. 2e and 2f), there is no corresponding description or interpretation in the main text regarding fertility outcomes.

      Response: We thank the reviewer for raising this point. Although we had presented the differences in testis size and histology between wild-type and Map7-/- mice, we agree that a description of the corresponding fertility outcomes was missing from the main text. We have now revised the relevant part of the Results section as follows: “Consistent with observations in Map7 gene-trap mice, Map7-/- males exhibited reduced testis size and spermatogenic defects (Supplemental Fig. 2E, F). Notably, the cauda epididymis of Map7-/- males contained no mature spermatozoa (Supplemental Fig. 2F), indicating male infertility.” (page 5, line 33–page 6, line 2)

      • In Figure 2C, the authors identified Sertoli cells, spermatogonia cells, and spermatocytes using SEM, based on their cell morphology and adhesion to the basement membrane. Given that the loss of MAP7 disrupts the polarity and architecture of Sertoli cells, the position of germ cells will be affected, making this identification criterion less reliable.

      Response: We appreciate the reviewer’s comment. While the reviewer notes that cell identification was based on cell morphology and adhesion to the basement membrane, we clarify that nuclear morphology was also considered, as described in the original manuscript. Specifically, germ cells have spherical nuclei, whereas Sertoli cell nuclei are irregularly shaped (representative segmentation results can be provided as an additional Supplemental Figure upon request). Round spermatids at P21 can be distinguished from spermatocytes by their smaller nuclear size. In addition, spermatogonia remain attached to the basement membrane even in Map7-/- testes, as confirmed by GFRα1-positive spermatogonial stem cells (Figure 6A). Together, these features ensure reliable identification of each cell type, independent of the altered polarity observed in Map7-deficient Sertoli cells.

      • In Figure 2e, the number of Sox9-positive Sertoli cells in MAP7 knockout mice appears higher than that in the control at P17. Quantification of total Sox9-positive cells should be done to determine whether MAP7 deletion increases Sertoli cell numbers.

      Response: As suggested by the reviewer, we will quantify the density of SOX9-positive Sertoli cells per unit area of seminiferous tubule at P10 and P17 in Map7+/- and Map7-/- testes, and include the results in the revised manuscript.

      • To determine whether MAP7's role in regulating Sertoli cell polarity relies on germ cells, the authors treated mice with busulfan at P28 to delete germ cells, a stage after Sertoli cell polarity defect has developed in MAP7 knockout mice. This data is insufficient to support the conclusion that MAP7 regulates Sertoli cell polarity independently of the presence of germ cells. Germ cell deletion should be done before the Sertoli cell defect develops to address this question.

      Response: We appreciate the reviewer’s thoughtful comment regarding the interpretation of the busulfan experiments. While depletion of germ cells at P28 enabled us to assess Sertoli cell polarity in the absence of postnatal spermatogonia, these experiments do not definitively determine whether MAP7 regulates Sertoli cell polarity independently of germ cells. Neonatal germ-cell depletion would more directly test germ cell–independent effects; however, systemic busulfan administration at early developmental stages is highly toxic, often causing bone marrow failure and multi-organ damage, which precludes survival and confounds analysis of testis-specific effects. Although germ cell ablation could, in principle, be achieved using transgenic approaches that exploit the natural resistance of mice to diphtheria toxin (DTX) (reviewed in Smith et al., Andrology, 2015), these strategies require multiple transgenes and show minor variability in efficiency, making them impractical for our current experiments. Generating the necessary genetic combinations would require considerable time. We therefore plan to pursue alternative genetic approaches in future work.

      In the revised manuscript, we have modified the relevant section to more accurately reflect the limitations of the current experiments, as follows: “Busulfan was administered at P28, and testes were analyzed 6 weeks later, after complete elimination of germ cell lineages. Following treatment, Map7+/- mice showed testis-to-body weight ratios comparable to untreated Map7-/- mice (Supplemental Fig. 3D), and hematoxylin-eosin (HE) staining confirmed germ cell depletion (Fig. 2F; Supplemental Fig. 3E). In Map7+/- testes, most Sertoli nuclei remained basally positioned, indicating that once apical–basal polarity is established, it is stably maintained even in the absence of germ cells. In contrast, Map7-/- Sertoli nuclei were frequently misoriented toward the lumen under the same conditions (Fig. 2F; Supplemental Fig. 3E), suggesting that polarity defects in Map7-deficient Sertoli cells occur independently of germ cell presence.” (page 7, lines 20–28)

      In addition, we have added the following sentences to the Discussion section to highlight the implication of these findings: “In addition, even after germ cell depletion by busulfan treatment, Map7-deficient Sertoli cells failed to reestablish basal nuclear positioning, indicating that loss of MAP7 causes an intrinsic polarity defect. These findings suggest that MAP7 acts as a cell-autonomous regulator of Sertoli cell polarity, rather than mediating effects indirectly through germ cell–Sertoli cell interactions.” (page 15, lines17–21)

      • The resolution of the SEM images in Figure 3c is insufficient to evaluate tight and adherens junctions clearly. As such, these images do not convincingly support the claim that adherens junctions are absent in the KO testes.

      Response: We thank the reviewer for this insightful comment. Tight junctions can be reliably identified in SEM images as dense intercellular structures accompanied by endoplasmic reticulum aligned along the cell boundaries. The region immediately apical to the tight junctions likely corresponds to adherens junctions, which are also associated with the endoplasmic reticulum. Unlike tight junctions, these regions exhibit wider intercellular spaces, consistent with the looser membrane apposition characteristic of adherens junctions, although they cannot be unambiguously distinguished from gap junctions or desmosomes based on morphology alone. In the original figure, 2× binning reduced image resolution, which may have contributed to the reviewer’s concern.

      In the revised manuscript, we have re-acquired the SEM images in high-resolution mode, focusing on the relevant regions. The new high-resolution images have replaced the original panels in revised Figure 3C, providing clearer visualization of junctional structures at P10 and P21 in Map7+/- and Map7-/- testes. The original Figure 3C images have been moved to Supplemental Figure 4B for reference.

      The corresponding section in the Results has been revised as follows in the updated manuscript: “We then performed SEM to examine the effects of Map7 KO. In P21 Map7+/- testes, electron-dense regions along the basal side of Sertoli–Sertoli junctions corresponded to tight junctions closely associated with the endoplasmic reticulum, consistent with previous reports (Luaces et al. 2023) (Fig. 3C; Supplemental Fig. 4B). The region immediately apical to the tight junctions likely represents adherens junctions, which were also associated with the endoplasmic reticulum. Unlike tight junctions, these regions displayed wider intercellular spaces, reflecting the looser membrane apposition typical of adherens junctions, though they could not be definitively distinguished from gap junctions or desmosomes based on morphology alone (Fig. 3C; Supplemental Fig. 4B). At P10, both Map7+/- and Map7-/- testes lacked clearly defined tight junctions and adherens junction–like structures (Fig. 3C; Supplemental Fig. 4B). In P21 Map7-/- mice, Sertoli cells formed expanded basal tight junctions but failed to establish adherens junction–like structures (Fig. 3C; Supplemental Fig. 4B).” (page 8, line 34–page 9, line 12)

      • GFP-tagged reporter mice and HeLa cells were used for immunoprecipitation-mass spectrometry to identify proteins that interact with MAP7. Given that the authors aimed to elucidate the mechanism by which MAP7 regulates Sertoli cell cytoskeleton organization, the rationale for including HeLa cells is unclear and should be better justified or reconsidered.

      Response: We thank the reviewer for this comment. MAP7-egfpKI HeLa cells were used as a complementary system to identify MAP7-associated proteins, providing sufficient material and a controlled environment for robust detection. By comparing IP-MS results from MAP7-egfpKI HeLa cells and P17–P20 Map7-egfpKI testes, we can distinguish proteins that are specific to polarized Sertoli cells: proteins detected exclusively in P17–P20 testes may be involved in Sertoli cell polarization, whereas proteins detected in both systems likely represent general MAP7-associated factors that are not specific to Sertoli cell polarity.

      This rationale has been clarified in the revised manuscript by adding the following sentence to the Results section: “MAP7-egfpKI HeLa cells were used as a complementary system, providing sufficient material and a controlled environment for robust detection of MAP7-associated proteins. Comparison of IP-MS results between MAP7-egfpKI HeLa cells and P17–P20 Map7-egfpKI testes allows identification of MAP7-associated proteins that are specific to polarized Sertoli cells, whereas proteins detected in both systems likely represent general MAP7-associated proteins.” (page 9 lines 27-32)

      • The authors observed that MYH9, one of the MAP7-interacting proteins, does not colocalize with ectopic microtubule and F-actin structures in MAP7 KO testes and concluded that MAP7 facilitates the integration of microtubules and F-actin via interaction with NMII heavy chains. This conclusion is speculative and not adequately supported by the presented data.

      Response: We thank the reviewer for this insightful comment. We agree that our initial conclusion was speculative and have revised the relevant section to more accurately reflect the limitations of the current data. The revised text now reads as follows: “These findings indicate that MYH9 localization at the luminal interface depends on MAP7, and suggest that MAP7 helps coordinate microtubules and F-actin, potentially via its association with NMII heavy chains.” (page 10, lines 13–15)

      To further elucidate this mechanism, we will perform biochemical domain-mapping to define the MAP7 region responsible for MYH9 complex formation. We have already established a series of human MAP7 deletion mutants (as reported previously, EMBO Rep., 2018) and will conduct co-immunoprecipitation assays in HEK293 cells to identify the specific MAP7 domain required for complex formation with MYH9. Based on these results, we plan to use AlphaFold3 to predict the three-dimensional structure of the MAP7–MYH9 complex. These analyses will help clarify how MAP7 associates with the actomyosin network and provide additional mechanistic insights that complement our in vivo observations of MYH9 mislocalization in Map7-/- testes.

      • The authors used Spearman correlation coefficients to analyze six Sertoli cell clusters and generated a minimum spanning tree to infer differentiation trajectories. However, details on the method used for constructing the tree are lacking. Moreover, relying solely on Spearman correlation to define differentiation topology is oversimplified.

      Response: We appreciate the reviewer’s valuable feedback. We agree that Spearman correlation alone is insufficient to infer differentiation topology. In response, we reanalyzed the data using Monocle3, which implements branch-aware pseudotime inference to capture both cluster continuity and differentiation directionality. This reanalysis provides a more accurate reconstruction of differentiation trajectories among the six Sertoli cell clusters. Although the overall trajectories appeared different and a higher proportion of Map7-/- Sertoli cells exhibited very low pseudotime values, comparison of the control and Map7-/- trajectories revealed that the average node degree was nearly identical, indicating that the local graph structure—reflecting the connectivity among neighboring cells—was largely preserved. The numbers of branch points and the graph diameter differed slightly, likely due to differences in sample size (311 control vs. 434 Map7-/- Sertoli cells) and distribution bias rather than major topological changes. Accordingly, Figures 5C and 5D have been replaced with the updated Monocle3-based trajectory analysis, and the corresponding text in the Results section and figure legend have been revised as follows:

      “To reconstruct differentiation trajectories among the six Sertoli cell clusters, we reanalyzed the datasets using Monocle3, which incorporates branch-aware pseudotime inference. Cluster C1 was selected as the root based on shared specificity and entropy scores, consistent with its metabolically active and transcriptionally diverse profile (Fig. 5B, C; Supplemental Fig. 7). While the overall trajectories appeared altered, the proportion of Map7-/- Sertoli cells with very low pseudotime values was only modestly increased (Fig. 5D). Comparison with controls showed that the average node degree was nearly identical (Fig. 5C), indicating that the local graph structure, reflecting connectivity among neighboring cells, remained largely intact. Minor differences in branch points and graph diameter likely reflect inherent variability in the data rather than major topological changes (Supplemental Fig. 6B). Consistent with this, the relative proportions of the six clusters showed only modest shifts, suggesting that the overall architecture of Sertoli cell differentiation is largely preserved in the absence of MAP7.” (page 11, lines 7-18)

      “(C) Control and Map7-/- Sertoli cells were visualized separately using UMAPs constructed in Seurat. Using the same datasets, pseudotime trajectories were inferred with Monocle3. For root selection, shared_score (cluster overlap), specificity_score (cluster uniqueness), and entropy_score (transcriptional diversity) were computed, resulting in cluster 1 being selected as the root. The numbers of nodes, edges, branch points, average degree, and diameter of each trajectory are shown below the corresponding UMAPs. (D) Parallel comparison of pseudotime distributions between control and Map7-/- populations.” (page 30, lines 5-12)

      Minor comments:

      • Several extended data figures are redundant with main figures and do not provide additional value (e.g., Fig. 2d vs. Extended Data Fig. 3a; Fig. 2f vs. Extended Data Fig. 3d; Fig. 2C vs. Extended Data Fig. 4b; Fig. 3d vs. Extended Data Fig. 4c). The authors should consolidate or remove duplicates.

      Response: Regarding the concerns about redundancy between main and Supplemental figures, we would like to clarify the rationale for retaining certain Supplemental figures.

      Fig. 2D vs. Supplemental Fig. 3A: Due to space limitations in the main figure, only the merged three-color image was shown. We believe that the single-color grayscale images in Supplemental Fig. 3A provide additional clarity, allowing easier visualization of SOX9-positive Sertoli cell distribution and differences in F-actin structure.

      Fig. 2F vs. Supplemental Fig. 3E: In the main figure, only the high-magnification image was shown due to space constraints. The lower-magnification image in Supplemental Fig. 3E demonstrates that the selected field was not chosen arbitrarily, providing context for the observed structures. In addition, Supplemental Fig. 3E includes both low- and high-magnification images of age-matched busulfan (-) testes as a control for the busulfan (+) condition, further supporting the validity of the comparison.

      For the above-mentioned cases (Fig. 2D vs. Supplemental. 3A; Fig. 2F vs. Supplemental Fig. 3E), as well as other potentially overlapping figures (e.g., Fig. 3D vs. Supplemental Fig. 4C), we believe that the additional single-channel and lower-magnification images provide important context that cannot be fully conveyed in the main figures due to space limitations. Nevertheless, to address the reviewer’s concern, we will (i) clearly state the purpose of each Supplemental figure in the corresponding legends, and (ii) re-evaluate all figures to consolidate or remove any truly redundant panels. Our goal is to ensure that all figures collectively convey the data in the most concise and informative manner.

      • Figure citations in the main text do not consistently match figure content. For example, on page 7 (lines 5-6), the text refers to Extended Data Fig. 4a for SOX9 staining. Yet, it is the extended Data Fig. 3a that contains the relevant data. Similarly, the reference to Extended Data Fig. 4b and 4c on page 7 (lines 7-8) for adult defects is inaccurate.

      Response: We thank the reviewer for drawing attention to these inconsistencies. We have carefully checked all figure citations throughout the main text and corrected them so that they consistently match the figure content. The revised manuscript reflects these corrections.

      • In Figure 2e, percentages of Sertoli cells across three layers are shown. The figure legend should specify which layer(s) show statistically significant differences between WT and KO.

      Response: We are grateful to the reviewer for highlighting this point. Statistical comparisons were performed between Map7+/- and Map7-/- mice within each corresponding layer at P17. Statistical significance was assessed using Student’s t-test, and all three layers showed significant differences between Map7+/- and Map7-/- (P < 2.20 × 10⁻⁴). The figure legend has been revised accordingly as follows: “Statistical comparisons between Map7+/- and Map7-/- mice were performed for each corresponding layer at P17 using Student’s t-test. All three layers showed significant differences between Map7+/- and Map7-/- mice (*, P<2.20 × 10⁻⁴).” (page 28, lines 5-8)

      • The current color scheme for F-actin and TUBB3 in Figure 3 lacks sufficient contrast. Adjusting to more distinguishable colors would improve readability.

      Response: Response: We thank the reviewer for this helpful suggestion. In the original merged images, four channels (DNA, TUBB3, F-actin, and β-catenin) were displayed together, which reduced contrast between cytoskeletal signals. To improve clarity, we generated new merged images showing only TUBB3 and F-actin, allowing better visual distinction between these components. In addition, β-catenin and DNA are now displayed together as a separate merged image (β-catenin in yellow and DNA in blue) in the final column, highlighting the altered localization of β-catenin in Map7-/- testes.

      • Since multiple scale bars with different units are present within the same figures, adding units directly above or beside each scale bar would improve readability.

      Response: We thank the reviewer for the suggestion. Following this recommendation, we have added units directly above each scale bar in all figures to improve readability.

      • It is recommended to directly mark Sertoli cells, spermatogonia, and spermatocytes on the SEM images in Figure 2C for clearer visualization.

      Response: We thank the reviewer for the suggestion. We will follow this recommendation by performing segmentation and directly marking Sertoli cells, spermatogonia, and spermatocytes on the SEM images in Figure 2C to improve visualization.

      • The quantification of Sertoli cell positioning shown in Fig. 2C is already described in the main text and is unnecessary in the figure.

      Response: We appreciate the reviewer’s comment regarding the quantification of Sertoli cell positioning. Although the results are described in the main text, we believe that the visual presentation in Figure 2C is essential for conveying the spatial distribution pattern in an intuitive and comparative manner. To address the concern about redundancy, we have slightly revised the figure legend (page 27, lines 28–29) to clarify that this panel provides a visual summary of the quantitative data described in the text, thereby improving clarity without unnecessary duplication.

      _Referee cross-commenting_

      I concur with Reviewer 2 that the Map7-eGFP mouse model is a valuable tool for the research community. I also agree that performing MAP7-MYH9 double immunofluorescence staining to demonstrate their colocalization would further strengthen the authors' conclusions regarding their interaction. My overall assessment of the manuscript remains unchanged: the study represents an incremental advance that extends previous findings on MAP7 function but provides limited new mechanistic insight.

      Reviewer #1 (Significance):

      This study investigates the role of the microtubule-associated protein MAP7 in Sertoli cell polarity and apical domain formation during early stages of spermatogenesis. Using GFP-tagged and MAP7 knockout mouse models, the authors show that MAP7 localizes to apical microtubules and is required for Sertoli cell cytoskeletal organization and germ cell development. While the study identifies early Sertoli cell defects and candidate MAP7-interacting proteins, the mechanistic insights remain limited, and several conclusions require stronger experimental support. Overall, the discovery represents an incremental advance that extends prior findings on MAP7 function, providing additional but modest insights into the role of MAP7 in cytoskeletal regulation in male reproduction.

      Response: We thank the reviewer for their constructive comments and thoughtful evaluation of our manuscript. We appreciate the positive feedback regarding the value of the Map7-egfpKI mouse model for the research community. We also thank the reviewer for the suggestion to perform MAP7–MYH9 double immunofluorescence staining to demonstrate colocalization, which we agree will further strengthen the mechanistic support.

      We would like to clarify that several aspects of our findings represent novel contributions within a field where the mechanisms of microtubule remodeling during apical domain formation have remained largely unresolved. In particular, our study provides evidence that MAP7 is asymmetrically enriched at the apical microtubule network in Sertoli cells and contributes to the directional organization of these microtubules—an aspect of Sertoli cell polarity that has not been previously characterized. Our results further indicate that dynamic microtubule turnover, rather than stabilization alone, is required for proper apical domain formation, addressing a gap in current understanding of how microtubules are reorganized during early polarity establishment. In addition, the data support a role for MAP7 in coordinating microtubule and actomyosin organization, suggesting a scaffolding function that links these cytoskeletal systems. We also observe that Sertoli cell polarity can be functionally separated from cell identity and that disruptions in apical domain architecture precede delays in germ cell developmental progression. Taken together, these observations provide mechanistic insight that expands upon previous studies of MAP7 function at the cellular level.

      The conclusions are supported by multiple, complementary lines of evidence, including knockout and Map7-egfpKI mouse models, high-resolution electron microscopy, immunoprecipitation–mass spectrometry, and single-cell RNA sequencing. While we agree that further experiments, such as MAP7–MYH9 double staining, will strengthen the mechanistic framework, we will also perform complementary biochemical analyses to provide additional insight. Specifically, we plan to conduct domain-mapping experiments to identify the MAP7 region required for MYH9 complex formation, coupled with co-immunoprecipitation assays in cultured cells to validate this association.

      Although generating new mutant mouse lines is not feasible within the scope of this revision, and no in vitro system fully recapitulates Sertoli cell polarization, these complementary approaches will provide further mechanistic support. We believe that these planned experiments, together with the current dataset, will clarify the underlying mechanisms and reinforce the significance of our findings, while appropriately acknowledging the current limits of experimental evidence.

      Reviewer #2 (Evidence, reproducibility and clarity):

      In this manuscript the authors evaluate the role of Microtubule Associated Protein 7 (MAP7) in postnatal Sertoli cell development. The authors build two novel transgenic mouse lines (Map7-eGFP, Map7 knockout) which will be useful tools to the community. The transgenic mouse lines are used in paired advanced sequencing experiments and advanced imaging experiments to determine how Sertoli cell MAP7 is involved in the first wave of spermatogenesis. The authors identify MAP7 as an important regulator of Sertoli cell polarity and junction formation with loss of MAP7 disrupting intracellular microtubule and F-actin arrangement and Sertoli cell morphology. These structural issues impact the first wave of spermatogenesis causing a meiotic delay that limits round spermatid numbers. The authors also identify possible binding partners for MAP7, key among those MYH9.

      The authors did a great job building a complex multi-modal project that addressed the question of MAP7 function from many angles. The is an excellent balance of using many advanced methods while still keeping the project narrowed, to use only tools to address the real questions. The lack of quality testing on the germ cells outside of TUNEL is disappointing, but the Conclusion section implies that this sort of work is being done currently so the omission in this manuscript is acceptable. However, there is an issue with the imaging portion of the work on MYH9. The conclusions from the MYH9 data is currently overstated, super-resolution imaging of Map7 knockouts with microtubule and F-actin stains, and imaging that uses MYH9 with either Map7-eGFP or anti-MAP7 are also needed to both support the MAP7-MYH9 interaction normally and lack of interaction with failure of MYH9 to localize to microtubules and F-actin in knockouts. Since a Leica SP8 was used for the imaging, using either Leica LIGHTNING or just higher magnification will likely be the easiest solution.

      Response: We sincerely appreciate the reviewer’s thorough and positive evaluation of our study. We are encouraged that the reviewer recognized the overall strength of our multi-modal approach and the scientific value of the Map7-egfp knock-in and Map7 knockout genome-edited mouse models that we generated. We also thank the reviewer for highlighting the balance between methodological breadth and focused, hypothesis-driven investigation in our work.

      Regarding the reviewer’s valuable comments on the imaging data, we have addressed them as follows. We improved the cytoskeletal imaging data as described in response to the reviewer’s minor comments. Specifically, in the revised Figure 3B, we replaced the original images with higher-resolution confocal images to provide a clearer view of cytoskeletal organization. In addition, following Reviewer #1’s suggestion, we modified the panel layout to enlarge each field and enhance the contrast between TUBB3 and F-actin channels, allowing better visualization of their altered localization in Map7-/- testes.

      We agree that super-resolution imaging comparing control and Map7-/- testes stained for TUBB3 and F-actin would further strengthen the analysis. If the current resolution is still considered insufficient, we plan to perform additional imaging using a Carl Zeiss Airyscan or Leica Stellaris 5 system to further improve spatial resolution and confirm the observed cytoskeletal phenotypes. Finally, we will perform co-imaging of MYH9 with MAP7 to validate their spatial relationship under normal conditions, complementing the existing data obtained from Map7-/- testes.

      This manuscript is nicely organized with almost all of the results spelled out very clearly and almost always paired with figures that make compelling and convincing support for the conclusions. There are minor revision suggestions for improving the manuscript listed below. These include synching up Figure and Supplemental Figure reference mismatches. There are also many minor, but important, details that need to be added to the Methods section including many catalog numbers and some references.

      - Some of the imaging, especially Fig4F could benefit and be more convincing with super-resolution imaging in the 150nm range (SIM, Airyscan, LIGHTNING, SoRa) possibly even just imaging with a higher magnification objective (60x or 100x)

      Response: We appreciate the reviewer’s suggestion to improve the resolution of the imaging data. In addition to revising Figure 3B as described above, we have also replaced the images in Figure 4F with higher-resolution confocal images to provide a clearer view of MYH9 localization relative to microtubules and F-actin. These revised images highlight that MYH9 specifically accumulates at apical regions where microtubules and F-actin intersect, forming the apical ES, but is not localized to the basal ES-associated F-actin structures. To retain spatial context and allow readers to appreciate the overall distribution pattern, the original lower-magnification images from Figure 4F have been moved to Supplemental Figure 5.

      - SuppFig1D: Please add context in the legend to the meaning of the Yellow Stars and "O->U" labels. The latter would seem to be to indicate the Ovarian and Uterine sides of the image

      Response: In response to this comment, we revised the figure legend to clarify the annotations. The legend now states: “O, ovary side; U, uterus side. Asterisks indicate secretory cells that lack planar cell polarity.”

      - Pg6Line7: up to P23 or up to P35?

      Response: We appreciate the reviewer’s attention to this detail. The text has been revised for clarity as follows: “To examine the temporal dynamics of Sertoli cell polarity establishment, we analyzed seminiferous tubule morphology across the first wave of spermatogenesis, from postnatal day (P)10 to P35. To specifically assess the role of MAP7 in Sertoli cells while minimizing contributions from germ cells, our analysis focused on stages up to P23, before MAP7 expression becomes detectable in step 9–11 spermatids (Fig. 1), to exclude potential secondary effects resulting from MAP7 loss in germ cells.” (page 6, lines 5-10)

      - SuppFig4B: Does SuppFig4B reference back to Fig3B or Fig3C? If the latter please update this in the legend.

      - Pg7Line21-23: Is SuppFig3D,E meant to be referenced and not SuppFig5A,B?

      - Pg8Line22-25: Is SuppFig4A meant to be reference and not SuppFig5?

      - Pg8Line34-Pg9Line: Is SuppFig4B meant to be reference and not SuppFig5B?

      Response: We appreciate the reviewer’s careful reading. All mismatches in Supplemental figure references have been corrected, ensuring that each reference in the text now accurately corresponds to the appropriate data.

      - Pg9Line28-33: Would the authors be willing to rework this figure to include images that more closely match the reported findings? The current version does not strongly support the idea that MYH9 fails to localize to microtubule and F-actin domains in Map7 knockout P17 seminiferous tubules. This could also just be a matter of acquiring these images at a higher magnification or with a lower-end (150nm range) super-resolution system (SIM, Airyscan, LIGHTNING, SoRa etc)

      Response: Following the reviewer’s recommendation, we replaced the images in Figure 4F with higher-resolution confocal images to better visualize MYH9 localization relative to microtubules and F-actin in Map7+/- and Map7-/- testes. These revised images demonstrate that MYH9 specifically accumulates at apical regions where microtubules and F-actin intersect, but not at the basal ES-associated F-actin structures. To preserve spatial context, the original low-magnification images have been moved to Supplemental Figure 5. If additional resolution is required, we are prepared to acquire further images using an Airyscan or Stellaris 5 system.

      - SuppFig7A: The legend notes these are P23 samples but the image label says 8W. Please update this to whichever is the correct age.

      Response: We thank the reviewer for pointing out this discrepancy. The figure legend for Supplemental Figure 7A (now revised as Supplemental Figure 8A) has been corrected to indicate that the samples are from 8-week-old mice, consistent with the image label.

      - Pg16Line4-5: Please include in the text the vendor and catalog number for the C57BL/6 mice

      Response: The text now specifies: “C57BL/6NJcl mice were purchased from CLEA Japan (Tokyo, Japan)” (page 17, line 4). CLEA Japan does not assign catalog numbers to mouse strains.

      - Pg16Line18-19: Please include in the text the catalog number for the DMEM

      - Pg16Line19-20: Please include in the text the vendor and catalog number for the FBS

      - Pg16Line20: Please include in the text the vendor and catalog number for the Pen-Strep

      Response: We have added vendor and catalog information as follows: “Wild-type and MAP7-EGFPKI HeLa cells were cultured in Dulbecco’s Modified Eagle Medium (DMEM, 043-30085; Fujifilm Wako Pure Chemical, Osaka, Japan) supplemented with 10% fetal bovine serum (FBS, 35-015-CV; Corning, Corning, NY, USA) and penicillin–streptomycin (26253-84; Nacalai, Kyoto, Japan) at 37 °C in a humidified atmosphere containing 5% CO₂ 18.” (page 17, lines 18-22)

      - Pg17Line6-12: Thank you for including organized and detailed information about the primers, please also define the PCR protocol used including temperatures, timing, and cycles for Map7 knockout genotyping

      - Pg17Line20-27: Thank you for including organized and detailed information about the primers, please also define the PCR protocol used including temperatures, timing, and cycles for Map7-eGFP genotyping

      Response: The text has been updated to include the PCR conditions used for genotyping as follows: “Genotyping PCR was routinely performed as follows. Genomic DNA was prepared by incubating a small piece of the cut toe in 180 µL of 50 mM NaOH at 95 °C for 15 min, followed by neutralization with 20 µL of 1 M Tris-HCl (pH 8.0). After centrifugation for 20 min, 1 µL of the resulting DNA solution was used as the PCR template. Each reaction (8 µL total volume) contained 4 µL of Quick Taq HS DyeMix (DTM-101; Toyobo, Osaka, Japan) and a primer mix. PCR cycling conditions were as follows: 94 °C for 2 min; 35 cycles of 94 °C for 30 s, 65 °C for 30 s, and 72 °C for 1 min; followed by a final extension at 72 °C for 2 min and a hold at 4 °C. PCR products were analyzed using agarose gel electrophoresis. This protocol was also applied to other mouse lines and alleles generated in this study.” (page 18, lines 17–25)

      - Pg17Line30: Please include in the text the vendor and catalog number for the Laemmli sample buffer

      Response: We clarified that the buffer was prepared in-house.

      - Pg17Line32&SuppTable1: Thank you for including an organized and detailed table for the primary antibodies used, please also make either a similar table or expand the current table to include secondary antibody information

      - Pg17Line32: Please note in the text which primary antibodies and secondary antibodies from Supp Table 1

      Response: Supplementary Table 1 has been updated to include both primary and HRP-conjugated secondary antibodies. In the Immunoblotting section of the Materials and Methods, we specified the antibodies used: “The following primary antibodies were used: mouse anti-Actin (C4, 0869100-CF; MP Biomedicals, Irvine, CA, USA), mouse anti-Clathrin heavy chain (610500; BD Biosciences, Franklin Lakes, NJ, USA), rat anti-GFP (GF090R; Nacalai, 04404-84), rabbit anti-MAP7 (SAB1408648; Sigma-Aldrich, St. Louis, MO, USA), rabbit anti-MAP7 (C2C3, GTX120907; GeneTex, Irvine, CA, USA), and mouse anti-α-tubulin (DM1A, T6199; Sigma-Aldrich). Corresponding HRP-conjugated secondary antibodies were used for detection: goat anti-mouse IgG (12-349; Sigma-Aldrich), goat anti-rabbit IgG (12-348; Sigma-Aldrich), and goat anti-rat IgG (AP136P; Sigma-Aldrich). Detailed information for all primary and secondary antibodies is provided in Supplementary Table 1.” (page 19, lines 14-22)

      - Pg18Line2: Please include in the text the vendor and catalog number for the Bouin's

      Response: The text has been updated to indicate that Bouin’s solution was prepared in-house

      - Pg18Line3: Please include in the text the catalog number for the CREST-coated glass slides

      - Pg18Line7: Please include in the text the catalog number for the OCT compound

      - Pg18Line11: Please include in the text the vendor and catalog number for the Donkey Serum

      - Pg18Line11: Please include in the text the vendor and catalog number for the Goat Serum

      Response: The text now includes vendor and catalog information for all these reagents, including CREST-coated slides (SCRE-01; Matsunami Glass, Osaka, Japan), OCT compound (4583; Sakura Finetechnical, Tokyo, Japan), donkey serum (017-000-121; Jackson ImmunoResearch Laboratories, PA, USA), and goat serum (005-000-121; Jackson ImmunoResearch Laboratories).

      - Pg18Line13: Thank you for including an organized and detailed table for the primary antibodies used, please also make either a similar table or expand the current table to include secondary antibody information

      Response: We thank the reviewer for the suggestion. Supplementary Table 1 already includes information for the antibodies used for immunoblotting, and we have now added information for the Alexa Fluor-conjugated secondary antibodies used for immunofluorescence in this study.

      - Pg18Line18: Please include in the text the vendor and catalog number for the DAPI

      Response: The text has been updated to include the vendor and catalog number for DAPI (D9542; Sigma-Aldrich).

      - Pg18Line19: Please also include information about the objectives used including catalog numbers, detectors used (PMT vs HyD)

      Response: We thank the reviewer for the suggestion. The following information has been added to the Histological analysis section in Materials and Methods: “Objectives used were HC PL APO 40×/1.30 OIL CS2 (11506428; Leica) and HC PL APO 63×/1.40 OIL CS2 (11506350; Leica), with digital zoom applied as needed for high-magnification imaging. DAPI was detected using PMT detectors, while Alexa Fluor 488, 594, and 647 signals were captured using HyD detectors. Images were acquired in sequential mode with detector settings adjusted to prevent signal bleed-through.” (page 20, lines 13-17)

      - Pg18Line23: Please cite in the text the reference paper for Fiji (Schindelin et al. 2012 Nature Methods PMID: 22743772) and note the version of Fiji used

      - Pg18Line24: Please note the version of Aivia used

      Response: We have revised the text accordingly by citing the reference paper for Fiji (Schindelin et al., 2012, Nature Methods, PMID: 22743772) and noting the version used (v.2.16/1.54p). In addition, we have added the version of Aivia used in this study (version 14.1).

      - Pg18Line25: If possible, please use a more robust and reliable system than Microsoft Excel to do statistics (Graphpad Prism, Stata, R, etc), if this is not possible please note the version of Microsoft Excel used

      Response: We appreciate the reviewer’s suggestion. For basic statistical analyses such as the Student’s t-test, we used Microsoft Excel (Microsoft Office LTSC Professional Plus 2021), which has been sufficient for these standard calculations. For more advanced analyses, including ANOVA and single-cell RNA-seq analyses, we used R. These details have now been added to the text.

      - Pg18Line25: Please cite in the text the reference paper for R (R Core Team 2021 R Foundation for Statistical Computing "R: A Language and Environment for Statistical Computing") and note the version of R used

      - Pg18Line25: Please note the specific R package with version used to do ANOVA, and cite in the text the reference for this package

      Response: We have cited the reference for R (R Core Team, 2021. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria) and noted the version used (version 4.4.0) in the text. In addition, regarding ANOVA, we have added the following description: “For ANOVA analysis, linear models were fitted using the base stats package (lm function), and analysis of variance was conducted with the anova function.” (page 20, lines 23-25)

      - Pg18Line25: Please clarify, was a R package called "AVNOVA" used to do ANOVA or is this a typo?

      Response: We thank the reviewer for pointing this out. It was a typographical error — the correct term is “ANOVA”. The text has been corrected accordingly.

      - Pg18Line32: Please include in the text the catalog number for the EPON 812 Resin

      - Pg19Line3: Please include the version number for Stacker Neo

      - Pg19Line5: Please include the vendor and version number for Amira 2022

      - Pg19Line5: Please include the version number for Microscopy Image Browser

      - Pg19Line5: Please include the version number for MATLAB that was used to run Microscopy Image Browser

      Response: We added the catalog number for the EPON 812 resin and the vendor and version information for the software used. The following details have been included in the revised text:

      EPON 812 resin: TAAB Embedding Resin Kit with DMP-30 (T004; TAAB Laboratory and Microscopy, Berks, UK)

      Stacker Neo: version 3.5.3.0; JEOL

      Amira 2022: version 2022.1; Thermo Fisher Scientific

      Microscopy Image Browser: version 2.91

      Note that although Microscopy Image Browser is written in MATLAB, we used the standalone version that does not require a separate MATLAB installation.

      - Pg19Line: 9-10: Please include in the text the catalog number for the complete protease inhibitor

      - Pg19Line14: Please include in the text the catalog number for the Magnetic Agarose Beads

      - Pg19Line16: Please include in the text the catalog number for the GFP-Trap Magnetic Agarose Beads

      Response: We have added the catalog numbers for the complete protease inhibitor (4693116001), control magnetic agarose beads (bmab), and GFP-Trap magnetic agarose beads (gtma).

      - Pg19Line21: Please note in the text which primary antibodies and secondary antibodies from Supp Table 1

      - Pg19Line21-22: Please include in the text the catalog number for the ECL Prime

      Response: We thank the reviewer for the helpful suggestions. The description regarding immunoblotting (“Eluted samples were separated by SDS–PAGE, transferred to PVDF membranes…”) was reorganized: overlapping content has been removed, and the necessary information has been integrated into the “Immunoblotting” section, where details of the primary and secondary antibodies (listed in Supplementary Table 1) are already provided. In addition, the information for ECL Prime has been updated to “Amersham ECL Prime (RPN2236; Cytiva, Tokyo, Japan)”.

      - Pg20Line2: Please include the version number for Xcalibur

      Response: The version of Xcalibur used in this study (version 4.0.27.19) has been added to the text.

      - Pg20Line5: Please cite in the text the reference paper for SWISS-PROT (Bairoch and Apweiler 1999 Nucleic Acid Research PMID: 9847139)

      Response: The reference paper for SWISS-PROT (Bairoch and Apweiler, 1999, Nucleic Acids Research, PMID: 9847139) has been cited in the text.

      - Pg19Line26: Please include in the text the catalog number for the NuPAGE gels

      - Pg19Line28: Please include in the text the catalog number for the SimpleBlue SafeStain

      Response: Both catalog numbers have been added in the Mass spectrometry section as follows: 4–12% NuPAGE gels (NP0321PK2; Thermo Fisher Scientific) and SimplyBlue SafeStain (LC6060; Thermo Fisher Scientific).

      - Pg20Line26: Please include in the text the catalog number for the Chromium Singel Cell 3' Reagent Kits v3

      Response: The catalog number for the Chromium Single Cell 3′ Reagent Kits v3 (PN-1000075; 10x Genomics) has been added to the text.

      - Pg21Line3: Please cite in the text the reference paper for R (R Core Team 2021 R Foundation for Statistical Computing "R: A Language and Environment for Statistical Computing")

      Response: The reference for R (R Core Team, 2021. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria) has already been cited in the “Histological analysis” section, where ANOVA analysis is described.

      - Pg21Line3 Please cite in the text the reference for RStudio (Posit team (2025). RStudio: Integrated Development Environment for R. Posit Software, PBC, Boston, MA. URL http://www.posit.co/.)

      Response: The reference for RStudio (Posit team, 2025. RStudio: Integrated Development Environment for R. Posit Software, PBC, Boston, MA, USA. URL: http://www.posit.co/) has been added to the text.

      - Pg21Line23: Please include the version number for Metascape

      Response: The version of Metascape used in this study (v3.5.20250701) has been added to the text.

      - SuppFig12: please update the legend to include a description after the title and update the figure labeling to correspond to the legend. Also, this figure is currently not referenced anywhere in the text.

      Response: We have updated the legend for Supplemental Figure 12 (Supplemental Figure 13) to include a descriptive sentence after the title and have adjusted the figure labeling to match the legend. The revised legend now reads: “Full-scan images of the agarose gels shown in Supplemental Figs. 1B and 2C are displayed in the upper and lower left panels, respectively, while the corresponding full-scan images of the immunoblots shown in Supplemental Figs. 1C and 2D are presented in the upper and lower right panels, respectively.”

      As these images serve as source data, they are not referenced directly in the main text.

      _Referee cross-commenting_

      I generally agree with Reviewer 1 and specifically concur related to adding details about fertility assessment of the Map7 Knockout line, and enhancing the SEM imaging.

      Response: As noted in our response to Reviewer #1, we have re-acquired the SEM images in high-resolution mode, focusing on the relevant regions. The new high-resolution images have replaced the original panels in revised Figure 3C, providing clearer visualization of junctional structures at P10 and P21 in Map7+/- and Map7-/- testes. The original Figure 3C images have been moved to Supplemental Figure 4B for reference.

      Reviewer #2 (Significance):

      There are mouse lines, and datasets that will be useful resources to the field. This work also advances our understanding of a period in Sertoli cell development that is critical to fertility but very understudied.

      Response: We thank the reviewer for the positive comments and for recognizing the potential value of our mouse lines and datasets to the field, as well as the significance of our work in advancing the understanding of this critical but understudied period in Sertoli cell development.

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity):

      Summary:

      The manuscript titled "Unravelling the Progression of the Zebrafish Primary Body Axis with Reconstructed Spatiotemporal Transcriptomics" presents a comprehensive analysis of the development of the primary body axis in zebrafish by integrating bulk RNA-seq, 3D images, and Stereo-Seq. The authors first clearly demonstrate the application of Palette for integrating RNA-seq and Stereo-Seq using published spatial transcriptomics data of Drosophila embryos. Subsequently, they produced serial bulk RNA-seq data for certain developmental stages of Danio rerio embryos and utilized published Stereo-Seq data. Through robust validation, the authors observe the molecular network involved in AP axis formation. While the authors show that integrating bulk RNA-seq data with Stereo-Seq improves spatial resolution, additional proof is required to demonstrate the extent of this improvement.

      Response: We thank the reviewer for the positive feedback on our Palette pipeline, zSTEP construction and analysis of primary body axis development. We appreciate the constructive suggestions provided, which we can implement to improve our manuscript. As pointed out by the reviewer, some analysis procedures were not described in sufficient detail. To address this, we have added more explanatory texts and additional schematic diagrams to make the methods clearer and more understandable. We also thank the reviewer for the meticulous reading and for reminding us to include parameters, references and essential texts, which significantly improve the manuscript quality and make the manuscript more rigorous. Furthermore, as suggested by the reviewer, the extent of the improvement on the spatial resolution was not clearly demonstrated in the manuscript. Therefore, we have provided an additional figure to show the original expression on the stacked Stereo-seq slices and 3D live image compared to the expression from zSTEP, and the results indicate that zSTEP provides better, more continuous expression patterns. We still have two remaining tasks that are expected to be completed within the next month. We hope our responses have address the concerns raised by the reviewer, and we are pleased to provide any additional proof as needed.

      Major Comments:

      1. Lines 66-68: Discuss the limitations of existing tools and explicitly state the advantages of using Palette.

      Response: We thank the reviewer for the valuable suggestion. We have added the following new texts after line 68 to emphasize the features and advantages of Palette.

      "Newly developed tools are committed to integrating bulk and/or scRNA-seq data with ST data to enhance spatial resolution, focusing on expression at the spot level. However, gene expression patterns are closely correlated to the biological functions and are more critical for understanding biological processes. Therefore, a tool focusing on inferring spatial gene expression patterns would be desirable."

      1. Body Pattern Genes Analysis: For both Drosophila and Danio rerio, it would be valuable to examine body pattern genes in Stereo-Seq and apply Palette to determine if the resolution of the segments improves or merges. The resolution of the A-P axis is convincing, but further evidence for other segments would be beneficial.

      Response: We thank the reviewer for the suggestions. For the Drosophila data, we only used two adjacent slices for Palette performance assessment, and thus were only able to evaluate the expression patterns within the slice.

      For the zebrafish data, although we have construct zSTEP as a 3D transcriptomic atlas, we have to admit that the left-right (LR) and dorsal-ventral (DV) patterning is not satisfactory enough. Here we show a section from the dorsal part of 16 hpf zSTEP that displays a relatively well-defined left-right pattern (Fig. 2). Along the left-right axis, the notochord cells are centrally located, flanked by somite cells on either side, with the outermost cells being pronephros.

      One reason for the limited LR and DV patterning is that the original annotation of the ST data does not clearly distinguish all the cell types. Another reason is likely due to the disordered cell positions when stacking ST slices. Thus, our zSTEP is most suitable for investigating the AP patterns, while the performances on LR and DV patterns may not achieve the same level of accuracy.

      See response letter for the figure.

      1. Figure 2d: Include the A-P line for which the intensity profile was plotted in the main figure, rather than just in the supplementary material. Additionally, consider simplifying the plot by not combining three lines into one, as it complicates the interpretation of observations.

      Response: We thank the reviewer for the helpful suggestions. We have updated Figure 2d and Figure S1b by adding a A-P line on each subfigure (Fig. 3). Additionally, as the reviewer suggested, we have separated the intensity plots so that each subfigure now includes a dedicated intensity plot along A-P axis.

      See response letter for the figure.

      1. Drosophila Data Analysis: While the alignment and validation of Danio rerio sections are clearly explained, the analysis and validation of Drosophila data are insufficiently detailed. Provide a more thorough explanation of how the intensity profiles between BDGP in situ data and Stereo-Seq data are adjusted.

      Response: We thank the reviewer for raising this issue. To make the analysis procedure clearer, we have updated Figure 2a (Fig. 4) and added explanatory texts in the figure legends to describe the processing procedure for the Drosophila ST data.

      See response letter for the figure.

      Additionally, the following sentences have been added into the Methods section to describe the generation of the intensity profiles.

      "The intensity plot profiles along AP axis were generated through the following steps: The expression pattern plot images or in situ hybridization images were imported into ImageJ and converted to grayscale. The colour was then inverted, and a line of a certain width (here set as 10) was drawn across from the anterior part to the posterior part (Fig. S1a). The signal intensities along the width of the line were measured and imported into R for generating intensity plots."

      1. Figure 3d: Present a plot with the expected expression profiles of the three genes if the embryo is aligned as anticipated.

      Response: We thank the reviewer for this helpful suggestion, which improves the clarity of our manuscript. We have added the following subfigure in as Figure 3d (Fig. 5) to show the expected expression profiles of the three midline genes along left-right axis.

      See response letter for the figure.

      1. Analysis Without Palette: Between lines 277-438, the outcome of using Palette with bulk RNA-seq and Stereo-Seq is convincing. However, consider the following:

      o What would be the observations if the analysis were conducted solely with Stereo-Seq data, without incorporating bulk RNA-seq data and employing Palette?

      Response: We thank the reviewer for raising this important question. Here we show the comparison of ST expression on stacked Stereo-seq slices, ST expression projected on 3D live images, and the Palette-inferred expression (Fig. 6). The stacked ST slices do not fully reflect the zebrafish morphology, and the gene expression appears sparse, making it look massive (the first row). While after projecting ST expression onto the live image, the expression patterns can be observed on zebrafish morphology, but the expression is still sparsely distributed in spots (the second row). However, the expression patterns captured by Palette in zSTEP show more continuous expression patterns (the third row), which are more similar to the observations in in situ hybridization images (the fourth row). We are considering put these analyses into the supplementary figure.

      See response letter for the figure.

      o This study uses only Stereo-Seq as the spatial transcriptomics reference. It would strengthen the argument to use at least one other spatial transcriptomics method, such as Visium or MERFISH, in conjunction with bulk RNA-seq and Palette, to demonstrate whether Palette consistently improves gene expression resolution.

      Response: We thank the reviewer for raising this professional question. To demonstrate a broad application of Palette, it would be necessary to test Palette performance using different types of ST references. We plan to perform extra analyses to evaluate Palette performance using Visium and MERFISH data as ST references, respectively. Additionally, our Palette pipeline only takes the overlapped genes for inference. As only hundreds of genes can be detected by MERFISH, Palette can only infer the expression patterns of these genes. As mentioned in the work of Liu et al. (2023), MERFISH can independently resolve distinct cell types and spatial structures, and thus we believe Palette will also show great performance when using MERFISH as ST reference. We've already started the analyses and expect to accomplish it within the next month. And we will update the analyses as separated tutorials to the GitHub repository.

      Reference:

      Liu, J. et al. Concordance of MERFISH spatial transcriptomics with bulk and single-cell RNA sequencing. Life Sci Alliance 6 (2023).

      1. PDAC Data Analysis: Provide a more detailed explanation of the PDAC data analysis and use appropriate colors in the tissue images to clearly distinguish cell types.

      Response: We thank the reviewer for the suggestions. We have updated the colours used in the tissue images to be consistent to the colours in tissue clustering analysis. Additionally, we have added an additional subfigure in supplementary figure (Fig. 7) with more explanatory texts in the figure legends to provide a more thorough explanation for the analysis.

      See response letter for the figure.

      1. Comparison with Other Methods: State the limitations of not using STitch3D and Spateo for alignment and explain why these methods were not employed.

      Response: We thank the reviewer for raising this constructive comment. We fully agree with you that the introduction of published alignment algorithms would be helpful in our analysis. Currently, the slice alignment is adjusted manually, and thus the main limitation of not using these tools is that manual operation may induce bias compared to the alignment generated by computational algorithm. Unfortunately, STitch3D and Spateo are not included in this study because of two reasons. First, these two newly developed tools have been recently posted, and our analyses were largely completed before that. Therefore, we only mentioned these tools in the Discussion section. Second, we do not want to embed too many external tools into our analysis, which may increase the difficulties for researchers' operation. Specifically, STitch3D and Spateo are configured to run in Python environment, while Palette is based on R packages. Moreover, without these tools, our current manual alignment also achieves desired performance. However, we value this enlightening suggestion by the reviewer and therefore plan to further compare the performance of manual alignment versus the mentioned two alignment tools. At present, we have a preliminary comparison scheme and collected relevant datasets. Hopefully, we will complete this analysis within the next 1 to 2 weeks.

      Minor Comments:

      1. References: Add references to the statements in lines 51-53.

      Response: We thank the reviewer for reminding us of the missing references. We have added the works of Junker et al. (2014), Liu et al. (2022), Chen et al. (2022), Wang et al. (2022), Shi et al. (2023) and Satija et al. (2015) as references in line 53 as follows.

      "Thus, great efforts are ongoing to construct gene expression maps of these models with higher resolution, depth, and comprehensiveness1-6."

      References:

      1. Junker, J.P. et al. Genome-wide RNA Tomography in the zebrafish embryo. Cell 159, 662-675 (2014).
      2. Liu, C. et al. Spatiotemporal mapping of gene expression landscapes and developmental trajectories during zebrafish embryogenesis. Dev Cell 57, 1284-1298 e1285 (2022).
      3. Chen, A. et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell 185, 1777-1792 e1721 (2022).
      4. Wang, M. et al. High-resolution 3D spatiotemporal transcriptomic maps of developing Drosophila embryos and larvae. Dev Cell 57, 1271-1283 e1274 (2022).
      5. Shi, H. et al. Spatial atlas of the mouse central nervous system at molecular resolution. Nature 622, 552-561 (2023).
      6. Satija, R. et al. Spatial reconstruction of single-cell gene expression data. Nature biotechnology 33, 495-502 (2015)
      1. Scientific Name Consistency: Ensure consistency in using either "Danio rerio" or "zebrafish" throughout the manuscript.

      Response: We thank the reviewer for this suggestion. We have changed "Danio rerio" to "zebrafish" to make "zebrafish" consistent throughout the manuscript.

      1. Related References: Include the following relevant references:

      o https://academic.oup.com/bib/article/25/4/bbae316/7705532

      o https://www.life-science-alliance.org/content/6/1/e202201701

      Response: We thank the reviewer for bringing these two relevant works to us. Baul et al. (2024) presented STGAT leveraging Graph Attention Networks for integrating spatial transcriptomics and bulk RNA-seq, and Liu et al. (2023) demonstrated the concordance of MERFISH ST with bulk and single-cell RNA-seq. Both are excellent works and relevant to our work. We have added these two references in line 61 and line 68, respectively.

      References:

      Baul, S. et al. Integrating spatial transcriptomics and bulk RNA-seq: predicting gene expression with enhanced resolution through graph attention networks. Brief Bioinform 25 (2024).

      Liu, J. et al. Concordance of MERFISH spatial transcriptomics with bulk and single-cell RNA sequencing. Life Sci Alliance 6 (2023).

      1. Figure 1a: In the Venn diagram, include the number of genes in the bulk and Stereo-Seq datasets, as well as the number of overlapping genes.

      Response: We thank the reviewer reminding us to include these important numbers. And in our current manuscript, we have added the following sentences in the Methods section to provide the gene numbers (Fig. 8). While the Venn diagram in Figure 1a serves as a schematic representation, so we did not include the gene numbers, as these may vary depending on the actual data.

      "Palette was performed on the aligned slices using the overlapped genes. For the 10 hpf embryo, there were 24,658 genes in the bulk data, 18,698 genes in the Stereo-seq data, and 16,601 overlapped genes. For the 12 hpf embryo, there were 23,018 genes in the bulk data, 18,948 genes in the Stereo-seq data, and 16,401 overlapped genes. For the 16 hpf embryo, there were 24,357 genes in the bulk data, 23,110 genes in the Stereo-seq data, and 19,539 overlapped genes."

      See response letter for the figure.

      1. Figure 1 Improvement: Enlarge Figure 1 and reduce repetitive elements, such as parts of the deconvolution and Figure 1b.

      Response: We thank the reviewer for the helpful suggestion. We agree with the reviewer that the deconvolution sections appear repetitive. We have updated Figure 1 (Fig. 9) by replacing these repetitive elements with a clearer and simpler diagram.

      See response letter for the figure.

      1. Figure 3f: Explain the black discontinuous line in the plot.

      Response: We thank the reviewer for the reminder. We are sorry about the lack of the explanation. We have added the below explanation for the black discontinuous line in the legend of Figure 3 (Fig. 10) as follows.

      See response letter for the figure.

      1. Line 610: State the percentage of unpaired imaging spots.

      Response: We thank the review for the reminder. We are sorry about not including the paired and unpaired spot number. We have added the number of paired spots with the percentage in the total spots in the Method section as follows.

      "The numbers of mapped spots for the 10 hpf, 12 hpf and 16 hpf embryos are 15,379 (69.4% of the total spots), 14,697 (70.5% of the total spots) and 21,605 (77.2% of the total spots), respectively."

      1. Lines 616-618: Specify the unit for the spot diameter.

      Response: We thank the reviewer for the reminder. Again, we are sorry about not including the spot diameter information in our previous version of manuscript. We have added the spot diameter in Method section as follows.

      "In the Stereo-seq data, each spot contained 15 × 15 DNA nanoball (DNB) spots (The diameter of each spot is near 10 μm)."

      Reviewer #1 (Significance):

      This algorithm will be useful not only for the field of developmental biology but also for wider applications in spatial omics. Although I have expertise in spatial omics technology development, my understanding of computational biology is limited, which restricts my ability to fully evaluate the Palette algorithm presented in this paper.

      Response: We thank the reviewer for recognizing our work, and we greatly appreciate the constructive suggestions from the reviewer. Although the reviewer acknowledged limited expertise in computational biology, the comments from the reviewer are highly professional and valuable. Following the suggestions from the reviewer, we have not only included more explanatory texts and figures to make the analysis procedures clearer and more understandable, but also supplemented the important parameters that were missing in our previous manuscript. We also provided extra figure to demonstrate the improvements of zSTEP on gene expression patterns. We believe that our work is now more scientific and more understandable, and we will continue working to solve the remaining issues as planned. We express our thanks for the reviewer again.

      Reviewer #2 (Evidence, reproducibility and clarity):

      The authors of the study introduce the Palette method, a novel approach designed to infer spatial gene expression patterns from bulk RNA-sequencing (RNA-seq) data. This method is complemented by the development of the DreSTEP 3D spatial gene expression atlas of zebrafish embryos, establishing a comprehensive resource for visualizing gene expression and investigating spatial cell-cell interactions in developmental biology.

      Response: We sincerely appreciate the reviewer's positive feedback on our Palette pipeline and the zSTEP 3D spatial expression atlas of zebrafish embryos. We also thank the reviewer for the professional comments and constructive suggestions. The reviewer raised the concerns from the aspect of algorithm design and computational biology, which we did not address well in our previous manuscript. We agree with the reviewer that we did not clarify the selection criteria of the parameters in detail, and we are now working on the additional analyses to address this issue.

      We also agree with the reviewer that we did not provide enough discussion of the strategies used in the pipeline, the features of Palette and the application scenarios of Palette and zSTEP. For wide use of our tools, it is significantly important to state these aspects. In this revised version, we have added more paragraphs in the Discussion section to address this issue. Additionally, we acknowledge that we did not adequately demonstrate the computational efficacy and computational requirements, which are important for researchers. We are also working on the additional analyses to address this issue.

      Finally, we thank the reviewer again for the professional and constructive suggestions. These suggestions are addressable, and by following them, we believe our manuscript will see a significant improvement, especially in the Palette pipeline part, making the pipeline more rigorous and easier to access. We are confident that we can complete the planned additional tasks within the next 1-2 months.

      1. The efficacy of the Palette method may be compromised by its dependency on the quality of the reference spatial transcriptomics data. As highlighted in the study, variations in data quality can lead to significant challenges in reconstructing accurate spatial expression patterns from bulk data. This underscores the necessity of evaluating quality parameters, such as the number of gene detections and spatial resolution, to ensure reliable outcomes. Additional studies should rigorously assess how these quality factors influence the accuracy and efficiency of the algorithm in various data contexts, particularly under diverse conditions of gene detection.

      Response: We thank the reviewer for this valuable suggestion. We agree with the reviewer that the quality of the reference ST data may greatly influence the performance and efficacy of the Palette, and we have added paragraphs in the Discussion section to further discuss the impact of ST data quality on Palette performance. As mentioned by the reviewer, gene detections and spatial resolution are two important parameters that can influence the Palette performance. Low gene detection may impact the clustering process, making the cell types of spots not distinguished well. To evaluate the performance of Palette when ST data shows low gene detection, we plan to applied Palette using MERFISH data as the ST reference, which only captures hundreds of genes. Moreover, we will also investigate the impact of spatial resolution on Palette performance by merging ST spots to simulate lower resolution scenarios, as well as the impact of gene detection by randomly reducing detected genes. Through the comparison among the inferred expression patterns with ST data of different spatial resolutions or different numbers of detected genes, we can better access the performance of Palette and provide guidance to researchers on the appropriate ST data requirements for optimal performance. These analyses will take another one month to accomplish after this round of revision due to the limited response time.

      1. The methodology raises pertinent questions regarding how the clustering results from different algorithms may affect the reconstructions by the Palette method. The authors would better provide a detailed discussion/comparison of clustering processes that optimize the reconstruction of spatial patterns, ensuring precision in the downstream analyses.

      Response: We thank the reviewer for the constructive comments. We agree with the reviewer that the differences in clustering results would impact the inference of the Palette. In our Palette pipeline, rather than develop a new methodology for clustering, we employ the BayesSpace for spot clustering, which considers both spot transcriptional similarity and neighbouring structure for clustering. In this case, researchers may adjust the parameters in the BayesSpace package to achieve optimal clustering results. Actually, in most cases, the spot identities were achieved through UMAP analysis, which only considers the transcriptional differences but does not consider the spatial information. This kind of clustering strategy will potentially lead to an intricate arrangement of spots belonging to different clusters, and may result in sparse gene expression in Palette outcome, which is different from the patterns in bona fide tissues. Therefore, a suitable clustering strategy will definitely help capture the local patterns.

      Moreover, our Palette pipeline also can use the clustering results from the tissue histomorphology. Using tissue histomorphology for clustering would be a good choice, as it is closer to the real case. The following Figure (Fig. 11) displays the Palette performance on PDAC datasets using both spatial clustering and histomorphology clustering strategies. The result using histomorphology clustering captures the weak pattern (indicated by the red circle) that were missed when using the spatial clustering (Fig. 11d).

      See response letter for the figure.

      1. The choice to utilize only highly expressed genes in the initial stages of the Palette algorithm also warrants further exploration. Addressing the criteria for determining which genes qualify as "highly expressed" and outlining robust cutoff will enhance the algorithm's rigor and applicability. Similarly, in the iterative estimation of gene expression across spatial spots, establishing optimal iteration conditions is crucial. Implementing a loss function may offer a systematic method for concluding iterations, thus refining computational efficiency.

      Response: We thank the reviewer for the professional suggestions. As pointed out by the reviewer, the selection of highly expressed genes and the iteration times are two important parameters in our pipeline. The definition of highly expressed genes and the number of highly expressed genes are important for achieving a satisfactory clustering performance. We tested the impact of different numbers of highly expressed genes on cluster performance in our preliminary analyses, while we did not summarize these tests and specify the parameters. Therefore, we plan to include a supplementary figure showing the clustering performances under different definitions of highly expressed genes and different numbers of highly expressed genes. Additionally, for the iteration conditions, we have tested different iteration numbers to find out a suitable iteration number to achieve a stable expression in each spot. The following figure (Fig. 1) shows the results after performing Palette with different iteration times. We randomly selected 20 cells and compared their expression across tests with varying iteration times. The results indicate that for a ST dataset with 819 spots, the expression in each spot becomes nearly stable after 5000 iteration times. We previously did not consider the computational efficiency, while here the reviewer raises a valuable and professional suggestion to implement a loss function to determine the optimal number of iterations. We greatly appreciate this suggestion, and plan to apply a loss function to summarize the optimal iteration times for ST datasets of different sizes. This will provide guidance for potential researchers in selecting iteration times and enhance computational efficiency.

      See response letter for the figure.

      1. Performance metrics relating to processing speed and computational demands remain inadequately addressed in the current framework. Understanding how the Palette method scales across varying gene counts and bulk RNA-seq datasets will be essential for potential applications in larger biological contexts. Notably, the quantitative demands of analyzing 20,000 genes when processing 10, 100, or 1,000 bulk RNA profiles must be articulated to guide researchers in planning accordingly.

      Response: We thank the reviewer for this valuable and professional suggestion. In our previous analyses, we did not consider the computation efficiency, processing speed and computational demands, which are important information for potential researchers. To address this issue, we will list our computer configuration first. And under this configuration, we plan to run Palette on datasets with different numbers of overlapped genes or ST references with varying spot numbers, and then summarize the running times into a metrics table. This will help researchers estimate the running time for their datasets and guide them in planning the analyses. We will begin the analyses soon and expect to complete the analysis within the next 1 to 2 months.

      Minor opinions:

      1. Despite the promising advances offered by the zebrafish 3D reconstruction, there is a lack of details regarding numbers of the spatial transcriptomics (ST) data utilized, and the number of bulk RNA-seq data employed in the analyses. These parameters need to be clarified.

      Response: We thank the reviewer for reminding us of these parameters. We are sorry for not including these parameters in our previous manuscript. We have now included the numbers of bulk, ST and overlap genes in the Methods section as follows (Fig. 12).

      "Palette was performed on the aligned slices using the overlapped genes. For the 10 hpf embryo, there were 24,658 genes in the bulk data, 18,698 genes in the Stereo-seq data, and 16,601 overlapped genes. For the 12 hpf embryo, there were 23,018 genes in the bulk data, 18,948 genes in the Stereo-seq data, and 16,401 overlapped genes. For the 16 hpf embryo, there were 24,357 genes in the bulk data, 23,110 genes in the Stereo-seq data, and 19,539 overlapped genes."

      See response letter for the figure.

      1. Issues regarding spatial cell-cell communication, especially concerning interactions over longer distances, necessitate careful consideration. Introducing spatial distance constraints could help formulate more realistic models of cellular interactions, a vital aspect of embryonic development.

      Response: We thank the reviewer for this essential comment. We agree with the reviewer that the spatial distance is an essential factor to investigate in vivo cell-cell communication during embryonic development. Therefore, in our analyses, we employed CellChat for spatial cell-cell communication analysis, which can be used to infer and visualize spatial cell-cell communication network for ST datasets, considering the spatial distance as constrains of the computed communication probability. However, during our analyses, we observed that there were interactions between cell types over longer distances, as mentioned by the reviewer. We then investigated how these interactions of longer distances occurred. Here, we show the FGF interaction between tail bud and neural crest cells from our spatial cell-cell analysis as an example, and the distance between these two cell types appears quite significant (Fig. 13). We labelled tail bud cells and neural crest cells on the selected midline section and observed that, although most neural crest cells are distributed anteriorly, a small number of neural crest cells are located at tail, close to the tail bud cells. Therefore, the observed interaction between tail bud and neural crest cells is likely due to their adjacent distribution in the tail region, while the anteriorly distributed of neural crest spot in spatial cell-cell communication analysis reflects the anterior positioning of most neural crest cells. As a result, the distances shown on the spatial cell-cell communication analysis are not the real distance between two cell types.

      In most cases in our spatial cell-cell communication analyses, the observed interactions over longer distances are likely influenced by this visualization strategy. Additionally, pre-processing the dataset may enhance the performance of the analyses. Here we performed systematic analyses of the entire embryo, which can make the interactions between cell types appear massive. To investigate specific biological questions, researchers can subset cell types of interest or categorize them into different subtypes based on their positions.

      See response letter for the figure.

      1. Evaluation metrics such as the Adjusted Rand Index (ARI) and Root Mean Square Error (RMSE) represent critical tools for systematically measuring the similarity of inferred spatial patterns, yet their specific application within this context should be elaborated.

      Response: We thank the reviewer for recommending these two tools. We have applied them to evaluate the similarity between the expression patterns (Fig. 14). The inclusion of these statistical values makes our comparisons of expression patterns more scientific and convincing. And we have added the following texts in the Methods section to describe the calculation of these two values.

      "The Adjusted Rand Index (ARI) and Root Mean Square Error (RMSE) were used to evaluate the similarity of the expression patterns. The expression patterns of in situ hybridization images were considered as the expected values, and the expression patterns of ST data and inferred expression patterns were compared to the expected values. Common positions along the AP axis within all three expression profiles were used, and the RMSE were calculated based on the scaled intensity of these positions. Values greater than the threshold were set to 1; otherwise, they were set to 0, and the ARI was then calculated based on the intensity category. Higher ARI and lower RMSE indicate greater similarity."

      See response letter for the figure.

      1. The study's limitations surrounding ST data quality cannot be overstated. Discussing scenarios where only limited or poor-quality ST data are available will be crucial for guiding future studies. Furthermore, a clear explanation of how enhanced specificity and accuracy translate into tangible biological insights is essential for demystifying the underlying mechanisms driving developmental processes.

      Response: We thank the reviewer for raising this essential suggestion. We have realized that in our previous manuscript, our discussion on the advantages and limitations of Palette and zSTEP was neither broad nor detailed enough.

      Therefore, in our revised manuscript, we have added the following paragraphs to further discuss the advantages and limitations of Palette and zSTEP, as well as the potential application of zSTEP in developmental biology.

      In this section, we have emphasized again the impact of ST data quality on the performance of Palette and zSTEP, and then compared Palette with the strategy that uses well-established marker genes to infer spatial information. We demonstrated that although Palette cannot achieve single cell resolution, it captures the major expression patterns, which are closely correlated to biological functions and critical for embryonic development. Furthermore, we further discussed that zSTEP is not only a valuable tool for investigating gene expression patterns, but also has the potential in evaluating the reaction-diffusion model to investigate the complicated and well-choreographed pattern formation during embryonic development.

      As here we have provided a more comprehensive discussion about Palette and zSTEP, we think that the potential researchers will better understand the application scenarios of our inference pipeline and our datasets. We hope our study can assist and inspire further research in the field of spatial transcriptomics and developmental biology.

      "Thirdly, the performance of Palette and zSTEP heavily relied on the quality of ST data. If the quality of ST data is not of sufficient quality, the low-expression genes may not be detected or only appear in very few scattered spots, and the performance of spot clustering could also be affected. Moreover, in this study, for example, the Stereo-seq data of 12 hpf zebrafish embryo had fewer slices on the right side (Fig. S3b), resulting in more blank spots in the right part of zSTEP for the 12 hpf embryo. However, with the ongoing advancements in spatial resolution and data quality, the performance of Palette is expected to be enhanced and demonstrate even greater potential for analysing spatiotemporal gene expression.

      On the other hand, compared to the brilliant strategy that infers spatial information of scRNA-seq data from well-established genes, our Palette pipeline cannot achieve single cell resolution. However, our Palette pipeline is based on the ST reference, and thus preserves the real positional relationships between spots. Furthermore, the focus of our pipeline is to infer the gene expression patterns, which are closely correlated to biological functions and critical for embryonic development, rather than the sparse expression within individual spots. In this regard, our Palette pipeline can be advantageous, as it allows for reconstruction of the major expression profiles, which are often more relevant for understanding developmental processes. Additionally, our Palette can be applied to serial sections, enabling the construction of 3D ST atlas.

      Finally, while the current analyses demonstrated that zSTEP can serve as a valuable tool for identifying genes having specific patterns at certain developmental stages, the exploration of zSTEP is still limited. During animal development, pattern formation is always one of the most important developmental issues. As demonstrated by the reaction-diffusion (RD) model, morphogen molecules are produced at specific regions of the embryo, forming morphogen gradients to guide cell specification, while interactions between different morphogens instruct more complicated and well-choreographed pattern formation. Our Palette constructed zSTEP, as a comprehensive transcriptomic expression pattern during development, could be leveraged to evaluate and prove the RD model during development, including AP patterning. Moreover, the investigation of gene expression patterns should not be limited to morphogens and TFs, and further investigation of their roles in AP patterning is desirable. Additionally, here a random forest model may be sufficient for investigating the most essential morphogens and TFs for AP axis refinement, while more sophisticated machine learning models may be required for addressing more specific biological questions."

      Reviewer #2 (Significance):

      The Palette pipeline demonstrates a marked improvement in specificity and accuracy when predicting spatial gene expression patterns. Evaluative studies on Drosophila and zebrafish datasets affirm its enhanced performance compared to existing methodologies. By effectively reconstructing spatial information from bulk transcriptomic data, the Palette method innovatively merges the philosophy of leveraging single-cell transcriptomic data for deconvolution analyses. This integration is pivotal, advancing traditional bulk RNA-seq approaches while laying the groundwork for future research.

      One of the notable achievements in this work is the construction of the DreSTEP atlas, which integrates serial bulk RNA-seq data with advanced 3D imaging techniques. This resource grants researchers unprecedented access to the visualization of gene expression patterns across the zebrafish embryo, facilitating the investigation of spatial relationships and cell-cell interactions critical for developmental processes. Such capabilities are invaluable for understanding the intricate dynamics of embryogenesis and the distinct roles of individual cell types.

      Response: We thank the reviewer for the positive evaluation of our work, either the Palette pipeline or zSTEP. The reviewer has strong expertise in algorithm development and computational biology, and the concerns and suggestions from the reviewer are significantly precious and valuable for us. Regarding the bioinformatics tool development, we did not have extensive experiences, and thus we did not thoroughly address the selection criteria or clarify the parameters used in the pipeline, which may influence the application by other researchers. Therefore, we sincerely appreciate the professional suggestions from the reviewer, which we can follow to address these issues, improve our manuscript and make our work more impactful for researchers. Additionally, we did not consider computation efficiency, processing speed and computational demands, which would be important factors for other researchers to use Palette. We would like to add extra analyses to address these aspects.

      Currently, based on the suggestions from the reviewer, we have added extra texts discussing the clustering strategy in Palette pipeline, the advantages and limitations of Palette, and the potential application of zSTEP in developmental biology. We believe that readers will now have a clearer understanding of the performance of Palette and the application scenarios of both Palette and zSTEP. We have not fully addressed the comments raised by the reviewer yet, while we are working on the planned additional analyses and expect to complete all these tasks within the next 1-2 months. We sincerely thank the reviewer for the professional and valuable suggestions, which definitely improve our work and will make it accessible for a wide range of researchers.

      Finally, through this review process, we have learned a lot about the important considerations and requirements when designing bioinformatics tools, and we benefit a lot from the thoughtful guidance. We express our thanks to the reviewer again for the guidance, and we will try our best to address the remaining issues to further improve our manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity):

      Evidence, reproducibility and clarity

      In this study, Dong and colleagues developed a computational pipeline to use spatial transcriptomics (ST) datasets as a reference to infer the spatial patterns of gene expression from bulk RNA sequencing data. This approach aims to overcome the low read depth and limited gene detection capabilities in current ST datasets, while exploiting its ability to provide highly resolved spatial information. By combining bulk RNA-seq datasets from 3 developmental stages during early zebrafish development with previously available ST and imaging datasets, the authors build DreSTEP (Danio rerio spatiotemporal expression profiles). Using this approach, they go on to identify the morphogens and transcription factors involved in anteroposterior patterning.

      The paper is well written, and the pipeline presented in this study is likely to be useful beyond the case studies included in this study. There are a few questions that, in my view, would be important to clarify to increase the impact of this work:

      Response: We sincerely appreciate the positive feedback from the reviewer on the Palette pipeline and zebrafish spatiotemporal expression profiles zSTEP. We thank the reviewer for the constructive suggestions, which have inspired us to think deeply about application and advantages of Palette and zSTEP for future studies.

      We fully agree with the reviewer that we do not sufficiently clarify the advantages and limitations of our inference pipeline in the original manuscript. The questions raised by the reviewer are very insightful. For example, while the inference expression patterns may closely resemble the in situ hybridization observation, which we consider as good performance, the reviewer pointed out that we should consider whether weak, yet real expression may have been removed. These questions have motivated us to think more deeply about the underlying principles and assumptions of our inference pipeline. Following the reviewer's questions, we have expanded our discussion on the application of zSTEP in developmental biology and the features of Palette compared to the existing strategies.

      We believe that after incorporating the revisions, our current manuscript now demonstrates the application scenario of Palette clearer and suggested the application of zSTEP for investigating biological questions in developmental biology. We are grateful for the reviewer's guidance, which helps us increase the impact of our work.

      1. The authors mention that they used a variable factor to adjust expression differences between the ST and bulk RNA-seq datasets. It would be important for the authors to comment on how much overlap in gene expression is necessary between the datasets for an accurate calculation of this variable factor? Can this be directly tested, for instance, by testing how their conclusions vary if expression is adjusted by a variable factor calculated from only a smaller set of genes?

      Response: We thank the reviewer for the professional questions. We are sorry about not including the gene numbers in our previous manuscript. And now we have provided the numbers of genes in bulk and ST data and the numbers of the overlapped genes (Fig. 15).

      "Palette was performed on the aligned slices using the overlapped genes. For the 10 hpf embryo, there were 24,658 genes in the bulk data, 18,698 genes in the Stereo-seq data, and 16,601 overlapped genes. For the 12 hpf embryo, there were 23,018 genes in the bulk data, 18,948 genes in the Stereo-seq data, and 16,401 overlapped genes. For the 16 hpf embryo, there were 24,357 genes in the bulk data, 23,110 genes in the Stereo-seq data, and 19,539 overlapped genes."

      See response letter for the figure.

      For Palette implementation, we took all the overlapped genes. To calculate the variable factor, we aggregated the expression of each gene in the ST data, and then used the expression of the bulk data to divide the aggregated expression for variable factor calculation. As a result, each overlapped gene was assigned a variable factor to adjust its expression, based on its difference between bulk and ST data. The rationale behind this approach is that by considering the ST data as a whole, we can effectively reduce the variations among individual spots. This allows the variable factors to provide reasonable adjustment to gene expression.

      Above all, the variable factors can be directly calculated. Currently Palette only can infer the expression patterns of overlapped genes. It means when the number of overlapped genes is small, such as MERFISH only detecting hundreds of genes, Palette can only infer the expression patterns of these genes. However, if the MERFISH data have good quality, which enable resolving distinct cell types, we believe Palette will also show good performance when using MERFISH as ST reference. Additionally, we plan to perform Palette using MERFISH as ST reference to further demonstrate its broad application when using different ST references.

      1. Palette gives rise to highly spatially precise patterns, which closely match those found in ISH. However, the smoothening of the expression can also remove weak, yet real, local expression patterns, as shown for idgf6 in Fig. 2a. Can the authors test this more extensively for other genes?

      Response: We thank the reviewer for this essential question. We agree with the reviewer that weak, yet real expression might be removed in our Palette inference pipeline. The weak, sparse expression may be due to the ST technique itself or the variations in samples. However, that sparse gene expression may not have biological meaning, and the focus of our pipeline in to capture the expression patterns, which are closely correlated with functions and crucial for embryonic development. Therefore, our algorithm considers spot characteristics and emphasize cluster-specific expression, resulting in spatial-specific expression patterns. In most cases, the main gene expression patterns can be captured, which can help understand gene functions and roles in embryonic development. We have updated Supplementary Figure S1a (Fig. 16) to include more gene patterns to demonstrate this point.

      See response letter for the figure.

      1. Using adjacent slices for ST and "bulk RNA-seq" may provide better results than those obtained when comparing two independent datasets. Could the authors also extend the analysis of Palette's functionalities by using separate, previously available but independent datasets, for ST and bulk RNA-seq in Drosophila as well?

      Response: We thank the reviewer for the valuable question. We agree with the reviewer that using adjacent slices may provide better results. The idea here is that the inferred spatial expression patterns from pseudo bulk RNA-seq can be used to compare with the real expression of ST to evaluate Palette performance. We have updated our Figure 2a (Fig. 17) to illustrate the analysis clearer.

      See response letter for the figure.

      To demonstrate the Palette's functionalities, we have used Palette to infer zebrafish bulk RNA-seq slice (Junker et al., 2014) using Stereo-seq slice (Liu et al., 2022) as ST reference, and these two datasets are separate and independent. We agree with the reviewer that it would be good to use separate datasets to test in Drosophila to further demonstrate the Palette's functionalities. However, unfortunately, we did not find the Drosophila serial bulk RNA-seq data along left-right axis of the corresponding stages, and thus we might be unable to perform the extra analyses using independent Drosophila datasets.

      References:

      Junker, J.P. et al. Genome-wide RNA Tomography in the zebrafish embryo. Cell 159, 662-675 (2014).

      Liu, C. et al. Spatiotemporal mapping of gene expression landscapes and developmental trajectories during zebrafish embryogenesis. Dev Cell 57, 1284-1298 e1285 (2022).

      1. The DreSTEP analysis in zebrafish embryos is interesting and validates well-established observations in the field. Can the authors also discuss whether and how their dataset allows them to refine our understanding of the spatial or temporal pattern of the morphogens and TFs involved in AP patterning? This would further validate their approach.

      Response: We appreciate the reviewer for recognition of our zSTEP and raising this valuable question, which has inspired us to think more deeply about the potential application of zSTEP in developmental biology. As the reviewer noted, our zSTEP analyses have validated well-established observations in the field. Rather than focusing on the sparse expression detected in ST data, zSTEP emphasizes the gene expression patterns that are closely correlated with biological functions and critical for embryonic development. Therefore, zSTEP can serve as a valuable tool for identifying the genes having specific patterns at certain developmental stages.

      Pattern formation is one of the most important developmental issues for all animals. The reaction-diffusion (RD) model is a widely recognized theoretical framework used to explain self-regulated pattern formation in developing animal embryos (Kondo & Miura, 2010). Morphogen molecules are produced at specific regions of the embryo, forming morphogen gradients to guide cell specification. Most importantly, interactions between different morphogens instruct more complicated and well-choreographed pattern formation. Our Palette-constructed zSTEP provides a comprehensive transcriptomic expression pattern, including all morphogens and TFs, across the whole embryo during development. These valuable resources, in our opinion, could be leveraged to evaluate and prove the RD model during development, including AP patterning. In our current zSTEP analyses, we have already identified genes that exhibit specific expression patterns along AP axis, some of which have not been fully characterized. These genes could be potential targets for further investigation into their roles in AP patterning, although they are not the primary focus of this study. Additionally, our analyses only focused on morphogens and TFs, but zSTEP can be used to investigate the expression patterns of other genes as well. Moreover, we employed a random forest model to investigate the most essential morphogens and TFs for AP axis refinement, which is one of the basic applications of zSTEP. To investigate specific biological questions of interest, it would be worth exploring the use of more sophisticated machine learning models.

      We have added the following paragraph in the Discussion section to discuss the potential application of zSTEP in future studies.

      "Finally, while the current analyses demonstrated that zSTEP can serve as a valuable tool for identifying genes having specific patterns at certain developmental stages, the exploration of zSTEP is still limited. During animal development, pattern formation is always one of the most important developmental issues. As demonstrated by the reaction-diffusion (RD) model, morphogen molecules are produced at specific regions of the embryo, forming morphogen gradients to guide cell specification, while interactions between different morphogens instruct more complicated and well-choreographed pattern formation. Our Palette constructed zSTEP, as a comprehensive transcriptomic expression pattern during development, could be leveraged to evaluate and prove the RD model during development, including AP patterning. Moreover, the investigation of gene expression patterns should not be limited to morphogens and TFs, and further investigation of their roles in AP patterning is desirable. Additionally, here a random forest model may be sufficient for investigating the most essential morphogens and TFs for AP axis refinement, while more sophisticated machine learning models may be required for addressing more specific biological questions."

      Reference

      Kondo, S. & Miura, T. Reaction-Diffusion model as a framework for understanding biological pattern formation. Science 329, 1616-1620 (2010).

      1. Can the authors comment on the limits of this inference pipeline? And how it performs as compared to single-cell RNA sequencing datasets where spatial information is inferred from well-established marker genes?

      Response: We appreciate the reviewer for this insightful question, which has inspired us to further explore the advantages and limitations of the Palette pipeline in comparison with other inference strategies. As mentioned in the Discussion section, a key limitation of the inference pipeline is its heavy reliance on the quality of ST data. It is obvious that if the quality of ST data is not of sufficient quality, the low-expression genes may not be detected or only appear in very few scattered spots. We think it is a common issue for any inference tools using ST data as the reference. However, with the ongoing advancements in spatial resolution and data quality, the performance of Palette is expected to be improved.

      As a comparison, the single-cell RNA sequencing datasets where spatial information is inferred from well-established marker genes do not face this limitation. The ground-breaking work by Satija et al. (2015) used such a strategy that combined scRNA-seq and in situ hybridizations of well-established marker genes to infer spatial location, enabling single cell resolution, as it maintains the high read depth and gene detection. One advantages of this scRNA-seq-based strategy is that it provides the transcriptomics of individual cells, rather than a combination of cell within a ST spot, although the positional relationships between cells are not real.

      However, compared to the inference from ST data, the positional relationships between cells are not directly captured. On the other hand, as the embryonic development progresses, more cell types will be specified, and the body patterning becomes more complex. In this scenario, using well-established marker gene to infer spatial information would be much more challenging. Additionally, there are not many scRNA-seq datasets of serial sections, and thus this strategy may not be used to construct 3D ST atlas.

      In contrast, our Palette inference pipeline is based on the ST data, which preserves the real positional relationships between spots. Although our inference pipeline cannot achieve single cell resolution, it focuses on the gene expression patterns rather than the sparse expression within individual spots. By applying Palette to paired serial sections, we were able to generated a 3D spatial expression atlas of zebrafish embryos, which has showed promising performance for investigating gene expression patterns and their involvement in AP patterning.

      Reference

      Satija, R. et al. Spatial reconstruction of single-cell gene expression data. Nature biotechnology 33, 495-502 (2015)

      We have updated the following paragraphs to further demonstrating the limitation of the inference pipeline in details in the Discussion section.

      "Thirdly, the performance of Palette and zSTEP heavily relied on the quality of ST data. If the quality of ST data is not of sufficient quality, the low-expression genes may not be detected or only appear in very few scattered spots, and the performance of spot clustering could also be affected. Moreover, in this study, for example, the Stereo-seq data of 12 hpf zebrafish embryo had fewer slices on the right side (Fig. S3b), resulting in more blank spots in the right part of zSTEP for the 12 hpf embryo. However, with the ongoing advancements in spatial resolution and data quality, the performance of Palette is expected to be enhanced and demonstrate even greater potential for analysing spatiotemporal gene expression.

      On the other hand, compared to the brilliant strategy that infers spatial information of scRNA-seq data from well-established genes, our Palette pipeline cannot achieve single cell resolution. However, our Palette pipeline is based on the ST reference, and thus preserves the real positional relationships between spots. Furthermore, the focus of our pipeline is to infer the gene expression patterns, which are closely correlated to biological functions and critical for embryonic development, rather than the sparse expression within individual spots. In this regard, our Palette pipeline can be advantageous, as it allows for reconstruction of the major expression profiles, which are often more relevant for understanding developmental processes. Additionally, our Palette can be applied to serial sections, enabling the construction of 3D ST atlas."

      Reviewer #3 (Significance):

      This study tackles an important challenge in biology - the difficult to resolve gene expression patterns with high spatial precision and in a high-throughput manner. By integrating sequencing datasets from previously published studies, as well as newly-generated datasets, the authors provide evidence that their novel inference pipeline enables them to obtain high-quality spatial information simply from bulk RNA-seq datasets, using ST as a reference. The development of this pipeline - Palette - is a major part of this manuscript and its applicability is validated using datasets from Drosophila and zebrafish embryos. This in an important advance for the field, but it would be nice for the authors to further comment on i) the validity of some of their approaches and how they may influence the quality of their inference, as well as, ii) potential pitfalls/limitations of this approach as compared to others available in the field. This would synthetize both previous and current findings into a conceptual and technological framework that would have a strong impact well beyond cell and developmental biology.

      Audience: This study would be relevant for a broad audience of biologists, interested in morphogen signaling, gene regulatory networks and cell fate specification.

      Expertise in zebrafish development, gastrulation, morphogen signaling and morphogenesis.

      Response: We thank the reviewer for providing the positive feedback, arising these valuable questions, which have motivated us to deeply consider the design concept and further application of Palette and zSTEP. Based on the insightful questions from the reviewer, we have added two extra paragraphs in the Discussion section to further discuss the potential application of zSTEP in developmental biology and application scenarios of the Palette pipeline. Specially, we have demonstrated that the performance of the inference pipeline relies on the spatial resolution and data quality of the ST data. We have then compared the advantages and limitations of Palette with the existing brilliant spatial inference strategy, which infers spatial information of scRNA-seq from well-established marker genes. Although our inference pipeline cannot achieve single cell resolution, it can capture the major expression patterns, which are closely correlated to functions and critical for embryonic development. We believe this will help readers gain a clearer understanding of the advantage and limitations of our pipeline compared to other tools, as well as the tasks for which Palette and our constructed zSTEP can be utilized. We express our thanks to the reviewer again for the valuable comments.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this study, Dong and colleagues developed a computational pipeline to use spatial transcriptomics (ST) datasets as a reference to infer the spatial patterns of gene expression from bulk RNA sequencing data. This approach aims to overcome the low read depth and limited gene detection capabilities in current ST datasets, while exploiting its ability to provide highly resolved spatial information. By combining bulk RNAseq datasets from 3 developmental stages during early zebrafish development with previously available ST and imaging datasets, the authors build DreSTEP (Danio rerio spatiotemporal expression profiles). Using this approach, they go on to identify the morphogens and transcription factors involved in anteroposterior patterning.

      The paper is well written, and the pipeline presented in this study is likely to be useful beyond the case studies included in this study. There are a few questions that, in my view, would be important to clarify to increase the impact of this work:

      1. The authors mention that they used a variable factor to adjust expression differences between the ST and bulk RNAseq datasets. It would be important for the authors to comment on how much overlap in gene expression is necessary between the datasets for an accurate calculation of this variable factor? Can this be directly tested, for instance, by testing how their conclusions vary if expression is adjusted by a variable factor calculated from only a smaller set of genes?
      2. Palette gives rise to highly spatially precise patterns, which closely match those found in ISH. However, the smoothening of the expression can also remove weak, yet real, local expression patterns, as shown for idgf6 in Fig. 2a. Can the authors test this more extensively for other genes?
      3. Using adjacent slices for ST and "bulk RNAseq" may provide better results than those obtained when comparing two independent datasets. Could the authors also extend the analysis of Palette's functionalities by using separate, previously available but independent datasets, for ST and bulk RNAseq in Drosophila as well?
      4. The DreSTEP analysis in zebrafish embryos is interesting and validates well-established observations in the field. Can the authors also discuss whether and how their dataset allows them to refine our understanding of the spatial or temporal pattern of the morphogens and TFs involved in AP patterning? This would further validate their approach.
      5. Can the authors comment on the limits of this inference pipeline? And how it performs as compared to single-cell RNA sequencing datasets where spatial information is inferred from well-established marker genes?

      Significance

      This study tackles an important challenge in biology - the difficult to resolve gene expression patterns with high spatial precision and in a high-throughput manner. By integrating sequencing datasets from previously published studies, as well as newly-generated datasets, the authors provide evidence that their novel inference pipeline enables them to obtain high-quality spatial information simply from bulk RNAseq datasets, using ST as a reference. The development of this pipeline - Palette - is a major part of this manuscript and its applicability is validated using datasets from Drosophila and zebrafish embryos. This in an important advance for the field, but it would be nice for the authors to further comment on i) the validity of some of their approaches and how they may influence the quality of their inference, as well as, ii) potential pitfalls/limitations of this approach as compared to others available in the field. This would synthetize both previous and current findings into a conceptual and technological framework that would have a strong impact well beyond cell and developmental biology.

      Audience: This study would be relevant for a broad audience of biologists, interested in morphogen signaling, gene regulatory networks and cell fate specification.

      Expertise in zebrafish development, gastrulation, morphogen signaling and morphogenesis.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The authors of the study introduce the Palette method, a novel approach designed to infer spatial gene expression patterns from bulk RNA-sequencing (RNA-seq) data. This method is complemented by the development of the DreSTEP 3D spatial gene expression atlas of zebrafish embryos, establishing a comprehensive resource for visualizing gene expression and investigating spatial cell-cell interactions in developmental biology.

      Major concerns:

      1. The efficacy of the Palette method may be compromised by its dependency on the quality of the reference spatial transcriptomics data. As highlighted in the study, variations in data quality can lead to significant challenges in reconstructing accurate spatial expression patterns from bulk data. This underscores the necessity of evaluating quality parameters, such as the number of gene detections and spatial resolution, to ensure reliable outcomes. Additional studies should rigorously assess how these quality factors influence the accuracy and efficiency of the algorithm in various data contexts, particularly under diverse conditions of gene detection.
      2. The methodology raises pertinent questions regarding how the clustering results from different algorithms may affect the reconstructions by the Palette method. The authors would better provide a detailed discussion/comparison of clustering processes that optimize the reconstruction of spatial patterns, ensuring precision in the downstream analyses.
      3. The choice to utilize only highly expressed genes in the initial stages of the Palette algorithm also warrants further exploration. Addressing the criteria for determining which genes qualify as "highly expressed" and outlining robust cutoff will enhance the algorithm's rigor and applicability. Similarly, in the iterative estimation of gene expression across spatial spots, establishing optimal iteration conditions is crucial. Implementing a loss function may offer a systematic method for concluding iterations, thus refining computational efficiency.
      4. Performance metrics relating to processing speed and computational demands remain inadequately addressed in the current framework. Understanding how the Palette method scales across varying gene counts and bulk RNA-seq datasets will be essential for potential applications in larger biological contexts. Notably, the quantitative demands of analyzing 20,000 genes when processing 10, 100, or 1,000 bulk RNA profiles must be articulated to guide researchers in planning accordingly.

      Minor opinions:

      1. Despite the promising advances offered by the zebrafish 3D reconstruction, there is a lack of details regarding numbers of the spatial transcriptomics (ST) data utilized, and the number of bulk RNA-seq data employed in the analyses. These parameters need to be clarified.
      2. Issues regarding spatial cell-cell communication, especially concerning interactions over longer distances, necessitate careful consideration. Introducing spatial distance constraints could help formulate more realistic models of cellular interactions, a vital aspect of embryonic development.
      3. Evaluation metrics such as the Adjusted Rand Index (ARI) and Root Mean Square Error (RMSE) represent critical tools for systematically measuring the similarity of inferred spatial patterns, yet their specific application within this context should be elaborated.
      4. The study's limitations surrounding ST data quality cannot be overstated. Discussing scenarios where only limited or poor-quality ST data are available will be crucial for guiding future studies. Furthermore, a clear explanation of how enhanced specificity and accuracy translate into tangible biological insights is essential for demystifying the underlying mechanisms driving developmental processes.

      Significance

      The Palette pipeline demonstrates a marked improvement in specificity and accuracy when predicting spatial gene expression patterns. Evaluative studies on Drosophila and zebrafish datasets affirm its enhanced performance compared to existing methodologies. By effectively reconstructing spatial information from bulk transcriptomic data, the Palette method innovatively merges the philosophy of leveraging single-cell transcriptomic data for deconvolution analyses. This integration is pivotal, advancing traditional bulk RNA-seq approaches while laying the groundwork for future research.

      One of the notable achievements in this work is the construction of the DreSTEP atlas, which integrates serial bulk RNA-seq data with advanced 3D imaging techniques. This resource grants researchers unprecedented access to the visualization of gene expression patterns across the zebrafish embryo, facilitating the investigation of spatial relationships and cell-cell interactions critical for developmental processes. Such capabilities are invaluable for understanding the intricate dynamics of embryogenesis and the distinct roles of individual cell types.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript titled "Unravelling the Progression of the Zebrafish Primary Body Axis with Reconstructed Spatiotemporal Transcriptomics" presents a comprehensive analysis of the development of the primary body axis in zebrafish by integrating bulk RNA-seq, 3D images, and Stereo-Seq. The authors first clearly demonstrate the application of Palette for integrating RNA-seq and Stereo-Seq using published spatial transcriptomics data of Drosophila embryos. Subsequently, they produced serial bulk RNA-seq data for certain developmental stages of Danio rerio embryos and utilized published Stereo-Seq data. Through robust validation, the authors observe the molecular network involved in AP axis formation. While the authors show that integrating bulk RNA-seq data with Stereo-Seq improves spatial resolution, additional proof is required to demonstrate the extent of this improvement.

      Major Comments:

      1. Lines 66-68: Discuss the limitations of existing tools and explicitly state the advantages of using Palette.
      2. Body Pattern Genes Analysis: For both Drosophila and Danio rerio, it would be valuable to examine body pattern genes in Stereo-Seq and apply Palette to determine if the resolution of the segments improves or merges. The resolution of the A-P axis is convincing, but further evidence for other segments would be beneficial.
      3. Figure 2d: Include the A-P line for which the intensity profile was plotted in the main figure, rather than just in the supplementary material. Additionally, consider simplifying the plot by not combining three lines into one, as it complicates the interpretation of observations.
      4. Drosophila Data Analysis: While the alignment and validation of Danio rerio sections are clearly explained, the analysis and validation of Drosophila data are insufficiently detailed. Provide a more thorough explanation of how the intensity profiles between BDGP in situ data and Stereo-Seq data are adjusted.
      5. Figure 3d: Present a plot with the expected expression profiles of the three genes if the embryo is aligned as anticipated.
      6. Analysis Without Palette: Between lines 277-438, the outcome of using Palette with bulk RNA-seq and Stereo-Seq is convincing. However, consider the following:<br /> o What would be the observations if the analysis were conducted solely with Stereo-Seq data, without incorporating bulk RNA-seq data and employing Palette?<br /> o This study uses only Stereo-Seq as the spatial transcriptomics reference. It would strengthen the argument to use at least one other spatial transcriptomics method, such as Visium or MERFISH, in conjunction with bulk RNA-seq and Palette, to demonstrate whether Palette consistently improves gene expression resolution.
      7. PDAC Data Analysis: Provide a more detailed explanation of the PDAC data analysis and use appropriate colors in the tissue images to clearly distinguish cell types.
      8. Comparison with Other Methods: State the limitations of not using STitch3D and Spateo for alignment and explain why these methods were not employed.

      Minor Comments:

      1. References: Add references to the statements in lines 51-53.
      2. Scientific Name Consistency: Ensure consistency in using either "Danio rerio" or "zebrafish" throughout the manuscript.
      3. Related References: Include the following relevant references:
      4. https://academic.oup.com/bib/article/25/4/bbae316/7705532
      5. https://www.life-science-alliance.org/content/6/1/e202201701
      6. Figure 1a: In the Venn diagram, include the number of genes in the bulk and Stereo-Seq datasets, as well as the number of overlapping genes.
      7. Figure 1 Improvement: Enlarge Figure 1 and reduce repetitive elements, such as parts of the deconvolution and Figure 1b.
      8. Figure 3f: Explain the black discontinuous line in the plot.
      9. Line 610: State the percentage of unpaired imaging spots.
      10. Lines 616-618: Specify the unit for the spot diameter.

      Significance

      This algorithm will be useful not only for the field of developmental biology but also for wider applications in spatial omics. Although I have expertise in spatial omics technology development, my understanding of computational biology is limited, which restricts my ability to fully evaluate the Palette algorithm presented in this paper.

    1. Reviewer #1 (Public review):

      Summary:

      Adult (4mo) rats were tasked to either press one lever for an immediate reward or another for a delayed reward. The task had an adjusting amount structure in which (1) the number of pellets provided on the immediate reward lever changed as a function of the decisions made, (2) rats were prevented from pressing the same lever three times in a row.

      While the authors have been very responsive to the reviews, and I appreciate that, unfortunately, the new analyses reported in this revision actually lead me to deeper concerns about the adequacy of the data to support the conclusions. In this revision, it has become clear that the conclusions are forced and not supported by the data. Alternative theories are not considered or presented. This revision has revealed deep problems with the task, the analyses, and the modeling.

      Data Weaknesses

      Most importantly, the inclusion of the task behavior data has revealed a deep problem with the entire structure of the data. As is obvious in Figure 1D, there is a slow learning effect that is changing over the sessions as the animals learn to stop taking the delayed outcome. Unfortunately, the 8s delays came *after* the 4s. The first 20 sessions contain 19 4s delays and 1 8s delay, while the last 20 sessions contain 14 8s delays and 6 4s delays. Given the changes across sessions, it is likely that a large part of the difference is due to across-session learning (which is never addressed or considered).

      These data are not shown by subject and I suspect that individual subjects did all 4s then all 8s and some subjects switched tasks at different times. If my suspicion is true, then any comparisons between the 4s and 8s conditions (which are a major part of the author's claims) may have nothing to do with the delays, but rather with increased experience on the task.

      Furthermore, the four "groups", which are still poorly defined, seem to have been assessed at a session-by-session level. So when did each animal fall into a given group? Why is Figure 1D not showing which session fell into which group and why are we not seeing each animal's progression? They also admit that animals used a mixture of strategies, which implies that the "group" assignment is an invalid analysis, as the groups do not accommodate strategy mixing.

      Figure 2 shows that none of the differences of the group behavior against random choice with a basic p(delay) are significant. The use a KS test to measure these differences. KS tests are notoriously sensitive as KS tests simply measure whether there are any statistical differences between two distributions. They do not report the full statistics for Figure 2, but only say that the 4HI group was not significant (KS p-value = 0.72) and the 8LO showed a p-value of 0.1 (which they interpret as significant). p=0.1 is not significant. They don't report the value of the 4LO or 8HI groups (why not?), but say they are in-between these two extremes. That means *none* of the differences are significant.

      They then test a model with additional parameters, and say that the model includes more than the minimal p_D parameter, but never report BIC or AIC model comparisons. In order to claim that the model is better than the bare p_D assumption, they should be reporting model-comparison statistics. But given that the p_D parameters are enough (q.v. Figure 2), this entire model seems unnecessary

      It took me a while to determine what was being shown in Figure 3, but I was eventually able to determine that 0 was the time after the animal made the choice to wait out the delay side, so the 4s in Figure 3A1 with high power in the low-frequency (<5 Hz) range is the waiting time. They don't show the full 8s time. Nor do they show the spectrograms separated by group (assuming that group is the analytical tool they are using). In B they show only show theta power, but it is unclear how to interpret these changes over time.

      In Figure 4, panel A is mostly useless because it is just five sample sessions showing firing rate plotted on the same panels as the immediate reward amount. If they want to claim correlation, they should show and test it. But moreover, this is not how neural data should be presented - we need to know what the cells are doing, population-wise. We need to have an understanding of the neural ensemble. These data are clearly being picked and chosen, which is not OK.

      Figure 4, panels B and C show that the activity trivially reflects the reward that has been delivered to the animal, if I am understanding the graphs correctly. (The authors do not interpret it this way, but the data is, to my eyes, clear.) The "immediate" signal shows up immediately at choice and reflects the size of the immediate reward (which is varying). The "delay" signal shows up after the delay and does not, which makes sense as the animals get 6 pellets on the delayed side no matter what. In fact, the max delayed side activity = the max immediate side activity, which is 6 pellets. This is just reward-related firing.

      Figure 5 is poorly laid out, switching the order in 5C to be 2 1 3 in E and F. (Why?!) The statistics for Figure 5 on page 17 should be asking whether there are differences between neuron types, not whether there is a choice x time interaction in a given neuron type. When I look at Figure 5F1-3, all three types look effectively similar with different levels of noise. It is unclear why they are doing this complicated PC analysis or what we should be drawing from it.

      Figure 6 mis-states pie charts as "total number" rather than proportions.

      Interpretation Weaknesses

      The separation of cognitive effort into "resource-based" and "resistance-based" seems artificial to me. I still do not understand why the ability to resist a choice does not also depend on resource or why using resources are not a form of resistance. Doesn't every action in the end depend on the resources one has available? And doesn't every use of a resource resist one option by taking another? Even if one buys these two separate cognitive control processes (which at this point in reading the revision, I do not), the paper starts from the assumption that a baseline probability of waiting out the delays is a "resistance-based cognitive control" (why?) and a probability of choice that takes into account the size of the immediate value (confusingly abbreviated as ival) is a "resource-based cognitive control" (again, why?)

    2. Reviewer #2 (Public review):

      Summary:

      I appreciate the considerable work the authors have done on the revision. The manuscript is markedly improved.

      Strengths still include the strong theoretical basis, well-done experiments, and clear links to LFP / spectral analyses that have links to human data. The task is now more clearly explained, and the neural correlates better articulated.

      Weaknesses:

      I had remaining questions, many related to my previous questions.<br /> (1) The results have some complexity, but I still had questions about which is resource and which is resistance based. The authors say in the last sentence of the discussion: "Prominent pre-choice theta power was associated with a behavioral strategy characterized by a strong bias towards a resistance-based strategy, whereas the neural signature of ival-tracking was associated with a strong bias towards a resource-based strategy.".<br /> I might suggest making this simpler and clear in the abstract and the first paragraph of the discussion. A simple statement like 'pre-choice theta was biased towards resistance whereas single neurons were biased towards resources" might make this idea come across?

      (2) I think most readers would like to see raw single trial LFP traces in Figure 3, single unit rasters in Figure 4, and spike-field records in Figure 5.

      (3) What limitations are there to this work? I wonder if readers might benefit from some contextualization - the sample size, heterogenous behavior - lack of cell-type specificity - using PC3 to define spectral relationships - I might suggest pointing these out.

      (4) I still wasn't sure what 4 Hz vs. theta 6-12 Hz meant - is it all based on PC3's pos/neg correlation? I wonder if showing a scatter plot with the y-axis being PC3 and the x-axis being theta 4 Hz power would help distinguish these? Is this the first time this sort of analysis has been done? If so, it requires clearer definitions.

    3. Reviewer #3 (Public review):

      Summary:

      The study investigated decision making in rats choosing between small immediate rewards and larger delayed rewards, in a task design where the size of the immediate rewards decreased when this option was chosen and increased when it was not chosen. The authors conceptualise this task as involving two different types of cognitive effort; 'resistance-based' effort putatively needed to resist the smaller immediate reward, and 'resource-based' effort needed to track the changing value of the immediate reward option. They argue based on analyses of the behaviour, and computational modelling, that rats use different strategies in different sessions, with one strategy in which they preferentially choose the delayed reward option irrespective of the current immediate reward size, and another strategy in which they preferentially choose the immediate reward option when the immediate reward size is large, and the delayed reward option when the immediate reward size is small. The authors recorded neural activity in anterior cingulate cortex. They propose that oscillatory activity in the 6-12Hz theta band occurs when subjects use a 'resistance-based' strategy of choosing the delayed option irrespective of the current value of the immediate reward option. They also examine neural representation of the current value of the immediate reward option, and suggest that this value is more strongly represented when subjects are using this value information to guide choice. They further argue that neurons whose activity is modulated by theta oscillations are less involved in tracking the value of the immediate reward option than neurons whose activity is not theta modulated. If solid, these findings will be of interest to researchers working on cognitive control and ACCs involvement in decision making. However, there are some issues with the modelling and analysis which preclude high confidence in the validity of the conclusions.

      Strengths:

      The behavioural task used is interesting and the recording methods used (64 channel silicon probes) should enable the collection of good quality single unit and LFP electrophysiology data. The authors recorded from a sizable sample of subjects for this type of study. The approach of splitting the data into sessions where subjects used different strategies and then examining the neural correlates of each is in principle interesting, though I have some reservations about the strength of evidence for the existence of multiple strategies.

      Limitations:

      The dataset is unbalanced in terms of both the number of sessions contributed by each subject, and their distribution across the different putative behavioural strategies (see Table 1), with some subjects contributing 7 sessions to a given strategy and others 0. Further, only 2 of 10 subjects contribute any sessions to one of the behavioural strategies (8LO), and a single subject contributes >50% of the sessions (7 of 13) sessions to another strategy (8HI). Apparent differences in brain activity between the strategies could therefore in fact reflect differences between subjects, which could arise due to e.g. differences in electrode placement. To make firm conclusions that neural activity is different in sessions where different strategies are thought to be employed, it would be necessary to account for potential cross-subject variation in the data. The current statistical methods don't appear to do this as they use within subject measures (e.g. trials or neurons) as the experimental unit and ignore which subject the neuron/trial came from.

      The starting point for the analysis was the splitting of sessions into 4 groups based on the duration of the delay (4 vs 8 seconds) and then clustering within each delay category into two sub-groups. It was not clear why 2 clusters per delay category were used, nor whether the data did in fact have a clear split into two distinct clusters or continuous variation across the population of sessions. The simplified RL model used in the revised manuscript (which is an improvement from that used in the previous version) could in principle help to quantify variation across the populations of sessions, by using model fitting and comparison methods to evaluate variation in strategy across subjects. However, as far as I could tell no model-fitting or comparison was performed, and the only attempt to link the model to data was by simulating data using a fixed probability of choosing the delayed lever (i.e. with no learning across trials) and comparing the distribution of total rewards obtained per session with that of the subjects in each group (Figure 2). Total reward per session is a very coarse behavioural metric and using likelihood-based methods to fit model parameters to subjects trial-by-trial choice data would provide a more sensitive way of using the modelling to assess behavioural strategy across sessions.

      Conceptually, it is not obvious that choices towards the delayed vs immediate lever reflect use of different strategies employing different types of cognitive effort. Rather these could reflect a single strategy which compares the estimated value of the two levers, with differences in behaviour between sessions accounted for either by differences in the task itself (between the 8s and 4s delay condition) or differences in the parameters of the strategy, such as the strength of temporal discounting.

      Even if one accepts the claim that the task recruits two distinct types of cognitive control, the argument that theta oscillations, which occur on delay choice trials in the 4s delay condition, are a correlate of a 'resistance-based' strategy (resisting the immediate reward), is hard to reconcile with the fact that theta oscillations do not occur on delay choice trials in the 8s delay condition (Figure 3). The authors note this discrepancy, but state that 'The reason was because these groups largely avoided the delayed lever (Figure 1) and thereby abandoned the need to implement resistance-based control altogether.' However, the data in Figure 1D show that even in the 8s condition the subjects choose the delayed lever on around 50% of trials. It is not obvious why choosing the delayed lever on 50% of trials in the 8s condition does not require 'resistance-based' cognitive effort, while choosing it in the 4s delay condition does.

      The other main claims regarding the neural data are that the neuronal representation of the value of the immediate reward lever (ival) is stronger in sessions where subjects are choosing that lever more often, particularly the 8LO group, and that neurons whose activity tracks ival are a different population from neurons whose activity is theta modulated. However, the analysis methods used to make these claims are rather convoluted and make it hard to assess the strength of the evidence for them.

      To evaluate the strength of ival representation in neural activity, the authors first fit a regression model predicting each neuron's activity at different timepoints as a function of behavioural variables including ival, which is a sensible first step. However, they then perform clustering on the regression coefficients and then plot neural activity only for the cluster which they state 'provided the clearest example of value tracking'. It is not clear how the clustering was done, whether there were in fact well defined clusters in the neural activity, how the clusters whose activity is plotted were chosen, nor the proportion of neurons in this cluster for each group of sessions. The analysis therefore provides only limited information about the strength of ival representation in different session groups. It would be useful to quantify the variance explained by ival in neural activity for each group of sessions using a simpler quantification of the regression analysis, such as cross-validated coefficient of partial determination.

      The analysis of how theta modulation related to representation of ival across neurons was also complicated and non-standard. To determine whether individual neurons were theta modulated, the authors did PCA on a matrix comprised of spike train autocorrelations for individual neurons, and then grouped neurons according to the projection of their autocorrelation function onto the 3rd Principal Component, on the basis that neurons with negative projection onto this component showed a peak roughly at theta frequency in the power spectrum of their autocorrelation. Even ignoring the fact that the peak in the power spectrum is broad and centred above the standard theta frequency (see figure 5B3), this is an arbitrary and unnecessarily complex way to determine if neurons are theta modulated. It would be much simpler and greatly preferable to either directly assess the modulation depth of individual neurons spike train autocorrelation in the theta band, or to use a metric of spike-LFP coupling in the theta band instead. The authors do include some analysis of spike field coherence in Figure 6 and this is a much more sensible approach. However, it is worth noting that the only session group which shows a difference in coherence at theta frequency relative to the other groups is 8LO, to which only 2 of 8 animals contribute any data and 70% of sessions come from one animal. It is therefore unclear whether differences in this group are due to differences in behavioural strategy, or reflect other sources of cross-animal variation.

    4. Author response:

      The following is the authors’ response to the current reviews.

      We would like to thank the reviewers for their efforts and feedback on our preprint. We have elected to rework the manuscript for publication in a different journal. In this process we will alter many of the approaches and re-evaluate the conclusions. With this, many of the points raised by the reviewers will be no longer relevant and therefore do not require a response. Again, we thank the reviewers for their time and helpful feedback.


      The following is the authors’ response to the original reviews.

      eLife Assessment:

      The authors present a potentially useful approach of broad interest arguing that anterior cingulate cortex (ACC) tracks option values in decisions involving delayed rewards. The authors introduce the idea of a resource-based cognitive effort signal in ACC ensembles and link ACC theta oscillations to a resistance-based strategy. The evidence supporting these new ideas is incomplete and would benefit from additional detail and more rigorous analyses and computational methods.

      We are extremely grateful for the several excellent and comments of the reviewers. To address these concerns, we have completely reworked the manuscript adding more rigorous approaches in each phase of the analysis and computational model. We realize that this has taken some time to prepare the revision. However, given the comments of the reviewers, we felt it necessary to thoroughly rework the paper based on their input. Here is a (nonexhaustive) overview of the major changes we made:

      We have developed a way to more adequately capture the heterogeneity in the behavior

      We have completely reworked the RL model

      We have added additional approaches and rigor to the analysis of the value-tracking signal. 

      Reviewer #1 (Public Review):

      Summary:

      Young (2.5 mo [adolescent]) rats were tasked to either press one lever for immediate reward or another for delayed reward. 

      Please note that at the time of testing and training that the rats were > 4 months old. 

      The task had a complex structure in which (1) the number of pellets provided on the immediate reward lever changed as a function of the decisions made, (2) rats were prevented from pressing the same lever three times in a row. Importantly, this task is very different from most intertemporal choice tasks which adjust delay (to the delayed lever), whereas this task held the delay constant and adjusted the number of 20 mg sucrose pellets provided on the immediate value lever.

      Several studies parametrically vary the immediate lever (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183). While most versions of the task will yield qualitatively similar estimates of discounting, the adjusting amount is preferred as it provides the most consistent estimates (PMID: 22445576). More specifically this version of the task avoids contrast effects of that result from changing the delay during the session (PMID: 23963529, 24780379, 19730365, 35661751) which complicates value estimates. 

      Analyses are based on separating sessions into groups, but group membership includes arbitrary requirements and many sessions have been dropped from the analyses. 

      We have updated this approach and now provide a more comprehensive assessment of the behavior. The updated approach applies a hierarchical clustering model to the behavior in each session. This was applied at each delay to separate animals that prefer the immediate option more/less. This results in 4 statistically dissociable groups (4LO, 4HI, 8LO, 8HI) and includes all sessions. Please see Figure 1. 

      Computational modeling is based on an overly simple reinforcement learning model, as evidenced by fit parameters pegging to the extremes. 

      We have completely reworked the simulations in the revision. In the updated RL model we carefully add parameters to determine which are necessary to explain the experimental data. We feel that it is simplified yet more descriptive. Please see Figure 2 and associated text. 

      The neural analysis is overly complex and does not contain the necessary statistics to assess the validity of their claims.

      We have dramatically streamlined the spike train analysis approach and added several statistical tests to ensure the rigor of our results. Please see Figures 4,5,6 and associated text. 

      Strengths:

      The task is interesting.

      Thank you for the positive comment

      Weaknesses:

      Behavior:

      The basic behavioral results from this task are not presented. For example, "each recording session consisted of 40 choice trials or 45 minutes". What was the distribution of choices over sessions? Did that change between rats? Did that change between delays? Were there any sequence effects? (I recommend looking at reaction times.) Were there any effects of pressing a lever twice vs after a forced trial? 

      Please see the updated statistics and panels in Figures 1 and 2. We believe these address this valid concern.  

      This task has a very complicated sequential structure that I think I would be hard pressed to follow if I were performing this task. 

      Human tasks implement a similar task structure (PMID: 26779747). Please note the response above that outlines the benefits of using of this task.   

      Before diving into the complex analyses assuming reinforcement learning paradigms or cognitive control, I would have liked to have understood the basic behaviors the rats were taking. For example, what was the typical rate of lever pressing? If the rats are pressing 40 times in 45 minutes, does waiting 8s make a large difference?

      Thank you for this suggestion. Our additions to Figure 1 are intended to better explain and quantify the behavior of the animals. Note that this task is designed to hold the rate of reinforcement constant no matter the choices of the animals. Our analysis supports the long-held view in the literature that rats do not like waiting for rewards, even at small delays. Going from the 4 à 8 sec delay results in significantly more immediate choices, indicating that the rats will forgo waiting 8 sec for a larger reinforcer and take a smaller reinforcer at 4 sec.  

      For that matter, the reaction time from lever appearance to lever pressing would be very interesting (and important). Are they making a choice as soon as the levers appear? Are they leaning towards the delay side, but then give in and choose the immediate lever? What are the reaction time hazard distributions?

      This is an excellent suggestion, we have added a brief analysis of reaction times (Please see the section entitled “4 behavioral groups are observed across all sessions” in the Results). Please note that an analysis of the reaction times has been presented in a prior analysis of this data set (White et al., 2024). In addition, an analysis of reaction times in this task was performed in Linsenbardt et al. (2017). In short, animals tend to choose within 1 second of the lever appearing. In addition, our prior work shows that responses on the immediate lever tend to be slower, which we viewed as evidence of increased deliberation requirements (possibly required to integrate value signals).   

      It is not clear that the animals on this task were actually using cognitive control strategies on this task. One cannot assume from the task that cognitive control is key. The authors only consider a very limited number of potential behaviors (an overly simple RL model). On this task, there are a lot of potential behavioral strategies: "win-stay/lose-shift", "perseveration", "alternation", even "random choices" should be considered.

      The strategies the Reviewer mentioned are descriptors of the actual choices the rats made. For example, perseveration means the rat is choosing one of the levers at an excessively high rate whereas alternation means it is choosing the two levers more or less equally, independent of payouts. But the question we are interested in is why? We are arguing that the type of cognitive control determines the choice behavior, but cognitive control is an internal variable that guides behavior, rather than simply a descriptor of the behavior. For example, the animal opts to perseverate on the delayed lever because the cognitive control required to track ival is too high. We then searched the neural data for signatures of the two types of cognitive control.

      The delay lever was assigned to the "non-preferred side". How did side bias affect the decisions made?

      The side bias clearly does not impact performance as the animals prefer the delay lever at shorter delays, which works against this bias.  

      The analyses based on "group" are unjustified. The authors compare the proportion of delayed to immediate lever press choices on the non-forced trials and then did k-means clustering on this distribution. But the distribution itself was not shown, so it is unclear whether the "groups" were actually different. They used k=3, but do not describe how this arbitrary number was chosen. (Is 3 the optimal number of clusters to describe this distribution?) Moreover, they removed three group 1 sessions with an 8s delay and two group 2 sessions with a 4s delay, making all the group 1 sessions 4s delay sessions and all group 2 sessions 8s delay sessions. They then ignore group 3 completely. These analyses seem arbitrary and unnecessarily complex. I think they need to analyze the data by delay. (How do rats handle 4s delay sessions? How do rats handle 6s delay sessions? How do rats handle 8s delay sessions?). If they decide to analyze the data by strategy, then they should identify specific strategies, model those strategies, and do model comparison to identify the best explanatory strategy. Importantly, the groups were session-based, not rat based, suggesting that rats used different strategies based on the delay to the delayed lever.

      We have completely reworked our approach for capturing the heterogeneity in behavior. We have taken care to show more of the behavioral statistics that have gone into identifying each of the groups. All sessions are included in this analysis. As the reviewer suggests, we used the statistics from each of the behavioral groups to inform the RL model that explores neural signals that underly decisions in this task. We strongly disagree that groups should be rat and not session based as the behavior of the animal can, and does, change from day to day. This is important to consider when analyzing the neural data as rat-based groupings would ignore this potential source of variance. 

      The reinforcement learning model used was overly simple. In particular, the RL model assumes that the subjects understand the task structure, but we know that even humans have trouble following complex task structures. Moreover, we know that rodent decision-making depends on much more complex strategies (model-based decisions, multi-state decisions, rate-based decisions, etc). There are lots of other ways to encode these decision variables, such as softmax with an inverse temperature rather than epsilon-greedy. The RL model was stated as a given and not justified. As one critical example, the RL model fit to the data assumed a constant exponential discounting function, but it is well-established that all animals, including rodents, use hyperbolic discounting in intertemporal choice tasks. Presumably this changes dramatically the effect of 4s and 8s. As evidence that the RL model is incomplete, the parameters found for the two groups were extreme. (Alpha=1 implies no history and only reacting to the most recent event. Epsilon=0.4 in an epsilongreedy algorithm is a 40% chance of responding randomly.)

      While we agree that the approach was not fully justified, we do not agree that it was invalid. Simply stated, a softmax approach gives the best fit to the choice behavior, whereas our epsilon-greedy approach attempted to reproduce the choice behavior using a naïve agent that progressively learns the values of the two levers on a choice-by-choice basis. Nevertheless, we certainly appreciate that important insights can be gained by fitting a model to the data as suggested. We feel that the new modeling approach we have now implemented is optimal for the present purposes and it replaces the one used in the original manuscript.

      The authors do add a "dbias" (which is a preference for the delayed lever) term to the RL model, but note that it has to be maximal in the 4s condition to reproduce group 2 behavior, which means they are not doing reinforcement learning anymore, just choosing the delayed lever.

      The dbias term was dropped in the new model implementation

      Neurophysiology:

      The neurophysiology figures are unclear and mostly uninterpretable; they do not show variability, statistics or conclusive results.

      While the reviewer is justified in criticizing the clarity of the figures, the statement that “they do not show variability, statistics or conclusive results” is not correct. Each of the figures presented in the first draft of the manuscript, except Figure 3, are accompanied by statistics and measures of variability. Nonetheless we have updated each of the neurophysiology analyses. We hope that the reviewer will find our updates more rigorous and thorough.   

      As with the behavior, I would have liked to have seen more traditional neurophysiological analyses first. What do the cells respond to? How do the manifolds change aligned to the lever presses? Are those different between lever presses?

      We have added several figures that plot the mean +/- SEM of the neural activity (see Figures 4 and 5). Hopefully this provides a more intuitive picture of the changes in neural activity throughout the task.  

      Are there changes in cellular information (both at the individual and ensemble level) over time in the session? 

      We provide several analyses of how firing rate changes over trials in relation to ival over time and trials in the session. In addition, we describe how these signals change in each of the behavioral groups. 

      How do cellular responses differ during that delay while both levers are out, but the rats are not choosing the immediate lever?

      We were somewhat unclear about this suggestion as the delay follows the lever press. In addition, there is no delay after immediate presses 

      Figure 3, for example, claims that some of the principal components tracked the number of pellets on the immediate lever ("ival"), but they are just two curves. No statistics, controls, or justification for this is shown. BTW, on Figure 3, what is the event at 200s?

      This comment is no longer relevant based on the changes we’ve made to the manuscript. 

      I'm confused. On Figure 4, the number of trials seems to go up to 50, but in the methods, they say that rats received 40 trials or 45 minutes of experience.

      This comment is no longer relevant based on the changes we’ve made to the manuscript. 

      At the end of page 14, the authors state that the strength of the correlation did not differ by group and that this was "predicted" by the RL modeling, but this statement is nonsensical, given that the RL modeling did not fit the data well, depended on extreme values. Moreover, this claim is dependent on "not statistically detectable", which is, of course, not interpretable as "not different".

      This comment is no longer relevant based on the changes we’ve made to the manuscript. 

      There is an interesting result on page 16 that the increases in theta power were observed before a delayed lever press but not an immediate lever press, and then that the theta power declined after an immediate lever press. 

      Thank you for the positive comment. 

      These data are separated by session group (again group 1 is a subset of the 4s sessions, group 2 is a subset of the 8s sessions, and group 3 is ignored). I would much rather see these data analyzed by delay itself or by some sort of strategy fit across delays.

      Thank you for the excellent suggestion. Our new group assignments take delay into account. 

      That being said, I don't see how this description shows up in Figure 6. What does Figure 6 look like if you just separate the sessions by delay?

      We are unclear what the reviewer means by “this description”.  

      Discussion:

      Finally, it is unclear to what extent this task actually gets at the questions originally laid out in the goals and returned to in the discussion. The idea of cognitive effort is interesting, but there is no data presented that this task is cognitive at all. The idea of a resourced cognitive effort and a resistance cognitive effort is interesting, but presumably the way one overcomes resistance is through resourcelimited components, so it is unclear that these two cognitive effort strategies are different.

      The basis for the reviewers assertation that “the way one overcomes resistance is through resourcelimited components” is not clear. In the revised version, we have taken greater care to outline how each type of effort signal facilitates performance of the task and articulate these possibilities in our stochastic and RL models. We view the strong evidence for ival tracking presented herein as a critical component of resource based cognitive effort. 

      The authors state that "ival-tracking" (neurons and ensembles that presumably track the number of pellets being delivered on the immediate lever - a fancy name for "expectations") "taps into a resourced-based form of cognitive effort", but no evidence is actually provided that keeping track of the expectation of reward on the immediate lever depends on attention or mnemonic resources. They also state that a "dLP-biased strategy" (waiting out the delay) is a "resistance-based form of cognitive effort" but no evidence is made that going to the delayed side takes effort.

      We challenge the reviewers that assertation ival tracking is a “fancy name for expectations”. We make no claim about the prospective or retrospective nature of the signal. Clearly, expectations should be prospective and therefore different from ival tracking. Regarding the resistance signal: First, animals avoid the delay lever more often at the 8 sec delay (Figure 1). We have shown that increasing the delay systematically biases responses AWAY from the delay (Linsenbardt et al., 2017). This is consistent with a well-developed literature that rats and mice do not like waiting for delayed reinforcers. We contend that enduring something you don’t like takes effort. 

      The authors talk about theta synchrony, but never actually measure theta synchrony, particularly across structures such as amygdala or ventral hippocampus. The authors try to connect this to "the unpleasantness of the delay", but provide no measures of pleasantness or unpleasantness. They have no evidence that waiting out an 8s delay is unpleasant.

      We have added spike-field coherence to better contact the literature on synchrony. Note that we never refer to our results as “synchrony”. However, we would be remiss to not address the growing literature on theta synchrony in effort allocation. There is a well-developed literature that rats and mice do not like waiting for delayed reinforcers. If waiting out the delay was not pleasant then why do the animals forgo larger rewards to avoid it? 

      The authors hypothesize that the "ival-tracking signal" (the expectation of number of pellets on the immediate lever) "could simply reflect the emotional or autonomic response". Aside from the fact that no evidence for this is provided, if this were to be true, then, in what sense would any of these signals be related to cognitive control?

      This is proposed as an alternative explanation to the ival signal in the discussion. It was added as our due diligence. Emotional state could provide feedback to the currently implemented control mechanism. If waiting for reinforcement is too unpleasant this could drive them to ival tracking and choosing the immediate option more frequently. We provide this option only as a possibility, not a conclusion. We have clarified this in the revised text. Nevertheless, based on our review of the literature, autonomic tracking in some form, seems to be the most likely function of ACC (Seamans & Floresco 2022). While the reviewer may disagree with this, we feel it is at least as valid as all the complex, cognitively-based interpretations that commonly appear in the literature.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the neuronal signals that underlie resistance vs resource-based models of cognitive effort. The authors use a delayed discounting task and computational models to explore these ideas. The authors find that the ACC strongly tracks value and time, which is consistent with prior work. Novel contributions include quantification of a resource-based control signal among ACC ensembles, and linking ACC theta oscillations to a resistance-based strategy.

      Strengths:

      The experiments and analyses are well done and have the potential to generate an elegant explanatory framework for ACC neuronal activity. The inclusion of local-field potential / spike-field analyses is particularly important because these can be measured in humans.

      Thank you for the endorsement of our work.

      Weaknesses:

      I had questions that might help me understand the task and details of neuronal analyses.

      (1) The abstract, discussion, and introduction set up an opposition between resource and resistancebased forms of cognitive effort. It's clear that the authors find evidence for each (ACC ensembles = resource, theta=resistance?) but I'm not sure where the data fall on this dichotomy.

      (a) An overall very simple schematic early in the paper (prior to the MCML model? or even the behavior) may help illustrate the main point.

      (b) In the intro, results, and discussion, it may help to relate each point to this dichotomy.

      (c) What would resource-based signals look like? What would resistance based signals look like? Is the main point that resistance-based strategies dominate when delays are short, but resource-based strategies dominate when delays are long?

      (d) I wonder if these strategies can be illustrated? Could these two measures (dLP vs ival tracking) be plotted on separate axes or extremes, and behavior, neuronal data, LFP, and spectral relationships be shown on these axes? I think Figure 2 is working towards this. Could these be shown for each delay length? This way, as the evidence from behavior, model, single neurons, ensembles, and theta is presented, it can be related to this framework, and the reader can organize the findings.

      These are excellent suggestions, and we have implemented them, where possible. 

      (2) The task is not clear to me.

      (a) I wonder if a task schematic and a flow chart of training would help readers.

      Yes, excellent idea, we have now included this in Figure 1. 

      (b) This task appears to be relatively new. Has it been used before in rats (Oberlin and Grahame is a mouse study)? Some history / context might help orient readers.

      Indeed, this task has been used in rats in several prior studies in rats. Please see the following references (PMID: 39119916, 31654652, 28000083, 26779747, 12270518, 19389183).

      (c) How many total sessions were completed with ascending delays? Was there criteria for surgeries? How many total recording sessions per animal (of the 54?)

      Please note that the delay does not change within a session. There were no criteria for surgery. 

      (d) How many trials completed per session (40 trials OR 45 minutes)? Where are there errors? These details are important for interpreting Figure 1.

      Every animal in this data set completed 40 trials and we have updated the task description to clarify this issue. There are no errors in this task, but rather the task is designed to the tendency to make an impulsive choice (smaller reward now). 

      (3) Figure 1 is unclear to me.

      (a) Delayed vs immediate lever presses are being plotted - but I am not sure what is red, and what is blue. I might suggest plotting each animal.

      We have updated Figure 1 considerably for clarity. 

      (b) How many animals and sessions go into each data point?

      We hope this is clarified now with our new group assignments as all sessions were included in the analysis. 

      (c) Table 1 (which might be better referenced in the paper) refers to rats by session. Is it true that some rats (2 and 8) were not analyzed for the bulk of the paper? Some rats appear to switch strategies, and some stay in one strategy. How many neurons come from each rat?

      We have updated Table 1 based on our new groupings. The rats that contribute the most sessions also tend to be represented across the behavioral groups therefore it is unlikely that effort allocation strategies across groupings are an esoteric feature of an animal. 

      (d) Task basics - RT, choice, accuracy, video stills - might help readers understand what is going into these plots

      (e) Does the animal move differently (i.e., RTs) in G1 vs. G2?

      Excellent suggestion. We have added more analysis of the task variables in the revision (e.g. RT, choice comparisons across delays, etc…)

      (4) I wasn't sure how clustered G1 vs. G2 vs G3 are. To make this argument, the raw data (or some axis of it) might help.

      (a) This is particularly important because G3 appears to be a mix of G1 and G2, although upon inspection, I'm not sure how different they really are

      (b) Was there some objective clustering criteria that defined the clusters?

      (c) Why discuss G3 at all? Can these sessions be removed from analysis?

      Based on our updates to the behavioral analysis these comments are no longer relevant. 

      (5) The same applies to neuronal analyses in Fig 3 and 4

      (a) What does a single neuron peri-event raster look like? I would include several of these.

      (b) What does PC1, 2 and 3 look like for G1, G2, and G3?

      (c) Certain PCs are selected, but I'm not sure how they were selected - was there a criteria used? How was the correlation between PCA and ival selected? What about PCs that don't correlate with ival?

      (d) If the authors are using PCA, then scree plots and PETHs might be useful, as well as comparisons to PCs from time-shuffled / randomized data.

      We hope that our reworking of the neural data analysis has clarified these issues. We now include several firing rate examples and aggregate data.   

      (6) I had questions about the spectral analysis

      (a) Theta has many definitions - why did the authors use 6-12 Hz? Does it come from the hippocampal literature, and is this the best definition of theta? What about other bands (delta - 1-4 Hz), theta (4-7 Hz); and beta - 13- 30 Hz? These bands are of particular importance because they have been associated with errors, dopamine, and are abnormal in schizophrenia and Parkinson's disease.

      This designation comes mainly from the hippocampal and ACC literature in rodents. In addition, this range best captured the peak in the power spectrum in our data. Note that we focus our analysis on theta give the literature regarding theta in the ACC as a correlate of cognitive controls (references in manuscript). We did interrogate other bands as a sanity check and the results were mostly limited to theta. Given the scope of our manuscript and the concerns raised regarding complexity we are concerned that adding frequency analyses beyond theta obfuscates the take home message.

      However, the spectrograms in Figure 3 show a range of frequencies and highlight the ones in the theta band as the most dynamic prior to the choice. 

      (b) Power spectra and time-frequency analyses may justify the authors focus. I would show these (yaxis - frequency, x-axis - time, z-axis, power).

      Thank you for the suggestion. We have added this to Figure 3.    

      (7) PC3 as an autocorrelation doesn't seem the to be right way to infer theta entrainment or spikefield relationships, as PCA can be vulnerable to phantom oscillations, and coherence can be transient. It is also difficult to compare to traditional measures of phase-locking. Why not simply use spike-field coherence? This is particularly important with reference to the human literature, which the authors invoke.

      Excellent suggestion. Note that PCA provided a way to classify neurons that exhibited peaks in the autocorrelation at theta frequencies. We have added spike-field coherence, and this analysis confirms the differences in theta entrainment of the spike trains across the behavioral groups. Please see Figure 6D.   

      Reviewer #3 (Public Review):

      Summary:

      The study investigated decision making in rats choosing between small immediate rewards and larger delayed rewards, in a task design where the size of the immediate rewards decreased when this option was chosen and increased when it was not chosen. The authors conceptualise this task as involving two different types of cognitive effort; 'resistance-based' effort putatively needed to resist the smaller immediate reward, and 'resource-based' effort needed to track the changing value of the immediate reward option. They argue based on analyses of the behaviour, and computational modelling, that rats use different strategies in different sessions, with one strategy in which they consistently choose the delayed reward option irrespective of the current immediate reward size, and another strategy in which they preferentially choose the immediate reward option when the immediate reward size is large, and the delayed reward option when the immediate reward size is small. The authors recorded neural activity in anterior cingulate cortex (ACC) and argue that ACC neurons track the value of the immediate reward option irrespective of the strategy the rats are using. They further argue that the strategy the rats are using modulates their estimated value of the immediate reward option, and that oscillatory activity in the 6-12Hz theta band occurs when subjects use the 'resistancebased' strategy of choosing the delayed option irrespective of the current value of the immediate reward option. If solid, these findings will be of interest to researchers working on cognitive control and ACCs involvement in decision making. However, there are some issues with the experiment design, reporting, modelling and analysis which currently preclude high confidence in the validity of the conclusions.

      Strengths:

      The behavioural task used is interesting and the recording methods should enable the collection of good quality single unit and LFP electrophysiology data. The authors recorded from a sizable sample of subjects for this type of study. The approach of splitting the data into sessions where subjects used different strategies and then examining the neural correlates of each is in principle interesting, though I have some reservations about the strength of evidence for the existence of multiple strategies.

      Thank you for the positive comments. 

      Weaknesses:

      The dataset is very unbalanced in terms of both the number of sessions contributed by each subject, and their distribution across the different putative behavioural strategies (see table 1), with some subjects contributing 9 or 10 sessions and others only one session, and it is not clear from the text why this is the case. Further, only 3 subjects contribute any sessions to one of the behavioural strategies, while 7 contribute data to the other such that apparent differences in brain activity between the two strategies could in fact reflect differences between subjects, which could arise due to e.g. differences in electrode placement. To firm up the conclusion that neural activity is different in sessions where different strategies are thought to be employed, it would be important to account for potential cross-subject variation in the data. The current statistical methods don't do this as they all assume fixed effects (e.g. using trials or neurons as the experimental unit and ignoring which subject the neuron/trial came from).

      In the revised manuscript we have updated the group assignments. We have improved our description of the logic and methods for employing these groupings as well. With this new approach, all sessions are now included in the analysis. The group assignments are made purely on the behavioral statistics of an animal in each session. We feel this approach is preferable to eliminating neurons or session with the goal of balancing them, which may introduce bias. Further, the rats that contribute the most sessions also tend to be represented across the behavioral groups therefore it is unlikely that effort allocation strategies across groupings are an esoteric feature of an animal. As neurons are randomly sampled from each animal on a given session, we feel that we’re justified in treating these as fixed effects.   

      It is not obvious that the differences in behaviour between the sessions characterised as using the 'G1' and 'G2' strategies actually imply the use of different strategies, because the behavioural task was different in these sessions, with a shorter wait (4 seconds vs 8 seconds) for the delayed reward in the G1 strategy sessions where the subjects consistently preferred the delayed reward irrespective of the current immediate reward size. Therefore the differences in behaviour could be driven by difference in the task (i.e. external world) rather than a difference in strategy (internal to the subject). It seems plausible that the higher value of the delayed reward option when the delay is shorter could account for the high probability of choosing this option irrespective of the current value of the immediate reward option, without appealing to the subjects using a different strategy.

      Further, even if the differences in behaviour do reflect different behavioural strategies, it is not obvious that these correspond to allocation of different types of cognitive effort. For example, subjects' failure to modify their choice probabilities to track the changing value of the immediate reward option might be due simply to valuing the delayed reward option higher, rather than not allocating cognitive effort to tracking immediate option value (indeed this is suggested by the neural data). Conversely, if the rats assign higher value to the delayed reward option in the G1 sessions, it is not obvious that choosing it requires overcoming 'resistance' through cognitive effort.

      The RL modelling used to characterise the subject's behavioural strategies made some unusual and arguably implausible assumptions:

      Thank you for the feedback, based on these comments (and those above) we have completely reworked the RL model. In addition, we’ve taken care to separate out the variables that correspond to a resistance- versus a resource-based signal. 

      There were also some issues with the analyses of neural data which preclude strong confidence in their conclusions:

      Figure 4I makes the striking claim that ACC neurons track the value of the immediately rewarding option equally accurately in sessions where two putative behavioural strategies were used, despite the behaviour being insensitive to this variable in the G1 strategy sessions. The analysis quantifies the strength of correlation between a component of the activity extracted using a decoding analysis and the value of the immediate reward option. However, as far as I could see this analysis was not done in a cross-validated manner (i.e. evaluating the correlation strength on test data that was not used for either training the MCML model or selecting which component to use for the correlation). As such, the chance level correlation will certainly be greater than 0, and it is not clear whether the observed correlations are greater than expected by chance.

      We have added more rigorous methods to assess the ival tracking signal (Figure 4 and 5). In addition, we’ve dropped the claim that ival tracking is the same across the behavioral groups. We suspect that this was an artifact of a suboptimal group assignment approach in the previous version. 

      An additional caveat with the claim that ACC is tracking the value of the immediate reward option is that this value likely correlates with other behavioural variables, notably the current choice and recent choice history, that may be encoded in ACC. Encoding analyses (e.g. using linear regression to predict neural activity from behavioural variables) could allow quantification of the variance in ACC activity uniquely explained by option values after controlling for possible influence of other variables such as choice history (e.g. using a coefficient of partial determination).

      We agree that the ival tracking signal may be influenced by other variables – especially ones that are not cognitive but rather more generated by the autonomic system. We have included a discussion of this possibility in the Discussion section. Our previous work has explored the role of choice history on neural activity, please see White et al., (2024). 

      Figure 5 argues that there are systematic differences in how ACC neurons represent the value of the immediate option (ival) in the G1 and G2 strategy sessions. This is interesting if true, but it appears possible that the effect is an artefact of the different distribution of option values between the two session types. Specifically, due to the way that ival is updated based on the subjects' choices, in G1 sessions where the subjects are mostly choosing the delayed option, ival will on average be higher than in G2 sessions where they are choosing the immediate option more often. The relative number of high, medium and low ival trials in the G1 and G2 sessions will therefore be different, which could drive systematic differences in the regression fit in the absence of real differences in the activity-value relationship. I have created an ipython notebook illustrating this, available at: https://notebooksharing.space/view/a3c4504aebe7ad3f075aafaabaf93102f2a28f8c189ab9176d48 07cf1565f4e3. To verify that this is not driving the effect it would be important to balance the number of trials at each ival level across sessions (e.g. by subsampling trials) before running the regression.

      This is an excellent point and lead us to abandon the linear regression-based approach to quantify differences in ival coding across behavioral groups.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This paper was extremely hard to read. In addition to the issues raised in the public review (overly complex and incomplete analyses), one of the hardest things to deal with was the writing.

      Thank you for the feedback. Hopefully we have addressed this with our thorough rewrite. 

      The presentation was extremely hard to follow. I had to read through it several times to figure out what the task was. It wasn't until I got to the RL model Figure 2A that I realized what was really going on with the task. I strongly recommend having an initial figure that lays out the actual task (without any RL or modeling assumptions) and identifies the multiple different kinds of sessions. What is the actual data you have to start with? That was very unclear.

      Excellent idea. We have implemented this in Figure 1.  

      Labeling session by "group" is very confusing. I think most readers take "group" as the group of subjects, but that's not what you mean at all. You mean some sessions were one way and some were another. (And, as I noted in the public review, you ignore many of the sessions, which I think is not OK.) I think a major rewrite would help a lot. Also, I don't think the group analysis is necessary at all. In the public review, I recommend doing the analyses very differently and more classically.

      We have updated the group assignments in a manner that is more intuitive, reflects the delays, and includes all sessions.  

      The paper is full of arbitrary abbreviations that are completely unnecessary. Every time I came to "ival", I had to translate that into "number of pellets delivered on the immediate lever" and every time I came to dLP, I had to translate that into "delayed lever press". Making the text shorter does not make the text easier to read. In general, I was taught that unless the abbreviation is the common term (such as "DNA" not "deoxyribonucleic acid"), you should never use an abbreviation. While there are some edge cases (ACC probably over "anterior cingulate cortex"), dLP, iLP, dLPs, iLPs, ival, are definitely way over the "don't do that" line.

      We completely agree here and apologize for the excessive use of abbreviations. We have removed nearly all of them

      The figures were incomplete, poorly labeled, and hard to read. A lot of figures were missing, for example

      Basic task structure

      Basic behavior on the task

      Scatter plot of the measures that you are clustering (lever press choice X number of pellets on the immediate lever, you can use color or multiple panels to indicate the delay to the delayed lever) Figure 3 is just a couple of examples. That isn't convincing at all.

      Figure 4 is missing labels. In Figure 4, I don't understand what you are trying to say.

      I don't see how the results on page 16 arise from Figure 6. I strongly recommend starting from the actual data and working your way to what it means rather than forcing this into this unreasonable "session group" analysis.

      We have completely reworked the Figures for clarity and content. 

      The statement that "no prior study has explored the cellular correlates of cognitive effort" is ludicrous and insulting. There are dozens of experiments looking at ACC in cognitive effort tasks, in humans, other primates, and rodents. There are many dozens of experiments looking at cellular correlates in intertemporal choice tasks, some with neural manipulations, some with ensemble recordings. There are many dozens of experiments looking at cellular relationships to waiting out a delay.

      We agree that our statement was extremely imprecise. We have updated this to say:  “Further, a role for theta oscillations in allocating physical effort has been identified. However, the cellular

      mechanisms within the ACC that control and deploy types of cognitive effort have not been identified.”

      Reviewer #2 (Recommendations For The Authors):

      In Figure 2, the panels below E and F are referred to as 'right' - but they are below? I would give them letters.

      I would make sure that animal #s, neuron #s, and LFP#s are clearly presented in the results and in each figure legend. This is important to follow the results throughout the manuscript.

      Some additional proofreading ('Fronotmedial') might help with clarity.

      Based on our updates, this is no longer relevant.  

      Reviewer #3 (Recommendations For The Authors):

      In addition to the suggestions above to address specific issues, it would be useful to report some additional information about aspects of the experiments and analyses:

      Specify how spike sorting was performed and what metrics were used to select well isolated single units.

      Done.

      Provide histology showing the recording locations for each subject.

      Histological assessments of electrodes placements are provided in White et al. 2024, but we provide an example placement. This has been added to the text. 

      Indicate the sequence of recording sessions that occurred for each subject, including for each session what delay duration was used and which dataset the session contributed to, and indicate when the neural probes were advanced between sessions.

      We feel that this adds complexity unnecessarily as we make no claims about holding units across sessions for differences in coding in the dorsoventral gradient of ACC. 

      Indicate the experimental unit when reporting uncertainty measures in figure legends (e.g. mean +/- SEM across sessions).

      Done.

    1. Reviewer #1 (Public review):

      The paper from Hudait and Voth details a number of coarse-grained simulations as well as some experiments focused on the stability of HIV capsids in the presence of the drug lenacapavir. The authors find that LEN hyperstabilizes the capsid, making it fragile and prone to breaking inside the nuclear pore complex.

      I found the paper interesting. I have a few suggestions for clarification and/or improvement.

      (1) How directly comparable are the NPC-capsid and capsid-only simulations? A major result rests on the conclusion that the kinetics of rupture are faster inside the NPC, but are the numbers of LENs bound identical? Is the time really comparable, given that the simulations have different starting points? I'm not really doubting the result, but I think it could be made more rigorous/quantitative.

      (2) Related to the above, it is stated on page 12 that, based on the estimated free-energy barrier, pentamer dissociation should occur in ~10 us of CG time. But certainly, the simulations cover at least this length of time?

      (3) At first, I was surprised that even in a CG simulation, LEN would spontaneously bind to the correct site. But if I read the SI correctly, LEN was parameterized specifically to bind to hexamers and not pentamers. This is fine, but I think it's worth describing in the main text.

    2. Reviewer #2 (Public review):

      Here, Hudait et al. use CG modeling to investigate the mechanism by which lenacapavir (LEN) treats HIV capsids that dock to the nuclear pore complex (NPC). However, the manuscript fails to present meaningful findings that were previously unreported in the literature, and is thus of low impact. Many claims made in the manuscript are not substantiated by the presented data. Key mechanistic details that the work purports to reveal are artifacts of the parameterization choices or simulation/analysis design, with the simulations said to reveal details that they were specifically biased to reproduce. This makes the manuscript highly problematic, as its contributions to the literature would represent misconceptions based on oversights in modeling, and thus mislead future readers.

      (1) Considering the literature, it is unclear that the manuscript presents new scientific discoveries. The following are results from this paper that have been previously reported:

      (a) LEN-bound capsid can dock to the nuclear pore (Figure 2; see e.g. 10.1016/j.cell.2024.12.008 or 10.1128/mbio.03613-24).

      (b) NUP98 interacts with the docked capsid (Figure 2; see e.g. 10.1016/j.virol.2013.02.008 or 10.1038/s41586-023-06969-7 or 10.1016/j.cell.2024.12.008).

      (c) LEN and NUP98 compete for a binding interface (Figure 2; see e.g. 10.1126/science.abb4808 or 10.1371/journal.ppat.1004459).

      (d) LEN creates capsid defects (Figure 3 and 5, see e.g. 10.1073/pnas.2420497122).

      (e) RNP can emerge from a damaged capsid (Figure 3 and 5; see e.g. 10.1073/pnas.2117781119 or 10.7554/eLife.64776).

      (f) LEN hyperstabilizes/reduces the elasticity of the capsid lattice (Figure 6; see e.g. 10.1371/journal.ppat.1012537).

      (2) The mechanistic findings related to how these processes occur are problematic, either based on circular reasoning or unsubstantiated, based on the presented data. In some cases, features of parameterization and simulation/analysis design are erroneously interpreted as predictions by the CG models.

      (a) Claim: LEN-bound capsids remain associated with the NPC after rupture. CG simulations did not reach the timescale needed to demonstrate continued association or failure to translocate, leaving the claim unsubstantiated.

      (b) Claim: LEN contributes to loss of capsid elasticity. The authors do not measure elasticity here, only force constants of fluctuations between capsomers in freely diffusing capsids. Elasticity is defined as the ability of a material to undergo reversible deformation when subjected to stress. Other computational works that actually measure elasticity (e.g., 0.1371/journal.ppat.1012537) could represent a point of comparison, but are not cited. The changes in force constants in the presence of LEN are shown in Figure 6C, but the text of the scale bar legend and units of k are not legible, so one cannot discern the magnitude or significance of the change.

      (c) Claim: Capsid defects are formed along striated patterns of capsid disorder. Data is not presented that correlates defects/cracks with striations.

      (d) Claim: Typically 1-2 LEN, but rarely 3 bind per capsid hexamer. The authors state: "The magnitude of the attractive interactions was adjusted to capture the substoichiometric binding of LEN to CA hexamers (Faysal et al., 2024). ... We simulated LEN binding to the capsid cone (in the absence of NPC), which resulted in a substoichiometric binding (~1.5 LEN per CA hexamer), consistent with experimental data (Singh et al., 2024)." This means LEN was specifically parameterized to reproduce the 1-2 binding ratio per hexamer apparent from experiments, so this was a parameterization choice, not a prediction by CG simulations as the authors erroneously claim: "This indicates that the probability of binding a third LEN molecule to a CA hexamer is impeded, likely due to steric effects that prevent the approach of an incoming molecule to a CA hexamer where 2 LEN molecules are already associated. ... Approximately 20% of CA hexamers remain unoccupied despite the availability of a large excess of unbound LEN molecules. This suggests a heterogeneity in the molecular environment of the capsid lattice for LEN binding." These statements represent gross over-interpretation of a bias deliberately introduced during parameterization, and the "finding" represents circular reasoning. Also, if "steric effects" play any role, the authors could analyze the model to characterize and report them rather than simply speculate.

      (e) Claim: Competition between NUP98 and LEN regulates capsid docking. The authors state: "A fraction of LEN molecules bound at the narrow end dissociate to allow NUP98 binding to the capsid ... Therefore, LEN can inhibit the efficient binding of the viral cores to the NPC, resulting in an increased number of cores in the cytoplasm." Capsid docking occurs regardless of the presence of LEN, and appears to occur at the same rate as the LEN-free capsid presented in the authors' previous work (Hudait &Voth, 2024). The presented data simply show that there is a fluctuation of bound LEN, with about 10 fewer (<5%) bound at the end of the simulation than at the beginning, and the curve (Figure 2A) does not clearly correlate with increased NUP98 contact. In that case, no data is shown that connects LEN binding with the regulation of the docking process. Further, the two quoted statements contradict each other. The presented data appear to show that NUP outcompetes LEN binding, rather than LEN inhibiting NUP binding. The "Therefore" statement is an attempt to reconcile with experimental studies, but is not substantiated by the presented data.

      (f) Claim: LEN binding leads to spontaneous dissociation of pentamers. The CG simulation trajectories show pentamer dissociation. However, it is quite difficult to believe that a pentamer in the wide end of the capsid would dissociate and diffuse 100 nm away before a hexamer in the narrow end (previously between two pentamers and now only partially coordinated, also in a highly curved environment, and further under the force of the extruding RNA) would dissociate, as in Figure 2B. A more plausible explanation could be force balance between pent-hex versus hex-hex contacts, an aspect of CG parameterization. No further modeling is presented to explain the release of pentamers, and changes in pent-hex stiffness are not apparent in the force constant fluctuation analysis in Figure 6C.

      (g) Claim: WTMetaD simulations predict capsid rupture. The authors state: "In WTMetaD simulations, we used the mean coordination number (Figure S6) between CA proteins in pentamers and in hexamers as the reaction coordinate." This means that the coordination number, the number of pent-hex contacts, is the bias used to accelerate simulation sampling. Yet the authors then interpret a change in coordination number leading to capsid rupture as a discovery, representing a fundamental misuse of the WTMetaD method. Changes in coordination number cannot be claimed as an emergent property when they are in fact the applied bias, when the simulation forced them to sample such states. The bias must be orthogonal to the feature of interest for that feature to be discoverable. While the reported free energies are orthogonal to the reaction coordinate, the structural and stepwise-mechanism "findings" here represent circular reasoning.

      (3) Another major concern with this work is the excessive self-citation, and the conspicuous lack of engagement with similar computational modeling studies that investigate the HIV capsid and its interactions with LEN, capsid mechanical properties relevant to nuclear entry, and other capsid-NPC simulations (e.g., 10.1016/j.cell.2024.12.008 and 10.1371/journal.ppat.1012537). Other such studies available in the literature include examination of varying aspects of the system at both CG and all-atom levels of resolution, which could be highly complementary to the present work and, in many cases, lend support to the authors' claims rather than detract from them. The choice to omit relevant literature implies either a lack of perspective or a lack of collegiality, which the presentation of the work suffers from. Overall, it is essential to discuss findings in the context of competing studies to give readers an accurate view of the state of the field and how the present work fits into it. It is appropriate in a CG modeling study to discuss the potential weaknesses of the methodology, points of disagreement with alternative modeling studies, and any lack of correlation with a broader range of experimental work. Qualitative agreement with select experiments does not constitute model validation.

      (4) Other critiques, questions, concerns:

      (a) The first Results sub-heading presents "results", complete with several supplementary figures and a movie that are from a previous publication about the development of the HIV capsid-NPC model in the absence of LEN (Hudait &Voth, 2024). This information should be included as part of the introduction or an abbreviated main-text methods section rather than being included within Results as if it represents a newly reported advancement, as this could be misleading.

      (b) The authors say the unbiased simulations of capsid-NPC docking were run as two independent replicates, but results from only one trajectory are ever shown plotted over time. It is not mentioned if the time series data are averaged or smoothed, so what is the shadow in these plots (e.g., Figures 1,2, and Supplementary Figure 5)?

      (c) Why do the insets showing LEN binding in Figure 2A look so different from the models they are apparently zoomed in on? Both instances really look like they are taken from different simulation frames, rather than being a zoomed-in view.

      (d) What are the sudden jerks apparent in the SI movies? Perhaps this is related to the rate at which trajectory frames are saved, but occasionally, during the relatively smooth motion of the capsid-NPC complex, something dramatic happens all of a sudden in a frame. For example, significant and apparently instantaneous reorientation of the cone far beyond what preceding motions suggest is possible (SI movie 2, at timestamp 0.22), RNP extrusion suddenly in a single frame (SI movie 2, at timestamp 0.27), and simultaneous opening of all pentamers all at once starting in a single frame (SI movie 2, at timestamp 0.33). This almost makes the movie look generated from separate trajectories or discontinuous portions of the same trajectory. If movies have been edited for visual clarity (e.g., to skip over time when "nothing" is happening and focus on the exciting aspects), then the authors should state so in the captions.

      (e) Figure 3c presents a time series of the degree of defects at pent-hex and hex-hex interfaces, but I do not understand the normalization. The authors state, "we represented the defects as the number of under-coordinated CA monomers of the hexamers at the pentamer-hexamer-pentamer and hexamer-hexamer interface as N_Pen-Hex and N_Hex-Hex ... Note that in N_Pen-Hex and N_Hex-Hex are calculated by normalizing by the total number of CA pentamer (12) and hexamer rings (209) respectively." Shouldn't the number of uncoordinated monomers be normalized by the number of that type of monomer, rather than the number of capsomers/rings? E.g., 12*5 and 209*6, rather than 12 and 209?

      (f) The authors state that "Although high computational cost precluded us from continuing these CG MD simulations, we expect these defects at the hexamer-hexamer interface to propagate towards the high curvature ends of the capsid." The defects being reported are apparently propagating from (not towards) the high curvature ends of the capsid.

      (g) The first half of the paper uses the color orange in figures to indicate LEN, but the second half uses orange to indicate defects, and this could be confusing for some readers. Both LEN and "defects" are simply a cluster of spheres, so highlighted defects appear to represent LEN without careful reading of captions.

      (h) SI Figure S3 captions says "The CA monomers to which at least one LEN molecule is bound are shown in orange spheres. The CA monomers to which no LEN molecule is bound are shown in white spheres. " While in contradiction, the main-text Fig 2 says "The CA monomers to which at least one LEN molecule is bound are shown in white spheres. The CA monomers to which no LEN molecule is bound are shown in orange spheres. " One of these must be a typo.

      (i) The authors state that: "CG MD simulations and live-cell imaging demonstrate that LEN-treated capsids dock at the NPC and rupture at the narrow end when bound to the central channel and then remain associated to the NPC after rupture." However, the live cell imaging data do not show where rupture occurs, such that this statement is at least partially false. It is also unclear that CG simulations show that cores remain bound following rupture, given that simulations were not extended to the timescale needed to observe this, again rendering the statement partially false.

      (j) The authors state: "We previously demonstrated that the RNP complex inside the capsid contributes to internal mechanical strain on the lattice driven by CACTD-RNP interactions and condensation state of RNP complex (Hudait &Voth, 2024). " In that case, why do the present CG models detect no difference in results for condensed versus uncondensed RNP?

      (k) The authors state: "The distribution demonstrates that the binding of LEN to the distorted lattice sites is energetically favorable. Since LEN localizes at the hydrophobic pocket between two adjoining CA monomers, it is sterically favorable to accommodate the incoming molecule at a distorted lattice site. This can be attributed to the higher available void volume at the distorted lattice relative to an ordered lattice, the latter being tightly packed. This also allows the drug molecule to avoid the multitude of unfavorable CA-LEN interactions and establish the energetically favorable interactions leading to a successful binding event. " What multitude of unfavorable interactions are the authors referring to? Data is not presented to substantiate the claim of increased void volume between hexamers in the distorted lattice. Capsomer distortion is shown as a schematic in Figure 6A rather than in the context of the actual model.

      (l) The authors state that "These striated patterns also demonstrate deviations from ideal lattice packing. " What does ideal lattice packing mean in this context, where hexamers are in numerous unique environments in terms of curvature? What is the structural reference point?

      (m) If pentamer-hexamer interactions are weakened in the presence of LEN, why are differences at these interfaces not apparent in the Figure 6C data that shows stiffening of the interactions between capsomer subunits?

      (n) The authors state: "Lattice defects arising from the loss of pentamers and cracks along the weak points of the hexameric lattice drive the uncoating of the capsid." The word rupture or failure should be used here rather than uncoating; it is unclear that the authors are studying the true process of uncoating and whether the defects induced by LEN binding relate in any way to uncoating.

      (o) The authors state: "LEN-treated broken cores are stabilized by the interaction with the disordered FG-NUP98 mesh at the NPC." But no data is presented to demonstrate that capsid stability is increased by NUP98 interaction. In fact, the presented data could suggest the opposite since capsids in contact with NUP98 in the NPC appeared to rupture faster than freely diffusing capsids.

      (p) The authors state: "LEN binding stimulates similar changes in free capsids, but they occur with lower frequency on similar time scales, suggesting that the cores docked at the NPC are under increased stress, resulting in more frequent weakening of the hexamer-pentamer and hexamer-hexamer interactions, as well as more nucleation of defects at the hexamer-hexamer<br /> Interface. ... Our results suggest that in the presence of the LEN, capsid docking into the NPC central channel will increase stress, resulting in more frequent breaks in the capsid lattice compared to free capsids." The first is a run-on sentence. The results shown support that LEN stimulates changes in free capsids to happen faster, but not more frequently. The frequency with which an event occurs is separate from the speed with which the event occurs.

      (q) The authors state: "A possible mechanistic pathway of capsid disassembly can be that multiple pentamers are dissociated from the capsid sequentially, and the remaining hexameric lattice remains stabilized by bound LEN molecules for a time, before the structural integrity of the remaining lattice is compromised." This statement is inconsistent with experimental studies that say LEN does not lead to capsid disassembly, and may even prevent disassembly as part of its disruption of proper uncoating (e.g., 10.1073/pnas.2420497122 previously published by the authors).

      (r) Finally, it remains a concern with the authors' work that the bottom-up solvent-free CG modeling software used in this and supporting works is not open source or even available to other researchers like other commonly used molecular dynamics software packages, raising significant questions about transparency and reproducibility.

    3. Author response:

      Before providing a brief provisional response to the two reviews, it is important to reiterate a few key points about our work. First, our paper is largely a computational biophysics paper, augmented by experimental results. Generally speaking, computational biophysics work intends to achieve one of two things (or both). One is to provide more molecular level insight into various behaviors of biomolecular systems that have not been (or cannot be) provided by qualitative experimental results alone. The second general goal of computational biophysics it to formulate new hypotheses to be tested subsequently by experiment. In our paper, we have achieved both of these goals and then confirmed the key computational results by experiment..

      The first reviewer has some valuable points, which can be addressed as follows (and will be emphasized in the revised version of the paper): (1) Yes the simulations of capsid rupture in the NPC and capsid-only are directly comparable as both have approximately the same number of bound LEN, as determined by following the LEN-capsid interaction protocol described in the main text (around Fig 6) and in the SI section S3; (2) While we have stressed this point in several places in the manuscript, here again we stress that coarse-grained (CG) MD time is not the same as real time. The point of CG simulations is to accelerate the timescale of the MD and the associated sampling, so the CG “time” from the MD integrator needs to be rescaled to associate a real time to it. As such, our CG simulation is not representing a microsecond of real time but rather something much longer. We will emphasize this again in the revised text. (3) Actually, we think that the parameterization of the LEN model and the LEN-capsid interactions is well described in the text associated with Fig 6 and in SI section S3. It is true that this one part of the CG model was parameterized “top-down” given the good experimental structures of bound LEN to capsid and other data, but the rest of the CG model is “bottom-up” (meaning developed from well-defined coarse-graining statistical mechanics as applied to molecular level structures and interactions, see also below). 

      As for the second reviewer, this review is quite problematic in our view as the reviewer seems to think that quoting a number of qualitative experimental results is sufficient to undermine the impact of our paper (they are not) and, furthermore, the reviewer appears to have a very minimal understanding of “bottom-up” CG modeling, which we have utilized. This modeling does not in fact rely on the “assumptions” this reviewer alleges we have relied on. (As an aside, it could be helpful for this reviewer to study the review by Jin et al, https://doi.org/10.1021/acs.jctc.2c00643) in order to become more familiar with the field and our approach before criticizing it.) We also note that our main HIV capsid-NPC docking model is already published in PNAS (https://doi.org/10.1073/pnas.2313737121), where it underwent rigorous peer review. In our forthcoming full response to the reviews and in the revised paper we will attempt to address a number of this reviewers comments, but the number, extent, and tone of this collection of criticisms, for us, calls into question the objectivity of this reviewer, not to mention the reviewer’s rather weak understanding of what we have done and how we have done it.

      Finally, while we certainly appreciate the overall positive eLife assessment, we are disappointed by the statement “some mechanistic interpretations rely on assumptions embedded in the simulations, leaving parts of the evidence incomplete”. Of course, all simulations (and experiments) rely on certain assumptions, but we have gone to great length to provide a “bottomup” approach to our modeling, based on underlying molecular level structures and interactions, and we have provided experimental validation of the main simulation predictions. It seems that the comments of the second reviewer may have influenced this point of view, but we do not feel it is justified.

    1. Reviewer #1 (Public review):

      Summary:

      Spinal projection neurons in the anterolateral tract transmit diverse somatosensory signals to the brain, including touch, temperature, itch, and pain. This group of spinal projection neurons is heterogeneous in their molecular identities, projection targets in the brain, and response properties. While most anterolateral tract projection neurons are multimodal (responding to more than one somatosensory modality), it has been shown that cold-selective projection neurons exist in lamina I of the spinal cord dorsal horn. Using a combination of anatomical and physiological approaches, the authors discovered that the cold-selective lamina I projection neurons are heavily innervated by Trpm8+ sensory neuron axons, with calb1+ spinal projection neurons primarily capturing these cold-selective lamina I projection neurons. These neurons project to specific brain targets, including the PBNrel and cPAG. This study adds to the ongoing effort in the field to identify and characterize spinal projection neuron subtypes, their physiology, and functions.

      Strengths:

      (1) The combination of anatomical and physiological analyses is powerful and offers a comprehensive understanding of the cold-selective lamina I projection neurons in the spinal cord dorsal horn. For example, the authors used detailed anatomical methods, including EM imaging of Trpm8+ axon terminals contacting the Phox2a+ lamina I projection neurons. Additionally, they recorded stimulus-evoked activity in Trpm8-recipient neurons, carefully selected by visual confirmation of tdTomato and GFP juxtaposition, which is technically challenging.

      (2) This study identifies, for the first time, a molecular marker (calb1) that labels cold-selective lamina I projection neurons. Although calb1+ projection neurons are not entirely specific to cold-selective neurons, using an intersectional strategy combined with other genes enriched in this ALS group or cold-induced FosTRAP may further enhance specificity in the future.

      (3) This study shows that cold-selective lamina I projection neurons specifically innervate certain brain targets of the anterolateral tract, including the NTS, PBNrel, and cPAG. This connectivity provides insights into the role of these neurons in cold sensation, which will be an exciting area for future research.

      Weaknesses:

      (1) The sample size for the ex vivo electrophysiology is small. Given the difficulty and complexity of the preparation, this is understandable. However, a larger sample size would have strengthened the authors' conclusions.

      (2) The authors used tdTomato expression to identify brain targets innervated by these cold-selective lamina I projection neurons. Since tdTomato is a soluble fluorescent protein that fills the entire cell, using synaptophysin reporters (e.g., synaptophysin-GFP) would have been more convincing in revealing the synaptic targets of these projection neurons.

      (3) The summary cartoon shown in Figure 7 can be misleading because this study did not determine whether these cold-selective lamina I projection neurons have collateral branches to multiple brain targets or if there are anatomical subtypes that may project exclusively to specific targets. For example, a recent study (Ding et al., Neuron, 2025) demonstrated that there are PBN-projecting spinal neurons that do not project to other rostral brain areas. Furthermore, based on the authors' bulk labeling experiments, the three main brain targets are NTS, PBNrel, and cPAG. The VPL projection is very sparse and almost negligible.

    1. Reviewer #2 (Public review):

      Lang et al. investigate the contribution of individual neuronal encoding of specific task features to population dynamics and behavior. Using a taste-based decision-making behavioral task with electrophysiology from the mouse gustatory cortex and computational modeling, the authors reveal that neurons encoding sensory, perceptual, and decision-related information with linear and categorical patterns are essential for driving neural population dynamics and behavioral performance. Their findings suggest that individual linear and categorical coding units have a significant role in cortical dynamics and perceptual decision-making behavior.

      Overall, the experimental and analytical work is of very high quality, and the findings are of great interest to the taste coding field, as well as to the broader systems neuroscience field.

      I have a couple of suggestions to further enhance the authors' important conclusions:

      My main comment is the distinction between constrained and unconstrained units. The authors train a small percentage of units to match the real neural data (constrained units), and then find some unconstrained units that are similar to the real neural data and some that are not. As far as I could tell, the relative fraction of constrained and unconstrained units in the trained RNN is not reported; I assume the constrained ones are a much smaller population, but this is unclear. The selection of different groups of neurons for the RNN ablation experiments appears to be based on their response profiles only. Therefore, if I understood correctly, both constrained and unconstrained units and ablated together for a given response category (e.g., linear or step-perception). It would be useful, therefore, to separately compare the effects of constrained vs. unconstrained RNN units.

      Specifically:

      (1) For the analyses in the initial version of the manuscript, the authors should specify how many units in each ablation category are constrained and unconstrained.

      (2) The authors should repeat Figure 6, but only for unconstrained units to test how much of the effects in the initial version of Figure 6 are driven by constrained vs. unconstrained RNN units.

      (3) The authors should repeat Figure 7, but performing ablations separately on the constrained and unconstrained units to examine how the network behaves in each case and the resulting "behavioral" effect.

    2. Reviewer #3 (Public review):

      Primary taste cortex neurons show a variety of dynamic response profiles during taste decision-making tasks, reflecting both sensory and decision variables. In the present study, Lang et al. set out to determine how neurons with distinct response profiles contribute to perceptual decisions about taste stimuli.

      The methods, with reference to the behavioral task and electrophysiological recordings/data analysis, are straightforward, solid, and appropriate. The computational model is presented in a clear and conceptually intuitive manner, although the details are outside of my area of expertise.

      The experimental design features a simple 2-alternative forced-choice design that yielded clear psychometric curves across a range of stimuli. In vivo recordings were performed using Neuropixels and yielded an appropriate sample of single neuron responses. The strength of the model lies in the fact that it consists of single neurons whose response profiles mimic those recorded in vivo, and allows neuron-selective manipulation.

      By virtually lesioning specific subsets of neurons in the network, the authors demonstrate that a relatively small population of neurons with specific tuning profiles was sufficient to produce the observed neural dynamics and behavioral responses. This effect was selective as lesioning other responsive neurons did not affect overall response dynamics or performance.

      These findings provide new insight into the relation between the response profiles of single neurons in sensory cortex, their population-level activity dynamics, and the perceptual decisions they inform.

      The approach is particularly innovative as it uses computational modeling to target functionally-defined "cell types", which cannot necessarily be targeted by more conventional genetic approaches.

    3. Author response:

      Reviewer #1 (Public review):

      This manuscript provides several important findings that advance our current knowledge about the function of the gustatory cortex (GC). The authors used high-density electrophysiology to record neural activity during a sucrose/NaCl mixture discrimination task. They observed population-based activity capable of representing different mixtures in a linear fashion during the initial stimulus sampling period, as well as representing the behavioral decision (i.e., lick left or right) at a later time point. Analyzing this data at the single neuron level, they observed functional subpopulations capable of encoding the specific mixture (e.g., 45/55), tastant (e.g., sucrose), and behavioral choice (e.g., lick left). To test the functional consequences of these subpopulations, they built a recurrent neural network model in order to "silence" specific functional subpopulations of GC neurons. The virtual ablation of these functional subpopulations altered virtual behavioral performance in a manner predicted by the subpopulation's presumed contribution.

      Strengths:

      Building a recurrent neural network model of the gustatory cortex allows the impact of the temporal sequence of functionally identifiable populations of neurons to be tested in a manner not otherwise possible. Specifically, the author's model links neural activity at the single neuron and population level with perceptual ability. The electrophysiology methods and analyses used to shape the network model are appropriate. Overall, the conclusions of the manuscript are well supported.

      Weaknesses:

      One potential concern is the apparent mismatch between the neural and behavioral data. Neural analyses indicate a clear separation of the activity associated with each mixture that is independent of the animal's ultimate choice. This would seemingly indicate that the animals are making errors despite correctly encoding the stimulus. Based solely on the neural data, one would expect the psychometric curve to be more "step-like" with a significantly steeper slope. One potential explanation for this observation is the concentration of the stimuli utilized in the mixture discrimination task. The authors utilize equivalent concentrations, rather than intensity-matched concentrations. In this case, a single stimulus can (theoretically) dominate the perception of a mixture, resulting in a biased behavioral response despite accurate concentration coding at the single neuron level. Given the difficulty of isointensity matching concentrations, this concern is not paramount. However, the apparent mismatch between the neural and behavioral data should be acknowledged/addressed in the text.

      We thank the Reviewer for the insightful comments and thoughtful suggestions. Our electrophysiological recordings show that GC dynamically encodes stimulus concentration of mixture elements, dominant perceptual quality, and decisions of directional lick. With regard to the encoding of mixtures, the clear separation of activity associated with each mixture (Figure 3) is present at a trial-averaged pseudo-population level, and average activities associated with more similar, intermediate mixtures are closer to each other in this space. In fact, at a single trial level activity evoked by similar, intermediate mixtures can be hard to separate. This increased similarity can lead to behavioral errors resulting from either incorrect encoding of the stimulus or from the inability to interpret the stimuli to guide the correct decision.

      The psychometric function, which shows that more distinct stimuli (100/0 vs 0/100) lead to fewer mistakes than more ambiguous, intermediate mixtures (55/45 vs 55/45), is consistent with the increased ambiguity of responses to intermediate mixtures and with the possibility that, compared to pure stimuli, intermediate mixtures lead to more trials in which the binary choice component of neural activity is inverted, resulting in more directional errors.

      The Reviewer is correct that there could be a slight mismatch in the perceived intensity of the mixture components. This mismatch could be the reason for the slight asymmetry in our psychometric function (Figure 1B). However, it is not uncommon for mice in these 2AC tasks to also have a motor laterality bias in their responses that manifests itself for the more ambiguous stimuli. We chose not to model this bias given its subtlety and its unknown origin. Rather, we chose to model an ideal scenario in which stimuli have matched intensity and no motor bias exists. In the revised version we will discuss this issue.

      Reviewer #2 (Public review):

      Lang et al. investigate the contribution of individual neuronal encoding of specific task features to population dynamics and behavior. Using a taste-based decision-making behavioral task with electrophysiology from the mouse gustatory cortex and computational modeling, the authors reveal that neurons encoding sensory, perceptual, and decision-related information with linear and categorical patterns are essential for driving neural population dynamics and behavioral performance. Their findings suggest that individual linear and categorical coding units have a significant role in cortical dynamics and perceptual decision-making behavior.

      Overall, the experimental and analytical work is of very high quality, and the findings are of great interest to the taste coding field, as well as to the broader systems neuroscience field.

      I have a couple of suggestions to further enhance the authors' important conclusions:

      My main comment is the distinction between constrained and unconstrained units. The authors train a small percentage of units to match the real neural data (constrained units), and then find some unconstrained units that are similar to the real neural data and some that are not. As far as I could tell, the relative fraction of constrained and unconstrained units in the trained RNN is not reported; I assume the constrained ones are a much smaller population, but this is unclear. The selection of different groups of neurons for the RNN ablation experiments appears to be based on their response profiles only. Therefore, if I understood correctly, both constrained and unconstrained units and ablated together for a given response category (e.g., linear or step-perception). It would be useful, therefore, to separately compare the effects of constrained vs. unconstrained RNN units.

      We thank the Reviewer for the constructive feedback and are pleased that the work is considered of broad interest. The Reviewer is correct that ablations were carried out with respect to response categories only and included both constrained and unconstrained units.

      The ratio of total units to constrained units is fixed at 5.88, thus constrained units are ~17% of the network and unconstrained units are ~83%. This value is specified in the Methods (RNN: Components and dynamics), but we will report it in the Results of the revised manuscript as well for clarity.

      Specifically:

      (1) For the analyses in the initial version of the manuscript, the authors should specify how many units in each ablation category are constrained and unconstrained.

      In the revised manuscript, we will specify the fractions of constrained and unconstrained units within each response category. For convenience, they are reported here: Linear = 194 constrained and 691 unconstrained units; Step-perception = 147 constrained and 840 unconstrained units; Step-choice = 129 constrained and 814 unconstrained units; Other = 353 constrained and 1739 unconstrained units.

      (2) The authors should repeat Figure 6, but only for unconstrained units to test how much of the effects in the initial version of Figure 6 are driven by constrained vs. unconstrained RNN units.

      In the revised version we will add a Supplemental Figure in which the contribution of constrained vs unconstrained units is addressed.

      (3) The authors should repeat Figure 7, but performing ablations separately on the constrained and unconstrained units to examine how the network behaves in each case and the resulting "behavioral" effect.

      The revised version will include a Supplemental Figure with these simulations.

      Reviewer #3 (Public review):

      Primary taste cortex neurons show a variety of dynamic response profiles during taste decision-making tasks, reflecting both sensory and decision variables. In the present study, Lang et al. set out to determine how neurons with distinct response profiles contribute to perceptual decisions about taste stimuli.

      The methods,with reference to the behavioral task and electrophysiological recordings/data analysis, are straightforward, solid, and appropriate. The computational model is presented in a clear and conceptually intuitive manner, although the details are outside of my area of expertise.

      The experimental design features a simple 2-alternative forced-choice design that yielded clear psychometric curves across a range of stimuli. In vivo recordings were performed using Neuropixels and yielded an appropriate sample of single neuron responses. The strength of the model lies in the fact that it consists of single neurons whose response profiles mimic those recorded in vivo, and allows neuron-selective manipulation.By virtually lesioning specific subsets of neurons in the network, the authors demonstrate that a relatively small population of neurons with specific tuning profiles was sufficient to produce the observed neural dynamics and behavioral responses. This effect was selective as lesioning other responsive neurons did not affect overall response dynamics or performance.These findings provide new insight into the relation between the response profiles of single neurons in sensory cortex, their population-level activity dynamics, and the perceptual decisions they inform.

      The approach is particularly innovative as it uses computational modeling to target functionally-defined "cell types", which cannot necessarily be targeted by more conventional genetic approaches.

      We thank the Reviewer for the positive assessment of our study.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Green et al. attempt to use large-scale protein structure analysis to find signals of selection and clustering related to antibiotic resistance. This was applied to the whole proteome of Mycobacterium tuberculosis, with a specific focus on the smaller set of known antibiotic-resistance-related proteins.

      Strengths:

      The use of geospatial analysis to detect signals of selection and clustering on the structural level is really intriguing. This could have a wider use beyond the AMR-focussed work here and could be applied to a more general evolutionary analysis context. Much of the strength of this work lies in breaking ground into this structural evolution space, something rarely seen in such pathogen data. Additional further research can be done to build on this foundation, and the work presented here will be important for the field.

      The size of the dataset and use of protein structure prediction via AlphaFold, giving such a consistent signal within the dataset, is also of great interest and shows the power of these approaches to allow us to integrate protein structure more confidently into evolution and selection analyses.

      Weaknesses:

      There are several issues with the evolutionary analysis and assumptions made in the paper, which perhaps overstate the findings, or require refining to take into account other factors that may be at play.

      (1) The focus on antimicrobial resistance (AMR) throughout the paper contains the findings within that lens. This results in a few different weaknesses:

      (a) While the large size of the analysis is highlighted in the abstract and elsewhere, in reality, only a few proteins are studied in depth. These are proteins already associated with AMR by many other studies, somewhat retreading old ground and reducing the novelty.

      (b) Beyond the AMR-associated proteins, the proteome work is of great interest, but only casually interrogated and only in the context of AMR. There appears to be an assumption that all signals of positive selection detected are related to AMR, whereas something like cas10 is part of the CRISPR machinery, a set of proteins often under positive selection, and thus unlikely to be AMR-related.

      (2) The strength of the signal from the structural information and the novelty of the structural incorporation into prediction are perhaps overstated.

      (a) A drop of 13% in F1 for a gain of 2% in PPV is quite the trade-off. This is not as indicative of a strong predictor that could be used as the abstract claims. While the approach is novel and this is a good finding for a first attempt at such complex analysis, this is perhaps not as significant as the authors claim

      (b) In relation to this, there is a lack of situating these findings within the wider research landscape. For instance, the use of structure for predicting resistance has been done, for example, in PncA (https://academic.oup.com/jacamr/article/6/2/dlae037/7630603, https://www.sciencedirect.com/science/article/pii/S1476927125003664, https://www.nature.com/articles/s41598-020-58635-x) and in RpoB (https://www.nature.com/articles/s41598-020-74648-y). These, and other such works, should be acknowledged as the novelty of this work is perhaps not as stark as the authors present it to be.

      (3) The authors postulate that neutral AA substitutions would be randomly distributed in the protein structure and thus use random mutations as a negative control to simulate this neutral evolution. However, I am unsure if this is a true negative control for neutral evolution. The vast majority of residues would be under purifying selection, not neutral selection, especially in core proteins like rpoB and gyrA. Therefore, most of these residues would never be mutated in a real-world dataset. Therefore, you are not testing positive selection against neutral selection; you are testing positive against purifying, which will have a much stronger signal. This is likely to, in turn, overestimate the signal of positive selection. This would be better accounted for using a model of neutral evolution, although this is complex and perhaps outside the scope. Still, it needs to be made clear that these negative controls are not representative of neutral evolution.

      (4) In a similar vein, the use of 15 Å as a cut-off for stating co-localisation feels quite arbitrary. The average radius of a globular protein is about 20 Å, so this could be quite a large patch of a protein. I think it may be good to situate the cut-off for a 'single location' within a size estimator of the entire protein, as 15 Å could be a neighbourhood in a large protein, but be the whole protein for smaller ones.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript by Xie and colleagues presents an intriguing behavioral finding for the field of perceptual learning (PL): combining the reactivation-based training paradigm with anodal tDCS induces complete generalization of the learning effect. Notably, this generalization is achieved without compromising the magnitude of learning effects and with an 80% reduction in total training time. The experimental design is well-structured, and the observed complete generalization is robustly replicated across two stimulus dimensions (orientation and motion direction).

      However, while the empirical results are methodologically valid and scientifically surprising, the theoretical framework proposed to explain them appears underdeveloped and, in some cases, difficult to reconcile with the existing literature. Several arguments are insufficiently justified. In addition, the introduction of a non-standard metric (NGI: normalized learning gain index) raises concerns about the interpretability and comparability with existing PL literature.

      Strengths:

      (1) Rigorous experimental design

      In this study, Xie and colleagues employed a 2×2 factorial design (Training paradigm: Reactivation vs. Full-Practice × tDCS protocols: Anodal vs. Sham), which allowed clear dissociation of the main and interaction effects.

      (2) High statistical credibility

      Sample sizes were predetermined using G*Power, non-significant effects were evaluated using the Bayes factor, and the core behavioral findings were replicated in a second stimulus dimension. These strengthen the credibility of the findings.

      (3) Strong translational potential

      The observed complete generalization could have useful implications for sensory rehabilitation. The large reduction (80%) in total training time is particularly compelling.

      Weaknesses:

      (1) NGI (Normalized learning gain index) is a non-standard behavioral metric and may distort interpretability.

      NGI (pre - post / ((pre + post) / 2)) is rarely used in PL studies to measure learning effects. Almost all PL studies rely on raw thresholds and percent improvements (pre - post / pre), making it difficult to contextualize the current NGI-based results within the broader field. The current manuscript provides no justification for adopting NGI.

      A more critical issue is the NGI's nonlinearity: by normalizing to the mean of pre- and post-test thresholds, it disproportionately inflates learning effects for participants with lower post-test thresholds. Notably, the "complete generalization" claims are illustrated mainly with NGI plots. Although the authors also analyze thresholds directly and the results also support the core claim, the interpretation in the text relies heavily on NGI.

      The authors may consider rerunning key analyses using the standard percent improvement metric. If retaining NGI, the authors should provide explicit justification for why NGI is superior to standard measures.

      (2) The proposed theoretical framework is sometimes unclear and insufficiently supported.

      The authors propose the following mechanistic chain:

      (a) reactivation-based learning depends on offline consolidation mediated by GABA (page 4 line 73);

      (b) online a-tDCS reduces GABA (page 4, line 76), thereby disrupting offline consolidation (page 11, line 225);

      (c) disrupted offline consolidation reduces perceptual overfitting (page 4, line 77; page 11, line 225), thereby enabling generalization;

      (d) under full-practice training, a-tDCS increases specificity via a different mechanism (page 11 line 235).

      While this framework is plausible in broad terms, several components are speculative at best in the absence of neurochemical or neural measurements.

      (3) Several reasoning steps require further clarification.

      (a) Mechanisms of Reactivation-based Learning.

      The manuscript focuses on the neurochemical basis of reactivation-based learning. However, reactivation-induced neurochemical changes differ across brain regions. In the motor cortex, Eisenstein et al. (2023) reported that after reactivation, increased GABA and decreased E/I ratio were associated with offline gains. In contrast, Bang et al. (2018) demonstrated that, in the visual cortex, reactivation decreased GABA and increased E/I ratio. While both studies are consistent with GABA involvement, the direction of GABA modulation differs. The authors should clarify this discrepancy.<br /> More importantly, Bang et al. (2018) demonstrated that reactivation-based (3 blocks) and full-practice (16 blocks) training produced similar time courses of E/I ratio changes in V1: an initial increase followed by a decrease. Given this similarity, the manuscript would benefit from a more thorough discussion of how the two paradigms diverge mechanistically. For example, behaviorally, Song et al. (2021) reported greater generalization with reactivation-based training than with full-practice training, aligning with Kondat et al. (2025). Neurally, Kondat et al. (2024) showed that reactivation-based training increased activity in higher-order brain regions (e.g., IPS), whereas full practice training reduced connectivity between temporal and parietal regions.

      (b) tDCS Mechanisms and Protocols.

      The effect of a-tDCS on GABA is not consistent across brain regions. While a-tDCS reliably reduces GABA in the motor cortex, recently, a more related work (Abuleli et al., 2025) reports no significant modulation of GABA or Glx in V1, challenging the authors' assumption of tDCS-induced GABA reduction in the visual cortex.

      The manuscript proposes that online a-tDCS disrupts offline consolidation is somewhat difficult to interpret conceptually. Online tDCS typically modulates processes occurring during stimulation (e.g., encoding process, attentional state), whereas consolidation occurs afterward. Thus, stating that online tDCS protocols only disrupt offline consolidation without considering the possibility that they first modulate the encoding process is difficult to interpret. Even if tDCS has prolonged effects, the link between online stimulation and disruption of offline consolidation remains unelucidated.

      (c) Missing links between GABA modulation and perceptual overfitting.

      The proposed chain ("tDCS disrupts consolidation → reduced overfitting → improved generalization") skips a critical step: how GABA modulation translates to changes in neural representational properties (e.g., tuning width, representational overlap between trained/untrained stimuli) that define "perceptual overfitting." The PL literature has not established a link between GABA levels and these representational changes, leaving a key component of the mechanistic explanation underspecified.

      (d) Insufficient explanation of the opposite effects.

      The manuscript does not fully explain why the same a-tDCS promotes generalization in reactivation-based training but increases specificity in full-practice training. Both paradigms engage offline consolidations, and, as mentioned above, the time courses of E/I ratio changes are similar for 3-block reactivation-based or 16-block training. Thus, if offline consolidation mechanisms (and their associated E/I changes) are comparable across paradigms, it is unclear why identical a-tDCS would produce opposite outcomes in the two paradigms.

    2. Reviewer #2 (Public review):

      Xie et al., combined transcranial direct current brain stimulation (tDCS) and a reactivation-based training protocol to investigate the generalization of learning. Using visual perceptual learning as a model, they found that a reactivation-based training protocol, when combined with anodal tDCS over the visual cortex, can induce learning transfer to untrained visual orientations and motion directions. Interestingly, extending reactivation-based training to a full-training protocol with more training trials did not induce generalization of learning. Furthermore, even when paired with tDCS, extending the training protocol did not provide benefits for generalization of learning. This study provides interesting insights into the mechanisms of brain plasticity and how future training protocols could be designed to achieve robust and generalizable learning outcomes.

      The authors supported their arguments with a series of well-constructed experiments. The conclusions are largely supported by the data, although some clarifications about their hypotheses and control analyses could strengthen the work:

      (1) The authors hypothesize that tDCS can reduce perceptual overfitting through reduced GABA concentrations in the visual cortex, which leads to learning transfer. However, without a clear description of the role of GABA in perceptual learning and perceptual overfitting, it is difficult for the reader to understand why reduced GABA concentrations would contribute to generalization. Do the authors imply that increased GABA can lead to specificity? Are there studies that can support this argument? The authors also did not describe clearly how reactivation-based visual perceptual learning can modify GABA levels in the visual cortex differently (compared to full-practice) during training and during the offline consolidation phase. In order for the reader to better understand their hypotheses and the motivation of the current study, it is beneficial for the authors to provide a concise but clearer description of the roles of GABA in perceptual learning with a focus on the roles of GABA in generalization and during off-line consolidation for different types of training protocols (see for instance Bang et al., 2018; Frangou et al., 2019; Frank et al., 2022; Jia et al., 2024; Shibata et al., 2011; Tamaki et al., 2020; Yamada et al., 2024).

      (2) Based on the results, an alternative explanation is that the amount of transfer to the untrained visual feature might be related to the amount of learning for the trained visual feature, which might be different depending on the training protocol and brain stimulation combination. Is it beneficial to compare the amount of learning gains across different training and stimulation protocols to rule out this possibility? Would more learning gains for the trained visual feature predict less transfer for the untrained visual feature? Are there correlations between learning gains and learning transfer?

      (3) The authors argued that a reactivation-based training protocol, rather than the amount of training, was critical for the generalization of learning. The control experiment in the study showed that full-practice training combined with tDCS did not lead to transfer, as in reactivation-based training. However, in order to rule out the confounding effects from the amount of training, it is crucial to examine whether a training protocol in which a similar number of trials as in the reactivation-based training but not separated across training sessions would lead to similar generalization of learning.

    3. Reviewer #3 (Public review):

      Summary:

      This research focuses on a long-lasting and interesting phenomenon in human plasticity. When humans learn basic perceptual skills such as judging the orientation of a simple line, the learned abilities are often limited to the trained condition but not generalizable to untrained conditions. The authors hypothesized that this learning specificity was related to GABA, an inhibitory neurotransmitter in the brain. Using a novel training method that combines reactivation and a brain stimulation method (tDCS) that hypothetically inactivates GABA, the authors hypothesized that learned visual perceptual skills would show greater transfer.

      Strengths:

      The authors conducted a list of well-conceived behavior studies to demonstrate the effectiveness of their proposed method in enabling learning transfer in two different visual tasks, and carefully conducted comparison studies to elucidate other possible explanations. The sample size was adequate to convey convincing results, and the analyses were thorough.

      Weaknesses:

      While the authors built their training paradigm on

      (1) the hypothetical role GABA plays in inhibiting learning transfer, and

      (2) the hypothetical impact tDCS may have on GABA, there was no direct evidence supporting these hypotheses in the current study.

      Further, learning specificity takes many formats from features to locations to tasks; it is not yet clear the scope of the observed transfer with the proposed method.

    1. Reviewer #2 (Public review):

      The authors describe the first deep neurological characterization of WAC mutation in two vertebrate species (zebrafish and mouse). They examine these at various levels, guided by the work in humans that has associated a heterozygous WAC mutation with DeSantos Shinawi Syndrome (DESSH). Therefore, they investigate the animals for a variety of phenotypes, following a template for what is seen when characterizing a new mouse/fish model of a developmental disability gene. Investigations include analysis of skull and jaw for abnormalities(both species), MRI of brain structure(in mice), electrophysiology(mice), assessment of signaling pathways (by Western blot, in mice), cell counts (both, more in mice), transcriptomics (mice), and behavior (both).

      Generally, this describes an important first characterization of the consequences of the mutation. Most of the studies appear well-conducted and reasonably powered, thus solid or convincing. However, there are a few places where the data presentation could be improved for clarity, and a few concerns about some choices in analytical approach for a couple of the experiments, where improved statistical approaches could improve their sensitivity and/or better rule out false positives, and thus the support of some of these claims is currently incomplete. There is also some lack of clarity about the rationale for some decisions regarding the fish genetics. Nonetheless, this is an important and useful first characterization of many phenotypes of these lines. Such experiments form a baseline for future mechanistic studies in the same lines and a platform to test approaches to reverse phenotypes.

      Individual claims and their strength & weaknesses:

      (1) The authors developed mouse and zebrafish models of WAC deletion

      They used the existing KOMP floxed WAC line to generate a null allele. For the mouse, there is a Western showing that it is indeed null for the protein. The fish data is less robustly validated - they don't confirm the allele in null at the protein or RNA level, and fish have two paralogs (waca and wacb), and this paper only characterizes one of these. So this evidence is less clear. The evaluated mice are heterozygous (Het), similar to patients, while the fish appear to be evaluated as homozygous mutants.

      (2) The authors show that both species show altered craniofacial features

      These data appear well powered, and the findings are robust.

      (3) Each model altered GABAergic neurons

      In mice, the authors stained with PV antibodies and saw a decrease in cells positive for this staining. A second marker, Lhx6, does not show a difference, suggesting this might be a change in PV expression rather than cell number. They could maybe look into the literature to see if this loss of just the protein also occurs in other models. Overall, the sample size here is a bit smaller than other parts of the paper (n=3), and the methods on the cell counts were less clear, so it is not as clear that this finding is as robust. The authors counted several other broad classes of cells, and those appear normal. Interestingly, there might also be some TBR1 mislocalization in layer 6 that might be significant with added power.

      The fish data is based on an in situ hybridization for GAD. The measure shown is the width of the positive area in the forebrain. This measure is not one I have seen much before, and has potential to be driven by something unrelated to GABA (e.g., if the whole forebrain were simply a bit smaller). So this analysis could use a couple of other approaches (density of signal?) and/or a control probe for some other brain gene showing the measure is normal, and thus it is not just a size issue.

      (4) Mice were more susceptible to the seizure-inducing agent PTZ

      These data appear well powered, and the findings are robust. The authors also did a fair amount of useful electrophysiology that was all normal, but appeared to be well executed.

      (5) Mice had changes in brain volume that interact with sex

      The authors conducted an MRI on a good number of mice and reported a slight increase in global volume just in males. Sample size is fair, but the statistical approach here may be better if it puts males and females in the same model (to boost power and explicitly test for sex by genotype interaction that they report), and there is some chance that the brain region level differences that they report could include some false positives. They tested many regions, and it is not clear whether or not they corrected for the number of tests. Often, an FDR correction would be used in such imaging studies. It may be that only the most robust regional findings will survive those corrections. It is interesting data either way, but the analysis could be improved.

      (6) Several behaviors are altered in the mice as well

      These studies were fairly well-powered (n=15,16), and they found several positive and negative results, including alterations in memory and sociability in both species. There is a minor statistical flaw in the three-chamber analysis (they don't actually compare the Hets directly to the wildtypes in their statistical testing - a common mistake in neuroscience that should be addressed. But the data look like they will probably still be significant when correctly analyzed. In the supplement, the authors could do a bit more with the data they have to look at hyperactivity (i.e., show total motion in open field, not just time in center vs. periphery), and adding sex to their model might improve sensitivity for genotype effects.

      (7) Some biochemical signaling pathways are altered in the brain

      These are n=4 immunoblots, and show altered phospho ERK, but no changes in other signaling events predicted from prior WAC literature like H2B ubiquitination. They appear well done, and the authors share the full blots in the supplement.

      (8) WAC deletion also alters gene expression in the brain

      These studies were well-powered for RNAseq, with 10 and 14 samples, using neonates (P2), just the forebrain. The sequencing quality metrics all looked good, and the approach to analysis was okay. It would be stronger to again include sex in the model, rather than separate by sex. There were some typos in this part of the paper that made part of the conclusions unclear, but the RNAseq nicely confirmed the mutation of the mice, and discovered many differentially expressed genes, consistent with the role of this gene as a regulator of transcription. The presentation could be expanded to make more use of the data. Overall, though, this is a useful first characterization of the transcriptome in the line.

    1. Reviewer #2 (Public review):

      Summary:

      Previous studies by some of the same authors of the actual manuscript showed that healthy human newborns memorize recently learned nonsense words. They exposed neonates to a familiarization period (several minutes) when multiple repetitions of a bisyllabic word were presented, uttered by the same speaker. Then they exposed neonates to an "interference period" when newborns listened to music or the same speaker uttering a different pseudoword. Finally, neonates were exposed to a test period when infants hear the familiarized word again. Interestingly, when the interference was music, the recognition of the word remained. The word recognition of the word was measured by using the NIRS technique, which estimates the regional brain oxygenation at the scalp level. Specifically, the brain response to the word in the test was reduced, unveiling a familiarity effect, while an increase in regional brain oxygenation corresponds to the detection of a "new word" due to a novelty effect. In previous studies, music does not erase the memory traces for a word (familiarity effect), while a different word uttered by the same speaker does.

      The current study aims at exploring whether and how word memory is interfered with by other speech properties, specifically the changes in the speaker, while young children can distinguish speakers by processing the speech. The author's main hypothesis anticipates that new speaker recognition would produce less interference in the familiarized word because somehow neonates "separate" the processing of both words (familiarized uttered by one speaker, and interfering word, uttered by a different speaker), memorizing both words as different auditory events.

      From my point of view, this hypothesis is interesting, since the results would contribute to estimating the role of the speaker in word learning and speech processing early in life.

      Strengths:

      (1) New data from neonates. Exploring neonates' cognitive abilities is a big challenge, and we need more data to enrich the knowledge of the early steps of language acquisition.

      (2) The study contributes new data showing the role of speaker (recognition) on word learning (word memory), a quite unexplored factor. The idea that neonates include speakers in speech processing is not new, but its role in word memory has not been evaluated before. The possible interpretation is that neonates integrate the process of the linguistic and communicative aspects of speech at this early age.

      (3) The study proposes a quite novel analytic approach. The new mixed models allow exploring the brain response considering an unbalanced design. More than the loss of data, which is frequent in infants' studies, the familiarization, interference and learning processes may take place at different moments of the experiment (e.g. related to changes in behavioural states along the experiment) or expressed in different regions (e.g. related to individual variations in optodes' locations and brain anatomy).

      Weaknesses:

      I did not find major weaknesses. However, I would like to have more discussion or explanation on the following points.

      (1) It would be fine to report the contribution of each infant to the analysis, i.e. how many good blocks, 1 to 5 in sequence 1 and 2, were provided by each infant.

      (2) Why did the factor "blocknumber" range from 0 to 4? The authors should explain what block zero means and why not 1 to 5.

      (3) I may suggest intending to integrate the changes in brain activity across the 3 phases. That is, whether changes in familiarization relate to changes in the test and interference phases. For instance, in Figure 2, the brain response distinguishes between same and novel words that occurred over IFG and STG in both hemispheres. However, in the right STG there was no initial increase in the brain response, and the response for the same was higher than the one for novels in the 5th block.

      (4) Similarly, it is quite amazing that the brain did not increase the activity with respect to the familiarization during the interference phase, mainly over the left hemisphere, even if both the word and speaker changed. Although the discussion considers these findings, an integrated discussion of the detection of novel words and the detection of a novel speaker over time may benefit from a greater integration of the results.

      Appraisal:

      The authors achieved their aims because the design and analytic approaches showed significant differences. The conclusions are based on these results. Specifically, the hypothesis that neonates would memorize words after interference, when interfered speech is pronounced by a different speaker, was supported by the data in blocks 2 and 5, and the potential mechanisms underlying these findings were discussed, such as separate processing for different speakers, likely related to the recognition of speaker identity.

      I think the discussion is well-structured, although I may suggest integrating the changes into the three phases of the study. Maybe comparing with other regions, not related to speech processing.

      Evaluating neonates is a challenge. Because physiology is constantly changing. For instance, in 9 minutes, newborns may transit from different behavioral states and experience different physiological needs.

      This study offers the opportunity to inspire looking for commonalities and individual differences when investigating early memory capacities of newborns.

    1. After defeating the Xiongnu in battle in 119 BCE, Han Wu sent Zhang back to the West in 115 with a caravan of over three hundred men carrying silk textiles, gold, and lacquerware as gifts for Wusun chiefs.

      After winning against the Xiongnu, Emperor Wu sent Zhang Qian on another mission with 300 people. They brought valuable gifts like silk, gold, and lacquerware to give to the leaders of the Wusun, another nomadic group.

    1. These observations indicate that actin-dense patches are formed as a response to external stimuli (i.e. food, neighboring syncytia) which then generate new branchlets for the network to respond dynamically to its environment.

      Do you see evidence of these actin-dense patches staying at the base of newly formed branchlets? I was curious whether you have observations that help distinguish patches functioning as nucleation sites for new protrusions versus representing increased endocytic activity in the nutrient-enriched conditions. Also, do you think these puncta might correspond to Arp2/3-mediated branched actin networks, or is that still unclear?

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This is a good manuscript, well performed and well presented. I have several suggestions/questions to enhance the clarity of the concept, as technically the work is rather well performed.

      1. I suggest that the authors explain better the mesenchymal-to-epithelial (MET) transition in reprogramming. Perhaps, explaining that epithelial gene acquisition (e.g., CDH1) and epidermal cell fate are not exactly the same. This approach could also be used to divide the genes they study further in their analyses.
      2. KLF4 is both a repressor and an activator in different cell contexts including reprogramming. Does HIC2 act only as repressor? Is it possible that HIC2 is repressing KLF4-activated genes bad for reprogramming (including epidermal genes) and activating KLF4-suppressed genes ncessary for reprogramming? This should not be too difficult to explore with their current dataset and they also could look at available datasets for histone modifications in reprogramming.
      3. Does HIC2 bind to genes related to somatic cell identify that need to be suppressed in reprogramming before the MET phase takes place?
      4. Does HIC2 influence proliferation during reprogramming?

      Referee cross-commenting

      Comments by the other reviewers are sound and will help improve the manuscript.

      Significance

      In this manuscript, Kaji and colleagues perform a CRISPR/Cas9 screen to identify genes involved in mouse somatic cell reprogramming, identifying HIC2 as a target that they further validate. They conclude that HIC2 acts by repressing the epidermal/epithelial program induced by KLF4 during reprogramming. Studying the complex role of transcription factor interactions in the context of cell fate conversions (of any kind and not just somatic cell reprogramming) is highly relevant. This work helps clarify such complexity in a specific context but the work has wider conceptual implications.

    1. Reviewer #1 (Public review):

      Summary:

      In this paper, the authors develop a biologically plausible recurrent neural network model to explain how the hippocampus generates and uses barcode-like activity to support episodic memory. They address key questions raised by recent experimental findings: how barcodes are generated, how they interact with memory content (such as place and seed-related activity), and how the hippocampus balances memory specificity with flexible recall. The authors demonstrate that chaotic dynamics in a recurrent neural network can produce barcodes that reduce memory interference, complement place tuning, and enable context-dependent memory retrieval, while aligning their model with observed hippocampal activity during caching and retrieval in chickadees.

      Strengths:

      (1) The manuscript is well-written and structured.

      (2) The paper provides a detailed and biologically plausible mechanism for generating and utilizing barcode activity through chaotic dynamics in a recurrent neural network. This mechanism effectively explains how barcodes reduce memory interference, complement place tuning, and enable flexible, context-dependent recall.

      (3) The authors successfully reproduce key experimental findings on hippocampal barcode activity from chickadee studies, including the distinct correlations observed during caching, retrieval, and visits.

      (4) Overall, the study addresses a somewhat puzzling question about how memory indices and content signals coexist and interact in the same hippocampal population. By proposing a unified model, it provides significant conceptual clarity.

      Weaknesses:

      The recurrent neural network model incorporates assumptions and mechanisms, such as the modulation of recurrent input strength, whose biological underpinnings remain unclear. The authors acknowledge some of these limitations thoughtfully, offering plausible mechanisms and discussing their implications in depth. It may be worth exploring the robustness of the results to certain modeling assumptions. For instance, the choice to run the network for a fixed amount of time and then use the activity at the end for plasticity could be relaxed.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      In this paper, the authors develop a biologically plausible recurrent neural network model to explain how the hippocampus generates and uses barcode-like activity to support episodic memory. They address key questions raised by recent experimental findings: how barcodes are generated, how they interact with memory content (such as place and seed-related activity), and how the hippocampus balances memory specificity with flexible recall. The authors demonstrate that chaotic dynamics in a recurrent neural network can produce barcodes that reduce memory interference, complement place tuning, and enable context-dependent memory retrieval, while aligning their model with observed hippocampal activity during caching and retrieval in chickadees.

      Strengths:

      (1) The manuscript is well-written and structured.

      (2) The paper provides a detailed and biologically plausible mechanism for generating and utilizing barcode activity through chaotic dynamics in a recurrent neural network. This mechanism effectively explains how barcodes reduce memory interference, complement place tuning, and enable flexible, context-dependent recall.

      (3) The authors successfully reproduce key experimental findings on hippocampal barcode activity from chickadee studies, including the distinct correlations observed during caching, retrieval, and visits.

      (4) Overall, the study addresses a somewhat puzzling question about how memory indices and content signals coexist and interact in the same hippocampal population. By proposing a unified model, it provides significant conceptual clarity.

      Weaknesses:

      The recurrent neural network model incorporates assumptions and mechanisms, such as the modulation of recurrent input strength, whose biological underpinnings remain unclear. The authors acknowledge some of these limitations thoughtfully, offering plausible mechanisms and discussing their implications in depth.

      One thread of questions that authors may want to further explore is related to the chaotic nature of activity that generates barcodes when recurrence is strong. Chaos inherently implies sensitivity to initial conditions and noise, which raises questions about its reliability as a mechanism for producing robust and repeatable barcode signals. How sensitive are the results to noise in both the dynamics and the input signals? Does this sensitivity affect the stability of the generated barcodes and place fields, potentially disrupting their functional roles? Moreover, does the implemented plasticity mitigate some of this chaos, or might it amplify it under certain conditions? Clarifying these aspects could strengthen the argument for the robustness of the proposed mechanism.

      In our model, chaos is used to produce a random barcode when forming memories, but memory retrieval depends on attractor dynamics. Specifically, the plasticity update at the end of the cache creates an attractor state, and then afterwards for successful memory retrieval the network activity must settle into this attractor rather than remaining chaotic. This attractor state is a conjunction of memory content (place and seed activity) and memory index (barcode activity). Thus a barcode is ‘reactivated’ when network dynamics during retrieval settle into this cache attractor, or in other words chaotic dynamics do not need to generate the same barcode twice.

      The reviewer raises an important point, which is how sensitivity to initial conditions and noise would affect the reliability of our proposed mechanism. The key question here is how noise will affect the network’s dynamics during retrieval. Would adding noise to the dynamics make memory retrieval more difficult? We thank the reviewer for suggesting we investigate this further, and below describe our experiments and changes to the manuscript to better address this topic.

      We first experimented with adding independent gaussian distributed noise into each unit, drawn independently at each timestep. We analyzed recall accuracy using the same task and methods as Fig. 4F while varying the magnitude of noise. Memory recall was quite robust to this form of noise, even as the magnitude of noise approached half of the signal amplitude. This first experiment added noise into the temporal dynamics of the network. We subsequently examined adding static noise into the network inputs, which can also be thought of as introducing noise into initial conditions. Specifically, we added independent gaussian distributed noise into each unit, with the random value held constant for the extent of temporal dynamics. This perturbation decreased the likelihood of memory recall in a graded manner with noise magnitude, without dramatically changing the spatial profile. Examination of dynamics on individual trials revealed that the network failed to converge onto a cache attractor on some random fraction of trials, with other trials appearing nearly identical to noiseless results. We now include these results in the text and as a new supplementary figure, Figure S4AB.

      To clarify the network dynamics and the purpose of chaos in our model, we make the following modifications in text:

      Section 2.3, paragraph 2 (starting at “To store memories…”):

      “…place inputs arrive into the RNN, recurrent dynamics generate an essentially random barcode, seed inputs are activated, and then Hebbian learning binds a particular pattern of barcode activity to place- and seed-related activity.”

      Section 2.3, paragraph 3 (starting at “Memory recall in our network…”): As an example, consider a scenario in which an animal has already formed a memory at some location l, resulting in the storage of an attractor \vec{a} into the RNN. The attractor \vec{a} can be thought of as a linear combination of place input-driven activity $p(l)$, seed input-driven activity $s$, and a recurrent-driven barcode component $b$. Later, the animal returns to the same location and attempts recall (i.e. sets r \= 1, Figure 3B). Place inputs for location l drive RNN activity towards $p(l)$, which is partially correlated with attractor \vec{a}, and the recurrent dynamics cause network activity to converge onto attractor \vec{a}. In this way, barcode activity $b$ is reactivated, along with the place and seed components stored in the attractor state, $p(l)$ and $s$. The seed input can also affect recall, as discussed in the following section.

      Section 2.4, final paragraph (starting “We further examined how model hyperparameters affected performance on these tasks”), added the following describing new results on adding noise: We found that adding noise to the network's temporal dynamics had little effect on memory recall performance (Figure S4A). However, large static noise vectors added to the network's input and initial state decreased the overall probability of memory recall, but not its spatial profile (Figure S4B).

      It may also be worth exploring the robustness of the results to certain modeling assumptions.  For instance, the choice to run the network for a fixed amount of time and then use the activity  at the end for plasticity could be relaxed.

      As described above, chaotic dynamics are necessary to generate a barcode during a cache, but not to reactivate that barcode during retrieval. During a successful memory retrieval, network activity settles into an attractor state and thus does not depend on the duration of simulated dynamics. The choice of duration to run dynamics during caching is important, but only insofar as activity significantly decorrelates from the initial state. We show in Figure S1B that decorrelation saturates ~t=25, and thus any random time point t > 25 would be similarly effective. We used a fixed duration runtime for caches only to avoid introducing unnecessary complication into our model.

      Reviewer #2 (Public review):

      Summary:

      Striking experimental results by Chettih et al 2024 have identified high-dimensional, sparse patterns of activity in the chickadee hippocampus when birds store or retrieve food at a given site. These barcode-like patterns were interpreted as "indexes" allowing the birds to retrieve from memory the locations of stored food.

      The present manuscript proposes a recurrent network model that generates such barcode activity and uses it to form attractor-like memories that bind information about location and food. The manuscript then examines the computational role of barcode activity in the model by simulating two behavioral tasks, and by comparing the model with an alternate model in which barcode activity is ablated.

      Strengths of the study:

      Proposes a potential neural implementation for the indexing theory of episodic memory - Provides a mechanistic model of striking experimental findings: barcode-like, sparse patterns of activity when birds store a grain at a specific location

      A particularly interesting aspect of the model is that it proposes a mechanism for binding discrete events to a continuous spatial map, and demonstrates the computational advantages of this mechanism.

      Weaknesses:

      The relation between the model and experimentally recorded activity needs some clarification

      The relation with indexing theory could be made more clear

      The importance of different modeling ingredients and dynamical mechanisms could be made more clear

      The paper would be strengthened by focusing on the most essential aspects

      Comments:

      The model distinguishes between "barcode activity" and "attractors". Which of the two corresponds to experimentally-recorded barcodes? I would presume the attractors. A potential issue is that the attractors are, as explained in the text (l.137), conjunctions of place activity, barcode activity and "seed" inputs. The fact that the seed activity is shared across attractors seems to imply that they have a non-zero correlation independent of distance. Is that the case in the model? If I understand correctly, Fig 3D shows correlations between an attractor and barcodes at different locations, but correlations between attractors at different locations are not shown. Fig 1 F instead shows that correlations between recorded retrieval activities decay to zero with distance.

      More generally, the fact that the expression "barcode" is apparently used with different meanings in the model and in the experiments is potentially confusing (in the model they correspond to activity generating during caching, and this activity is distinct from the memories; my understanding is that in the experiments barcodes correspond to both caching and retrieval, but perhaps I am mistaken?).

      Our intent is to use the expression “barcode” as similarly as possible between model and experimental work. The reviewer points out that the connection between barcodes in experimental and modeling work is unclear, as well as the relation of “attractors” in our model to previous experimental results. The meaning of ‘barcode’ is absolutely critical—we clarify below our intended meaning, and then describe changes to the manuscript to highlight this.

      In experiments, we observed that activity during caching looked different than ordinary hippocampal activity (i.e. typical “place activity” observed during visits). Empirically there were two major differences. First, there was a pattern of neural activity which was present during every cache . This pattern was also present when birds visually inspected sites containing a cached seed, but not when visually inspecting an empty site. This is what we refer to as “seed activity”. Second, there was a pattern of neural activity which was unique to each cache. This pattern re-occurred during retrieval, and was orthogonal to place activity (see Fig. 1E-F). This is what we refer to as “barcode activity”. In summary, activity during a cache (or retrieval) contains a combination of three components: place activity, seed activity, and barcode activity.

      These experimental findings are recapitulated in our model, as activity during a cache contains a combination of three components: place activity driven by place inputs, seed activity driven by seed inputs, and barcode activity generated by recurrent dynamics. Cache activity in the model corresponds to cache activity in experiments, and barcodes in the model correspond to barcodes in experiments. Our model additionally has “attractors”, meaning that network connectivity changes so that the activity generated during a simulated cache becomes an attractor state of network dynamics. “Attractors” refers to a feature of network dynamics, not a distinct activity state, and we do not yet know if these attractors exist in experimental data.

      Figure 3D, as described in the figure legend, is a correlation of activity during cache and retrieval (in purple), for cache-retrieval pairs at the same or at different sites. We believe this is what the reviewer asks to see: the correlation between attractor states for different cache locations. The reviewer makes an important point: seed activity is shared across all attractors, so then why are correlations not high for all locations? This is because attractors also have a place component, which is anti-correlated for distant locations. This is evident in Fig. 3D by noticing that visit-visit correlations (black line, corresponding to place activity only) are negative for distant locations, and the correlation between attractors (purple line, cache-retrieval pairs) is subtly shifted up relative to the black line (place code only) for these distant locations. The size of this shift is due to the relative magnitude of place and seed inputs. For example, if we increase the strength of the seed input during caching (blue line), we can further increase the correlation between attractors even for quite distant sites:

      Author response image 1.

      To clarify the manuscript, we made the following modifications:

      Section 2.2, first paragraph: We model the hippocampus as a recurrent neural network (RNN) (Alvarez and Squire, 1994; Tsodyks, 1999; Hopfield, 1982) and propose that recurrent dynamics can generate barcodes from place inputs. As in experiments, the model’s population activity during a cache should exhibit both place and barcode activity components.

      Section 2.3, paragraph 3 (starting at “Memory recall in our network…”): As an example, consider a scenario in which an animal has already formed a memory at some location l , resulting in the storage of an attractor \vec{a} into the RNN . The attractor \vec{a} can be thought of as a linear combination of place input-driven activity $p(l)$, seed input-driven activity $s$, and a recurrent-driven barcode component $b$. Later, the animal returns to the same location and attempts recall (i.e. sets r \= 1, Figure 3B). Place inputs for l drive RNN activity towards $p(l)$, which is partially correlated with attractor \vec{a}, and the recurrent dynamics cause network activity to converge onto attractor \vec{a}. In this way, barcode activity $b$ is reactivated as part of attractor \vec{a}, along with the place and seed components stored in the attractor state, $p(l)$ and $s$. The seed input can also affect recall, as discussed in the following section.

      The insights obtained from the network model for the computational role of barcode activity could be explained more clearly. The introduction starts by laying out the indexing theory, which proposes that the hippocampus links an index with each memory so that the memory is reactivated when the index is presented. The experimental paper suggests that the barcode activations play the role of indexes. Yet, in the model reactivations of memories are driven not by presenting bar-code activity, but by presenting place activity (Cache Presence task) or seed activity (Cache Location task). So it seems that either place activity and seed activity play the role of indexes. Section 2.5 nicely shows that ultimately the role of barcode activity is to decorrelate attractors, which seems different from playing the role of indexes. I feel it would be useful that the Discussion reassess more critically the relationship between barcodes, indexing theory, and key-value architectures.

      The reviewer highlights a failure on our part to clearly identify the connection between our findings on barcodes, indexing theory, and key-value architectures. This is another major component of the paper, and below we propose changes to the manuscript to clarify these concepts and their relationships. First, we will summarize the key points that were unclear in our original manuscript.

      The reviewer equates the concept of an ‘index’ with that of a ‘query’: the signal that drives memory reactivation. This may be intuitive, but it is not how a memory index was defined in indexing theory (e.g. Teyler & DiScenna 1986). In indexing theory, the index is a pattern of hippocampal activity that is (a) generated during memory formation, (b) separate from the activity encoding memory content, and (c) linked to memory content via associative plasticity. After memory formation, a memory might be queried by activating a partial set of the memory contents, which would then drive reactivation of the hippocampal index, leading to pattern completion of memory contents. See, for example, figure 1 of Teyler and DiScenna 1986. The ‘index’ is thus not the same as the ‘query’ that drives recall.

      We propose in this work that barcode activity is such an index. Indexing theory originally posited that memory content was encoded by neocortex, and memory index was encoded by hippocampus. However the experiments of Chettih et al. 2024 revealed that the hippocampus contained both memory content and memory index signals, and furthermore there was no division of cells into ‘content’ and ‘index’ subtypes. Thus our model drops the assumption of earlier work that index and content signals correspond to different neurons in different brain areas—a significant advance of our work. Otherwise, the experimentally observed barcodes and the barcodes generated by our computational model play the role of indices as originally defined.

      Our original manuscript was unclear on the relationship of indexing theory and key-value systems. Our work connects diverse areas of memory models, including attractor dynamics, key-value memory systems, and memory indexing. A full account of these literatures and their relationships may be beyond the scope of this manuscript, and we note that a recent review article (Gershman, Fiete, and Irie, 2025) further clarifies the relationship between key-value memory, indexing theory, and the hippocampus. We will cite this work in our discussion as a source for the interested reader.

      Briefly, a key-value memory system distinguishes between the address where a memory is stored, the ‘key’, and the content of that memory, the ‘value’. An advantage of such systems is that keys can be optimized for purposes independent of the value of each memory. The use of barcodes in our model to decorrelate memories is related to this optimization of keys in key-value memory systems. By generating barcodes and adding this to the attractor state corresponding to a cache memory, the ‘address’ of the memory in population activity is differentiated from other memories. Our work is thus consistent with the idea that hippocampus generates keys and implements a key storage system. However it is not so straightforward to equate barcodes with keys, as they are defined in key-value memory. As the reviewer points out, memory recall can be driven by location and seed inputs, i.e. it is content-addressable. We think of the barcode as modifying the memory address to better separate similar memories, without changing memory content, and the resulting memory can be recalled by querying with either content or barcode. Given the complex and speculative nature of these relationships, we prefer to note the salient connection of our work with ongoing efforts applying the key-value framework to biological memory, and leave the precise details of this connection to future work.

      We make the following changes in the manuscript to clarify these ideas:

      Introduction, first paragraph: In this scheme, during memory formation the hippocampus generates an index of population activity, and the neurons representing this index are linked with the neurons representing memory content by associative plasticity . Later, re-experience of partial memory contents may reactivate the index, and reactivation of the index drives complete recall of the memory contents.

      Discussion, 4th paragraph on key-value: Interestingly, prior theoretical work has suggested neural implementations for both key-value memory and attention mechanisms, arguing for their usefulness in neural systems such as long term memory (Kanerva, 1988; Tyulmankov et al., 2021; Bricken and Pehlevan, 2021; Whittington et al., 2021; Kozachkov et al., 2023; Krotov and Hopfield, 2020; Gershman 2025 ). In this framework, the address where a memory is stored (the key) may be optimized independently of the value or content of the memory. In our model, barcodes improve memory performance by providing a content-independent scaffold that binds to memory content, preventing memories with overlapping content from blurring together. Thus barcodes can be considered as a change in memory address, and our model suggests important connections between recurrent neural activity and key generation mechanisms. However we note that barcodes should not be literally equated with keys in key-value systems as our model’s memory is ‘content-addresable’—it can be queried by place and seed inputs.

      The model includes a number of non-standard ingredients. It would be useful to explain which of these ingredients and which of the described mechanisms are essential for the studied phenomenon. In particular:

      - the dynamics in Eq.2 include a shunting inhibition term. Is it essential and why?

      The shunting inhibition is important as it acts to normalize the network activity to prevent runaway excitation. We hope to clarify this further by amending the following sentence in section 2.2: “g (·) is a leak rate that depends on the average activity of the full network, representing a form of global shunting inhibition that normalizes network activity to prevent runaway excitation from recurrent dynamics.”

      - same question for the global inhibition included in the random connectivity;

      The distribution from which connectivity strengths are drawn has a negative mean (global inhibition). This causes activity during caching (i.e. r = 1) to be sparser than activity during visits (i.e. r = 0), and was chosen to match experimental findings. In figures 2B and S2B we show that our model can transition between a mode with place code only, barcode only, or a mode containing both, by changing the variance of the weight distribution while holding the mean constant. We suggest clarifying this by editing the following in section 2.2, paragraph 2: “We initialize the recurrent weights from a random Gaussian distribution, . where 𝑁<sub>𝑋</sub> is the number of RNN neurons and μ < 0, reflecting global subtractive inhibition that encourages sparse network activity to match experimental findings (Chettih et al. 2024).”

      - the model is fully rate-based, but for certain figures, spikes are randomly generated. This seems superfluous.

      Spikes are simulated for one analysis and one visualization, where it is important to consider noise or variability in neural responses across trials. First, for Fig. 2H,J, we generated spikes to allow a visual comparison to figures that can be easily generated from experimental data. Second, and more significantly, for the analysis underlying Fig. 3D, it is essential to simulate variability in neural responses. Because our rate-based models are noiseless, the RNN’s rate vector at site distance = 0 will always be the same and result in a correlation of 1 for both visit-visit and cache-retrieval. However, we show that, if one interprets the rate as a noisy Poisson spiking process, the correlation at site distance = 0 between a cache-retrieval pair is higher than that of two visits. This is because under a Poisson spiking model, the signal-to-noise ratio is higher for cache-retrieval activity, where rates are higher in magnitude. The greater correlation for a cache-retrieval pair at the same site, relative to visits at the same site, is an experimental finding that was critical for our model to reproduce. We detail clarifications to the manuscript below in response to the reviewer’s following and related question.

      How are the correlations determined in the model (e.g., Fig 2 B)? The methods explain that they are computed from Poisson-generated spikes, but over which time period? Presumably during steady-state responses, but are these responses time-averaged?

      The reviewer points out a lack of clarity in our original manuscript. Correlations for events (caches, retrievals and visits) at different sites are calculated in two sections of the paper (2B, 3D), for different purposes and with slight differences in methods:

      - For figure 2B, no spikes are simulated. Note that the methods mentioning poisson spike generation specify only Fig. 2H,J and Fig. 3D. We simply take the network’s rate vector at timestep t=100 (when the decorrelating effect of chaotic dynamics has saturated, S1A-B) and correlate this vector when generated at different locations. We now clarify this in the legend for Figure 2B: “We show correlation of place inputs (gray) and correlation of the RNN's rate vector at t = 100 (black).”

      - For Figure 3D, we want to compare the model to empirical results from Chettih et al. 2024, and reproduced in this paper in Fig. 1E-F. These empirical results are derived from correlating vectors of spiking activity on pairs of single trials, and are thus affected by noise or variability in neural responses as described in our response to the reviewer’s previous question. We thus took the RNN’s rate vector at t=100 and simulated spiking data by drawing samples from a poisson distribution to get spike counts. Our original manuscript was unclear about this, and we suggest the following changes:

      - Legend for Figure 3D: D. Correlation of Poisson-generated spikes simulated from RNN rate vectors at two sites, plotted as a function of the distance between the two sites.

      - Section 2.3, last paragraph: Population activity during retrieval closely matches activity during caching, and is substantially decorrelated from activity during visits (Figure 3C). To compare our model with the empirical results reproduced in Figure 1E,F, we ran in silico experiments with caches and retrievals at varying sites in the circular arena. We simulated Poisson-generated spikes drawn from our network's underlying rates to match the intrinsic variability in empirical data (see Methods).

      - Methods, subsection Spatial correlation of RNN activity for cache-retrieval pairs at different sites: To calculate correlation values as in Figure \ref{fig3}D, we simulated experiments where 5 sites were randomly chosen for caching and retrieval. To compare model results to the empirical data in Fig. 1E,F, which includes intrinsic neural variability, we sampled Poisson-generated spike counts from the rates output by our model. Specifically, for RNN activity \vec{r_i} at location i, using the rates at t=100 as elsewhere, we first generate a sample vector of spikes…

      I was confused by early and late responses in Fig 2 C. The text says that the activity is initialized at zero, so the response at t=0 should be flat (and zero). More generally, I am not sure I understand why the dynamics matter for the phenomenon at all, presumably the decorrelation shown in Fig 2B depends only on steady state activity (cf previous question).

      Thanks for catching this mistake. The legend has been updated to indicate that the ‘early’ response is actually at t=1, when network activity reflects place inputs without the effects of dynamics. The reviewer is correct that we are primarily interested in the ‘late’ response of the network. All other results in the paper use this late response at t=100. As shown in Fig. S2A,B, this timepoint is not truly a steady state, as activity in the network continues to change, but the decorrelation of network activity with place-driven activity has saturated.

      We include the early response in Fig. 2C for visual comparison of the purely place-driven early activity with the eventual network response. It is also relevant since, as the reviewer points out above, there is a shunting inhibition term in the dynamics that is present during both low and high recurrent strength simulations.

      Related to the previous point, the discussion of decorrelation (l.79 - 97) is somewhat confusing. That paragraph focuses on chaotic activity, but chaos decorrelates responses across different time points. Here the main phenomenon is the decorrelation of responses across different spatial inputs (Fig 2B). This decorrelation is presumably due to the fact that different inputs lead to different non-trivial steady-state responses, but this requires some clarification. If that is correct, the temporal chaos adds fluctuations around these non-trivial steady-state responses, but that alone would not lead to the decorrelation shown in Fig 2B.

      We agree with the reviewer that chaotic activity produces a decorrelation across time points. Because of chaotic dynamics, network activity does not settle into a trivial steady-state, and instead evolves from the initial state in an unpredictable way. The network does not settle into a steady-state pattern, but both the decorrelation of network state with initial state and the rate of change in the network state saturate after ~t=25 timesteps, as shown in Fig. S2A-B.

      The initial activity for nearby states is similar, due to them receiving similar place inputs.

      Because network activity is chaotically decorrelated from this initial state by temporal dynamics, ‘late stage’ network activity between nearby spatial states is less correlated than ‘early stage’ activity. Thus the temporal decorrelation produces a spatial decorrelation. We believe that the changes we have introduced to the manuscript in revision will make this point clearer in our resubmission.

      A key ingredient of the model is that the recurrent interactions are switched on and off between "caching" and "visits". The discussion argues that a possible mechanism for this is recurrent inhibition (l.320), which would need to be added. However two forms of inhibition are already included in the model. The text also says that it is unclear how units in the model should be mapped onto E and I neurons. However the model makes explicit assumptions about this, in particular by generating spikes from individual neurons. Altogether, I did not find that part of the Discussion convincing.

      We agree with the reviewer that this section is a limitation of our current work, and in fact it is an ongoing area of future research. However we think the advances in this current work warrant publication despite this topic requiring further research. We attempted to discuss this limitation explicitly, and note that the other reviewer pointed this section out as particularly helpful. We do not think it is problematic for a realistic model of the brain to ultimately include 3, or even more forms of inhibition. We do not think that poisson-generated spikes commit us to interpreting network units as single neurons. Spikes are not a core part of our model’s mechanism, and were used only as a mechanism of introducing variability on top of deterministic rates for specific analyses. Furthermore one could still view network units as pools of both E and I spiking neurons. We would welcome further recommendations the reviewer believes are important to note in this section on our model’s limitations.

      On lines 117-120 the text briefly mentions an alternate feed-forward model and promptly discards it. The discussion instead says that a "separate possibility is that barcodes are generated in a circuit upstream of where memories are stored, and supplied as inputs to the hippocampal population", and that this possibility would lead to identical conclusions. The two statements seem a bit contradictory. It seems that the alternative possibility would replace the need for switching on and off recurrent interactions, with a mechanism where barcode inputs are switched on and off. This alternate scenario is perhaps more plausible, so it would be useful to discuss it more explicitly.

      We apologize for the confusion here, which seems to be due to our phrasing in the discussion section. We do reject the idea that a simple feed-forward model could generate the spatial correlation profile observed in data, as mentioned in the text and included as Fig. S2. Our statement in the discussion may have seemed contradictory because here we intended to discuss the possibility that an upstream area generates barcodes, for example by the chaotic recurrent dynamics proposed in our work, while a downstream network receives these barcodes as inputs and undergoes plasticity to store memories as attractors. We did not intend to suggest any connection to the feedforward model of barcode generation, and apologize for the confusion. Our claim that this ‘2 network’ solution would lead to similar conclusions is because the upstream network would need an efficient means of barcode generation, and the downstream network would need an efficient means of storing memory attractors, and separating these functions into different networks is not likely to affect for example the advantage of partially decorrelating memory attractors. Moreover, the downstream network would still require some form of recurrent gating, so that during visits it exhibits place activity without activating stored memory attractors!

      We thus chose a 1 network instead of a 2 network solution because it was simpler and, we believe, more interesting. It is challenging in the absence of more data to say which is more plausible, thus we wanted to mention the possibility of a 2 network solution. We suggest the following changes to the manuscript:

      - Discussion, 3rd paragraph: “Alternatively, other mechanisms may be involved in generating barcodes. We demonstrated that conventional feed-forward sparsification (Babadi and Sompolinsky, 2014; Xie et al., 2023) was highly inefficient, but more specialized computations may improve this (Földiak, 1990; Olshausen and Field, 1996; Sacouto and Wichert, 2023; Muscinelli et al., 2023). Another possibility is that barcodes are generated in a separate recurrent network upstream of the recurrent network where memories are stored. In this 2-network scenario, the downstream network receives both spatial tuning and barcodes as inputs. This would not obviate the need for modulating recurrent strength in the downstream network to switch between input-driven modes and attractor dynamics. We suspect separating barcode generation and memory storage in separate networks would not fundamentally affect our conclusions.”

      As a minor note, the beginning of the discussion states that the presented model is similar to previous recurrent network models of the hippocampus. It would be worth noting that several of the cited works assign a very different role to recurrent interactions: they generate place cell activity, while the present model assumes it is inherited from upstream inputs.

      We are not sure how best to modify the paper to address this suggestion. As far as we know, all of the cited models which deal with spatial encoding do assume that the hippocampus receives a spatially-modulated or spatially-tuned input. For example, the Tsodyks 1999 paper cited in this paragraph uses exponentially-decaying place inputs to each neuron highly similar to our model. Furthermore we explore how our model would perform if we change the format of spatial inputs in Fig. S4, and find key results are unchanged. It is unclear how hippocampal place fields could emerge without inputs that differentiate between spatial locations. We think it is appropriate to highlight the similarity of our model to well known hopfield-type recurrent models, where memories are stored as attractor states of the network dynamics.

      On the other hand, we agree that a common line of hippocampal modeling proposes that recurrent interactions reshape spatial inputs to produce place fields. This often arises in the context of hippocampus generating a predictive map, where inputs may be one-hot for a single spatial state, in a grid cell-like format, or a random projection of sensory features. We attempted to address this in section 2.6, using a model which superimposes the random connectivity needed for barcode generation with the structured connectivity needed for predictive map formation. We found that such a model was able to perform both predictive and barcode functions, suggesting a path forward to connecting different lines of hippocampal modeling in future work.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Xiong and colleagues investigate the mechanisms operating downstream to TRIM32 and controlling myogenic progression from proliferation to differentiation. Overall, the bulk of the data presented is robust. Although further investigation of specific aspects would make the conclusions more definitive (see below), it is an interesting contribution to the field of scientists studying the molecular basis of muscle diseases.

      We thank the Reviewer for appreciating our work and for their valuable suggestions to improve our manuscript. We have carefully addressed some of the concerns raised, as detailed here, while others, which require more experimental efforts, will be addressed as detailed in the Revision Plan.

      In my opinion, a few aspects would improve the manuscript. Firstly, the conclusion that Trim32 regulates c-Myc mRNA stability could be expanded and corroborated by further mechanistic studies:

      1. Studies investigating whether Tim32 binds directly to c-Myc RNA. Moreover, although possibly beyond the scope of this study, an unbiased screening of RNA species binding to Trim32 would be informative. Authors’ response. This point will be addressed as detailed in the Revision Plan

      If possible, studies in which the overexpression of different mutants presenting specific altered functional domains (NHL domain known to bind RNAs and Ring domain reportedly involved in protein ubiquitination) would be used to test if they are capable or incapable of rescuing the reported alteration of Trim32 KO cell lines in c-Myc expression and muscle maturation.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      An optional aspect that might be interesting to explore is whether the alterations in c-Myc expression observed in C2C12 might be replicated with primary myoblasts or satellite cells devoid of Trim32.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      I also have a few minor points to highlight:

        • It is unclear if the differences highlighted in graphs 5G, EV5D, and EV5E are statistically significant.*

      Authors’ response. We thank the Reviewer for raising this point. We now indicated the statistical analyses performed on the data presented in the mentioned figures (according also to a point of Reviewer #3). According to the conclusion that Trim32 is necessary for proper regulation of c-Myc transcript stability, using 2-way-ANOVA, the data now reported as Figure 5G show the statistically significant effect of the genotype at 6h (right-hand graph) but not at D0 (left-hand graph). In the graphs of Fig. EV5 D and E at D0 no significant changes are observed whereas at 6h the data show significant difference at the 40 min time point. We included this info in the graphs and in the corresponding legends.

      - On page 10, it is stated that c-Myc down-regulation cannot rescue KO myotube morphology fully nor increase the differentiation index significantly, but the corresponding data is not shown. Could the authors include those quantifications in the manuscript?

      Authors’ response. As suggested, we included the graph showing the differentiation index upon c-Myc silencing in the Trim32 KO clones and in the WT clones, as a novel panel in Figure 6 (Fig. 6D). As already reported in the text, a partial recovery of differentiation index is observed but the increase is not statistically significant. In contrast, no changes are observed applying the same silencing in the WT cells. Legend and text were modified accordingly.

      Reviewer #1 (Significance (Required)):

      The manuscript offers several strengths. It provides novel mechanistic insight by identifying a previously unrecognized role for Trim32 in regulating c-Myc mRNA stability during the onset of myogenic differentiation. The study is supported by a robust methodology that integrates CRISPR/Cas9 gene editing, transcriptomic profiling, flow cytometry, biochemical assays, and rescue experiments using siRNA knockdown. Furthermore, the work has a disease relevance, as it uncovers a mechanistic link between Trim32 deficiency and impaired myogenesis, with implications for the pathogenesis of LGMDR8. * * At the same time, the study has some limitations. The findings rely exclusively on the C2C12 myoblast cell line, which may not fully represent primary satellite cell or in vivo biology. The functional rescue achieved through c-Myc knockdown is only partial, restoring Myogenin expression but not the full differentiation index or morphology, indicating that additional mechanisms are likely involved. Although evidence supports a role for Trim32 in mRNA destabilization, the precise molecular partners-such as RNA-binding activity, microRNA involvement, or ligase function-remain undefined. Some discrepancies with previous studies, including Trim32-mediated protein degradation of c-Myc, are acknowledged but not experimentally resolved. Moreover, functional validation in animal models or patient-derived cells is currently lacking. Despite these limitations, the study represents an advancement for the field. It shifts the conceptual framework from Trim32's canonical role in protein ubiquitination to a novel function in RNA regulation during myogenesis. It also raises potential clinical implications by suggesting that targeting the Trim32-c-Myc axis, or modulating c-Myc stability, may represent a therapeutic strategy for LGMDR8. This work will be of particular interest to muscle biology researchers studying myogenesis and the molecular basis of muscle disease, RNA biology specialists investigating post-transcriptional regulation and mRNA stability, and neuromuscular disease researchers and clinicians seeking to identify new molecular targets for therapeutic intervention in LGMDR8. * * The Reviewer expressing this opinion is an expert in muscle stem cells, muscle regeneration, and muscle development.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: * * In this study, the authors sought to investigate the molecular role of Trim32, a tripartite motif-containing E3 ubiquitin ligase often associated with its dysregulation in Limb-Girdle Muscular Dystrophy Recessive 8 (LGMDR8), and its role in the dynamics of skeletal muscle differentiation. Using a CRISPR-Cas9 model of Trim32 knockout in C2C12 murine myoblasts, the authors demonstrate that loss of Trim32 alters the myogenic process, particularly by impairing the transition from proliferation to differentiation. The authors provide evidence in the way of transcriptomic profiling that displays an alteration of myogenic signaling in the Trim32 KO cells, leading to a disruption of myotube formation in-vitro. Interestingly, while previous studies have focused on Trim32's role in protein ubiquitination and degradation of c-Myc, the authors provide evidence that Trim32-regulation of c-Myc occurs at the level of mRNA stability. The authors show that the sustained c-Myc expression in Trim32 knockout cells disrupts the timely expression of key myogenic factors and interferes with critical withdrawal of myoblasts from the cell cycle required for myotube formation. Overall, the study offers a new insight into how Trim32 regulates early myogenic progression and highlights a potential therapeutic target for addressing the defects in muscular regeneration observed in LGMDR8.

      We thank the Reviewer for valuing our work and for their appreciated suggestions to improve our manuscript. We have carefully addressed some of the concerns raised as detailed here, while others, which require more laborious experimental efforts, will be addressed as reported in the Revision Plan.

      Major Comments:

      The work is a bit incremental based on this:

      https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030445 * * And this:

      https://www.nature.com/articles/s41418-018-0129-0 * * To their credit, the authors do cite the above papers.

      Authors’ response. We thank the Reviewer for this careful evaluation of our work against the current literature and for recognising the contribution of our findings to the understanding of myogenesis complex picture in which the involvement of Trim32 and c-Myc, and of the Trim32-c-Myc axis, can occur at several stages and likely in narrow time windows along the process, thus possibly explaining some reports inconsistencies.

      The authors do provide compelling evidence that Trim32 deficiency disrupts C2C12 myogenic differentiation and sustained c-Myc expression contributes to this defective process. However, while knockdown of c-Myc does restore Myogenin levels, it was not sufficient to normalize myotube morphology or differentiation index, suggesting an incomplete picture of the Trim32-dependent pathways involved. The authors should qualify their claim by emphasizing that c-Myc regulation is a major, but not exclusive, mechanism underlying the observed defects. This will prevent an overgeneralization and better align the conclusions with the author's data.

      Authors’ response. We agree with the Reviewer and we modified our phrasing that implied Trim32-c-Myc axis as the exclusive mechanism by explicitly indicated that other pathways contribute to guarantee proper myogenesis, in the Abstract and in Discussion.

      The Abstract now reads: … suggesting that the Trim32–c-Myc axis may represent an essential hub, although likely not the exclusive molecular mechanism, in muscle regeneration within LGMDR8 pathogenesis.”

      The Discussion now reads: “Functionally, we demonstrated that c-Myc contributes to the impaired myogenesis observed in Trim32 KO clones, although this is clearly not the only factor involved in the Trim32-mediated myogenic network; realistically other molecular mechanisms can participate in this process as also suggested by our transcriptomic results.”

      The authors provide a thorough and well-executed interrogation of cell cycle dynamics in Trim32 KO clones, combining phosphor-histone H3 flow cytometry of DNA content, and CFSE proliferation assays. These complementary approaches convincingly show that, while proliferation states remain similar in WT and KO cells, Trim32-deficient myoblasts fail in their normal withdraw from the cell cycle during exposure to differentiation-inducing conditions. This work adds clarity to a previously inconsistent literature and greatly strengthens the study.

      Authors’ response. We thank the Reviewer for appreciating our thorough analyses on cell cycle dynamics in proliferation conditions and at the onset of the differentiation process.

      The transcriptomic analysis (detailed In the "Transcriptomic analysis of Trim32 WT and KO clones along early differentiation" section of Results) is central to the manuscript and provides strong evidence that Trim32 deficiency disrupts normal differentiation processes. However, the description of the pathway enrichment results is highly detailed and somewhat compressed, which may make it challenging for readers to following the key biological 'take-homes'. The narrative quickly moves across their multiple analyses like MDS, clustering, heatmaps, and bubble plots without pausing to guide the reader through what each analysis contributes to the overall biological interpretation. As a result, the key findings (reduced muscle development pathways in KO cells and enrichment of cell cycle-related pathways) can feel somewhat muted. The authors may consider reorganizing this section, so the primary biological insights are highlighted and supported by each of their analyses. This would allow the biological implications to be more accessible to a broader readership.

      Authors’ response. We thank the Reviewer for raising this point and apologise for being too brief in describing the data, leaving indeed some points excessively implicit. As suggested, we now reorganised this session and added the lists of enriched canonical pathways relative to WT vs KO comparisons at D0 and D3 (Fig. EV3B) as well as those relative to the comparison between D0 and D3 for both WT and Trim32 KO samples (Fig. EV3C), with their relative scores. We changed the Results section “Transcriptomic analysis of Trim32 WT and Trim32 KO clones along early differentiationas reported here below and modified the legends accordingly.

      The paragraph now reads: Based on our initial observations, the absence of Trim32 already exerts a significant impact by day 3 (D3) of C2C12 myogenic differentiation. To investigate how Trim32 influences early global transcriptional changes during the proliferative phase (D0) and early differentiation (D3), we performed an unbiased transcriptomic profiling of WT and Trim32 KO clones (Fig. 2A). Multidimensional Scaling (MDS) analysis revealed clear segregation of gene expression profiles based on both time of differentiation (Dim1, 44% variance) and Trim32 genotype (Dim2, 16% variance) (Fig. 2A). Likewise, hierarchical clustering grouped WT and Trim32 KO clones into distinct clusters at both timepoints, indicating consistent genotype-specific transcriptional differences (Fig. EV3A). Differentially Expressed Genes (DEGs) were detected in the Trim32 KO transcriptome relative to WT, at both D0 and D3. In proliferating conditions, 72 genes were upregulated and 189 were downregulated whereas at D3 of differentiation, 72 genes were upregulated and 212 were downregulated. Ingenuity Pathway Analysis of the DEGs revealed the top 10 Canonical Pathways displayed in Fig. EV3B as enriched at either D0 or D3 (Fig. EV3B). Several of these pathways can underscore relevant Trim32-mediated functions though most of them represent generic functions not immediately attributable to the observed myogenesis defects.

      Notably, the transcriptional divergence between WT and Trim32 KO cells is more pronounced at D3, as evidenced by a greater separation along the MSD Dim2 axis, suggesting that Trim32-dependent transcriptional regulation intensifies during early differentiation (Fig. 2A). Given our interest in the differentiation process, we therefore focused our analyses comparing the changes occurring from D0 to D3 in WT (WT D3 vs. D0) and in Trim32 KO (KO D3 vs. D0) RNAseq data.

      Pathway enrichment analysis of D3 vs. D0 DEGs allowed the selection of the top-scored pathways for both WT and Trim32 KO data. We obtained 18 top-scored pathways enriched in each genotype (-log(p-value) ³ 9 cut-off): 14 are shared while 4 are top-ranked only in WT and 4 only in Trim32 KO (Fig. EV3C). For the following analyses, we employed thus a total of 22 distinct pathways and to better mine those relevant in the passage from the proliferation stage to the early differentiation one and that are affected by the lack of Trim32, we built a bubble plot comparing side-by-side the scores and enrichment of the 22 selected top-scored pathways above in WT and Trim32 KO (Fig. 2B). A heatmap of DEGs included within these selected pathways confirms the clustering of the samples considering both the genotypes and the timepoints highlighting gene expression differences (Fig. 2C). These pathways are mainly related to muscle development, cell cycle regulation, genome stability maintenance and few other metabolic cascades.

      As expected given the results related to Figure 1, moving from D0 to D3 WT clones showed robust upregulation of key transcripts associated with the Inactive Sarcomere Protein Complex, a category encompassing most genes in the “Striated Muscle Contraction” pathway, while in Trim32 KO clones this pathway was not among those enriched in the transition from D0 to D3 (Fig. EV3C). Detailed analyses of transcripts enclosed within this pathway revealed that on the transition from proliferation to differentiation, WT clones show upregulation of several Myosin Heavy Chain isoforms (e.g., MYH3, MYH6, MYH8), α-Actin 1 (ACTA1), α-Actinin 2 (ACTN2), Desmin (DES), Tropomodulin 1 (TMOD1), and Titin (TTN), a pattern consistent with previous reports, while these same transcripts were either non-detected or only modestly upregulated in Trim32 KO clones at D3 (Fig. 2D). This genotype-specific disparity was further confirmed by gene set enrichment barcode plots, which demonstrated significant enrichment of these muscle-related transcripts in WT cells (FDR_UP = 0.0062), but not in Trim32 KO cells (FDR_UP = 0.24) (Fig. EV3D). These findings support an early transcriptional basis for the impaired myogenesis previously observed in Trim32 KO cells.

      In addition to differences in muscle-specific gene expression, we observed that also several pathways related to cell proliferation and cell cycle regulation were more enriched in Trim32 KO cells compared to WT. This suggests that altered cell proliferation may contribute to the distinct differentiation behavior observed in Trim32 KO versus WT (Fig. 2B). Given that cell cycle exit is a critical prerequisite for the onset of myogenic differentiation and considering that previous studies on Trim32 role in cell cycle regulation have reported inconsistent findings, we further examined cell cycle dynamics under our experimental conditions to clarify Trim32 contribution to this process

      The work would be greatly strengthened by the conclusion of LGMDR8 primary cells, and rescue experiments of TRIM32 to explore myogenesis.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      Also, EU (5-ethynyl uridine) pulse-chase experiments to label nascent and stable RNA coupled with MYC pulldowns and qPCR (or RNA-sequencing of both pools) would further enhance the claim that MYC stability is being affected.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      "On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025)." Also address and discuss the following, as what is currently written is not entirely accurate: https://www.embopress.org/doi/full/10.1038/s44319-024-00299-z and https://journals.physiology.org/doi/prev/20250724-aop/abs/10.1152/ajpcell.00528.2025

      Authors’ response. We thank the Reviewer for bringing to our attention these two publications, that indeed, add important piece of data to recapitulate the in vivo complexity of c-Myc role in myogenesis. We included this point in our Discussion.

      The Discussion now reads: “On one side, c-Myc may influence early stages of myogenesis, such as myoblast proliferation and initial myotube formation, but it may not contribute significantly to later events such as myotube hypertrophy or fusion between existing myotubes and myocytes. This hypothesis is supported by recent work showing that c-Myc is dispensable for muscle fiber hypertrophy but essential for normal MuSC function (Ham et al, 2025). Other reports, instead, demonstrated the implication of c-Myc periodic pulses, mimicking resistance-exercise, in muscle growth, a role that cannot though be observed in our experimental model (Edman et al., 2024; Jones et al., 2025).”

      Minor Comments:

      Z-score scale used in the pathway bubble plot (Figure 2C) could benefit from alternative color choices. Current gradient is a bit muddy and clarity for the reader could be improved by more distinct color options, particularly in the transition from positive to negative Z-score.

      Authors’ response. As suggested, we modified the z-score-representing colors using a more distinct gradient especially in the positive to negative transition in Figure 2B.

      Clarification on the rationale for selecting the "top 18" pathways would be helpful, as it is not clear if this cutoff was chosen arbitrarily or reflects a specific statistical or biological threshold.

      Authors’ response. As now better explained (see comment regarding Major point: Transcriptomics), we used a cut-off of -log(p-value) above or equal to 9 for pathways enriched in DEGs of the D0 vs D3 comparison for both WT and Trim32 KO. The threshold is now included in the Results section and the pathways (shared between WT and Trim32 KO and unique) are listed as Fig. EV3C.

      The authors alternates between using "Trim 32 KO clones" and "KO clones" throughout the manuscript. Consistent terminology across figures and text would improve readability.

      Authors’ response. We thank the Reviewer for this remark, and we apologise for having overlooked it. We amended this throughout the manuscript by always using for clarity “Trim32 KO clones/cells”.

      Cell culture methodology does not specify passage number or culture duration (only "At confluence") before differentiation. This is important, as C2C12 differentiation potential can drift with extended passaging.

      Authors’ response. We agree with the Reviewer that C2C12 passaging can reduce the differentiation potential of this myoblast cell lines; this is indeed the main reason why we decided to employ WT clones, which underwent the same editing process as those that resulted mutated in the Trim32 gene, as reference controls throughout our study. We apologise for not indicating the passages in the first version of the manuscript that now is amended as per here below in the Methods section:

      The C2C12 parental cells used in this study were maintained within passages 3–8. All clonal cell lines (see below) were utilized within 10 passages following gene editing. In all experiments, WT and Trim32 KO clones of comparable passage numbers were used to ensure consistency and minimize passage-related variability.

      Reviewer #2 (Significance (Required)):

      General Assessment:

      This study provides a thorough investigation of Trim32's role the processes related to skeletal muscle differentiation using a CRISPR-Cas9 knockout C2C12 model. The strengths of this study lie in the multi-layered experimental approach as the authors incorporated transcriptomics, cell cycle profiling, and stability assays which collectively build a strong case for their hypothesis that Trim32 is a key factor in the normal regulation of myogenesis. The work is also strengthened by the use of multiple biological and technical replicates, particularly the independent KO clones which helps address potential clonal variation issues that could occur. The largest limitation to this study is that, while the c-Myc mechanism is well explored, the other Trim32-dependent pathways associated with the disruption (implicated by the incomplete rescue by c-Myc knockdown) are not as well addressed. Overall however, the study convincingly identifies a critical function for Trim32 during skeletal muscle differentiation. * * Advance: * * To my knowledge, this is the first study to demonstrate the mRNA stability level of c-Myc regulation by Trim32, rather than through the ubiquitin-mediated protein degradation. This work will advance the current understanding and provide a more complete understanding of Trim32's role in c-Myc regulation. Beyond c-Myc, this work highlights the idea that TRIM family proteins can influence RNA stability which could implicate a broader role in RNA biology and has potential for future therapeutic targeting. * * Audience: * * This research will be of interest to an audience that focuses on broad skeletal muscle biology but primarily to readers with more focused research such as myogenesis and neuromuscular disease (LGMDR8 in particular) where the defined Trim32 governance over early differentiation checkpoints will be of interest. It will also provide mechanistic insights to those outside of skeletal muscle that study TRIM family proteins, ubiquitin biology, and RNA regulation. For translational/clinical researchers, it identifies the Trim32/c-Myc axis as a potential therapeutic target for LGMDR8 and related muscular dystrophies.

      Expertise: * * My expertise lies in skeletal muscle biology, gene editing, transgenic mouse models, and bioinformatics. I feel confident evaluating the data and conclusions as presented.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      • In this paper, the authors examine the role of TRIM32, implicated in limb girdle muscular dystrophy recessive 8 (LGMDR8), in the differentiation of C2C12 mouse myoblasts. Using CRISPR, they generate mutant and wild-type clones and compare their differentiation capacity in vitro. They report that Trim32-deficient clones exhibit delayed and defective myogenic differentiation. RNA-seq analysis reveals widespread changes in gene expression, although few are validated by independent methods. Notably, Trim32 mutant cells maintain residual proliferation under differentiation conditions, apparently due to a failure to downregulate c-Myc. Translation inhibition experiments suggest that TRIM32 promotes c-Myc mRNA destabilization, but this conclusion is insufficiently substantiated. The authors also perform rescue experiments, showing that c-Myc knockdown in Trim32-deficient cells alleviates some differentiation defects. However, this rescue is not quantified, was conducted in only two of the three knockout lines, and is supported by inappropriate statistical analysis of gene expression. Overall, the manuscript in its current form has substantial weaknesses that preclude publication. Beyond statistical issues, the major concerns are: (1) exclusive reliance on the immortalized C2C12 line, with no validation in primary/satellite cells or in vivo, (2) insufficient mechanistic evidence that TRIM32 acts directly on c-Myc mRNA, and (3) overinterpretation of disease relevance in the absence of supporting patient or in vivo data. Please find more details below:*

      We thank the Reviewer for the in-depth assessment of our work and precious suggestions to improve the manuscript. We have carefully addressed some of the concerns raised, as detailed here, while others, which require more experimental efforts, will be addressed as detailed in the Revision Plan.

      - TRIM32 complementation / rescue experiments to exclude clonal or off-target CRISPR effects and show specificity are lacking.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      - The authors link their in vitro findings to LGMDR8 pathogenesis and propose that the Trim32-c-Myc axis may serve as a central regulator of muscle regeneration in the disease. However, LGMDR8 is a complex disorder, and connecting muscle wasting in patients to differentiation assays in C2C12 cells is difficult to justify. No direct evidence is provided that the proposed mRNA mechanism operates in patient-derived samples or in mouse satellite cells. Moreover, the partial rescue achieved by c-Myc knockdown (which does not fully restore myotube morphology or differentiation index) further suggests that the disease connection is not straightforward. Validation of the TRIM32-c-Myc axis in a physiologically relevant system, such as LGMD patient myoblasts or Trim32 mutant mouse cells, would greatly strengthen the claim.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      -Some gene expression changes from the RNA-seq study in Figure 2 should be validated by qPCR

      Authors’ response. We thank the reviewer for this suggestion. This point will be addressed as detailed in the Revision Plan. We have selected several transcripts that will be evaluated in independent samples in order to validate the RNAseq results.

      - The paper shows siRNA knockdown of c-Myc in KO restores Myogenin RNA/protein but does not fully rescue myotube morphology or differentiation index. This suggests that Trim32 controls additional effectors beyond c-Myc; yet the authors do not pursue other candidate mediators identified in the RNA-seq. The manuscript would be strengthened by systematically testing whether other deregulated transcripts contribute to the phenotype.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      - There are concerns with experimental/statistical issues and insufficient replicate reporting. The authors use unpaired two-tailed Student's t-test across many comparisons; multiple testing corrections or ANOVA where appropriate should be used. In Figure EV5B and Figure 6B, the authors perform statistical analyses with control values set to 1. This method masks the inherent variability between experiments and artificially augments p values. Control sample values need to be normalized to one another to have reliable statistical analysis. Myotube morphology and differentiation index quantifications need clear description of fields counted, blind analysis, and number of biological replicates.

      Authors’ response. We thank the Reviewer for raising this point.

      Regarding the replicates, we clarified in the Methods and Legends that the Trim32 KO experiments have been performed on 3 biological replicates (independent clones) and the same for the reference control (3 independent WT clones), except for the Fig. 6 experiments that were performed on 2 Trim32 KO and 2 WT clones. All the Western Blots, immunofluorescence, qPCR data are representative of the results of at least 3 independent experiments unless otherwise stated. We reported the number and type of replicates as well as the microscope fields analyzed.

      We repeated the statistical analyses of the data in Figure 5G, EV5D, EV5E, employing more appropriately the 2-way-ANOVA test, as suggested, and we now reported this info in the graphs and legends.

      We thank the Reviewer for raising this point, we agree and substituted the graphs in Fig. EV5B and 6B showing the control values normalised as suggested. The statistical analyses now reflect this change.

      -Some English mistakes require additional read-throughs. For example: "Indeed, Trim32 has no effect on the stability of c-Myc mRNA in proliferating conditions, but upon induction of differentiation the stability of c-Myc mRNA resulted enhanced in Trim32 KO clones (Fig. 5G, Fig. EV5D and 5E)."

      Authors’ response. We re-edited this revised version of the manuscript as suggested.

      -Results in Figure 5A should be quantified

      Authors’ response. We amended this point by quantifying the results shown in Fig. 5A, we added the graph of the quantification of 3 experimental replicates to the Figure. Quantification confirms that no statistically significant difference is observed. The Figure and the relative legend are modified accordingly.

      -Based on the nuclear marker p84, the separation of cytoplasmic and nuclear fractions is not ideal in Figure 5D

      Authors’ response. We agree with the Reviewer that the presence of p84 also in the cytoplasmic fraction is not ideal. Regrettably, we observed this faint p84 band in all the experiments performed. We think however, that this is not impacting on the result that clearly shows that c-Myc and Trim32 are never detected in the same compartment.

      -In Figure 6, it is not appropriate to perform statistical analyses on only two data points per condition.

      Authors’ response. We agree with the Reviewer and we now show the graph of the results of the 3 technical replicates for 2 biological replicates and do not indicate any statistics (Fig. 6B). The graph was also modified according to a previous point raised.

      -The nuclear MYOG phenotype is very interesting; could this be related to requirements of TRIM32 in fusion?

      Authors’ response. We agree with the Reviewer that Trim32 might also be necessary for myoblast fusion. This point is however beyond the scope of the present study and will be addressed in future work.

      - The hypothesis that TRIM32 destabilizes c-Myc mRNA is intriguing but requires stronger mechanistic support. This would be more convincing with RNA immunoprecipitation to test direct association with c-Myc mRNA, and/or co-immunoprecipitation to identify interactions between TRIM32 and proteins involved in mRNA stability. The study would also be strengthened by reporter assays, such as c-Myc 3′UTR luciferase constructs in WT and KO cells, to directly demonstrate 3′UTR-dependent regulation of mRNA stability.

      Authors’ response. This point will be addressed as detailed in the Revision Plan

      Reviewer #3 (Significance (Required)):

      The manuscript presents a minor conceptual advance in understanding TRIM32 function in myogenic differentiation. Its main limitation is that all experiments were performed in C2C12 cells. While C2C12 are a classical system to study muscle differentiation, they are an immortalized, long-cultured, and genetically unstable line that represents a committed myoblast stage rather than bona fide satellite cells. They therefore do not fully model the biology of early regenerative responses. Several TRIM32 phenotypes reported in the literature differ between primary satellite cells and cell lines, and the authors themselves note such discrepancies. Extrapolating these findings to LGMDR8 pathogenesis without validation in primary human myoblasts, satellite cell assays, or in vivo regeneration models is therefore not justified. Previous work has already established clear roles for TRIM32 in mouse satellite cells in vivo and in patient myoblasts in vitro, whereas this study introduces a novel link to c-Myc regulation during differentiation. In addition, without mechanistic evidence, the central claim that TRIM32 regulates c-Myc mRNA stability remains descriptive and incomplete. Nevertheless, the results will be of interest to researchers studying LGMD and to those exploring TRIM32 biology in broader contexts. I review this manuscript as a muscle biologist with expertise in satellite cell biology and transcriptional regulation.

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Reply to the Reviewers

      I thank the Referees for their...

      Referee #1

      1. The authors should provide more information when...

      Responses + The typical domed appearance of a hydrocephalus-harboring skull is apparent as early as P4, as shown in a new side-by-side comparison of pups at that age (Fig. 1A). + Though this is not stated in the MS 2. Figure 6: Why has only...

      Response: We expanded the comparison

      Minor comments:

      1. The text contains several...

      Response: We added...

      Referee #2

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, the authors examine the role of TRIM32, implicated in limb girdle muscular dystrophy recessive 8 (LGMDR8), in the differentiation of C2C12 mouse myoblasts. Using CRISPR, they generate mutant and wild-type clones and compare their differentiation capacity in vitro. They report that Trim32-deficient clones exhibit delayed and defective myogenic differentiation. RNA-seq analysis reveals widespread changes in gene expression, although few are validated by independent methods. Notably, Trim32 mutant cells maintain residual proliferation under differentiation conditions, apparently due to a failure to downregulate c-Myc. Translation inhibition experiments suggest that TRIM32 promotes c-Myc mRNA destabilization, but this conclusion is insufficiently substantiated. The authors also perform rescue experiments, showing that c-Myc knockdown in Trim32-deficient cells alleviates some differentiation defects. However, this rescue is not quantified, was conducted in only two of the three knockout lines, and is supported by inappropriate statistical analysis of gene expression. Overall, the manuscript in its current form has substantial weaknesses that preclude publication. Beyond statistical issues, the major concerns are: (1) exclusive reliance on the immortalized C2C12 line, with no validation in primary/satellite cells or in vivo, (2) insufficient mechanistic evidence that TRIM32 acts directly on c-Myc mRNA, and (3) overinterpretation of disease relevance in the absence of supporting patient or in vivo data. Please find more details below:

      • TRIM32 complementation / rescue experiments to exclude clonal or off-target CRISPR effects and show specificity are lacking.
      • The authors link their in vitro findings to LGMDR8 pathogenesis and propose that the Trim32-c-Myc axis may serve as a central regulator of muscle regeneration in the disease. However, LGMDR8 is a complex disorder, and connecting muscle wasting in patients to differentiation assays in C2C12 cells is difficult to justify. No direct evidence is provided that the proposed mRNA mechanism operates in patient-derived samples or in mouse satellite cells. Moreover, the partial rescue achieved by c-Myc knockdown (which does not fully restore myotube morphology or differentiation index) further suggests that the disease connection is not straightforward. Validation of the TRIM32-c-Myc axis in a physiologically relevant system, such as LGMD patient myoblasts or Trim32 mutant mouse cells, would greatly strengthen the claim. -Some gene expression changes from the RNA-seq study in Figure 2 should be validated by qPCR
      • The paper shows siRNA knockdown of c-Myc in KO restores Myogenin RNA/protein but does not fully rescue myotube morphology or differentiation index. This suggests that Trim32 controls additional effectors beyond c-Myc; yet the authors do not pursue other candidate mediators identified in the RNA-seq. The manuscript would be strengthened by systematically testing whether other deregulated transcripts contribute to the phenotype.
      • There are concerns with experimental/statistical issues and insufficient replicate reporting. The authors use unpaired two-tailed Student's t-test across many comparisons; multiple testing corrections or ANOVA where appropriate should be used. In Figure EV5B and Figure 6B, the authors perform statistical analyses with control values set to 1. This method masks the inherent variability between experiments and artificially augments p values. Control sample values need to be normalized to one another to have reliable statistical analysis. Myotube morphology and differentiation index quantifications need clear description of fields counted, blind analysis, and number of biological replicates. -Some English mistakes require additional read-throughs. For example: "Indeed, Trim32 has no effect on the stability of c-Myc mRNA in proliferating conditions, but upon induction of differentiation the stability of c-Myc mRNA resulted enhanced in Trim32 KO clones (Fig. 5G, Fig. EV5D and 5E)." -Results in Figure 5A should be quantified -Based on the nuclear marker p84, the separation of cytoplasmic and nuclear fractions is not ideal in Figure 5D -In Figure 6, it is not appropriate to perform statistical analyses on only two data points per condition. -The nuclear MYOG phenotype is very interesting; could this be related to requirements of TRIM32 in fusion?
      • The hypothesis that TRIM32 destabilizes c-Myc mRNA is intriguing but requires stronger mechanistic support. This would be more convincing with RNA immunoprecipitation to test direct association with c-Myc mRNA, and/or co-immunoprecipitation to identify interactions between TRIM32 and proteins involved in mRNA stability. The study would also be strengthened by reporter assays, such as c-Myc 3′UTR luciferase constructs in WT and KO cells, to directly demonstrate 3′UTR-dependent regulation of mRNA stability.

      Significance

      The manuscript presents a minor conceptual advance in understanding TRIM32 function in myogenic differentiation. Its main limitation is that all experiments were performed in C2C12 cells. While C2C12 are a classical system to study muscle differentiation, they are an immortalized, long-cultured, and genetically unstable line that represents a committed myoblast stage rather than bona fide satellite cells. They therefore do not fully model the biology of early regenerative responses. Several TRIM32 phenotypes reported in the literature differ between primary satellite cells and cell lines, and the authors themselves note such discrepancies. Extrapolating these findings to LGMDR8 pathogenesis without validation in primary human myoblasts, satellite cell assays, or in vivo regeneration models is therefore not justified. Previous work has already established clear roles for TRIM32 in mouse satellite cells in vivo and in patient myoblasts in vitro, whereas this study introduces a novel link to c-Myc regulation during differentiation. In addition, without mechanistic evidence, the central claim that TRIM32 regulates c-Myc mRNA stability remains descriptive and incomplete. Nevertheless, the results will be of interest to researchers studying LGMD and to those exploring TRIM32 biology in broader contexts. I review this manuscript as a muscle biologist with expertise in satellite cell biology and transcriptional regulation.

    1. Reviewer #2 (Public review):

      A long-standing debate in the field of Pavlovian learning relates to the phenomenon of timescale invariance in learning i.e. that the rate at which an animal learns about a Pavlovian CS is driven by the relative rate of reinforcement of the cue (CS) to the background rate of reinforcement. In practice, if a CS is reinforced on every trial, then the rate of acquisition is determined by the relative duration of the CS (T) and the ITI (C = inter-US-interval = duration of CS + ITI), specifically the ratio of C/T. Therefore, the point of acquisition should be the same with a 10s CS and a 90s ITI (T = 10; C = 90 + 10 = 100, C/T = 100/10 = 10) and with a 100s CS and a 900s ITI (T = 100; C = 900 + 100 = 1000, C/T = 1000/100 = 10). That is to say, the rate of acquisition is invariant to the absolute timescale as long as this ratio is the same. This idea has many other consequences, but is also notably different from more popular prediction-error based associative learning models such as the Rescorla-Wagner model. The initial demonstrations that the ratio C/T predicts the point of acquisition across a wide range of parameters (both within and across multiple studies) was conducted in Pigeons using a Pavlovian autoshaping procedure. What has remained under contention is whether or not this relationship holds across species, particularly in the standard appetitive Pavlovian conditioning paradigms used in rodents. The results from rodent studies aimed at testing this have been mixed, and often the debate around the source of these inconsistent results focuses on the different statistical methods used to identify the point of acquisition for the highly variable trial-by-trial responses at the level of individual animals.

      The authors successfully replicate the same effect found in pigeon autoshaping paradigms decades ago (with almost identical model parameters) in a standard Pavlovian appetitive paradigm in rats. They achieve this through a clever change the experimental design, using a convincingly wide range of parameters across 14 groups of rats, and by a thorough and meticulous analysis of these data. It is also interesting to note that the two authors have published on opposing sides of this debate for many years, and as a result have developed and refined many of the ideas in this manuscript through this process.

      Main findings

      (1) The present findings demonstrate that the point of initial acquisition of responding is predicted by the C/T ratio.

      (2) The terminal rates of responding to the CS appear to be related to the reinforcement rate of the CS (T; specifically, 1/T) but not its relation to the reinforcement rate of the context (i.e. C or C/T). In the present experiment, all CS trials were reinforced so it is also the case that the terminal rate of responding was related to the duration of the CS.

      (3) An unexpected finding was that responding during the ITI was similarly related to the rate of contextual reinforcement (1/C). This novel finding suggests that the terminal rate of responding during the ITI and the CS are related to their corresponding rates of reinforcement. This finding is surprising as it suggests that responding during the ITI is not being driven by the probability of reinforcement during the ITI.

      (4) Finally, the authors characterised the nature of increased responding from the point of initial acquisition until responding peaks at a maximum. Their analyses suggest that nature of this increase was best described as linear in the majority of rats, as opposed to the non-linear increase that might be predicted by prediction error learning models (e.g. Rescorla-Wagner). However, more detailed analyses revealed that these changes can be quite variable across rats, and more variable when the CS had lower informativeness (defined as C/T).

      Strengths and Weaknesses:

      There is an inherent paradox regarding the consistency of the acquisition data from Gibbon & Balsam's (1981) meta-analysis of autoshaping in pigeons, and the present results in magazine response frequency in rats. This consistency is remarkable and impressive, and is suggestive of a relatively conserved or similar underlying learning principle. However, the consistency is also surprising given some significant differences in how these experiments were run. Some of these differences might reasonably be expected to lead to differences in how these different species respond. For example:

      The autoshaping procedure commonly used in the pigeons from these data were pretrained to retrieve rewards from a grain hopper with an instrumental contingency between head entry into the hopper and grain availability. During Pavlovian training, pecking the key light also elicited an auditory click feedback stimulus, and when the grain hopper was made available, the hopper was also illuminated.

      In the present experimental procedure, the rats were not given contextual exposure to the pellet reinforcers in the magazine (e.g. a magazine training session is typically found in similar rodent procedures). The Pavlovian CS was a cue light within the magazine itself.

      These design features in the present rodent experiment are clearly intentional. Pretraining with the reinforcer in the testing chambers would reasonably alter the background rate of reinforcement (parameter), so it make sense not to include this but differs from the paradigm used in pigeons. Having the CS inside the magazine where pellets are delivered provides an effective way to reduce any potential response competition between CS and US directed responding and combines these all into the same physical response. This makes the magazine approach response more like the pecking of the light stimulus in the pigeon autoshaping paradigm. However, the location of the CS and US is separated in pigeon autoshaping, raising questions about why the findings across species are consistent despite these differences.

      Intriguingly, when the insertion of a lever is used as a Pavlovian cue in rodent studies, CS directed responding (sign-tracking) often develops over training such that eventually all animals bias their responding towards the lever than towards the US (goal-tracking at the magazine). However, the nature of this shift highlights the important point that these CS and US directed responses can be quite distinct physically as well as psychologically. Therefore, by conflating the development of these different forms of responding, it is not clear whether the relationship between C/T and the acquisition of responding describes the sum of all Pavlovian responding or predominantly CS or US directed responding.

      Another interesting aspect of these findings is that there is a large amount of variability that scales inversely with C/T. A potential account of the source of this variability is related to the absence of preexposure to the reward pellets. This is normally done within the animals' homecage as a form of preexposure to reduce neophobia. If some rats take longer to notice and then approach and finally consume the reward pellets in the magazine, the impact of this would systematically differ depending on the length of the ITI. For animals presented with relatively short CSs and ITIs, they may essentially miss the first couple of trials and/or attribute uneaten pellets accumulating in the magazine to the background/contextual rate of reinforcement. What is not currently clear is whether this was accounted for in some way by confirming when the rats first started retrieving and consuming the rewards from the magazine.

      While the generality of these findings across species is impressive, the very specific set of parameters employed to generate these data raise questions about the generality of these findings across other standard Pavlovian conditioning parameters. While this is obviously beyond the scope of the present experiment, it is important to consider that the present study explored a situation with 100% reinforcement on every trial, with a variable duration CS (drawn form a uniform distribution), with a single relatively brief CS (maximum of 122s) CS and a single US. Again, the choice of these parameters in the present experiment is appropriate and very deliberately based on refinements from many previous studies from the authors. This includes a number of criteria used to define magazine response frequency which includes discarding specific responses (discussed and reasonably justified clearly in the methods section). Similarly, the finding that terminal rates of responding are reliably related to 1/T is surprising, and it is not clear whether this might be a property specific to this form of variable duration CS, the use of a uniform sampling distribution, or the use of only a single CS. However, it is important to keeps these limitations in mind when considering some of the claims made in the discussion section of this manuscript that go beyond what these data can support.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Conceptually, I feel that the authors addressed many concerns. However, I am still not convinced that their data support the strength of their claims. Additionally, I spent considerable time investigating the now freely available code and data and found several inconsistencies that would be critical to rectify. My comments are split into two parts, reflecting concerns related to the responses/methods and concerns resulting from investigation of the provided code/data. The former is described in the public review above. Because I show several figures to illustrate some key points for the latter part, an attached file will provide the second part: https://elife-rp.msubmit.net/elife-rp_files/2025/02/24/00136468/01/136468_1_attach_15_2451_convrt.pdf

      (1) This point is discussed in more detail in the attached file, but there are some important details regarding the identification of the learned trial that require more clarification. For instance, isn’t the original criterion by Gibbon et al. (1977) the first “sequence of three out of four trials in a row with at least one response”? The authors’ provided code for the Wilcoxon signed rank test and nDkl thresholds looks for a permanent exceeding of the threshold. So, I am not yet convinced that the approaches used here and in prior papers are directly comparable.

      We agree that there remain unresolved issues with our two attempts to create criteria that match that used by Gibbon and Balsam for trials to criterion. Therefore, we have decided to remove those analyses and return to our original approach showing trials to acquisition using several different criteria so as to demonstrate that the essential feature of the results—the scaling between learning rate and information—is robust. Figure 2A shows the results for a criterion that identifies the trial after which the cumulative response rate during the CS (=cumulative CS response count from Trial 1 divided by cumulative CS time from Trial 1) is consistently above the cumulative overall response rate across the trial (i.e., including both the CS and ITI). These data compare the CS response rate with the overall response rate, rather than with ITI rate as done in the previous version (in Figure 3A of that submission), to be consistent with the subsequent comparisons that are made using the nDkl. (The nDkl relies on the comparison between the CS rate and the overall rate, rather than between the CS and ITI rates.) Figures 2B and 2C show trials to acquisition when two statistical criteria, based on the nDkl, are applied to the difference between CS and overall response rates (the criteria are for odds >= 4:1 and p<.05). As we now explain in the text, a statistical threshold is useful inasmuch as it provides some confidence to the claim that the animals had learned by a given trial. However, this trial is very likely to be after the point when they had learned because accumulating statistical evidence of a difference necessarily adds trials.

      Also, there’s still no regression line fitted to their data (Fig 3’s black line is from Fig 1,according to the legends). Accordingly, I think the claim in the second paragraph of the Discussion that the old data and their data are explained by a model with “essentially the same parameter value” is not yet convincing without actually reporting the parameters of the regression. Related to this, the regression for their data based on my analysis appears to have a slope closer to -0.6, which does not support strict timescale invariance. I think that this point should be discussed as a caveat in the manuscript.

      We now include regression lines fitted to our data in Figures 2A-C, and their slopes are reported in the figure note. We also note on page 14 of the revision that these regressions fitted to our data diverge from the black regression line (slope -1) as the informativeness increases. On pages 14-15, we offer an explanation for this divergence; that, in groups with high informativeness, the effective informativeness is likely to be lower than the assigned value because the rats had not been magazine trained which means they would not have discovered the food pellet as soon as it was released on the first few trials. On pages 15-16, we go on to note that evidence for a change in response rate during the CS in those very first few trials may have been missed because the initial response rates were very low in rats trained with very long inter-reinforcement intervals (and thus high informativeness). We also propose a solution to this problem of comparing between very low response rates, one that uses the nDkl to parse response rates into segments (clusters of trials with equivalent response rates). This analysis with parsed response rates provides evidence that differential responding to the CS may have been acquired earlier than is revealed using trial-by-trial comparisons.

      (2) The authors report in the response that the basis for the apparent gradual/multiple step-like increases after initial learning remains unclear within their framework. This would be important to point out in the actual manuscript Further, the responses indicating the fact that there are some phenomena that are not captured by the current model would be important to state in the manuscript itself.

      We have included a paragraph (on page 26) that discusses the interpretation of the steady/multi-step increase in responding across continued training.

      (3) There are several mismatches between results shown in figures and those produced by the authors’ code, or other supplementary files. As one example, rat 3 results in Fig 11 and Supplementary Materials don’t match and neither version is reproduced by the authors’ code. There are more concerns like this, which are detailed in the attached review file.

      Addressed next….

      The following is the response to the points raised in Part 2 of Reviewer 1’s pdf.

      (1a) I plotted the calculated nDkl with the provided code for rat 3 (Fig 11), but itlooks different, and the trials to acquisition also didn’t match with the table  provided (average of ~20 trial difference). The authors should revise the provided code and plots. Further, even in their provided figures, if one compares rat 3 in Supplementary Materials to data from the same rat in Fig 11, the curves are different. It is critical to have reproducible results in the manuscript, including the ability to reproduce with the provided code.

      We apologise for those inconsistencies. We have checked the code and the data in the figures to ensure they are all now consistent and match the full data in the nHT.mat file in OSF. Figures 11 and 12 from the previous version are now replaced with Figure 6 in the revised manuscript (still showing data from Rats 3 and 176). The data plotted in Fig 6 match what is plotted in the supplementary figures for those 2 rats (but with slightly different cropping of the x-axes) and all plots draw directly from nHT.mat.

      (1b) I tried to replicate also Fig 3C with the results from the provided code, but I failed especially for nDkl > 2.2. Fig 3A and B look to be OK.

      There was error in the previous Fig 3C which was plotting the data from the wrong column of the Trials2Acquisition Table. We suspect this arose because some changes to the file were not updated in Dropbox. However, that figure has changed (now Figure 2) as already mentioned, and no longer plots data obtained with that specific nDkl criterion. The figure now shows criteria that do not attempt to match the Gibbon and Balsam criterion.

      (1c) The trials to learn from the code do match with those in the  Trials2Acquisition Table, but the authors’ code doesn’t reproduce the reported trials to learn values in the nDkl Acquisition Table. The trials to learn from the code are ~20 trials different on average from the table’s ones, for 1:20, 1:100, and 1:1000 nDkl.

      We agree that discrepancies between those different files were a source of potential confusion because they were using different criteria or different ways of measuring response rate (i.e., the “conventional” calculation of rate as number of responses/time, vs our adjusted calculation in which the 1<sup>st</sup> response in the CS was excluded as well as the time spent in the magazine, vs parsed response rates based on inter-response intervals). To avoid this, there is now a single table called Acquisition_Table.xlsx in OSF that includes Trials to acquisition for each rat based on a range of criteria or estimates of response rate in labelled columns. The data shown in Figure 2 are all based on the conventional calculation of response rate (provided in Columns E to H of Acquisition_Table.xlsx). To make the source of these data explicit, we have provided in OSF the matlab code that draws the data from the nHT.mat file to obtain these values for trials-to-acquisition.

      (1d) The nDkl Acquisition Table has columns with the value of the nDkl statistics at various acquisition landmarks, but the value does not look to be true, especially for rat 19. The nDkl curve provided by the authors (Supplementary Materials) doesn’t match the values in the table. The curve is below 10 until at least 300 trials, while the table reports a value higher than 20 (24.86) at the earliest evidence of learning (~120 trials?).

      We are very grateful to the reviewer for finding this discrepancy in our previous files. The individual plots in the Supplementary Materials now contain a plot of the nDkl computed using the conventional calculation of response rate (plot 3 in each 6-panel figure) and a plot of the nDkl computed using the new adjusted calculation of response rate (plot 4). These correspond to the signed nDkl columns for each rat in the full data file nHT.mat. The nDkl values at different acquisition landmarks included in Acquisition_Table.xlsx (Cols AB to AF) correspond to the second of these nDkl formulations. We point out that, of the acquisition landmarks based on the conventional calculation of response rate (Cols E to J of Acquisition_Tabls.xlsx), only the first two landmarks (CSrate>Contextrate and min_nDkl) match the permanently positive and minimum values of the plotted nDkl values. This is because the subsequent acquisition landmarks are based on a recalculation of the nDkl starting from the trial when CSrate>ContextRate, whereas the plotted nDkl starts from Trial 1.

      (2) The cumulative number of responses during the trial (Total) in the raw data table is not measured directly, but indirectly estimated from the pre-CS period, as (cumNR_Pre*[cumITI/cumT_Pre])+ cumNR_CS (cumNR_Pre: cumulative nose-poke response number during pre-CS period; cumITI: cumulative sum of ITI duration; cumT_Pre: cumulative pre-CS duration; cumNR_CS: cumulative response number during CS), according to ‘Explanation of TbyTdataTable (MATLAB).docx’.Why not use the actual cumulative responses during the whole trial instead of using a noisier measure during a smaller time window and then scaling it for the total period?

      Unfortunately, the bespoke software used to control the experimental events and record the magazine activity did not record data continuously throughout the experiment. The ITI responses were only sampled during a specified time-window (the “pre-CS” period) immediately before each CS onset. Therefore, response counts across the whole ITI had to be extrapolated.

      (3) Regarding the “Matlab code for Find Trials to Criterion.docx”:

      (a) What’s the rationale for not using all the trials to calculate nDkl but starting the cumulative summation from the earliest evidence trial (truncated)? Also, this procedure is not described in the manuscript, and this should be mentioned.

      The procedure was perhaps not described clearly enough in the previous manuscript. We have expanded that text to make it clearer (page 12) which includes the text…

      “We started from this trial, rather than from Trial 1, because response rate data from trials prior to the point of acquisition would dilute the evidence for a statistically significant difference in responding once it had emerged, and thereby increase the number of trials required to observe significant responding to the CS. The data from Rat 1 illustrates this point. The CS response rate of Rat 1 permanently exceeded its overall response rate on Trial 52 (when the nD<sub>KL</sub> also became permanently positive). The nD<sub>KL</sub>, calculated from that trial onwards, surpassed 0.82 (odds 4:1) after a further 11 trials (on Trial 63) and reached 1.92 (p < .05) on Trial 81. By contrast, the nD<sub>KL</sub> for this rat, calculated from Trial 1, did not permanently exceed 0.82 until Trial 83 and did not exceed 1.92 until Trial 93, adding 10 or 20 trials to the point of acquisition.”

      (3b) The authors' threshold is the trial when the nDkl value exceeds the threshold permanently.  What about using just the first pass after the minimum?

      Rat 19 provides one example where the nDkl was initially positive, and even exceeded threshold for odds 4:1 and p<.05, but was followed by an extended period when the nDkl was negative because the CS response rate was less than the overall response rate. It illustrates why the first trial on which the nDkl passes a threshold cannot be used as a reliably index of acquisition.

      (3c) Can the authors explain why a value of 0.5 is added to the cumulative response number before dividing it by the cumulative time?

      This was done to provide an “unbiased” estimate of the response count because responses are integers. For example, if a rat has made 10 responses over 100 s of cumulative CS time, the estimated rate should be at least 10/100 but could be anything up to, but not including, 11/100. A rate of 10.5/100 is the unbiased estimate. However, we have now removed this step when calculating the nDkl to identify trials to acquisition because we recognise that it would represent a larger correction to the rate calculated across short intervals than across long intervals and therefore bias comparison between CS and overall response rates that involve very different time durations. As such, the correction would artefactually inflate evidence that the CS response rate was higher than the contextual response rate. However, as noted earlier in this reply, we have now instituted a similar correction when calculating the pre-CS response rate over the final 5 sessions for rats that did not register a single response (hence we set their response count to 0.5).

      (3d) Although the authors explain that nDkl was set to negative if pre-CS rate is higher than CS rate, this is not included in the code because the code calculates the nDkl using the truncated version, starting to accumulate the poke numbers and time from the earliest evidence, thus cumulative CS rate is always higher than cumulative contextual rate. I expect then that the cumulative CS rate will be always higher than the cumulative pre-CS rate.

      Yes, that is correct. The negative sign is added to the nDkl when it is computed starting from Trial 1. But when it is computed starting from the trial when the CS rate is permanently > the overall rate, there is no need to add a sign because the divergence is always in the positive direction.

      (3e) Regarding the Wilcoxon signed rank test, please clarify in the manuscript that the input ‘rate’ is not the cumulative rate as used for the earliest evidence. Please also clarify if the rates being compared for the signed nDkl are just the instantaneous rates or the cumulative ones. I believe that these are the ‘cumulative’ ones (not as for Wilcoxon signed rank test), because if not, the signed nDkl curve of rat 3 would fluctuate a lot across the x-axis.

      The reviewer is correct in both cases. However, as already mentioned, we have removed the analysis involving the Wilcoxon test. The description of the nDkl already specifies that this was done using the cumulative rates.

      (4) Supplemental table ‘nDkl Acquisition Table.xlsx’ 3rd column (“Earliest”) descriptions are unclear.

      (a) It is described in the supplemental ‘Explanation of Excel Tables.docx’ as the ‘earliest estimate of the onset of a poke rate during the CSs higher than the contextual poke rate’, while the last paragraph of the manuscript’s method section says ‘Columns 4, 5 and 6 of the table give the trial after which conditioned responding appeared as estimated in the above described three different ways— by the location of the minimum in the nDkl, the last upward 0 crossings, and the CS parse consistently greater than the ITI parse, respectively. Column 3 in that table gives the minimum of the three estimates.’ I plotted the data from column 3 (right) and comparing them with Fig 3A (left) makes it clear that there’s an issue in this column. If the description in the ‘Explanation of Excel Tables.docx’ is incorrect, please update it.

      We agree that the naming of these criteria can cause confusion, hence we have changed them. On page 9 we have replaced “earliest” with “first” in describing the criterion plotted in Figure 2A showing the trial starting from which the cumulative CS response rate permanently exceeded the cumulative overall rate. What is labelled as “Earliest” in “Acquisition_Table.xlsx” is, as the explanation says, the minimum value across the 3 estimates in that table.

      (b) Also, the term ‘contextual poke rate’ in the 3rd column’s description isconfusing as in the nDkl calculation it represents the poke rate during all the training time, while in the first paragraph of the ‘Data analysis’ part, the earliest evidence is calculated by comparing the ITI (pre-CS baseline) poke rate.

      Yes, we have kept the term “contextual” response rate to refer to responding across the whole training interval (the ITI and the CS duration). This is used in calculation of the nDkl. For consistency with this comparison, we now take the first estimate of acquisition (in Fig 2A) based on a comparison between the CS rate and the overall (context) rate (not the pre-CS rate).

      Reviewer #2 (Recommendations for the authors):

      In response to the Rebuttal comments:

      Analytical (1) relating to Figure 3C/D

      This is a reasonable set of alternative analyses, but it is not clear that it answers the original comment regarding why the fit was worse when using a theoretically derived measure. Indeed, Figure 3C now looks distinctly different to the original Gibbon and Balsam data in terms of the shape of the relationship (specifically, the Group Median - filled orange circles) diverge from the black regression line.

      As mentioned in response to Reviewer 1, there was a mistake in Figure 3C of the revised manuscript. The figure was actually plotting data using a more stringent criterion of nDkl > 5.4, corresponding to p<0.001. The figure was referencing the data in column J of the public Trials2Acquisition Table. The data previously plotted in Figure 3C are no longer plotted because we no longer attempt to identify a criterion exactly matching that used by Gibbon and Balsam.

      We agree that the data shown in the first 3 panels of Figure 2 do diverge somewhat from the black regression line at the highest levels of informativeness (C/T ratios > 70), and the regression lines fitted to the data have slopes greater than -1. We acknowledge this on page 14 of the revised manuscript. Since Gibbon and Balsam did not report data from groups with such high ratios, we can’t know whether their data too would have diverged from the regression line at this point. We now report in the text a regression fitted to the first 10 groups in our experiment, which have C/T ratios that coincide with those of Gibbon and Balsam, and those regression lines do have slopes much closer to -1 (and include -1 in the 95% confidence intervals). We believe the divergence in our data at the high C/T ratios may be due to the fact that our rats were not given magazine training before commencing training with the CS and food. Because of this, it is quite likely that many rats did not find the food immediately after delivery on the first few trials. Indeed, in subsequent experiments, when we have continued to record magazine entries after CS-offset, we have found that rats can take 90 s or more to enter the magazine after the first pellet delivery. This delay would substantially increase the effective CS-US interval, measured from CS onset to discovery of the food pellet by the rat, making the CS much less informative over those trials. We now make this point on pages 14-15 of the revised manuscript.

      Analytical (2)

      We may have very different views on the statistical and scientific approaches here.

      This scalar relationship may only be uniquely applicable to the specific parameters of an experiment where CS and US responding are measured with the same behavioral response (magazine entry). As such, statements regarding the simplicity of the number of parameters in the model may simply reflect the niche experimental conditions required to generate data to fit the original hypotheses.

      To the extent that our data are consistent with the data reported decades ago by Gibbon and Balsam indicates the scalar relationship they identified is not unique to certain niche conditions since those special conditions must be true of both the acquisition of sign-tracking responses in pigeons and magazine entry responses in rats. How broadly it applies will require further experimental work using different paradigms and different species to assess how the rate of acquisition is affected across a wide range of informativeness, just as we have done here.

    1. Reviewer #3 (Public review):

      Summary:

      The authors aimed to overcome the challenges associated with complex, conventional prokaryotic cell-free protein synthesis (CFPS) systems, which require up to thirty-five components, by developing a streamlined and efficient E. coli CFPS platform to encourage broader adoption. The main objective was to reduce the number of reaction components from thirty-five to seven, while also developing an accessible 'fast lysate' preparation protocol that eliminates time-consuming runoff and dialysis steps. The authors also sought to demonstrate the robustness and translational quality of this streamlined system by efficiently synthesising challenging functional proteins, including the cytotoxic restriction endonuclease BsaI and the self-assembling intermediate filament protein vimentin.

      Strengths:

      This study presents several key strengths of the optimised E. coli cell-free protein synthesis system in terms of its design, performance and accessibility.

      (1) The reaction mixture has been dramatically simplified, with the number of essential core components successfully reduced from up to thirty-five in conventional systems to just seven.

      (2) The "fast lysate" protocol is a significant advance in terms of procedure.

      (3) The system's ability to synthesise challenging, functional proteins is evidence of its robustness.

      Weaknesses:

      (1) Title: "A simplified and highly efficient cell-free protein synthesis system for prokaryotes".

      (a) This title is misleading since one would expect a simplified and highly efficient cell-free protein synthesis system to yield similar protein levels compared to current cell-free protein synthesis systems. What this study shows is that the composition of cell-free protein synthesis systems can be simplified while maintaining a certain level of protein synthesis. Here, optimisation does not involve maintaining protein synthesis yield while simplifying the cell-free protein synthesis system; rather, it involves developing a simplified cell-free protein synthesis system. As mentioned in my comments below, this study lacks a comparison of protein levels with a typical cell-free protein synthesis system.

      (b) What do the authors mean by "highly efficient"? Highly efficient compared to what experimental conditions? If one is interested in the yield of protein synthesis, is this simplified system highly efficient compared to current systems?

      (2) Figures 1, 3-5 :

      (a) What do relative luciferase units represent? How are these units calculated?

      (b) In this system, the level of expression depends mainly on the level of NLuc transcripts and the efficiency of NLuc translation. How did the authors ensure that the chemical composition of the different eCFPS buffers only affected protein translation and not transcript levels? In other words, are luciferase units solely an indicator of protein synthesis efficiency, or do they also depend on transcription efficiency, which could vary depending on the experimental conditions?

      (c) How long were the eCFPS reactions allowed to proceed before performing the luciferase activity measurement? Depending on the reaction time, the absence or presence of certain compounds may or may not impact NLuc expression. For example, it can be assumed that tRNA does not significantly affect NLuc levels over a short period of time, and that endogenous tRNA in the lysate is present at sufficient concentrations. However, over a longer period of time, the addition of tRNA could be essential to achieve optimal NLuc levels.

      (d) The authors show that tRNA and amino acids are not strictly essential for the expression of NLuc, likely due to residual amounts within the cell lysate. However, are the protein levels achieved without added amino acids and tRNA sufficient for biochemical assays that require a certain amount of protein? It is important to note that the focus here is on optimising the simplicity of the buffer rather than the level of protein expression. In fact, the simplicity of the buffer is prioritised over the amount of protein produced. This should be made clear.

      (e) How would the NLuc level compare if all the components were optimised individually and present in an optimised buffer, compared to a buffer optimised for simplicity as described by the authors?

      (3) Line 71, Streamlining eCFPS: removal of dispensable components. This title is misleading because it creates the false impression that proteins can be produced in vitro without the addition of certain compounds. While this is true, the level of protein produced may not be sufficient for subsequent biochemical analyses. This should be made clear.

      (4) Figure 2: In the legend, "(A) Protein expression levels of the eCFPS system measured at varying concentrations of KGlu and MgGlu2" would be more accurate if changed to "(A) Protein expression levels of the eCFPS system using an Nanoluciferase (NLuc) reporter DNA measured at varying concentrations of KGlu and MgGlu2".

      (5) Lanes 302-303: "The thorough optimization of the seven core components was a critical step in achieving high protein expression levels". What are "high expression levels"? Compared to what?

    2. Author response:

      Thank you for overseeing the review of our manuscript and for providing the eLife Assessment and Public Reviews. We are highly appreciative of the detailed, constructive feedback from the editors and reviewers.

      We acknowledge the core issues raised and we are committed to undertaking the necessary experiments and textual revisions to address every critique.

      Here is a summary of the key revisions we plan to undertake to address the major points raised:

      (1) Absolute yield comparison and efficiency clarification (eLife Assessment, R#3)

      We will perform new quantitative experiments to provide the absolute protein yield of our optimized eCFPS system and benchmark it against a published, widely recognized high-yield CFPS protocol. This will directly address the central requirement for industry comparison and strengthen the claim of "high efficiency." Furthermore, we will revise the manuscript's terminology, especially in the title and abstract, to accurately reflect the system's success in "streamlining" and "robustness" in addition to performance.

      (2) Mechanistic rationale for simplification (eLife Assessment, R#1)

      We will substantially expand the Discussion to provide a mechanistic explanation for why activity is maintained after removing up to 28 components. This analysis will focus on the retention of endogenous metabolic enzymes and residual factors within the "Fast Lysate," citing relevant literature (e.g., Yokoyama et al., 2010, as suggested by R#1) to support the role of metabolic pathways in compensating for the lack of exogenous tRNA, CTP/UTP, and specific amino acids.

      (3) Transcription-translation coupling (R#3)

      To address the concern that expression changes might be due to transcription rather than translation efficiency, we will perform control experiments to monitor mRNA levels under key optimized conditions. This will help confirm that the observed efficiency changes are primarily attributable to translation.

      (4) Data presentation and completeness (R#2)

      We will revise the presentation of data in figures (e.g., Figure 2) to use appropriate graph types for discrete data and ensure all units, incubation times, and conditions are clearly and consistently specified. Furthermore, we will add a paragraph to the Discussion addressing the study's limitations, specifically the potential implications of DTT removal for certain protein types.

      We are confident that these planned revisions will address the reviewers' recommendations and result in a stronger manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      The authors have created a new model of KCNC1-related DEE in which a pathogenic patient variant (A421V) is knocked into mouse in order to better understand the mechanisms through which KCNC1 variants lead to DEE.

      Strengths:

      (1) The creation of a new DEE model of KCNC1 dysfunction.

      (2) InVivo phenotyping demonstrates key features of the model such as early lethality and several types of electrographic seizures.

      (3) The ex vivo cellular electrophysiology is very strong and comprehensive including isolated patches to accurately measure K+ currents, paired recording to measure evoked synaptic transmission, and the measurement of membrane excitability at different timepoint and in two cell types.

      (4) 2P imaging relates the cellular dysfunction in PV neurons to epilepsy.

    2. Reviewer #3 (Public review):

      Summary:

      Here Wengert et al., establish a rodent model of KCNC1 (Kv3.1) epilepsy by introducing the A421V mutation. The authors perform video-EEG, slice electrophysiology, and in vivo 2P imaging of calcium activity to establish a disease mechanisms involving impairment in the excitability of fast spiking parvalbumin (PV) interneurons in the cortex and thalamic PV cells.

      Outside out nucleated patch recordings were used to evaluate the biophysical consequence of the A421V mutation on potassium currents and showed a clear reduction in potassium currents. Similarly action potential generation in cortical PV interneurons was severely reduced. Given that both potassium currents and action potential generation was found to be unaffected in excitatory pyramidal cells in the cortex the authors propose that loss of inhibition leads to hyperexcitability and seizure susceptibility in a mechanism similar to that of Dravet Syndrome.

      Strengths:

      This manuscript establishes a new rodent model of KCNC1-developmental and epileptic encephalopathy. The manuscript provides strong evidence that parvabumin interneurons are impaired by the Kcnc1-A421V mutation and that cortical excitatory neurons are not impaired. Together, these findings support the conclusion that seizure phenotypes associated with Kcnc1-A421V are caused by impaired cortical inhibition.

      Weaknesses:

      The manuscript identifies a partial mechanism of disease that leaves several aspects unresolved including the possible role of subcortical regions in the seizure mechanism. Similarly, while the authors identify a reduction in potassium currents and a reduction in PV cell surface expression of Kv3.1 why the A421V missense mutation leads to a more severe phenotype than previously reported loss-of-function mutations in Kv3.1is not clear.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):           

      Summary:

      The authors have created a new model of KCNC1-related DEE in which a pathogenic patient variant (A421V) is knocked into a mouse in order to better understand the mechanisms through which KCNC1 variants lead to DEE.  

      Strengths:

      (1)  The creation of a new DEE model of KCNC1 dysfunction. 

      (2)  In Vivo phenotyping demonstrates key features of the model such as early lethality and several types of electrographic seizures. 

      (3)  The ex vivo cellular electrophysiology is very strong and comprehensive including isolated patches to accurately measure K+ currents, paired recording to measure evoked synaptic transmission, and the measurement of membrane excitability at different time points and in two cell types.

      We thank Reviewer 1 for these positive comments related to strengths of the study.   

      Weaknesses:

      (1) The assertion that membrane trafficking is impaired by this variant could be bolstered by additional data.

      We agree with this comment. However, given the technical challenges of standard biochemical experiments for investigating voltage-gated potassium channels (e.g., antibody quality), the lack of a Kv3.1-A421V specific antibody, and the fact that Kv3.1 is expressed in only a small subset of cells, we did not undertake this approach. However, we did perform additional experiments and analysis to improve the rigor of the experiments supporting our conclusion that membrane trafficking is impaired in the Kcnc1-A421V/+ mouse. 

      Such experiments support a highly significant and robust difference in our (albeit imperfect) measurement of the membrane:cytosol ratio of Kv3.1 immunofluorescence between WT and Kcnc1-A421V/+ mice, which is consistent with lack of membrane trafficking (Figure 3). In the revised manuscript, we have added additional data points to this plot and updated the representative example images using improved imaging techniques to better showcase how Kcnc1-A421V/+ PV-INs differ from age-matched WT littermate controls. We think the result is quite clear. Future biochemical experiments perhaps best performed in a culture system in vitro could provide additional support for this conclusion.

      (2) In some experiments details such as the age of the mice or cortical layer are emphasized, but in others, these details are omitted.

      We apologize for this omission. We have now clarified the age of the mice and cortical layer for each experiment in the Methods and Results sections as well as figure legends.   

      (3) The impairments in PV neuron AP firing are quite large. This could be expected to lead to changes in PV neuron activity outside of the hypersynchronous discharges that could be detected in the 2-photon imaging experiments, however, a lack of an effect on PV neuron activity is only loosely alluded to in the text. A more formal analysis is lacking. An important question in trying to understand mechanisms underlying channelopathies like KCNC1 is how changes in membrane excitability recorded at the whole cell level manifest during ongoing activity in vivo. Thus, the significance of this work would be greatly improved if it could address this question.

      Yes, the impairments in the neocortical PV-IN excitability are notably severe relative to other PV interneuronopathies that we and others have directly investigated (e.g., Kv3.1 or Kv3.2-/- knockout mice; Scn1a+/- mice). In the revised version of the manuscript, we have now added a more thorough in vivo 2P calcium imaging investigation and analysis of our in vivo 2P calcium imaging data of PV-IN (and presumptive excitatory cell) neural activity (Figure 8 and Supplementary Figure 9, Methods- lines 230-271 Results- lines 630-657, and Discussion lines- 795-814). 

      Because of the prominent recruitment of neuropil during presumptive myoclonic seizures, further investigation of individual neuronal excitability in vivo required a slightly different labeling strategy now using a soma-tagged GCaMP8m as well as a separate AAV containing tdTomato driven by the PV-IN-specific S5E2 enhancer. Our new results reveal an increase in the baseline calcium transient frequency in non-PV-INs, and reduced mean transient amplitudes in both non-PV cells and PV-INs. These interesting findings, which are consistent with attenuated PV-IN-mediated perisomatic inhibition leading to disinhibited excitatory cells in the Kcnc1-A421V/+ mice, link our in vivo results to the slice electrophysiology experiments. Of course, there are residual issues with the application of this technique to interneurons and the ability to resolve individual or small numbers of spikes, which likely explains the lack of genotype difference in calcium transient frequency in PV-INs.

      (4) Myoclonic jerks and other types of more subtle epileptiform activity have been observed in control mice, but there is no mention of littermate control analyzed by EEG. 

      We performed additional experiments as requested and did not observe myoclonic jerks or any other epileptic activity in WT control mice. We have included this data in the revised manuscript (Figure 9C).   

      Reviewer #2 (Public review):           

      Summary:

      Wengert et al. generated and thoroughly characterized the developmental epileptic encephalopathy phenotype of Kcnc1A421V/+ knock-in mice. The Kcnc1 gene encodes the Kv3.1 channel subunit. Analogous to the role of BK channels in excitatory neurons, Kv3 channels are important for the recurrent high-frequency discharge in interneurons by accelerating the downward hyperpolarization of the individual action potential. Various Kcnc1 mutations are associated with developmental epileptic encephalopathy, but the effect of a recurrent A421V mutation was somewhat controversial and its influence on neuronal excitability has not been fully established. In order to determine the neurological deficits and underlying disease mechanisms, the authors generated cre-dependent KI mice and characterized them using neonatal neurological examination, high-quality in vitro electrophysiology, and in vivo imaging/electrophysiology analyses. These analyses revealed excitability defects in the PV+ inhibitory neurons associated with the emergence of epilepsy and premature death. Overall, the experimental data convincingly support the conclusion.

      Strengths:

      The study is well-designed and conducted at high quality. The use of the Cre-dependent KI mouse is effective for maintaining the mutant mouse line with premature death phenotype, and may also minimize the drift of phenotypes which can occur due to the use of mutant mice with minor phenotype for breeding. The neonatal behavior analysis is thoroughly conducted, and the in vitro electrophysiology studies are of high quality.

      We appreciate these positive comments from Reviewer 2. 

      Weaknesses:

      While not critically influencing the conclusion of the study, there are several concerns.

      In some experiments, the age of the animal in each experiment is not clearly stated. For example, the experiments in Figure 2 demonstrate impaired K+ conductance and membrane localization, but it is not clear whether they correlated with the excitability and synaptic defects shown in subsequent figures. Similarly, it is unclear how old mice the authors conducted EEG recordings, and whether non-epileptic mice are younger than those with seizures. 

      We have now updated the manuscript to include clear report of age for all experiments including the impaired K<sup>+</sup> conductance (now Figure 3) and EEG (now Figure 9). There was no intention to omit this information. The recordings of K<sup>+</sup> conductance impairments in PV-INs from Kcnc1-A421V/+ mice were completed at P1621. Thus, we interpret the loss of potassium current density to be causally linked with the impairments in intrinsic physiological function at that same time-period in neocortical layer II-IV PV-INs and more subtly in PV-positive cells in the RTN and neocortical layer V PVINs.

      Mice used in the EEG experiments were P24-48, an age range which roughly corresponded with the midpoint on the survival curve for Kcnc1-A421V/+ mice. Although we saw significant mouse-to-mouse variability in seizure phenotype, no Kcnc1-A421V/+ mice completely lacked epilepsy or marked epileptiform abnormalities, neither of which were seen in WT mice. We did not detect a clear relationship between seizure frequency/type and mouse age. 

      The trafficking defect of mutant Kv3.1 proposed in this study is based only on the fluorescence density analysis which showed a minor change in membrane/cytosol ratio. It is not very clear how the membrane component was determined (any control staining?). In addition to fluorescence imaging, an addition of biochemical analysis will make the conclusion more convincing (while it might be challenging if the Kv3.1 is expressed only in PV+ cells).

      This relates to comment 3 of Reviewer 1. We agree that, in the initial submission of the manuscript, the evidence from IHC for Kv3.1 trafficking deficits was somewhat subtle. In the revised version of the paper, we have gathered additional replicates of this original experiment with improved imaging quality and clarify how the membrane component was specified, to now show a robust and highly significant (***P<0.001) decrease in membrane:cytosol Kv3.1 ratio. We have also now provided new example images better showcasing the deficits observed in the Kcnc1-A421V/+ mice (Figure 3). The membrane compartment was defined as the outermost 1 micron of the parvalbumin-defined cell soma (drawn blind to the Kv3.1b signal), and, importantly, all analysis was conducted blinded to mouse genotype. These measures help to ensure that the result is robust and unbiased. Nonetheless, we have added a paragraph in the Discussion section highlighting the limitations of our IHC evidence for trafficking impairment (Lines 868-883). 

      While the study focused on the superficial layer because Kv3.1 is the major channel subunit, the PV+ cells in the deeper cortical layer also express Kv3.1 (Chow et al., 1999) and they may also contribute to the hyperexcitable phenotype via negative effect on Kv3.2; the mutant Kv3.1 may also block membrane trafficking of Kv3.1/Kv3.2 heteromers in the deeper layer PV cells and reduce their excitability. Such an additional effect on Kv3.2, if present, may explain why the heterozygous A421V KI mouse shows a more severe phenotype than the Kv3.1 KO mouse (and why they are more similar to Kv3.2 KO). Analyzing the membrane excitability differences in the deep-layer PV cells may address this possibility.

      We appreciate this thoughtful suggestion. We have now provided data from neocortical layer V PV interneurons in the revised manuscript (Supplementary Figure 5). Abnormalities in intrinsic excitability from neocortical layer V PV-INs in Kcnc1A421V/+ mice were present, but less pronounced than in PV-INs from more superficial cortical layers. These results are consistent with the view that greater relative expression of Kv3.2 “dilutes” the impact of the Kv3.1 A421V/+ variant. More specific determination of whether the A421V/+ variant impairs membrane trafficking and/or gating of Kv3.2 remains unclear. 

      We attempted to assess how the mutant Kv3.1 affects Kv3.2 localization, but were unsuccessful due to the lack of reliable antibodies. After immunostaining mouse brain sections with two different anti-Kv3.2 antibodies, only one produced somewhat promising signal (see below). However, even in this case, Kv3.2 staining was successful only once (out of five independent staining experiments) and the signal varied across cortical regions, showing widespread cellular Kv3.2 signal in some areas (b, top panel), and barely detectable signal in others, regardless of Kv3.1 expression. In the remaining four attempts, we detected only ‘fiber-like’ immunostaining signal, further diminishing our confidence in anti-Kv3.2 antibody, although results could be improved with still further testing and refinement which we will attempt. Consequently, this important question remains unsolved in this study. 

      Author response image 1.

      Immunostaining of Kv3.1 and Kv3.2 in sagittal mouse brain sections. a) An example of intracellular Kv3.2 immunostaining signal, variable across the cortex of a WT mice independent of Kv3.1 expression b) Kv3.2 is detectable intracellularly in most of the cells in the top panel but barely detectable in the lowest panel. c) Representative image of Kv3.2 immunostaining signal in other sagittal mouse brain sections.

      We have discussed these important implications and limitations of our results in the Discussion (Lines 868-883). We agree with the Reviewer’s interpretation that an impact on Kv3.1/Kv3.2 heteromultimers across the neocortex may explain why the Kcnc1A421V/+ mouse exhibits a more severe phenotype than Kv3.1-/- or Kv3.2-/- mice (see below), a view which we have attempted to further clarify in the Conclusion.    

      In Table 1, the A421V PV+ cells show a depolarized resting membrane potential than WT by ~5 mV which seems a robust change and would influence the circuit excitability. The authors measured firing frequency after adjusting the membrane voltage to -65mV, but are the excitability differences less significant if the resting potential is not adjusted? It is also interesting that such a membrane potential difference is not detected in young adult mice (Table 2). This loss of potential compensation may be important for developmental changes in the circuit excitability. These issues can be more explicitly discussed.

      We do not entirely understand this finding and its apparent developmental component. It could be compensatory, as suggested by the Reviewer; however, it is transient and seems to be an isolated finding (i.e., it is not accompanied by compensation in other properties). It is also possible that this change in Kcnc1-A421V/+ PV-INs may reflect impaired/delayed development. We cannot test excitability at a meaningfully later time point as the mice are deceased.

      The revised version of the manuscript contains additional data (Supplementary Figure 4) showing that major deficits in intrinsic excitability are still observed even when the resting membrane potential is left unadjusted. These results are further discussed in the Results section (lines 522-523) and the Discussion section (lines 727-731).   

      Reviewer #3 (Public review):           

      Summary:

      Here Wengert et al., establish a rodent model of KCNC1 (Kv3.1) epilepsy by introducing the A421V mutation. The authors perform video-EEG, slice electrophysiology, and in vivo 2P imaging of calcium activity to establish disease mechanisms involving impairment in the excitability of fast-spiking parvalbumin (PV) interneurons in the cortex and thalamic PV cells.

      Outside-out nucleated patch recordings were used to evaluate the biophysical consequence of the A421V mutation on potassium currents and showed a clear reduction in potassium currents. Similarly, action potential generation in cortical PV interneurons was severely reduced. Given that both potassium currents and action potential generation were found to be unaffected in excitatory pyramidal cells in the cortex the authors propose that loss of inhibition leads to hyperexcitability and seizure susceptibility in a mechanism similar to that of Dravet Syndrome.  

      Strengths: 

      This manuscript establishes a new rodent model of KCNC1-developmental and epileptic encephalopathy. The manuscript provides strong evidence that parvabumin-type interneurons are impaired by the A421V Kv3.1 mutation and that cortical excitatory neurons are not impaired. Together these findings support the conclusion that seizure phenotypes are caused by reduced cortical inhibition.

      We thank Reviewer 3 for their view of the strengths of the study.

      Weaknesses:

      The manuscript identifies a partial mechanism of disease that leaves several aspects unresolved including the possible role of the observed impairments in thalamic neurons in the seizure mechanism. Similarly, while the authors identify a reduction in potassium currents and a reduction in PV cell surface expression of Kv3.1 it is not clear why these impairments would lead to a more severe disease phenotype than other loss-of-function mutations which have been characterized previously. Lastly, additional analysis of videoEEG data would be helpful for interpreting the extent of the seizure burden and the nature of the seizure types caused by the mutation.

      We agree with this comment(s) from Reviewer 3. We studied neurons in the reticular thalamus and layer V neocortical PV-INs since they are also linked to epilepsy pathogenesis and are known to express Kv3.1. However, for most of the study, we focused on neocortical layer II-IV PV-INs, because these cells exhibited the most robust impairments in intrinsic excitability. Cross of our novel Kcnc1-Flox(A421V)/+ mice to a cerebral cortex interneuron-specific driver that would avoid recombination in the thalamus, such as Ppp1r2-Cre (RRID:IMSR_JAX:012686), could assist in determining the relative contribution of thalamic reticular nucleus dysfunction to overall phenotype as used by (Makinson et al., 2017) to address a similar question; however, we have been unable to obtain this mouse despite extensive effort. There are of course other Kv3.1expressing neurons in the brain, including in the hippocampus, amygdala, and cerebellum, and we have provided additional discussion (Lines 731-736) of this issue.

      We further agree with the Reviewer that a major question in the field of KCNC1-related neurological disorders is the mechanistic underpinning of why the KCNC1-A421V variant leads to a more severe disease phenotype than other loss of function KCNC1 variants, and, further, why the mouse phenotype is more severe than the Kcnc1 knockout. Previous results and our own recordings in heterologous systems suggest that the A421V variant is more profoundly loss of function than the R320H variant (Oliver et al., 2017; Cameron et al., 2019; Park et al., 2019), which is consistent with A421V having a more severe disease phenotype. Relative to knockout of Kv3.1, our results are consistent with the view that the A421V exhibits dominant negative activity by reducing surface expression of Kv3.1 and/or Kv3.2 (an effect that would not occur in knockout mice), with a possible additional contribution of impairing gating of those Kv3.1-A421V variant containing Kv3.1/Kv3.2 heteromultimers by inclusion of A421V subunits into the heterotetramer. Our finding that the magnitude of total potassium current was reduced in PV-INs by ~50% is consistent with a combination of these various mechanisms but does not distinguish between them.

      In the revised version of the manuscript, we have provided a more complete discussion of these important remaining questions regarding our interpretation of how the severity of KCNC1 disorders relates to the biophysical features of the ion channel variant (lines 868883).

      Recommendations for the authors

      Reviewer #1 (Recommendations for the authors):          

      Major

      (1) The authors suggest that the reduced K+ current density in Kcnc1-A421V/+ neurons is due in part to impaired trafficking and cell surface expression of Kv3.1 in these neurons. The data supporting this claim aren't completely convincing. First, it's difficult to visualize a difference in Kv3.1 localization in the images shown in panel H, and importantly, it seems problematic that the method to assess Kv3.1 levels in membrane vs. cytosol relied on using PV co-staining to define the membrane compartment as the outermost 1 um of the PV-defined cell soma. This doesn't seem to be the best method to define the membrane compartment, as the PV signal should be largely cytosolic.

      As noted above, we have completed additional data collection to confirm our results, and have performed additional imaging and updated our example images to be more representative of the observed deficits in membrane Kv3.1 expression in the Kcnc1-A421V/+ mice. We attempted to identify a marker to more clearly label the membrane to combine with PV immunocytochemistry but were unable to do so despite some effort. 

      Is it possible that in control neurons, the cytosolic PV signal localizes within the membrane-bound Kv3.1 signal, with less colocalization, whereas in Kcnc1-A421V/+ neurons, there would be more colocalization of the cytosolic PV and improperly trafficked Kv3.1.? Could the data be presented in this way showing altered colocalization of Kv3.1 with PV?

      We do not entirely understand the nature of this concern. In our experiments, we utilized the PV signal to determine the cell membrane and cytosolic compartments in an unbiased manner using a 1-micron shell traced around/outside the edge of the PV signal to define the membrane compartment, with the remainder of the area (minus the nuclear signal defined by DAPI) defined as the cytosol (see Methods 176-186). Because we did not identify any alterations in PV signal or correlation between PV immunohistochemistry and tdTomato expression in Cre reporter strains between WT and Kcnc1-A421V/+ mice, we believe that our strategy for determining membrane:cytosol ratio of Kv3.1 in an unbiased manner is acceptable (albeit of course imperfect). 

      Alternatively, membrane fractionation could be performed on WT vs Kcnc1-A421V/+ neurons, followed by Western blotting with a Kv3.1 antibody to show altered proportions in the cytosolic vs. membrane protein fractions. It's important that these results are convincing, as the findings are mentioned in the Abstract, the Results section, and multiple times in the Discussion, although it is still unclear how much the potential altered trafficking contributes to the decrease in K+ currents versus changes in channel gating.

      Multiple technical barriers made it difficult for us to gain direct biochemical evidence for altered trafficking of the A421V/+ Kv3.1 variant (see above). It is not clear how membrane fractionation techniques could be easily applied in this case (at least by us) when PV-INs constitute 3-5% of all neocortical neurons. We further agree (as noted above) that it is difficult to properly disentangle the relative roles of impaired membrane trafficking vs. gating deficits to the observed effect; however, we think that both phenomena are likely occurring. In the revised version of the manuscript, we have more explicitly discussed these limitations in the Discussion section (Lines 868-883).   

      (2) More information is needed regarding the age of mice used for experiments for the following results (added to the Results section as well as figure legends):

      PV density (Supplementary Figure 1) 

      K+ current data (Figure 2A-G)       

      Kv3.1 localization (Figure 2H and I)        

      RTN electrophysiology (Supplementary Figure 3)

      Excitatory neuron electrophysiology (Figure 4)             

      In vivo 2P calcium imaging (Figure 7) 

      Video-EEG (Figure 8)

      We apologize for omitting this critical information. In the revised manuscript, we have provided the age of mice for each of our experiments in the results section, in the figure legend, and in the methods section.   

      (3) It's unclear why developmental milestones/behavioral assessments were only done at P5-P10. In the previous publication of another Kcnc1 LOF variant (Feng et al. 2024), no differences were found at P5-P10, and it was suggested in the discussion that this finding was "consistent with the known developmental expression pattern of Kv3.1 in mouse, where Kv3.1 protein does not appear until P10 or later". In that paper, they did find behavioral deficits at 2-4 months. Even though this model is more severe than the previous model, it would be interesting to determine if there are any behavioral deficits at a later time point (especially as they find more neurophysiological impairments at P32P42).

      As in our previous study, the lack of clear behavioral deficits in developmental milestones from P5-15 is potentially expected considering the developmental expression of Kv3.1, and we performed these experiments primarily to showcase that the Kcnc1-A421V/+ mice exhibit otherwise normal overall early development (although this could be an artifact of the sensitivity of our testing methods).

      For the revised manuscript, we have conducted additional experiments to investigate behavioral deficits in adult Kcnc1-A421V/+ mice. We found cognitive/learning deficits in both Kcnc1-A421V/+ mice relative to WT in both the Barnes maze (Figure 2A-C) and Ymaze (Figure 2D-F). Other aspects of animal behavior including cerebellar-related motor function are likely also impaired at post-weaning timepoints, and will be included in a forthcoming research study focusing on the motor function in these mice.  

      (4) In the Results section, it should be more clearly stated which cortical layer/layers are being studied. In some cases, it mentions layers 2-4, and in some, only layer 4, and in others, it doesn't mention layers at all. Toward the beginning of the Results section, the rationale for focusing on layers 2-4 to assess the effects of this variant should be well described and then, for each experiment, it should be stated which cortical layers were assessed. Related to this point, it seems electrophysiology was only done in layer 4; the rationale for this should also be included.

      We have now clarified which neocortical layers were under investigation in the study. All PV-INs were targeted in somatosensory layers II-IV, while excitatory neurons were either cortical layer IV spiny stellate cells or pyramidal cells. Paired recordings were also completed in layer IV. We have also more explicitly articulated our rationale for looking at PV-INs in layers II-IV to examine the cellular/circuitlevel impact of Kv3.1 in a model of developmental and epileptic encephalopathy (Lines 487-491). 

      (5) Kcnc1-A421V/+ PV neurons showed more robust impairments in AP shape and firing at P32-42 than at P16-21 (Figure 3), and only showed synaptic neurotransmission alterations at P32-42 (Figure 6). Thus, it's unclear why Kcnc1-A421V/+ excitatory neurons were only assessed at P16-21 (Figure 4 and Supplementary Figure 4 related to Figure 5), particularly if only secondary or indirect effects on this population would be expected.

      We appreciate this excellent point raised by the Reviewer and we have taken the suggestion to examine excitatory neurons at P32-42 in addition to the earlier juvenile timepoint. Our new results from the later timepoint are similar to our results at P16-21: Excitatory neurons show no statistically significant impairments in intrinsic excitability at either of the two timepoints examined (Supplementary Figure 7). This adds support to our original conclusion that PV-INs represent the major driver of disease pathology across development.   

      (6) The 2P calcium imaging experiments are potentially interesting, however, a relationship between these results and the electrophysiology results for PV neurons is lacking. Was there an attempt to assess the frequency and/or amplitude of calcium events specifically in PV neurons, outside of the hypersynchronous discharges, to determine whether there are differences between WT and Kcnc1-A421V/+, as was seen in the electrophysiological analyses? It does seem there are some key differences between the two experiments (age: later timepoint for 2P vs. P16-21 and P32-42, layer: 2/3 vs. 4, and PV marking method: virus vs. mouse line), but the electrophysiological differences reported were quite strong. Thus, it would be surprising if there were no alterations in calcium activity among the Kcnc1-A421V/+ PV neurons.

      In our initial experiments, the prominent neuropil GCaMP signal in Kcnc1-A421V/+ mice rendered it difficult to distinguish and accurately describe baseline neuronal excitability in PV-INs and non-PV cells. In our revised manuscript, we utilized a soma-tagged GCaMP8m and separately labeled PV-INs through S5E2-tdTomato. This strategy made it possible to assess the amplitude and frequency of calcium transients in both PV-positive and PV-negative cells in vivo. We have updated the description of our methods (lines 230-271) and our results (lines 630-657) in the revised manuscript.

      As noted above, our more detailed analysis of somatic calcium transients in PV-IN and non-PV cells during quiet rest (Figure 8 and Supplementary Figure 9) shows that PV-INs from Kcnc1-A421V/+ mice are abnormally excitable- having reduced transient amplitude relative to WT controls. Interestingly, non-PV cells also exhibited an increased calcium transient frequency and reduced amplitude which is potentially consistent with reduced perisomatic inhibition causing disinhibition in cortical microcircuits. We again highlight that the slow kinetics of GCaMP combined with the calcium buffering and brief spikes of PVINs render quantification of action potential frequency and comparisons between groups difficult.  

      (7) As mentioned above, it would be helpful to state the time points or age ranges of these experiments to better understand the results and relate them to each other. For example, the 2P imaging showed apparent myoclonic seizures in 7/7 Kcnc1-A421V/+ mice (recorded for a total of 30-50 minutes/mouse), but the video-EEG showed myoclonic seizures in only 3/11 Kcnc1-A421V/+ mice (recorded for 48-72 hours/mouse). Were these experiments done at very different age ranges, so this difference could be due to some sort of progression of seizure types and events as the mice age? Is it possible these are not the same seizure types (even though they are similarly described)? This discrepancy should be discussed.

      Mice in the EEG experiments were between the ages of P24 and 48, slightly younger than the age in which we carried out the in vivo calcium imaging experiments (>P50). Therefore, an age-related exacerbation in myoclonic jerks is possible. 

      As is highlighted by the Reviewer, it is interesting that the myoclonic seizures were only detected in a portion of the Kcnc1-A421V/+ mice during EEG monitoring (4/12). We believe that the difference is most likely driven by more sensitive detection of the myoclonic jerk activity and behavior in the 2P imaging of neuropil cellular activity compared to our video-EEG monitoring and 2P imaging of soma-tagged GCaMP. We have occasionally observed repetitive myoclonic jerking in mice that appears highly localized (i.e. one forepaw only) suggesting that the myoclonic seizures exist on a spectra of severity from focal to diffuse. It is therefore possible that myoclonic events and electrographic activity may be slightly underestimated in our video-EEG experiments? 

      We have now added a few lines discussing this discrepancy in the Discussion (lines 809814).   

      (8) Myoclonic jerks and other types of more subtle epileptiform activity have been observed in control mice. Was video-EEG performed on control mice? These data should be added to Figure 8.

      We have added recordings in control WT mice (N=4). We did not detect myoclonic jerks or other epileptiform activity in the control mice (Figure 9).  

      Minor

      (1) In the first Results section, Line 365, the P value (P<0.001) is different from that in the legend for Figure 1, line 743 (P<0.0001).

      We have fixed this discrepancy. 

      (2) For Supplementary Figure 1, it would be helpful to show images that span the cortical layers (1-6), as PV and Kv3.1 are both expressed across the cortical layers.

      We have updated Supplementary Figure 1 with better example images that span the cortical layers.    

      (3) Error bars should be added to the line graphs in Supplementary Figure 2, particularly panels B and C. Some of the differences appear small considering the highly significant p-values (i.e. body weight at P7 and brain weight at P21).

      The values shown in Supplementary Figure 2D-E are percentages of mice displaying a particular characteristic, so there is no variance for the data.

      Supplementary Figure 2B-C actually do contain error bars plotted as SEM, however, because of the large number of N and small degree of variance in the measurements, the error bars are not apparent in the graphs. This has been noted in the Supplementary Figure 2 legend for clarity. 

      (4) In Figure 3, although the Kcnc1-A421V/+ neurons have elevated AP amplitudes relative to WT, the representative traces for P16-21 and P32-42 groups appear strikingly opposite (traces in B in G appear to have much higher amplitudes than those in C and H). As this is one of the three AP phenotypes described, it would be nice to have it reflected in the traces.

      We have updated our example traces to better represent our main findings including AP amplitude for both P16-21 and P32-42 timepoints.  

      (5) Were any effects on the AHP assessed in the electrophysiology experiments? As other studies have reported the effects of altered Kv3 channel activity on AHP, this parameter could be interesting to report as well.

      We have now provided data on the afterhyperpolarization for each condition displayed in the Supplementary data tables. Interestingly, we failed to detect significant differences in AHP between WT and Kcnc1-A421V/+ PV-INs, RTN neurons, or pyramidal cells, although we did identify differences in the dV/dt of the repolarization phase of the AP.   

      (6) The figure legend for Figure 7 has errors in the panel labeling (D instead of C, and two Fs).

      This error has been corrected in the revised manuscript.

      Reviewer #3 (Recommendations for the authors):

      Specific comments and questions for the authors:         

      (1) Do the authors provide a reason for why the juvenile animals are unaffected by the A421V mutation? Is it that PV cells have not fully integrated at this early time point or that Kv3.1 expression is low? Is the developmental expression profile of Kv3.1 in PV cells known and if so could the authors update the discussion with this information?

      We interpret the normal early developmental milestones (P5-P15) to reflect that Kcnc1-A421V/+ mice exhibit the onset of their neurological impairment at the same time that PV-INs upregulate Kv3.1, develop a fast-spiking physiological phenotype, and integrate into functional circuits in the third and fourth postnatal weeks. We have updated the discussion (Line 780-782) with this information and more clearly describe our interpretation of these early-life behavioral experiments.   

      (2) I would like to see a more complete analysis of the Video-EEG data that is included in Figure 8. What was the seizure duration and frequency? Were there spike-wave seizure types observed? Were EEG events that involve thalamocortical circuitry affected such as spindles? Was sleep architecture impaired in the model? Were littermate control animals recorded?

      Although classical convulsive seizures represent only part of the overall epilepsy phenotype that this mouse exhibits, we agree that reporting seizure duration and frequency is important. We have now included this in our revised manuscript (line 624-626). We have also now added WT control mice to our dataset, and, as expected, we failed to observe any epileptic features in our WT recordings.

      In our EEG experiments, we did not record EMG activity in the mouse to allow for unambiguous determination of sleep vs. quiet wakefulness. For that reason, and because we believe it beyond the scope of this particular study, we did not examine sleep-related EEG phenomena such as spindles or sleep architecture. We have, however, added a line in the discussion (line 771-774) suggesting that future studies focus on a more thorough investigation of the EEG activity in these animals. 

      (3) The in vivo calcium imaging data shows synchronous bursts in A421V animals which is in agreement with the synchronous bursts observed in the EEG. Overall the analysis of the in vivo calcium imaging data appears to be rudimentary and perhaps this is a missed opportunity. What additional insights were gained from this technically demanding experiment that were not obtained from the EEG recordings?

      As noted above, in the revised version of the manuscript, we have conducted additional experiments which allowed us to separately examine PV-IN and non-PV neuron excitability via 2P in vivo calcium imaging. This required an alternative strategy to label individual neuronal somata without contamination by the robust neuropil signal that we observed in the approach undertaken in the original submission. We’ve described the details of this new approach in methods (Lines 230-271) and results section (lines 630-657).

      Our new results (Figure 8 and Supplementary Figure 9) reveal that, during quiet rest, neocortical PV-INs from Kcnc1-A421V/+ mice exhibit a reduction in calcium transient amplitude during quiet wakefulness and that non-PV cells exhibit altered transient frequency and amplitude. Overall, we believe that these results are consistent with the view that PV-IN-mediated perisomatic inhibition is compromised in Kcnc1-A421V/+ mice which leads to a downstream hyperexcitability in excitatory neurons within cortical microcircuits.  

      (4) The increased severity of seizure phenotypes observed in the A421V model relative to knockout mice is interesting but also confusing given what is known about this mutation. As the authors point out, a possible explanation is that the mutation is acting in a dominant negative manner, where mutant Kv3.1 channels compete with other Kvs that would otherwise be able to partially compensate for the loss of Kv function. Alternatively, the A421V mutation might act by affecting the trafficking of heterotetrameric Kv3 channels to the membrane. Can the authors clarify why a trafficking deficit would produce a different effect than a loss of function mutation? Are the authors proposing that a hypomorphic mutation involving both a partial trafficking deficit and a dominant negative effect of those channels that are properly localized is more severe than a "clean" loss of function? The roughly 50% loss of potassium current absent a change in gating would be expected to behave like a loss-of-function mutation. This might be addressed by comparing the surface expression of the other Kv channels and/or through the use of Kv3.1-selective pharmacology.

      These are excellent points raised by the Reviewer. As noted above, we have endeavored to clarify our hypothesis as to the basis of this phenomenon, although the mechanistic basis for the more severe phenotype in the Kcnc1-A421V/+ mouse relative to the Kv3.1 knockout is not entirely clear. Our physiology results and the evidence presented supporting a trafficking impairment, are consistent with dominant negative action of the Kv3.1 A421V variant at the level of channel gating and/or trafficking. To restate, we think the Kcnc1-A421V/+ heterozygous variant is more severe than a Kv3.1 knockout for (at least) three reasons: variant Kv3.1 is incorporated into Kv3.1/Kv3.2 heterotetramers to (1) impair trafficking to the membrane as well as (2) alter the electrophysiological function of those channels that do successfully traffic to the membrane (while Kv3.1 knockout affects Kv3.1 only), and (3) the heterozygous variant may escape compensatory upregulation of Kv3.2 and which is known to occur in Kv3.1 knockout mice.

      For example, our data suggests and is consistent with the view that heterotetramers of WT Kv3.1 and Kv3.2 potentially come together with the A421V Kv3.1 subunit in the endoplasmic reticulum and then fail to traffic to the membrane due to the presence of one or more A421V subunit(s), as evidenced by increased Kv3.1 staining in the cytosol in the Kcnc1-A421V/+ mouse relative to WT. This is in contrast to what would occur in the Kv3.1knockout mice as there is no subunit produced from the null allele to impair WT Kv3.2 subunits from forming fully functional Kv3.2 homotetramers to then reach the cell surface and function properly. This is one specific possible mechanism for dominant negative activity.

      A non-mutually-exclusive mechanism is that inclusion of one or more Kv3.1 A421V subunits into Kv3 heterotetramers impairs gating and prevents potassium flux such that, even if the tetramer does reach the membrane, that entire tetramer fails to contribute to the total potassium current. This is another possible mechanism for dominant negative function of the A421V subunit.

      Experimental elucidation of the precise mechanism of the dominant negative activity of the A421V Kcnc1 variant is beyond the scope of this study; yet, our lab is continuing to work on this. It will likely require dose-response experiments in which various ratios of WT and Kv3.1 A421V subunits are co-expressed in heterologous cells and then recorded for an overall effect on potassium current similar to (Clatot et al., 2017).

      In the revised manuscript, we have updated our discussion of these mechanistic considerations for KCNC1-related epilepsy syndromes in lines 868-883 in the Discussion. 

      References

      Cameron JM et al. (2019) Encephalopathies with KCNC1 variants: genotype-phenotypefunctional correlations. Annals of Clinical and Translational Neurology 6:1263– 1272.

      Clatot J, Hoshi M, Wan X, Liu H, Jain A, Shinlapawittayatorn K, Marionneau C, Ficker E, Ha T, Deschênes I (2017) Voltage-gated sodium channels assemble and gate as dimers. Nature Communications 8.

      Makinson CD, Tanaka BS, Sorokin JM, Wong JC, Christian CA, Goldin AL, Escayg A, Huguenard JR (2017) Regulation of Thalamic and Cortical Network Synchrony by Scn8a. Neuron 93:1165-1179.e6.

      Oliver KL et al. (2017) Myoclonus epilepsy and ataxia due to KCNC1 mutation: Analysis of 20 cases and K+ channel properties. Annals of Neurology 81.

      Park J et al. (2019) KCNC1-related disorders: new de novo variants expand the phenotypic spectrum. Annals of Clinical and Translational Neurology 6:1319–1326.

    1. In active systems, a specific electromagnetic radiation signal is transmitted from the instrument and the sensor detects the component of this signal that is reflected or back-scattered by the surface or atmosphere. Active systems include synthetic aperture radar (SAR) and LiDAR (light detection and ranging).

      In active systems, a specific electromagnetic radiation signal is transmitted from the instrument and the sensor detects the component of the sign that's reflect back Radar * LiDAR

    2. Passive systems detect the short-wave electromagnetic radiation that is reflected or long-wave radiation that is emitted back to space from the Earth’s surface and atmosphere. That is, natural radiation is the measurement source. The MODIS (Moderate Resolution Imaging Spectroradiometer) on NASA’s Terra and Aqua satellites and the MSI (MultiSpectral Instrument) on the European Space Agency’s (ESA) Sentinel-2 satellite (Figure 2.1.24(a)) are examples of passive instruments.

      Passive systems detect SW that is reflected or LW which is emitted back

      Natural radiation is the measurement source

    3. The classification of the electromagnetic spectrum into different regions is based on what property?

      classification of electromagnetic spectrum is based on WL

    4. Earth observation data are usually in the form of digital imagery. This may be an image similar to a photo, depicting a view or scene familiar to what we see. A digital image, however, is any image composed of several picture elements, or pixels, that have numeric values assigned to them representing the intensity of some measured quantity.

      Digital images have numeric values assigned to the pixels representing the intensity of measured quatitys

    1. Reviewer #1 (Public review):

      Sandkuhler et al. re-evaluated the biological functions of TANGO2 homologs in C. elegans, yeast, and zebrafish. Compared to the previously reported role of TANGO2 homologs in transporting heme, Sandkuhler et al. expressed a different opinion on the biological functions of TANGO2 homologs. With the support of some results from their tests, they conclude that 'there is insufficient evidence to support heme transport as the primary function of TANGO2', in addition to the evidence that C. elegans TANGO2 helps counteract oxidative stress.. While the differences are reported in this study, more work is needed to elucidate the intuitive biological function of TANGO2.

      Strengths:

      (1) This work revisits a set of key experiments, including the toxic heme analog GaPP survival assay, the fluorescent ZnMP accumulation assay, and the multi-organismal investigations documented by Sun et al. in Nature (2022), which are critical for comparing the two works. Meanwhile, the authors also highlight the differences in reagents and methods between the two studies, demonstrating significant academic merit.

      (2) This work reported additional phenotypes for the C. elegans mutant of the TANGO2 homologs, including lawn avoidance, reduced pharyngeal pumping, smaller brood size, faster exhaustion under swimming test, and a shorter lifespan. These phenotypes are important for understanding the biological function of TANGO2 homologs, while they were missing from the report by Sun et al.

      (3) Investigating the 'reduced GaPP consumption' as a cause of increased resistance against the toxic GaPP for the TANGO2 homologs, hrg-9 hrg-10 double null mutant provides a valuable perspective for studying the biological function of TANGO2 homologs.

      (4) The induction of hrg-9 gene expression by paraquat indicates a strong link between TANGO2 and mitochondrial function.

      (5) This work thoroughly evaluated the role of TANGO2 homologs in supporting yeast growth using multiple yeast strains and also pointed out the mitochondrial genome instability feature of the yeast strain used by Sun et al.

      Weakness:

      It is always a challenge to replicate someone else's work, but it is worthwhile to take on the challenge, provide evidence, and raise concerns about it. These authors attempted to replicate the experiment using the same biological material as that used by Sun et al. in Nature (2022), despite some experimental differences between the two studies. This study does not have many technical weaknesses, but it can become a much better project by focusing on the new phenotypes discovered here.

    2. Reviewer #3 (Public review):

      In this paper, Sandkuhler et al. reassessed the role of TANGO2 as a heme chaperone proposed by Sun et al in a recently published paper (https://doi.org/10.1038/s41586-022-05347-z). Overall, Sandkuhler et al. conclude that the heme-related roles of TANGO2 had been overemphasized by Sun et al. especially because the hrg9 gene does not exclusively respond to different regimens of heme synthesis/uptake but is susceptible to a greater extent to, for example, oxidative stress. Impaired heme trafficking is then interpreted as due to general mitochondrial dysfunction. In recent years, the discussion around the heme-related roles of TANGO2 has been tantalizing but is still far from a definitive consensus. Discrepancies between results and their interpretation are testament to how ambitious the understanding of TANGO2 and the phenotypes associated with TANGO2 defects are.

      The work presented by Sandkuhler et al. is methodologically sound, and the authors have appropriately addressed my concerns in the first round of review. Overall, this paper challenges the recent developments in the field in relation to heme trafficking and provides a wider perspective on the biological roles of TANGO2.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) A detailed comparison between this work and the work of Sun et al. on experimental protocols and reagents in the main text will be beneficial for readers to assess critically.

      We have added a Key Reagents Table outlining the key reagents used in our study. In terms of experimental protocols, we replicated those described by Sun et al. in most instances and described any differences when present. With this resubmission, we included additional ZnMP accumulation experiments in liquid media (see point 3 below).

      (2) The GaPP used by Sun et al. (purchased from Frontier Scientific) is more effective in killing the worm than the one used in this study (purchased from Santa Cruz). Is the different outcome due to the differences in reagents? Moreover, Sun et al. examined the lethality after 3-4 days, while this work examined the lethality after 72 hours. Would the extra 24 hours make any difference in the result?

      We now cite product vender differences as a possible reason for the observed difference in worm death, as the reviewer suggests, on page 8 (see text below) and include these differences in the Key Reagents Table. We also now stress the fact that our experiments included different doses of GaPP and the use of eat-2 mutants as an additional control, which we believe adds rigor and demonstrates the potency of GaPP in our experiments. We decided on assessment at 72 hours, as we deemed it a less nebulous time point as compared to 3-4 days. Most of the observed worm death occurred earlier in this interval, so we believe it is unlikely that large group differences would emerge after an additional 24 hours.

      “Exposing worms to GaPP, a toxic heme analog, we observed that nematodes deficient in HRG-9 and HRG-10 displayed increased survival compared to WT worms, consistent with prior work,[13] though the between-group difference was markedly smaller in our study. We required higher GaPP concentrations to induce lethality, potentially due to product vendor differences, but did observe a clear dose-dependent effect across strains. Although it was previously proposed that the survival benefit seen in worms lacking HRG-9 and HRG-10 resulted from reduced transfer from intestinal cells after GaPP ingestion, our data suggest the reduced lethality is more likely due to decreased environmental GaPP uptake. Supporting this notion, DKO worms exhibited lawn avoidance, reduced pharyngeal pumping, and modestly lower intestinal ZnMP accumulation when exposed to this fluorescent heme analog on agar plates. In liquid media, DKO worms demonstrated higher fluorescence, but only in ZnMP-free conditions, suggesting the presence of gut granule autofluorescence. Furthermore, survival following exposure to GaPP was highest in eat-2 mutants, despite heme trafficking being unaffected in this strain.”

      (3) This work reported the opposite result of Sun et al. for the fluorescent ZnMP accumulation assay. However, the experimental protocols used by the two studies are massively different. Sun et al. did the ZnMP staining by incubating the L4-stage worms in an axenic mCeHR2 medium containing 40 μM ZnMP (purchased from Frontier Scientific) and 4 μM heme at 20 ℃ for 16 h, while this work placed the L4-stage worms on the OP50 E. coli seeded NGM plates treated with 40 μM ZnMP (purchased from Santa Cruz) for 16 h. The liquid axenic mCeHR2 medium is bacteria-free, heme-free, and consistent for ZnMP uptake by worms. This work has mentioned that the hrg-9 hrg-10 double null mutant has bacterial lawn avoidance and reduced pharyngeal pumping phenotypes. Therefore, the ZnMP staining protocol used in this work faces challenges in the environmental control for the wild type vs. the mutant. The authors should adopt the ZnMP staining protocol used by Sun et al. for a proper evaluation of fluorescent ZnMP accumulation.

      We agree with this comment. As such, we performed the ZnMP assay in liquid media conditions, as now described on page 13:

      “For liquid media experiments, three generations of worms were cultured in regular heme (20 uM) axenic media, with the first two generations receiving antibiotic-supplemented media (10 mg/ml tetracycline) and the 3<sup>rd</sup> generation cultivated without antibiotic. L4 worms from the 3<sup>rd</sup> generation were placed in media containing 40uM ZnMP for 16 hours before being prepared and mounted for imaging as above. Worms were imaged on Zeiss Axio Imager 2 at 40x magnification, with image settings kept uniform across all images. Fluorescent intensity was measured within the proximal region of the intestine using ImageJ.”

      In heme-free media, both WT and DKO worms invariably entered L1 arrest, thus we were not able to replicate the results reported by Sun et al. Using media containing heme, we did see an increase in fluorescence, but this was only in the ZnMP-free condition, indicating that the increased signal was attributable to autofluorescence. This is a known phenomenon associated with gut granules in C. elegans in the setting of oxidative stress. The results of these experiments are now summarized on page 6:

      “DKO nematodes at the L4 larval stage were previously shown to accumulate the fluorescent heme analog zinc mesoporphyrin IX (ZnMP) in intestinal cells in low-heme (4 µM) liquid media. While attempting to replicate this experiment, we observed that both wildtype and DKO nematodes entered L1 arrest under these conditions. Therefore, to allow for developmental progression, we grew worms on standard OP50 E. coli plates and in media containing physiological levels of heme (20 µM). We then examined whether differences in ZnMP uptake persisted under these basal conditions. DKO worms grown on ZnMP-treated E. coli plates displayed significantly reduced intestinal ZnMP fluorescence compared to N2 (Figure 1B and C). Using basal heme media with ZnMP, there was no significant difference in ZnMP fluorescence between DKO and wildtype nematodes, although DKO worms grown in media without ZnMP exhibited significantly higher autofluorescence (Figure 1D and E). To test whether autofluorescence may have contributed to the higher fluorescent intensities previously reported in heme-deficient DKO worms, we repeated this experiment on agar plates under starved conditions but did not observe a difference between groups (Figure 1B).”

      (4) A striking difference between the two studies is that Sun et al. emphasize the biochemical function of TANGO2 homologs in heme transporting with evidence from some biochemical tests. In contrast, this work emphasizes the physiological function of TANGO2 homologs with evidence from multiple phenotypical observations. In the discussion part, the authors should address whether these observed phenotypes in this study can be due to the loss of heme transporting activities upon eliminating TANGO2 homologs. This action can improve the merit of academic debate and collaboration.

      Thank you for this suggestion. The following text has been added to the Discussion section (page 9):

      “In addition to altered pharyngeal pumping, DKO worms displayed multiple previously unreported phenotypic features, suggesting a broader metabolic impairment and reminiscent of some clinical manifestations observed in patients with TDD. Elucidating the mechanisms underlying this phenotype, and whether they reflect a core bioenergetic defect, is an active area of investigation in our lab. Several C. elegans heme-responsive genes have been characterized, revealing relatively specific defects in heme uptake or utilization rather than broad organismal dysfunction. For example, hrg-1 and hrg-4 mutants exhibit impaired growth only under heme-limited conditions,[23] and hrg-3 loss affects brood size and embryonic viability specifically when maternal heme is scarce.[24] ]By contrast, hrg-9 and hrg-10 mutants exhibit the most severe organismal phenotypes of the hrg family, to date, including reduced pharyngeal pumping, decreased motility, shortened lifespan, and smaller broods, even when fed a heme-replete diet.”

      Reviewer #2 (Public review):

      (1) The manuscript is written mainly as a criticism of a previously published paper. Although reproducibility in science is an issue that needs to be acknowledged, a manuscript should focus on the new data and the experiments that can better prove and strengthen the new claims.

      Thank you for this suggestion. While the primary intent of this study was to replicate key findings from the 2022 publication by Sun et al., the revised manuscript now emphasizes underlying mechanisms more broadly rather than focusing narrowly on that prior publication.

      (2) The current presentation of the logic of the study and its results does not help the authors deliver their message, although they possess great potential.

      We have attempted to rectify this through substantial revision of the Discussion section and other places throughout the manuscript.

      (3) The study is missing experiments to link hrg-9 and hrg-10 more directly to bioenergetic and oxidative stress pathways.

      The reviewer is correct in this assertion, but it was not our intent to definitively prove this link or, indeed, the primary mechanism of TANGO2 in the present manuscript. This said, we are actively engaged in this endeavor in our lab and anticipate these data will be published in a separate, forthcoming publication.

      We have added additional references pertaining to hrg-9 enrichment as part of the mitochondrial unfolded protein response (page 10) and a comparison of the phenotype observed in hrg-9 and hrg-10 deficient worms versus those lacking other proteins in the hrg family (page 9).

      Reviewer #3 (Public review):

      (1) The authors stress - with evidence provided in this paper or indicated in the literature - that the primary role of TANGO2 and its homologues is unlikely to be related to heme trafficking, arguing that observed effects on heme transport are instead downstream consequences of aberrant cellular metabolism. But in light of a mounting body of evidence (referenced by the authors) connecting more or less directly TANGO2 to heme trafficking and mobilization, it is recommended that the authors comment on how they think TANGO2 could relate to and be essential for heme trafficking, albeit in a secondary, moonlighting capacity. This would highlight a seemingly common theme in emerging key players in intracellular heme trafficking, as it appears to be the case for GAPDH - with accumulating evidence of this glycolytic enzyme being critical for heme delivery to several downstream proteins.

      TANGO2 is essential for mitochondrial health, albeit in a yet unknown capacity. In the absence of TANGO2, defects in heme trafficking may be secondary sequelae of mitochondrial dysfunction. We would point out that prior studies that attempted to show that TANGO2 and its homologs are involved in heme trafficking proposed very different mechanisms (direct binding vs. membrane protein interaction) and relied on artificially low or high heme conditions to produce these effects. We have attempted to address these more clearly in the Discussion section and have added a fifth figure to summarize our current unifying theory for how heme levels and mitochondrial stress may be linked.

      (2) The observation - using eat-2 mutants and lawn avoidance behaviour - that survival patterns can be partially explained by reduced consumption, is fascinating. It would be interesting to quantify the two relative contributions.

      We have completed additional ZnMP experiments in liquid media at the reviewers’ request. This experimental condition eliminates lawn avoidance as a factor in consumption. Fluorescent intensity was significantly higher in the DKO worms in media lacking ZnMP, indicating increased autofluorescence in DKO worms, while signal was not significantly different in media with ZnMP.

      (3) In the legend to Figure 1A it's a bit unclear what the differently coloured dots represent for each condition. Repeated measurements, worms, independent experiments? The authors should clarify this.

      The following sentence has been added to the legend for Figure 1:

      “Each dot represents the number of offspring laid by one adult worm on one GaPP-treated plate after 24 hours.”

      (4) It would help if the entire fluorescence images (raw and processed) for the ZnMP treatments were provided. Fluorescence images would also benefit Figure 1B.

      Fluorescent intensity values pertaining to the ZnMP experiments are included in our Extended Data supplement, and we have added representative images to Figure 1, per the reviewer’s request. We thank the reviewer for this helpful suggestion. We would be happy to upload raw images to an open-access repository if deemed necessary by the editorial team.

      (5) Increasingly, the understanding of heme-dependent roles relies on transient or indirect binding to unsuspected partners, not necessarily relying on a tight affinity and outdating the notion of heme as a static cofactor. Despite impressive recent advancements in the detection of these interactions (for example https://doi.org/10.1021/jacs.2c06104; cited by the authors), a full characterisation of the hemome is still elusive. Sandkuhler et al. deemed it possible but seem to question that heme binding to TANGO2 occurs. However, Sun et al. convincingly showed and characterised TANGO2 binding to heme. It is recommended that the authors comment on this.

      We believe it is plausible that TANGO2 binds heme (as do hundreds of other proteins), especially as it has been shown to bind other hydrophobic molecules. However, we also note that a separate paper examining the role of TANGO2 in heme transport posited that GAPDH is the sole heme binding partner for cytoplasmic transport (https://doi.org/10.1038/s41467-025-62819-2), contradicting the originally posited theory of how TANGO2 functions. This is described in the Discussion section and, as noted above, we have added an additional figure to demonstrate our unifying hypothesis for why TANGO2 may be important in the low-heme state, irrespective of any direct effect on heme trafficking.

      Additional comments and revisions:

      (1) It was suggested that a triple mutant (eat-2; hrg-9; hrg-10) be tested to determine the primary driver of GaPP toxicity. We appreciate this suggestion, but we offer the following rationale for why these experiments were not pursued. The eat-2 mutant, which lacks a nicotinic acetylcholine receptor subunit in pharyngeal muscles, was included solely as a dietary restriction control to illustrate that reduced GaPP toxicity in the hrg-9/10 double mutant could arise from poor feeding rather than defective heme transport. Both eat-2 and hrg-9/10 mutants exhibit markedly reduced feeding but via different mechanisms. In our assays, GaPP survival was inversely correlated with ingestion rate: eat-2 animals, which feed the least, showed the highest survival, while hrg-9/10 mutants showed intermediate feeding and intermediate survival. Consistent with this, eat-2 worms also displayed the lowest ZnMP accumulation.

      (2) GaPP solution was added to NGM plates after seeding with OP50. This is now expressly stated in the Methods section (page 15). We would note that Sun et al. mixed GaPP in with NGM in the liquid phase. We would expect that if there were a difference in GaPP exposure due to these different protocols, worms in our experiment would have received higher GaPP concentrations.

      “Standard NGM plates were treated with 1, 2, 5, or 10 µM gallium protoporphyrin IX (GaPP; Santa Cruz) after seeding with OP50. Plates were swirled to ensure an even distribution of GaPP and allowed to dry completely.

      (3) The manuscript has been reworked to read as more of an independent study rather than a rebuttal of prior work, though the primary objective of validating prior work remains unchanged.

      (4) Several technical details of experiments have been moved from the main text to the materials and methods section.

      (5) One reviewer noted that the figure numbering should be adjusted. Numbering does not progress sequentially (i.e., 1A…1B…2A…2B) early in the text, because we have opted to consolidate data pertaining to heme analog experiments in Figure 1 and behavioral data in Figure 2.

      (6) “Kingdoms” has been changed to “domains” (page 4).

      (7) Example images are now included for Figure 1B, as noted above.

    1. Reviewer #1 (Public review):

      Summary:

      The authors set out on the ambitious task of establishing the reproducibility of claims from the Drosophila immunity literature. Starting out from a corpus of 400 articles from 1959 and 2011, the authors sought to determine whether their claims were confirmed or contradicted by previous or subsequent publications. Additionally, they actively sought to replicate a subset of the claims for which no previous replications were available (although this set was not representative of the whole sample, as the authors focused on suspicious and/or easily testable claims). The focus of the article is on inferential reproducibility; thus, methods don't necessarily map exactly to the original ones.

      The authors present a large-scale analysis of the individual replication findings, which are presented in a companion article (Westlake et al., 2025. DOI 10.1101/2025.07.07.663442). In their retrospective analysis of reproducibility, the authors find that 61% of the original claims were verified by the literature, 7.5% were partialy verified, and only 6.8% were challenged, with 23.8% having no replication available. This is in stark contrast with the result of their prospective replications, in which only 16% of claims were successfully reproduced.

      The authors proceed to investigate correlates of replicability, with the most consistent finding being that findings stemming from higher-ranked universities (and possibly from very high impact journals) were more likely to be challenged.

      Strengths:

      (1) The work presents a large-scale, in-depth analysis of a particular field of science that includes authors with deep domain expertise of the field. This is a rare endeavour to establish the reproducibility of a particular subfield of science, and I'd argue that we need many more of these in different areas.

      (2) The project was built on a collaborative basis (https://ReproSci.epfl.ch/), using an online database (https://ReproSci.epfl.ch/), which was used to organize the annotations and comments of the community about the claims. The website remains online and can be a valuable resource to the Drosophila immunity community.

      (3) Data and code are shared in the authors' GitHub repository, with a Jupyter notebook available to reproduce the results.

      Main concerns:

      (1) Although the authors claim that "Drosophila immunity claims are mostly replicable", this conclusion is strictly based on the retrospective analysis - in which around 84% of the claims for which a published verification attempt was found. This is in very stark contrast with the findings that the authors replicate prospectively, of which only 16% are verified.

      Although this large discrepancy may be explained by the fact that the authors focused on unchallenged and suspicious claims (which seems to be their preferred explanation), an alternative hypothesis is that there is a large amount of confirmation bias in the Drosophila immunity literature, either because attempts to replicate previous findings tend to reach similar results due to researcher bias, or because results that validate previous findings are more likely to be published.

      Both explanations are plausible (and, not being an expert in the field, I'd have a hard time estimating their relative probability), and in the absence of prospective replication of a systematic sample of claims - which could determine whether the replication rate for a random sample of claims is as high as that observed in the literature -, both should be considered in the manuscript.

      (2) The fact that the analysis of factors correlating with reproducibility includes both prospective and retrospective replications also leads to the possibility of confusion bias in this analysis. If most of the challenged claims come from the authors' prospective replications, while most of the verified ones come from those that were replicated by the literature, it becomes unclear whether the identified factors are correlated with actual reproducibility of the claims or with the likelihood that a given claim will be tested by other authors and that this replication will be published.

      (3) The methods are very brief for a project of this size, and many of the aspects in determining whether claims were conceptually replicated and how replications were set up are missing.

      Some of these - such as the PubMed search string for the publications and a better description of the annotation process - are described in the companion article, but this could be more explicitly stated. Others, however, remain obscure. Statements such as "Claims were cross-checked with evidence from previous, contemporary and subsequent publications and assigned a verification category" summarize a very complex process for which more detail should be given - in particular because what constitutes inferential reproducibility is not a self-evident concept. And although I appreciate that what constitutes a replication is ultimately a case-by-case decision, a general description of the guidelines used by the authors to determine this should be provided. As these processes were done by one author and reviewed by another, it would also be useful to know the agreement rates between them to have a general sense of how reproducible the annotation process might be.

      The same gap in methods descriptions holds for the prospective replications. How were labs selected, how were experimental protocols developed, and how was the validity of the experiments as a conceptual replication assessed? I understand that providing the methods for each individual replication is beyond the scope of the article, but a general description of how they were developed would be important.

      (4) As far as I could tell, the large-scale analysis of the replication results was not preregistered, and many decisions seem somewhat ad hoc. In particular, the categorization of journals (e.g. low impact, high impact, "trophy") and universities (e.g. top 50, 51-100, 101+) relies on arbitrary thresholds, and it is unclear how much the results are dependent on these decisions, as no sensitivity analyses are provided.

      Particularly, for analyses that correlate reproducibility with continuous variable (such as year of publication, impact factor or university ranking, I'd strongly favor using these variables as continuous variables in the analysis (e.g. using logistic regression) rather than performing pairwise comparisons between categories determined by arbitrary cutoffs. This would not only reduce the impact of arbitrary thresholds in the analysis, but would also increase statistical power in the univariate analyses (as the whole sample can be used in at once) and reduce the number of parameters in the multivariate model (as they will be included as a single variable rather than multiple dummy variables when there are more than two categories).

      (5) The multivariate model used to investigate predictors of replicability includes unchallenged claims along with verified ones in the outcome, which seems like an odd decision. If the intention is to analyze which factors are correlated with reproducibility, it would make more sense to remove the unchallenged findings, as these are likely uninformative in this sense. In fact, based on the authors' own replications of unchallenged findings, they may be more likely to belong the "challenged" category than to the "unchallenged" one if they were to be verified.

    2. Reviewer #3 (Public review):

      Summary:

      The authors of this paper were trying to identify how reproducible, or not, their subfield (Drosophilia immunity) was since its inception over 50 years ago. This required identifying not only the papers, but the specific claims made in the paper, assessing if these claims were followed up in the literature, and if so whether the subsequent papers supported or refuted the original claim. In addition to this large manually curated effort, the authors further investigated some claims that were left unchallenged in the literature by conducting replications themselves. This provided a rich corpus of the subfield that could be investigated into what characteristics influence reproducibility.

      Strengths:

      A major strength of this study is the focus on a subfield, the detailing of identifying the main, major, and minor claims - which is a very challenging manual task - and then cataloging not only their assessment of if these claims were followed up in the literature, but also what characteristics might be contributing to reproducibility, which also included more manual effort to supplement the data that they were able to extract from the published papers. While this provides a rich dataset for analysis, there is a major weakness with this approach, which is not unique to this study.

      Weaknesses:

      The main weakness is relying heavily on the published literature as the source for if a claim was determined to be verified or not. There are many documented issues with this stemming from every field of research - such as publication bias, selective reporting, all the way to fraud. It's understandable why the authors took this approach - it is the only way to get at a breadth of the literature - however the flaw with this approach is it takes the literature as a solid ground truth, which it is not. At the same time, it is not reasonable to expect the authors to have conducted independent replications for all of the 400 papers they identified. However, there is a big difference trying to assess the reproducibility of the literature by using the literature as the 'ground truth' vs doing this independently like other large-scale replication projects have attempted to do. This means the interpretation of the data is a bit challenging.

      Below are suggestions for the authors and readers to consider:

      (1) I understand why the authors prefer to mention claims as their primary means of reporting what they found, but it is nested within paper, and that makes it very hard to understand how to interpret these results at times. I also cannot understand at the high-level the relationship between claims and papers. The methods suggest there are 3-4 major claims per paper, but at 400 papers and 1,006 claims, this averages to ~2.5 claims per paper. Can the authors consider describing this relationship better (e.g., distribution of claims and papers) and/or considering presenting the data two ways (primary figures as claims and complimentary supplementary figures with papers as the unit). This will help the reader interpret the data both ways without confusion. I am also curious how the results look when presented both ways (e.g., does shifting to the paper as the unit of analysis shift the figures and interpretation?). This is especially true since the first and last author analysis shows there is varying distribution of papers and claims by authors (and thus the relationship between these is important for the reader).

      (2) As mentioned above, I think the biggest weakness is that the authors are taking the literature at face value when assigning if a claim was validated or challenged vs gathering new independent evidence. This means the paper leans more on papers, making it more like a citation analysis vs an independent effort like other large-scale replication projects. I highly recommend the authors state this in their limitations section.

      On top of that, I have questions that I could not figure out (though I acknowledge I did not dig super deep into the data to try). The main comment I have is How was verified (and challenged) determined? It seems from the methods it was determined by "Claims were cross-checked with evidence from previous, contemporary and subsequent publications and assigned a verification category". If this is true, and all claims were done this way - are verified claims double counted then? (e.g., an original claim is found by a future claim to be verified - and thus that future claim is also considered to be verified because of the original claim).

      Related, did the authors look at the strength of validation or challenged claims? That is, if there is a relationship mapping the authors did for original claims and follow-up claims, I would imagine some claims have deeper (i.e., more) claims that followed up on them vs others. This might be interested to look at as well.

      (3) I recommend the authors add sample sizes when not present (e.g., Fig 4C). I also find that the sample sizes are a bit confusing, and I recommend the authors check them and add more explanation when not complete, like they did for Fig 4A. For example, Fig 7B equals to 178 labs (how did more than 156 labs get determined here?), and yet the total number of claims is 996 (opposed to 1,006). Another example, is why does Fig 8B not have all 156 labs accounted for? (related to Fig 8B, I caution on reporting a p value and drawing strong conclusions from this very small sample size - 22 authors). As a last example, Fig 8C has al 156 labs and 1,006 claims - is that expected? I guess it means authors who published before 1995 (as shown in Figure 8A continued to publish after 1995?) in that case, it's all authors? But the text says when they 'set up their lab' after 1995, but how can that be?

      (4) Finally, I think it would help if the authors expanded on the limitations generally and potential alternative explanations and/or driving factors. For example, the line "though likely underestimated' is indicated in the discussion about the low rate of challenged claims, it might be useful to call out how publication bias is likely the driver here and thus it needs to be carefully considered in the interpretation of this. Related, I caution the authors on overinterpreting their suggestive evidence. The abstract for example, states claims of what was found in their analysis, when these are suggestive at best, which the authors acknowledge in the paper. But since most people start with the abstract, I worry this is indicating stronger evidence than what the authors actually have.

      The authors should be applauded for the monumental effort they put into this project, which does a wonderful job of having experts within a subfield engage their community to understand the connectiveness of the literature and attempt to understand how reliable specific results are and what factors might contribute to them. This project provides a nice blueprint for others to build from as well as leverage the data generated from this subfield, and thus should have an impact in the broader discussion on reproducibility and reliability of research evidence.

    1. Reviewer #3 (Public review):

      Summary:

      The submission from Cronshagen and colleagues describes the application of a previously described method (selection linked integration) to the systematic study of PfEMP1 trafficking in the human malaria parasite Plasmodium falciparum. PfEMP1 is the primary virulence factor and surface antigen of infected red blood cells and is therefore a major focus of research into malaria pathogenesis. Since the discovery of the var gene family that encodes PfEMP1 in the late 1990s, there have been multiple hypotheses for how the protein is trafficked to the infected cell surface, crossing multiple membranes along the way. One difficulty in studying this process is the large size of the var gene family and the propensity of the parasites to switch which var gene is expressed, thus preventing straightforward gene modification-based strategies for tagging the expressed PfEMP1. Here the authors solve this problem by forcing expression of a targeted var gene by fusing the PfEMP1 coding region with a drug selectable marker separated by a skip peptide. This enabled them to generate relatively homogenous populations of parasites all expressing tagged (or otherwise modified) forms of PfEMP1 suitable for study. They then applied this method to study various aspects of PfEMP1 trafficking.

      Strengths:

      The study is very thorough, and the data are well presented. The authors used SLI to target multiple var genes, thus demonstrating the robustness of their strategy. They then perform experiments to investigate possible trafficking through PTEX, they knockout proteins thought to be involved in PfEMP1 trafficking and observe defects in cytoadherence, and they perform proximity labeling to further identify proteins potentially involved in PfEMP1 export. These are independent and complimentary approaches that together tell a very compelling story.

      Weaknesses:

      (1) When the authors targeted IT4var19, they were successful in transcriptionally activating the gene, however they did not initially obtain cytoadherent parasites. To observe binding to ICAM-1 and EPCR, they had to perform selection using panning. This is an interesting observation and potentially provides insights into PfEMP1 surface display, folding, etc. However, it also raises questions about other instances in which cytoadherence was not observed. Would panning of these other lines have successfully selected for cytoadherent infected cells? Did the authors attempt panning of their 3D7 lines? Given that these parasites do export PfEMP1 to the infected cell surface (Figure 1D), it is possible that panning would similarly rescue binding. Likewise, the authors knocked out PTP1, TryThrA and EMPIC3 and detected a loss of cytoadhesion, but they did not attempt panning to see if this could rescue binding. The strong selection that panning exerts on parasite populations could result in selection of compensatory changes that enable cytoadherence, which could be very informative, although the analysis could potentially be quite complicated and beyond the scope of the current paper. Nonetheless, these are important concepts to consider when assessing these phenotypes.

      (2) The authors perform a series of trafficking experiments to help discern whether PfEMP1 is trafficked through PTEX. While the results were not entirely definitive, they make a strong case for PTEX in PfEMP1 export. The authors then used BioID to obtain a proxiome for PfEMP1 and identified proteins they suggest are involved in PfEMP1 trafficking. However, it seemed that components of PTEX were missing from the list of interacting proteins. Is this surprising and does this observation shed any additional light on the possibility of PfEMP1 trafficking through PTEX? This warrants a comment or discussion.

      Comments on revisions:

      The authors have responded thoroughly and constructively to suggestions and comments in the initial review. I have no additional comments. This is a great contribution to the literature.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment:

      This study introduces an important approach using selection linked integration (SLI) to generate Plasmodium falciparum lines expressing single, specific surface adhesins PfEMP1 variants, enabling precise study of PfEMP1 trafficking, receptor binding, and cytoadhesion. By moving the system to different parasite strains and introducing an advanced SLI2 system for additional genomic edits, this work provides compelling evidence for an innovative and rigorous platform to explore PfEMP1 biology and identify novel proteins essential for malaria pathogenesis including immune evasion.

      Reviewer #1 (Public review):

      One of the roadblocks in PfEMP1 research has been the challenges in manipulating var genes to incorporate markers to allow the transport of this protein to be tracked and to investigate the interactions taking place within the infected erythrocyte. In addition, the ability of Plasmodium falciparum to switch to different PfEMP1 variants during in vitro culture has complicated studies due to parasite populations drifting from the original (manipulated) var gene expression. Cronshagen et al have provided a useful system with which they demonstrate the ability to integrate a selectable drug marker into several different var genes that allows the PfEMP1 variant expression to be 'fixed'. This on its own represents a useful addition to the molecular toolbox and the range of var genes that have been modified suggests that the system will have broad application. As well as incorporating a selectable marker, the authors have also used selective linked integration (SLI) to introduce markers to track the transport of PfEMP1, investigate the route of transport, and probe interactions with PfEMP1 proteins in the infected host cell.

      What I particularly like about this paper is that the authors have not only put together what appears to be a largely robust system for further functional studies, but they have used it to produce a range of interesting findings including:

      Co-activation of rif and var genes when in a head-to-head orientation.

      The reduced control of expression of var genes in the 3D7-MEED parasite line.

      More support for the PTEX transport route for PfEMP1.

      Identification of new proteins involved in PfEMP1 interactions in the infected erythrocyte, including some required for cytoadherence.

      In most cases the experimental evidence is straightforward, and the data support the conclusions strongly. The authors have been very careful in the depth of their investigation, and where unexpected results have been obtained, they have looked carefully at why these have occurred.

      We thank the reviewer for the kind assessment and the comments to improve the paper.

      (1) In terms of incorporating a drug marker to drive mono-variant expression, the authors show that they can manipulate a range of var genes in two parasite lines (3D7 and IT4), producing around 90% expression of the targeted PfEMP1. Removal of drug selection produces the expected 'drift' in variant types being expressed. The exceptions to this are the 3D7-MEED line, which looks to be an interesting starting point to understand why this variant appears to have impaired mutually exclusive var gene expression and the EPCR-binding IT4var19 line. This latter finding was unexpected and the modified construct required several rounds of panning to produce parasites expressing the targeted PfEMP1 and bind to EPCR. The authors identified a PTP3 deficiency as the cause of the lack of PfEMP1 expression, which is an interesting finding in itself but potentially worrying for future studies. What was not clear was whether the selected IT4var19 line retained specific PfEMP1 expression once receptor panning was removed.

      We do not have systematic long-term data for the Var19 line but do have medium-term data. After panning the Var19 line, the binding assays were done within 3 months without additional panning. The first binding assay was 2 months after the panning and the last binding assays three weeks later, totaling about 3 months without panning. While there is inherent variation in these assays that precludes detection of smaller changes, the last assay showed the highest level of binding, giving no indication for rapid loss of the binding phenotype. Hence, we can say that the binding phenotype appears to be stable for many weeks without panning the cells again and there was no indication for a rapid loss of binding in these parasites.

      Systematic long-term experiments to assess how long the Var19 parasites retain binding would be interesting, but given that the binding-phenotype appears to remain stable over many weeks or even months, this would only make sense if done over a much longer time frame. Such data might arise if the line is used over extended times for a specific project in which case it might be advisable to monitor continued binding. We included a statement in the discussion that the binding phenotype was stable over many weeks but that if long-term work with this line is planned, monitoring the binding phenotype might be advisable: “In the course of this work the binding phenotype of the IT4var19 expressor line remained stable over many weeks without further panning. However, given that initial panning had been needed for this particular line, it might be advisable for future studies to monitor the binding phenotype if the line is used for experiments requiring extended periods of cultivation.”

      (2) The transport studies using the mDHFR constructs were quite complicated to understand but were explained very clearly in the text with good logical reasoning.

      We are aware of this being a complex issue and are glad this was nevertheless understandable.

      (3) By introducing a second SLI system, the authors have been able to alter other genes thought to be involved in PfEMP1 biology, particularly transport. An example of this is the inactivation of PTP1, which causes a loss of binding to CD36 and ICAM-1. It would have been helpful to have more insight into the interpretation of the IFAs as the anti-SBP1 staining in Figure 5D (PTP-TGD) looks similar to that shown in Figure 1C, which has PTP intact. The anti-EXP2 results are clearly different.

      We realize the description of the PTP1-TGD IFA data and that of the other TGDs (see also response to Recommendation to authors point 4 and reviewer 2, major points 6 and 7) was rather cursory. The previously reported PTP1 phenotype is a fragmentation of the Maurer’s clefts into what in IFA appear to be many smaller pieces (Rug et al 2014, referenced in the manuscript). The control in Fig. 5D has 13 Maurer’s cleft spots (previous work indicates an average of ~15 MC per parasite, see e.g. the originally co-submitted eLife preprint doi.org/10.7554/eLife.103633.1 and references therein). The control mentioned by the reviewer in Fig. 1C has about 22 Maurer’s clefts foci, at the upper end of the typical range, but not unusual. In contrast, the PTP1-TGD in Fig. 5D, has more than 30 foci with an additional cytoplasmic pool and additional smaller, difficult to count foci. This is consistent with the published phenotype in Rug et al 2014. The EXP1 stained cell has more than 40 Maurer’s cleft foci, again beyond what typically is observed in controls. Therefore, these cells show a difference to the control in Fig. 5 but also to Fig. 1C. Please note that we are looking at two different strains, in Fig. 1 it is 3D7 and in Fig. 5 IT4. While we did not systematically assess this, the Maurer’s clefts number per cell seemed to be largely comparable between these strains (Fig. 10C and D in the other eLife preprint doi.org/10.7554/eLife.103633.1). 

      Overall, as the PTP1 loss phenotype has already been reported, we did not go into more experimental detail. However, we now modified the text to more clearly describe how the phenotype in the PTP1-TGD parasites was different to control: “IFAs showed that in the PTP1-TGD parasites, SBP1 and PfEMP1 were found in many small foci in the host cell that exceeded the average number of ~ 15 Maurer’s clefts typically found per infected RBC [66] (Fig. 5D). This phenotype resembled the previously reported Maurer’s clefts phenotype of the PTP1 knock out in CS2 parasites [39].”

      (4) It is good to see the validation of PfEMP1 expression includes binding to several relevant receptors. The data presented use CHO-GFP as a negative control, which is relevant, but it would have been good to also see the use of receptor mAbs to indicate specific adhesion patterns. The CHO system if fine for expression validation studies, but due to the high levels of receptor expression on these cells, moving to the use of microvascular endothelial cells would be advisable. This may explain the unexpected ICAM-1 binding seen with the panned IT4var19 line.

      We agree with the reviewer that it is desirable to have better binding systems for studying individual binding interactions. As the main purpose of this paper was to introduce the system and provide proof of principle that the cells show binding, we did not move to more complicated binding systems. However, we would like to point out that the CSA binding was done on receptor alone in addition to the CSA-expressing HBEC-5i cells and was competed successfully with soluble CSA. In addition, apart from the additional ICAM1-binding of the Var19 line, all binding phenotypes were conform with expectations. We therefore hope the tools used for binding studies are acceptable at this stage of introducing the system while future work interested in specific PfEMP1 receptor interactions may use better systems, tailored to the specific question (e.g. endothelial organoid models and engineered human capillaries and inhibitory antibodies or relevant recombinant domains for competition).

      (5) The proxiome work is very interesting and has identified new leads for proteins interacting with PfEMP1, as well as suggesting that KAHRP is not one of these. The reduced expression seen with BirA* in position 3 is a little concerning but there appears to be sufficient expression to allow interactions to be identified with this construct. The quantitative impact of reduced expression for proxiome experiments will clearly require further work to define it.

      This is a valid point. Clearly there seems to be some impact on binding when BirA* is placed in the extracellular domain (either through reduced presentation or direct reduction of binding efficiency of the modified PfEMP1; please see also minor comment 10 reviewer 2). The exact quantitative impact on the proxiome is difficult to assess but we note that the relative enrichment of hits to each other is rather similar to the other two positions (Fig. 6H-J). We therefore believe the BioIDs with the 3 PfEMP1-BirA* constructs are sufficient to provide a general coverage of proteins proximal to PfEMP1 and hope this will aid in the identification of further proteins involved in PfEMP1 transport and surface display as illustrated with two of the hits targeted here.

      The impact of placing a domain on the extracellular region of PfEMP1 will have to be further evaluated if needed in other studies. But the finding that a large folded domain can be placed into this part at all, even if binding was reduced, in our opinion is a success (it was not foreseeable whether any such change would be tolerated at all).

      (6) The reduced receptor binding results from the TryThrA and EMPIC3 knockouts were very interesting, particularly as both still display PfEMP1 on the surface of the infected erythrocyte. While care needs to be taken in cross-referencing adhesion work in P. berghei and whether the machinery truly is functionally orthologous, it is a fair point to make in the discussion. The suggestion that interacting proteins may influence the "correct presentation of PfEMP1" is intriguing and I look forward to further work on this.

      We hope future work will be able to shed light on this.

      Overall, the authors have produced a useful and reasonably robust system to support functional studies on PfEMP1, which may provide a platform for future studies manipulating the domain content in the exon 1 portion of var genes. They have used this system to produce a range of interesting findings and to support its use by the research community. Finally, a small concern. Being able to select specific var gene switches using drug markers could provide some useful starting points to understand how switching happens in P. falciparum. However, our trypanosome colleagues might remind us that forcing switches may show us some mechanisms but perhaps not all.

      Point noted! From non-systematic data with the Var01 line that has been cultured for extended periods of time (several years), it seems other non-targeted vars remain silent in our SLI “activation” lines but how much SLI-based var-expression “fixing” tampers with the integrity of natural switching mechanisms is indeed very difficult to gage at this stage. We now added a statement to the discussion that even if mutually exclusive expression is maintained, it is not certain the mechanisms controlling var expression all remain intact: “However, it should be noted that it is not known whether all mechanisms controlling mutually exclusive expression and switching remain intact in parasites with SLI-activated var genes.”

      Reviewer #2 (Public review):

      Summary

      Croshagen et al develop a range of tools based on selection-linked integration (SLI) to study PfEMP1 function in P. falciparum. PfEMP1 is encoded by a family of ~60 var genes subject to mutually exclusive expression. Switching expression between different family members can modify the binding properties of the infected erythrocyte while avoiding the adaptive immune response. Although critical to parasite survival and Malaria disease pathology, PfEMP1 proteins are difficult to study owing to their large size and variable expression between parasites within the same population. The SLI approach previously developed by this group for genetic modification of P. falciparum is employed here to selectively and stably activate the expression of target var genes at the population level. Using this strategy, the binding properties of specific PfEMP1 variants were measured for several distinct var genes with a novel semi-automated pipeline to increase throughput and reduce bias. Activation of similar var genes in both the common lab strain 3D7 and the cytoadhesion competent FCR3/IT4 strain revealed higher binding for several PfEMP1 IT4 variants with distinct receptors, indicating this strain provides a superior background for studying PfEMP1 binding. SLI also enables modifications to target var gene products to study PfEMP1 trafficking and identify interacting partners by proximity-labeling proteomics, revealing two novel exported proteins required for cytoadherence. Overall, the data demonstrate a range of SLI-based approaches for studying PfEMP1 that will be broadly useful for understanding the basis for cytoadhesion and parasite virulence.

      We thank the reviewer for the kind assessment and the comments to improve the paper.

      Comments

      (1) While the capability of SLI to actively select var gene expression was initially reported by Omelianczyk et al., the present study greatly expands the utility of this approach. Several distinct var genes are activated in two different P. falciparum strains and shown to modify the binding properties of infected RBCs to distinct endothelial receptors; development of SLI2 enables multiple SLI modifications in the same parasite line; SLI is used to modify target var genes to study PfEMP1 trafficking and determine PfEMP1 interactomes with BioID. Curiously, Omelianczyk et al activated a single var (Pf3D7_0421300) and observed elevated expression of an adjacent var arranged in a head-to-tail manner, possibly resulting from local chromatin modifications enabling expression of the neighboring gene. In contrast, the present study observed activation of neighboring genes with head-to-head but not head-totail arrangement, which may be the result of shared promoter regions. The reason for these differing results is unclear although it should be noted that the two studies examined different var loci.

      The point that we are looking at different loci is very valid and we realize this is not mentioned in the discussion. We now added to the discussion that it is unclear if our results and those cited may be generalized and that different var gene loci may respond differently

      “However, it is unclear if this can be generalized and it is possible that different var loci respond differently.”

      (2) The IT4var19 panned line that became binding-competent showed increased expression of both paralogs of ptp3 (as well as a phista and gbp), suggesting that overexpression of PTP3 may improve PfEMP1 display and binding. Interestingly, IT4 appears to be the only known P. falciparum strain (only available in PlasmoDB) that encodes more than one ptp3 gene (PfIT_140083100 and PfIT_140084700). PfIT_140084700 is almost identical to the 3D7 PTP3 (except for a ~120 residue insertion in 3D7 beginning at residue 400). In contrast, while the C-terminal region of PfIT_140083100 shows near-perfect conservation with 3D7 PTP3 beginning at residue 450, the N-terminal regions between the PEXEL and residue 450 are quite different. This may indicate the generally stronger receptor binding observed in IT4 relative to 3D7 results from increased PTP3 activity due to multiple isoforms or that specialized trafficking machinery exists for some PfEMP1 proteins.

      We thank the reviewer for pointing this out, the exact differences between the two PTP3s of IT4 and that of other strains definitely should be closely examined if the function of these proteins in PfEMP1 binding is analysed in more detail. 

      It is an interesting idea that the PTP3 duplication could be a reason for the superior binding of IT4. We always assumed that IT4 had better binding because it was less culture adapted but this does not preclude that PTP3(s) is(are) a reason for this. However, at least in our 3D7 PTP3 can’t be the reason for the poor binding, as our 3D7 still has PfEMP1 on the surface while in the unpanned IT4-Var19 line and in the Maier et al., Cell 2008 ptp3 KO (PMID: 18614010)) PfEMP1 is not on the surface anymore. 

      Testing the impact of having two PTP3s would be interesting, but given the “mosaic” similarity of the two PTP3s isoforms, a simple add-on experiment might not be informative. Nevertheless, it will be interesting in future work to explore this in more detail.

      Reviewer #3 (Public review):

      Summary:

      The submission from Cronshagen and colleagues describes the application of a previously described method (selection linked integration) to the systematic study of PfEMP1 trafficking in the human malaria parasite Plasmodium falciparum. PfEMP1 is the primary virulence factor and surface antigen of infected red blood cells and is therefore a major focus of research into malaria pathogenesis. Since the discovery of the var gene family that encodes PfEMP1 in the late 1990s, there have been multiple hypotheses for how the protein is trafficked to the infected cell surface, crossing multiple membranes along the way. One difficulty in studying this process is the large size of the var gene family and the propensity of the parasites to switch which var gene is expressed, thus preventing straightforward gene modification-based strategies for tagging the expressed PfEMP1. Here the authors solve this problem by forcing the expression of a targeted var gene by fusing the PfEMP1 coding region with a drug-selectable marker separated by a skip peptide. This enabled them to generate relatively homogenous populations of parasites all expressing tagged (or otherwise modified) forms of PfEMP1 suitable for study. They then applied this method to study various aspects of PfEMP1 trafficking.

      Strengths:

      The study is very thorough, and the data are well presented. The authors used SLI to target multiple var genes, thus demonstrating the robustness of their strategy. They then perform experiments to investigate possible trafficking through PTEX, they knock out proteins thought to be involved in PfEMP1 trafficking and observe defects in cytoadherence, and they perform proximity labeling to further identify proteins potentially involved in PfEMP1 export. These are independent and complimentary approaches that together tell a very compelling story.

      We thank the reviewer for the kind assessment and the comments to improve the paper.

      Weaknesses:

      (1)  When the authors targeted IT4var19, they were successful in transcriptionally activating the gene, however, they did not initially obtain cytoadherent parasites. To observe binding to ICAM-1 and EPCR, they had to perform selection using panning. This is an interesting observation and potentially provides insights into PfEMP1 surface display, folding, etc. However, it also raises questions about other instances in which cytoadherence was not observed. Would panning of these other lines have been successfully selected for cytoadherent infected cells? Did the authors attempt panning of their 3D7 lines? Given that these parasites do export PfEMP1 to the infected cell surface (Figure 1D), it is possible that panning would similarly rescue binding. Likewise, the authors knocked out PTP1, TryThrA, and EMPIC3 and detected a loss of cytoadhesion, but they did not attempt panning to see if this could rescue binding. To ensure that the lack of cytoadhesion in these cases is not serendipitous (as it was when they activated IT4var19), they should demonstrate that panning cannot rescue binding.

      These are very important considerations. Indeed, we had repeatedly attempted to pan 3D7 when we failed to get the SLI-generated 3D7 PfEMP1 expressor lines to bind, but this had not been successful. The lack of binding had been a major obstacle that had held up the project and was only solved when we moved to IT4 which readily bound (apart from Var19 which was created later in the project). After that we made no further efforts to understand why 3D7 does not bind but the fact that PfEMP1 is on the surface indicates this is not a PTP3 issue because loss of PTP3 also leads to loss of PfEMP1 surface display. Also, as the parent 3D7 could not be panned, we assumed this issue is not easily fixed in the SLI var lines we made in 3D7.

      Panning the TGD lines: we see the reasoning for conducting panning experiments with the TGD lines. However, on second thought, we are unsure this should be attempted. The outcome might not be easily interpretable as at least two forces will contribute to the selection in panning experiments with TGD lines that do not bind anymore:

      Firstly, panning would work against the SLI of the TGD, resulting in a tug of war between the TGD-SLI and binding. This is because a small number of parasites will loop out the TGD plasmid (revert) and would normally be eliminated during standard culturing due to the SLI drug used for the TGD. These revertant cells would bind and the panning would enrich them. Hence, panning and SLI are opposed forces in the case of a TGD abolishing binding. It is unclear how strong this effect would be, but this would for sure lead to mixed populations that complicate interpretations. 

      The second selecting force are possible compensatory changes to restore binding. These can be due to different causes: (i) reversal of potential independent changes that may have occurred in the TGD parasites and that are in reality causing the binding loss (i.e. such as ptp3 loss or similar, the concern of the reviewer) or (ii) new changes to compensate the loss of the TGD target (in this case the TGD is the cause of the binding loss but for instance a different change ameliorates it by for instance increasing PfEMP1 expression or surface display). As both TGDs show some residual binding and have VAR01 on the surface to at least some extent, it is possible that new compensatory changes might indeed occur that indirectly increase binding again. 

      In summary, even if more binding occurs after panning of the lines, it is not clear whether this is due to a compensatory change ameliorating the TGD or reversal of an unrelated change or are counter-selections against the SLI. To determine the cause, the panned TGD lines would need to be subjected to a complex and time-consuming analysis (WGS, RNASeq, possibly Maurer’s clefts phenotype) to find out whether they were SLI-revertants, or had an unrelated chance that was reverted or a new compensatory change that helps binding. This might be further muddled if a mix of cells come out of the selection that have different changes of the options indicated above. In that case, it might even require scRNASeq to make sense of the panning experiment. Due to the envisaged difficulty in interpreting the outcome, we did not attempt this panning.

      To exclude loss of ptp3 expression as the reason for binding loss (something we would not have seen in the WGS if it is only due to a transcriptional change), we now carried out RNASeq with the TGD lines that have a binding phenotype. While we did not generate replicas to obtain quantitative data, the results show that both ptp3 copies were expressed in these TGDs comparable to other parasite lines that do bind with the same SLI-activated var gene, indicating that the effect is not due to ptp3 (see response to point 4 on PTP3 expression in the Recommendations for the authors). While we can’t fully exclude other changes in the TGDs that might affect binding, the WGS did not show any obvious alterations that could be responsible for this. 

      (2) The authors perform a series of trafficking experiments to help discern whether PfEMP1 is trafficked through PTEX. While the results were not entirely definitive, they make a strong case for PTEX in PfEMP1 export. The authors then used BioID to obtain a proxiome for PfEMP1 and identified proteins they suggest are involved in PfEMP1 trafficking. However, it seemed that components of PTEX were missing from the list of interacting proteins. Is this surprising and does this observation shed any additional light on the possibility of PfEMP1 trafficking through PTEX? This warrants a comment or discussion.

      This is an interesting point and we agree that this warrants to be discussed. A likely reason why PTEX components are not picked up as interactors is that BirA* is expected to be unfolded when it passes through the channel and in that state can’t biotinylate. Labelling likely would only be possible if PfEMP1 lingered at the PTEX translocation step before BirA* became unfolded to go through the channel which we would not expect under physiological conditions. We added the following sentences to the discussion: “While our data indicates PfEMP1 uses PTEX to reach the host cell, this could be expected to have resulted in the identification of PTEX components in the PfEMP1 proxiomes, which was not the case. However, as BirA* must be unfolded to pass through PTEX, it likely is unable to biotinylate translocon components unless PfEMP1 is stalled during translocation. For this reason, a lack of PTEX components in the PfEMP1 proxiomes does not necessarily exclude passage through PTEX.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Most of my comments are in the public section. I would just highlight a few things:

      (1) In the binding studies section you talk about "human brain endothelial cells (HBEC-5i)". These cells do indeed express CSA but this is a property of their immortalisation rather than being brain endotheliium, which does not express CSA. I think this could be confusing to readers so I think you might want to reword this sentence to focus on CSA expressing the cell line rather than other features.

      We thank the reviewer for pointing this out, we now modified the sentence to focus on the fact these are CSA expressing cells and provided a reference for it.

      (2) As I said in the public section, CHO cells are great for proof of concept studies, but they are not endothelium. Not a problem for this paper.

      Noted! Please also see our response to the public review.

      (3) I wonder whether your comment about how well tolerated the Bir3* insertion is may be a bit too strong. I might say "Nonetheless, overall the BirA* modified PfEMP1 were functional."

      Changed as requested.

      (4) I'm not sure how you explain the IFA staining patterns to the uninitiated, but perhaps you could explain some of the key features you are looking for.

      We apologise for not giving an explanation of the IFA staining patterns in the first place. Please see detailed response to public review of this reviewer (point 3 on PTP1-TGD phenotype) and to reviewer 2 (Recommendations to the authors, points 6 and 7 on better explaining and quantifying the Maurer’s clefts phenotypes). For this we now also generated parasites that episomally express mCherry tagged SBP1 in the TGD parasites with the reduced binding phenotype. This resulted in amendments to Fig. S7, addition of a Fig. S8 and updated results to better explain the phenotypes. 

      This is a great paper - I just wish I'd had this system before.

      Thank you!

      Reviewer #2 (Recommendations for the authors):

      Major Comments

      (1) Does the RNAseq analysis of 3D7var0425800 and 3D7MEEDvar0425800 (Figure 1G, H) reveal any differential gene expression that might suggest a basis for loss of mutually exclusive var expression in the MEED line?

      We now carried out a thorough analysis of these RNASeq experiments to look for an underlying cause for the phenotype. This was added as new Figure 1J and new Table S3. This analysis again illustrated the increased transcript levels of var genes. In addition, it showed that transcripts of a number of other exported proteins, including members of other gene families, were up in the MEED line. 

      One hit that might be causal of the phenotype was sip2, which was down by close to 8-fold (pAdj 0.025). While recent work in P. berghei found this ApiAP2 to be involved in the expression of merozoite genes (Nishi et al., Sci Advances 2025(PMID: 40117352)), previous work in P. falciparum showed that it binds heterochromatic telomere regions and certain var upstream regions (Flück et al., PlosPath 2010 (PMID: 20195509), now cited in the manuscript). The other notable change was an upregulation of the non-coding RNA ruf6 which had been linked with impaired mono-allelic var expression (Guizetti et al., NAR 2016 (PMID: 27466391), now also cited in the manuscript). While it would go beyond this manuscript to follow this up, it is conceivable that alterations in chromosome end biology due to sip2 downregulation or upregulation of ruf6 are causes of the observed phenotype

      We now added a paragraph on the more comprehensive analysis of the RNA Seq data of the MEED vs non-MEED lines at the end of the second results section.

      (2) Could the inability of the PfEMP1-mDHFR fusion to block translocation (Fig 2A) reflect unique features of PfEMP1 trafficking, such as the existence of a soluble, chaperoned trafficking state that is not fully folded? Was a PfEMP1-BPTI fusion ever tested as an alternative to mDHFR?

      This is an interesting suggestion. The PfEMP1-BPTI was never tested. However, a chaperoned trafficking state would likely also affect BPTI. Given that both domains (mDHFR and BPTI) in principle do the same when folded and would block when the construct is in the PV, it is not so likely that using a different blocking domain would make a difference. Therefore, the scenario where BPTI would block when mDHFR does not, is not that probable. The opposite would be possible (mDHFR blocking while BPTI does not, because only the latter depends on the redox state). However, this would only happen if the block  occurred before the construct reaches the PV.

      At present, we believe the lacking block to be due to the organization of the domains in the construct. In the PfEMP1-mDHFR construct in this manuscript the position of the blocking domain is further away from the TMD compared to all other previously tested mDHFR fusions. Increased distance to the TMD has previously been found to be a factor impairing the blocking function of mDHFR (Mesen-Ramirez et al., PlosPath 2016 (PMID: 27168322)). Hence, our suspicion that this is the reason for the lacking block with the PfEMP1-mDHFR rather than the type of blocking domain. However, the latter option can’t be fully excluded and we might test BPTI in future work.

      (3) The late promoter SBP1-mDHFR is 2A fused with the KAHRP reporter. Since 2A skipping efficiency varies between fusion contexts and significant amounts of unskipped protein can be present, it would be helpful to include a WB to determine the efficiency of skipping and provide confidence that the co-blocked KAHRP in the +WR condition (Fig 2D) is not actually fused to the C-terminus of SBP1-mDHFR-GFP.

      Fortunately, this T2A fusion (crt_SBP1-mDHFR-GFP-2A-KAHRP-mScarlet<sup>epi</sup>) was used before in work that included a Western blot showing its efficient skipping (S3 A Fig in MesenRamirez et al., PlosPath 2016). In agreement with these Western blot result, fluorescence microscopy showed very limited overlap of SBP1-mDHFR-GFP and KAHRP-mCherry in absence of WR (Fig. 3B in Mesen-Ramirez et al., PlosPath 2016 and Fig. 2 in this manuscript) which would not be the case if these two constructs were fused together. Please note that KAHRP is known to transiently localize to the Maurer’s clefts before reaching the knobs (Wickham et al., EMBOJ 2001, PMID: 11598007), and therefore occasional overlap with SBP1 at the Maurer’s clefts is expected. However, we would expect much more overlap if a substantial proportion of the construct population would not be skipped and therefore the co-blocked KAHRP-mCherry in the +WR sample is unlikely to be due to inefficient skipping and attachment to SBP1-mDHFR-GFP.

      (4) Does comparison of RNAseq from the various 3D7 and IT4 lines in the study provide any insight into PTP3 expression levels between strains with different binding capacities? Was the expression level of ptp3a/b in the IT4var19 panned line similar to the expression in the parent or other activated IT4 lines? Could the expanded ptp3 gene number in IT4 indicate that specialized trafficking machinery exists for some PfEMP1 proteins (ie, IT4var19 requires the divergent PTP3 paralog for efficient trafficking)?

      PTP3 in the different IT4 lines that bind:

      In those parasite lines that did bind, the intrinsic variation in the binding assays, the different binding properties of different PfEMP1 variants and the variation in RNA Seq experiments to compare different parasite lines precludes a correlation of binding level vs ptp3 expression. For instance, if a PfEMP1 variant has lower binding capacity, ptp3 may still be higher but binding would be lower than if comparing to a parasite line with a better binding PfEMP1 variant. Studying the effect of PTP3 levels on binding could probably be done by overexpressing PTP3 in the same PfEMP1 SLI expressor line and assessing how this affects binding, but this would go beyond this manuscript.

      PTP3 in panned vs unpanned Var19:

      We did some comparisons between IT4 parent, and the IT4-Var19 panned and unpanned

      (see Author response table 1). This did not reveal any clear associations. While the parent had somewhat lower ptp3 transcript levels, they were still clearly higher than in the unpanned Var19 line and other lines had also ptp3 levels comparable to the panned IT4-Var19 (see Author response table 2) 

      PTP3 in the TGDs and possible reason for binding phenotype:

      A key point is whether PTP3 could have influenced the lack of binding in the TGD lines (see also weakness section and point 1 of public review of reviewer 3: ptp3 may be an indirect cause resulting in lacking binding in TGD parasites). We now did RNA Seq to check for ptp3 expression in the relevant TGD lines although we did not do a systematic quantitative comparison (which would require 3 replicates of RNASeq), but we reasoned that loss of expression would also be evident in one replicate. There was no indication that the TGD lines had lost PTP3 expression (see Author response table 2) and this is unlikely to explain the binding loss in a similar fashion to the Var19 parasites. Generally, the IT4 lines showed expression of both ptp3 genes and only in the Var19 parasites before panning were the transcript levels considerably lower:

      Author response table 1.

      Parent vs IT4-Var19 panned and unpanned

      Author response table 2.

      TGD lines with binding phenotype vs parent

      The absence of an influence of PTP3 on the binding phenotype in the cell lines in this manuscript (besides Var19) is further supported by its role in PfEMP1 surface display. Previous work has shown that KO of ptp3 leads to a loss of VAR2CSA surface display (Maier et al., Cell 2008). The unpanned Var19 parasite also lacked PfEMP1 surface display and panning and the resulting appearance of the binding phenotype was accompanied by surface display of PfEMP1. As both, the EMPIC3 and TryThra-TGD lines had still at least some PfEMP1 on the surface, this also (in addition to the RNA Seq above) speaks against PTP3 being the cause of the binding phenotype. The same applies to 3D7 which despite the poor binding displays PfEMP1 on the host cell surface (Figure 1D). This indicating that also the binding phenotype in 3D7 is not due to PTP3 expression loss, as this would have abolished PfEMP1 surface display. 

      The idea about PTP3 paralogs for specific PfEMP1s is intriguing. In the future it might be interesting to test the frequency of parasites with two PTP3 paralogs in endemic settings and correlate it with the PfEMP1 repertoire, variant expression and potentially disease severity. 

      (5) The IT4var01 line shows substantially lower binding in Figure 5F compared with the data shown in Figure 4E and 6F. Does this reflect changes in the binding capacity of the line over time or is this variability inherent to the assay?

      There is some inherent variability in these assays. While we did not systematically assess this, we had no indication that this was due to the parasite line changing. The Var01 line was cultured for months and was frozen down and thawed more than once without a clear gradual trend for more or less binding. While we can’t exclude some variation from the parasite side, we suspect it is more a factor of the expression of the receptor on the CHO cells the iRBCs bind to. 

      Specifically, the assays in Fig. 6F and 4E mentioned by the reviewer both had an average binding to CD36 of around 1000 iE/mm2, only the experiments in Fig. 5F are different (~ 500 iE/mm2) but these were done with a different batch of CHO cells at a different time to the experiments in Fig. 6F and 4E. 

      (6) In Figure S7A, TryThrA and EMPIC3 show distinct localization as circles around the PfEMP1 signal while PeMP2 appears to co-localize with PfEMP1 or as immediately adjacent spots (strong colocalization is less apparent than SBP1, and the various PfEMP1 IFAs throughout the study). Does this indicate that TryThrA and EMPIC3 are peripheral MC proteins? Does this have any implications for their function in PfEMP1 binding? Some discussion would help as these differences are not mentioned in the text. For the EMPIC3 TGD IFAs, localization of SBP1 and PfEMP1 is noted to be normal but REX1 is not mentioned (although this also appears normal).

      We apologise for the lacking description of the candidate localisations and cursory description of the Maurer’s clefts phenotypes (next point). Our original intent was to not distract too much from the main flow of the manuscript as almost every part of the manuscript could be followed up with more details. However, we fully agree that this is unsatisfactory and now provided more description (this point) and more data (next point).

      Localisation of TryThrA and EMPIC3 compared to PfEMP1 at the Maurer’s clefts: the circular pattern is reminiscent of the results with Maurer’s clefts proteins reported by McMillan et al using 3D-SIM in 3D7 parasites (McMillan et al., Cell Microbiology 2014 (PMID: 23421990)). In that work SBP1 and MAHRP1 (both integral TMD proteins) were found in foci but REX1 (no TMD) in circular structures around these foci similar to what we observed here for TryThrA and EMPIC3 which both also lack a TMD. The SIM data in McMillan et al indicated that also PfEMP1 is “more peripheral”, although it did only partially overlap with REX1. The conclusion from that work was that there are sub-compartments at the Maurer’s clefts. In our IFAs (Fig. S7A) PfEMP1 is also only partially overlapping with the TryThrA and EMPIC3 circles, potentially indicating similar subcompartments to those observed by 3D-SIM. We agree with the reviewer that this might be indicative of peripheral MC proteins, fitting with a lack of TMD in these candidates, but we did not further speculate on this in the manuscript.

      We now added enlargements of the ring-like structures to better illustrate this observation in Fig. S7A. In addition, we now specifically mention the localization data and the ring like signal with TryThrA and EMPIC3 in the results and state that this may be similar to the observations by McMillan et al., Cell Microbiology 2014.

      We also thank the reviewer for pointing out that we had forgotten to mention REX1 in the EMPIC3-TGD, this was amended.  

      (7) The atypical localization in TryThrA TGD line claimed for PfEMP1 and SBP1 in Fig S7B is not obvious. While most REX1 is clustered into a few spots in the IFA staining for SBP1 and REX1, SBP1 is only partially located in these spots and appears normal in the above IFA staining for SBP1 and HA. The atypical localization of PfEMP1-HA is also not obvious to me. The authors should clarify what is meant by "atypical" localization and provide support with quantification given the difference between the two SBP1 images shown.

      We apologise for the inadequate description of these IFA phenotypes. The abnormal signal for SBP1, REX1 and PfEMP1 in the TryThrA-TGD included two phenotypes found with all 3 proteins: 

      (1) a dispersed signal for these proteins in the host cell in addition to foci (the control and the other TGD parasites have only dots in the host cell with no or very little detectable dispersed signal). 

      (2) foci of disproportionally high intensity and size, that we assumed might be aggregation or enlargement of the Maurer’s clefts or of the detected proteins.

      The reason for the difference between the REX1 (aggregation) phenotype and the PfEMP1 and SBP1 (dispersed signal, more smaller foci) phenotypes in the images in Fig. S7B is that both phenotypes were seen with all 3 proteins but we chose a REX1 stained cell to illustrate the aggregation phenotype (the SBP1 signal in the same cell is similar to the REX1 signal, illustrating that this phenotype is not REX1 specific; please note that this cell also has a dispersed pool of REX1 and SBP1). 

      Based on the IFAs 66% (n = 106 cells) of the cells in the TryThrA-TGD parasites had one or both of the observed phenotypes. We did not include this into the previous version of the manuscript because a description would have required detouring from the main focus of this results section. In addition, IFAs have some limitations for accurate quantifications, particularly for soluble pools (depending on fixing efficiency and agent, more or less of a soluble pool in the host cell can leak out). 

      To answer the request to better explain and quantify the phenotype and given the limitations of IFA, we now transfected the TryThrA-TGD parasites with a plasmid mediating episomal expression of SBP1-mCherry, permitting live cell imaging and a better classification of the Maurer’s clefts phenotype. Due to the two SLI modifications in these parasites (using up 4 resistance markers) we had to use a new selection marker (mutated lactate transporter PfFNT, providing resistance to BH267.meta (Walloch et al., J. Med. Chem. 2020 (PMID: 32816478))) to transfect these parasites with an additional plasmid. 

      These results are now provided as Fig. S8 and detailed in the last results section. The new data shows that the majority of the TryThrA-TGD parasites contain a dispersed pool of SBP1 in the host cell. About a third of the parasites also showed disproportionally strong SBP1 foci that may be aggregates of the Maurer’s clefts. We also transfected the EMPIC3-TGD parasites with the FNT plasmid mediating episomal SBP1-mCherry expression and observed only few cells with a cytoplasmic pool or aggregates (Fig. S8). Overall these findings agree with the previous IFA results. As the IFA suggests similar results also for REX1 and PfEMP1, this defect is likely not SBP1 specific but more general (Maurer’s clefts morphology; association or transport of multiple proteins to the Maurer’s clefts). This gives a likely explanation for the cytoadherence phenotype in the TryThrA-TGD parasites. The reason for the EMPIC3-TGD phenotype remains to be determined as we did not detect obvious changes of the Maurer’s clefts morphology or in the transport of proteins to these structures in these experiments. 

      Minor comments

      (1) Italicized numbers in parenthesis are present in several places in the manuscript but it is not clear what these refer to (perhaps differently formatted citations from a previous version of the manuscript). Figure 1

      legend: (121); Figure S3 legend: (110), (111); Figure S6 legend: (66); etc.

      We thank the reviewer for pointing out this issue with the references, this was amended.

      (2) Figure 5A and legend: "BSD-R: BSD-resistance gene". Blasticidin-S (BS) is the drug while Blasticidin-S deaminase (BSD) is the resistance gene.

      We thank the reviewer for pointing this out, the legend and figure were changed.

      (3) Figure 5E legend: µ-SBP1-N should be α-SBP1-N.

      This was amended.

      (4) Figure S5 legend: "(Full data in Table S1)" should be Table S3.

      This was amended.

      (5) Figure S1G: The pie chart shows PF3D7_0425700 accounts for 43% of rif expression in 3D7var0425800 but the text indicates 62%.

      We apologize for this mistake, the text was corrected. We also improved the citations to Fig. S1G and H in this section.

      (6) "most PfEMP1-trafficking proteins show a similar early expression..." The authors might consider including a table of proteins known to be required for EMP1 trafficking and a graph showing their expression timing. Are any with later expressions known?

      Most exported proteins are expressed early, which is nicely shown in Marti et al 2004 (cited for the statement) in a graph of the expression timing of all PEXEL proteins (Fig. 4B in that paper). PNEPs also have a similar profile (Grüring et al 2011, also cited for that statement), further illustrated by using early expression as a criterion to find more PNEPs (Heiber et al., 2013 (PMID: 23950716)). Together this includes most if not all of the known PfEMP1 trafficking proteins. The originally co-submitted paper (Blancke-Soares & Stäcker et al., eLife preprint doi.org/10.7554/eLife.103633.1) analysed several later expressed exported proteins

      (Pf332, MSRP6) but their disruption, while influencing Maurer’s clefs morphology and anchoring, did not influence PfEMP1 transport. However, there are some conflicting results for Pf332 (referenced in Blancke-Soares & Stäcker et al). This illustrates that it may not be so easy to decide which proteins are bona fide PfEMP1 trafficking proteins. We therefore did not add a table and hope it is acceptable for the reader to rely on the provided 3 references to back this statement.

      (7)  Figure S1J: The predominate var in the IT4 WT parent is var66 (which appears to be syntenic with Pf3D7_0809100, the predominate var in the 3D7 WT parent). Is there something about this locus or parasite culture conditions that selects for these vars in culture? Is this observed in other labs as well?

      This is a very interesting point (although we are not certain these vars are indeed syntenic, they are on different chromosomes). As far as we know at least Pf3D7_0809100 is commonly a dominant var transcribed in other labs and was found expressed also in sporozoites (Zanghì et al. Cell Rep. 2018). However, it is unclear how uniform this really is. For IT4 we do not know in full but have also here commonly observed centromeric var genes to be dominating transcripts in unselected parasite cultures. It is possible that transcription drifts to centromeric var genes in cultured parasites. However, given the anecdotal evidence, it is unknown to which extent this is related to an inherent switching and regulation regiment or a consequence of faulty regulation following prolonged culturing.

      (8) Figure 4B, C: Presumably the asterisks on the DNA gels indicate non-specific bands but this is not described in the legend. Why are non-specific bands not consistent between parent and integrated lanes?

      We apologize for not mentioning this in the legend, this was amended.

      It is not clear why the non-specific bands differ between the lines but in part this might be due to different concentrations and quality of DNA preps. A PCR can also behave differently depending on whether the correct primer target is present or not. If present, the PCR will run efficiently and other spurious products will be outcompeted, but in absence of the correct target, they might become detectable.  

      Overall, we do not think the non-specific bands are indications of anything untoward with the lines, as for instance in Fig. 4B the high band in the 5’ integration in the IT4 line (that does not occur anywhere else) can’t be due to a genomic change as this is the parental line and does not contain the plasmid for integration. In the same gel, the ori locus band of incorrect size (likely due to crossreaction of the primers to another var gene which due to the high similarity of the ATS region is not always fully avoidable), is present in both, the parent IT4 and the integrant line which therefore also is not of concern. In C there are a couple of bands of incorrect size in the Integration line. One of these is very faint and both are too large and again therefore are likely other vars that are inefficiently picked up by these primers. The reason they are not seen in the parent line is that there the correct primer binding site is present, which then efficiently produces a product that outcompetes the product derived from non-optimal matching primer products and hence appear in the Int line where the correct match is not there anymore. For these reasons we believe these bands are not of any concern.  

      (9) Figure 4C: Is there a reason KAHRP was used as a co-marker for the IFA detecting IT4var19 expression instead of SBP1 which was used throughout the rest of the study?

      This is a coincidence as this line was tested when other lines were tested for KAHRP. As there were foci in the host cell we were satisfied that the HA-tagged PfEMP1 is produced and the localization deemed plausible. 

      (10) Figure 6: Streptavidin labeling for the IT4var01-BirA position 3 line is substantially less than the other two lines in both IFA and WB. Does the position 3 fusion reduce PfEMP1 protein levels or is this a result of the context or surface display of the fusion? Interestingly, the position 3 trypsin cleavage product appears consistently more robust compared with the other two configurations. Does this indicate that positioning BirA upstream of the TM increases RBC membrane insertion and/or makes the surface localized protein more accessible to trypsin?

      It is possible that RBC membrane insertion or trypsin accessibility is increased for the position 3 construct. But there could also be other explanations:

      The reason for the more robustly detected protected fragment for the position 3 construct in the WB might also be its smaller size (in contrast to the other two versions, it does not contain BirA*) which might permit more efficient transfer to the WB membrane. In that case the more robust band might not (only) be due to better membrane insertion or better trypsin accessibility.

      The lower biotinylation signal with the position 3 construct might also be explained by the farther distance of BirA* to the ATS (compared to position 1 and 2), the region where interactors are expected to bind. The position 1 and 2 constructs may therefore generally be more efficient (as closer) to biotinylate ATS proximal proteins. Further, in the final destination (PfEMP1 inserted into the RBC membrane) BirA* would be on the other side of the membrane in the position 3 construct while in the position 1 and 2 constructs BirA* would be on the side of the membrane where the ATS anchors PfEMP1 in the knob structure. In that case, labelling with position 3 would come from interactions/proximities during transport or at the Maurer’s clefts (if there indeed PfEMP1 is not membrane embedded) and might therefore be less.

      Hence, while alterations in trypsin accessibility and RBC membrane insertion are possible explanations, other explanations exist. At present, we do not know which of these explanations apply and therefore did not mention any of them in the manuscript. 

      Reviewer #3 (Recommendations for the authors):

      (1) In the abstract and on page 8, the authors mention that they generate cell lines binding to "all major endothelial receptors" and "all known major receptors". This is a pretty allencompassing statement that might not be fully accepted by others who have reported binding to other receptors not considered in this paper (e.g. VCAM, TSP, hyaluronic acid, etc). It would be better to change this statement to something like "the most common endothelial receptors" or "the dominant endothelial receptors", or something similar.

      We agree with the reviewer that these statements are too all-encompassing and changed them to “the most common endothelial receptors” (introduction) and “the most common receptors” (results).

      (2) The authors targeted two rif genes for activation and in each case the gene became the most highly expressed member of the family. However, unlike var genes, there were other rif genes also expressed in these lines and the activated copy did not always make up the majority of rif mRNAs. The authors might wish to highlight that this is inconsistent with mutually exclusive expression of this gene family, something that has been discussed in the past but not definitively shown.

      We thank the reviewer for highlighting this, we now added the following statement to this section: “While SLI-activation of rif genes also led to the dominant expression of the targeted rif gene, other rif genes still took up a substantial proportion of all detected rif transcripts, speaking against a mutually exclusive expression in the manner seen with var genes.”

      (3) In Figure 6, H-J, the authors display volcano plots showing proteins that are thought to interact with PfEMP1. These are labeled with names from the literature, however, several are named simply "1, 2, 3, 4, 5, or 6". What do these numbers stand for?

      We apologize for not clarifying this and thank the reviewer for pointing this out. There is a legend for the numbered proteins in what is now Table S4 (previously Table S3). We now amended the legend of Figure 6 to explain the numbers and pointing the reader to Table S4 for the accessions.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, "Cryo-EM structure of the bicarbonate receptor GPR30," the authors aimed to enrich our understanding of the role of GPR30 in pH homeostasis by combining structural analysis with a receptor function assay. This work is a natural development and extension of their previous work on Nature Communications (PMID: 38413581). In the current body of work, they solved the cryo-EM structure of the human GPR30-G-protein (mini-Gsqi) complex in the presence of bicarbonate ions at 3.15 Å resolution. From the atomic model built based on this map, they observed the overall canonical architecture of class A GPCR and also identified 3 extracellular pockets created by ECLs (Pockets A-C). Based on the polarity, location, size, and charge of each pocket, the authors hypothesized that pocket A is a good candidate for the bicarbonate binding site. To identify the bicarbonate binding site, the authors performed an exhaustive mutant analysis of the hydrophilic residues in Pocket A and analyzed receptor reactivity via calcium assay. In addition, the human GPR30-G-protein complex model also enabled the authors to elucidate the G-protein coupling mechanism of this special class A GPCR, which plays a crucial role in pH homeostasis.

      Strengths:

      As a continuation of their recent Nature Communications publication, the authors used cryo-EM coupled with mutagenesis and functional studies to elucidate bicarbonate-GPR30 interaction. This work provided atomic-resolution structural observations for the receptor in complex with G-protein, allowing us to explore its mechanism of action, and will further facilitate drug development targeting GPR30. There were 3 extracellular pockets created by ECLs (Pockets A-C). The authors were able to filter out 2 of them and hypothesized that pocket A was a good candidate for the bicarbonate binding site based on the polarity, location, and charge of each pocket. From there, the authors identified the key residues on GPR30 for its interaction with the substrate, bicarbonate. Together with their previous work, they mapped out amino acids that are critical for receptor reactivity.

      Weaknesses:

      When we see a reduction of a GPCR-mediated downstream signaling, several factors could potentially contribute to this observation: 1) a reduced total expression of this receptor due to the mutation (transcription and translation issue); 2) a reduced surface expression of this receptor due to the mutation (trafficking issue); and 3) a dysfunctional receptor that doesn't signal due to the mutation. In the current revision, based on the gating strategy, the surface expression of the HA-positive WT GPR30-expressing cells is only 10.6% of the total population, while the surface expression levels of the mutants range from 1.89% (P71A) to 64.4% (D111A). Combining this information with the functional readout in Figure 3F and G, as well as their previous work, the authors concluded that mutations at P71, E115, D125, Q138, C207, D210, and H307 would decrease bicarbonate responses. Among those sites,

      E115, Q138, and H307 were from their previous Nature Comm paper.

      Authors claim P71 and C207 make a structural-stability contribution, as their mutations result in a significant reduction in surface expression: P71A (1.89%) and C207A (2.71%). However, compared to 10.6% of the total population in the WT, (P71A is 17.8% of the WT, and C207A is 25.6% of the WT), this doesn't rule out the possibility that the mutated receptor is also dysfunctional: at 10 mM NaHCO3, RFU of WT is ~500, RFU of P71 and C207 are ~0.

      The authors also interpret "The D125ECL1A mutant has lost its activity but is located on the surface" and only mention "D125 is unlikely to be a bicarbonate binding site, and the mutational effect could be explained due to the decreased surface expression". Again, compared to 10.6% of the total population in the WT, D125A (3.94%) is 37.2% of the WT. At 10 mM NaHCO3, the RFU of the WT is ~500, the RFU of D125 is ~0. This doesn't rule out the possibility that the mutated receptor is also dysfunctional. It is not clear why D125A didn't make it to the surface.

      Other mutants that the authors didn't mention much in their text: D111A (64.4%, 607.5% of WT surface expression), E121A (50.4%, 475.5% of WT surface expression), R122 (41.0%, 386.8% of WT surface expression), N276A (38.9%, 367.0% of WT surface expression) and E218A (24.6%, 232.1% of WT surface expression) all have similar RFU as WT, although the surface expression is about 2-6 times more. On the other hand, Q215A (3.18%, 30% of WT surface expression) has similar RFU as WT, with only a third of the receptor on the surface.

      Altogether, the wide range of surface expression across the different cell lines, combined with the different receptor function readouts, makes the cell functional data only partially support their structural observations.

    2. Reviewer #3 (Public review):

      Summary

      GPR30 responds to bicarbonate and plays a role in regulating cellular pH and ion homeostasis. However, the molecular basis of bicarbonate recognition by GPR30 remains unresolved. This study reports the cryo-EM structure of GPR30 bound to a chimeric mini-Gq in the presence of bicarbonate, revealing mechanistic insights into its G-protein coupling. Nonetheless, the study does not identify the bicarbonate-binding site within GPR30.

      Strengths

      The work provides strong structural evidence clarifying how GPR30 engages and couples with Gq.

      Weaknesses

      Several GPR30 mutants exhibited diminished responses to bicarbonate, but their expression levels were also reduced. As a result, the mechanism by which GPR30 recognizes bicarbonate remains uncertain, leaving this aspect of the study incomplete.

    3. Author response:

      The following is the authors’ response to the original reviews.

      The parts of the text that have been changed.The major changes are as follows:

      We re-analyzed the dataset and improved the local resolution of the extracellular region (Author response image 1).

      We re-modeled based on the improved density and canceled the bicarbonate model based on comments from all reviewers.

      We performed calcium assay using cell lines stably expressing the mutants, whose surface expression levels were analyzed by fluorescence-activated cell sorting (FACS)<br /> (Figure 3F, G and Figure 3–figure supplement 1-3).

      Thus, we significantly revised our discussion of the extracellular binding pocket and the result of the mutational study. In the revised manuscript, we speculate that H307 is a candidate for the bicarbonate binding site.

      Author response image 1.

      Figure Comparison of local resolution between re-analyzed and previous maps.A Side and top view of the re-analyzed receptor-focused map of GPR30 colored by local resolution. B Side and top view of the previous receptor-focused map of GPR30 colored by local resolution

      Reviewer #1 (Public Review):

      Summary:

      This study resolves a cryo-EM structure of the GPCR, GPR30, which was recently identified as a bicarbonate receptor by the authors' lab. Understanding the ligand and the mechanism of activation is of fundamental importance to the field of receptor signaling. However, the main claim of the paper, the identification of the bicarbonate binding site, is only partly supported by the structural and functional data, leaving the study incomplete.

      Strengths:

      The overall structure, and proposed mechanism of G-protein coupling seem solid. The authors perform fairly extensive unbiased mutagenesis to identify a host of positions that are important to G-protein signaling. To my knowledge, bicarbonate is the only physiological ligand that has been identified for GPR30, making this study a particularly important contribution to the field.

      Weaknesses:

      Without higher resolution structures and/or additional experimental assessment of the binding pocket, the assignment of the bicarbonate remains highly speculative. The local resolution is especially poor in the ECL loop region where the ligand is proposed to bind (4.3 - 4 .8 Å range). Of course, sometimes it is difficult to achieve high structural resolution, but in these cases, the assignment of ligands should be backed up by even more rigorous experimental validation.The functional assay monitors activation of GPR30, and thus reports on not only bicarbonate binding, but also the integrity of the allosteric network that transduces the binding signal across the membrane. Thus, disruption of bicarbonate signaling by mutagenesis of the putative coordinating residues does not necessarily mean that bicarbonate binding has been disrupted. Moreover, the mutagenesis was apparently done prior to structure determination, meaning that residues proposed to directly surround bicarbonate binding, such as E218, were not experimentally validated. Targeted mutagenesis based on the structure would strengthen the story.

      Moreover, the proposed bicarbonate binding site is surprising in a chemical sense, as it is located within an acidic pocket. The authors cite several other structural studies to support the surprising observation of anionic bicarbonate surrounded by glutamate residues in an acidic pocket (references 31-34). However, it should be noted that in general, these other structures also possess a metal ion (sodium or calcium) and/or a basic sidechain (arginine or lysine) in the coordination sphere, forming a tight ion pair. Thus, the assigned bicarbonate binding site in GPR30 remains an anomaly in terms of the chemical properties of the proposed binding site.

      Thank you for your insightful comments. Based on the weaknesses you pointed out, we reconstructed the receptor based on the improved density and removed the bicarbonate model. We performed calcium assays using cell lines stably expressing the variant based on the structure.

      Reviewer #2(Public Review):

      Summary:

      In this manuscript, "Cryo-EM structure of the bicarbonate receptor GPR30," the authors aimed to enrich our understanding of the role of GPR30 in pH homeostasis by combining structural analysis with a receptor function assay. This work is a natural development and extension of their previous work (PMID: 38413581). In the current body of work, they solved the first cryo-EM structure of the human GPR30-G-protein (mini-Gsqi) complex in the presence of bicarbonate ions at 3.21 Å resolution. From the atomic model built based on this map, they observed the overall canonical architecture of class A GPCR and also identified 4 extracellular pockets created by extracellular loops (ECLs) (Pockets A-D). Based on the polarity, location, and charge of each pocket, the authors hypothesized that pocket D is a good candidate for the bicarbonate binding site. To verify their structural observation, on top of the 10 mutations they generated in the previous work, the authors introduced another 11 mutations to map out the essential residues for the bicarbonate response on hGPR30. In addition, the human GPR30-G-protein complex model also allowed the authors to untangle the G-protein coupling mechanism of this special class A GPCR that plays an important role in pH homeostasis.

      Strengths:

      As a continuation of their recent Nature Communication publication (PMID: 38413581), this study was carefully designed, and the authors used mutagenesis and functional studies to confirm their structural observations. This work provided high-resolution structural observations for the receptor in complex with G-protein, allowing us to explore its mechanism of action, and will further facilitate drug development targeting GPR30. There were 4 extracellular pockets created by ECLs (Pockets A-D). The authors were able to filter out 3 of them and identified that pocket D was a good candidate for the bicarbonate binding site based on the polarity, location, and charge of each pocket. From there, the authors identified the key residues on GPR30 for its interaction with the substrate, bicarbonate. Together with their previous work, they carefully mapped out nine amino acids that are critical for receptor reactivity.

      Weaknesses:

      It is unclear how novel the aspects presented in the new paper are compared to the most recent Nature Communications publication (PMID: 38413581). Some areas of the manuscript appear to be mixed with the previous publication. The work is still impactful to the field. The new and novel aspects of this manuscript could be better highlighted.

      I also have some concerns about the TGFα shedding assay the authors used to verify their structural observation. I understand that this assay was also used in the authors' previous work published in Nature Communications. However, there are still several things in the current data that raised concerns:

      Thank you for your insightful comments. Based on the weaknesses you pointed out, we highlighted the new and novel aspects of this manuscript could be better highlighted.l. We performed calcium assays using cell lines stably expressing the variant based on the structure.

      (1) The authors confirmed the "similar expression levels of HA-tagged hGPR30" mutants by WB in Supplemental Figure 1A and B. However, compared to the hGPR30-HA (~6.5 when normalized to the housekeeping gene, Na-K-ATPase), several mutants of the key amino acids had much lower surface expression: S134A, D210A, C207A had ~50% reduction, D125A had ~30% reduction, and Q215A and P71A had ~20% reduction. This weakens the receptor reactivity measured by the TGFα shedding assay.

      Since the calcium assay data is included in the main figure, the TGFα shedding assay and WB expression quantification data are Figure 3. –– supplement figure 1-4, but we included an explanation of the expression levels in the figure caption.

      (2) In the previous work, the authors demonstrated that hGPR30 signals through the Gq signaling pathway and can trigger calcium mobilization. Given that calcium mobilization is a more direct measurement for the downstream signaling of hGPR30 than the TGFα shedding assay, pairing the mutagenesis study with the calcium assay will be a better functional validation to confirm the disruption of bicarbonate signaling.

      According to the suggestion, we performed calcium assay using cell lines stably expressing the mutants (Figure 3F, G and Figure 3–figure supplement 1-3).

      (3) It was quite confusing for Figure 4B that all statistical analyses were done by comparing to the mock group. It would be clearer to compare the activity of the mutants to the wild-type cell line.

      Thank you for your comment. As you mentioned, the comparisons are made between wild-type GPR30 and mutants in the revised manuscript (Figure 3G, Figure 3.—figure supplement 4B)

      Additional concerns about the structural data include

      (1) E218 was in close contact with bicarbonate in Figure 4D. However, there is no functional validation for this observation. Including the mutagenesis study of this site in the cell-based functional assay will strengthen this structural observation.

      We cancelled the bicarbonate model, and we performed mutation analysis targeting all residues facing the binding pocket using cell lines that stably express variants including E218A.

      (2) For the flow chart of the cryo-EM data processing in Supplemental data 2, the authors started with 10,148,422 particles after template picking, then had 441,348 Particles left after 2D classification/heterogenous refinement, and finally ended with 148,600 particles for the local refinement for the final map. There seems to be a lot of heterogeneity in this purified sample. GPCRs usually have flexible and dynamic loop regions, which explains the poor resolution of the ECLs in this case. Thus, a solid cell-based functional validation is a must to assign the bicarbonate binding pocket to support their hypothesis.

      We re-analyzed the dataset and improved the local resolution of the extracellular region (Author response image 1) and cancelled the bicarbonate model. Yet, as suggested by the reviewer, solid cell-based functional validation is efficient to analyze the receptor function response to bicarbonate. Thus, we performed mutation analysis targeting all residues facing the binding pocket using cell lines stably expressing the mutants, whose surface expression levels were analyzed by FACS (Figure 3F, G and Figure 3.––figure supplement 1-3).

      Reviewer #3 (Public Review):

      Summary:

      GPR30 responds to bicarbonate and regulates cellular responses to pH and ion homeostasis. However, it remains unclear how GPR30 recognizes bicarbonate ions. This paper presents the cryo-EM structure of GPR30 bound to a chimeric mini-Gq in the presence of bicarbonate. The structure together with functional studies aims to provide mechanistic insights into bicarbonate recognition and G protein coupling.

      Strengths:

      The authors performed comprehensive mutagenesis studies to map the possible binding site of bicarbonate.

      Weaknesses:

      Owing to the poor resolution of the structure, some structural findings may be overclaimed.

      Based on EM maps shown in Figure 1a and Figure Supplement 2, densities for side chains in the receptor particularly in ECLs (around 4 Å) are poorly defined. At this resolution, it is unlikely to observe a disulfide bond (C130ECL1-C207ECl2) and bicarbonate ions. Moreover, the disulfide between ECL1 and ECL2 has not been observed in other GPCRs and the published structure of GPR30 (PMID: 38744981). The density of this disulfide bond could be noise.

      The authors observed a weak density in pocket D, which is accounted for by the bicarbonate ions. This ion is mainly coordinated by Q215 and Q138. However, the Q215A mutation only reduced but not completely abolished bicarbonate response, and the author did not present the data of Q138A mutation. Therefore, Q215 and Q138 could not be bicarbonate binding sites. While H307A completely abolished bicarbonate response, the authors proposed that this residue plays a structural role. Nevertheless, based on the structure, H307 is exposed and may be involved in binding bicarbonate. The assignment of bicarbonate in the structure is not supported by the data.

      Thank you for your insightful comments. Based on the weaknesses you pointed out, we reconstructed the receptor based on the improved density and removed the bicarbonate model. We performed calcium assays using cell lines stably expressing the variant based on the structure.

      Reviewer #1 (Recommendations For The Authors):

      (1) The experimental validation of the bicarbonate binding could be strengthened by developing an assay that directly monitors bicarbonate binding (rather than GPCR signaling)

      We agree that a direct binding assay for bicarbonate would be highly attractive (i.e. Filter binding assay using 14C-HCO₃⁻). However, the weak affinity of bicarbonate ions (in the mM range) would make reliable radioisotope-based detection impossible due to minimal specific receptor occupancy and high non-specific background and thus it is highly challenging and there are limitations to what can be done in this structural paper.

      and determining a structure at comparable resolution in the absence of bicarbonate. In addition, all residues that are proposed to be located adjacent to the bicarbonate should be mutated and functionally validated.

      We re-modeled the receptor based on the improved density and canceled the bicarbonate model. We performed calcium assay using cell lines stably expressing the mutants (Figure 3F, G and Figure 3.–figure supplement 1-3).

      (2) What are the maps contoured in Figure 4D? The legend should describe this. Is 218 within the map region shown, or is there no density for its sidechain?

      We removed the corresponding figure and cancelled the bicarbonate model.

      (3) The contour level of the maps in Figure 1 - Figure Supplement 2 should also be indicated. Are these all contoured at the same level?

      Thank you for your comment. We re-analyzed the same data set and obtained new density maps and models. We reworked Figure 1 and Figure 1. figure supplement 2; the contour level of the map for Figure 1 and composite map for the Figure 1. figure supplement 2 is the same, 7.65. 

      (4) Regarding the cited structures of bicarbonate-binding proteins, for three of the four cited structures, the bicarbonate is actually coordinated by positive ligands, with the Asp/Glu playing a more peripheral role:

      Capper et al: Overall basic cavity with tight bidentate coordination by Arg. The Glu is 5-6 Å away.

      Koropatkin et al: Two structures. The first, solved at pH 5, is proposed to have carbonic acid bound. The second, solved at pH 8, shows carbonate in a complex with calcium, with the calcium coordinated by carboxylates.

      Wang et al: The bicarbonate is coordinated by a lysine and a sodium ion. The sodium is coordinated by carboxylates.

      The authors should more thoughtfully discuss the unusual properties of this binding site with regard to the previous literature. Is it possible that bicarbonate binds in complex with a metal ion? Could this possibility be experimentally tested?

      We cancelled the bicarbonate model.

      (5) As a structure of GPR30 has been recently published by another group (PMID: 38744981), it would be valuable to discuss structural similarities and differences and discuss how bicarbonate activation and activation by the chloroquine ligand identified by the other group might both be accommodated by this structure.

      Thank you for your valuable comment. We compared the structure presented by another group and added our discussion, as “During the revision of this manuscript, the structures of apo-GPR30-G<sub>q</sub> (PDB 8XOG) and the exogenous ligand Lys05-bound GPR30-G<sub>q</sub> (PDB 8XOF) were reported [42]. We compared our structure of GPR30 in the presence of bicarbonate with these structures. In the extracellular region, the position of TM5 in GPR30 in the presence of bicarbonate is similar to that in apo-GPR30. In contrast, the position of TM6 is shifted outward relative to that of apo-GPR30, resembling the conformation observed in Lys05-bound GPR30 (Figure 6A, B). Additionally, the position of ECL1 is also shifted outward compared to that of apo-GPR30 (Figure 6B). In the GPR30 structure in the presence of bicarbonate, ECL2 was modeled, suggesting differences in structural flexibility. These findings indicate that the structure of GPR30 in the presence of bicarbonate is different from both the apo structure and the Lys05-bound structure, demonstrating that the structure and the flexibility of the extracellular domain of GPR30 change depending on the type of ligand. Furthermore, focusing on the interaction with G<sub>q</sub>, the αN helix of G<sub>q</sub> is not rotated in the structure bound to Lys05, in contrast to the characteristic bending of the αN helix in our structure (Figure 6C, D). Although it is necessary to consider variations in experimental conditions, such as salt concentration, the differences in the G<sub>q</sub> binding modes suggest that the downstream signals may change in a ligand-dependent manner.” (lines 249-266).

      Reviewer #2 (Recommendations For The Authors):

      (1) It is highly recommended that the authors carefully go through the "insights into bicarbonate binding" section. The results of the new findings in this paper were blended in with the results from the previous work: the importance of E115, Q138, and H307 in the receptor-bicarbonate interaction was shown in the Nature Communication paper but the authors didn't make it clear, which added a little confusion.

      We emphasized this fact in the main text (lines 130-132).

      (2) It would be nice for the authors to add some content about the physiological concentration of HCO3 or refer more to their previous work about the rationale for selecting the bicarbonate dose in their functional assay.

      Thank you for your comment. The physiological concentration of bicarbonate is 22-26 mM in the extracellular fluid, including interstitial fluid and blood, and 10-12 mM in the intracellular fluid. The bicarbonate concentration alters in various physiological and pathological conditions – metabolic acidosis in chronic kidney disease causes a drop to 2-3 mM, and metabolic alkalosis induced by severe vomiting increases HCO<sub>3</sub><sup>-</sup> concentrations more than 30 mM. Thus, our present and previous works clearly show that GPR30 is activated by physiological concentrations of bicarbonate, whether it is localized intracellularly or on the membrane, and that GPR30 can be deactivated or reactivated in various pathophysiological conditions. We added this in the discussion section (lines 267-278).

      (3) In Figure 3A, in the legend, the authors mentioned: "black dashed lines indicate hydrogen bonds". No hydrogen bond was noted in the figure.

      We totally corrected Figure 3.

      (4) Figure 3B, it would be helpful for the authors to denote the meaning of the blue-white-red color coding in the legend.

      We removed the figure.

      (5) Supplemental Figure 3: since AF3 was released on May 3rd, it would be awesome in the revision version if the authors would update this to the AF3 model.

      The AF2 model has been replaced with the AF3. (Figure 2–figure supplement 2A-C). The AF2 and AF3 models are almost identical, and they form incorrect disulfide bonds. This confirms the usefulness of the experimental structural determination in this study.

      (6) Supplemental Figure 4: it wasn't clear to me if the expression experiments were repeated multiple times or if there was any statistical analysis for the expression level was done in this study.

      We performed the expression experiment by western blotting once and did not perform statistical analyses. We performed repeated FACS analyses of HEK cells stably expressing N-terminally HA-tagged wild-type or mutant GPR30s to analyze their membrane and whole-cell expressions during revision (Figure 3.–figure supplement 1-3). Using these stable cells, we performed calcium assays using cell lines stably expressing the mutants (Figure 3F, G and Figure 3–figure supplement 1-3).

      (7) Supplemental Figure 4: Also, is there a reason for the authors to compare the expression level of hGPR30 to the housekeeping gene NA-K-ATPase rather than the total loaded protein? Traditionally housekeeping genes have been used as loading controls to semiquantitatively compare the expression of target proteins in western blots. However, numerous recent studies show that housekeeping proteins can be altered due to experimental conditions, biological variability across tissues, or pathologies. A consensus has developed for using total protein as the internal control for loading. An editorial from the Journal of Biological Chemistry reporting on "Principles and Guidelines for Reporting Preclinical Research" from the workshop held in June 2014 by the NIH Director's Office, Nature Publishing Group, and Science stated, "It is typically better to normalize Western blots using total protein loading as the denominator".

      Thank you for your instructive comment. We evaluated western blotting with the same amount of total protein loaded 20 µg for whole-cell lysate and 1.5 µg for cell surface protein (Figure 3.–figure supplement 3C-F).

      Reviewer #3 (Recommendations For The Authors):

      The claim about this disulfide should be removed unless the authors can provide mass spec evidence.

      Thank you for your crucial comments. Firstly, C130 is a residue of TM3, not ECL1, so our misprint has been corrected to C130<sup>3.25</sup>. C207<sup>ECL2</sup>, located at position 45.50, is the most conserved residue in ECL2, and it forms a disulfide bond with cysteine at position 3.25 (PMID: 35113559). The paper was additionally cited regarding the preservation of the bond of C130<sup>3.25</sup>-C207<sup>ECL2</sup> (line 103). Indeed, disruption of this disulfide bond by the C207<sup>ECL2</sup> A mutation resulted in a marked reduction in receptor activity. In addition, the data set was re-analyzed to improve the local resolution of the extracellular region, and it was shown that the density of ECL2 is not noise (Figure 2. ––figure supplement 2). We are confident about the presence of the disulfide bond, based on the structural analysis data and the conservation.

      The highly flexible extracellular region is greatly affected by experimental conditions and ligands, so we speculate that the ECL2 and the disulfide bond was not observed in other reported structures of GPR30. Then, we have added the following content to the discussion, as “In the GPR30 in the presence of bicarbonate, ECL2 was modelled, suggesting differences in structural flexibility.” (lines 256-257).

      The authors should remove the assignment of bicarbonate in the structure, and tone down the binding site of bicarbonate.

      We cancelled the bicarbonate model.

      Minor:

      (1) The potency of bicarbonate for GPR30 is in the mM range. Although the concentration of bicarbonate in the serum can reach mM range, how about its concentration in the tissues? Given its low potency, it may be not appropriate to claim GPR30 is a bicarbonate receptor at this point, but the authors can claim that GPR30 can be activated by or responds to bicarbonate.

      The physiological concentration of bicarbonate is 22-26 mM in the extracellular fluid, including interstitial fluid and blood, and 10-12 mM in the intracellular fluid. Therefore, GPR30 is activated by physiological concentrations of bicarbonate in the tissues. Also, the bicarbonate concentration alters in various physiological and pathological conditions – metabolic acidosis in chronic kidney disease causes a drop to 2-3 mM, and metabolic alkalosis induced by severe vomiting increases HCO3- concentrations more than 30 mM. Thus, our work clearly shows that GPR30 is activated by physiological concentrations of bicarbonate, whether it is localized intracellularly or on the membrane, and that GPR30 can be deactivated or reactivated in various pathophysiological conditions. According to the reasons above, we claim GPR30 is a bicarbonate receptor (lines 267-278).

      (2) The description that there is no consensus on a drug that targets GPR30 is not accurate, since lys05 has been reported as an agonist of GPR30 and their structure is published (PMID: 38744981). The published structures of GPR30 should be introduced in the paper.

      We added the discussion about the structural comparison with the Lys05-bound structure (Figure 6, lines 249-266)

      (3) BW numbers in Figure 4A should be shown.

      We added BW numbers in the figures of the mutational studies.

    1. boardrooms and parliaments, it's somewhere between 3 to 21%. Now, again, numbers are very disputed

      for - stats - psychopathy - 3 to 21% in boardrooms and parliaments - more likely to find psychopath in boardroom and parliament than grocery store - SRG comment - stats- shadow side of leadership - high percentage of leaders have dark triad

    1. R0:

      Reviewer #1:

      This sub study was nested in a factorial randomized controlled trial (RCT) in women aged 18–30 years. Participants included in this study were randomized to receive either a preconception intervention package or routine care until early childhood. The design strategy involved a reasonable sample size justification to show superiority. The sample needed for the study objectives was well justified with power considerations. However, the investigators do note that the sample size, while adequate for detecting moderate effect sizes, may have been insufficient to identify smaller but clinically meaningful differences. The descriptives are informative as seen in Tables 1 and 2.

      1. Please define IQR in the footnote of Table 2 or put a descriptive section in the ‘Analysis Plan’ paragraph.

      Generalized linear models (GLMs) with a Gaussian family and identity link function were used to estimate mean differences in CRP, AGP, IGF-1, and IGFBP3 concentrations. To estimate risk ratios for inflammatory status between infants in the intervention and routine care groups, GLMs with a binomial family and log link function were employed. Final models were adjusted for place of birth. There are several considerations needing clarification.

      There are four endpoints. Therefore,

      1. Some consideration of multiple comparison p-value adjustment should have been discussed.

      Also, with respect to model content,

      1. Exactly how was adjustment by birthplace incorporated into the models?

      The overall conclusions follow from the analyses performed and results seen in Table 3. The strengths and limitations are reasonably described in the ‘Discussion’ section. As an added point, however,

      4.There is a gap between the manuscript text and the supplement supporting information proposal Version 2.0. Was there any attempt to explore the mediation analysis discussed in that proposal?

      Reviewer #2:

      1. Overall Assessment This study reports a well-designed randomized controlled trial. It investigates the impact of an integrated intervention on infant biomarkers related to inflammation and growth like CRP, AGP, IGF-1, IGFBP3. The research addresses a significant question in maternal and child health. However, the discussion sections can be improved with detailed explanation on biological plausibility. Also, the implications of this study can be broadly elaborated.
      2. Originality and Relevance The research topic appears to be original and highly relevant. The novelty in this study is integrated interventions across different stages right from preconception to 2 years of early child development. The intervention is policy-relevant and aligns well as per Goal-2 and Goal-4 of SDG-2030. The concept is innovative and similar integrated frameworks are reported in the literature. The specific distinct approach of this study needs to be articulated.
      3. Scientific Rigor and Methodology This randomized controlled design follows standard protocols and manuscript is well-aligned as per CONSORT guidelines. Please elaborate on randomization process, blinding, and control of confounders. The sample size calculations appear to be powered for anthropometric assessments. For biomarker outcomes, sample size calculations need to be refined/justified.
      4. Results and Interpretation The results of this study report no significant differences in biomarkers between intervention and control groups. The null findings can be discussed with possible biological explanations like timing of assessment, nutritional variability, breastfeeding. Subgroup analysis by maternal or infant characteristics can be helpful.
      5. Discussion and Implications There is a scope to elaborate the discussion section by linking the pathways of maternal interventions with infant biomarker responses. Implications of this study for public health, including integration into maternal and child health programs, can be discussed highlighting the need for long-term follow-up.
      6. Presentation and Clarity The manuscript is well-written and well-organized as per required guidelines. However, most of the references are quite older and references from 2022 onwards are missing. More recent Citations can be included from year 2023-2025.
      7. Ethical and Data Considerations All the ethical procedures are described clearly including IEC and CTRI. Data availability through Open Access links is provided.
      8. Conclusion and Recommendation This well-executed trial can be good evidence for understanding the biological outcomes of integrated maternal-child interventions.

      Recommendation: Minor Revision.

      Reviewer #3:

      This study is a secondary analysis of the WINGS factorial randomized controlled trial evaluating the effects of a multidomain, integrated intervention delivered from preconception through early childhood on infant biomarkers of inflammation and growth (CRP, AGP, IGF-1, IGFBP3) at 3 and 6 months of age. This study links the integrated intervention to specific changes in inflammatory and growth-related biomarkers like CRP, AGP, IGF-1 and IGFBP3. The study addressed the biologically relevant and policy-important question related to early-life interventions in low-resource settings The findings indicate no significant differences in these biomarkers between the intervention and control groups, except for a transient decrease in IGFBP3 at 3 months, which was not sustained at 6 months. The authors conclude that while the intervention improved growth outcomes in the parent trial, it did not significantly influence early-life inflammation or IGF axis biomarkers. The manuscript is well-written, clearly articulated and follows the required CONSORT Guidelines. Major Comments 1. Rationale and Framing • Biological rationale connecting integrated maternal–child interventions (nutrition, WASH, psychosocial care) with the specific biomarkers studied (CRP, AGP, IGF-1, IGFBP3), needs clarity • Clarify why these markers and 3- and 6-month time points were selected, especially since primary growth outcomes were reported at 24 months in the main WINGS paper. • A concise conceptual model or figure showing hypothesized pathways could help readers follow the mechanistic logic. 2. Study Power and Sample • The power calculation is based on CRP only. Please justify the adequacy of the sample size for detecting meaningful differences in IGF-1 and IGFBP3, given their biological variability in infancy. • Power calculations are based on LAZ outcomes from the primary WINGS study rather than biomarker data. This needs justification. 3. Statistical Analysis and results • Tables 2 and 3 could be simplified to highlight group comparisons more effectively. • Adjustment only for the place of delivery seems limited. • The author may consider other covariates, such as mothers’ BMI, socioeconomic indicators, or exposure to infections, in the analysis. In case they are intentionally excluded from the analysis, explain their exclusion. • It would be useful to include effect size interpretation (e.g., percentage change or standardized mean difference) to better convey the biological relevance of null findings. 4. Interpretation of Findings • However, cautious interpretation of the null findings is needed. Aspects such as biological plausibility, contextual limitations, and future implications for longitudinal research require further elaboration. • The discussion acknowledges the absence of significant effects, but can be deepened if the authors discuss the following issues o Address low baseline inflammation as a potential ceiling effect. o Note that intervention effects might appear later in life (after 6 months). o Acknowledge that non-inflammatory mechanisms (caregiving, infection prevention, psychosocial stimulation) might explain the positive growth outcomes in the primary trial. • Expand the comparison with similar trials—such as SHINE (Zimbabwe), ELICIT (Tanzania), and MAL-ED studies—that examined inflammation and growth factor pathways. • The trial was conducted in a single urban Indian setting, which limits extrapolation to rural or diverse socioeconomic contexts. The discussion should acknowledge this limitation more explicitly and suggest strategies for replication in varied environments. 5. Policy and Program Implications • The conclusion is based on the non-significant findings of biomarkers. Whereas the short duration of biomarker assessment may oversimplify complex biological processes. More elaborate discussion is needed on possible confounders like infections, duration, and type of breastfeeding.

      Minor Comments 1. Abstract: Conclude with a stronger statement about contribution: e.g., “These findings add to the understanding of biological mechanisms underlying integrated early-life interventions in LMICs.” 2. Tables: Present only adjusted results in the main text; unadjusted data may be submitted as supplementary files. Ensure all tables include units (mg/L, ng/mL) and consistent decimal formatting. 3. CONSORT Diagram: Please include the number of exclusions, losses to follow-up, and reasons for non-participation in Figure 1 for transparency. 4. Discussion: Add a short note acknowledging that biomarker variability in early infancy is high and may obscure subtle intervention effects. 5. References: Consider citing more recent literature (published within the last 3 years) that links microbiome–inflammation–growth relationships in infants. 6. Language and Formatting: Ensure consistency in abbreviations (e.g., IGFBP3 vs IGF-BP3). Use consistent phrasing for “preconception, pregnancy, and early childhood interventions, growth-related biomarkers, and growth factor profiles” throughout.

      Overall Recommendations: Minor–to–Moderate Revision This is a robust, well-implemented study addressing an important mechanistic question within global child health. Although the results are null, they offer valuable insights into early-life biology and integrated program evaluation. Strengthening the biological framing, contextual discussion, and presentation of adjusted analyses will substantially enhance the manuscript’s impact and readability.

    1. Synthèse de la Séance Plénière du Conseil Économique, Social et Environnemental

      Résumé

      La séance plénière du Conseil économique, social et environnemental (CESE) s'est articulée autour de deux axes majeurs :

      l'examen et l'adoption unanime d'un avis crucial sur les droits et les besoins fondamentaux de l'enfant,

      et une série d'interventions sur des sujets d'actualité reflétant les préoccupations de la société civile.

      L'avis intitulé "Satisfaire les besoins fondamentaux des enfants et garantir leurs droits dans tous les temps et espaces de leur vie quotidienne" a été adopté à l'unanimité (130 voix pour).

      Conçu en complément des travaux de la Convention Citoyenne sur le même sujet, cet avis dresse un constat sévère de la situation des enfants en France, marquée par des inégalités croissantes (sociales, territoriales, économiques) et un décalage persistant entre les droits proclamés et leur application réelle. Le document met en lumière une société pensée "par et pour les adultes", qui peine à placer l'enfant au cœur de ses préoccupations.

      Les préconisations phares incluent l'instauration d'une "clause impact enfance" dans chaque texte de loi, une réforme ambitieuse des rythmes scolaires, la garantie d'un accès équitable aux loisirs et aux vacances, et la création d'un "service public de la continuité éducative" pour coordonner l'ensemble des acteurs.

      L'intervention de Claire Hédon, Défenseure des droits, a renforcé ce diagnostic par des données chiffrées alarmantes sur les atteintes aux droits de l'enfant, notamment pour les plus vulnérables.

      En amont de ce débat, la séance d'expression libre a permis d'aborder des enjeux variés :

      • la remise en cause de la légitimité de la participation citoyenne,

      • les coupes drastiques dans l'aide publique au développement,

      • les menaces sur le système de santé,

      • la dérégulation environnementale au niveau européen, les dangers des nouveaux OGM,

      • la hausse des accidents du travail,

      • la pression exercée sur les demandeurs d'emploi,

      • et les appels à une souveraineté alimentaire concrète.

      Enfin, la présentation du budget du CESE a révélé une situation financière tendue, marquée par une baisse des dotations de l'État et menacée par de nouvelles coupes potentielles votées par le Sénat, mettant en péril la capacité de l'institution à mener ses missions, notamment l'organisation de futures conventions citoyennes.

      I. Session d'Expression Libre : Un Panorama des Préoccupations Sociétales

      Avant l'examen de l'avis sur l'enfance, plusieurs intervenants ont exprimé les préoccupations de leurs groupes respectifs sur des sujets d'actualité.

      Défense de la Participation Citoyenne (Agatha Mel) :

      Au nom des organisations étudiantes, une défense de la Convention Citoyenne sur les temps de l'enfant a été formulée, dénonçant les "procès d'illégitimité, d'incompétence et de manipulation" et appelant à un débat sérieux sur le fond du rapport, sans caricaturer le travail des citoyens.

      Aide Publique au Développement (Jean-Marc Boivin) :

      Le groupe des associations a alerté sur les coupes "drastiques et disproportionnées" (-60 % en 2 ans) dans le budget de l'aide publique au développement, entraînant la fermeture de 1300 projets, la suppression de 10 000 emplois et impactant plus de 15 millions de personnes.

      Impact sur la Santé (Dominique Joseph) :

      La Mutualité Française a qualifié d'irresponsable l'augmentation de la taxe sur les complémentaires santé, la qualifiant de "TVA sur la santé", et a souligné la nécessité d'une réforme de fond du système de protection sociale.

      Dérégulation Environnementale (Florent Compnibus) :

      Le groupe environnement a dénoncé le projet législatif européen "Omnibus" comme une "dérégulation massive" et un "abandon pur et simple du principe de précaution", instaurant des autorisations illimitées pour les pesticides et biocides et affaiblissant le devoir de vigilance des entreprises.

      Opposition aux Nouveaux OGM (Éric Meer) :

      Le groupe alternative sociale et écologique a critiqué l'accord européen sur les nouvelles techniques génomiques (NGT), y voyant une "fuite en avant technologique" qui favorise le brevetage, la dépendance des paysans et prive les consommateurs de traçabilité.

      Accidents du Travail (Ingrid Clément) :

      La CFDT a qualifié 2024 d'"année noire" avec 774 décès au travail (deux par jour), une augmentation de 26 % des accidents pour les femmes, et une hausse des troubles musculosquelettiques et des affections psychiques, appelant à renforcer la prévention primaire.

      Pression sur les Demandeurs d'Emploi (Isabelle Dor) :

      Le groupe des associations a relayé des témoignages de personnes suivies par France Travail décrivant "infantilisation", "pression folle" et menaces de radiation, illustrant des situations qualifiées d'ubuesques pour les bénéficiaires du RSA et les travailleurs pauvres.

      Soutien à la Solidarité Syndicale (Alain le corps) :

      La CGT a dénoncé la mise en examen de sa secrétaire générale, Sophie Binet, pour avoir utilisé l'expression "les rats quittent le navire", affirmant qu'il s'agit "non pas une injure, mais le constat amer d'un comportement irresponsable".

      Souveraineté Alimentaire (Henriespéré) :

      Le groupe de l'agriculture a relayé les propos de la ministre sur la "guerre agricole" qui se prépare, appelant à passer "des discours aux actes" pour relancer les filières agricoles françaises via l'innovation et la réciprocité des normes.

      II. L'Avis du CESE sur les Besoins et les Droits Fondamentaux de l'Enfant

      Le cœur de la séance a été consacré à l'avis "Satisfaire les besoins fondamentaux des enfants et garantir leurs droits", élaboré par la commission éducation, culture et communication.

      Cet avis constitue la contribution de la société civile organisée en parallèle de la Convention Citoyenne sur les temps de l'enfant, saisie par le Premier ministre.

      A. Le Discours de la Défenseure des Droits (Claire Hédon)

      En introduction, Claire Hédon, Défenseure des droits et des enfants, a livré une intervention dense, soulignant l'écart entre le "droit annoncé et son effectivité".

      Volume des Saisines : L'institution a reçu 3 073 réclamations relatives à des atteintes aux droits de l'enfant en 2024. 30 % de ces réclamations concernent la scolarisation d'élèves en situation de handicap.

      Consultation des Enfants : Pour préparer son rapport 2025, plus de 1 600 enfants et jeunes ont été écoutés, soulignant l'importance de leur parole "trop souvent absente du débat public".

      Accès aux Loisirs : Un chiffre marquant illustre les inégalités massives : 71 % des enfants issus de familles modestes ne pratiquent aucune activité sportive ou culturelle, contre seulement 38 % des familles aisées.

      La situation est encore plus critique en Outre-mer, où les équipements sont quatre fois moins nombreux qu'en métropole à Mayotte.

      Temps d'Écran : Le temps passé devant les écrans augmente fortement, atteignant en moyenne 4h48 par jour chez les 11-14 ans (hors école) et jusqu'à 5h10 chez les 16 ans, avec des conséquences graves sur le sommeil et la santé mentale.

      Droit à l'Éducation : La Défenseure a alerté sur les heures d'enseignement perdues, citant le cas d'élèves de CP à Marseille sans cours pendant un mois, et le chiffre de 27 000 jeunes sans affectation au lycée début 2024 sur tout le territoire.

      Impact Climatique : Le réchauffement climatique menace la continuité du service public de l'éducation.

      D'ici 2030, près de 7 000 écoles maternelles seront exposées à des vagues de chaleur supérieures à 35°C.

      B. Présentation du Projet d'Avis par la Commission

      Les rapporteurs ont présenté un projet d'avis structuré autour d'un principe fondamental : l'enfant est une personne à part entière.

      Le fil rouge de l'analyse est un triptyque : droits de l'enfant, satisfaction de ses besoins et lutte contre les inégalités.

      Constats et Enjeux Majeurs

      Des Droits Peu Effectifs : Malgré la ratification de la Convention internationale des droits de l'enfant, la réalité quotidienne est marquée par des droits non respectés, comme le soulignent les rapports de l'ONU et de la Défenseure des droits.

      Des Inégalités Croissantes : Les inégalités sociales, économiques, territoriales et environnementales percutent de plein fouet la vie des enfants.

      34,3 % des familles monoparentales vivent en situation de pauvreté.

      À la veille de la rentrée 2025, au moins 2 159 enfants sont restés sans solution d'hébergement.

      Une Société "Adulto-centrée" : L'organisation sociale, notamment les rythmes de travail et les temps scolaires, est pensée pour les adultes, laissant peu de place aux besoins biologiques et psychologiques des enfants.

      L'Enfant "de l'intérieur" : En 20 ans, le périmètre de déplacement autonome des enfants a chuté de plusieurs kilomètres à moins de 300 mètres.

      Quatre enfants sur 10 (3-10 ans) ne jouent jamais dehors pendant la semaine.

      Préconisations Clés

      L'avis formule 19 préconisations pour répondre à ces enjeux. Les plus structurantes sont :

      Thématique

      Préconisation Phare

      Description

      Gouvernance et Législation

      Créer une clause "impact enfance"

      Intégrer dans l'évaluation de chaque projet de loi ou de règlement une analyse de ses conséquences sur les droits et le bien-être des enfants.

      Temps Scolaire

      Affirmer que le statu quo n'est plus tenable

      Appeler à revoir l'organisation des journées et des semaines scolaires, en préconisant une alternance de 7 semaines de cours et 2 semaines de vacances, tout en maintenant 8 semaines l'été.

      Droit aux Vacances et Loisirs

      Garantir un accès équitable pour tous

      Développer une information ciblée, mettre en place une tarification sociale et soutenir financièrement les structures d'accueil collectif pour lutter contre les inégalités d'accès.

      Lien à la Nature

      Valoriser et accompagner l'éducation "au dehors"

      Déployer des aménagements tels que la végétalisation des cours d'école, les aires éducatives et les plans locaux d'éducation à la nature pour reconnecter les enfants à leur environnement.

      Coordination des Acteurs

      Créer un service public de la continuité éducative

      Articuler les outils existants (PEDT, CTG) pour garantir à chaque enfant un accès à des temps éducatifs variés, cohérents et de qualité, en mobilisant l'ensemble des acteurs (école, familles, associations, collectivités).

      Parentalité et Travail

      Créer un droit attaché aux obligations parentales

      Transposer la directive européenne sur l'équilibre vie pro/vie perso pour permettre aux parents de recourir à des formules souples de travail.

      Financement

      Assurer un effort budgétaire conséquent et pérenne

      Reconnaître l'éducation comme un investissement d'avenir et non comme une simple dépense, en garantissant les moyens nécessaires à l'État, la Sécurité sociale et aux collectivités pour mener des politiques publiques ambitieuses.

      C. Réception et Adoption de l'Avis

      L'ensemble des groupes politiques et de la société civile présents au CESE ont salué la qualité et l'ambition de l'avis.

      Les déclarations ont convergé sur le diagnostic des inégalités croissantes et la nécessité d'une action politique forte.

      Le projet d'avis a été adopté à l'unanimité des 130 votants.

      En complément, la députée Florence Erroin-Léoté a annoncé son intention de porter une proposition de loi sur le droit au loisir des enfants, s'appuyant sur les travaux de la Convention Citoyenne et du CESE pour faire du temps libre un "lieu éducatif, de mixité, d'émancipation et de démocratie vivante".

      III. Le Budget du CESE : Enjeux et Vulnérabilités

      La séance s'est conclue par la présentation du budget du CESE, qui a mis en lumière une situation financière préoccupante.

      Contexte de Pression Budgétaire : Le président a rappelé qu'au même moment, le Sénat votait une baisse de 5 millions d'euros du budget du CESE, contre l'avis de sa propre commission des finances et du gouvernement.

      Baisse des Recettes : Le budget présenté montre une érosion continue des recettes, notamment la fin de la dotation spécifique de 4 millions d'euros pour l'organisation des conventions citoyennes.

      De plus, les travaux de rénovation du Palais d'Iéna vont priver le CESE d'environ 1,6 million d'euros de recettes de valorisation (location d'espaces) en 2026.

      Un Budget 2026 à l'Équilibre Fragile : Le budget pour 2026 est présenté comme étant à l'équilibre, mais cet équilibre est atteint en n'incluant pas le financement d'une nouvelle convention citoyenne et en réduisant certains postes comme la communication.

      Incapacité à Financer de Nouvelles Missions : Le questeur a été clair : "en l'état, [...] on est demain incapable de refaire une convention citoyenne à 4 millions d'euros".

      L'organisation de telles missions dépendra désormais de la capacité du CESE à obtenir des financements ad hoc auprès du gouvernement pour chaque commande.

      Investissement Immobilier Massif : La présentation a souligné que les réserves de trésorerie accumulées sont désormais engagées dans un plan pluriannuel d'investissement indispensable pour la rénovation du bâtiment, rattrapant des décennies de sous-investissement.

    1. Dossier d'Information : L'Impact du Smartphone et de l'IA sur l'Adolescence

      Résumé

      Cette synthèse examine l'analyse de l'anthropologue David Le Breton sur les transformations profondes induites par l'omniprésence du smartphone et de l'intelligence artificielle (IA) dans la vie des adolescents.

      Le constat central est celui d'une rupture anthropologique majeure, marquée par le remplacement de la "conversation" – un échange incarné, empathique et réciproque – par la "communication" numérique, une interaction désincarnée, utilitariste et source d'isolement.

      Les points critiques à retenir sont :

      La Fin de la Conversation : L'interaction en face à face est constamment rompue par les notifications, dévalorisant la présence physique au profit d'un univers virtuel.

      Cette fragmentation du lien social direct entraîne une érosion documentée de l'empathie chez les jeunes générations.

      L'Ascension du Compagnon IA : Pour combler le vide affectif et social, les adolescents se tournent vers des chatbots, des "compagnons secrets" virtuels qui offrent une attention constante et sans jugement.

      Cette relation, bien que narcissiquement rassurante, amplifie l'isolement et transforme l'utilisateur en produit, ses données étant captées et valorisées.

      Des Conséquences Cognitives et Physiques Sévères : L'exposition massive aux écrans est corrélée à un affaiblissement des capacités de concentration, de lecture approfondie et de pensée critique.

      Elle favorise une sédentarité accrue, entraînant des problèmes de santé (douleurs cervicales, myopie) et une baisse drastique de l'activité physique par rapport aux générations précédentes.

      Une Crise de Santé Mentale Planétaire : David Le Breton, s'appuyant sur de multiples travaux, établit un lien direct entre l'explosion de l'anxiété, de la dépression, des tentatives de suicide et des scarifications chez les adolescents depuis 2010 et l'adoption généralisée du smartphone connecté à Internet.

      Enjeux Sociétaux et Éthiques : Au-delà de l'individu, l'analyse pointe vers une homogénéisation culturelle mondiale ("MacWorld"), la vulnérabilité accrue aux fausses nouvelles, et les graves implications éthiques et environnementales de la technologie (travail des enfants, exploitation de métaux rares, pollution des data centers).

      En conclusion, loin d'être un simple outil, le smartphone dopé à l'IA façonne une nouvelle anthropologie où la simulation du lien supplante l'expérience réelle, avec des conséquences délétères sur le développement individuel et la cohésion sociale.

      --------------------------------------------------------------------------------

      1. Contexte de l'Analyse

      La présente analyse se fonde sur les propos de David Le Breton, professeur émérite d'anthropologie à l'Université de Strasbourg, reconnu pour ses travaux sur les conduites à risque, le corps, et plus récemment sur le ralentissement et la marche.

      Son intervention s'inscrit dans une réflexion plus large sur la santé mentale des jeunes et l'impact de l'intelligence artificielle (IA) sur la société.

      2. La Rupture Anthropologique : L'Avant et l'Après Smartphone

      David Le Breton postule qu'une rupture anthropologique fondamentale a eu lieu autour des années 2008-2009 avec l'avènement de l'Internet à haut débit sur les smartphones.

      Ce changement a transformé radicalement l'espace public et les interactions humaines.

      Une "Société Spectrale" : Les villes sont désormais "hantées par des espèces de fantômes qui sont hypnotisés par leur téléphone portable et qui ne voient plus rien du tout à leur entour".

      Perte d'Attention à l'Environnement : Cet état d'hypnose crée des dangers physiques (piétons et cyclistes inattentifs) et sociaux, car l'attention n'est plus portée à l'environnement immédiat ou aux autres personnes présentes.

      Le Monde d'Avant : Il y a une vingtaine d'années, le monde était radicalement différent.

      Même avec les premiers téléphones portables, l'attention au monde environnant n'était pas abolie comme elle l'est aujourd'hui par l'hypnose de l'écran du smartphone.

      3. Distinction Fondamentale : Conversation contre Communication

      Le cœur de l'analyse de Le Breton repose sur une distinction anthropologique essentielle entre deux modes d'interaction.

      Caractéristique

      La Conversation

      La Communication (numérique)

      Cadre

      Visage à visage, présence physique.

      À distance, anonymat fréquent.

      Corps

      Central (mimiques, expressions, gestes).

      Absent, désincarné.

      Temporalité

      Imprévisible, inclut le temps du silence et de la réflexion.

      Urgence, efficacité, utilitarisme. Le silence est perçu comme une "panne".

      Qualité du lien

      Écoute, attention, empathie, réciprocité.

      Centrée sur soi, instrumentale.

      David Le Breton cite son propre ouvrage pour souligner ce point :

      La conversation à l'implique de l'empathie c'est-à-dire une capacité à se mettre à la place de l'autre et à ne pas être étranger à ses ressentis.

      Cette qualité disparaît dans la communication à distance [...] l'autre se transforme alors en fiction sans épaisseur.

      4. Données Clés sur le Temps d'Écran

      L'intervention initiale d'Axel fournit des chiffres qui contextualisent l'ampleur du phénomène, basés notamment sur un rapport de l'ARCOM d'avril 2025.

      Catégorie d'Âge

      Temps d'Écran en 2011

      Temps d'Écran en 2022/récent

      1-6 ans

      1h 47min

      2h 03min

      7-12 ans

      2h 51min

      4h 12min

      13-19 ans

      4h 20min

      5h 10min

      15-24 ans

      (non spécifié)

      5h 48min (dépasse les 50-64 ans)

      50-64 ans

      (non spécifié)

      5h 27min (principalement TV en direct)

      Ces données montrent une augmentation astronomique du temps passé devant les écrans en une décennie, les jeunes de 15-24 ans étant désormais les plus grands consommateurs, principalement via le smartphone. Pour certains adolescents, ce temps peut dépasser les dix heures par jour.

      5. L'Adolescent et le Compagnon Virtuel (IA)

      Face à un lien social qui s'effrite et à une désertion affective des proches, l'IA, via les chatbots, offre une solution de substitution qui devient un phénomène central de l'adolescence contemporaine.

      Le "Doudou de Substitution" : L'IA permet de fabriquer un "compagnon secret fictionnel" pour combler un manque affectif.

      Le jeune programme ce personnage virtuel (nom, voix, personnalité) pour en faire un interlocuteur idéal.

      Un Bouclier de Sens : Le chatbot est toujours disponible, bienveillant, sans jugement, et procure un sentiment de maîtrise et de reconnaissance.

      Il devient un "bouclier de sens pour conjurer les désarrois, les souffrances".

      L'Illusion de la Réciprocité : L'adolescent interagit avec le chatbot comme avec une personne réelle, oubliant qu'il s'agit d'un programme conçu pour capter ses données et le maintenir connecté le plus longtemps possible.

      La Violence de l'Indifférence : Cette quête d'attention virtuelle naît souvent d'un manque d'attention réelle, illustré par l'anecdote poignante d'une petite fille disant à son père hypnotisé par son portable :

      Papa je veux que tu m'écoutes avec les yeux.

      6. Conséquences sur le Lien Social et l'Érosion de l'Empathie

      L'hyper-connexion paradoxalement génère un isolement profond et une dégradation des compétences sociales.

      La Liquidation de l'Interlocuteur : La présence physique d'un ami ou d'un parent est immédiatement "liquidée" dès qu'une notification apparaît.

      L'interlocuteur réel a "moins d'épaisseur ontologiquement que les autres virtuels".

      La Simulation du Lien : Les "centaines d'amis" des réseaux sociaux ne valent pas un ou deux amis réels capables d'un geste de réconfort physique.

      La communication numérique simule le lien social mais ne crée ni intimité ni raisons de vivre.

      Le Déclin de l'Empathie : Une étude menée par la sociologue Sherry Turkle sur 14 000 étudiants sur 30 ans montre que depuis les années 2000, "les jeunes témoignent d'un moindre intérêt pour les autres".

      Les auteurs de l'étude établissent un lien direct entre ce retrait de l'empathie et la croissance de l'accès aux jeux en ligne et aux réseaux sociaux.

      7. Impacts Cognitifs, Physiques et Comportementaux

      La surexposition aux écrans et la délégation de la pensée à l'IA ont des effets directs et mesurables sur le développement des jeunes.

      7.1. Impacts Cognitifs

      Difficulté de Lecture : La communication "synchopée, simple, permanente, ultra rapide" rend difficile la lecture de textes longs et élaborés, y compris des SMS de plus de quelques phrases.

      Faible Culture Générale : La croyance que toute information est accessible en un clic décourage l'apprentissage en profondeur.

      Les étudiants "peinent à lire simplement quelques pages d'un article ou d'un livre".

      Apprentissage de la Passivité : Le recours systématique à l'IA pour obtenir des réponses immédiates (ex: ChatGPT pour un devoir) empêche le développement de la recherche personnelle, de la nuance et de la pensée critique.

      Externalisation de la Mémoire : L'usage du clavier et la possibilité de tout retrouver en ligne affaiblissent la mémorisation, qui est un processus affectif et contextuel, et non un simple stockage d'informations.

      7.2. Impacts Physiques et Comportementaux

      Sédentarité Extrême : Une recherche du médecin William Bird montre qu'en quelques décennies, la distance parcourue par un enfant de 8 ans autour de son domicile est passée de 9 km à 300 mètres.

      Baisse des Performances Physiques : Les adolescents des années 70 étaient "deux fois plus actifs". Un 800 mètres qui se courait en 3 minutes en prend aujourd'hui 4.

      Problèmes de Santé : Le développement planétaire des douleurs cervicales et dorsales, ainsi que de la myopie, est directement lié à la posture penchée sur l'écran.

      8. La Crise de la Santé Mentale Adolescente

      David Le Breton conclut son analyse sur un bilan humain alarmant, établissant une corrélation temporelle forte entre la généralisation du smartphone et l'explosion des troubles psychiques chez les jeunes à partir de 2010.

      En se référant aux travaux du psychologue Jonathan Haidt ("Génération anxieuse"), il affirme que jamais dans l'histoire on n'a connu une telle ampleur de souffrances adolescentes :

      Anxiété et Dépression

      Sentiment d'Isolement

      Tentatives de Suicide et Suicides

      Scarifications (particulièrement chez les filles)

      Cette crise est également visible chez les tout-petits, avec des retards de langage chez des enfants surexposés aux écrans, privés des interactions parentales cruciales à leur développement.

      9. Enjeux Éthiques, Culturels et Environnementaux

      L'impact du smartphone et de l'IA dépasse la sphère individuelle pour toucher l'ensemble de la société.

      Manipulation et Harcèlement : L'IA permet de créer facilement des "deepfakes" ou "deepnudes" pour humilier, discréditer ou faire chanter des individus, les adolescentes étant des victimes fréquentes.

      Homogénéisation Culturelle ("MacWorld") : Les technologies créent une culture mondiale unifiée par les mêmes films, musiques, séries et modes de consommation, liquidant les cultures locales et les savoir-faire traditionnels.

      Hypocrisie de la Silicon Valley : Les dirigeants des géants du numérique protègent leurs propres enfants des technologies qu'ils promeuvent, en les inscrivant dans des écoles (ex: Waldorf) où le numérique est banni, conscients de ses dangers.

      Impacts Environnementaux et Géopolitiques : Le numérique a une empreinte écologique massive (data centers, consommation d'énergie) et repose sur l'exploitation de métaux rares, alimentant des conflits géopolitiques et le travail d'enfants dans certains pays.

      Ces aspects sont souvent occultés dans les débats sur le climat.

      10. Conclusion et Posture de l'Analyste

      David Le Breton insiste sur le fait que son analyse n'est pas celle d'un "moraliste" mais celle d'un sociologue et anthropologue qui observe et documente une réalité.

      Son travail vise à pointer des faits observables et documentés par de nombreuses études, soulignant que jamais dans l'histoire le lien social n'a été aussi "abîmé".

      Le monde hyper-connecté a coïncidé avec le début de "l'hyperindividualisation de nos sociétés", menant au paysage social et psychologique actuel.

    1. hyperspectral Raman imaging (600-1800 cm⁻¹, 873 dimensions with a pixel size of 3 µm)

      How did you account for spatial mixing that may occur with the given analyzed spot size? It's possible that a neighboring cell signal could be contributing to the target cell.

    2. we manually selected corresponding cellular keypoints across both imaging modalities. This selection tool then generated a 3×3 transformation matrix to adjust the STARmap images to align with the Raman regions. The manual alignment process utilized a least-squares method, employing a modified two-dimensional version of Horn’s (1987) algorithm to account for differences in translation, scale, rotation, and reflection. For each Raman-STARmap paired sample, hundreds of keypoints were manually selected, and the fitgeotrans function in MATLAB was used to transform the STARmap image to match the Raman region. The imshowpair function was employed iteratively after every 20 keypoints to ensure satisfactory alignment.

      I appreciate that this is an important and sometimes tricky problem, but well worth doing! Did you consider using an accuracy metric for the registration?

    3. The backscattered Raman light from the sample passes through two dichroic mirrors (DM1: Semrock LPD01 785RU 25, DM2: Semrock LPD01 785RU 25×36×1.1) and was collected by a multi-mode fiber (Thorlabs M14L 01). The collected signal was delivered to the imaging spectrograph (Holospec f/1.8i, Kaiser Optical Systems) and detected by a thermoelectric cooled, back illuminated and deep depleted CCD (PIXIS: 100BR_eXcelon, Princeton Instruments).

      What is the spectral resolution and sampling rate of this system? The datasheet for this spectrograph lists a resolution of 3-6 cm^-1, and if the spectra have 873 dimensions and cover 600-1800 cm^-1, then the spectral sampling rate is around 1.4 cm^-1. Assuming these numbers are roughly correct, this makes it hard to interpret figures that highlight differences between Raman intensities at wavenumbers closer than either of these values (e.g. 1643, 1644, and 1645 cm^-1 in Extended data Fig 11).

    1. Larocco writes: “[...]empathy is an orientation to the other, one thatattunes to some aspect of the other’s feelings or emotions or thoughts[...] yet which may not engage with the other’s otherness at all. [...]Toput the point succinctly: feeling-with is not the same as feeling-for. [...]Empathy, for ethical behavior, is a crucial intersubjective vocalizer, butby itself as an orientation it may not direct the better angels of ournature to direct action.” (Larocco 2018, 3). Larocco here underscoresthe uncertainty around the potential of this empathic positioning, asthere are many possibilities along a spectrum, all the way from authen-tic identification with another to selective empathy that seeks to mis-construe the other as similar to the self, or identifies only with aspectsof the other perceived as similar to the self.

      @RealDidacticus

    Annotators

    1. Reviewer #3 (Public review):

      This work provides a novel statistical model to identify imported malaria cases, which are an important challenge for elimination, particularly in low-transmission areas. This tool was applied in Plasmodium falciparum populations in Mozambique and determined differences in importation rates in 2 low-transmission districts in the South.

      Strengths:

      The study has several strengths, mainly the development of a novel Bayesian model that integrates genomic, epidemiological, and travel data to estimate importation probabilities. The results showed insights into malaria transmission dynamics, particularly identifying importation sources and differences in importation rates in Mozambique. Finally, the relevance of the findings is to suggest interventions focusing on the traveler population to support efforts for malaria elimination.

      Weaknesses:

      The study also has some limitations, although the authors have plans to address them. The sample collection was not representative of some provinces, and not all samples had sufficient metadata for the risk factor analysis. Additionally, the authors used a proxy for transmission intensity and assumed some other conditions to calculate the importation probability for specific scenarios. They plan to conduct a new sample collection and include monthly malaria incidence estimates in the future.

      Comments on revisions:

      - Delete "We added this text to the discussion" in line 302 (Discussion)<br /> - I recommend adding the plans to address limitations indicated in the Response to Reviewers document in the Discussion. This would really strengthen the limitation section.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study presents a new Bayesian approach to estimate importation probabilities of malaria, combining epidemiological data, travel history, and genetic data through pairwise IBD estimates. Importation is an important factor challenging malaria elimination, especially in low-transmission settings. This paper focuses on Magude and Matutuine, two districts in southern Mozambique with very low malaria transmission. The results show isolation-by-distance in Mozambique, with genetic relatedness decreasing with distances larger than 100 km, and no spatial correlation for distances between 10 and 100 km. But again, strong spatial correlation in distances smaller than 10 km. They report high genetic relatedness between Matutuine and Inhambane, higher than between Matutuine and Magude. Inhambane is the main source of importation in Matutuine, accounting for 63.5% of imported cases. Magude, on the other hand, shows smaller importation and travel rates than Matutuine, as it is a rural area with less mobility. Additionally, they report higher levels of importation and travel in the dry season, when transmission is lower. Also, no association with importation was found for occupation, sex, and other factors. These data have practical implications for public health strategies aiming for malaria elimination, for example, testing and treating travelers from Matutuine in the dry season.

      Strengths:

      The strength of this study lies in the combination of different sources of data - epidemiological, travel, and genetic data - to estimate importation probabilities, and the statistical analyses.

      Weaknesses:

      The authors recognize the limitations related to sample size and the biases of travel reports.

      We appreciate the review and comment about the manuscript.

      Reviewer #2 (Public review):

      Summary:

      Based on a detailed dataset, the authors present a novel Bayesian approach to classify malaria cases as either imported or locally acquired.

      Strengths:

      The proposed Bayesian approach for case classification is simple, well justified, and allows the integration of parasite genomics, travel history, and epidemiological data. The work is well-written, very organized, and brings important contributions both to malaria control efforts in Mozambique and to the scientific community. Understanding the origin of cases is essential for designing more effective control measures and elimination strategies.

      Weakness:

      While the authors aim to classify cases as imported or locally acquired, the work lacks a quantification of the contribution of each case type to overall transmission.

      The method presented here allows for classifying individual cases according to whether the infection occurred locally or was imported during a trip. By definition, it does not look to secondary infections after an importation event. Our next step is to conduct outbreak investigation to quantify the impact of importation events on the overall transmission, but this activity goes beyond the scope of this manuscript. We clarify this in the discussion section.

      The Bayesian rationale is sound and well justified; however, the formulation appears to present an inconsistency that is replicated in both the main text and the Supplementary Material.

      Thank you for pointing out the inconsistency in the final formula. In fact, the final formula corresponds to P(IA | G), instead of P(IA), so:

      instead of

      We have now corrected this error in the new version of the manuscript.

      Reviewer #3 (Public review):

      The authors present an important approach to identify imported P. falciparum malaria cases, combining genetic and epidemiological/travel data. This tool has the potential to be expanded to other contexts. The data was analyzed using convincing methods, including a novel statistical model; although some recognized limitations can be improved. This study will be of interest to researchers in public health and infectious diseases.

      Strengths:

      The study has several strengths, mainly the development of a novel Bayesian model that integrates genomic, epidemiological, and travel data to estimate importation probabilities. The results showed insights into malaria transmission dynamics, particularly identifying importation sources and differences in importation rates in Mozambique. Finally, the relevance of the findings is to suggest interventions focusing on the traveler population to help efforts for malaria elimination.

      Weaknesses:

      The study also has some limitations. The sample collection was not representative of some provinces, and not all samples had sufficient metadata for risk factor analysis, which can also be affected by travel recall bias. Additionally, the authors used a proxy for transmission intensity and assumed some conditions for the genetic variable when calculating the importation probability for specific scenarios. The weaknesses were assessed by the authors.

      We acknowledge the limitations commented by the reviewer. We have the following plans to address the limitations. We will repeat the study for our data collected in 2023, which this time contains a good representation of all the provinces of Mozambique, and completeness of the metadata collection was ensured by implementing a new protocol in January 2023. Regarding the proxy for transmission intensity, we will refine the model by integrating monthly estimates of malaria incidence (previously calibrated to address testing and reporting rates) from the DHIS2 data, taking also into account the date of the reported cases in the analysis.

      Reviewing Editor Comments:

      The reviewers have made specific suggestions that could improve the clarity and accuracy of this report.

      Reviewer #1 (Recommendations for the authors):

      (1) Abstract, lines 36, 37 and 38: "Spatial genetic structure and connectivity were assessed using microhaplotype-based genetic relatedness (identity-by-descent) from 1605 P. falciparum samples collected (...)", but only 540 samples were successfully sequenced, therefore used in spatial genetic structure and connectivity analysis.

      The 540 samples refer to those from Maputo province and are described in Fig. 1. The Spatial and connectivity analyses also included the samples from the rest of the provinces from the multi-cluster sampling scheme. Sample sizes from these provinces are described in Suppl. Table 2, and the total between them and the 540 samples from Maputo are the 1605 samples mentioned in the abstract. We specify this number in the caption of Sup. Fig. 4, and add it now into Fig. 3

      (2) In the Introduction, some epidemiological context about Magude and Matutuine could be added. It is only mentioned in the Discussion section (lines 265-269).

      We have added some context about both districts in the introduction now.

      (3) In the Discussion, lines 241-244, could the lack of structure mean no barriers for gene flow due to high mobility in short distances? Maybe it could only be resolved with a large number of samples.

      This could be an explanation (we mention it in the new version), although it is not something we can prove, or at least in this study.

      Reviewer #2 (Recommendations for the authors):

      The work is well written, very organized, and brings important contributions both to malaria control efforts in Mozambique and to the scientific community. Based on detailed datasets from Mozambique, the authors present a novel Bayesian approach to classify malaria cases as either imported or locally acquired. Understanding the origin of cases is essential for designing more effective control measures and elimination strategies. My review focuses on the Bayesian approach as well as on a few aspects of the presentation of results.

      The authors combine travel history, parasite genetic relatedness, and transmission intensity from different areas to compute the probability of infection occurring in the study area, given the P. falciparum genome. The Bayesian rationale is sound and well justified; however, the formulation appears to present an inconsistency that is replicated in both the main text and the Supplementary Material. According to Bayes' Rule:

      P(I_A |G) = (P(I_A) ∙ P(G|I_A)) / (P(G)),

      with

      P(I_A) = K ∙ T_A ∙ PR_A,

      P(G│I_A) = R'_A,

      and assuming

      P(I_A│G) + P(I_B│G) = 1,

      the expression,

      (T_A ∙ PR_A ∙ R'_A) / (T_A ∙ PR_A ∙ R'_A + T_B ∙ PR_B ∙ R'_B)

      appears to refer to P(I_A│G), not to P(I_A) (as indicated in the main text and Supplementary Material).

      P(I_A│G) + P(I_B│G) = (P(I_A) ∙ P(G|I_A) + P(I_B) ∙ P(G|I_B)) / P(G) = 1

      ⇒P(G) = P(I_A) ∙ P(G|I_A) + P(I_B) ∙ P(G|I_B)

      ⇒P(G) = K ∙ T_A ∙ PR_A ∙ R'_A + K ∙ T_B ∙ PR_B ∙ R'_B

      ⇒P(I_A│G) = (T_A ∙ PR_A ∙ R'_A) / (T_A ∙ PR_A ∙ R'_A + T_B ∙ PR_B ∙ R'_B)

      Please clarify this.

      As mentioned in a previous comment, we acknowledge this point from the reviewer.  In fact, the final formula corresponds to P(IA | G), instead of P(IA), so:

      instead of

      We have now corrected this error in the new version of the manuscript and in the supplementary information.

      Additional comments:

      (1) Figure 3A has a scale that includes negative values, which is not reasonable for R.

      We agree that R estimates are not compatible with negative values. The intention of this scale was to show the overall mean R in the centre, in white, so that blue colours represented values below the average and red values above the average. However, we proceeded to update the figures according to your recommendations.

      (2) I suggest using a common scale from 0 to 0.12 (maximum values among panels) across panels A, C, and D, as well as in Sup Fig 3, to facilitate comparison.

      We updated the figures according to the recommendations.

      (3) The x-axis labels in Figure 3A and Supplementary Figure 2A are not aligned with the x-axis ticks.

      We updated the figures so that the alignment in the x-axis is clear.

      (4) Supplementary Figure 5 would be better presented if the data were divided into four separate panels.

      We have divided the figure into four separate panels.

      (6) Figure 5D is not referenced in the main text.

      We missed the mention, which is now fixed in the new version.

      (7) The authors state: "No significant differences in R were found comparing parasite samples from Magude and the rest of the districts." However, Supplementary Figure 3 shows statistically significant relatedness between parasites from Magude and Matutuine. Please clarify this.

      Answer: we added clarity to this sentence which was indeed confusing.

      Reviewer #3 (Recommendations for the authors):

      (1) Introduction: More background info about malaria in Mozambique would be appreciated.

      We included some contextualisation about malaria in Mozambique and our study districts.

      (2) Why were most of the samples collected from children? Is malaria most prevalent in that group? Information could be added in the introduction.

      Children are usually considered an appropriate sentinel group for malaria surveillance for several reasons. First, most malaria cases reported from symptomatic outpatient visits are children, especially in areas with moderate to high burden. Second (and probably the cause for the first reason), their lower immunity levels, due to lower time of exposure, and their immature system, provides a cleaner scenario of the effects of malaria, since the body response is less adapted from past exposures. Finally, as a vulnerable population, they deserve a stronger focus in surveillance systems. We added a comment in the introduction referring to them as a common sentinel group for surveillance.

      (3) Minor: Check spaces in the text (for example, line 333 and the start of the Discussion).

      Thank you for noticing, we fixed in in the new version

      (4) Minor: In my case, the micro (u) symbol can be observed in Word, but not in PDF.

      One of the symbols produced an error, we hope that the new version is correct now.

      (5) Were COI calculations with MOIRE performed across provinces and regions, or taking all samples as one population?

      Wwe took all samples as one population. However, we validated that the same results (reaching equivalent numbers and the same conclusions) were obtained when run across different populations (regions or provinces). We mention this in the manuscript now.

      (6) Have you tested lower values than 0.04 for PR in Maputo?

      This would not have had any impact in the classification. Only two individuals reported a trip to Maputo city (where we assumed PR=0.04), and none of them were classified as imported. If lower values of PR were assumed, their probabilities of importation would have reduced, so that we would still obtain no imported cases.

      (7) Map (Supplementary Figure 1): Please, improve the resolution (like in the zoom in) and add a scale and a compass rose.

      We improved the resolution of the map. We did not add a scale and a compass rose, but labelled the coordinates as longitude and latitude to clarify the scale and orientation of the map. We added this in the rest of the maps of the manuscript as well.

      (8) In this work, Pimp values were bimodal to 0 or 1, making the classification easy. I wonder in other scenarios, where Pimp values are more intermediate (0.4-0.6), is the threshold at 0.5 still useful? Is there another way, like having a confidence interval of Pimp, to ensure the final classification? A discussion on this topic may be appreciated.

      In this case, we would recommend doing probabilistic analyses, keeping the probability of being imported as the final outcome, and quantifying the importation rates from the weighted sum of probabilities across individuals. We added this clarification in the Methods section: “ In case of obtaining a higher fraction of intermediate values (0.4-0.6), weighted sums of individual probabilities would be more appropriate to better quantify importation rates.”

      (9) Results: More details per panel, not as the whole figure (Figure 2B, Figure 3A, etc) in the manuscript would be appreciated.

      We appreciate the comment and added more details

      (10) Figure 3: Please, add a color legend in panel B (not only in the caption, but in the panel, such as in A, C, D).

      We added a color legend in panel B.

      (11) Do the authors recommend routine surveillance to detect importation in Mozambique, or are these results solid enough to propose strategies? How possible is it that importation rates vary in the future in the south? If so, how feasible is it to implement all this process (including the amplicon sequencing) routinely?

      We added the following text in the discussion: “While these results propose programmatic strategies for the two study districts, routine surveillance to detect importation in Mozambique would allow for identifying new strategies in other districts aiming for elimination, as well as monitoring changes in importation rates in Magude and Matutuine in the future. If scaling molecular surveillance is not feasible, travel reports could be integrated in the routing surveillance to extrapolate the case classification based on the results of this study. “

      (12) Which other proxies of transmission intensity could have been used?

      Better proxies of transmission intensity could be malaria incidence at the monthly level from national surveillance systems, or estimates of force of infection, for example from the use of molecular longitudinal data if available. We added this text in the discussion.

      (13) Can this strategy be applied to P. vivax-endemic areas outside Africa?

      This new method can also be applied to P. vivax-endemic areas outside Africa. Symptomatic P. vivax cases are not necessarily reflecting recent infections, so that travel reports might need to cover longer time periods, which does not require any essential adaptation to the method. We added this text to the discussion.

    1. By 15 May 2014 and every six years thereafter the Commissionshall present to the European Parliament and to the Council areport on the implementation of this Directive based, inter alia,on reports from Member States in accordance with Article 21(2)and (3).Where necessary, the report shall be accompanied by proposalsfor Community action

      replaced, by a one time(!) eval after 6 years.

    2. By way of derogation from Article 11(1), Member States maylimit public access to spatial data sets and services through theservices referred to in points (b) to (e) of Article 11(1), or to thee-commerce services referred to in Article 14(3), where suchaccess would adversely affect any of the following

      changed to Member States may limit public access to spatial data sets and services where such access could adversly affect any of the following:

    3. The description of the existing data themes referred to inAnnexes I, II and III may be adapted in accordance with theregulatory procedure with scrutiny referred to in Article 22(3), inorder to take into account the evolving needs for spatial data insupport of Community policies that affect the environment.25.4.2007 EN Official Journal of the European Union L 108/5

      Replaced by The Commission is empowered to adopt delegated acts in accordance with Article 22a in order to amend Annexes I, II and III by adapting the description of the existing data themes in the light of technological and economic developments

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Colorectal cancer (CRC) is the third most common cancer globally and the second leading cause of cancer-related deaths. Colonoscopy and fecal immunohistochemical testing are among the early diagnostic tools that have significantly enhanced patient survival rates in CRC. Methylation dysregulation has been identified in the earliest stages of CRC, offering a promising avenue for screening, prediction, and diagnosis. The manuscript entitled "Early Diagnosis and Prognostic Prediction of Colorectal Cancer through Plasma Methylation Regions" by Zhu et al. presents that a panel of genes with methylation pattern derived from cfDNA (27 DMRs), serving as a noninvasive detection method for CRC early diagnosis and prognosis.

      Strengths:

      The authors provided evidence that the 27 DMRs pattern worked well in predicting CRC distant metastasis, and the methylation score remarkably increased in stage III-IV.

      Weaknesses:

      The major concerns are the design of DMR screening, the relatively low sensitivity of this DMR pattern in detecting early-stage CRC, the limited size of the cohorts, and the lack of comparison with the traditional diagnosis test.

      We sincerely thank the reviewer for their thorough evaluation and constructive feedback on our manuscript. We are encouraged that the reviewer found our 27-DMR panel promising for predicting distant metastasis and for its performance in late-stage CRC. We have carefully considered the weaknesses pointed out and have made revisions to address these concerns, which we believe have significantly strengthened our paper.

      We agree with the reviewer that achieving high sensitivity for early-stage disease is the ultimate goal for any noninvasive screening test. Detecting the minute quantities of cfDNA shed from early-stage tumors is a well-recognized challenge in the field. Although the sensitivity of our current panel for early-stage CRC is modest, its core strengths, lie in its capability to also detect advanced adenomas and its excellent performance in assessing CRC metastasis and prognosis. Furthermore, we have now added a direct comparative analysis of our 27-DMR panel against the most widely used clinical serum biomarker for CRC, carcinoembryonic antigen (CEA), using samples from the same patient cohorts. Our results demonstrate that 27-DMR methylation score significantly outperforms CEA in diagnostic accuracy for early-stage CRC (64% vs. 18%) (Table s7). And in the Discussion section, we have also acknowledged our limitations and suggest that future studies are warranted to combine the cfDNA methylation model with commonly used clinical markers, such as CEA and CA19-9, with the aim of improving the sensitivity for early diagnosis.

      We acknowledge the reviewer's concern regarding the cohort size and validation in larger, prospective, multi-center cohorts is essential before this panel can be considered for clinical application. We have explicitly stated this as a limitation of our study in the Discussion section and have highlighted the need for future large-scale validation studies (Page 18, Lines 367-373). We once again thank the reviewer for their insightful comments, which have allowed us to substantially improve our manuscript. We hope that the revised version is now suitable for publication.

      Reviewer #2 (Public review):

      This work presents a 27-region DMR model for early diagnosis and prognostic prediction of colorectal cancer using plasma methylation markers. While this non-invasive diagnostic and prognostic tool could interest a broad readership, several critical issues require attention.

      Major Concerns:

      (1) Inconsistencies and clarity issues in data presentation

      (a) Sample size discrepancies

      The abstract mentions screening 119 CRC tissue samples, while Figure 1 shows 136 tissues. Please clarify if this represents 119 CRC and 17 normal samples.

      We sincerely thank the reviewer for this careful observation and for pointing out the inconsistency. We apologize for the error and the confusion it caused. Regarding Figure 1: The reviewer is correct. The number 136 in the original Figure 1 was an error. This was due to an inadvertent double-counting of the tumor samples that were used in the differential analysis against adjacent normal tissues. The actual number of tissue samples used in this analysis is 89. We have now corrected this value in the revised Figure 1.

      Regarding the Abstract: The 119 CRC tissue samples mentioned in the abstract represents the total number of unique tumor samples analyzed across all stages of our study. This number is composed of two cohorts: the initial 15 pairs of tissues used for preliminary screening, and the subsequent 89 tissue samples used for validation, totaling 119 samples. We have ensured all sample numbers are now consistent throughout the revised manuscript.

      The plasma sample numbers vary across sections: the abstract cites 161 samples, Figure 1 shows 116 samples, and the Supplementary Methods mentions 77 samples (13 Normal, 15 NAA, 12 AA, 37 CRC).

      We sincerely thank the reviewer for their meticulous review and for identifying these inconsistencies in the plasma sample numbers. We apologize for this oversight and the lack of clarity.

      Figure 1 & Supplementary Methods (77 samples): The number 116 in the original Figure 1 was a clerical error. The correct number is 77, which is the cohort used for our differential methylation analysis. This number is now consistent with the Supplementary Methods. This cohort is composed of 13 Normal, 15 NAA, 12 AA, and 37 CRC samples. The figure has been revised accordingly.

      Abstract (161 samples): The total of 161 plasma samples mentioned in the abstract is the sum of two distinct sample sets used for different stages of our analysis: The 77 samples (13 Normal, 15 NAA, 12 AA, 37 CRC) used for the differential analysis.  An additional 84 samples (33 Normal, 51 CRC) which served as the training set for the LASSO regression model. We have now clarified these distinctions in the text and ensured consistency across the abstract, figures, and methods sections.

      (b) Methodological inconsistencies

      The Supplementary Material reports 477 hypermethylated sites from TCGA data analysis (Δβ>0.20, FDR<0.05), but Figure 1 indicates 499 sites.

      The manuscript states that analyzing TCGA data across six cancer types identified 499 CRC-specific methylation sites, yet Figure 1 shows 477. Please also explain the rationale for selecting these specific cancer types from TCGA.

      We sincerely thank the reviewer for their sharp observation and for highlighting these inconsistencies. We apologize for this clerical error, which occurred when labeling the figure. The numbers 477 and 499 in Figure 1 were inadvertently swapped and the text in Supplementary Material is correct. We have now corrected this error throughout the manuscript to ensure clarity and consistency. We deeply regret the confusion this has caused.

      Regarding the rationale for selecting the cancer types:

      The selection of colorectal, esophageal, gastric, lung, liver, and breast cancers was based on the following strategic criteria to ensure the stringent identification of CRC-specific markers. Firstly, esophageal, gastric, liver, and colorectal cancers all originate from the gastrointestinal tract and share developmental and functional similarities. Comparing CRC against these closely related cancers allowed us to filter out general GI-tract-related methylation patterns and isolate those that are truly unique to colorectal tissue. Secondly, we included lung and breast cancer as they are two of the most common non-GI malignancies worldwide with distinct tissue origins. This helps ensure our identified markers are not just pan-cancer methylation events but are specific to CRC, even when compared against highly prevalent cancers from different lineages. Finally, these six cancer types have some of the largest and most complete datasets available in the TCGA database, including high-quality methylation data. This provided a robust statistical foundation for a reliable cross-cancer comparison. We hope this explanation clarifies our methodology. Thank you again for your valuable feedback.

      "404 CRC-specific DMRs" mentioned in the main text while "404 MCBs" in Figure 1, the authors need to clarify if these terms are interchangeable or how MCBs are defined.

      We sincerely thank the reviewer for pointing out this important inconsistency in terminology. We apologize for the confusion this has caused and for the error in Figure 1. The two terms are closely related in our study. The final 404 markers are technically DMRs that were identified through an analysis of MCBs. To avoid confusion, we have decided to unify the terminology. The manuscript has now been revised to consistently use "DMRs", which is the most accurate final descriptor. The label in Figure 1 has been corrected accordingly.

      (2) Methodological documentation

      The Results section requires a more detailed description of marker identification procedures and justification of methodological choices.

      Figure 3 panels need reordering for sequential citation.

      We thank the reviewer for this valuable suggestion. We agree that the original Results section lacked sufficient detail regarding the marker identification procedures and the justification for our methodological choices. To address this, we have substantially rewritten the "Methylation markers selection" subsection. This revised section provides a clear, step-by-step narrative of our marker discovery. The revised text now integrates the specific methodological details and statistical criteria. For instance, we now explicitly describe the three-pronged approach for the initial TCGA data mining and the specific criteria (Δβ, FDR, log2FC) for each, and the analysis methodology such as Wilcoxon test and LASSO regression analysis. We believe this detailed narrative now provides the necessary description and justification for our methodological choices directly within the results, significantly improving the clarity and logical flow of our manuscript. This revision can be found on (Page 9-11, Lines 180-195, 202-213). We hope these changes fully address the reviewer's concerns.

      We thank the reviewer for pointing out the citation order of the panels in Figure 3. This was a helpful suggestion for improving the clarity of our manuscript. We have now reordered the panels in Figure 3 to ensure they are cited sequentially within the text. These adjustments have been made in the "Development and validation of the CRC diagnosis model" subsection of the Results (Page 11, lines 224-230). We appreciate the reviewer's attention to detail.

      (3) Quality control and data transparency

      No quality control metrics are presented for the in-house sequencing data (e.g., sequencing quality, alignment rate, BS conversion rate, coverage, PCA plots for each cohort).

      The analysis code should be publicly available through GitHub or Zenodo.

      At a minimum, processed data should be made publicly accessible to ensure reproducibility.

      We sincerely thank the reviewer for their valuable and constructive feedback regarding quality control and data transparency. We fully agree that these elements are crucial for ensuring the robustness and reproducibility of our research. As the reviewer suggested, we have made all processed data and the key quality control metrics for each sample including sequencing quality scores, bisulfite (BS) conversion rates, and sequencing coverage publicly available to ensure the reproducibility of our findings. The analysis was performed using standard algorithms as detailed in the Methods section. While we are unable to host the code in a public repository at this time, all analysis scripts are available from the corresponding author upon reasonable request. The data has been deposited in the National Genomics Data Center (NGDC) and is accessible under the accession number OMIX009128. This information is now clearly stated in the "Data and Code Availability" section of the manuscript. We thank the reviewer again for pushing us to improve our manuscript in this critical aspect.

      Reviewer #3 (Public review):

      Summary:

      This article provides a model for early diagnosis and prognostic prediction of Colorectal Cancer and demonstrates its accuracy and usability. However, there are still some minor issues that need to be revised and paid attention to.

      Strengths:

      A large amount of external datasets were used for verification, thus demonstrating robustness and accuracy. Meanwhile, various influencing factors of multiple samples were taken into account, providing usability.

      Weaknesses:

      There are notable language issues that hinder readability, as well as a lack of some key conclusions provided.

      We are very grateful to the reviewer for their positive assessment of our study and for the constructive feedback provided. We are particularly encouraged that the reviewer recognized the strengths of our work, especially the robustness demonstrated through extensive external validation and the practical usability of our model. Regarding the weaknesses, we have taken the comments very seriously and have thoroughly revised the manuscript. We sincerely apologize for the language issues that hindered readability in our initial submission. To address this, the entire manuscript has undergone a comprehensive round of professional language polishing and editing. We have carefully reviewed and revised the text to improve clarity, flow, and grammatical accuracy. Besides, we agree that the conclusions could be stated more explicitly. To rectify this, we have substantially revised the final paragraph of the Discussion and the Conclusion section (Page 14-18, lines 279-305, 319-334, 346-348, 358-360, 367-379). We now more clearly summarize the main findings of our study, emphasize the clinical significance and potential applications of our model, and provide clear take-home messages. We thank you again for your time and insightful comments, which have been invaluable in improving the quality of our paper. We hope the revised manuscript now meets the standards for publication.

      Reviewer #1 (Recommendations for the authors):

      Detail comments are outlined below:

      (1) In this study, the authors have highlighted methylated cfDNA as a noninvasive approach for CRC early diagnosis. However, the small size of cohorts for plasma screening, particularly the sample number of NAA and AA , may cause bias in the selection of DMRs. This bias may lead to inappropriate DMRs for early diagnosis. Furthermore, the similar issues for the training set with a high percentage of late-stage CRC, no AA or NAA samples were included. This absence may be the key factor in screening changed methylated cfDNA that can predict the early stages of CRC.

      We are very grateful to the reviewer for this insightful methodological critique. We agree that cohort composition and sample size are critical factors in the development of robust biomarkers, and we appreciate the opportunity to clarify our study design and the interpretation of our results.

      We agree with the reviewer that the number of precancerous lesion samples (NAA and AA) in our initial plasma screening cohort was limited. This is a valid point. However, it is important to contextualize the role of this step within our overall multi-stage marker selection funnel. The markers evaluated in this plasma cohort were not discovered from this small sample set alone. They were the result of a rigorous pre-selection process based on large-scale public TCGA data and our own tissue-level sequencing. This robust, tissue-based validation ensured that only the most promising CRC-specific markers were advanced for plasma testing. Therefore, while the plasma cohort was modest in size, its purpose was to confirm the circulatory detectability of markers already known to have a strong tissue-of-origin signal, thereby mitigating the potential bias from a smaller discovery set.

      Our primary aim was to first build a model that could robustly and accurately identify a definitive cancer-specific methylation signal. By training the model on clear-cut invasive cancer cases versus healthy controls, we could isolate the most powerful and specific markers for established malignancy. Our working hypothesis was that these strong cancer-specific methylation patterns are initiated during the precursor stages and would therefore be detectable, albeit at lower levels, in precancerous lesions.  Unfortunately, the panel could only identify a limited proportion of precancerous lesions (48.4% in the NAA group and 52.2% in the AA group). We fully agree with the reviewer's sentiment that including a larger and more balanced set of precancerous lesions in future training cohorts could potentially optimize a model specifically for adenoma detection. We have now explicitly added this point to our Discussion section, highlighting it as an important direction for future research (Page 18, lines 367-373).

      (2) The sensitivity of 27 DMRs in the external validation set (for NAA, AA and CRC 0-Ⅱare 48.4%. 52.2% and 66.7%, respectively) were much lower compared with previously published studies, like ColonES assay (DOI: 10.1016/j.eclinm.2022.101717) and ColonSecure test (DOI: 10.1186/s12943-023-01866-z). The 27 DMRs from the layered screening process did not show superior performance in a small population of an external validation cohort. Therefore, it is unlikely that this DMR pattern will be applicable to the general population in the future.

      We sincerely thank the reviewer for their insightful comments and for providing a thorough comparison with the highly relevant ColonES and ColonSecure assays. This has given us an important opportunity to clarify the unique contributions and specific clinical applications of our 27-DMR panel.

      We acknowledge the reviewer's point that the sensitivities of our panel for precancerous lesions (NAA: 48.4%, AA: 52.2%), while substantial, are numerically lower than those reported by the excellent ColonES assay (AA: 79.0%). However, it is important to clarify that while the ColonES and ColonSecure tests are outstanding benchmarks designed primarily for early detection and screening, the primary objective and contribution of our study were slightly different. Our model demonstrated an exceptional ability to predict distant metastasis with an AUC of 0.955 and a strong capacity for predicting overall prognosis with an AUC of 0.867. Our goal was to develop a multi-functional, biologically-rooted biomarker panel that not only contributes to early detection but, more importantly, provides crucial information for post-diagnosis patient management, including staging, risk stratification, and prognostication, from a single preoperative sample. We believe this ability to preoperatively identify high-risk patients who may require more aggressive treatment or intensive surveillance is the key contribution of our work. It provides a distinct clinical utility that complements, rather than directly competes with, pure screening assays.

      We agree with the reviewer that our external validation was performed on a limited cohort, and we have acknowledged this as a limitation in our Discussion section. However, the purpose of this validation was to provide a proof-of-concept for the panel's performance across its multiple functions. The promising and exceptionally high-performing results in the prognostic domain strongly warrant further validation in larger, prospective, multi-center cohorts.

      (3) The 27 DMRs pattern worked well in predicting CRC distant metastasis, and the methylation score remarkably increased in stage III-IV. In contrast, the increase of AA and 0-II groups was very mild in the validation cohort. This observation raises concerns regarding the study design, particularly in the context of the layered screening process and sample assigning.

      We sincerely thank the reviewer for this insightful and critical comment. We agree with the reviewer's observation that the methylation score increased more remarkably in late-stage (III-IV) CRC compared to the milder increase in adenoma (AA) and early-stage (0-II) CRC in the validation cohort. However, the observed pattern is biologically plausible and consistent with the nature of colorectal cancer progression. Carcinogenesis is a multi-step process involving the gradual accumulation of genetic and epigenetic alterations. The methylation changes we identified are likely associated with tumor progression and metastasis. Therefore, it is expected that advanced, metastatic cancers (Stage III-IV), which have undergone significant biological changes, would exhibit a much stronger and more robust methylation signal compared to pre-cancerous lesions (adenomas) or early-stage, non-metastatic cancers (Stage 0-II). The "mild" increase in early stages reflects the initial, more subtle epigenetic alterations, while the "remarkable" increase in late stages reflects the extensive changes required for invasion and metastasis. We believe this graduated increase actually strengthens the validity of our methylation signature, as it mirrors the underlying biological progression of the disease. We hope this response and the corresponding revisions address the reviewer's comments.

      (4) The authors did not provide the 27 DMRs prediction efficacy comparison with other noninvasive CRC assays, like a CEA and a FIT test.

      Thank you for this valuable suggestion. We agree that comparing our model with established non-invasive assays is crucial for demonstrating its clinical potential. Following your advice, we have now included a direct comparison of the diagnostic performance between our model and the traditional tumor marker, carcinoembryonic antigen (CEA), using the external validation cohort. The results show that our model has a significantly higher sensitivity for detecting early-stage colorectal cancer and adenomas compared to CEA. This detailed comparison has been added as Table s7 in the supplementary materials, and the corresponding description has been incorporated into the Results section of our manuscript (Page 12, lines 234-236). Regarding the Fecal Immunochemical Test (FIT), we unfortunately could not perform a direct statistical comparison because very few individuals in our cohort had undergone FIT. A comparison based on such a small sample size would lack statistical power and might not yield meaningful conclusions. We have acknowledged this as a limitation of our study in the Discussion section.We believe these additions and clarifications have substantially strengthened our manuscript. Thank you again for your constructive feedback.

      (5) The authors did not explicitly describe how they assigned the plasma samples to the distinct sets, nor did they specify the criteria for the plasma screen set, training set, and validation set. The detailed information for the patient grouping should be listed.

      Responce: Thank you for this essential feedback. We agree that a transparent and detailed description of the sample allocation process is crucial for the manuscript. We apologize for the previous lack of clarity and have now revised the Methods section to address this. Our patient cohorts were assigned to the screening, training, and validation sets based on a chronological splitting strategy. Specifically, samples were allocated based on the date of collection in a consecutive manner. This approach was chosen to minimize selection bias and to provide a more realistic, forward-looking assessment of the model's performance, simulating a prospective validation scenario. The screening set comprised 89 tissue samples and 77 plasma samples collected between June to December 2020. The primary purpose of this set was for the initial discovery and screening of potential methylation markers. The training set and validation set included 165 plasma samples collected from December 2020 to July 2022. The external validation cohort comprised 166 plasma samples collected from from July 2022 to December 2022. The subsection titled "Study design and samples" within the Methods section of the revised manuscript, which now contains all of this detailed information (Page 6, lines 116-133). We believe this detailed explanation now makes our study design clear and transparent. Thank you again for helping us improve our manuscript.

      Reviewer #2 (Recommendations for the authors):

      The manuscript requires significant language editing to improve clarity and readability. We recommend that the authors seek professional editing services for revision.

      Thank you for your constructive comments on the language of our manuscript. We apologize for any lack of clarity in the previous version. To address this, we have performed a thorough revision of the manuscript. The text has been carefully reviewed and edited by a native English-speaking colleague who is an expert in our research field. We have focused on correcting all grammatical errors, improving sentence structure, and refining the phrasing throughout the document to enhance readability. We are confident that these extensive revisions have significantly improved the clarity of the manuscript. We hope you will find the current version much easier to read and understand.

      Reviewer #3 (Recommendations for the authors):

      (1) However, I think the abstract part of the article is too detailed and should be more concise and shortened. It is not necessary to show detailed values but to summarize the results.

      Thank you for this valuable suggestion. We agree that the previous version of the abstract was overly detailed and that a more concise summary would be more effective for the reader. Following your advice, we have substantially revised the abstract. We have removed the specific numerical values (such as detailed statistics) and have instead focused on summarizing the key findings and their broader implications (Page 3, lines 54-60, 64-66, 70-72). The revised abstract is now shorter and provides a clearer, high-level overview of our study's background, methods, main results, and conclusions. We believe these changes have significantly improved its readability and impact. We hope you will find the current version more appropriate.

      (2) Figure 4, the color in the legend and plot are not the same, and should be revised.

      Thank you for your careful attention to detail and for pointing out the color inconsistency in Figure 4. We apologize for this oversight. We have now corrected the figure as you suggested, ensuring that the colors in the legend perfectly match those in the plot. The revised Figure 4 has been updated in the manuscript. We appreciate your help in improving the quality of our figures.

      (3) Please pay attention to the article format, such as the consistency of fonts and punctuation marks. (For example, Lines 75 and Line 230).

      Thank you for your meticulous review and for pointing out the inconsistencies in our manuscript's formatting. We sincerely apologize for these oversights and any inconvenience they may have caused. Following your feedback, we have carefully corrected the specific issues you highlighted. Furthermore, we have conducted a thorough proofread of the entire manuscript to ensure consistency in all fonts, punctuation marks, and overall adherence to the journal's formatting guidelines. We appreciate your help in improving the presentation and professionalism of our paper.

  2. milenio-nudos.github.io milenio-nudos.github.io
    1. However, due to software limitations, it was not possible to correct standard errors for the complex survey design (via the Jackknife method for ICILS, Fay’s method for PISA, or cluster-robust estimation). At present, the lavaan package does not support: 1) the use of robust estimators for categorical variables in conjunction with clustering; 2) the simultaneous use of sampling weights combined with clustering; and 3) the implementation of replicate variance estimation methods.

      No es necesario entrar en este detalle. Pero si se va a decir, hay que usar citas para fundamentar.

    1. Reviewer #1 (Public review):

      Summary:

      Taylar Hammond and colleagues identified new regulators of the G1/S transition of the cell cycle. They did so by screening publicly available data from the Cancer Dependency Map and identified FAM53C as a positive regulator of the G1/S transition. Using biochemical assays they then show that FAM53 interacts with the DYRK1A kinase to inhibit its function. They show in RPE1 cells that loss of FAMC53 leads to a DYRK1A + P53-dependent cell cycle arrest. Combined inactivation of FAM53C and DYRK1A in a TP53-null background caused S-phase entry with subsequent apoptosis. Finally the authors assess the effect of FAM53C deletion in a cortical organoid model, and in Fam53c knockout mice. Whereas proliferation of the organoids is indeed inhibited, mice show virtually no phenotype.

      The authors have revised the manuscript, and I respond here point-by-point to indicate which parts of the revision I found compelling, and which parts were less convincing. So the numbering is consistent with the numbering in my first review report.

      (1) The p21 knockdowns are a valuable addition, and the claim that other p53 targets than p21 are involved in the FAMC53 RNAi-mediated arrest is now much more solid. Minor detail: if S4D is a quantification of S4C, it is hard to believe that the quantification was done properly (at least the DYRK1Ai conditions). Perhaps S4C is not the best representative example, or some error was made?

      (2a) I appreciate the decision to remove the cyclin D1 phosphorylation data. A more nuanced model now emerges. It is not clear to me however why the Protein Simple immunoassay was used for experiments with RPE cells, and not the cortical organoids. Even though no direct claims are made based on the phospho-cyclin D data in Figure 5E+G, showing these data suggests that FAM53C deletion increases DYRK1A-mediated cyclin D1 phosphorylation. I find it tricky to show these data, while knowing now that this effect could not be shown in the RPE1 cells.<br /> (2b) The quantifications of the immunoassays are not convincing. In multiple experiments, the HSP90 levels vary wildly, which indicates big differences in protein loading if HSP90 is a proper loading control. This is for example problematic for the interpretation of figure 3F and S3I. The cyclin D1 "bands" look extremely similar between siCtrl and siFAM53C (Fig S3I), in fact the two series of 6 samples with different dosages of DYRK1Ai look seem an identical repetition of each other. I did not have to option to overlay them, but it would be important to check if a mistake was made here. The cyclin D1 signals aside, the change in cycD1/HSP90 ratios seems to be entirely caused by differences in HSP90 levels. Careful re-analysis of the raw data and more equal loading seem necessary. The same goes (to a lesser extent) for S3J+K.<br /> (2c) the new model in Fig S4L: what do the arrows at the right FAM53C and p53 that merge a point straight towards S-phase mean? They suggest that p53 (and FAM53C) directly promote S-phase progression, but most likely this is not what the authors intended with it.

      (3) Clear; nicely addressed.

      (4) Thank you for correcting.

      (5) I appreciate that the authors are now more careful to call the IMPC analysis data preliminary. This is acceptable to me, but nevertheless, I suggest the authors to seriously consider taking this part entirely out. The risk of chance finding and the extremely skewed group sizes (as reviewer #2 had pointed out) hamper the credibility of this statistical analysis.

    2. Reviewer #3 (Public review):

      Summary:

      In this study Hammond et al. investigated the role of Dual-specificity Tyrosine Phosphorylation regulated Kinase 1A (DYRK1) in G1/S transition. By exploiting Dependency Map portal, they identified a previously unexplored protein FAM53C as potential regulator of G1/S transition. Using RNAi, they confirmed that depletion of FAM53C suppressed proliferation of human RPE1 cells and that this phenotype was dependent on the presence protein RB. In addition, they noted increased level of CDKN1A transcript and p21 protein that could explain G1 arrest of FAM53C-depleted cells but surprisingly, they did not observe activation of other p53 target genes. Proteomic analysis identified DYRK1 as one of the main interactors of FAM53C and the interaction was confirmed in vitro. Further, they showed that purified FAM53C blocked the ability of DYRK1 to phosphorylate cyclin D in vitro although the activity of DYRK1 was likely not inhibited (judging from the modification of FAM53C itself). Instead, it seems more likely that FAM53C competes with cyclin D in this assay. Authors claim that the G1 arrest caused by depletion of FAM53C was rescued by inhibition of DYRK1 but this was true only in cells lacking functional p53. This is quite confusing as DYRK1 inhibition reduced the fraction of G1 cells in p53 wild type cells as well as in p53 knock-outs, suggesting that FAM53C may not be required for regulation of DYRK1 function. Instead of focusing on the impact of FAM53C on cell cycle progression, authors moved towards investigating its potential (and perhaps more complex) roles in differentiation of IPSCs into cortical organoids and in mice. They observed a lower level of proliferating cells in the organoids but if that reflects an increased activity of DYRK1 or if it is just an off-target effect of the genetic manipulation remains unclear. Even less clear is the phenotype in FAM53C knock-out mice. Authors did not observe any significant changes in survival nor in organ development but they noted some behavioral differences. Weather and how these are connected to the rate of cellular proliferation was not explored. In the summary, the study identified previously unknown role of FAM53C in proliferation but failed to explain the mechanism and its physiological relevance at the level of tissues and organism. Although some of the data might be of interest, in current form the data is too preliminary to justify publication.

      Major comments:

      (1) Whole study is based on one siRNA to Fam53C and its specificity was not validated. Level of the knock down was shown only in the first figure and not in the other experiments. The observed phenotypes in the cell cycle progression may be affected by variable knock-down efficiency and/or potential off target effects.

      (2) Experiments focusing on the cell cycle progression were done in a single cell line RPE1 that showed a strong sensitivity to FAM53C depletion. In contrast, phenotypes in IPSCs and in mice were only mild suggesting that there might be large differences across various cell types in the expression and function of FAM53C. Therefore, it is important to reproduce the observations in other cell types.

      (3) Authors state that FAM53C is a direct inhibitor of DYRK1A kinase activity (Line 203), however this model is not supported by the data in Fig 4A. FAM53C seems to be a good substrate of DYRK1 even at high concentrations when phosphorylations of cyclin D is reduced. It rather suggests that DYRK1 is not inhibited by FAM53C but perhaps FAM53C competes with cyclin D. Further, authors should address if the phosphorylation of cyclin D is responsible for the observed cell cycle phenotype. Is this Cyclin D-Thr286 phosphorylation, or are there other sites involved?

      (4) At many places, information on statistical tests is missing and SDs are not shown in the plots. For instance, what statistics was used in Fig 4C? Impact of FAM53C on cyclin D phosphorylation does not seem to be significant. IN the same experiment, does DYRK1 inhibitor prevent modification of cyclin D?

      (5) Validation of SM13797 compound in terms of specificity to DYRK1 was not performed.

      (6) A fraction of cells in G1 is a very easy readout but it does not measure progression through the G1 phase. Extension of the S phase or G2 delay would indirectly also result in reduction of the G1 fraction. Instead, authors could measure the dynamics of entry to S phase in cells released from a G1 block or from mitotic shake off.

      Comments to the revised manuscript:

      In the revised version of the manuscript, authors addressed most of the critical points. They now include new data with depletion of FAM53C using single siRNAs that show small but significant enrichment of population of the G1 cells. This G1 arrest is likely caused by a combined effects on induction of p21 expression and decreased levels of cyclin D1. Authors observed that inhibition of DYRK1 rescued cyclin D1 levels in FAM53 depleted cells suggesting that FAM53C may inhibit DYRK1. This possibility is also supported by in vitro experiments. On the other hand, inhibition of DYRK1 did not rescue the G1 arrest upon depletion of FAM53C, suggesting that FAM53C may have also DYRK1-independent role in G1. Functional rescue experiments with cyclin D1 mutants and detection of DYRK1 activity in cells would be necessary to conclusively explain the function of FAM53C in progression through G1 phase but unfortunately these experiments were technically not possible. Knock out of FAM53C in iPSCs and in mice suggest that FAM53C may have additional functions besides the cell cycle control and/or that adaptation may have occurred in these model systems. Overall, the study implicated FAM53C in fine tuning DYRK1 activity in cells that may to some extent influence the progression through G1 phase. In addition, FAM53C may also have DYRK1 and cell cycle independent functions that remain to be addressed by future studies.

    3. Author response:

      (1) General Statements

      We thank the Reviewers for a fair review of our work and helpful suggestions. We have significantly revised the manuscript in response to these suggestions. We provide a point-by-point response to the Reviewers below but wanted to highlight in our response a recurring concern related to the strong cell cycle arrest observed upon the acute FAM53C knock-down being different than the limited phenotypes in other contexts, including the knockout mice and DepMap data.

      First, we now show that we can recapitulate the strong G1 arrest resulting from the FAM53C knock-down using two independent siRNAs in RPE-1 cells, supporting the specificity of the effects.

      Second, the G1 arrest that results from the FAM53C knock-down is also observed in cells with inactive p53, suggesting it is not due to a non-specific stress response due to “toxic” siRNAs. In addition, the arrest is dependent on RB, which fits with the genetic and biochemical data placing FAM53C upstream of RB, further supporting a specific phenotype.

      Third, we have performed experiments in other human cells, including cancer cell lines. As would be expected for cancer cells, the G1 arrest is less pronounced but is still significant, indicating that the G1 arrest is not unique to RPE-1 cells.

      Fourth, it is not unexpected that compensatory mechanisms would be activated upon loss of FAM53C during development or in cancer – which may explain the lack of phenotypes in vivo or upon long-term knockout. This has been true for many cell cycle regulators, either because of compensation by other family members that have overlapping functions, or by a larger scale rewiring of signaling pathways. 

      (2) Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity): 

      Summary: 

      Taylar Hammond and colleagues identified new regulators of the G1/S transition of the cell cycle.

      They did so by screening public available data from the Cancer Dependency Map, and identified FAM53C as a positive regulator of the G1/S transition. Using biochemical assays they then show that FAM53 interacts with the DYRK1A kinase to inhibit its function. DYRK1A in its is known to induce degradation of cyclin D, leading the authors to propose a model in which DYRK1Adependent cyclin D degradation is inhibited by FAM53C to permit S-phase entry. Finally the authors assess the effect of FAM53C deletion in a cortical organoid model, and in Fam53c knockout mice. Whereas proliferation of the organoids is indeed inhibited, mice show virtually no phenotype.  

      Major comments: 

      The authors show convincing evidence that FAM53C loss can reduce S-phase entry in cell cultures, and that it can bind to DYRK1A. However, FAM53 has multiple other binding partners and I am not entirely convinced that negative regulation of DYRK1A is the predominant mechanism to explain its effects on S-phase entry. Some of the claims that are made based on the biochemical assays, and on the physiological effects of FAM53C are overstated. In addition, some choices made methodology and data representation need further attention. 

      (1) The authors do note that P21 levels increase upon FAM53C. They show convincing evidence that this is not a P53-dependent response. But the claim that " p21 upregulation alone cannot explain the G1 arrest in FAM53C-deficient cells (line 138-139) is misleading. A p53-independent p21 response could still be highly relevant. The authors could test if FAM53C knockdown inhibits proliferation after p21 knockdown or p21 deletion in RPE1 cells. 

      The Reviewer raises a great point. Our initial statement needed to be clarified and also need more experimental support. We have performed experiments where we knocked down FAM53C and p21 individually, as well as in combination, in RPE-1 cells. These experiment show that p21 knock-down is not sufficient to negate the cell cycle arrest resulting from the FAM53C knockdown in RPE-1 cells (Figure 4B,C and Figure S4C,D).

      We now extended these experiments to conditions where we inhibited DYRK1A, and we also compared these data to experiments in p53-null RPE-1 cells. Altogether, these experiments point to activation of p53 downstream of DYRK1A activation upon FAM53C knock-down, and indicate that p21 is not the only critical p53 target in the cell cycle arrest observed in FAM53C knock-down cells (Figure 4 and Figure S4).

      (2) The authors do not convincingly show that FAM53C acts as a DYRK1A inhibitor in cells. Figures 4B+C and S4B+C show extremely faint P-CycD1 bands, and tiny differences in ratios. The P values are hovering around the 0.05, so n=3 is clearly underpowered here. Total CycD1 levels also correlate with FAM53C levels, which seems to affect the ratios more than the tiny pCycD1 bands. Why is there still a pCycD1 band visible in 4B in the GFP + BTZ + DYRK1Ai condition? And if I look at the data points I honestly don't understand how the authors can conclude from S4C that knockdown of siFAM53C increases (DYRK1A dependent) increases in pCycD1 (relative to total CycD1). In figure 5C, no blot scans are even shown, and again the differences look tiny. So the authors should either find a way to make these assays more robust, or alter their claims appropriately. 

      We appreciate these comments from the Reviewer and have significantly revised the manuscript to address them.

      The analysis of Cyclin D phosphorylation and stability are complicated by the upregulation of p21 upon FAM53C knock-down, in particular because p21 can be part of Cyclin D complexes, which may affect its protein levels in cells (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). Instead of focusing on Cyclin D levels and stability, we refocused the manuscript on RB and p53 downstream of FAM53C loss.

      We removed previous panel 4B from the revised manuscript. For panels 4E and S4B (now panels S3J and S3K)), we used a true “immunoassay” (as indicated in the legend – not an immunoblot), which is much more quantitative and avoids error-prone steps in standard immunoblots (“Western blots”). Briefly, this system was developed by ProteinSimple. It uses capillary transfer of proteins and ELISA-like quantification with up to 6 logs of dynamic range (see their web site https://www.proteinsimple.com/wes.html). The “bands” we show are just a representation of the luminescence signals in capillaries. We made sure to further clarify the figure legends in the revised manuscript.

      The representative Western blot images for 5C-D (now 5F-G) in the original submission are shown in Figure 5E, we apologize if this was not clear. The differences are small, which we acknowledge in the revised manuscript. Note that several factors can affect Cyclin D levels in cells, including the growth rate and the stage of the cell cycle. Our FACS analysis shows that normal organoids have ~63% of cells in G1 and ~13% in S phase; the overall lower proportion of S-phase cells in organoids may make the immunoblot difference appear smaller, with fewer cycling cells resulting in decreased Cyclin D phosphorylation.

      Nevertheless, the Reviewer brings up a good point and comments from this Reviewer and the others made us re-think how to best interpret our results. As discussed above, we re-read carefully the Meyer paper and think that FAM53C’s role and DYRK1A activity in cells may be understood when considering levels of both CycD and p21 at the same time in a continuum. While our genetic and biochemical data support a role for FAM53C in DYRK1A inhibition, it is likely that the regulation of cell cycle progression by FAM53C is not exclusively due to this inhibition. As discussed above and below, we noted an upregulation of p21 upon FAM53C knock-down, and activation of p53 and its targets likely contributes significantly to the phenotypes observed. We added new experiments to support this more complex model (Figure 4 and Figure S4, with new model in S4L).

      (3) The experiments to test if DYRK1A inhibition could rescue the G1 arrest observed upon FAM53C knockdown are not entirely convincing either. It would be much more convincing if they also perform cell counting experiments as they have done in Figures 1F and 1G, to complement the flow cytometry assays. I suggest that the authors do these cell counting experiments in RPE1 +/- P53 cells as well as HCT116 cells. In addition, did the authors test if P21 is induced by DYRK1Ai in HCT116 cells? 

      We repeated the experiments with the DYRK1A inhibitor and counted the cells. In p53-null RPE1 cells, we found that cell numbers do not increase in these conditions where we had observed a cell cycle re-entry (Fig. 4E), which was accompanied by apoptotic cell death (Fig. S4I). Thus, cells re-enter the cell cycle but die as they progress through S-phase and G2/M. We note that inhibition of DYRK1A has been shown to decrease expression of G2/M regulators (PMID: 38839871), which may contribute to the inability of cells treated to DYRK1Ai to divide. Because our data in RPE-1 cells showed that p21 knock-down was not sufficient to allow the FAM53C knock-down cells to re-enter the cell cycle, we did not further analyze p21 in HCT-116 cells.

      (4) The data in Figure 5C and 5D are identical, although they are supposed to represent either pCycD1 ratios or p21 levels. This is a problem because at least one of the two cannot be true. Please provide the proper data and show (representative) images of both data types.

      We apologize for these duplicated panels in the original submission. We now replaced the wrong panel with the correct data (Fig. 5F,G). 

      (5) Line 246: "Fam53c knockout mice display developmental and behavioral defects." I don't agree with this claim. The mutant mice are born at almost the expected Mendelian ratios, the body weight development is not consistently altered. But more importantly, no differences in adult survival or microscopic pathology were seen. The authors put strong emphasis on the IMPC behavioral analysis, but they should be more cautious. The IMPC mouse cohorts are tested for many other phenotypes related to behavior and neurological symptoms and apparently none of these other traits were changed in the IMPC Famc53c-/- cohort. Thus, the decreased exploration in a new environment could very well be a chance finding. The authors need to take away claims about developmental and behavioral defects from the abstract, results and discussion sections; the data are just too weak to justify this. 

      We agree with the Reviewer that, although we observed significant p-values, this original statement may not be appropriate in the biological sense. We made sure in the revised manuscript to carefully present these data.

      Minor comments: 

      (6) Can the authors provide a rationale for each of the proteins they chose to generate the list of the 38 proteins in the DepMap analysis? I looked at the list and it seems to me that they do not all have described functions in the G1/S transition. The analysis may thus be biased. 

      To address this point, we updated Table S1 (2nd tab) to provide a better rationale for the 38 factors chosen. Our focus was on the canonical RB pathway and we included RB binding proteins whose function had suggested they may also be playing a role in the G1/S transition. We do agree that there is some bias in this selection (e.g., there are more RB binding factors described) but we hope the Reviewer will agree with us that this list and the subsequent analysis identified expected factors, including FAM53C. Future studies using this approach and others will certainly identify new regulators of cell cycle progression.

      (7) Figure 1B is confusing to me. Are these just some (arbitrarily) chosen examples? Consider leaving this heatmap out altogether, of explain in more detail. 

      We agree with the Reviewer that this panel was not necessarily useful and possibly in the wrong place, and we removed it from the manuscript. We replaced it with a cartoon of top hits in the screen.

      (8) The y-axes in Figures 2C, 2D, 2E, and 4D are misleading because they do not start at 0. Please let the axis start at 0, or make axis breaks. 

      We re-graphed these panels.

      (9) Line 229: " Consequences ... brain development." This subheader is misleading, because the in vitro cortical organoid system is a rather simplistic model for brain development, and far away from physiological brain development. Please alter the header. 

      We changed the header to “Consequences of FAM53C inactivation in human cortical organoids in culture”.

      (10) Figure S5F: the gating strategy is not clear to me. In particular, how do the authors know the difference between subG1 and G1 DAPI signals? Do they interpret the subG1 as apoptotic cells? If yes, why are there so many? Are the culturing or harvesting conditions of these organoids suboptimal? Perhaps the authors could consider doing IF stainings on EdU or BrdU on paraffin sections of organoids to obtain cleaner data?

      Thank you for your feedback. The subG1 population in the original Figure S5F represents cells that died during the dissociation step of the organoids for FACS analysis. To address this point, we performed live & dead staining to exclude dead cells and provide clearer data. We refined gating strategy for better clarity in the new S5F panel.

      (11) Figure S6A; the labeling seems incorrect. I would think that red is heterozygous here, and grey mutant. 

      We fixed this mistake, thank you. 

      Reviewer #1 (Significance): 

      The finding that the poorly studied gene FAM53C controls the G1/S transition in cell lines is novel and interesting for the cell cycle field. However, the lack of phenotypes in Famc53-/- mice makes this finding less interesting for a broader audience. Furthermore, the mechanisms are incompletely dissected. The importance of a p53-indepent induction of p21 is not ruled out. And while the direct inhibitory interaction between FAM53C and DYRK1A is convincing (and also reported by others; PMID: 37802655), the authors do not (yet) convincingly show that DYRK1A inhibition can rescue a cell proliferation defect in FAM53C-deficient cells. 

      Altogether, this study can be of interest to basic researchers in the cell cycle field. 

      I am a cell biologist studying cell cycle fate decisions, and adaptation of cancer cells & stem cells to (drug-induced) stress. My technical expertise aligns well with the work presented throughout this paper, although I am not familiar with biolayer interferometry. 

      Reviewer #2 (Evidence, reproducibility and clarity): 

      Summary 

      In this study Hammond et al. investigated the role of Dual-specificity Tyrosine Phosphorylation regulated Kinase 1A (DYRK1) in G1/S transition. By exploiting Dependency Map portal, they identified a previously unexplored protein FAM53C as potential regulator of G1/S transition. Using RNAi, they confirmed that depletion of FAM53C suppressed proliferation of human RPE1 cells and that this phenotype was dependent on the presence protein RB. In addition, they noted increased level of CDKN1A transcript and p21 protein that could explain G1 arrest of FAM53Cdepleted cells but surprisingly, they did not observe activation of other p53 target genes. Proteomic analysis identified DYRK1 as one of the main interactors of FAM53C and the interaction was confirmed in vitro. Further, they showed that purified FAM53C blocked the ability of DYRK1 to phosphorylate cyclin D in vitro although the activity of DYRK1 was likely not inhibited (judging from the modification of FAM53C itself). Instead, it seems more likely that FAM53C competes with cyclin D in this assay. Authors claim that the G1 arrest caused by depletion of FAM53C was rescued by inhibition of DYRK1 but this was true only in cells lacking functional p53. This is quite confusing as DYRK1 inhibition reduced the fraction of G1 cells in p53 wild type cells as well as in p53 knock-outs, suggesting that FAM53C may not be required for regulation of DYRK1 function. Instead of focusing on the impact of FAM53C on cell cycle progression, authors moved towards investigating its potential (and perhaps more complex) roles in differentiation of IPSCs into cortical organoids and in mice. They observed a lower level of proliferating cells in the organoids but if that reflects an increased activity of DYRK1 or if it is just an off target effect of the genetic manipulation remains unclear. Even less clear is the phenotype in FAM53C knock-out mice. Authors did not observe any significant changes in survival nor in organ development but they noted some behavioral differences. Weather and how these are connected to the rate of cellular proliferation was not explored. In the summary, the study identified previously unknown role of FAM53C in proliferation but failed to explain the mechanism and its physiological relevance at the level of tissues and organism. Although some of the data might be of interest, in current form the data is too preliminary to justify publication.

      Major points 

      (1) Whole study is based on one siRNA to Fam53C and its specificity was not validated. Level of the knock down was shown only in the first figure and not in the other experiments. The observed phenotypes in the cell cycle progression may be affected by variable knock-down efficiency and/or potential off target effects. 

      We thank the Reviewer for raising this important point. First, we need to clarify that our experiments were performed with a pool of siRNAs (not one siRNA). Second, commercial antibodies against FAM53C are not of the best quality and it has been challenging to detect FAM53C using these antibodies in our hands – the results are often variable. In addition, to better address the Reviewer’s point and control for the phenotypes we have observed, we performed two additional series of experiments: first, we have confirmed G1 arrest in RPE-1 cells with individual siRNAs, providing more confidence for the specificity of this arrest (Fig. S1B); second, we have new data indicating that other cell lines arrest in G1 upon FAM53C knock-down (Fig. S1E,F and Fig. 4F).

      (2) Experiments focusing on the cell cycle progression were done in a single cell line RPE1 that showed a strong sensitivity to FAM53C depletion. In contrast, phenotypes in IPSCs and in mice were only mild suggesting that there might be large differences across various cell types in the expression and function of FAM53C. Therefore, it is important to reproduce the observations in other cell types. 

      As mentioned above, we have new data indicating that other cell lines arrest in G1 upon FAM53C knock-down (three cancer cell lines) (Fig. S1E,F and Fig. 4F).

      (3) Authors state that FAM53C is a direct inhibitor of DYRK1A kinase activity (Line 203), however this model is not supported by the data in Fig 4A. FAM53C seems to be a good substrate of DYRK1 even at high concentrations when phosphorylations of cyclin D is reduced. It rather suggests that DYRK1 is not inhibited by FAM53C but perhaps FAM53C competes with cyclin D. Further, authors should address if the phosphorylation of cyclin D is responsible for the observed cell cycle phenotype. Is this Cyclin D-Thr286 phosphorylation, or are there other sites involved? 

      We revised the text of the manuscript to include the possibility that FAM53C could act as a competitive substrate and/or an inhibitor.

      We removed most of the Cyclin D phosphorylation/stability data from the revised manuscript. As the Reviewers pointed out, some of these data were statistically significant but the biological effects were small. As discussed above in our response to Reviewer #1, the analysis of Cyclin D phosphorylation and stability are complicated by the upregulation of p21 upon FAM53C knockdown, in particular because p21 can be part of Cyclin D complexes, which may affect its protein levels in cells (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). Instead of focusing on Cyclin D levels and stability, we refocused the manuscript on RB and p53 downstream of FAM53C loss.

      We note, however, that we used specific Thr286 phospho-antibodies, which have been used extensively in the field. Our data in Figure 1 with palbociclib place FAM53C upstream of Cyclin D/CDK4,6. We performed Cyclin D overexpression experiments but RPE-1 cells did not tolerate high expression of Cyclin D1 (T286A mutant) and we have not been able to conduct more ‘genetic’ studies. 

      (4) At many places, information on statistical tests is missing and SDs are not shown in the plots. For instance, what statistics was used in Fig 4C? Impact of FAM53C on cyclin D phosphorylation does not seem to be significant. In the same experiment, does DYRK1 inhibitor prevent modification of cyclin D? 

      As discussed above, we removed some of these data and re-focused the manuscript on p53-p21 as a second pathway activated by loss of FAM53C.

      (5) Validation of SM13797 compound in terms of specificity to DYRK1 was not performed. 

      This is an important point. We had cited an abstract from the company (Biosplice) but we agree that providing data is critical. We have now revised the manuscript with a new analysis of the compound’s specificity using kinase assays. These data are shown in Fig. S3F-H.

      (6) A fraction of cells in G1 is a very easy readout but it does not measure progression through the G1 phase. Extension of the S phase or G2 delay would indirectly also result in reduction of the G1 fraction. Instead, authors could measure the dynamics of entry to S phase in cells released from a G1 block or from mitotic shake off. 

      The Reviewer made a good point. As discussed in our response to Reviewer #1, with p53-null RPE-1 cells, we found that cell numbers do not increase in these conditions where we had observed a cell cycle re-entry (Fig. 4E), which was accompanied by apoptotic cell death (Fig. S4I). Thus, cells re-enter the cell cycle but die as they progress through S-phase and G2/M. We note that inhibition of DYRK1A has been shown to decrease expression of G2/M regulators (PMID: 38839871), which may contribute to the inability of cells treated to DYRK1Ai to divide.

      Because our data in RPE-1 cells showed that p21 knock-down was not sufficient to allow the FAM53C knock-down cells to re-enter the cell cycle, we did not further analyze p21 in HCT-116 cells. These data indicate that G1 entry by flow cytometry will not always translate into proliferation.

      Other points:

      (7) Fig. 2C, 2D, 2E graphs should begin with 0 

      We remade these graphs.

      (8) Fig. 5D shows that the difference in p21 levels is not significant in FAM53C-KO cells but difference is mentioned in the text. 

      We replaced the panel by the correct panel; we apologize for this error.

      (9) Fig. 6D comparison of datasets of extremely different sizes does not seem to be appropriate

      We agree and revised the text. We hope that the Reviewer will agree with us that it is worth showing these data, which are clearly preliminary but provide evidence of a possible role for FAM53C in the brain.

      (10) Could there be alternative splicing in mice generating a partially functional protein without exon 4? Did authors confirm that the animal model does not express FAM53C? 

      We performed RNA sequencing of mouse embryonic fibroblasts derived from control and mutant mice. We clearly identified fewer reads in exon 4 in the knockout cells, and no other obvious change in the transcript (data not shown). However, immunoblot with mouse cells for FAM53C never worked well in our hands. We made sure to add this caveat to the revised manuscript.

      Reviewer #2 (Significance): 

      Main problem of this study is that the advanced experimental models in IPSCs and mice did not confirm the observations in the cell lines and thus the whole manuscript does not hold together. Although I acknowledge the effort the authors invested in these experiments, the data do not contribute to the main conclusion of the paper that FAM53C/DYRK1 regulates G1/S transition. 

      Reviewer #3 (Evidence, reproducibility and clarity: 

      This paper identifies FAM53C as a novel regulator of cell cycle progression, particularly at the G1/S transition, by inhibiting DYRK1A. Using data from the Cancer Dependency Map, the authors suggest that FAM53C acts upstream of the Cyclin D-CDK4/6-RB axis by inhibiting DYRK1A.  Specifically, their experiments suggest that FAM53C Knockdown induces G1 arrest in cells, reducing proliferation without triggering apoptosis. DYRK1A Inhibition rescues G1 arrest in P53KO cells, suggesting FAM53C normally suppresses DYRK1A activity. Mass Spectrometry and biochemical assays confirm that FAM53C directly interacts with and inhibits DYRK1A. FAM53C Knockout in Human Cortical Organoids and Mice leads to cell cycle defects, growth impairments, and behavioral changes, reinforcing its biological importance. 

      Strength of the paper: 

      The study introduces a novel cell cycle control signalling module upstream of CDK4/6 in G1/S regulation which could have significant impact. The identification of FAM53C using a depmap correlation analysis is a nice example of the power of this dataset. The experiments are carried out mostly in a convincing manner and support the conclusions of the manuscript. 

      Critique: 

      (1) The experiments rely heavily on siRNA transfections without the appropriate controls. There are so many cases of off-target effects of siRNA in the literature, and specifically for a strong phenotype on S-phase as described here, I would expect to see solid results by additional experiments. This is especially important since the ko mice do not show any significant developmental cell cycle phenotypes. Moreover, FAM53C does not show a strong fitness effect in the depmap dataset, suggesting that it is largely non-essential in most cancer cell lines. For this paper to reach publication in a high-standard journal, I would expect that the authors show a rescue of the S-phase phenotype using an siRNA-resistant cDNA, and show similar S-phase defects using an acute knock out approach with lentiviral gRNA/Cas9 delivery. 

      We thank the Reviewer for this comment. Please refer to the initial response to the three Reviewers, where we discuss our use of single siRNAs and our results in multiple cell lines. Briefly, we can recapitulate the G1 arrest upon FAM53C knock-down using two independent siRNAs in RPE-1 cells. We also observe the same G1 arrest in p53 knockout cells, suggesting it is not due to a non-specific stress response. In addition, the arrest is dependent on RB, which fits with the genetic and biochemical data placing FAM53C upstream of RB, further supporting a specific phenotype. Human cancer cell lines also arrest in G1 upon FAM53C knock-down, not just RPE-1 cells. Finally, we hope the Reviewer will agree with us that compensatory mechanisms are very common in the cell cycle – which may explain the lack of phenotypes in vivo or upon long-term knockout of FAM53C.

      (2) The S-phase phenotype following FAM53C should be demonstrated in a larger variety of TP53WT and mutant cell lines. Given that this paper introduces a new G1/S control element, I think this is important for credibility. Ideally, this should be done with acute gRNA/Cas9 gene deletion using a lentiviral delivery system; but if the siRNA rescue experiments work and validate an on-target effect, siRNA would be an appropriate alternative. 

      We now show data with three cancer cell lines (U2OS, A549, and HCT-116 – Fig. S1E,F and Fig. 4F), in addition to our results in RPE-1 cells and in human cortical organoids. We note that the knock-down experiments are complemented by overexpression data (Fig. 1G-I), by genetic data (our original DepMap screen), and our biochemical data (showing direct binding of FAM53C to DYRK1A).

      (3) The western blot images shown in the MS appear heavily over-processed and saturated (See for example S4B, 4A, B, and E). Perhaps the authors should provide the original un-processed data of the entire gels? 

      For several of our panels (e.g., 4E and S4B, now panels S3J and S3K)), we used a true “immunoassay” (as indicated in the legend – not an immunoblot), which is much more quantitative and avoids error-prone steps in standard immunoblots (“Western blots”). Briefly, this system was developed by ProteinSimple. It uses capillary transfer of proteins and ELISA-like quantification with up to 6 logs of dynamic range (see their web site https://www.proteinsimple.com/wes.html). The “bands” we show are just a representation of the luminescence signals in capillaries. We made sure to further clarify the figure legends in the revised manuscript.

      Data in 4A are also not a western blot but a radiograph.

      For immunoblots, we will provide all the source data with uncropped blots with the final submission.

      (4) A critical experiment for the proposed mechanism is the rescue of the FAM53C S-phase reduction using DYRK1A inhibition shown in Figure 4. The legend here states that the data were extracted from BrdU incorporation assays, but in Figure S4D only the PI histograms are shown, and the S-phase population is not quantified. The authors should show the BrdU scatterplot and quantify the phenotype using the S-phase population in these plots. G1 measurements from PI histograms are not precise enough to allow for conclusions. Also, why are the intensities of the PI peaks so variable in these plots? Compare, for example, the HCT116 upper and lower panels where the siRNA appears to have caused an increase in ploidy. 

      We apologize for the confusion and we fixed these errors, for most of the analyses, we used PI to measure G1 and S-phase entry. We added relevant flow cytometry plots to supplemental figures (Fig. S1G, H, I, as well as Fig. S4E and S4K, and Fig. S5F).

      (5) There's an apparent contradiction in how RB deletion rescues the G1 arrest (Figure 2) while p21 seems to maintain the arrest even when DYRK1A is inhibited. Is p21 not induced when FAM53C is depleted in RB ko cells? This should be measured and discussed. 

      This comment and comments from the two other Reviewers made us reconsider our model. We re-read carefully the Meyer paper and think that DYRK1A activity may be understood when considering levels of both CycD and p21 at the same time in a continuum (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). While our genetic and biochemical data support a role for FAM53C in DYRK1A inhibition, it is obvious that the regulation of cell cycle progression by FAM53C is not exclusively due to this inhibition. As discussed above and below, we noted an upregulation of p21 upon FAM53C knock-down, and activation of p53 and its targets likely contributes significantly to the phenotypes observed. We added new experiments to support this more complex model (Figure 4 and Figure S4, with new model in S4L).

      Reviewer #3 (Significance): 

      In conclusion, I believe that this MS could potentially be important for the cell cycle field and also provide a new target pathway that could be relevant for cancer therapy. However, the paper has quite a few gaps and inconsistencies that need to be addressed with further experiments. My main worry is that the acute depletion phenotypes appear so strong, while the gene is nonessential in mice and shows only a minor fitness effect in the depmap screens. More convincing controls are necessary to rule out experimental artefacts that misguide the interpretation of the results.

      We appreciate this comment and hope that the Reviewer will agree it is still important to share our data with the field, even if the phenotypes in mice are modest.

    1. As a consequence of the amendments set out above in relation to network services,interoperability and data sharing, it is furthermore proposed to repeal the following relatedimplementing acts, by way of the applicable procedure, and to delete the correspondingempowerments:(1) Commission Regulation (EC) No 976/2009 as regards Network Services21(2) Commission Regulation (EU) No 1089/2010 on interoperability of spatial data setsand services22, and(3) Commission Regulation (EU) No 268/2010 on data and service sharing23.(4) Commission Implementing Decision (EU) 2019/1372 implementing Directive2007/2/EC as regards monitoring and reporting24.

      I read this as taking out all INSPIRE obligations, whereas the HVD reg builds on these pre-existing obligations. (Stating that sharing data / services must be open)- [ ] Crosscheck if HVD states an explicit independent mandate, without reference to INSPIRE mandates. #geonovumtb #10mins #belangrijkeerst

    1. Reviewer #3 (Public review):

      Summary:

      In the manuscript of Cotten et al., the authors study the 2-thiolation of tRNA in bacterial antibiotic resistance. The wildtype organism, Yersinia pseudotuberculosis, downregulates 2-thiolation as a response to antibiotics targeting the ribosome. In this manuscript, the authors show that a knockout of tusB causes slower translation. They provide evidence on the mechanisms of the slowing by determining transcription and translation, ribosome profiling and performing codon-usage analysis. They successfully determined that 2 codons are drivers of the translation slowdown, and the data is highly conclusive. Technically, I have nothing to criticize.

      Strengths:

      All in all, the study is very well made, and the writing is clear and concise. It covers a wide array of state-of-the-art analyses to unravel the interplay of tRNA modifications in translation.

      Weaknesses:

      The only question that remains to be asked is why the slowed translation leads to a better survival of the bacteria under antibiotic stress. In my opinion, the mechanism itself remains unclear. Thus, the statement that "We expect that this reduction in ribosomal proteins is globally reducing the translational capacity of the cell and is responsible for inducing tolerance to ribosome and RNA polymerase-targeting antibiotics" does not truly emphasize the remaining open question of why slowed translation favors survival. Therefore, I would recommend a minor text revision.

    2. Author response:

      Reviewer #1 (Public review): 

      Summary: 

      Cotton et al. investigated the role of tusB in antibiotic tolerance in Yersinia pseudotuberculosis. They used the IP2226 strain and introduced appropriate mutations and complementation constructs. Assays were performed to measure growth rates, antibiotic tolerance, tRNA modification, gene expression and proteomic profiles. In addition, experiments to measure ribosome pausing and bioinformatic analysis of codon usage in ribosomal proteins provided in-depth mechanistic support for the conclusions. 

      Strengths: 

      The findings are consistent with the authors having uncovered new mechanistic insights into bacterial antibiotic tolerance mediated by reducing ribosomal protein abundance. 

      Weaknesses: 

      Since the WT strain grows faster than the tusB mutant, there is a question of how growth rate, per se, impacts some of the analysis done. The authors should address this issue. In addition, it may not be essential, but would analysis of another slow-growing mutant (in some other antibiotic tolerance pathway if available) serve as a good control in this context? 

      We would like to thank the reviewer for their time spent reviewing our manuscript and for their positive review. We plan to address their comment as to how growth rate impacts the analyses and plan to incorporate another slow-growing mutant in the revised version of the manuscript.

      Reviewer #2 (Public review): 

      Summary: 

      This study addresses a critical clinical challenge-bacterial antibiotic tolerance (a key driver of treatment failure distinct from genetic resistance)-by uncovering a novel regulatory role of the conserved s2U tRNA modification in Yersinia pseudotuberculosis. Its strengths are notable and lay a solid foundation for understanding phenotypic drug tolerance. The study is the first to link s2U tRNA modification loss to antibiotic tolerance, specifically targeting translation/transcription-inhibiting antibiotics (doxycycline, gentamicin, rifampicin). By establishing a causal chain - s2U deficiency → codon-specific ribosome pausing (at AAA/CAA/GAA) → reduced ribosomal protein translation → global translational suppression → tolerance - it expands the functional landscape of tRNA modifications beyond canonical translation fidelity, filling a gap in how RNA epigenetics shapes bacterial stress adaptation. 

      Strengths: 

      This study makes a valuable contribution to understanding tRNA modification-mediated antibiotic tolerance. 

      Weaknesses: 

      There are several limitations that weaken the robustness of the study's mechanistic conclusions. Addressing these gaps would significantly enhance its impact and translational potential. 

      We would like to thank the reviewer for their time spent reviewing our manuscript, and for both their positive comments about the significance and novelty of this work as well as their critiques. We plan to address their specific recommendations in the revised manuscript by focusing on the contribution of specific ribosomal proteins (i.e. the 30S subunit protein, S13) through overexpression, codon replacement, and stability experiments. We also plan to design experiments to assess in vivo relevance and assess possible impacts on other pathways involved in antibiotic tolerance.

      Reviewer #3 (Public review): 

      Summary: 

      In the manuscript of Cotten et al., the authors study the 2-thiolation of tRNA in bacterial antibiotic resistance. The wildtype organism, Yersinia pseudotuberculosis, downregulates 2-thiolation as a response to antibiotics targeting the ribosome. In this manuscript, the authors show that a knockout of tusB causes slower translation. They provide evidence on the mechanisms of the slowing by determining transcription and translation, ribosome profiling and performing codon-usage analysis. They successfully determined that 2 codons are drivers of the translation slowdown, and the data is highly conclusive. Technically, I have nothing to criticize. 

      Strengths: 

      All in all, the study is very well made, and the writing is clear and concise. It covers a wide array of state-of-the-art analyses to unravel the interplay of tRNA modifications in translation. 

      Weaknesses: 

      The only question that remains to be asked is why the slowed translation leads to a better survival of the bacteria under antibiotic stress. In my opinion, the mechanism itself remains unclear. Thus, the statement that "We expect that this reduction in ribosomal proteins is globally reducing the translational capacity of the cell and is responsible for inducing tolerance to ribosome and RNA polymerase-targeting antibiotics" does not truly emphasize the remaining open question of why slowed translation favors survival. Therefore, I would recommend a minor text revision. 

      We would like to thank the reviewer for their time spent reviewing our manuscript and for their positive review of the technical aspects, experimental design, and writing. We will incorporate their suggested text revision into the revised manuscript, and will add to this statement if additional planned experiments shed light on this remaining question.

    1. Today's simplification package is composed of six legislative proposals.

      6 legislative proposals (but press release lists 5)

      1. Environmental assessments wrt permits
      2. industrial emissions directive
      3. SCIP database (substances of concern, in the Waste Framework directive) to be replaced with DPP ( #openvraag DPP is not in effect yet, so repeal of SCIP early / protection erosion?)
      4. Extended Producer Responsibility req changed for EU producers.
      5. INSPIRE