10,000 Matching Annotations
  1. Oct 2025

    Annotators

    1. Reviewer #3 (Public review):

      Rovira, et al., aim to characterize immune cells in the brain parenchyma and identify a novel macrophage population referred to as "dendritic-like cells". They use a combination of single-cell transcriptomics, immunohistochemistry, and genetic mutants to conclude the presence of this "dendritic-like cell" population in the brain. The strength of this manuscript is the identification of dendritic cells in the brain, which are typically found in the meningeal layers and choroid plexus. In addition, Rovira, et al., findings are supported by the findings of the Wen lab and a recent Cell Reports paper. Congratulations on the nice work!

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Weaknesses:

      While scRNA-seq data clearly revealed different subsets of microglia, macrophages, and DCs in the brain, it remains somewhat challenging to distinguish DC-like cells from P2ry12- macrophages by immunohistochemistry or flow cytometry.

      Indeed, in flow cytometry analyses of adult brain samples, the p2ry12<sup>-</sup>; mpeg1<sup>+</sup> fraction could, in theory, encompass not only DC-like cells but also other macrophage subsets, as well as B cells, since B cells have been reported to express mpeg1 in zebrafish (Ferrero et al., 2020; Moyse et al., 2020). Nevertheless, our data strongly indicate that within the brain parenchyma, DC-like cells represent the predominant component of this population. This conclusion is supported by the pronounced reduction of p2ry12<sup>-</sup>; mpeg1<sup>+</sup> cells in brain sections from ba43 mutants, in which DC development is impaired. Currently, further phenotypic resolution is constrained by the limited availability of zebrafish-specific antibodies and the restricted palette of fluorescent reporter lines capable of distinguishing MNP subsets. We anticipate that future efforts, including the generation of novel transgenic lines informed by our dataset (initiatives already underway in our group), will enable more precise discrimination among these distinct subsets.

      Reviewer #2 (Public Review):

      A weakness of this study is that it is mainly based on FACS sorting, which might modify the proportion of different subtypes.

      We agree that reliance solely on FACS could potentially introduce biases in the proportions of different subtypes. To minimize this concern, we complemented our flow cytometry data with quantification performed directly on brain sections using immunohistochemistry. This approach allowed us to validate cell population distributions in situ, thereby confirming that the trends observed by FACS accurately reflect the cellular composition of microglia and DC-like cells within the brain parenchyma.

      Reviewer#3 (Public Review):

      A weakness is the lack of specific reporters or labeling of this dendritic cell population using specific genes found in their single-cell dataset. Additionally, it is difficult to remove the meningeal layers from the brain samples and thus can lead to confounding conclusions. Overall, I believe this study should be accepted contingent on sufficient labeling of this population and addressing comments.

      While the generation of DC-like specific transgenic lines is indeed a promising direction (and such efforts are currently underway in our group), creating and validating these lines is time-consuming. Importantly, although these additional tools will be valuable for future functional investigations, we believe they would not impact the main conclusions or core message of our current work, where we already provide detailed spatial information on DC-like cells, and we demonstrated their lineage identity through the use of our newly generated batf3 mutant line. 

      Recommendations for the authors:

      Major Comments: 

      The authors should discuss another recent report demonstrating DCs in the zebrafish brain, which also developed independently of Csf1ra, and compare the two datasets (Zhou et al. Cell reports, 2023).

      Thank you for highlighting the study by Zhou et al., which offers complimentary insight into the dendritic cell population in the zebrafish brain. We note that in this work, the authors reclassify ccl34b.1<sup>-</sup> mpeg1<sup>+</sup> brain-resident cells as conventional DCs, thus revising their earlier interpretation of these cells as microglia (Wu et al., 2020). This shift in interpretation is based on their transcriptional comparison between the previously characterized ccl34b.1<sup>-</sup> mpeg1<sup>+</sup> population and a new dataset of brain

      mpeg1<sup>+</sup> cells. This updated classification aligns closely with our findings. Given that our data already demonstrate the equivalence between the DC-like cells described in our study and the ccl34b.1<sup>-</sup> mpeg1<sup>+</sup> population, repeating a direct transcriptional comparison would be redundant. We have now included a discussion of this work in the revised manuscript. Specifically, we have added the following sentences in the discussion: “Importantly, since the submission of our manuscript, the Wen lab published an independent study in which they now reclassify the ccl34b.1<sup>-</sup> mpeg1<sup>+</sup> cells in the zebrafish brain as cDCs, revising their earlier interpretation of these cells as microglia (Zhou et al., 2023)”. 

      Data reported in Figure 5 should be quantified (cell numbers, how many brains analyzed). 

      Thank you for this comment. We would like to clarify that the primary purpose of Figure 5 (and Figure 5 supplement 1) is to provide an initial qualitative overview of the different MNP subsets present in the adult brain, using the currently available transgenic and immunohistochemical tools. These descriptive analyses were instrumental in identifying the most reliable combination, namely the Tg(p2ry12:p2ry12GFP; mpeg1.1:mCherry) double transgenic line in conjunction with L-plastin immunostaining, to distinguish microglia from other parenchymal MNPs. Quantitative analyses using this optimized strategy are presented in Figure 7 (Figure 7 supplement 1), where we systematically enumerate the different MNPs. We therefore believe that performing additional quantification in Figure 5 would be redundant with the more robust data already shown in Figure 7. As requested, we have now included in the Figure 5 legend that images are representative of brain tissue sections from 2-3 fish. 

      The title mentions an "atlas", but there is no searchable database or website associated with the paper. Please provide one.

      We agree and fully support the importance of data accessibility. To facilitate use of our dataset by the scientific community, we have developed a user-friendly, searchable web interface that allows users to explore gene expression pacerns within our dataset. This website is available at https://scrna-analysis zebrafish.shinyapps.io/scatlas/

      This information has now been included in the “Data availability statement” section of the manuscript.  

      Reviewer #1 (Recommendations For The Authors): 

      Specific comments: 

      The authors should discuss another recent report demonstrating DCs in the zebrafish brain, which also developed independently of Csf1ra, and compare the two datasets (Zhou et al. Cell reports, 2023). 

      Thank you for this suggestion. Please refer to our response in the major comments section, where we address this point in detail.

      Within macrophages, the authors identified 5 clusters including 4 microglia clusters and 1 MF cluster (Figure 4). Does the laUer relate to 'BAMs' and express markers previously described in murine BAMs, including Lyve1, CD206, etc.? Or to monocytes? By flow cytometry, monocytes were detected (Figure 1B), but not by scRNA-seq.  

      You have raised an important point here. As described in lines 197-202 (“results” section), the cells in the MF cluster exhibit a macrophage identity, based on their expression of classical macrophage markers such as marco, mfap4 or csf1ra. However, we were unable to confidently annotate this cluster more specifically. We also considered whether this population might resemble mammalian BAMs or monocytes, cell types that, to our knowledge, have not yet been clearly identified in zebrafish. However, orthologous markers typically associated with murine BAMs were not detected (lyve1) or not specifically enriched (mrc1a/mrc1b) in the MF cluster (see below). Based on these findings, we can only cautiously propose that this cluster may represent blood-derived macrophages and / or monocytes.

      To further address your suggestion, we performed a cell type enrichment analysis using the marker genes of the MF cluster, following the same strategy as for the microglia and DC-like clusters presented in Figure 4 supplement 2 C,D. This analysis revealed significant for “monocytes” and “macrophages”, further supporting a general monocytic/macrophage identity (see below). At present, further characterization of this cluster is limited by the lack of zebrafish-specific antibodies and the restricted palette of fluorescent reporter lines that distinguish among MNP subsets. We anticipate that future studies, including the development of new transgenic lines guided by our dataset, will allow for a more precise analysis of this distinct population. 

      Author response image 1.

      Do all 4 DC clusters identified by scRNA-seq represent cDC1s? or are there also cDC2s, and cDC3s present?  

      In our analyses, the four dendritic cell clusters identified by scRNA-seq (DC1-DC4) exhibit transcriptional profiles consistent with a conventional type 1 dendritic cell (cDC1) identity. These clusters uniformly express hallmark cDC1-associated genes, while lacking expression of markers typically associated with mammalian cDC2 or plasmacytoid dendritic cells (pDCs). For instance, irf4, a key transcription factor required for cDC2 development, is not detected in our dataset. Similarly, we do not observe expression of genes characteristic of pDCs. 

      That said, the absence of cDC2 or pDC-like signatures in our dataset does not rule out the presence of these populations in zebrafish.  

      While they show that DC-like cells did not express Csf1rb (Figure 4D) or other macrophage/microglia genes, DC-like cells were affected in the Csf1rb mutants and in double mutants, demonstrating that their development depends on Csf1rb signaling, as known for macrophages but not DCs. Can the authors discuss this in more detail with regard to DC differentiation/precursors? 

      Thank you for pointing this out. As previously demonstrated, CSF1R signaling in zebrafish is more complex than in mammals, due to the presence of two paralogs, csf1ra and csf1rb, which exhibit partially non-overlapping functions (Ferrero et al., 2021). We and others have shown that csf1rb signaling is implicated in the regulation of definitive hematopoiesis, particularly in the regulation of hematopoietic stem cell (HSC)-derived myelopoiesis. Although the developmental origin of zebrafish brain DC-like cells remains uncharacterized, their reduced numbers in the csf1rb mutant, despite their lack of csf1rb expression, supports the current model in which csf1rb acts at the progenitor level, promoting myeloid lineage commitment. According to this, csf1rb disruption affects the differentiation of multiple myeloid subsets, which likely include DC-like cells. We have developed this point in the discussion section (lines 502506).  

      Do the DCs express Csf1ra? 

      Csf1ra transcripts are not found in DCs in our dataset. As shown below, csf1ra expression is restricted to the microglia and macrophage clusters. These observations are in line with those made by Zhou et al., 2023.

      Author response image 2.

      Fig. 5, the number of brains analyzed should be added, and also quantifications of cell numbers included. It is mentioned (line 260) that P2ry12GFP+mpeg1mCherry+ microglia are abundant across brain regions while P2ry12GFP- mpeg1mCherry+ cells particularly localize in the ventral part of the posterior brain parenchyma. It would be nice if images of the different brain regions were provided. 

      Regarding the quantification, we refer to our response in the major comments section, where we explain that detailed quantification of microglia and other MNP subsets is provided in Figure 7, using a more refined strategy for distinguishing cell types.

      As requested, we have now included representative sections from the forebrain, midbrain and hindbrain of adult Tg(mhc2dab:GFP; cd45:DsRed) fish. These images illustrate the spatial distribution of DC-like cells across brain regions. Notably, DC-like cells are most abundant in the ventral areas of the midbrain and hindbrain, and are also present in the posterior telencephalon, particularly concentrated in the region of the commissura anterior. This regional annotation is based on the zebrafish brain atlas by Wullimann et al., 1996 (Neuroanatomy of the zebrafish brain, https://doi.org/10.1007/978-3-0348-8979-7).

      These additional images have been included in Figure 5 Supplement 1 (A-E).

      It is sometimes not evident whether the Pr2y12- cells included DC-like cells and macrophages, which should be discussed. 

      Thank you for bringing this to our attention. Upon review, we agree this point required clearer explanation throughout the text, particularly beginning with the description of putative DC-like cells in Figure 5. We have now revised the manuscript to improve clarity and becer guide readers through the phenotypic identification of DC-like cells using the Tg(p2ry12:p2ry12-GFP;mpeg1:mCherry) line. Specifically, we have modified the titles in the results section from page 5 to page 9, so that readers can more easily follow the step-by-step approach we used to distinguish DC-like cells from microglia. 

      To directly address your comment: the p2ry12<sup>-</sup>; mpeg1<sup>+</sup> fraction may, in theory, include not only DC-like cells but also other macrophage subsets and B cells, as B cells have been shown to express mpeg1 in zebrafish (Ferrero et al., 2020; Moyse et al., 2020). Nevertheless, our data strongly indicate that within the brain parenchyma, DC-like cells represent the predominant component of this population. This conclusion is supported by the pronounced reduction of p2ry12<sup>-</sup>; mpeg1<sup>+</sup> cells in brain sections from ba43 mutants, in which DC development is impaired. 

      We have revised the text accordingly to clarify this point in the results section of the manuscript (line 355).

      For example, the DC-like cell population in Figure 6C appears to include two populations of cells. Thus, it is unclear whether the sorted mhc2dab:GFP+;CD45:DsRedhi population for bulk-seq also contains the MF population identified in Fig. 2. 

      Thank you for this thoughtful observation. During the course of this study, we indeed considered how best to isolate non-microglial macrophages in order to specifically recover the MF population identified in our scRNA-seq analysis. However, with the current repertoire of fluorescent transgenic zebrafish lines, it remains technically challenging to selectively isolate non-microglial macrophages from the adult brain. As a result, the mhc2dab:GFP<sub>+</sub>; cd45:DsRedhi sorted population used for bulk RNA-seq may indeed include a mixture of DC-like and other mononuclear phagocytes, potentially the MF population. In contrast, our data demonstrate that the Tg(p2ry12:p2ry12-GFP) line provides a more selective tool for isolating microglia, minimizing contamination from other mononuclear phagocyte subsets.

      In Figure 7, a reduction of GFP-mpeg+ cells can be seen in baf3 mutants. Could the remaining cells be the (non-microglia) macrophages? Or in Figure 8, could the remaining P2ry12GFP-Lcp1+ cells in Irf8 mutants be macrophages? 

      Indeed, we believe it is likely that the remaining mpeg1<sup>+</sup> cells observed in ba43 mutants include non-microglial macrophages and/or B cells, as we and others previously showed that zebrafish B cells express mpeg1.1 transcripts and are labeled in the mpeg1.1 reporters (Ferrero et al., 2020). This interpretation is further supported by the observation that the reduction in mepg1+ cells is more pronounced in brain sections than in flow cytometry samples, where non-parenchymal mpeg+ cells, such as peripheral macrophages or B cells, are likely enriched. To explore this possibility, we attempted to assess the expression of MF- and B cell-specific markers in the remaining mpeg1+ population isolated from ba43 mutants. However, due to the very low numbers of cells recovered per animal, we were limited to analyzing only a few markers. Despite multiple attempts, qPCR analyses proved unconclusive, likely due to low transcript abundance. We thank you for your understanding of the technical limitations that currently prevent a more definitive characterization of these remaining cells.  

      Regarding the irf8 mutants (Figure 8), irf8 is a well-established master regulator of mononuclear phagocyte development. In mice, deficiency results in developmental defects and functional impairments across multiple myeloid lineages, including microglia, which exhibit reduced density (Kierdorf et al., 2013) and an immature phenotype (Vanhove and al., 2019). Similarly, in zebrafish, irf8 mutants show abnormal macrophage development, with an accumulation of immature and apoptotic cells during embryonic and larval stages (Shiau et al., 2014). Based on these findings, it is plausible that the residual p2ry12:GFP<sup>-</sup> Lcp1<sup>+</sup> cells observed in the irf8 mutant brains represent immature or arrested mononuclear phagocytes, possibly including both microglia and DC-like cells. This is supported by their distinct morphology and specific localization along the ventricle borders. However, as previously noted, our current tools do not permit to conclusively identify these cells.

      Reviewer #2 (Recommendations For The Authors): 

      A few sentences are not easy to understand for a "non zebrafish specialist". 

      (1) Page 3 line 111 The sentence "Interestingly, analyses of brain cell suspensions from double transgenics showed p2ry12:GFP+ microglia accounted for half of cd45:DsRed+ cells (50.9 % {plus minus} 2.9; n=4) (Figure 1D,E). Considering that mpeg1:GFP+ cells comprised ~75% of all leukocytes, these results indicated that approximately 25% of brain mononuclear phagocytes do not express the microglial p2ry12:GFP+ transgene." is not clear. This point is significant and deserves a more detailed explanation. 

      We apologize for the lack of clarity in this section. The quantification presented in Figure 1 refers specifically to cd45:Dsred<sup>+</sup> leukocytes, meaning that the reported percentages of p2ry12:GFP<sup>+</sup> and mpeg1:GFP<sup>+</sup> cells are calculated relative to the total cd45+ population (defined as 100%). Specifically, we observed that approximately 51% of all cd45+ cells were p2r12:GFP<sup>+</sup> microglia, while around ti5% were mpeg1:GFP<sup>+</sup>. From these values, we infer that about 25% of mpeg1:GFP<sup>+</sup> leukocytes do not express the p2ry12:GFP transgene and therefore likely represent non-microglial mononuclear phagocytes. We agree that this distinction is important and have revised the text accordingly to clarify the interpretation for readers who may be less familiar with zebrafish transgenic lines or gating strategies. See page 3, lines 107 117.

      (2) Line 522; Like human and mouse ILC2s, "these cells do not express the T cell receptor cd4-1" is confusing (T cell receptor should be reserved to the ag specific TCR). Also, was TCR isotypes expression analyzed (and how was genome annotation used in this case ?) 

      Thank you for this insightful comment.  We agree that the term “T cell receptor” should be used specifically to refer to antigen-specific TCRs, and we have revised the discussion accordingly to avoid any confusion. Regarding your question on the analysis of TCR isotype expression and the use of genome annotation: due to technical limitations, we did not pursue TCR isotype-level analysis in this study. Instead, we relied on established markers such as cd4-1 and cd8a to distinguish T cell populations, acknowledging that cd4-1 is not expressed by ILC2-like cells in our dataset. We have clarified these points in the relevant sections of the manuscript (see lines 168 and 535)

      The analysis of single-cell data might be more detailed, with more explanation about possible doublet identification and normalization procedures. 

      Thank you for highlighting the need for additional clarity regarding our scRNA-seq analysis.

      As noted in the Seurat tutorial, “cell doublets or multiplets often exhibit abnormally high gene count” (https://sa7jalab.org/seurat/archive/v3.0/pbmc3k_tutorial). To evaluate this, we performed a dedicated doublet detection analysis using the scDblFinder R package (https://rdrr.io/bioc/scDblFinder/f/vigneces/2_scDblFinder.Rmd). Our results indicated that the proportion of predicted doublets is low (see Figure below), and when present, these doublets are distributed among the different clusters. This contrasts with the typical clustering of doublets into discrete groups and indicates that our single-cell sequencing workflow was sufficiently robust to predominantly capture singlets.

      Regarding normalization, we have clarified this in the manuscript. Briefly, single-cell data were normalized using Seurat’s SCTransform method with the following custom parameters: “variable.features.n=4000 and return.only.var.genes=F”. These settings are now clearly described to ensure reproducibility.

      Author response image 3.

      Reviewer #3 (Recommendations For The Authors):

      Major issues

      Though baf3 mutants were generated the manuscript will greatly benefit from in situ labeling by RNAscope or the generation of transgenic reporters to conclusively localize this dendritic cell population and address any potential contamination issues. 

      We thank you for this constructive suggestion. We agree that in situ labeling approaches such as RNAscope would offer valuable complementary insights. In our current study, however, we already provide detailed spatial information on DC-like cells, and we demonstrated their lineage identity through the use of our newly generated batf3 mutant line. 

      To address concerns regarding potential contamination, we have carefully analyzed more than two dozens adult brains to date and consistently observed abundant DC-like cells within the brain parenchyma, exhibiting a reproducible and specific spatial distribution, as described in the manuscript. This consistent localization across multiple samples strongly supports the genuine presence of these cells in the brain rather than artifactual contamination.

      While the generation of DC-like specific transgenic lines is indeed a promising direction (and such efforts are currently underway in our group) we note that creating and validating these lines is time-consuming and falls beyond the scope of the present study. Importantly, although these additional tools will be valuable for future functional investigations, we believe they would not impact the main conclusions or core message of our current work. 

      The morphological characterization of CD45:DsRed+ macrophages stained with May-Grunwald-Giemsa has been previously reported in the paper, "Characterization of the mononuclear phagocyte system in the zebrafish" Wittamer et al., 2011."Morphologic analyses revealed that the majority of cells exhibited the characteristics of monocytes/macrophages namely low nuclear to cytoplasm ratios and a high number of cytoplasmic vacuoles (Figure 3B). 

      We thank you for pointing out the reference to Wittamer et al., 2011. In that study, we indeed provided the first morphological characterization of mononuclear phagocytes (MNPs) in various adult zebrafish organs using the cd45:DsRed line in combination with the mhc2dab:GFP reporter. The focus was primarily on MNPs across peripheral tissues. In the current study, our aim is broader: we investigate the full diversity of brain immune cells, using cd45 as a general marker for leukocytes. As part of this comprehensive characterization, we applied MGG staining, a widely accepted cytological technique, to gain morphological insight into the sorted CD45:DsRed+ population. This method remains a valuable and rapid approach to visually assess cell type heterogeneity, especially when evaluating samples where multiple immune cell lineages may be present. 

      While there is some overlap with the methodology used in Wittamer et al., the context, scope, and tissue examined differ substantially. Thus, the inclusion of MGG staining in this study serves to complement our broader transcriptomic analyses by providing supporting morphological evidence specific to brain-resident immune cells.

      We have now clarified this distinction in the revised manuscript to better differentiate the current work from our previous findings (see line 85).

      Figure 5 data should be quantified.

      Please refer to our response in the major comments section, where we address this question in detail.

      Figure 7- Figure Supplement 1. J, K has no CD45:DsRed positive cells in baf3 mutants, which is counterintuitive because CD45:DsRed should capture all hematopoietic cells and is not specific to dendritic cells. 

      It is correct that cd45 is a general leukocyte marker, labeling all immune cells, including dendritic cells. In this Figure, we used the Tg(cd45:DsRed) transgenic line to visualize the phenotype because it offers an alternative to IHC, with the advantage of strong endogenous fluorescence and easier screening of vibratome sections. However, this technique has limitations: due to fixation, only cells with high fluorescence (e.g. cd45<sup>high</sup>dendritic cells) are captured, while those with medium/low expression (e.g. cd45<sup>low</sup> microglia) are often not visible. This explains why fewer cells are observed in both wild-type and ba43 mutant brains (Figure 5 KN, Figure 7 – supplement 1 JK). While this approach is quicker and allows for thicker sections, IHC remains the preferred method for the rest of the analyses, including the use of additional markers to identify all relevant cell populations. 

      Thank you for bringing this point of confusion to our attention. To improve clarity, we have amended the text in the relevant sections (see lines 704-706, and legend of Figure 7 Supplement 1)

      Minor issues: 

      The terms in the title, "A single-cell transcriptomic atlas..." are used. What is meant by "atlas"? A searchable database or website is not provided.

      Please refer to our response in the major comments section, where we explain that we have made our dataset accessible through a searchable web interface (https://scrna-analysiszebrafish.shinyapps.io/scatlas/) which is now referenced in the Data Availability Statement.

      This reviewer considers that it is offensive to use terminology such as "poorly characterized" in reference to others' work. 

      Thank you for pointing this out. We understand the concern and have revised the wording to ensure it remains respectful and neutral when referring to previous work. The changes are reflected in lines 20 and 49.

      The introduction of this manuscript should consider restructuring and editing. Example: Lines 51-57 introduce the importance of immune cells in zebrafish regeneration studies. However, this study does not investigate such processes. Additionally, the authors focus on the concept of immune heterogeneity in the brain throughout the text however, these studies have been conducted previously by others (Silva et al., 2021) at single-cell level.

      The novelty of this manuscript is the identification of "dendritic-like cells" and yet the introduction and text are limited to 68-71 lines. The introduction would benefit by introducing this cell type "dendritic-like cells" and differences between vertebrates. 

      Thank you for these valuable comments. In response, we have revised the introduction to better align with the focus of the study (see edited text in page 2). We now emphasize that, while macrophages have been extensively studied in zebrafish, dendritic cells remain much less well characterized in this model.  Also, while we acknowledge that Silva et al. addressed aspects of immune heterogeneity in the zebrafish brain, their study primarily focused on mononuclear phagocytes. In contrast, our work provides a broader and more detailed characterization of the brain immune landscape, integrating transcriptomic data with multiple fluorescent reporter lines and hematopoietic mutants to strengthen cell identity assignments. Importantly, we note that Silva et al. classified DC-like cells within the microglial compartment, whereas our findings support that these cells represent a distinct population. While our data challenge this specific aspect of their conclusions, we believe both studies offer complementary insights that collectively advance our understanding of zebrafish brain immunity. 

      Though Figure 6 is a great conformation of scRNA sequencing, it seems redundant and should be supplemental data.

      We respectfully disagree with the reviewer’s suggestion. We believe that presenting the data in Figure 6 as the main figure enhances its visibility and impact, particularly highlighting the distinction between microglia and DC-like cells, an aspect we consider highly valuable information for the zebrafish research community. This is especially important given that our conclusions challenge two previous independent reports, further underscoring the relevance of these findings to the field.

    1. Reviewer #2 (Public review):

      Summary:

      The authors show that in E. coli the initiator protein DnaA oscillates post-translationally: its activity rises and peaks exactly when DNA replication begins, even if dnaA transcription is held constant. To explain this, they propose an "extrusion" mechanism in which nucleoid-associated proteins such as H-NS, whose amount grows with cell volume, dislodge DnaA from chromosomal binding sites; modelling and H-NS perturbations reproduce the observed drop in initiation mass and extra initiations seen after dnaA shut-down. Together, the data and model link biomass growth to replication timing through chromosome-driven, post-translational control of DnaA, filling gaps left by classic titration and ATP/ADP-switch models.

      Strengths:

      (1) Introduces an "extrusion" model that adds a new post-translational layer to replication control and explains data unexplained by classic titration or ATP/ADP-switch frameworks.

      (2) A major asset of the study is that it bridges the longstanding gap between DnaA oscillations and DNA-replication initiation, providing direct single-cell evidence that pulses of DnaA activity peak exactly at the moment of initiation across multiple growth conditions and genetic perturbations.

      (3) A tunable dnaA strain and targeted H-NS manipulations shift initiation mass exactly as the model predicts, giving model-driven validation across growth conditions.

      (4) A purpose-built Psyn66 reporter combined with mRNA-FISH captures DnaA-activity pulses with cell-cycle resolution, providing direct, compelling data.

      Weaknesses:

      (1) What happens to the (C+D) period and initiation time as the dnaA mRNA level changes? This is not discussed in the text or figure and should be addressed.

      (2) It is unclear what is meant by "relative dnaA mRNA level." Relative to what? Wild-type expression? Maximum expression? This should be explicitly defined.

      (3) It would be helpful to provide some intuition for why an increase in dnaA mRNA level leads to a decrease in initiation mass per ori and an increase in oriC copy number.

      (4) The titration and switch models do not explicitly include dnaA mRNA in the dynamics of DnaA protein. Yet, in Figure 2G, initiation mass is shown to decrease linearly with dnaA mRNA level in these models. How was dnaA mRNA level represented or approximated in these simulations?

      (5) Is Schaechter's law (i.e., exponential scaling of average cell size with growth rate) still valid under the different dnaA mRNA expression conditions tested?

      (6) The manuscript should explain more explicitly how the extrusion model implements post-translational control of DnaA and, in particular, how this yields the nonlinear drop in relative initiation mass versus dnaA mRNA seen in Fig. 6E. Please provide the governing equation that links total DnaA, the volume-dependent "extruder" pool, and the threshold of free DnaA at initiation, and show-briefly but quantitatively-how this equation produces the observed concave curve.

      (7) Does this Extrusion model give well well-known adder per origin, i.e., initiation to initiation is an adder.

      (8) DnaA protein or activity is never measured; mRNA is treated as a linear proxy. Yet the authors' own narrative stresses post-translational (not transcriptional) control of DnaA. Without parallel immunoblots or activity readouts, it is impossible to know whether a six-fold mRNA increase truly yields a proportional rise in active DnaA.

      (9) Figure 2 infers both initiation mass and oriC copy number from bulk measurements (OD₆₀₀ per cell and rifampicin-cephalexin run-out) instead of measuring them directly in single cells. Any DnaA-dependent changes in cell size, shape, or antibiotic permeability could skew these bulk proxies, so the plotted relationships may not accurately reflect true initiation events.

      Comments on revisions:

      The authors have addressed all of my previous concerns, questions, and suggestions sufficiently.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The study by Li and coworkers addresses the important and fundamental question of replication initiation in Escherichia coli, which remains open, despite many classic and recent works. It leverages single-cell mRNA-FISH experiments in strains with titratable DnaA and novel DnaA activity reporters to monitor DNA activity peaks versus size. The authors find oscillations in DnaA activity and show that their peaks correlate well with the estimated population-average replication initiation volume across conditions and imposed dnaA transcription levels. The study also proposes a novel extrusion model where DNA-binding proteins regulate free DnaA availability in response to biomass-DNA imbalance. Experimental perturbations of H-NS support the model validity, addressing key gaps in current replication control frameworks.

      Strengths:

      I find the study interesting and well conducted, and I think its main strong points are:

      (1) the novel reporters obtained with systematic synthetic biology methods, and combined with a titratable dnaA strain.

      (2) the interesting perturbations (titration, production arrest, and H-NS).

      (3) the use of single-cell mRNA FISH to monitor transcripts directly.

      The proposed extrusion model is also interesting, though not fully validated, and I think it will contribute positively to the future debate.

      We thank the reviewer for acknowledging the strengths of our study.

      Weaknesses and Limitations:

      (1) A relevant limitation in novelty is that DnaA activity and concentration oscillations have been reported by the cited Iuliani and coworkers previously by dynamic microscopy, and to a smaller extent by the other cited study by Pountain and coworkers using mRNA FISH.

      (2) An important limitation is that the study is not dynamic. While monitoring mRNA is interesting and relevant, the current study is based on concentrations and not time variations (or nascent mRNA). Conversely, the study by Iuliani and coworkers, while having the drawback of monitoring proteins, can directly assess production rates. It would be interesting for future studies or revisions to monitor the strains and reporters dynamically, as well as using (as a control) the technique of this study on the chromosomal reporters used by Iuliani et al.

      We acknowledge the value of dynamic measurements and clarify our methodological rationale.

      While luliani et al. provided valuable temporal resolution through protein dynamics, our mRNA FISH approach achieves direct decoupling of transcriptional vs. post-translational regulation (Fig 4F-H), and condition flexibility across 7 growth rates (30-66 min doubling times). This trade-off sacrifices temporal resolution for enhanced population-scale resolution and perturbation flexibility. To directly address temporal coupling, future work will implement dual-color live imaging of DnaA activity concurrent with replication initiation events.

      (3) Regarding the mathematical models, a lot of details are missing regarding the definitions and the use of such models, which are only presented briefly in the Methods section. The reader is not given any tools to understand the predictions of different models, and no analytical estimates are used. The falsification procedures are not clear. More transparency and depth in the analysis are needed, unless the models are just used as a heuristic tool for qualitative arguments (but this would weaken the claims). The Berger model, for example, has many parameters and many regimes and behaviors. When models are compared to data (e.g., in Figure 2G), it is not clear which parameters were used, how they were fixed, and whether and how the model prediction depends on parameters.

      We agree that model transparency is essential for quantitative validation. To address this, all model parameters (DnaA synthesis rate, activation/deactivation rates etc.) are explicitly tabulated in Supplementary Information Table S6. For the titration (Hansen et al. 1991) and extrusion models, we derive analytical expressions for initiation mass (IM) sensitivity to DnaA expression in Supplementary Note 1. For Figure 2G/S6, we used published parameters (Berger & Wolde 2022 SI Table 2) with experiment growth conditions (μ = 1.54 h<sup>-1</sup>).

      The extrusion model's validation relies primarily on its ability to resolve paradoxical initiation events under dnaA shutdown (Fig 6C), a test where other models fail categorically. While the Berger titration-switch hybrid can fit steady-state IM trends (Fig S6A), it cannot reproduce post-shutdown dynamics without ad hoc modifications (Fig S6B). We acknowledge that comprehensive analysis of all model regimes exceeds this study's scope but provide full simulation code for independent verification: https://github.com/BaiYangBqdq/dynamics_of_biomass_DNA_coordination

      (4) Importantly, the main statement about tight correlations of peak volumes and average estimated initiation volume does not establish coincidence, and some of the claims by the authors are unclear in these respects (e.g., when they say "we resolve a 1:1 coupling between DnaA activity thresholds and replication initiation", the statement could be correct but is ambiguous). Crucially, the data rely on average initiation volumes (on which there seems to be an eternally open debate, also involving the authors), and the estimate procedure relies on assumptions that could lead to biases and uncertainties added to the population variability (in any case, error bars are not provided).

      We acknowledge the limitations of population-level inference and have refined our claims: "Replication initiation volume scales proportionally with peak DnaA activity volume with a slope of 1.0 (R<sub>2</sub>=0.98, Fig 7G), indicating predictive correspondence rather than absolute coincidence. While population-level  𝑉<sub>𝑖</sub> estimation cannot resolve single-cell stochasticity, the consistent 𝑉*: 𝑉<sub>𝑖</sub> relationship across 20 conditions suggest DnaA activity thresholds predict initiation timing within physiological error margins”. Future work will implement simultaneously DnaA activity and replication forks by using microfluidic single-cell tracking.

      (5) The delays observed by the authors (in both directions) between the peaks of DnaAactivity conditional averages with respect to volume and the average estimated initiation volumes are not incompatible with those observed dynamically by Iuliani and coworkers. The direct experiment to prove the authors' point would be to use a direct proxy of replication initiation, such as SeqA or DnaN, and monitor initiations and quantify DnaA activity peaks jointly, with dynamic measurements.

      We acknowledge the observed temporal deviations between DnaA activity peaks (𝑉*) and population-derived volumes at initiation ( 𝑉<sub>𝑖</sub>) in certain conditions, in line with the findings of Iuliani et al. This might be mechanistically consistent with the time required for orisome assembly or oriC sequestration. They do not contradict our core finding that initiation occurs at a defined DnaA activity threshold (slope=1.0, R<sub>2</sub>=0.98 in 𝑉*: 𝑉<sub>𝑖</sub> correlation).

      (6) While not being an expert, I had some doubt that the fact that the reporters are on plasmid (despite a normalization control that seems very sensible) might affect the measurements. Also, I did not understand how the authors validated the assumptions that the reporters are sensitive to DnaA-ATP specifically. It seems this assumption is validated by previous studies only.

      We employed a plasmid-based reporter system to circumvent the significant confounding effects of chromosomal position on promoter activity, as extensively documented by Pountain et al., where local genomic context (e.g., nucleoid occlusion, supercoiling gradients, and neighboring operons) introduces uncontrolled variability. By housing the P<sub>syn66</sub> test promoter and P<sub>con</sub> normalization control in identical low-copy pSC101 vectors (<8 copies/ cell, Peterson & Phillips, Plasmid 2008), we ensured they experience equivalent physical and biochemical environments. This ratiometric design, where DnaA activity is calculated, actively corrects for global fluctuations in RNA polymerase availability, nucleotide pools, and plasmid copy number. Critically, P<sub>syn66</sub>’s architecture emulates natural DnaA-responsive elements: its strong DnaAboxes report free DnaA concentration, while its weak box is preferentially bound by DnaA-ATP (Speck et al., EMBO journal 1999), mirroring the nucleotide-state sensitivity of oriC and the native dnaA promoter. This system was indispensable for our central finding, as it uniquely enabled the decoupling of DnaA activity oscillations from transcriptional feedback (Fig. 4F-H), an experiment fundamentally impossible with chromosomally integrated reporters due to autoregulatory interference.

      Overall Appraisal:

      In summary, this appears as a very interesting study, providing valuable data and a novel hypothesis, the extrusion model, open to future explorations. However, given several limitations, some of the claims appear overstated. Finally, the text contains some selfevaluations, such as "our findings redefine the paradigm for replication control", etc., that appear exaggerated.

      We thank the reviewer for highlighting the need for precise language in framing our conclusions. We have implemented the following substantive revisions throughout the manuscript to ensure claims align strictly with empirical evidence:

      (1) Changed "redefine the paradigm for replication control" into "advance the paradigm for replication control" (Introduction)

      (2) Changed "redefine bacterial cell cycle control" into "refine bacterial cell cycle control as a dynamic interplay..." (Discussion)

      (3) Removed the term "spatial" from the Discussion's description of DnaA-chromosome interactions (Discussion, first paragraph).

      (4) Changed "provides a blueprint" into "provides a valuable tool for dissecting spatial regulation..." (Discussion, final paragraph)

      (5) Scrutinized all superlatives (e.g., "critical feat" into "important capability"; "fundamental principle of cellular organization" into "potential organizational strategy")

      (6) Replaced the instances of "robust" with evidence-backed descriptors (e.g., "sensitive," "consistent")

      (7) We agree that the extrusion model requires further validation and have emphasized this in Discussion: "While H-NS perturbation supports extrusion mechanism, future work should identify the full extruder interactome and elucidate how metabolic signals modulate their activity" (final paragraph)

      This calibrated language more accurately represents our study as a conceptual advance with testable mechanisms, not a complete paradigm shift.

      Reviewer #2 (Public review):

      Summary:

      The authors show that in E. coli, the initiator protein DnaA oscillates post-translationally: its activity rises and peaks exactly when DNA replication begins, even if dnaA transcription is held constant. To explain this, they propose an "extrusion" mechanism in which nucleoidassociated proteins such as H-NS, whose amount grows with cell volume, dislodge DnaA from chromosomal binding sites; modelling and H-NS perturbations reproduce the observed drop in initiation mass and extra initiations seen after dnaA shut-down. Together, the data and model link biomass growth to replication timing through chromosome-driven, posttranslational control of DnaA, filling gaps left by classic titration and ATP/ADP-switch models.

      Strengths:

      (1) Introduces an "extrusion" model that adds a new post-translational layer to replication control and explains data unexplained by classic titration or ATP/ADP-switch frameworks.

      (2) A major asset of the study is that it bridges the longstanding gap between DnaA oscillations and DNA-replication initiation, providing direct single-cell evidence that pulses of DnaA activity peak exactly at the moment of initiation across multiple growth conditions and genetic perturbations.

      (3) A tunable dnaA strain and targeted H-NS manipulations shift initiation mass exactly as the model predicts, giving model-driven validation across growth conditions.

      (4) A purpose-built Psyn66 reporter combined with mRNA-FISH captures DnaA-activity pulses with cell-cycle resolution, providing direct, compelling data.

      We thank the reviewer for acknowledging the strengths of our study.

      Weaknesses:

      (1) What happens to the (C+D) period and initiation time as the dnaA mRNA level changes? This is not discussed in the text or figure and should be addressed.

      We thank the reviewer for this important observation. Our data demonstrate that increased dnaA mRNA levels induce two compensatory changes in cell cycle progression:

      (1) Earlier replication initiation, manifested as a reduced initiation mass: the initiation mass decreased from 5.6 to 2.6 (OD<sub>600</sub>·ml per 10<sup>10</sup> cells) as the relative dnaA mRNA level increased from 0.2 to 7.2 (normalized to the wild-type level) (Fig. 2F, red).

      (2) Prolonged C+D period: Increased by approximately 60% (from 1.05 to 1.66 hours, Fig. 2F blue).

      The complete quantitative relationship is now explicitly described in the Results section: “Concurrently, the initiation mass was reduced by 50%, and the period from initiation to division (C+D) was increased by ~60% (Fig. 2F)”

      (2) It is unclear what is meant by "relative dnaA mRNA level." Relative to what? Wild-type expression? Maximum expression? This should be explicitly defined.

      The relative dnaA mRNA level was obtained by normalizing to that in wild-type MG1655 cells grown in the same medium. To clarify this point, we have now marked the wild-type level in Fig. 1B, and a clear description of this has also been included in the figure caption.

      (3) It would be helpful to provide some intuition for why an increase in dnaA mRNA level leads to a decrease in initiation mass per ori and an increase in oriC copy number.

      Thank you for your valuable suggestion. Increased dnaA mRNA accelerates DnaA accumulation, causing cells to reach the initiation threshold at a smaller cell size (reducing initiation mass, Fig. 2F red). This earlier initiation increases oriC copies per cell at populational level (Fig. 2E). This mechanistic interpretation now appears in the Results: “As the DnaA expression level increases, DnaA activity reaches the initiation threshold earlier. Given that cell mass remained nearly unchanged, this earlier initiation led to an increase in population-averaged cellular oriC numbers (Fig. 2E).”

      (4) The titration and switch models do not explicitly include dnaA mRNA in the dynamics of DnaA protein. Yet, in Figure 2G, initiation mass is shown to decrease linearly with dnaA mRNA level in these models. How was dnaA mRNA level represented or approximated in these simulations?

      All models presented in this article omit explicit modeling of dnaA mRNA dynamics for simplicity. However, at steady state, the relative level of dnaA mRNA can be approximated by the relative expression rate of DnaA protein, as both reflect the expression level of DnaA. This detail is now clarified in the caption of Figure 2G.

      (5) Is Schaechter's law (i.e., exponential scaling of average cell size with growth rate) still valid under the different dnaA mRNA expression conditions tested?

      Schaechter's law describes the exponential scaling of average cell size with growth rate in bacteria. In our prior work (Zheng et al., Nature Microbiology 2020), where we demonstrated that Schaechter's law fails in slow-growth regimes. However, in current study, growth rate remained constant across different dnaA expression levels (Fig. 2C), and cell mass showed no significant change (Fig. 2D). Since Schaechter's law specifically addresses how cell size scales with growth rate, it does not apply here, as growth rate was invariant in our perturbations, which selectively alter replication initiation dynamics, not growth rate or size scaling.

      (6) The manuscript should explain more explicitly how the extrusion model implements posttranslational control of DnaA and, in particular, how this yields the nonlinear drop in relative initiation mass versus dnaA mRNA seen in Figure 6E. Please provide the governing equation that links total DnaA, the volume-dependent "extruder" pool, and the threshold of free DnaA at initiation, and show - briefly but quantitatively - how this equation produces the observed concave curve.

      The governing equations linking initiation mass and DnaA expression level is now provided in Supplementary Note S1 for both the titration and the extrusion model. In general, the dependence of initiation mass (𝑉<sub>𝐼</sub>) on dnaA expression level (𝛼<sub>𝐴</sub>) dependency takes an inverse 1 proportionality form: . In the extrusion model, the incorporated extruder protein is assumed to have similar synthesis dynamics as DnaA and can release DnaA from DnaA-box. After denoting the synthesis rate of the extruder as 𝛼<sub>𝐻</sub>, the combined effect of DnaA and the extruder on replication initiation can be briefly described as: . Then the additive contribution of 𝛼<sub>𝐻</sub> dampens the sensitivity of initiation mass to changes in 𝛼<sub>𝐴</sub>, resulting in a significantly flattened curve. As a result, the predicted 𝑉<sub>𝐼</sub> − 𝛼<sub>𝐴</sub> relationship has a concave shape in the semi-log plots.

      (7) Does this Extrusion model give well well-known adder per origin, i.e., initiation to initiation is an adder.

      Yes, the extrusion model can provide the initiation-to-initiation adder phenomenon, this information was provided in fig. S3C.

      (8) DnaA protein or activity is never measured; mRNA is treated as a linear proxy. Yet the authors' own narrative stresses post-translational (not transcriptional) control of DnaA. Without parallel immunoblots or activity readouts, it is impossible to know whether a sixfold mRNA increase truly yields a proportional rise in active DnaA.

      We acknowledge the reviewer's valid concern regarding the indirect nature of our DnaA activity measurements. While mRNA levels alone cannot resolve active DnaA dynamics, our approach integrates functional replication outcomes with a validated synthetic reporter to infer activity. Crucially, elevated dnaA mRNA causes demonstrable biological effects: earlier replication initiation (Fig. 2F) and increased oriC copies (Fig. 2E), directly confirming enhanced functional DnaA activity at the oriC locus. The P<sub>syn66</sub> reporter, engineered with DnaA-boxes mirroring oriC's architecture, provides orthogonal validation, showing progressive repression to dnaA induction (Fig. 3C). Our operational metric , bases on P<sub>syn66</sub> responds sensitively to DnaA-chromosome interactions within its characterized 8-fold dynamic range (Fig. 3C). Immunoblots would be inadequate here, as they cannot distinguish functionally critical pools: free versus chromosome-bound DnaA, or DnaA-ATP versus DnaAADP, precisely the post-translational states our study implicates in regulation. We therefore prioritize functional readouts (initiation timing) and the P<sub>syn66</sub> reporter, which probes the biologically active fraction relevant to replication control.

      (9) Figure 2 infers both initiation mass and oriC copy number from bulk measurements (OD<sub>600</sub> per cell and rifampicin-cephalexin run-out) instead of measuring them directly in single cells. Any DnaA-dependent changes in cell size, shape, or antibiotic permeability could skew these bulk proxies, so the plotted relationships may not accurately reflect true initiation events.

      We acknowledge the reviewer's valid methodological concern and clarify that while bulk measurements carry inherent limitations, our approach is grounded in established techniques with demonstrated reliability. Cell mass was inferred from OD600/cell, which correlates strongly with direct dry weight measurements and microscopic cell volumes across diverse growth conditions, as validated in our prior work (Zheng et al., Nature Microbiology 2020). Crucially, cell mass remained invariant across dnaA expression levels (Fig. 2D).

      Regarding oriC quantification, the rifampicin-cephalexin run-out assay is a wildly applied for replication initiation studies. Our data shows expected 2<sup>n</sup> oriC distributions without abnormal ploidy (as shown below). While single-cell methods offer superior resolution, our bulk approach provides accurate population-level trends.

      Author response image 1.

      Recommendations for the authors:

      Reviewing Editor Comments:

      The reviewers felt that the mathematical modeling was not adequately explained in the paper, and that this affected the readability of the manuscript. The authors are encouraged to elaborate on this aspect of the paper (in addition to strengthening other claims, if possible, per the reviewers' comments).

      We thank the editor and reviewers for their constructive feedback. We have comprehensively strengthened the mathematical modeling framework to enhance clarity and rigor.

      Reviewer #1 (Recommendations for the authors):

      The only revision I would do is a recalibration of the claims and a major effort to clarify the modeling part (including a detailed SI appendix), without necessarily performing additional work.

      To enhance mathematical modeling transparency, we have completed model description in the method section and a parameter table with literature-sourced values in Supplementary Information Table S6. Moreover, analytical derivations of initiation mass dependencies are performed and presented in the Supplementary Information Note S1.

      Of course, there are extra experiments (mentioned in the public review) that would help support some of the big claims, but that can be considered a different project.

      Thank you for your suggestion. This will be addressed in our future work.

      Minor suggestion: please put signposts or plot jointly to compare the maxima/minima in Figures 4D, E, G, and H.

      We added dashed lines in Figures 4D, and E, to synchronize visualization of DnaA activity peaks and transcriptional minima across panels, facilitating direct biological comparisons.

      Reviewer #2 (Recommendations for the authors):

      (1) Should define what DNA activity is.

      We have explicitly defined DnaA activity in the Introduction as “the capacity to initiate replication…” and noted that it is “governed by free DnaA concentration, DnaA-ATP/-ADP ratio, and orisome assembly competence”.

      (2) Word repetition - “...grown in in Luria-Bertani (LB) medium...”.

      Corrected.

      (3) Typographical error - “FISH ... was preformed" should be "performed”.

      Corrected.

      (4) The manuscript alternates between “ng ml<sup>-1</sup>” and “ng·ml<sup>-1</sup>”; choose one style and apply it uniformly.

      Standardized the units to ng·ml<sup>-1</sup> throughout.

      (5) Reference duplicates - Some citations appear twice in the bibliography (e.g., "Bintu et al., 2005a/b" and "Bintu et al., 2005b" listed again later).

      The studies by Bintu et al. (2005a, 2005b) represent separate works: 2005a details applications, and 2005b develops models.

    1. Reviewer #2 (Public review):

      Summary:

      This paper introduces a new function within the Fam3Pro package that addresses the problem of breaking loops in family structures. When a loop is present, standard genotype peeling algorithms fail, as they cannot update genotypes correctly. The solution is to break these loops, but until now, this could not be done automatically and optimally.

      The manuscript provides useful background on constructing graphs and trees from family data, detecting loops, and determining how to break them optimally for the case of no loops with multiple matings. For this situation, the algorithm switches between Prim's algorithm and a simple greedy approach and provides a solution. However, here, an optimal solution is not guaranteed.

      The theoretical foundations-such as the representation of families as graphs or trees and the identification of loops-are clearly explained and well-illustrated with example pedigrees. The practical utility of the new function is demonstrated by applying it to a dataset containing families with loops.

      This work has the potential for considerable impact, especially for medical researchers and individuals from families with loops. These families could previously not be analysed automatically and optimally. The new function changes that, enabling risk assessments and genetic calculations that were previously infeasible.

      Strengths:

      (1) The theoretical explanation of graphs, trees, and loop detection is clear and well-structured.

      (2) The idea of switching between algorithms is original and appears effective.

      (3) The function is well implemented, with minimal additional computational cost.

      Weaknesses:

      (1) In cases with multiple matings, the notion of a "close-to-optimal" solution is not clearly defined. It would be helpful to explain what this means-whether it refers to empirical performance, theoretical bounds, or something else.

      (2) In the example pedigree discussed, multiple options exist for breaking loops, but it is unclear which is optimal.

      (3) No example is provided where the optimal solution is demonstrably not reached.

      (4) It is also unclear whether the software provides a warning when the solution might not be optimal.

    2. Author response:

      Response to Reviewer #1:

      We plan to extend the discussion section to discuss the clinical implications of this new function. We will note the algorithm's applicability to broader genetic counseling contexts beyond cancer risk assessment.

      Response to Reviewer #2:

      We will clarify the four points raised:

      (1) "Close-to-optimal" definition: We will explain that in multiple-mating cases, finding the global optimum is NP-hard (equivalent to the Weighted Feedback Vertex Set problem). We will clarify that our greedy algorithm provides practically efficient solutions suitable for clinical use, though without theoretical optimality guarantees.

      (2) Example clarity: We will improve Figure 1's caption to explain the cost calculations and note that with equal weights, both shown solutions are equivalent.

      (3) Non-optimal examples: We will describe scenarios where the greedy algorithm may not achieve the global optimum, particularly in multiple-mating cases with heterogeneous weights.

      (4) Warning message: The current version not provide a warning when the solution might be non-optimal. This may be added in the future to the function.

      We appreciate your feedback and suggestions to help improve the manuscript.

    1. n the public mind Asian Americans are often synonymous with academic excellence, in part because their group scores on standardized tests and their college enrollment levels often exceed those of other groups, often including whites.

      This tends to come with a lot of pressure that is put on the students because of stereotypes like the model minority. As mentioned in the text, stereotypes amongst Asians include. having high standardized scores, which can lead them to feeling forced to fit into this category. This leads to unhealthy habits which can affect their mental health nd well-being. Many are oblivious to the dangers that come with forcing stereotypes, especially when being forced to such high standards.

    2. Some earlier studies during the legal segregation era indicated that manyAfrican Americans were encouraged, from a young age, to rigidly control theiranger and rage over discriminatory incidents affecting them.10 Historically, it wasvery dangerous for African Americans to unleash their anger about racist attacks.In earlier decades, black parents taught their children to remain even temperedin the face of extreme Jim Crow oppression, which silence demonstrably hadsevere effects on self-esteem and mental health—as it likely does in the case ofAfrican Americans and Asian Americans today

      This reveals how systemic racism not only inflicts external harm but also demands internalized emotional discipline. The legacy of emotional containment reveals how racism operates through the regulation of affect shaping how marginalized communities are allowed to feel or express themselves. Seeing how both African Americans and Asian American communities have similar constraints shows how institutional racism continues to police emotional expression which tends to have negative effects on mental health and identity formation.

    3. Althoughshe was rarely recognized for her significant involvement in important extra-curricular activities, people did associate her with academic excellence. Whileperforming well in school made her feel like an outsider, she worked hard foracademic success as a defensive mechanism

      This really comes to show that academic success is no longer being a source of empowerment and joy but a shield against exclusion. Her excellence is only acknowledged narrowly and deeply confined to academics while her broader contributions are overlooked. This shows a systemic bias in regards to how merit is recognized. I think her relationship with school reveals how institutional cultures can distort the meaning of success, turning it to a coping mechanism for navigating environments.

    4. Most school systems seem to allow much racist teasing. Respondents whoprotested to teachers were usually told not to take racial taunting seriously.Young Asian Americans are told to thicken their skin, while white and othernon-Asian children are often allowed to continue. The parents of tormentedstudents are frequently fearful about complaining of racial taunting and teasingand do not want to “cause trouble” or generate white retaliation. In this era ofschool multiculturalism, many administrators encourage teachers to celebratediversity in classrooms, and this superficial “be happy” multiculturalism maysometimes reduce their ability to see the impact of such racist treatment onstudents of color, as well as the underlying reality of institutionalized racism intheir educational institutions

      This exposes how school systems tolerate racist bullying and exposes the tendency of institutions to mask harm with performative multiculturalism. Telling Asian American students to toughen up while excusing white peers behavior not only reflects bias but the systems refusal to confront racism. It is really upsetting to see that parents fear speaking out and feel that it is unnecessary due to the fear of being further discriminated against.

    5. More Discrimination: The High School Asian Experience

      Many Chinese students also face similar pressures within the exam-oriented education system: “excellence” becomes a moral label, with no room for failure or emotional expression. This is especially true when studying abroad. Chinese students are often defined as “quiet, strong in science, and lacking creativity”—a narrative that closely mirrors Ann's own experience.

    6. As children attend child-care facilities and elemen-tary school, they are gradually introduced to racial socialization in peer groups. Young children’s racist behavior is often excused by adults on the grounds that children are naïve innocents and often slip and fall in the realm of social behavior, yet the assumption that children’s racist comments and actions are innocuous is incorrect. Based on extensive field research in a large child-care center, Debra Van Ausdale and Joe Feagin concluded that the “strongest evidence of white adults’ conceptual bias is seen in the assumption that children experience life events in some naïve or guileless way.”5 Children mimic adults’ racist views and behavior, but that does not mean they do not understand and know numerous elements of the dominant racial frame and use its stereotypes and interpretations to enhance their status among other children.

      Children are not born racists; they learn racial hierarchies by imitating the behaviors of adults and peers. White children reproduce the social order of “white supremacy” through language and mockery, while Asian children are marginalized from a young age, subjected to ridicule about their appearance and food, thereby learning their subordinate place within the social racial structure. Schools are not neutral learning environments but spaces that reproduce social hierarchies. Through seemingly “playful” interactions, children learn who belongs to the ‘mainstream’ and who is the “other,” while teachers' silence effectively endorses this structure. Similar social stratification exists within China's educational environment—manifesting in stereotypes targeting non-local students, ethnic minorities, those with distinct accents, or students deemed “unsociable.” We frequently hear excuses like “He's just a child, he doesn't understand” to justify discriminatory remarks.

    1. , far from a solipsistic politics ofconsciousness oblivious to material context, many sought to buildlives “not on stoned indifference but on active social engagementand community-oriented hard work” (p. 3) that would create newenvironments, public spaces (Silos, 2003), “right livelihoods,” andalternative social “games” in line with their values (notably bymoving “back-to-the-land” and setting up farms and communes

      Rejecting of the American norm

    Annotators

    1. 如果你没中子宫彩票,那么选择并从事一份职业将是人生绕不过去的课题。今天的“职业”似乎成了万恶之源,因为“工作就是为了不工作”,“努力是为了不努力”。今天的年轻人,人生目标出奇的一致:财务自由。我希望本期节目能完成一个论证:财务自由是一个让你输在起跑线上的糟糕目标。以它为目标,很可能会导致你财务严重不自由。 通常理解是:从事一份职业,是把自己当成商品,通过出售自己的时间和精力来换取报酬。更抽象地说,“职业”是通过解决他人问题来解决自己问题的生存模式,这是商业分工导致的必然结果。问题A与问题B的汇率是不同的,一名患者的致命问题,对一名医生而言可能只是一个常规问题,之间巨大的知识差,让前者愿意倾家荡产换取。生死问题比清洁问题重要,医生待遇比环卫工人更好。

      研究绝大多数人都绕不开的一个的重要话题——职业。那么我们应该如何才能认知职业的本质,让我们从本质、底层逻辑层面进行深刻理解。进而深刻的理解他、认知他、掌握它、运用它? 职业是一种生存模式,是一种解决自身问题的生存模式,目标是解决自身的生存问题的模式。但是这种模式必须要进行连接,要将自己的能力、目标与他人进行连接,才能形成并建立一种稳定的联系。 例如:医生是将自己的技能与患者之间的需求进行一种一对多的映射联系,企业家是将自己的综合能力与社会上更加广泛的人们需求建立联系,一个成功的企业家,自然它的服务受众也会约广泛,例如:库克、马斯克、马云。 结论:职业是一种将自己与他人建立联系并解决自身生存问题的生存模式。 职业的价值大小、能力的强弱,评价指标就是自己的职业范畴服务、联系的人员数量人员范畴的人员规模大小,联系强度的强弱。

    2. 关于赚钱的「观念」、「意愿」和「能力」是三回事,它们息息相关,但不能混为一谈。观念不同,会导致意愿的天差地别,意愿不同,则会使得养成的能力大相径庭。很多人有钱,就是因为足够贪婪,足够贪婪的同时又足够愚妄,以至于能完美自欺,彻底认同主流意识形态,没有一丝自我怀疑,冒进的风险偏好配合正常的智力,在特定历史阶段下就能赚到大钱,所谓“傻有钱”就是这么来的。我称之为“傻子钱”,不是说坑傻子赚到的钱,而是通过「未经反思的观念和意愿」赚到钱的这个人,在爱智的意义上就是个“傻子”。

      观念、意愿和能力,他们三者之前到底有什么样的区别和联系?他们在赚钱这个核心的目标实现过程中分别扮演什么样的角色和作用? 那么在赚钱的这个目标过程中,是否有底层逻辑、原理是通行、通用的,有没有相对稳定的模型、规则是适用统配的?有没有具体的措施、案例可以印证? 从而让赚钱这件事情,看上去更加通透、触及本质。

    1. Cognitive artefacts may be seen in terms of functioning in a similar fashion to the equivalent human cognitive process. This is the basis for seeing computer reasoning as a model of the human mind

      Producers of these programs had to adjust their approach when introducing AI to society - instead of a cold robotic, non-existent relationship, they created a false impression of a real "entity" with reason/logic; when in reality it is a program with instructions and training.

      In Gemini vs. ChatGPT exercise I asked the programs opinions on my topic and whether they thought the hypothesis was correct according to the findings (in their opinion), like it had thoughts of its own to see what both programs would think.

    2. This introduces an essentially asymmetric relationship between human agent and thing rather than the broadly symmetric interaction implicit in the parity principle. In some respects, this might appear to be akin to the distinction between 'primary agency' and 'secondary agency' (for example, Gell 1998, 21) in which, unlike humans, things do not have agency in themselves but have agency given or ascribed to them. However, the increasing assignment of intelligence in digital devices that enables them to act independent of human agents could suggest that some digital cognitive artefacts possess primary agency as they autonomously act on others – both human and non-human/inanimate things. Arguably this agency is still in some senses secondary in that it is ultimately provided via the human programmer even if this is subsequently subsumed within a neural network generated by the thing itself, for example. This is not the place to develop the discussion of thing agency further (for example, see the debate between Lindstrøm (2015), Olsen and Witmore (2015), and Sørensen (2016)); however, the least controversial position to adopt here is to propose that for the most part the agency of digital cognitive artefacts employed by archaeologists complements rather than duplicates through extending and supporting archaeological cognition. They do this, for example, through providing the capability of seeing beneath the ground or characterising the chemical constituents of objects, neither of which are specifically human abilities. So there is considerable scope for considering the nature of the relationship between ourselves as archaeologists and our cognitive artefacts – how do we interact and in what ways is archaeological cognition extended or complemented by these artefacts?

      While controversial, the addition of intelligence to digital cognitive artifacts making them operate independently from humans, remains a completely necessary step in advancement. There are some tasks which would simply require too much time, or are actually impossible for humans to complete if the process relied on their intelligence alone. The ability for an artifcat to work on its own allows for an incredible increase in effeciency, making things that were deemed impossible 20 years ago into a reality.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.


      Reply to the Reviewers

      We thank the reviewers for their positive assessments overall and for many helpful suggestions for clarification to make the manuscript more accessible to a broader audience. We made minor text changes and added more labels to the figures to address these comments.

      • *

      __Referee #1

      __

      Summary: In this study, the authors show a genetic interaction of the lipid receptors Lpr-1, Lpr-3 and Scav-2 in C. elegans. They show that Lpr-1 loss-of-function specifically affects aECM localization of Lpr-3 and attribute the lethality of Lpr-1 mutants to this phenotype. The authors performed a mutagenesis screen and identified a third lipid receptor, Scav-2, as a modulating factor: loss of scav-2 partially rescues the Lpr-1 phenotype. The authors created a variety of tools for this study, notably Crispr-Cas9-mediated knock-ins for endogenous tagging of the receptors.

      Major comments:

      1. while the authors provide a nice diagram showing the potential roles and interplay of lpr-1, lpr-3 and scav-2, it remains unclear what their respective cargo is. The nature of interaction between the proteins remains unclear from the data.

      Response

      • We agree that identifying the relevant cargo(s) will be key to understanding the detailed mechanisms involved and that the lack of such information is a limitation of our study. However, the impact of our study is to show that these lipid transporters functionally interact to affect aECM organization, a role that could be relevant to many systems, including humans.

      As an optional (since time-consuming) experiment I would suggest trying more tissue-specific lipidomics.

      Response

      • This would be an interesting future experiment but is outside our current technical capabilities.

      The lipidomics data should be presented in the figures, even if there were no significant changes. Importantly, show the lipid abundance at least of total lipids, better of individual classes, normalized to the material input (e.g. number of embryos, protein).

      Response

      • The reviewer is right to point out that lipid variations could occur at different levels, and that we should exercise caution. However, the unsupervised lipidomics analysis would have detected not only individual lipid variations, but also variations in the total or subgroup lipid content. Indeed, the eggs were weighed prior to extraction and each sample was extracted with the same precise volume of solvent before analysis. Furthermore, the LC-MS/MS injection sequence included blanks and quality control (QC) samples. The blanks were the extraction solvent, which allowed us to control for features unrelated to the biological samples. The QC sample was a mixture of all the samples included in the injection sequence, reflecting the central values of the model. If a subclass of samples, such as the lpr-1 mutant, had been characterized by a decrease in one lipid, a subgroup of lipids, or all lipids, it would have clustered separately. Instead, our PCA showed that the variation between samples of the same genotype (wild type, lpr-1 mutant, or lpr-1; scav-2) was similar to the variation between samples from two different genotypes. This means that we did not detect modifications to lipid quantity specifically or in total. A figure illustrating the lipid contents would show no difference between groups.

      Figure 1g: I do not understand what the lpr3:gfp signal is: the punctae in the overview image? and where are they in the zoom image showing anulli and alae? Also, how where the anulli and alae structures labeled? please provide more information

      Response

      • All of the fluorescent signal shown in this figure panel corresponds to the indicated LPR fusion - no other labelling method was used. SfGFP::LPR-3 labels the matrix structures (alae and annuli) as well as some puncta – the ratio of matrix to puncta changes over developmental stages. We edited the figure legend to make this more clear.

      One point that is not sufficiently adressed is that the authors deduce from the inability of the scav-2 gfp knock in to suppress lpr1 lethality that scav2 function is not impaired. This is quite indirect. Can the authors provide more convincing evidence that scav-2 ki has normal function?

      Response

      • Suppression of lpr-1 (or other aECM mutant) lethality is the only known phenotype caused by loss of scav-2 Therefore, this is the only phenotype for which we can do a rescue experiment to test functionality of the knock-in. The data presented do indicate that the knock-in fusion retains significant function.

      In general, the data is clearly presented and the statistical analyses look sound.

      Response

      • Thank you

      __Minor comments: __

      Please provide page and line numbers!

      Response:

      • done

      Avoid contractions like "don't" in both text and figure legends

      Response:

      • changed one instance of “don’t” to “do not”

      Page 12: I do not understand the meaning of the sentence "This transgene also caused more modest lethality in a wild-type background"

      Response:

      • Wording changed to “This transgene caused very little lethality in a wild-type background (Fig. 6C), indicating it is not generally toxic.”

      Figure 7: what is meant with "Dodt"?

      Response:

      • Dodt gradient contrast imaging is a method for transmitted light imaging similar to DIC and is used on some confocal microscopes. It is now explained in the Methods section. We removed the Dodt label from Figure 7 since it seems to be confusing and it is not really important whether the brightfield image is DIC or Dodt.

        Reviewer #1 (Significance (Required)):

        The study is experimentally sound and uses numerous novel tools, such as endogenously tagged lipid receptors. It is an interesting study for researchers in basic research studying lipid receptors and ECM biology. It provides insights on the genetic interaction of lipid receptors. My expertise is in lipid biochemistry, inter-organ lipid trafficking and imaging. I am not very familiar with C. elegans genetics.

      __Referee #2 __ 1. The manuscript is very well written; the documentation is fine, but some more details are needed for better following the subject for readers not familiar with nematode anatomy.

      For instance, while alae are somehow explained, annuli are not - structures that look abnormal in lpr1 and lpr1-scav2 mutants (Fig. 5B).

      Response

      • Apologies for this oversight. We added annuli labels to Figure 1 and Figure 5 panels and added descriptions of annuli to the Figure 1 legend and the Results text.

      Moreover, the authors show in Fig. 1 the punctae etc in the epidermis, whereas in Fig. 2 the show Lpr3 accumulation or not in the duct and the pore (lpr1). How do they localize in the cells of these structures at high magnification? It is also important to see the Lpr3 localisation in lpr1 mutants shown in Fig. 2A with the quality of the images shown in Fig. 1F. This applies also to Figs. 4 and 5.

      Responses:

      • The embryonic duct and pore cells are very small and we have not reliably seen puncta within them. In Figs 2 and 5, we supplemented the duct and pore images with those from the epidermis, which is a much larger tissue, allowing us to resolve puncta and matrix structures with better resolution.
      • The laser settings in Figs 2,4,5 (as opposed to Fig. 1) were chosen to avoid saturation of the matrix signal so that we could do accurate quantifications as shown. The images are unmodified with respect to brightness and therefore appear relatively dim – but we think they convey the observations very accurately.

      I would like to see punctae in lpr1-scav2 doubles.

      Response:

      • Puncta in this genotype are shown for the epidermis in Figure 5. It has not been possible to see puncta specifically within the embryonic duct and pore.

      Regarding the central mechanism, one possibility is - what the authors describe - that Lpr1 is needed for Lpr3 accumulation in ducts and tubes. Alternatively, Lpr1 is needed for duct and tube expansion, in lack of which Lpr3 is unable to reach its destination that is the lumina. Scav2, in this scenario, might be antagonist of tube and duct expansion, and thereby rescue the Lpr1 mutant phenotype independently. Admittedly, the non-accumulation of Lpr3 in scav2 mutants argues against a lpr1-independent function of scav2.

      Responses:

      • LPR-1 is indeed needed to maintain duct and pore tube integrity as the tubes grow, but in mutants the tubes appear to collapse at a later stage than we imaged here (Stone et al 2009). The ~normal accumulation of LET-4 and LET-653 further argues that the duct and pore tubes are still intact at the 1.5-to-2-fold stages. Therefore, we conclude that the defect in LPR-3 accumulation precedes duct and pore collapse.
      • The changes we document in the epidermis also show that the lpr-1 mutant affects LPR-3 accumulation in another (non-tube) tissue.

      In any case, to underline the aspect of Lpr1-Scav2 dosage relationship, the authors may also have a look at Lpr3 distribution in lpr1 heterozygous, and lpr1-scav2 double heterozygous worms. In this spirit, it would be interesting to see the semi-dominant effects of scav2 on Lpr3 localisation in lpr1 mutants by microscopy.

      Response:

      • Because of the hermaphroditism of C. elegans, it would be technically challenging to confidently identify heterozygous (vs. homozygous) embryos for confocal imaging. We do not think that the results would be informative enough to warrant the effort, given that we’ve already shown that scav-2 heterozygosity can partly suppress lpr-1 The expectation is that LPR-3 levels would be partially restored in the scav-2 het, but it might take a very large sample size to confidently assess that partial effect.

      One word to the overexpression studies: it is surprising that the amounts of Scav2 delivered by the expression through the grl-2 promoter in the lpr1, scav2 background are almost matching those by the opposite effect of scav2 mutations on lpr1 dysfunction.

      Response:

      • The reviewer refers to the transgenic rescue experiment with the grl-2pro::SCAV-2 transgene. Because the scav-2 mutant phenotype being tested is suppression of lpr-1 lethality, the expected result from scav-2 rescue is to restore the lpr-1 lethal phenotype to the strain. This is exactly the result we see. We have revised the text to more clearly explain the logic.

      One issue concerns the localization of scav2-gfp "rarely" in vesicles: what are these vesicles?

      Response

      • Only a handful of vesicles were seen across all the images we collected, and we have not yet identified them. They could be associated with either SCAV-2 delivery or removal from the plasma membrane, as now stated in the text. SCAV-2 trafficking would be an interesting area for further study but is beyond the scope of this paper.

      One comment to the Let653 transgenes/knock-ins: the localization of transgenic Let653-gfp may be normal in lpr1 mutants because there are wild-type copies in the background.

      Response

      • There are wild type copies of LET-653 in the background, but no wild type copies of LPR-1. Even if the untagged LET-653 would be recruiting the tagged LET-653 as the reviewer suggests, we can still conclude that lpr-1 loss does not prevent the untagged LET-653 (and thus also the tagged LET-653) from accumulating in the duct lumen matrix.

      One thought to the model: if Scav2 has a function in a lpr1 background, this means that yet another transporter X delivers the substrate for Scav2, isn't it?

      Response

      • Yes, we completely agree with this interpretation and have revised the discussion and Figure 8 legend to more explicitly make this point.

      A word to the term haploinsifficient that is used in this study: scav2 mutants would be haploinsifficient if the heterozygous worms died in an otherwise wild-type background.

      Response

      • We disagree with this comment. The term “haploinsufficient” simply means that heterozygosity for a deletion or other loss of function allele can cause a mutant phenotype – the term is not restricted to lethal phenotypes.

        Reviewer #2 (Significance (Required)):

        Alexandra C.Belfi and colleagues wrote the manuscript entitled "Opposing roles for lipocalins and a CD36 family scavenger receptor in apical extracellular matrix-dependent protection of narrow tube integrity" in which they report on their findings on the genetic and cell-biological interaction between the lipid transporters Lpr1 and scav2 in the nematode C. elegans. In principle, these two proteins are involved in shaping the apical extracellular matrix (aECM) of ducts by regulating the amounts of Lpr3 in the extracellular space. While seems to act cell autonomously, Lpr1 has a non-cell autonomous effect on Lpr3.


      __Referee #3 __ Summary: Using a powerful combination of genetic and quantitative imaging approaches, Belfi et al., describe novel findings on the roles of several lipocalins-secreted lipid carrier proteins-in the production and organization of the apical extracellular matrix (aECM) required for small diameter tube formation in C. elegans. The work comprises a substantial extension of previous studies carried out by the Sundaram lab, which has pioneered studies into the roles of aECM and accessory proteins in creating the duct-pore excretion tube and which also plays a role in patterning of the epidermal cuticle. One core finding is that the lipocalin LPR-1 does not stably associate with the aECM but is instead required for the incorporation of another lipocalin, LPR-3. A second major finding is that reduction of function in SCAV-2, a SCARB family membrane lipid transporter, suppresses lpr-1 mutant lethality along with associated duct-pore defects and mislocalization of LPR-3. Likewise loss of scav-2 partially suppresses defects in two other aECM proteins and restores defects in LPR-3 localization in one of them (let-653). Additional genetic and protein localization studies lead to the model that LPR-1 and SCAV-2 may antagonistically regulate one or more lipid or lipoprotein factors necessary for LPR-3 localization and duct-pore formation. A role for LPR-1 and LPR-3 at lysosomes is clearly implicated based on co-localization studies, although a specific role for lysosomes (or related organelles) is not defined. Finally, MS data suggests that neither LPR-1 or SCAV-2 grossly affect lipid composition in embryos, consistent with dietary interventions failing to affect mutant phenotypes. Ultimately, a plausible schematic model is presented to explain for much of the data.

      __*Major comments:

      *__

      1. The studies are very thorough, convincing, and generally well described. Conclusions are logical and well grounded. Additional experiments are not required to support the authors major conclusions, and the data and methods are described in a sufficient detail to allow replication. As such my comments are minor and should be addressable at the author's discretion in writing.

      Response

      • Thank you for these positive comments

        __Minor comments: __2) In the abstract, "tissue-specific suppression" made me think that there was going to be a tissue-specific knockdown experiment, which was not the case. Rather scav-2 suppression is specific to the duct-pore, which corresponds to where scav-2 is expressed. Consider rewording this.

      Response

      • Wording was changed to “duct/pore-specific suppression”

        3) Page 5. Suggest wording change to, "Whereas LPR-3 incorporates stably into the precuticle, suggesting a structural role in matrix organization, LPR-1..."

      Response

      • Done

        4) LIMP-2 versus LIMP2. Both are used. Uniprot lists LIMP2, but some papers use LIMP-2. Choose one and be consistent.

      Response

      • Everything changed to LIMP2.

        5) Some of the data for S6 Fig wasn't referred to directly in the text. Namely results regarding pcyt-1 and pld-1. I'd suggest incorporating this into the results section possibly using, "As a control for our lipid supplementation experiments..."

      Response

      • These experiments are now described on page 11.

        6) Page 12 bottom. I understand the use of "oppose", but another way to put it is that SCAV-2 and LPR-1 (antagonistically or collectively) modulate aECM composition. Other terms that might confuse some readers is the use of upstream and downstream, although I OK with its use in the context of this work.

      Response

      • The genetics indicate that lpr-1 and scav-2 have opposite effects on tube shaping and LPR-3 localization, so they do function antagonistically rather than collectively/cooperatively; we decided to keep this terminology.

        7) Page 16. I understand the logic that SCAV-2 is unlikely to directly modulate LPR-3 given its presumed molecular function. But is it possible that LPR-3 levels are already maxed out in the aECM so that loss of SCAV-2 doesn't lead to any increase? Conversely, one could argue that even if acting indirectly, SCAV-2 could have led to increased LPR-3 levels, unless they were already maxed.

      Response

      • This is a good point and the possibility is now mentioned in the Results page 9. We also changed our wording in the Abstract and Discussion to acknowledge the possibility that LPR-3 could be the SCAV-2 cargo, though we still don’t favor this model.

        8) Figure legend 1. I did not see an asterisk in figure 1B.

      Response

      • thanks for catching this error, text removed

        9) Figure 1C. Might want to define the "degree" term in the legend for people outside the field.

      Response

      • We added an explanation to the figure legend.

        10) Fig 1 G. I was just wondering if cuticle autofluorescence was an issue for taking these images.

      Response

      • Cuticle auto fluorescence is generally quite dim in L4s with our settings, and it was not an issue at this mid/late L4 stage, which corresponds to when both LPR fusions are at their brightest. Note that both large panels are MAX projections and yet you can’t see any cuticle auto-fluorescence in the LPR-1 panel.

        11) Fig 2 and others. Please define error bars.

      Response

      • These correspond to the standard deviation; this information is now added to the Methods.

        12) Fig 5. From the images, it looks like lpr-1; scav-2 doubles might have a worse (pre)cuticle defect in LPR-3 localization than lpr-1 singles. If so that would be interesting and would suggest that their relationship with respect to the modulation of LPR-3 is context dependent. Admittedly, the lack of obvious scav-2 expression in the epidermis would not be consistent with an effect (positive or negative).

      Response

      • The lpr-1 scav-2 strain is certainly not improved over lpr-1 but we have not noted any consistent worsening of the phenotype either.

        13) Consider defining Dodt in the first figure legend where it appears.

      Response

      • Dodt gradient contrast imaging is a method of transmitted light imaging similar to DIC and is used on some confocal microscopes. It is now explained in the Methods section. We removed the term from Figure 7 since it seems to be confusing.

        14) For Mander's, is there a reason to report just one of the two findings (M1 or M2) versus both?

      Response

      • We now include the 2nd Manders value in the figure legend and note that value is much lower (0.25) because much of the red signal is lysosomes (where green would be quenched by acidity).

        15) Consider referring to specific panels (A, B...) within references to the supplemental files.

      Response

      • done

        16) Fig S6E. Neither "increasing nor increasing" to "increasing nor decreasing".

      Response

      • fixed

        **Referees cross-commenting**

        I thought that Reviewers 1 and 2 brought up some good points. My sense is that Belfi and colleagues can address most of these in writing, but are of course welcome to add new data as they see fit. I get that it's not a "perfect" paper where everything is explained fully or comes together, but I don't see that as a flaw that needs to be fixed. I think that the manuscript represents a good deal of work (as it is) and provides a sufficient advance while also suggesting an interesting link to disease. It will be up to individual journals to decide if the findings meets their criteria.

        Reviewer #3 (Significance (Required)):

        Significance: The work carried out in this paper, and more generally by the Sundaram lab, always has a ground-breaking element because very few labs in the field have studied in detail the developmental roles and regulation of the aECM, in large part because it can be challenging to dissect. The core findings in this study are rather novel and unexpected, namely the opposing roles of the paralogous LPR-1 and LPR-3 lipocalins and their functional interactions with SCAV-2. The study does stop short of finding specific molecules (lipid or lipoprotein) that would mediate the effects they report, and it wasn't yet clear how the lysosomal co-loc plays a role, but this is not a criticism of the work presented or the forward progress. I was particularly intrigued by the idea, presented in the discussion, that disruption of vascular aECM could potentially account for some of the (complex) observations regarding the role of lipocalins and SCARB proteins in human disease. This would represent a new avenue for researchers to consider and underscores the power of using non-biased approaches in model systems.

        As for all my reviews, this is signed by David Fay.

      • *

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Using a powerful combination of genetic and quantitative imaging approaches, Belfi et al., describe novel findings on the roles of several lipocalins-secreted lipid carrier proteins-in the production and organization of the apical extracellular matrix (aECM) required for small diameter tube formation in C. elegans. The work comprises a substantial extension of previous studies carried out by the Sundaram lab, which has pioneered studies into the roles of aECM and accessory proteins in creating the duct-pore excretion tube and which also plays a role in patterning of the epidermal cuticle. One core finding is that the lipocalin LPR-1 does not stably associate with the aECM but is instead required for the incorporation of another lipocalin, LPR-3. A second major finding is that reduction of function in SCAV-2, a SCARB family membrane lipid transporter, suppresses lpr-1 mutant lethality along with associated duct-pore defects and mislocalization of LPR-3. Likewise loss of scav-2 partially suppresses defects in two other aECM proteins and restores defects in LPR-3 localization in one of them (let-653). Additional genetic and protein localization studies lead to the model that LPR-1 and SCAV-2 may antagonistically regulate one or more lipid or lipoprotein factors necessary for LPR-3 localization and duct-pore formation. A role for LPR-1 and LPR-3 at lysosomes is clearly implicated based on co-localization studies, although a specific role for lysosomes (or related organelles) is not defined. Finally, MS data suggests that neither LPR-1 or SCAV-2 grossly affect lipid composition in embryos, consistent with dietary interventions failing to affect mutant phenotypes. Ultimately, a plausible schematic model is presented to explain for much of the data.

      Major comments:

      The studies are very thorough, convincing, and generally well described. Conclusions are logical and well grounded. Additional experiments are not required to support the authors major conclusions, and the data and methods are described in a sufficient detail to allow replication. As such my comments are minor and should be addressable at the author's discretion in writing.

      Minor comments:

      1) In the abstract, "tissue-specific suppression" made me think that there was going to be a tissue-specific knockdown experiment, which was not the case. Rather scav-2 suppression is specific to the duct-pore, which corresponds to where scav-2 is expressed. Consider rewording this.

      2) Page 5. Suggest wording change to, "Whereas LPR-3 incorporates stably into the precuticle, suggesting a structural role in matrix organization, LPR-1..."

      3) LIMP-2 versus LIMP2. Both are used. Uniprot lists LIMP2, but some papers use LIMP-2. Choose one and be consistent.

      4) Some of the data for S6 Fig wasn't referred to directly in the text. Namely results regarding pcyt-1 and pld-1. I'd suggest incorporating this into the results section possibly using, "As a control for our lipid supplementation experiments..."

      5) Page 12 bottom. I understand the use of "oppose", but another way to put it is that SCAV-2 and LPR-1 (antagonistically or collectively) modulate aECM composition. Other terms that might confuse some readers is the use of upstream and downstream, although I OK with its use in the context of this work.

      6) Page 16. I understand the logic that SCAV-2 is unlikely to directly modulate LPR-3 given its presumed molecular function. But is it possible that LPR-3 levels are already maxed out in the aECM so that loss of SCAV-2 doesn't lead to any increase? Conversely, one could argue that even if acting indirectly, SCAV-2 could have led to increased LPR-3 levels, unless they were already maxed.

      7) Figure legend 1. I did not see an asterisk in figure 1B.

      8) Figure 1C. Might want to define the "degree" term in the legend for people outside the field.

      9) Fig 1 G. I was just wondering if cuticle autofluorescence was an issue for taking these images.

      10) Fig 2 and others. Please define error bars.

      11) Fig 5. From the images, it looks like lpr-1; scav-2 doubles might have a worse (pre)cuticle defect in LPR-3 localization than lpr-1 singles. If so that would be interesting and would suggest that their relationship with respect to the modulation of LPR-3 is context dependent. Admittedly, the lack of obvious scav-2 expression in the epidermis would not be consistent with an effect (positive or negative).

      12) Consider defining Dodt in the first figure legend where it appears.

      13) For Mander's, is there a reason to report just one of the two findings (M1 or M2) versus both?

      14) Consider referring to specific panels (A, B...) within references to the supplemental files.

      15) Fig S6E. Neither "increasing nor increasing" to "increasing nor decreasing".

      As for all my reviews, this is signed by David Fay.

      Referees cross-commenting

      I thought that Reviewers 1 and 2 brought up some good points. My sense is that Belfi and colleagues can address most of these in writing, but are of course welcome to add new data as they see fit. I get that it's not a "perfect" paper where everything is explained fully or comes together, but I don't see that as a flaw that needs to be fixed. I think that the manuscript represents a good deal of work (as it is) and provides a sufficient advance while also suggesting an interesting link to disease. It will be up to individual journals to decide if the findings meets their criteria.

      Significance

      Significance:

      The work carried out in this paper, and more generally by the Sundaram lab, always has a ground-breaking element because very few labs in the field have studied in detail the developmental roles and regulation of the aECM, in large part because it can be challenging to dissect. The core findings in this study are rather novel and unexpected, namely the opposing roles of the paralogous LPR-1 and LPR-3 lipocalins and their functional interactions with SCAV-2. The study does stop short of finding specific molecules (lipid or lipoprotein) that would mediate the effects they report, and it wasn't yet clear how the lysosomal co-loc plays a role, but this is not a criticism of the work presented or the forward progress. I was particularly intrigued by the idea, presented in the discussion, that disruption of vascular aECM could potentially account for some of the (complex) observations regarding the role of lipocalins and SCARB proteins in human disease. This would represent a new avenue for researchers to consider and underscores the power of using non-biased approaches in model systems.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: In this study, the authors show a genetic interaction of the lipid receptors Lpr-1, Lpr-3 and Scav-2 in C. elegans. They show that Lpr-1 loss-of-function specifically affects aECM localization of Lpr-3 and attribute the lethality of Lpr-1 mutants to this phenotype. The authors performed a mutagenesis screen and identified a third lipid receptor, Scav-2, as a modulating factor: loss of scav-2 partially rescues the Lpr-1 phenotype. The authors created a variety of tools for this study, notably Crispr-Cas9-mediated knock-ins for endogenous tagging of the receptors.

      Major comments: while the authors provide a nice diagram showing the potential roles and interplay of lpr-1, lpr-3 and scav-2, it remains unclear what their respective cargo is. The nature of interaction between the proteins remains unclear from the data. As an optional (since time-consuming) experiment I would suggest trying more tissue-specific lipidomics. The lipidomics data should be presented in the figures, even if there were no significant changes. Importantly, show the lipid abundance at least of total lipids, better of individual classes, normalized to the material input (e.g. number of embryos, protein). Figure 1g: I do not understand what the lpr3:gfp signal is: the punctae in the overview image? and where are they in the zoom image showing anulli and alae? Also, how where the anulli and alae structures labeled? please provide more information One point that is not sufficiently adressed is that the authors deduce from the inability of the scav-2 gfp knock in to suppress lpr1 lethality that scav2 function is not impaired. This is quite indirect. Can the authors provide more convincing evidence that scav-2 ki has normal function? In general, the data is clearly presented and the statistical analyses look sound.

      Minor comments: Please provide page and line numbers! Avoid contractions like "don't" in both text and figure legends Page 12: I do not understand the meaning of the sentence "This transgene also caused more modest lethality in a wild-type background" Figure 7: what is meant with "Dodt"?

      Significance

      The study is experimentally sound and uses numerous novel tools, such as endogenously tagged lipid receptors. It is an interesting study for researchers in basic research studying lipid receptors and ECM biology. It provides insights on the genetic interaction of lipid receptors.

      My expertise is in lipid biochemistry, inter-organ lipid trafficking and imaging. I am not very familiar with C. elegans genetics.

  2. bafybeid7gjtxre33jbpmnzs6avjyvludaufoeubczurkrkrncisabad7w4.ipfs.localhost:8080 bafybeid7gjtxre33jbpmnzs6avjyvludaufoeubczurkrkrncisabad7w4.ipfs.localhost:8080
    1. So, the ocean acidification planetary boundary relates to the saturation state of aragonite in the surface waters. The aragonite saturation state refers to the concentration of dissolved carbonate ions in relation to the solubility of aragonite. It is referred to by the symbol normal cap omega sub a times r times a times g (where normal cap omega is the Greek letter capital omega). It is calculated using the formula: cap omega sub arag equals left square bracket cap c a super two postfix plus right square bracket times left square bracket cap c cap o sub three super two postfix minus right square bracket divided by cap k sub sp super prime

      Ocean acidification planetary boundary relates to the saturdation of aragonite in the surface waters. Aragonite saturation state refers to the concentration of dissolved carbonate ions in relation to the solubility or aragonite, refered to by the horeshoe arag symbol

    2. where left square bracket cap c a super two postfix plus right square bracket and left square bracket cap c cap o sub three super two postfix minus right square bracket are the concentration of their respective ions and cap k sub sp super prime is the ‘apparent solubility product’ – the equilibrium constant for the dissolution of the compound, in this case aragonite. The important thing to take away from this is that, other things being equal, the saturation state is dependent on the concentration of calcium and carbonate ions, which, as you learned in Study session 1.3.1, vary with changing CO2 concentration and pH. Furthermore, cap k sub sp super prime increases with temperature, so in warmer seas (as expected with climate change), if calcium and carbonate ion concentrations stay the same, cap omega sub arag would decrease. Overall, however, changes in ion concentrations are expected to be the main influence on cap omega sub arag as our climate changes.

      The concentration of ions over the apparent solubility product is how its calculated - the equilibrium constant for the dissolution of the compound aragonite. Other things being equal, the saturation state is depednent on the concentration of calcium and carbonate ions, which vary with chanign CO2 conetreation and Ph. Further more, K increases with temp, so in warmer seas, if calc and carb ion centrations stay the same, the solubility of aragonite will decrease. Overall ion concentrations are expected to be the main influence

    3. The crystal structure of the two minerals differs. Calcite forms blocky crystals while aragonite forms needle-like crystals. Calcite is the more stable form of CaCO3 in most conditions and is by far the most abundant form in rocks. It is the major component of most limestone. However, the presence of magnesium ions in solution in seawater alongside calcium ions favours the formation of aragonite. Although many organisms can form both calcite and aragonite in their shells and exoskeletons, going against the energetically favoured form in any environment requires greater energy expenditure by the organism. Marine conditions in Earth’s oceans have favoured organisms that use aragonite predominantly over calcite in their hard structures. This is important in understanding the effects of ocean acidification because aragonite is less stable and more prone to dissolution than calcite. Over geological time and under certain conditions, aragonite can convert to (or dissolve and re-precipitate as) calcite, which is one reason why limestone rocks, made from the bodies of marine organisms, predominantly contain calcite.

      The crystal structure of the two minearls differ - calcite forms block cyrstals whilst aragonite forms needle like cystals. Calcite is more stable and most abundant in rocks - its a major component of limestone THe presence of manesium ions in solution in seawater along calcium ions favours the formation of aragonite - many organisms can form from calcite or aragonite in but going against hte energtically favoured form requires greater energy expenditure so isn't common. Marine conditons favour organisms that use aragonite, which is less stable and prone to dissolution than calcite is aragonite can convert to calcite which is why limestone rocks made from teh bodies of marine organisms contain calcite

    1. Reviewer #2 (Public review):

      Summary:

      Whole-brain network modeling is a common type of dynamical systems-based method to create individualized models of brain activity incorporating subject-specific structural connectome inferred from diffusion imaging data. This type of model has often been used to infer biophysical parameters of the individual brain that cannot be directly measured using neuroimaging but may be relevant to specific cognitive functions or diseases. Here, Ziaeemehr et al introduce a new toolkit, named "Virtual Brain Inference" (VBI), offering a new computational approach for estimating these parameters using Bayesian inference powered by artificial neural networks. The basic idea is to use simulated data, given known parameters, to train artificial neural networks to solve the inverse problem, namely, to infer the posterior distribution over the parameter space given data-derived features. The authors have demonstrated the utility of the toolkit using simulated data from several commonly used whole-brain network models in case studies.

      Strength:

      Model inversion is an important problem in whole-brain network modeling. The toolkit presents a significant methodological step up from common practices, with the potential to broadly impact how the community infers model parameters.

      Notably, the method allows the estimation of the posterior distribution of parameters instead of a point estimation, which provides information about the uncertainty of the estimation, which is generally lacking in existing methods.

      The case studies were able to demonstrate the detection of degeneracy in the parameters, which is important. Degeneracy is quite common in this type of models. If not handled mindfully, they may lead to spurious or stable parameter estimation. Thus, the toolkit can potentially be used to improve feature selection or to simply indicate the uncertainty.

      In principle, the posterior distribution can be directly computed given new data without doing any additional simulation, which could improve the efficiency of parameter inference on the artificial neural network is well-trained.

      Weaknesses:

      The z-scores used to measure prediction error are generally between 1-3, which seems quite large to me. It would give readers a better sense of the utility of the method if comparisons to simpler methods, such as k-nearest neighbor methods, are provided in terms of accuracy.

      A lot of simulations are required to train the posterior estimator, which is computationally more expensive than existing approaches. Inferring from Figure S1, at the required order of magnitudes of the number of simulations, the simulation time could range from days to years, depending on the hardware. The payoff is that once the estimator is well-trained, the parameter inversion will be very fast given new data. However, it is not clear to me how often such use cases would be encountered. It would be very helpful if the authors could provide a few more concrete examples of using trained models for hypothesis testing, e.g., in various disease conditions.

    2. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      This work provides a new Python toolkit for combining generative modeling of neural dynamics and inversion methods to infer likely model parameters that explain empirical neuroimaging data. The authors provided tests to show the toolkit's broad applicability, accuracy, and robustness; hence, it will be very useful for people interested in using computational approaches to better understand the brain.

      Strengths:

      The work's primary strength is the tool's integrative nature, which seamlessly combines forward modelling with backward inference. This is important as available tools in the literature can only do one and not the other, which limits their accessibility to neuroscientists with limited computational expertise. Another strength of the paper is the demonstration of how the tool can be applied to a broad range of computational models popularly used in the field to interrogate diverse neuroimaging data, ensuring that the methodology is not optimal to only one model. Moreover, through extensive in-silico testing, the work provided evidence that the tool can accurately infer ground-truth parameters even in the presence of noise, which is important to ensure results from future hypothesis testing are meaningful.

      We appreciate the positive feedback on our open-source tool that delivers rapid forward simulations and flexible Bayesian model inversion for a broad range of whole-brain models, with extensive in-silico validation, including scenarios with dynamical/additive noise.

      Weaknesses

      The paper still lacks appropriate quantitative benchmarking relative to non-Bayesian-based inference tools, especially with respect to performance accuracy and computational complexity and efficiency. Without this benchmarking, it is difficult to fully comprehend the power of the software or its ability to be extended to contexts beyond large-scale computational brain modelling.

      Non-Bayesian inference methods were beyond the scope of this study, as we focused on full posterior estimation to enable uncertainty quantification and detection of degeneracy. Their advantages and disadvantages are briefly discussed in the Introduction and Discussion sections.

      Reviewer #2 (Public review):

      Whole-brain network modeling is a common type of dynamical systems-based method to create individualized models of brain activity incorporating subject-specific structural connectome inferred from diffusion imaging data. This type of model has often been used to infer biophysical parameters of the individual brain that cannot be directly measured using neuroimaging but may be relevant to specific cognitive functions or diseases. Here, Ziaeemehr et al introduce a new toolkit, named "Virtual Brain Inference" (VBI), offering a new computational approach for estimating these parameters using Bayesian inference powered by artificial neural networks. The basic idea is to use simulated data, given known parameters, to train artificial neural networks to solve the inverse problem, namely, to infer the posterior distribution over the parameter space given data-derived features. The authors have demonstrated the utility of the toolkit using simulated data from several commonly used whole-brain network models in case studies.

      Strength:

      Model inversion is an important problem in whole-brain network modeling. The toolkit presents a significant methodological step up from common practices, with the potential to broadly impact how the community infers model parameters.

      Notably, the method allows the estimation of the posterior distribution of parameters instead of a point estimation, which provides information about the uncertainty of the estimation, which is generally lacking in existing methods.

      The case studies were able to demonstrate the detection of degeneracy in the parameters, which is important. Degeneracy is quite common in this type of models. If not handled mindfully, they may lead to spurious or stable parameter estimation. Thus, the toolkit can potentially be used to improve feature selection or to simply indicate the uncertainty.

      In principle, the posterior distribution can be directly computed given new data without doing any additional simulation, which could improve the efficiency of parameter inference on the artificial neural network is well-trained.

      We thank the reviewer for the careful consideration of important aspects of the VBI tool, such as uncertainty quantification rather than point estimation, degeneracy detection, features selection, parallelization, and amortization strategy.

      Weaknesses:

      The z-scores used to measure prediction error are generally between 1-3, which seems quite large to me. It would give readers a better sense of the utility of the method if comparisons to simpler methods, such as k-nearest neighbor methods, are provided in terms of accuracy. - A lot of simulations are required to train the posterior estimator, which is computationally more expensive than existing approaches. Inferring from Figure S1, at the required order of magnitudes of the number of simulations, the simulation time could range from days to years, depending on the hardware. The payoff is that once the estimator is well-trained, the parameter inversion will be very fast given new data. However, it is not clear to me how often such use cases would be encountered. It would be very helpful if the authors could provide a few more concrete examples of using trained models for hypothesis testing, e.g., in various disease conditions.

      We agree with the reviewer that for some parameters the z-score is large, which could be due to the limited number of simulations, the informativeness of the data features, or non-identifiability, and we do address these possible limitations in the Discussion. In line with our previous study, we stick to Bayesian metrics such as posterior z-scores and shrinkage. The application of an amortized strategy needs to be demonstrated in future work, for example in anonymized personalization of virtual brain twins (Baldy et al., 2025).

      Ref: Baldy N, Woodman MM, Jirsa VK. Amortizing personalization in virtual brain twins. arXiv preprint arXiv:2506.21155.

      Reviewer #1 (Recommendations for the authors):

      (1) The authors want to keep the term "spatio-temporal" data features to make it consistent with the language they use in their code, even though they only refer to statistical and temporal features of the time series. I stand by my previous comment that this is misleading and should be avoided as much as possible because it doesn't take into account the actual spatial characteristics of the data. At the very least, the authors should recognize this in the text.

      We have now recognized this point.

      (2) There are still some things that need further clarification and/or explanation:

      (a) It remains unclear why PCA needs to be applied to the FC/FCD matrices. It was also unclear how many PCs were kept as data features.

      We aim to use as many features as possible as a battery of metrics to reduce the number of simulations. The role of each feature can be investigated in future studies.  For instance, PCA is used in the LEiDA approach (Cabral et al., 2017) to enhance robustness to high-frequency noise, thereby overcoming a limitation common to all quasi-instantaneous measures of FC. In this work, the default setting was two PCA components. 

      Ref:  Cabral J, Vidaurre D, Marques P, Magalhães R, Silva Moreira P, Miguel Soares J, Deco G, Sousa N, Kringelbach ML. Cognitive performance in healthy older adults relates to spontaneous switching between states of functional connectivity during rest. Scientific reports. 2017 Jul 11;7(1):5135.

      (b) It was also unclear which features were used for each model. This is important for reproducibility and to make the users of the software aware of which features are most likely to work best for each model.

      We have done our best to indicate the class of features used in each case. This is illustrated more clearly in the notebook examples provided in the repository.

      Reviewer #2 (Recommendations for the authors):

      Thanks for responding to my suggestions. Here is only one remaining point:

      Section 2.1: Please mention the atlas used to parcellate the brain; without this information, readers won't know what area 88 is in Figure 1, for example. 

      We have now mentioned this point. In this study we used AAL Atlas.

  3. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. Why are all the Black kids sitting together in the cafeteria?" WALK INTO ANY RACIALLY MIXED HIGH SCHOOL CAFETERIA AT LUNCH-tune 3:11d you will instantly notice that in the sea of adolescent faces, there is an identifiable group of Black students sitting together. Con-versely, it could be pointed out that there are many groups of White students sitting together as well, though people rarely comment about that. The question on the tip of everyone's tongue is, "Why are the Black kids sitting together?"

      I have noticed this and it’s strange how far back this goes. I told my dad about this when i was younger and he told me about how it was even worse when he was in hs because there was often fights between the students in regards to race with people often gettting fatally hurt. I think at some point it became a thing of sittting with people who have faced the same hardships that you have

    2. WALK INTO ANY RACIALLY MIXED HIGH SCHOOL CAFETERIA AT LUNCH-tune 3:11d you will instantly notice that in the sea of adolescent faces, there is an identifiable group of Black students sitting together. Con-versely, it could be pointed out that there are many groups of White students sitting together as well, though people rarely comment about that.

      This is true for a lot of minority students who more away from home for the first time to pursue higher education. I think it is a subconscious actions, done to make one feel more welcome and at home. Minority students like myself are likely to gravitate towards others who remind us from home because it make up for the fact we miss our homes. Although this causes a divide, it also forms relationships between who otherwise would feel like imposters.

    3. LK INTO ANY RACIALLY MIXED HIGH SCHOOL CAFETERIA AT LUNCH-tune 3:11d you will instantly notice that in the sea of adolescent faces, there is an identifiable group of Black students sitting together. Con-versely, it could be pointed out that there are many groups of White students sitting together as well, though people rarely comment about that. The question on the tip of everyone's tongue is, "Why are the Black kids sitting together?" Principals want to know, teachers want to know, White students want to know, the Black students who aren't sitting at the table want to know.

      This passage is not merely describing a dining scene; rather, it uses a mundane detail to bring up a deeper social issue: race.

    1. Oxyacids are compounds of the general formula HnXOm, where X is a nonmetal and the acidic hydrogens are attached to oxygen

      Reminder to Self: put examples to the definition for better understanding

    1. I strongly endorse the main theme of most of the reviews, which is that the progression and underlying justifications for this article’s arguments needs a great deal of work. In my view, this article’s main contribution seems to be the evaluation of the three peer review models against the functions of scientific communication. I say ‘seems to be’ because the article is not very clear on that and I hope you will consider clarifying what your manuscript seeks to add to the existing work in this field.

      In any case, if that assessment of the three models is your main contribution, that part is somewhat underdeveloped. Moreover, I never got the sense that there is clear agreement in the literature about what the tenets of scientific communication are. Note that scientific communication is a field in its own right.

      I also agree that paper is too strongly worded at times, with limitations and assumptions in the analysis minimised or not stated. For example, all of the typologies and categories drawn could easily be reorganised and there is a high degree of subjectivity in this entire exercise. Subjective choices should be highlighted and made salient for the reader.

      Note that greater clarity, rigour, and humility may also help with any alleged or actual bias.

      Some more minor points are:

      1. I agree with Reviewer 3 that the ‘we’ perspective is distracting.

      2. The paragraph starting with ‘Nevertheless’ on page 2 is very long.

      3. There are many points where language could be shortened for readability, for example:

        • Page 3: ‘decision on publication’ could be ‘publication decision’.

        • Page 5: ‘efficiency of its utilization’ could be ‘its efficiency’.

        • Page 7: ‘It should be noted…’ could be ‘Note that…’.

      4. Page 7: ‘It should be noted that..’ – this needs a reference.

      5. I’m not sure that registered reports reflect a hypothetico-deductive approach (page 6). For instance, systematic reviews (even non-quantitative ones) are often published as registered reports and Cochrane has required this even before the move towards registered reports in quantitative psychology.

      6. I agree that modular publishing sits uneasily as its own chapter.

      7. Page 14: ‘The "Publish-Review-Curate" model is universal that we expect to be the future of scientific publishing. The transition will not happen today or tomorrow, but in the next 5-10 years, the number of projects such as eLife, F1000Research, Peer Community in, or MetaROR will rapidly increase’. This seems overly strong (an example of my larger critique and that of the reviewers).

    2. In "Evolution of Peer Review in Scientific Communication", Kochetkov provides a point-of-view discussion of the current state of play of peer review for scientific literature, focussing on the major models in contemporary use and recent innovations in reform. In particular, they present a typology of three main forms of peer review: traditional pre-publication review; registered reports; and post-publication review, their preferred model. The main contribution it could make would be to help consolidate typologies and terminologies, to consolidate major lines of argument and to present some useful visualisations of these. On the other hand, the overall discussion is not strongly original in character.

      The major strength of this article is that the discussion is well-informed by contemporary developments in peer-review reform. The typology presented is modest and, for that, readily comprehensible and intuitive. This is to some extent a weakness as well as a strength; a typology that is too straightforward may not be useful enough. As suggested at the end it might be worth considering how to complexify the typology at least at subordinate levels without sacrificing this strength. The diagrams of workflows are particularly clear.

      The primary weakness of this article is that it presents itself as an 'analysis' from which they 'conclude' certain results such as their typology, when this appears clearly to be an opinion piece. In my view, this results in a false claim of objectivity which detracts from what would otherwise be an interesting and informative, albeit subjective, discussion, and thus fails to discuss the limitations of this approach. A secondary weakness is that the discussion is not well structured and there are some imprecisions of expression that have the potential to confuse, at least at first.

      This primary weakness is manifested in several ways. The evidence and reasoning for claims made is patchy or absent. One instance of the former is the discussion of bias in peer review. There are a multitude of studies of such bias and indeed quite a few meta-analyses of these studies. A systematic search could have been done here but there is no attempt to discuss the totality of this literature. Instead, only a few specific studies are cited. Why are these ones chosen? We have no idea. To this extent I am not convinced that the references used here are the most appropriate. Instances of the latter are the claim that "The most well-known initiatives at the moment are ResearchEquals and Octopus" for which no evidence is provided, the claim that "we believe that journal-independent peer review is a special case of Model 3" for which no further argument is provided, and the claim that "the function of being the "supreme judge" in deciding what is "good" and "bad" science is taken on by peer review" for which neither is provided.

      A particular example of this weakness, which is perhaps of marginal importance to the overall paper but of strong interest to this reviewer is the rather odd engagement with history within the paper. It is titled "Evolution of Peer Review" but is really focussed on the contemporary state-of-play. Section 2 starts with a short history of peer review in scientific publishing, but that seems intended only to establish what is described as the 'traditional' model of peer review. Given that that short history had just shown how peer review had been continually changing in character over centuries - and indeed Kochetkov goes on to describe further changes - it is a little difficult to work out what 'traditional' might mean here; what was 'traditional' in 2010 was not the same as what was 'traditional' in 1970. It is not clear how seriously this history is being taken. Kochetkov has earlier written that "as early as the beginning of the 21st century, it was argued that the system of peer review is 'broken'" but of course criticisms - including fundamental criticisms - of peer review are much older than this. Overall, this use of history seems designed to privilege the experience of a particular moment in time, that coincides with the start of the metascience reform movement.

      Section 2 also demonstrates some of the second weakness described, a rather loose structure. Having moved from a discussion of the history of peer review to detail the first model, 'traditional' peer review, it then also goes on to describe the problems of this model. This part of the paper is one of the best - and best -evidenced. Given the importance of it to the main thrust of the discussion it should probably have been given more space as a Section all on its own.

      Another example is Section 4 on Modular Publishing, in which Kochetkov notes "Strictly speaking, modular publishing is primarily an innovative approach for the publishing workflow in general rather than specifically for peer review." Kochetkov says "This is why we have placed this innovation in a separate category" but if it is not an innovation in peer review, the bigger question is 'Why was it included in this article at all?'.

      One example of the imprecisions of language is as follows. The author also shifts between the terms 'scientific communication' and 'science communication' but, at least in many contexts familiar to this reviewer, these are not the same things, the former denoting science-internal dissemination of results through publication (which the author considers), conferences and the like (which the author specifically excludes) while the latter denotes the science-external public dissemination of scientific findings to non-technical audiences, which is entirely out of scope for this article.

      A final note is that Section 3, while an interesting discussion, seems largely derivative from a typology of Waltman, with the addition of a consideration of whether a reform is 'radical' or 'incremental', based on how 'disruptive' the reform is. Given that this is inherently a subjective decision, I wonder if it might not have been more informative to consider 'disruptiveness' on a scale and plot it accordingly. This would allow for some range to be imagined for each reform as well; surely reforms might be more or less disruptive depending on how they are implemented. Given that each reform is considered against each model, it is somewhat surprising that this is not presented in a tabular or graphical form.

      Beyond the specific suggestions in the preceding paragraphs, my suggestions to improve this article are as follows:

      1. Reconceptualize this as an opinion piece. Where systematic evidence can be drawn upon to make points, use that, but don't be afraid to just present a discussion from what is clearly a well-informed author.

      2. Reconsider the focus on history and 'evolution' if the point is about the current state of play and evaluation of reforms (much as I would always want to see more studies on the history and evolution of peer review).

      3. Consider ways in which the typology might be expanded, even if at subordinate level.

    3. The work ‘Evolution of Peer Review in Scientific Communication’ provides a concise and readable summary of the historical role of peer review in modern science. The paper categorises the peer review practices into three models: (1) traditional pre-publication peer review; (2) registered reports; (3) post-publication peer review. The author compares the three models and draws the conclusion that the “third model offers the best way to implement the main function of scientific communication”.

      I would contest this conclusion. In my eyes the three models serve different aims - with more or less drawbacks. For example, although Model 3 is less chance to insert bias to the readers, it also weakens the filtering function of the review system. Let’s just think about the dangers of machine-generated articles, paper-mills, p-hacked research reports and so on. Although the editors do some pre-screening for the submissions, in a world with only Model 3 peer review the literature could easily get loaded with even more ‘garbage’ than in a model where additional peers help the screening.

      Compared to registered reports other aspects can come to focus that Model 3 cannot cover. It’s the efficiency of researchers’ work. In the care of registered reports, Stage 1 review can still help researchers to modify or improve their research design or data collection method. Empirical work can be costly and time-consuming and post-publication review can only say that “you should have done it differently then it would make sense”.

      Finally, the author puts openness as a strength of Model 3. In my eyes, openness is a separate question. All models can work very openly and transparently in the right circumstances. This dimension is not an inherent part of the models.

      In conclusion, I would not make verdict over the models, instead emphasise the different functions they can play in scientific communication.

      A minor comment: I found that a number of statements lack references in the Introduction. I would have found them useful for statements such as “There is a point of view that peer review is included in the implicit contract of the researcher.”

    4. In this manuscript, the author provides a historical review of the place of peer review in the scientific ecosystem, including a discussion of the so-called current crisis and a presentation of three important peer review models. I believe this is a non-comprehensive yet useful overview. My main contention is that the structure of the paper could be improved. More specifically, the author could expand on the different goals of peer review and discuss these goals earlier in the paper. This would allow readers to better interpret the different issues plaguing peer review and helps put the costs and benefits of the three models into context. Other than that, I found some claims made in the paper a little too strong. Presenting some empirical evidence or downplaying these claims would improve the manuscript in my opinion. Below, you can find my comments:

      1. In my view, the biggest issue with the current peer review system is the low quality of reviews, but the manuscript only mentions this fleetingly. The current system facilitates publication bias, confirmation bias, and is generally very inconsistent. I think this is partly due to reviewers’ lack of accountability in such a closed peer review system, but I would be curious to hear the author’s ideas about this, more elaborately than they provide them as part of issue 2.

      2. I’m missing a section in the introduction on what the goals of peer review are or should be. You mention issues with peer review, and these are mostly fair, but their importance is only made salient if you link them to the goals of peer review. The author does mention some functions of peer review later in the paper, but I think it would be good to expand that discussion and move it to a place earlier in the manuscript.

      3. Table 1 is intuitive but some background on how the author arrived at these categorizations would be welcome. When is something incremental and when is something radical? Why are some innovations included but not others (e.g., collaborative peer review, see https://content.prereview.org/how-collaborative-peer-review-can-transform-scientific-research/)?

      4. “Training of reviewers through seminars and online courses is part of the strategies of many publishers. At the same time, we have not been able to find statistical data or research to assess the effectiveness of such training.” (p. 5)  There is some literature on this, although not recent. See work by Sara Schroter for example, Schroter et al., 2004; Schroter et al., 2008)

      5. “It should be noted that most initiatives aimed at improving the quality of peer review simultaneously increase the costs.” (p. 7)  This claim needs some support. Please explicate why this typically is the case and how it should impact our evaluations of these initiatives.

      6. I would rephrase “Idea of the study” in Figure 2 since the other models start with a tangible output (the manuscript). This is the same for registered reports where they submit a tangible report including hypotheses, study design, and analysis plan. In the same vein, I think study design in the rest of the figure might also not be the best phrasing.  Maybe the author could use the terminology used by COS (Stage 1 manuscript, and Stage 2 manuscript, see Details & Workflow tab of https://www.cos.io/initiatives/registered-reports). Relatedly, “Author submits the first version of the manuscript” in the first box after the ‘Manuscript (report)’ node maybe a confusing phrase because I think many researchers see the first version of the manuscript as the stage 1 report sent out for stage 1 review.

      7. One pathway that is not included in Figure 2 is that authors can decide to not conduct the study when improvements are required. Relatedly, in the publish-review-curate model, is revising the manuscripts based on the reviews not optional as well? Especially in the case of 3a, authors can hardly be forced to make changes even though the reviews are posted on the platform.

      8. I think the author should discuss the importance of ‘open identities’ more. This factor is now not explicitly included in any of the models, while it has been found to be one of the main characteristics of peer review systems (Ross-Hellauer, 2017). More generally, I was wondering why the author chose these three models and not others. What were the inclusion criteria for inclusion in the manuscript? Some information on the underlying process would be welcome, especially when claims like “However, we believe that journal-independent peer review is a special case of Model 3 (“Publish-Review-Curate”).” are made without substantiation.

      9. Maybe it helps to outline the goals of the paper a bit more clearly in the introduction. This helps the reader to know what to expect.

      10. The Modular Publishing section is not inherently related to peer review models, as you mention in the first sentence of that paragraph. As such, I think it would be best to omit this section entirely to maintain the flow of the paper. Alternatively, you could shortly discuss it in the discussion section but a separate paragraph seems too much from my point of view.

      11. Labeling model 3 as post-publication review might be confusing to some readers. I believe many researchers see post-publication review as researchers making comments on preprints, or submitting commentaries to journals. Those activities are substantially different from the publish-review-curate model so I think it is important to distinguish between these types.

      12. I do not think the conclusions drawn below Table 3 logically follow from the earlier text. For example, why are “all functions of scientific communication implemented most quickly and transparently in Model 3”? It could be that the entire process takes longer in Model 3 (e.g. because reviewers need more time), so that Model 1 and Model 2 lead to outputs quicker. The same holds for the following claim: “The additional costs arising from the independent assessment of information based on open reviews are more than compensated by the emerging opportunities for scientific pluralism.” What is the empirical evidence for this? While I personally do think that Model 3 improves on Model 1, emphatic statements like this require empirical evidence. Maybe the author could provide some suggestions on how we can attain this evidence. Model 2 does have some empirical evidence underpinning its validity (see Scheel, Schijen, Lakens, 2021; Soderberg et al., 2021; Sarafoglou et al. 2022) but more meta-research inquiries into the effectiveness and cost-benefits ratio of registered reports would still be welcome in general.

      13. What is the underlaying source for the claim that openness requires three conditions?

      14. “If we do not change our approach, science will either stagnate or transition into other forms of communication.” (p. 2)  I don’t think this claim is supported sufficiently strongly. While I agree there are important problems in peer review, I think would need to be a more in-depth and evidence-based analysis before claims like this can be made.

      15. On some occasions, the author uses “we” while the study is single authored.

      16. Figure 1: The top-left arrow from revision to (re-)submission is hidden

      17. “The low level of peer review also contributes to the crisis of reproducibility in scientific research (Stoddart, 2016).” (p. 4)  I assume the author means the low quality of peer review.

      18. “Although this crisis is due to a multitude of factors, the peer review system bears a significant responsibility for it.” (p. 4)  This is also a big claim that is not substantiated

      19. “Software for automatic evaluation of scientific papers based on artificial intelligence (AI) has emerged relatively recently” (p. 5)  The author could add RegCheck (https://regcheck.app/) here, even though it is still in development. This tool is especially salient in light of the finding that preregistration-paper checks are rarely done as part of reviews (see Syed, 2023)

      20. There is a typo in last box of Figure 1 (“decicion” instead of “decision”). I also found typos in the second box of Figure 2, where “screns” should be “screens”, and the author decision box where “desicion” should be “decision”

      21. Maybe it would be good to mention results blinded review in the first paragraph of 3.2. This is a form of peer review where the study is already carried out but reviewers are blinded to the results. See work by Locascio (2017), Grand et al. (2018), and Woznyj et al. (2018).

      22. Is “Not considered for peer review” in figure 3b not the same as rejected? I feel that it is rejected in the sense that neither the manuscript not the reviews will be posted on the platform.

      23. “In addition to the projects mentioned, there are other platforms, for example, PREreview12, which departs even more radically from the traditional review format due to the decentralized structure of work.” (p. 11)  For completeness, I think it would be helpful to add some more information here, for example why exactly decentralization is a radical departure from the traditional model.

      24. “However, anonymity is very conditional - there are still many “keys” left in the manuscript, by which one can determine, if not the identity of the author, then his country, research group, or affiliated organization.” (p.11)  I would opt for the neutral “their” here instead of “his”, especially given that this is a paragraph about equity and inclusion.

      25. “Thus, “closeness” is not a good way to address biases.” (p. 11)  This might be a straw man argument because I don’t believe researchers have argued that it is a good method to combat biases. If they did, it would be good to cite them here. Alternatively, the sentence could be omitted entirely.

      26. I would start the Modular Publishing section with the definition as that allows readers to interpret the other statements better.

      27. It would be helpful if the Models were labeled (instead of using Model 1, Model 2, and Model 3) so that readers don’t have to think back what each model involved.

      28. Table 2: “Decision making” for the editor’s role is quite broad, I recommend to specify and include what kind of decisions need to be made.

      29. Table 2: “Aim of review” – I believe the aim of peer review differs also within these models (see the “schools of thought” the author mentions earlier), so maybe a statement on what the review entails would be a better way to phrase this.

      30. Table 2: One could argue that the object of the review’ in Registered Reports is also the manuscript as a whole, just in different stages. As such, I would phrase this differently.

      Good luck with any revision!

      Olmo van den Akker (ovdakker@gmail.com)

      References

      Grand, J. A., Rogelberg, S. G., Banks, G. C., Landis, R. S., & Tonidandel, S. (2018). From outcome to process focus: Fostering a more robust psychological science through registered reports and results-blind reviewing. Perspectives on Psychological Science, 13(4), 448-456.

      Ross-Hellauer, T. (2017). What is open peer review? A systematic review. F1000Research, 6.

      Sarafoglou, A., Kovacs, M., Bakos, B., Wagenmakers, E. J., & Aczel, B. (2022). A survey on how preregistration affects the research workflow: Better science but more work. Royal Society Open Science, 9(7), 211997.

      Scheel, A. M., Schijen, M. R., & Lakens, D. (2021). An excess of positive results: Comparing the standard psychology literature with registered reports. Advances in Methods and Practices in Psychological Science, 4(2), 25152459211007467.

      Schroter, S., Black, N., Evans, S., Carpenter, J., Godlee, F., & Smith, R. (2004). Effects of training on quality of peer review: randomised controlled trial. Bmj, 328(7441), 673.

      Schroter, S., Black, N., Evans, S., Godlee, F., Osorio, L., & Smith, R. (2008). What errors do peer reviewers detect, and does training improve their ability to detect them?. Journal of the Royal Society of Medicine, 101(10), 507-514.

      Soderberg, C. K., Errington, T. M., Schiavone, S. R., Bottesini, J., Thorn, F. S., Vazire, S., ... & Nosek, B. A. (2021). Initial evidence of research quality of registered reports compared with the standard publishing model. Nature Human Behaviour, 5(8), 990-997.

      Syed, M. (2023). Some data indicating that editors and reviewers do not check preregistrations during the review process. PsyArXiv Preprints.

      Locascio, J. J. (2017). Results blind science publishing. Basic and applied social psychology, 39(5), 239-246.

      Woznyj, H. M., Grenier, K., Ross, R., Banks, G. C., & Rogelberg, S. G. (2018). Results-blind review: A masked crusader for science. European Journal of Work and Organizational Psychology, 27(5), 561-576.

    5. Overall thoughts: This is an interesting history piece regarding peer review and the development of review over time. Given the author’s conflict of interest and association with the Centre developing MetaROR, I think that this paper might be a better fit for an information page or introduction to the journal and rationale for the creation of MetaROR, rather than being billed as an independent article. Alternatively, more thorough information about advantages to pre-publication review or more downsides/challenges to post-publication review might make the article seem less affiliated. I appreciate seeing the history and current efforts to change peer review, though I am not comfortable broadly encouraging use of these new approaches based on this article alone.

      Page 3: It’s hard to get a feel for the timeline given the dates that are described. We have peer review becoming standard after WWII (after 1945), definitively established by the second half of the century, an example of obligatory peer review starting in 1976, and in crisis by the end of the 20th century. I would consider adding examples that better support this timeline – did it become more common in specific journals before 1976? Was the crisis by the end of the 20th century something that happened over time or something that was already intrinsic to the institution? It doesn’t seem like enough time to get established and then enter crisis, but more details/examples could help make the timeline clear. 

      Consider discussing the benefits of the traditional model of peer review.

      Table 1 – Most of these are self-explanatory to me as a reader, but not all. I don’t know what a registered report refers to, and it stands to reason that not all of these innovations are familiar to all readers. You do go through each of these sections, but that’s not clear when I initially look at the table. Consider having a more informative caption. Additionally, the left column is “Course of changes” here but “Directions” in text. I’d pick one and go with it for consistency.

      3.2: Considering mentioning your conflict of interest here where MetaROR is mentioned.

      With some of these methods, there’s the ability to also submit to a regular journal. Going to a regular journal presumably would instigate a whole new round of review, which may or may not contradict the previous round of post-publication review and would increase the length of time to publication by going through both types. If someone has a goal to publish in a journal, what benefit would they get by going through the post-publication review first, given this extra time?

      There’s a section talking about institutional change (page 14). It mentions that openness requires three conditions – people taking responsibility for scientific communication, authors and reviewers, and infrastructure. I would consider adding some discussion of readers and evaluators. Readers have to be willing to accept these papers as reliable, trustworthy, and respectable to read and use the information in them. Evaluators such as tenure committees and potential employers would need to consider papers submitted through these approaches as evidence of scientific scholarship for the effort to be worthwhile for scientists.

      Based on this overview, which seems somewhat skewed towards the merits of these methods (conflict of interest, limited perspective on downsides to new methods/upsides to old methods), I am not quite ready to accept this effort as equivalent of a regular journal and pre-publication peer review process. I look forward to learning more about the approach and seeing this review method in action and as it develops.

    6. Response to the Editors and the Reviewers

      I am sincerely grateful to the editors and peer reviewers at MetaROR for their detailed feedback and valuable comments and suggestions. I have addressed each point below.

      Handling editor

      1. “However, the article’s progression and arguments, along with what it seeks to contribute to the literature need refinement and clarification. The argument for PRC is under-developed due to a lack of clarity about what the article means by scientific communication. Clarity here might make the endorsement of PRC seem like less of a foregone conclusion.”

      The structure of the paper (and discussion) has changed significantly to address the feedback.

      2. “I strongly endorse the main theme of most of the reviews, which is that the progression and underlying justifications for this article’s arguments needs a great deal of work. In my view, this article’s main contribution seems to be the evaluation of the three peer review models against the functions of scientific communication. I say ‘seems to be’ because the article is not very clear on that and I hope you will consider clarifying what your manuscript seeks to add to the existing work in this field. In any case, if that assessment of the three models is your main contribution, that part is somewhat underdeveloped. Moreover, I never got the sense that there is clear agreement in the literature about what the tenets of scientific communication are. Note that scientific communication is a field in its own right.”

      I have implemented a more rigorous approach to argumentation in response. “Scientific communication” was replaced by “scholarly communication.”

      3. “I also agree that paper is too strongly worded at times, with limitations and assumptions in the analysis minimised or not stated. For example, all of the typologies and categories drawn could easily be reorganised and there is a high degree of subjectivity in this entire exercise. Subjective choices should be highlighted and made salient for the reader. Note that greater clarity, rigour, and humility may also help with any alleged or actual bias.”

      I have incorporated the conceptual framework and description of the research methodology. However, the Discussion section reflects my personal perspective in some points, which I have explicitly highlighted to ensure clarity.

      4. “I agree with Reviewer 3 that the ‘we’ perspective is distracting.”

      This has been fixed.

      5. “The paragraph starting with ‘Nevertheless’ on page 2 is very long.”

      The text was restructured.

      6. “There are many points where language could be shortened for readability, for example:

      Page 3: ‘decision on publication’ could be ‘publication decision’.

      Page 5: ‘efficiency of its utilization’ could be ‘its efficiency’.

      Page 7: ‘It should be noted…’ could be ‘Note that…’.”

      I have proofread the text.

      7. “Page 7: ‘It should be noted that..’ – this needs a reference.”

      This statement has been moved to the Discussion section, paraphrased, and reference added

      “It should be also noted that peer review innovations pull in opposing directions, with some aiming to increase efficiency and reduce costs, while others aim to promote rigor and increase costs (Kaltenbrunner et al., 2022).”

      8. “I’m not sure that registered reports reflect a hypothetico-deductive approach (page 6). For instance, systematic reviews (even non-quantitative ones) are often published as registered reports and Cochrane has required this even before the move towards registered reports in quantitative psychology.”

      I have added this clarification.

      9. “I agree that modular publishing sits uneasily as its own chapter.”

      Modular publishing has been combined with registered reports into the deconstructed publication group of models, now Section 5.1.

      10. “Page 14: ‘The "Publish-Review-Curate" model is universal that we expect to be the future of scientific publishing. The transition will not happen today or tomorrow, but in the next 5-10 years, the number of projects such as eLife, F1000Research, Peer Community in, or MetaROR will rapidly increase’. This seems overly strong (an example of my larger critique and that of the reviewers).”

      This part of the text has been rewritten.

      Reviewer 1

      11. “For example, although Model 3 is less chance to insert bias to the readers, it also weakens the filtering function of the review system. Let’s just think about the dangers of machine-generated articles, paper-mills, p-hacked research reports and so on. Although the editors do some pre-screening for the submissions, in a world with only Model 3 peer review the literature could easily get loaded with even more ‘garbage’ than in a model where additional peers help the screening.”

      I think that generated text is better detected by software tools. At the same time, I tried and described the pros and cons of different models in a more balanced way in the concluding section.

      12. “Compared to registered reports other aspects can come to focus that Model 3 cannot cover. It’s the efficiency of researchers’ work. In the care of registered reports, Stage 1 review can still help researchers to modify or improve their research design or data collection method. Empirical work can be costly and time-consuming and post-publication review can only say that ‘you should have done it differently then it would make sense’.”

      Thank you very much for this valuable contribution, I have added this statement at P. 11.

      13. “Finally, the author puts openness as a strength of Model 3. In my eyes, openness is a separate question. All models can work very openly and transparently in the right circumstances. This dimension is not an inherent part of the models.”

      I think that the model, providing peer reviews to all the submissions, ensures maximum transparency. However, I have made effort to make the wording more balanced and distinguish my personal perspective from the literature.

      14. “In conclusion, I would not make verdict over the models, instead emphasize the different functions they can play in scientific communication.”

      This idea has been reflected now in the concluding section.

      15. “A minor comment: I found that a number of statements lack references in the Introduction. I would have found them useful for statements such as ‘There is a point of view that peer review is included in the implicit contract of the researcher.’”

      Thank you for your feedback. I have implemented a more rigorous approach to argumentation in response.

      Reviewer 2

      16. “The primary weakness of this article is that it presents itself as an 'analysis' from which they 'conclude' certain results such as their typology, when this appears clearly to be an opinion piece. In my view, this results in a false claim of objectivity which detracts from what would

      otherwise be an interesting and informative, albeit subjective, discussion, and thus fails to discuss the limitations of this approach.”

      I have incorporated the conceptual framework and description of the research methodology. However, the Discussion section reflects my personal perspective in some points, which I have explicitly highlighted to ensure clarity.

      17. “A secondary weakness is that the discussion is not well structured and there are some imprecisions of expression that have the potential to confuse, at least at first.”

      The structure of the paper (and discussion) has changed significantly.

      18. “The evidence and reasoning for claims made is patchy or absent. One instance of the former is the discussion of bias in peer review. There are a multitude of studies of such bias and indeed quite a few meta-analyses of these studies. A systematic search could have been done here but there is no attempt to discuss the totality of this literature. Instead, only a few specific studies are cited. Why are these ones chosen? We have no idea. To this extent I am not convinced that the references used here are the most appropriate.”

      I have reviewed the existing references and incorporated additional sources. However, the study does not claim to conduct a systematic literature review; rather, it adopts an interpretative approach to literature analysis.

      19. “Instances of the latter are the claim that ‘The most well-known initiatives at the moment are ResearchEquals and Octopus’ for which no evidence is provided, the claim that ‘we believe that journal-independent peer review is a special case of Model 3’ for which no further argument is provided, and the claim that ‘the function of being the "supreme judge" in deciding what is "good" and "bad" science is taken on by peer review’ for which neither is provided.

      Thank you for your feedback. I have implemented a more rigorous approach to argumentation in response.

      20. “A particular example of this weakness, which is perhaps of marginal importance to the overall paper but of strong interest to this reviewer is the rather odd engagement with history within the paper. It is titled "Evolution of Peer Review" but is really focussed on the contemporary state-of-play. Section 2 starts with a short history of peer review in scientific publishing, but that seems intended only to establish what is described as the 'traditional' model of peer review. Given that that short history had just shown how peer review had been continually changing in character over centuries - and indeed Kochetkov goes on to describe further changes - it is a little difficult to work out what 'traditional' might mean here; what was 'traditional' in 2010 was not the same as what was 'traditional' in 1970. It is not clear how seriously this history is being taken. Kochetkov has earlier written that "as early as the beginning of the 21st century, it was argued that the system of peer review is 'broken'" but of course criticisms - including fundamental criticisms - of peer review are much older than this. Overall, this use of history seems designed to privilege the experience of a particular moment in time, that coincides with the start of the metascience reform movement.”

      While the paper addresses some aspects of peer review history, it does not provide a comprehensive examination of this topic. A clarifying statement to this effect has been included in the methodology section.

      “… this section incorporates elements of historical analysis, it does not fully qualify as such because primary sources were not directly utilized. Instead, it functions as an interpretative literature review, and one that is intentionally concise, as a comprehensive history of peer review falls outside the scope of this research”.

      21. “Section 2 also demonstrates some of the second weakness described, a rather loose structure. Having moved from a discussion of the history of peer review to detail the first model, 'traditional' peer review, it then also goes on to describe the problems of this model. This part of the paper is one of the best - and best - evidenced. Given the importance of it to the main thrust of the discussion it should probably have been given more space as a Section all on its own.”

      This section (now Section 4) has been extended, see also previous comment.

      22. “Another example is Section 4 on Modular Publishing, in which Kochetkov notes "Strictly speaking, modular publishing is primarily an innovative approach for the publishing workflow in general rather than specifically for peer review." Kochetkov says "This is why we have placed this innovation in a separate category" but if it is not an innovation in peer review, the bigger question is 'Why was it included in this article at all?'.”

      Modular publishing has been combined with registered reports into the deconstructed publication group of models, now Section 5.1.

      23. “One example of the imprecisions of language is as follows. The author also shifts between the terms 'scientific communication' and 'science communication' but, at least in many contexts familiar to this reviewer, these are not the same things, the former denoting science-internal dissemination of results through publication (which the author considers), conferences and the like (which the author specifically excludes) while the latter denotes the science-external public dissemination of scientific findings to non-technical audiences, which is entirely out of scope for this article.”

      Thank you for your remark. As a non- native speaker, I initially did not grasp the distinction between the terms. However, I believe the phrase ‘scholarly communication’ is the most universally applicable term. This adjustment has now been incorporated into the text.

      24. “A final note is that Section 3, while an interesting discussion, seems largely derivative from a typology of Waltman, with the addition of a consideration of whether a reform is 'radical' or 'incremental', based on how 'disruptive' the reform is. Given that this is inherently a subjective decision, I wonder if it might not have been more informative to consider 'disruptiveness' on a scale and plot it accordingly. This would allow for some range to be imagined for each reform as well; surely reforms might be more or less disruptive depending on how they are implemented. Given that each reform is considered against each model, it is somewhat surprising that this is not presented in a tabular or graphical form.”

      Ultimately, I excluded this metric due to its current reliance on purely subjective judgment. Measuring 'disruptiveness', e.g., through surveys or interviews remains a task for future research.

      25. “Reconceptualize this as an opinion piece. Where systematic evidence can be drawn upon to make points, use that, but don't be afraid to just present a discussion from what is clearly a well-informed author.”

      I cannot definitively classify this work as an opinion piece. In fact, this manuscript synthesizes elements of a literature review, research article, and opinion essay. My idea was to integrate the strengths of all three genres.

      26. “Reconsider the focus on history and 'evolution' if the point is about the current state of play and evaluation of reforms (much as I would always want to see more studies on the history and evolution of peer review).”

      I have revised the title to better reflect the study’s scope and explicitly emphasize its focus on contemporary developments in the field.

      “Peer Review at the Crossroads”

      27. “Consider ways in which the typology might be expanded, even if at subordinate level.”

      I have updated the typology and introduced the third tier, where it is applicable (see Fig.2).

      Reviewer 3

      28. “In my view, the biggest issue with the current peer review system is the low quality of reviews, but the manuscript only mentions this fleetingly. The current system facilitates publication bias, confirmation bias, and is generally very inconsistent. I think this is partly due to reviewers’ lack of accountability in such a closed peer review system, but I would be curious to hear the author’s ideas about this, more elaborately than they provide them as part of issue 2.

      I have elaborated on this issue in the footnote.

      29. “I’m missing a section in the introduction on what the goals of peer review are or should be. You mention issues with peer review, and these are mostly fair, but their importance is only made salient if you link them to the goals of peer review. The author does mention some functions of peer review later in the paper, but I think it would be good to expand that discussion and move it to a place earlier in the manuscript.”

      The functions of peer review are summarized in the first paragraph of Introduction.

      30. “Table 1 is intuitive but some background on how the author arrived at these categorizations would be welcome. When is something incremental and when is something radical? Why are some innovations included but not others (e.g., collaborative peer review, see https://content.prereview.org/how-collaborative-peer-review-can-transform-scientific-research/)?”

      Collaborative peer review, namely, Prereview was mentioned in the context of Model 3 (Publish-Review-Curate). However, I have extended this part of the paper.

      31“‘Training of reviewers through seminars and online courses is part of the strategies of many publishers. At the same time, we have not been able to find statistical data or research to assess the effectiveness of such training.’ (p. 5)  There is some literature on this, although not recent. See work by Sara Schroter for example, Schroter et al., 2004; Schroter et al., 2008)”

      Thank you very much, I have added these studies and a few more recent ones.

      32. “‘It should be noted that most initiatives aimed at improving the quality of peer review simultaneously increase the costs.’ (p. 7) This claim needs some support. Please explicate why this typically is the case and how it should impact our evaluations of these initiatives.”

      I have moved this part to the Discussion section.

      33. “I would rephrase “Idea of the study” in Figure 2 since the other models start with a tangible output (the manuscript). This is the same for registered reports where they submit a tangible report including hypotheses, study design, and analysis plan. In the same vein, I think study design in the rest of the figure might also not be the best phrasing. Maybe the author could use the terminology used by COS (Stage 1 manuscript, and Stage 2 manuscript, see Details & Workflow tab of https://www.cos.io/initiatives/registered-reports). Relatedly, “Author submits the first version of the manuscript” in the first box after the ‘Manuscript (report)’ node maybe a confusing phrase because I think many researchers see the first version of the manuscript as the stage 1 report sent out for stage 1 review.”

      Thank you very much. Stage 1 and Stage 2 manuscripts look like suitable labelling solution.

      34. “One pathway that is not included in Figure 2 is that authors can decide to not conduct the study when improvements are required. Relatedly, in the publish-review-curate model, is revising the manuscripts based on the reviews not optional as well? Especially in the case of

      3a, authors can hardly be forced to make changes even though the reviews are posted on the platform.”

      All the four models imply a certain level of generalization; thus, I tried to avoid redundant details. However, I have added this choice to the PRC model (now, Model 4).

      35. “I think the author should discuss the importance of ‘open identities’ more. This factor is now not explicitly included in any of the models, while it has been found to be one of the main characteristics of peer review systems (Ross-Hellauer, 2017).”

      This part has been extended.

      36. “More generally, I was wondering why the author chose these three models and not others. What were the inclusion criteria for inclusion in the manuscript? Some information on the underlying process would be welcome, especially when claims like ‘However, we believe that journal-independent peer review is a special case of Model 3 (‘Publish-Review-Curate’).’ are made without substantiation.”

      The study included four generalized models of peer review that involved some level of abstraction.

      37. “Maybe it helps to outline the goals of the paper a bit more clearly in the introduction. This helps the reader to know what to expect.”

      The Introduction has been revised including the goal and objectives.

      38. “The Modular Publishing section is not inherently related to peer review models, as you mention in the first sentence of that paragraph. As such, I think it would be best to omit this section entirely to maintain the flow of the paper. Alternatively, you could shortly discuss it in the discussion section but a separate paragraph seems too much from my point of view.”

      Modular publishing has been combined with registered reports into the fragmented publishing group of models, now in Section 5.

      39. “Labeling model 3 as post-publication review might be confusing to some readers. I believe many researchers see post-publication review as researchers making comments on preprints, or submitting commentaries to journals. Those activities are substantially different from the publish-review-curate model so I think it is important to distinguish between these types.”

      The label was changed into Publish- Review-Curate model.

      40. “I do not think the conclusions drawn below Table 3 logically follow from the earlier text. For example, why are “all functions of scientific communication implemented most quickly and transparently in Model 3”? It could be that the entire process takes longer in Model 3 (e.g. because reviewers need more time), so that Model 1 and Model 2 lead to outputs quicker. The same holds for the following claim: ‘The additional costs arising from the independent assessment of information based on open reviews are more than compensated by the emerging opportunities for scientific pluralism.’ What is the empirical evidence for this? While I personally do think that Model 3 improves on Model 1, emphatic statements like this require empirical evidence. Maybe the author could provide some suggestions on how we can attain this evidence. Model 2 does have some empirical evidence underpinning its validity (see Scheel, Schijen, Lakens, 2021; Soderberg et al., 2021; Sarafoglou et al. 2022) but more meta-research inquiries into the effectiveness and cost-benefits ratio of registered reports would still be welcome in general.”

      The Discussion section has been substantially revised to address this point. While I acknowledge the current scarcity of empirical studies on innovative peer review models, I have incorporated a critical discussion of this methodological gap. I am grateful for the suggested literature on RRs, which I have now integrated into the relevant subsection.

      41. “What is the underlaying source for the claim that openness requires three conditions?”

      I have made effort to clarify within the text that this reflects my personal stance.

      42. “‘If we do not change our approach, science will either stagnate or transition into other forms of communication.’ (p. 2) I don’t think this claim is supported sufficiently strongly. While I agree there are important problems in peer review, I think would need to be a more in-depth and evidence-based analysis before claims like this can be made.”

      The sentence has been rephrased.

      43. “On some occasions, the author uses ‘we’ while the study is single authored.”

      This has been fixed.

      44. “Figure 1: The top-left arrow from revision to (re-)submission is hidden”

      I have updated Figure 1.

      45. “‘The low level of peer review also contributes to the crisis of reproducibility in scientific research (Stoddart, 2016).’ (p. 4) I assume the author means the low quality of peer review.”

      This has been fixed.

      46. “‘Although this crisis is due to a multitude of factors, the peer review system bears a significant responsibility for it.’ (p. 4) This is also a big claim that is not substantiated”

      I have paraphrased this sentence as “While multiple factors drive this crisis, deficiencies in the peer review process remain a significant contributor.” and added a footnote.

      47. “‘Software for automatic evaluation of scientific papers based on artificial intelligence (AI) has emerged relatively recently” (p. 5) The author could add RegCheck (https://regcheck.app/) here, even though it is still in development. This tool is especially salient in light of the finding that preregistration-paper checks are rarely done as part of reviews (see Syed, 2023)”

      Thank you very much, I have added this information.

      48. “There is a typo in last box of Figure 1 (‘decicion’ instead of ‘decision’). I also found typos in the second box of Figure 2, where ‘screns’ should be ‘screens’, and the author decision box where ‘desicion’ should be ‘decision’”

      This has been fixed.

      49. “Maybe it would be good to mention results blinded review in the first paragraph of 3.2. This is a form of peer review where the study is already carried out but reviewers are blinded to the results. See work by Locascio (2017), Grand et al. (2018), and Woznyj et al. (2018).”

      Thanks, I have added this (now section 5.2)

      50. “Is ‘Not considered for peer review’ in figure 3b not the same as rejected? I feel that it is rejected in the sense that neither the manuscript not the reviews will be posted on the platform.”

      Changed into “Rejected”

      51. “‘In addition to the projects mentioned, there are other platforms, for example, PREreview12, which departs even more radically from the traditional review format due to the decentralized structure of work.’ (p. 11) For completeness, I think it would be helpful to add some more information here, for example why exactly decentralization is a radical departure from the traditional model.”

      I have extended this passage.

      52. “‘However, anonymity is very conditional - there are still many “keys” left in the manuscript, by which one can determine, if not the identity of the author, then his country, research group, or affiliated organization.’ (p.11) I would opt for the neutral ‘their’ here instead of ‘his’, especially given that this is a paragraph about equity and inclusion.”

      This has been fixed.

      53. “‘Thus, “closeness” is not a good way to address biases.’ (p. 11) This might be a straw man argument because I don’t believe researchers have argued that it is a good method to combat biases. If they did, it would be good to cite them here. Alternatively, the sentence could be

      omitted entirely.

      I have omitted the sentence.

      54. “I would start the Modular Publishing section with the definition as that allows readers to interpret the other statements better.”

      Modular publishing has been combined with registered reports into the deconstructed publication group of models, now in Section 5, general definition added.

      55. “It would be helpful if the Models were labeled (instead of using Model 1, Model 2, and Model 3) so that readers don’t have to think back what each model involved.”

      All the models represent a kind of generalization, which is why non-detailed labels are used. The text labels may vary depending on the context.

      56. “Table 2: ‘Decision making’ for the editor’s role is quite broad, I recommend to specify and include what kind of decisions need to be made.”

      Changed into “Making accept/reject decisions”

      57. “Table 2: ‘Aim of review’ – I believe the aim of peer review differs also within these models (see the ‘schools of thought’ the author mentions earlier), so maybe a statement on what the review entails would be a better way to phrase this.”

      Changed into “What does peer review entail?”

      58. “Table 2: One could argue that the object of the review’ in Registered Reports is also the manuscript as a whole, just in different stages. As such, I would phrase this differently.

      Current wording fits your remark: “Manuscript in terms of study design and execution”

      Reviewer 4

      59. “Page 3: It’s hard to get a feel for the timeline given the dates that are described. We have peer review becoming standard after WWII (after 1945), definitively established by the second half of the century, an example of obligatory peer review starting in 1976, and in crisis by the end of the 20th century. I would consider adding examples that better support this timeline – did it become more common in specific journals before 1976? Was the crisis by the end of the 20th century something that happened over time or something that was already intrinsic to the institution? It doesn’t seem like enough time to get established and then enter crisis, but more details/examples could help make the timeline clear. Consider discussing the benefits of the traditional model of peer review.”

      This section has been extended.

      60. “Table 1 – Most of these are self-explanatory to me as a reader, but not all. I don’t know what a registered report refers to, and it stands to reason that not all of these innovations are familiar to all readers. You do go through each of these sections, but that’s not clear when I initially look at the table. Consider having a more informative caption. Additionally, the left column is “Course of changes” here but “Directions” in text. I’d pick one and go with it for consistency.”

      Table 1 has been replaced by Figure 2. I have also extended text descriptions, added definitions.

      61. “With some of these methods, there’s the ability to also submit to a regular journal. Going to a regular journal presumably would instigate a whole new round of review, which may or may not contradict the previous round of post-publication review and would increase the length of time to publication by going through both types. If someone has a goal to publish in a journal, what benefit would they get by going through the post-publication review first, given this extra time?”

      Some of these platforms, e.g., F1000, Lifecycle Journal, replace conventional journal publishing. Modular publishing allows for step-by-step feedback from peers. An important advantage of RRs over other peer review models lies in their capacity to enhance research efficiency. By conducting peer review at Stage 1, researchers gain the opportunity to refine their study design or data collection protocols before empirical work begins. Other models of review can offer critiques such as "the study should have been conducted differently" without actionable opportunity for improvement. The key motivation for having my paper reviewed in MetaROR is the quality of peer review – I have never received so many comments, frankly! Moreover, platforms such as MetaROR usually have partnering journals.

      62. “There’s a section talking about institutional change (page 14). It mentions that openness requires three conditions – people taking responsibility for scientific communication, authors and reviewers, and infrastructure. I would consider adding some discussion of readers and evaluators. Readers have to be willing to accept these papers as reliable, trustworthy, and respectable to read and use the information in them. Evaluators such as tenure committees and potential employers would need to consider papers submitted through these approaches as evidence of scientific scholarship for the effort to be worthwhile for scientists.”

      I have omitted these conditions and employed the Moore’s Technology Adoption Life Cycle. Thank you very much for your comment!

      63. Based on this overview, which seems somewhat skewed towards the merits of these methods (conflict of interest, limited perspective on downsides to new methods/upsides to old methods), I am not quite ready to accept this effort as equivalent of a regular journal and pre-publication peer review process. I look forward to learning more about the approach and seeing this review method in action and as it develops.

      The Discussion section has been substantially revised to address this point. While I acknowledge the current scarcity of empirical studies on innovative peer review models, I have incorporated a critical discussion of this methodological gap.

    1. Reviewer #3 (Public review):

      In this study, the authors investigate the requirements for the formation of CPSF6 puncta induced by HIV-1 under a high multiplicity of infection conditions. Not surprisingly, they observe that mutation of the Phe-Gly (FG) repeat responsible for CPSF6 binding to the incoming HIV-1 capsid abrogates CPSF6 punctum formation. Perhaps more interestingly, they show that the removal of other domains of CPSF6, including the mixed-charge domain (MCD), does not affect the formation of HIV-1-induced CPSF6 puncta. The authors also present data suggesting that CPSF6 puncta form individual before fusing with nuclear speckles (NSs) and that the fusion of CPSF6 puncta to NSs requires the intrinsically disordered region (IDR) of the NS component SRRM2. While the study presents some interesting findings, there are some technical issues that need to be addressed and the amount of new information is somewhat limited. Also, the authors' finding that deletion of the CPSF6 MCD does not affect the formation of HIV-1-induced CPSF6 puncta contradicts recent findings of Jang et al. (https://doi.org/10.1093/nar/gkae769).

      Comments on revisions:

      The authors have generally addressed my comments.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      In recent years, our understanding of the nuclear steps of the HIV-1 life cycle has made significant advances. It has emerged that HIV-1 completes reverse transcription in the nucleus and that the host factor CPSF6 forms condensates around the viral capsid. The precise function of these CPSF6 condensates is under investigation, but it is clear that the HIV-1 capsid protein is required for their formation. This study by Tomasini et al. investigates the genesis of the CPSF6 condensates induced by HIV-1 capsid, what other co-factors may be required, and their relationship with nuclear speckels (NS). The authors show that disruption of the condensates by the drug PF74, added post-nuclear entry, blocks HIV-1 infection, which supports their functional role. They generated CPSF6 KO THP-1 cell lines, in which they expressed exogenous CPSF6 constructs to map by microscopy and pull down assays of the regions critical for the formation of condensates. This approach revealed that the LCR region of CPSF6 is required for capsid binding but not for condensates whereas the FG region is essential for both. Using SON and SRRM2 as markers of NS, the authors show that CPSF6 condensates precede their merging with NS but that depletion of SRRM2, or SRRM2 lacking the IDR domain, delays the genesis of condensates, which are also smaller. 

      The study is interesting and well conducted and defines some characteristics of the CPSF6-HIV-1 condensates. Their results on the NS are valuable. The data presented are convincing. 

      I have two main concerns. Firstly, the functional outcome of the various protein mutants and KOs is not evaluated. Although Figure 1 shows that disruption of the CPSF6 puncta by PF74 impairs HIV-1 infection, it is not clear if HIV-1 infection is at all affected by expression of the mutant CPSF6 forms (and SRRM2 mutants) or KO/KD of the various host factors. The cell lines are available, so it should be possible to measure HIV-1 infection and reverse transcription. Secondly, the authors have not assessed if the effects observed on the NS impact HIV-1 gene expression, which would be interesting to know given that NS are sites of highly active gene transcription. With the reagents at hand, it should be possible to investigate this too. 

      We thank the reviewer for her/his valuable feedback on our manuscript. We are pleased to see her/his appreciation of our results, and we did our utmost to address the highlighted points to further improve our work.

      To correctly perform the infectivity assay, we generated stable cell clones—a process that required considerable time, particularly during the selection of clones expressing protein levels comparable to wild-type (WT) cells. To accurately measure infectivity, it was essential to use stable clones expressing the most important deletion mutant, ∆FG CPSF6, at levels similar to those of CPSF6 in WT cells (new Fig.5 A-B). Importantly, we assessed the reproducibility of our experiments by freezing and thawing these clones.

      Regarding SRRM2, in THP-1 cells we were only able to achieve a knockdown, which still retains residual SRRM2 protein, albeit at much lower levels. Due to the essential role of SRRM2 in cell survival, obtaining a complete knockout in this cell line is not feasible, making it difficult to draw definitive conclusions from these experiments.

      In contrast, 293T cells carrying the endogenous SRRM2 deletion mutant (ΔIDR) cannot be infected with replication-competent HIV-1, as they lack expression of CD4 and either CCR4 or CCR5. These cells were instead used to monitor the dynamics of CPSF6 puncta assembly within nuclear speckles. However, they are not a suitable model for studying the impact of the depletion of SRRM2 in viral infection.

      Thus, we performed infectivity assays in a more relevant cell line for HIV-1 infection, THP-1 macrophage-like cells, using both a single-round virus and a replication-competent virus. The new results, shown in Figure 5 C-D, indicate that complete depletion of CPSF6 reduces infectivity, as measured by luciferase expression in a single-round infection (KO: ~65%; ΔFG: ~74%; compared to WT: 100% on average). Notably, a more pronounced defect in viral particle production was observed when WT virus was used for infection (KO: ~21%; ΔFG: ~16%; compared to WT: 100% on average). These findings support the referee’s insightful suggestion that the absence of CPSF6 could also impair HIV-1 gene expression. 

      Reviewer #2 (Public review): 

      Summary: 

      HIV-1 infection induces CPSF6 aggregates in the nucleus that contain the viral protein CA. The study of the functions and composition of these nuclear aggregates have raised considerable interest in the field, and they have emerged as sites in which reverse transcription is completed and in the proximity of which viral DNA becomes integrated. In this work, the authors have mutated several regions of the CPSF6 protein to identify the domains important for nuclear aggregation, in addition to the alreadyknown FG region; they have characterized the kinetics of fusion between CPSF6 aggregates and SC35 nuclear speckles and have determined the role of two nuclear speckle components in this process (SRRM2, SUN2). 

      Strengths: 

      The work examines systematically the domains of CPSF6 of importance for nuclear aggregate formation in an elegant manner in which these mutants complement an otherwise CPSF6-KO cell line. In addition, this work evidences a novel role for the protein SRRM2 in HIV-induced aggregate formation, overall advancing our comprehension of the components required for their formation and regulation. 

      Weaknesses: 

      Some of the results presented in this manuscript, in particular the kinetics of fusion between CPSF6aggregates and SC35 speckles have been published before (PMID: 32665593; 32997983). 

      The observations of the different effects of CPSF6 mutants, as well as SRRM2/SUN2 silencing experiments are not complemented by infection data which would have linked morphological changes in nuclear aggregates to function during viral infection. More importantly, these functional data could have helped stratify otherwise similar morphological appearances in CPSF6 aggregates. 

      Overall, the results could be presented in a more concise and ordered manner to help focus the attention of the reader on the most important issues. Most of the figures extend to 3-4 different pages and some information could be clearly either aggregated or moved to supplementary data. 

      First, we thank the reviewer for her/his appreciation of our study and to give to us the opportunity to better explain our results and to improve our manuscript. We appreciate the reviewer’s positive feedback on our study, and we will do our best to address her/his concerns. In the meantime, we would like to clarify the focus of our study. Our research does not aim to demonstrate an association between CPSF6 condensates (we use the term "condensates" rather than "aggregates," as aggregates are generally non-dynamic (Alberti & Hyman, 2021; Banani et al., 2017; Scoca et al., JMCB 2022), and our work specifically examines the dynamic behavior of CPSF6 puncta formed during infection and nuclear speckles. The association between CPSF6 puncta and NS has already been established in previous studies, as noted in the manuscript (PMID: 32665593; 32997983). The previous studies (PMID: 32665593; 32997983) showed that CPSF6 puncta colocalize with SC35 upon HIV infection and in the submitted study we study their kinetics.

      About the point highlighted by the reviewer: "Kinetics of fusion between CPSF6-aggregates and SC35 speckles have been published before."  

      Our study differs from prior work PMID 32665593 because we utilize a full-length HIV genome, and we did not follow the integrase (IN) fluorescence in trans and its association with CPSF6 but we specifically assess if CPSF6 clusters form in the nucleus independently of NS factors and next to fuse with them. In the current study we evaluated the dynamics of formation of CPSF6/NS puncta, which it has not been explored before. Given this focus, we believe that our work offers a novel perspective on the molecular interactions that facilitate HIV / CPSF6-NS fusion.

      We calculated that 27% of CPSF6 clusters were independent from NS at 6 h post-infection, compared to only 9% at 30 h. This likely reflects a reduction in individual clusters as more become fused with nuclear speckles over time. At the same time, these data suggest that the fusion process can begin even earlier. Indeed, it has been reported that in macrophages, the peak of viral nuclear import occurs before 6 h post-infection (doi: 10.1038/s41564-020-0735-8).

      In addition, we have incorporated new experiments assessing viral infectivity in the absence of CPSF6, or in CPSF6-knockout cells expressing either a CPSF6 mutant lacking the FG peptide or the WT protein. As shown in our new Figure 5, these results demonstrate that the FG peptide is critical for viral replication in THP-1 cells.

      For better clarity, we would like to specify that our study focuses on the role of SON, a scaffold factor of nuclear speckles, rather than SUN2 (SUN domain-containing protein 2), which is a component of the LINC (Linker of Nucleoskeleton and Cytoskeleton) complex.

      As suggested by the reviewer, we have revised the text and combined figures to improve clarity and facilitate reader comprehension. We appreciate the constructive comment of the reviewer.

      Reviewer #3 (Public review): 

      In this study, the authors investigate the requirements for the formation of CPSF6 puncta induced by HIV-1 under a high multiplicity of infection conditions. Not surprisingly, they observe that mutation of the Phe-Gly (FG) repeat responsible for CPSF6 binding to the incoming HIV-1 capsid abrogates CPSF6 punctum formation. Perhaps more interestingly, they show that the removal of other domains of CPSF6, including the mixed-charge domain (MCD), does not affect the formation of HIV-1-induced CPSF6 puncta. The authors also present data suggesting that CPSF6 puncta form individual before fusing with nuclear speckles (NSs) and that the fusion of CPSF6 puncta to NSs requires the intrinsically disordered region (IDR) of the NS component SRRM2. While the study presents some interesting findings, there are some technical issues that need to be addressed and the amount of new information is somewhat limited. Also, the authors' finding that deletion of the CPSF6 MCD does not affect the formation of HIV-1-induced CPSF6 puncta contradicts recent findings of Jang et al. (doi.org/10.1093/nar/gkae769). 

      We thank the reviewer for her/his thoughtful feedback and the opportunity to elaborate on why our findings provide a distinct perspective compared to those of Jang et al. (doi.org/10.1093/nar/gkae769).

      One potential reason for the differences between our findings and those of Jang et al. could be the choice of experimental systems. Jang et al. conducted their study in HEK293T cells with CPSF6 knockouts, as described in Sowd et al., 2016 (doi.org/10.1073/pnas.1524213113). In contrast, our work focused on macrophage-like THP-1 cells, which share closer characteristics with HIV-1’s natural target cells. 

      Our approach utilized a complete CPSF6 knockout in THP-1 cells, enabling us to reintroduce untagged versions of CPSF6, such as wild-type and deletion mutants, to avoid potential artifacts from tagging. Jang et al. employed HA-tagged CPSF6 constructs, which may lead to subtle differences in experimental outcomes due to the presence of the tag.

      Finally, our investigation into the IDR of SRRM2 relied on CRISPR-PAINT to generate targeted deletions directly in the endogenous gene (Lester et al., 2021, DOI: 10.1016/j.neuron.2021.03.026). This approach provided a native context for studying SRRM2’s role.

      We will incorporate these clarifications into the discussion section of the revised manuscript.  

      Reviewer #1 (Recommendations for the authors): 

      (1) Figure 2E: The statistical analysis should be extended to the comparison between the "+HIV" samples. 

      We showed the statistics between only HIV+ cells now new Fig. 2D.  

      (2) Figure 4A top panel is out of focus. 

      We modified the figure now figure 6A.

      Reviewer #2 (Recommendations for the authors): 

      (1) Some of the sentences could be rewritten for the sake of simplicity, also taking care to avoid overstatement. 

      We modified the sentences as best as we could.

      (2) For instance: There is no evidence that "viral genomes in nuclear niches may be contributing to the formation of viral reservoirs" (lines 33-35). 

      We changed the sentence as follows: “Despite antiretroviral treatment, viral genomes can persist in these nuclear niches and reactivate upon treatment interruption, raising the possibility that they could play a role in the establishment of viral reservoirs.”

      (3) Line 53: unclear sentence. "The initial stages of the viral life cycle have been understood....." The authors certainly mean reverse transcription, but as formulated this is not clear. The authors should also bear in mind that reverse transcription starts already in budding/just released virions. 

      We clarified the concept as follows: “the initial stages of the viral life cycle, such as the reverse transcription (the conversion of the viral RNA in DNA) and the uncoating (loss of the capsid), have been understood to mainly occur within the host cytoplasm.”

      (4) Line 124: the results in Figure 1 are not at all explained in the text. PF74 does not act on CPSF6, it acts on CA and this in turn leads to CPS6 puncta disappearance. 

      PF74 binds the same hydrophobic pocket of the viral core as CPSF6. However, when viral cores are located within CPSF6 puncta, treatment with a high dose of PF74 leads to a rapid disassembly of these puncta, while viral cores remain detectable up to 2 hours post-treatment (Ay et al., EMBO J. 2024). Here, we simply describe what we observed by confocal microscopy. Said that HIV-Induced CPSF6 Puncta include both CPSF6 proteins and viral cores as we have now specified.

      (5) Line 130; 'hinges into two key ...' should be 'hinges on'. 

      Thanks we modified it.

      (6) Supplementary Figures are not cited sequentially in the text. 

      We have now modified the numbers of the supplementary figures according to their appearance in the text.

      (7) Line 44: define FG. 

      We defined it.

      Reviewer #3 (Recommendations for the authors): 

      Specific comments that the authors should address are outlined below. 

      (1) As mentioned in the summary above, the authors' findings seem to be in direct contradiction with recent work published by Alan Engelman's lab in NAR. The authors should address the possible reason(s) for this discrepancy. 

      We mention the potential reasons for the differences in the results between our study and Engelman’s lab study in the discussion.

      (2) The major finding here that deletion of the CFSF6 FG repeat prevents the formation of CFSP6 puncta is unsurprising, as the FG repeat is responsible for capsid binding. This has been reported previously and such mutants have been used as controls in other studies. 

      Our study demonstrates that the FG domain is the sole region responsible for the formation of CPSF6 puncta, rather than the LCR or MCD domains. The unique role of the FG domain in CPSF6 that promotes the formation of CPSF6 puncta without the help of the other IDRs during viral infection is a finding particularly novel, as it has not yet been reported in the literature.

      (3) Line 339, the authors state: "incoming viral RNA has been observed to be sequestered in nuclear niches in cells treated with the reversible reverse transcriptase inhibitor, NEV. When macrophage-like cells are infected in the presence of NEV, the incoming viral RNA is held within the nucleus (Rensen et al., 2021; Scoca et al., 2023). This scenario is comparable to what is observed in patients undergoing antiretroviral therapy". In what way is this comparable to what is observed in individuals on ART? I see no basis for this statement. Sequestration of viral RNA in the nucleus is not the basis for maintaining the viral reservoir in individuals on therapy. 

      Thanks, we rephrased the sentence.

      (4) General comment: analyzing single-cell-derived KO clones is very risky because of random clonal variability between individual cells in the population. If single-cell-derived clones are used, phenotypes could be confirmed with multiple, independent clones. 

      We used a clone completely KO for CPSF6 mainly to investigate the role of a specific domain in condensate formation and it will be difficult that clone selection could have introduced artifacts in this context. Other available clones retain residual endogenous protein, which prevents us from accurately assessing CPSF6 cluster formation in the various deletion mutants. A complete CPSF6 knockout is essential for studying puncta formation, as it eliminates potential artifacts arising from protein tags that could alter the phase separation properties of the protein under investigation.

      (5) Line 214. "It is predicted to form two short α helices and a ß strand, arranged as: α helix - FG - ß strand - α helix". What is this based on? No citation is provided and no data are shown. 

      In fact, the statement "It is predicted to form two short α helices and a ß strand, arranged as: α helix - FG - ß strand - α helix" is based on the data shown in Figure 4E presenting data generated by PSIPRED. 

      (6) Figure 1B. "Luciferase values were normalized by total proteins revealed with the Bradford kit". What does this mean? I couldn't find anything explaining how the viral inputs were normalized. 

      The amount of the virus used is the same for all samples, we used MOI 10 as described in the legend of Figure 1. It is important to normalize the RLU (luciferase assay) with the total amount of proteins to be sure that we are comparing similar number of cells. Obviously, the cells were plated on the same amount on each well, the normalization in our case it is just an additional important control.

      (7) I can't interpret what is being shown in the movies. 

      We updated the movie 1B and rephrased the movie legends and we added a new suppl. Fig.4B.

      (8) Figure 5B. The differences seen are very small and of questionable significance. The data suggest that by 6 hpi, around 75% of HIV-induced CPSF6 puncta are already fused with NSs. 

      We calculated that 27% of CPSF6 clusters were independent from NS at 6 h post-infection, compared to only 9% at 30 h. This likely reflects a reduction in individual clusters as more become fused with nuclear speckles over time. At the same time, these data suggest that the fusion process can begin even earlier. Indeed, it has been reported that in macrophages, the peak of viral nuclear import occurs before 6 h post-infection (doi: 10.1038/s41564-020-0735-8).

      (9) Figure 6. Immunofluorescence is not a good method for quantifying KD efficiency. The authors should perform western blotting to measure KD efficiency. This is an important point, because the effect sizes are small, quite likely due to incomplete KD. 

      We performed WB and quantified the results, which correlated with the IF data and their imaging analysis. These new findings have been incorporated into Figure 8A. Of note, deletion of the IDR of SRRM2 does not affect the number of SON puncta (Fig.8C), but significantly reduces the number of CPSF6 puncta in infected cells compared to those expressing full-length SRRM2 (Fig.8D).

      (10) There are a variety of issues with the text that should be corrected. 

      The authors use "RT" to mean both the enzyme (reverse transcriptase) and the process (reverse transcription). This is incorrect and will confuse the reader. RT refers to the enzyme (noun, not verb). 

      The commonly used abbreviation for nevirapine is NVP, not NEV. 

      In line 60, it is stated that the capsid contains 250 hexamers. This number is variable, depending on the size and shape of the capsid. By contrast, the capsid has exactly 12 pentamers. 

      Line 75. Typo: "nuclear niches containing, such as like". 

      Line 82. Typo: "the mechanism behinds". 

      Line 102. Typo: "we aim to elucidate how these HIV-induced CPSF6 form". 

      Line 107. Type: "CPSF6 is responsible for tracking the viral core" ("trafficking the viral core"?). 

      Thanks, we corrected all of them.

    1. Reviewer #1 (Public review):

      Zhu and colleagues used high-density Neuropixel probes to perform laminar recordings in V1 while presenting either small stimuli that stimulated the classical receptive field (CRF) or large stimuli whose border straddled the RF to provide nonclassical RF (nCRF) stimulation. Their main question was to understand the relative contribution of feedforward (FF), feedback (FB), and horizontal circuits to border ownership (B<sub>own</sub> ), which they addressed by measuring cross-correlation across layers. They found differences in cross-correlation between feedback/horizontal (FH) and input layers during CRF and nCRF stimulation.

      Comments on revisions:

      In the revision, the authors have added a paragraph in the Discussion to address the question of layers 2/3 neurons leading layer 4 neurons, and have provided answers to the questions in the public review without making substantial changes in the paper. However, there were several other recommendations, which I am not sure why were not considered. I am adding those again below.

      * For CRF stimulation, the zero lag between 4C and 4A/B with layer 5/6 (Figure 3D last two columns on the right) was surprising to me. I just felt that this could be because layer 6 may also be getting FF inputs. Perhaps better not to club layer 5 with 6, as mentioned earlier also.

      * Interpreting the nCRF delays, with often negative delays, was very challenging for me. For example, 4C -> 5/6 (third column in Figure 3) has a significantly negative peak (although that does not show up in statistical analysis because it seems to be a signed test to just test if the median was greater than zero, not if the median was different from zero; line 285). What is the interpretation here? Are spikes in 5/6 causing spikes in 4C (which, as mentioned earlier, would require anatomical projections from 5/6 to 4C)? On the other hand, if FB inputs arrive in 5/6 but there are no inputs going to 4C, then why should there even be a significant cross-correlation?

      The only explanation I could think of is somehow an alignment of inputs in these two layers such that FH inputs come in Layer 5/6 just before FF inputs arrive in 4C, each causing a spike in a neuron in each layer which are otherwise not anatomically interconnected. But this would require both a very precise temporal coupling between FF and FH inputs arriving in these areas AND neurons in layer 5/6 which very strongly respond to FH stimulation (I thought that FH inputs are mainly modulatory and not as strong). Anyway, it would be good to see some cross correlation functions which have a negative lag (all examples in Fig 3B has positive or zero lag).

      * I think cross-correlation analysis would have been useful if there was data from a feedback area (say V2). In its absence, perhaps latency analysis (by just comparing the PSTH) could have revealed something interesting, given that the hypothesis is about differences in the timings in FH versus FF inputs. Do PSTHs across layers show the type of differences that are being claimed (e.g. in line 295-297)?

      * Line 262-63: "Notably, the rates were nearly identical under the two stimulus conditions" - I would have thought CRF stimulation would produce higher rates. Can the authors explain this?

      * Line 174-175: Isn't the proportion of border ownership cells in layer 4C higher than one would expect under the assumption that nCRF effects are mediated by horizontal and feedback connections which layer 4C does not receive? Can authors explain?

      * Figure 3D: it would also be good to show the heatmaps stacked up in the increasing order of the interelectrode distance of the pairs so that it will be easy to see how the peak lag changes with distance as well.

      * It will be good to show the shift in peak lag and CCG asymmetry between CRF and nCRF conditions for the same pairs, using a violin or bar plot with lines connecting each pair in Figure 3.

      * Line 594, 603, 628 and 630: What procedure was used to determine the size, location of the CRF, and optimal orientation manually online?

      * Line 733-734: Although a reference is cited, please explicitly mention the rationale for keeping the peak lag cutoff at 10 ms.

      * It is unclear why a grating was used for the CRF condition, instead of just having the portion of the stimulus within the RF for the nCRF condition, as the comparisons for FHi with FF are with different FF drives in each case.

      * Figure 5 - the scatter is enormous, can you please provide the R2 values?

    2. Reviewer #2 (Public review):

      Summary:

      The authors present a study of how modulatory activity from outside the classical receptive field (cRF) differs from cRF stimulation. They study neural activity across the different layers of V1 in two anesthetized monkeys using Neuropixels probes. The monkeys are presented with drifting gratings and border-ownership tuning stimuli. They find that border-ownership tuning is organized into columns within V1, which is unexpected and exciting, and that the flow of activity from cell-to-cell (as judged by cross-correlograms between single units) is influenced by the type of visual stimulus: border-ownership tuning stimuli vs. drifting-grating stimuli.

      Strengths:

      The questions addressed by the study are of high interest, and the use of Neuropixels probes yields extremely high numbers of single-units and cross-correlation histograms (CCHs) which makes the results robust. The study is well-described.

      Comments on revisions:

      The results are interesting and seem robust. However, several of my main points were not addressed. The authors do not analyze or discuss the problem the border ownership stimuli do uniquely isolate feedback from feedforward influences. Here are my remaining points/recommendations:

      (1) In my previous review I indicated that the border-ownership signal also provides a strong feedforward drive, a black-white edge, in addition to the border ownership signal. Calling this a "nCRF stimulus" is a misnomer. Please correct this terminology and replace it by something that is appropriate, e.g. changing it into "grating stimulation" (instead of CRF stimulation) and BO-stimulation (instead of nCRF stimulation).

      (2) In my previous review I asked if the initial response for the border ownership stimulus show the feedforward signature. It is unclear to me why this suggestions did not lead to an analysis of the feedforward response. I repeat the text from my previous review: "The authors state that they did not look at cross-correlations during the initial response, but if they do, do they see the feedforward-dominated pattern? The jitter CCH analysis might suffice in correcting for the response transient." Can the authors address this point?

      (3) In my previous review I asked the authors show the average time course of the response elicited by preferred and nonpreferred border ownership stimuli across all significant neurons. It remains unclear why this plot was not provided.

    3. Reviewer #3 (Public review):

      Summary:

      The paper by Zhu et al is on an important topic in visual neuroscience, the emergence in the visual cortex of signals about figure and ground. This topic also goes by the name border ownership. The paper utilizes modern recording techniques very skillfully to extend what is known about border ownership. It offers new evidence about the prevalence of border ownership signals across different cortical layers in V1 cortex. Also, it uses pairwise cross correlation to study signal flow under different conditions of visual stimulation that include the border ownership paradigm.

      Strengths: The paper's strengths are results of its use of multi-electrode probes to study border ownership in many neurons simultaneously across the cortical layers in V1. Also it provides new useful data about the dynamics of interaction of signals from the non-classical receptive field (NCRF) and the Classical receptive field (CRF).

      Weaknesses:

      The paper's weakness is that it does not challenge consensus beliefs about mechanisms. Also, the paper combines data about border ownership with data about the NCRF without making it clear how they are similar or different.

      Critique:

      The border ownership data on V1 offered in the paper replicate experimental results obtained by Zhou and von der Heydt (2000) and confirm the earlier results. The incremental addition is that the authors found border ownership in all cortical layers of V1, extending Zhou and von der Heydt's results that were only about layer 2/3 in V2 cortex. This is an interesting new result using the same stimuli but new measurement techniques.

      The cross-correlation results show that the pattern of the cross correlogram (CCG) is influenced by the visual pattern being presented. However, in the initial submitted ms. the results were not analyzed mechanistically, and the interpretation was unclear. For instance, the authors show in Figure 3 (and in Figure S2) that the peak of the CCG can indicate layer 2/3 excites layer 4C when the visual stimulus is the border ownership test pattern, a large square 8 deg on a side. More than one reviewer asked, " how can layer 2/3 excite layer 4C"? . In the revised ms. the authors added a paragraph to the Discussion to respond to the reviewers about this point. The authors could provide an even better response to the reviewers by emphasizing that, consistently, layer 5/6 neurons lead neurons in layer 4, and for the CRF pattern and even more when the NCRF patterns are used.

      The problems in understanding the CCG data are indirectly caused by the lack of a critical analysis of what is happening in the responses that reveal the border ownership signals, as in Fig.2. Let's put it bluntly--are border ownership signals excitatory or inhibitory? As the authors pointed out in their rebuttal, Zhang and von der Heydt (2010, JNS) did experiments to answer this question but I do not agree with the authors rebuttal letter about what Zhang and von der Heydt (2010) reported. If you examine Zhang and von der Heydt's Figure 6, you see that the major effect of stimulating border ownership neurons is suppression from the non-preferred side. That result is consistent with many papers on the NCRF (many cited by the authors) that indicate that it is mostly suppressive. That experimental fact about border ownership should be mentioned in the present paper.

      What I should have pointed out in the first round, but didn't understand it then, is that there is a disconnect between the the border ownership laminar analysis (Figure 2) and the laminar correlations with CCGs (Figures 3-5) because the CCGs are not limited to border ownership neurons (or at least we are not told they were limited to them). So the CCG results are not mostly about border ownership--they are about the difference between signal flow in responses to small drifting Gabor patterns vs big flashed squares. Since only 21% of all recorded neurons were border ownership neurons, it is likely that most of the CCG statistics is based on neurons that do not show border ownership. Nevertheless, Figures 3 and 4 are very useful for the study of signal flow in the NCRF. It wasn't clear to me and I think the authors could make it clearer what those figures are about.<br /> And I wonder if it might be possible to make a stronger link with border ownership by restricting the CCG analysis to pairs of neurons in which one neuron is a border ownership neuron. Are there enough data?

      My critique of the CCG analysis applies to Figure 5 also. That figure shows a weak correlation of CCG asymmetry with Border Ownership Index. Perhaps a stronger correlation might be present if the population were restricted to the much smaller population of neuron pairs that had at least one border ownership neuron.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Zhu and colleagues used high-density Neuropixel probes to perform laminar recordings in V1 while presenting either small stimuli that stimulated the classical receptive field (CRF) or large stimuli whose border straddled the RF to provide nonclassical RF (nCRF) stimulation. Their main question was to understand the relative contribution of feedforward (FF), feedback (FB), and horizontal circuits to border ownership (Bown), which they addressed by measuring crosscorrelation across layers. They found differences in cross-correlation between feedback/horizontal (FH) and input layers during CRF and nCRF stimulation. 

      Although the data looks high quality and analyses look mostly fine, I had a lot of difficulty understanding the logic in many places. Examples of my concerns are written below. 

      (1) What is the main question? The authors refer to nCRF stimulation emerging from either feedback from higher areas or horizontal connections from within the same area (e.g. lines 136 to 138 and again lines 223-232). I initially thought that the study would aim to distinguish between the two. However, the way the authors have clubbed the layers in 3D, the main question seems to be whether Bown is FF or FH (i.e., feedback and horizontal are clubbed). Is this correct? If so, I don't see the logic, since I can't imagine Bown to be purely FF. Thus, just showing differences between CRF stimulation (which is mainly expected to be FF) and nCRF stimulation is not surprising to me. 

      We thank the reviewer for their thoughtful comments. As explained in the discussion, we grouped cortical layers to reduce uncertainty in precisely assigning laminar boundaries and to increase statistical power. Consequently, this limits our ability to distinguish the relative contributions of feedback inputs, primarily targeting layers 1 and 6, and horizontal connections, mainly within layers 2/3 and 5. Nevertheless, previous findings, especially regarding the rapid emergence of B<sub>own</sub> signals, suggest that feedback is more biologically plausible than horizontal-based mechanisms.

      Importantly, the emergence of B<sub>own</sub> signals in the primate brain should not be taken for granted. Direct physiological evidence that distinguishes feedforward from feedback/horizontal mechanisms has been lacking. While we agree it is unlikely that B<sub>own</sub> is mediated solely by feedforward processing, we felt it was necessary to test this empirically, particularly using highresolution laminar recordings.

      As discussed, feedforward models of B<sub>own</sub> have been proposed (e.g., Super, Romeo, and Keil, 2010; Saki and Nishimura, 2006). These could, in theory, be supported by more general nCRF modulations arising through early feedforward inhibitions, such as those observed in the retinogeniculate pathway (e.g., Webb, Tinsley, Vincent and Derrington, 2005; Blitz and Regehr, 2005; Alitto and Usrey, 2008). However, most B<sub>own</sub> models rely heavily on response latency, yet very few studies have recorded across layers or areas simultaneously to address this directly. Notably, recent findings in area V4 show that B<sub>own</sub> signals emerge earlier in deep layers than in granular (input) layers, suggesting a non-feedforward origin (Franken and Reynolds, 2021).

      Furthermore, although previous studies have shown that the nCRF can modulate firing rates and the timing of neuronal firing across layers, our findings go beyond these effects. We provide clear evidence that nCRF modulation also alters precise spike timing relationships and interlaminar coordination, and that the magnitude of nCRF modulation depends on these interlaminar interactions. This supports the idea that B<sub>own</sub> , or more general nCRF modulation, involves more than local rate changes, reflecting layer-specific network dynamics consistent with feedback or lateral integration.

      (2) Choice of layers for cross-correlation analysis: In the Introduction, and also in Figure 3C, it is mentioned that FF inputs arrive in 4C and 6, while FB/Horizontal inputs arrive at "superficial" and "deep", which I take as layer 2/3 and 5. So it is not clear to me why (i) layer 4A/B is chosen for analysis for Figure 3D (I would have thought layer 6 should have been chosen instead) and (ii) why Layers 5 and 6 are clubbed. 

      We thank the reviewer for raising this important point. The confusion likely stems from our use of the terms “superficial” and “deep” layers when describing the targets of feedback/horizontal inputs. To clarify, by “superficial” and “deep,” we specifically refer to layers 1–3 and layers 5–6, respectively, as illustrated in Figure 3C. Feedback and horizontal inputs relatively avoid entire layer 4, including both 4C and 4A/B.

      We also emphasize that the classification of layers as feedforward or feedback/horizontal recipients is relative rather than absolute. For example, although layer 6 receives both feedforward and feedback/horizontal inputs, it contains a higher proportion of feedback/horizontal inputs compared to layers 4C and 4A/B. 

      We had addressed this rationale in the Discussion, but recognize it may not have been sufficiently emphasized. We have revised the main text accordingly to clarify this point for readers in the final manuscript version.

      (3) Addressing the main question using cross-correlation analysis: I think the nice peaks observed in Figure 3B for some pairs show how spiking in one neuron affects the spiking in another one, with the delay in cross-correlation function arising from the conduction delay. This is shown nicely during CRF stimulation in Figure 3D between 4C -> 2/3, for example. However, the delay (positive or negative) is constrained by anatomical connectivity. For example, unless there are projections from 2/3 back to 4C which causes firing in a 2/3 layer neuron to cause a spike in a layer 4 neuron, we cannot expect to get a negative delay no matter what kind of stimulation (CRF versus nCRF) is used. 

      We thank the reviewer for the insightful comment. The observation that neurons within FH<sub>i</sub> laminar compartments (layers 2/3, 5/6) can lead those in layer 4 (4C, 4A/B) during nCRF stimulation may indeed seem unexpected. However, several anatomical pathways could mediate the propagation of B<sub>own</sub> signals from FH<sub>i</sub> compartments to layer 4. We have revised the Discussion section in the final version of the manuscript to address this point explicitly.

      In Macaque V1, projections from layers 2/3 to 4A/B have been documented (Blasdel et al., 1985; Callaway and Wiser, 1996), and neurons in 4A/B often extend apical dendrites into layers 2/3 (Lund, 1988; Yoshioka et al., 1994). Although direct projections from layers 2/3 to 4C are generally sparse (Callaway, 1998), a subset of neurons in the lower part of layer 3 can give off collateral axons to 4C (Lund and Yoshioka, 1991). Additionally, some 4C neurons extend dendrites into 4B, enabling potential dendritic integration of inputs from more superficial layers (Somogyi and Cowey, 1981; Mates and Lund, 1983; Yabuta and Callaway, 1998). Sparse connections from 2/3 to layer 4 have also been reported in cat V1 (Binzegger, Douglas and Martin, 2004). Moreover, layers 2/3 may influence 4C neurons disynaptically, without requiring dense monosynaptic connections. 

      Importantly, while CCGs can suggest possible circuit arrangements, functional connectivity may arise through mechanisms not fully captured by traditional anatomical tracing. Indeed, the apparent discrepancy between anatomical and functional data is not uncommon. For example, although 4B is known to receive anatomical input primarily from 4Cα, but not 4Cβ, photostimulation experiments have shown that 4B neurons can also be functionally driven by 4Cβ (Sawatari and Callaway, 1996). Our observation of functional inputs from layers 2/3 to layer 4 is also consistent with prior findings in rodent V1, where CCG analysis (e.g., Figure 7 in Senzai, Fernandez-Ruiz and Buzsaki, 2019) or photostimulation (Xu et al., 2016) revealed similar pathways. 

      Layers 5/6 provide dense projections to layers 4A/B (Lund, 1988; Callaway, 1998). In particular, layer 6 pyramidal neurons, especially the subset classified as Type 1 cells, project substantially to layer 4C (Wiser and Callaway, 1996; Fitzpatrick et al., 1985). 

      Reviewer #2 (Public review): 

      Summary: 

      The authors present a study of how modulatory activity from outside the classical receptive field (cRF) differs from cRF stimulation. They study neural activity across the different layers of V1 in two anesthetized monkeys using Neuropixels probes. The monkeys are presented with drifting gratings and border-ownership tuning stimuli. They find that border-ownership tuning is organized into columns within V1, which is unexpected and exciting, and that the flow of activity from cellto-cell (as judged by cross-correlograms between single units) is influenced by the type of visual stimulus: border-ownership tuning stimuli vs. drifting-grating stimuli. 

      Strengths: 

      The questions addressed by the study are of high interest, and the use of Neuropixels probes yields extremely high numbers of single-units and cross-correlation histograms (CCHs) which makes the results robust. The study is well-described. 

      Weaknesses: 

      The weaknesses of the study are (a) the use of anesthetized animals, which raises questions about the nature of the modulatory signal being measured and the underlying logic of why a change in visual stimulus would produce a reversal in information flow through the cortical microcircuit and (b) the choice of visual stimuli, which do not uniquely isolate feedforward from feedback influences. 

      (1) The modulation latency seems quite short in Figure 2C. Have the authors measured the latency of the effect in the manuscript and how it compares to the onset of the visually driven response? It would be surprising if the latency was much shorter than 70ms given previous measurements of BO and figure-ground modulation latency in V2 and V1. On the same note, it might be revealing to make laminar profiles of the modulation (i.e. preferred - non-preferred border orientation) as it develops over time. Does the modulation start in feedback recipient layers? 

      (2) Can the authors show the average time course of the response elicited by preferred and nonpreferred border ownership stimuli across all significant neurons? 

      We thank the reviewer for the insightful comment—this is indeed an important and often overlooked point. As noted in the Discussion, B<sub>own</sub> modulation differs from other forms of figure-ground modulation (e.g., Lamme et al., 1998) in that it can emerge very rapidly in early visual cortex—within ~10–35 ms after response onset (Zhou et al., 2000; Sugihara et al., 2011). This rapid emergence has been interpreted as evidence for the involvement of fast feedback inputs, which can propagate up to ten times faster than horizontal connections (Girard et al., 2001). Moreover, interlaminar interactions via monosynaptic or disynaptic connections can occur on very short timescales (a few milliseconds), further complicating efforts to disentangle feedback influences based solely on latency.

      Thus, while the early onset of modulation in our data may appear surprising, it is consistent with prior B<sub>own</sub> findings, and likely reflects a combination of fast feedback and rapid interlaminar processing. This makes it challenging to use conventional latency measurements to resolve laminar differences in B<sub>own</sub> modulation. Latency comparisons are well known to be susceptible to confounds such as variability in response onset, luminance, contrast, stimulus size, and other sensory parameters. 

      Although we did not explicitly quantify the latency of B<sub>own</sub> modulation in this manuscript, our cross-correlation analysis provides a more sensitive and temporally resolved measure of interlaminar information flow. We therefore focused on this approach rather than laminar modulation profiles, as it more directly addresses our primary research question.

      (3) The logic of assuming that cRF stimulation should produce the opposite signal flow to borderownership tuning stimuli is worth discussing. I suspect the key difference between stimuli is that they used drifting gratings as the cRF stimulus, the movement of the stimulus continually refreshes the retinal image, leading to continuous feedforward dominance of the signals in V1. Had they used a static grating, the spiking during the sustained portion of the response might also show more influence of feedback/horizontal connections. Do the initial spikes fired in response to the borderownership tuning stimuli show the feedforward pattern of responses? The authors state that they did not look at cross-correlations during the initial response, but if they do, do they see the feedforward-dominated pattern? The jitter CCH analysis might suffice in correcting for the response transient. 

      We thank the reviewer for the insightful comment. As noted in the final Results section, our CRF and nCRF stimulation paradigms differ in respects beyond the presence or absence of nonclassical modulation, including stimulus properties within the CRF.

      We agree with the reviewer’s speculation that drifting gratings may continually refresh the retinal image, promoting sustained feedforward dominance in V1, whereas static gratings might allow greater influence from feedback/horizontal inputs during the sustained response. Likewise, the initial response to the B<sub>own</sub> stimulus could be dominated by feedforward activity before feedback/horizontal influences arrive. 

      This contrast was a central motivation for our experimental design: we deliberately used two stimulus conditions — drifting gratings to emphasize feedforward processing, and B<sub>own</sub> stimuli, which are known to engage feedback modulation — to test whether these two conditions yield different patterns of interlaminar information flow. Our results confirm that they do. While we did not separately analyze the very initial spike period, our focus is on interlaminar information flow during the sustained response, which serves as the primary measure of feedback/horizontal engagement in this study.

      Finally, beyond this direct comparison, we show in Figure 5 that under nCRF stimulation alone, the direction and strength of interlaminar information flow correlate with the magnitude of B<sub>own</sub> modulation, further supporting the idea that our cross-correlation approach reveals functionally meaningful differences in cortical processing.

      (4) The term "nCRF stimulation" is not appropriate because the CRF is stimulated by the light/dark edge. 

      We thank the reviewer for the comment. As noted in the Introduction, nCRF effects described in the literature invariably involve stimulation both inside and outside the CRF. Our use of the term “nCRF stimulation” refers to this experimental paradigm, rather than suggesting that the CRF itself is unstimulated. We hope this clarifies our use of the term.

      Reviewer #3 (Public review): 

      Summary: 

      The paper by Zhu et al is on an important topic in visual neuroscience, the emergence in the visual cortex of signals about figures and ground. This topic also goes by the name border ownership. The paper utilizes modern recording techniques very skillfully to extend what is known about border ownership. It offers new evidence about the prevalence of border ownership signals across different cortical layers in V1 cortex. Also, it uses pairwise cross-correlation to study signal flow under different conditions of visual stimulation that include the border ownership paradigm. 

      Strengths: 

      The paper's strengths are its use of multi-electrode probes to study border ownership in many neurons simultaneously across the cortical layers in V1, and its innovation of using crosscorrelation between cortical neurons -- when they are viewing border-ownership patterns or instead are viewing grating patterns restricted to the classical receptive field (CRF). 

      Weaknesses: 

      The paper's weaknesses are its largely incremental approach to the study of border ownership and the lack of a critical analysis of the cross-correlation data. The paper as it is now does not advance our understanding of border ownership; it mainly confirms prior work, and it does not challenge or revise consensus beliefs about mechanisms. However, it is possible that, in the rich dataset the authors have obtained, they do possess data that could be added to the paper to make it much stronger. 

      Critique: 

      The border ownership data on V1 offered in the paper replicates experimental results obtained by Zhou and von der Heydt (2000) and confirms the earlier results using the same analysis methods as Zhou. The incremental addition is that the authors found border ownership in all cortical layers extending Zhou's results that were only about layer 2/3. 

      The cross-correlation results show that the pattern of the cross-correlogram (CCG) is influenced by the visual pattern being presented. However, the results are not analyzed mechanistically, and the interpretation is unclear. For instance, the authors show in Figure 3 (and in Figure S2) that the peak of the CCG can indicate layer 2/3 excites layer 4C when the visual stimulus is the border ownership test pattern, a large square 8 deg on a side. But how can layer 2/3 excite layer 4C? The authors do not raise or offer an answer to this question. Similar questions arise when considering the CCG of layer 4A/B with layer 2/3. What is the proposed pathway for layer 2/3 to excite 4A/B? Other similar questions arise for all the interlaminar CCG data that are presented. What known functional connections would account for the measured CCGs? 

      We thank the reviewer for raising this important point. As noted in our response to a previous comment, several anatomical pathways could mediate apparent functional inputs from layers 2/3 to 4C and 4A/B. In macaque V1, projections from layers 2/3 to 4A/B have been documented (Blasdel et al., 1985; Callaway and Wiser, 1996), and neurons in 4A/B often extend apical dendrites into layers 2/3 (Lund, 1988; Yoshioka et al., 1994). Although direct projections from layers 2/3 to 4C are generally sparse (Callaway, 1998), a subset of lower layer 3 neurons can give off collateral axons to 4C (Lund and Yoshioka, 1991). Some 4C neurons also extend dendrites into 4B, potentially allowing dendritic integration of inputs from more superficial layers (Somogyi and Cowey, 1981; Mates and Lund, 1983; Yabuta and Callaway, 1998). Sparse connections from 2/3 to layer 4 have also been reported in cat V1 (Binzegger et al., 2004).

      Moreover, layers 2/3 may influence 4C neurons disynaptically, without requiring dense monosynaptic connections. While CCGs suggest possible circuit arrangements, functional connectivity may arise through mechanisms not fully captured by anatomical tracing, and apparent discrepancies between anatomical and functional data are not uncommon. For example, although 4B is known to receive anatomical input primarily from 4Cα, 4B neurons can also be functionally driven by 4Cβ using photostimulation (Sawatari and Callaway, 1996). Our observation of functional inputs from layers 2/3 to layer 4 is also consistent with prior findings in rodent V1, where CCG analysis (e.g., Figure 7 in Senzai, Fernandez-Ruiz and Buzsaki, 2019) or photostimulation (Xu et al., 2016) revealed similar pathways. 

      Layers 5/6 also provide dense projections to layers 4A/B (Lund, 1988; Callaway, 1998). In particular, layer 6 pyramidal neurons, especially the subset classified as Type 1 cells, project substantially to layer 4C (Wiser and Callaway, 1996; Fitzpatrick et al., 1985). 

      We have revised the Discussion section to explicitly address these points and clarify the potential anatomical and functional pathways underlying the measured interlaminar CCGs, highlighting how inputs from layers 2/3 and 5/6 to layer 4 can be mediated via both direct and indirect connections.

      The problems in understanding the CCG data are indirectly caused by the lack of a critical analysis of what is happening in the responses that reveal the border ownership signals, as in Figure 2. Let's put it bluntly - are border ownership signals excitatory or inhibitory? The reason I raise this question is that the present authors insightfully place border ownership as examples of the action of the non-classical receptive field (nCRF) of cortical cells. Most previous work on the nCRF (many papers cited by the authors) reveal the nCRF to be inhibitory or suppressive. In order to know whether nCRF signals are excitatory or inhibitory, one needs a baseline response from the CRF, so that when you introduce nCRF signals you can tell whether the change with respect to the CRF is up or down. As far as I know, prior work on border ownership has not addressed this question, and the present paper doesn't either. This is where the rich dataset that the present authors possess might be used to establish a fundamental property of border ownership. 

      Then we must go back to consider what the consequences of knowing the sign of the border ownership signal would mean for interpreting the CCG data. If the border ownership signals from extrastriate feedback or, alternatively, from horizontal intrinsic connections, are excitatory, they might provide a shared excitatory input to pairs of cells that would show up in the CCG as a peak at 0 delay. However, if the border ownership manuscript signals are inhibitory, they might work by exciting only inhibitory neurons in V1. This could have complicated consequences for the CCG.The interpretation of the CCG data in the present version of the m is unclear (see above). Perhaps a clearer interpretation could be developed once the authors know better what the border ownership signals are. 

      We thank the reviewer for raising this fundamental and thought-provoking question. As noted, B<sub>own</sub> signals arise from nCRF, which has often been associated with suppressive effects. However, Zhang and von der Heydt (2010) provided important insight into this issue by systematically varying the placement of figure fragments outside the CRF while keeping an edge centered within the CRF. They found that contextual fragments on the preferred side of B<sub>own</sub> produce facilitation, while those on the non-preferred side produce suppression. Thus, the nCRF contribution to B<sub>own</sub> reflects both excitatory and inhibitory modulation, depending on the spatial configuration of the figure.

      These effects were well explained by their model in which feedback from grouping cells in higher areas selectively enhances or suppresses V1/V2 neuron responses, depending on their B<sub>own</sub> preference. In this framework, the B<sub>own</sub> signal itself is not inherently excitatory or inhibitory; rather, it results from the net effect of feedback, which can be either facilitative or suppressive. Importantly, it is the input that is modulated — not that the receiving neurons are necessarily inhibitory themselves.

      In the current study, our analysis focused on CCGs showing excessive coincident spiking, i.e., positive peaks, which are typically interpreted as evidence for shared excitatory input or excitatory connections. Due to the limited number of connections, we did not analyze inhibitory interactions, such as anti-correlations or delayed suppression in the CCGs, which would be expected if the reference neuron were inhibitory. Therefore, the CCGs we report here likely reflect the excitatory component of the B<sub>own</sub> signal, and possibly its upstream drive via feedback. While a full separation of excitatory and inhibitory components remains an important goal for future work, our data suggest that B<sub>own</sub> modulation is at least partially mediated through excitatory feedback input.

      My critique of the CCG analysis applies to Figure 5 also. I cannot comprehend the point of showing a very weak correlation of CCG asymmetry with Border Ownership Index, especially when what CCG asymmetry means is unclear mechanistically. Figure 5 does not make the paper stronger in my opinion. 

      We thank the reviewer for this comment. As described in the Results section for Figure 5, the observation that interlaminar information flow correlates with B<sub>own</sub> modulation is important because it demonstrates that these flow patterns are specifically related to the magnitude of B<sub>own</sub> signals, independent of the comparisons between CRF and nCRF stimulation. 

      In Figure 3, the authors show two CCGs that involve 4C--4C pairs. It would be nice to know more about such pairs. If there are any 6--6 pairs, what they look like also would be interesting. The authors also in Figure 3 show CCG's of two 4C--4A/B pairs and it would be quite interesting to know how such CCGs behave when CRF and nCRF stimuli are compared. In other words, the authors have shown us they have many data but have chosen not to analyze them further or to explain why they chose not to analyze them. It might help the paper if the authors would present all the CCG types they have. This suggestion would be helpful when the authors know more about the sign of border ownership signals, as discussed at length above. 

      We thank the reviewer for the insightful comment. The rationale for selecting specific laminar pairs is described in the Results section after Figure 3C and further discussed in the Discussion. In brief, we focused on CCGs computed from pairs in which one neuron resided in laminar compartments receiving feedback/horizontal inputs (layers 2/3 and 5/6) and the other within compartments relatively devoid of these inputs (layers 4C and 4A/B).

      To mitigate uncertainty in defining exact laminar boundaries and to maximize statistical power, we combined some anatomical layers into distinct laminar compartments. This approach allowed us to compare the relative spike timing between neuronal pairs during CRF and nCRF stimulation. If feedback/horizontal inputs contribute more during nCRF than CRF stimulation, we expect this to be reflected in the lead-lag relationships of the CCGs. While other pairs (e.g., 5/6–5/6 or 4C– 4A/B) could in principle be analyzed, the hypothesized patterns for these pairs are less clear, and thus they were not the focus of our study. Nonetheless, these additional pairs represent interesting directions for future work.

    1. Scaling

      The set of questions on a survey cannot be considered a scale unless a scaling process was followed to identify the questions and determine how the responses would be combined. So, just because a set of questions on a survey looks like a scale, it collects data using the same response scale, and it is even analyzed like a scale, it isn’t a real scale unless some type of scaling process was used to create it.

    1. Reviewer #3 (Public review):

      Summary:

      Nucleotide modifications are important regulators of biological function, however, until recently, their study has been limited by the availability of appropriate analytical methods. Oxford Nanopore direct RNA sequencing preserves nucleotide modifications, permitting their study, however many different nucleotide modifications lack an available base-caller to accurately identify them. Furthermore, existing tools are computationally intensive, and their results can be difficult to interpret.

      Cheng et al. present SegPore, a method designed to improve the segmentation of direct RNA sequencing data and boost the accuracy of modified base detection.

      Strengths:

      This method is well described and has been benchmarked against a range of publicly available base callers that have been designed to detect modified nucleotides.

      Weaknesses:

      However, the manuscript has a significant drawback in its current version. The most recent nanopore RNA base callers can distinguish between different ribonucleotide modifications, however, SegPore has not been benchmarked against these models.

      The manuscript would be strengthened by benchmarking against the rna004_130bps_hac@v5.1.0 and rna004_130bps_sup@v5.1.0 dorado models, which are reported to detect m5C, m6A_DRACH, inosine_m6A and PseU.

      A clear demonstration that SegPore also outperforms the newer RNA base caller models will confirm the utility of this method.

    2. Author response:

      The following is the authors’ response to the original reviews

      We thank all the reviewers for their constructive comments. We have carefully considered your feedback and revised the manuscript accordingly. The major concern raised was the applicability of SegPore to the RNA004 dataset. To address this, we compared SegPore with f5c and Uncalled4 on RNA004, and found that SegPore demonstrated improved performance, as shown in Table 2 of the revised manuscript.

      Following the reviewers’ recommendations, we updated Figures 3 and 4. Additionally, we added one table and three supplementary figures to the revised manuscript:

      · Table 2: Segmentation benchmark on RNA004 data

      · Supplementary Figure S4: RNA translocation hypothesis illustrated on RNA004 data

      · Supplementary Figure S5: Illustration of Nanopolish raw signal segmentation with eventalign results

      · Supplementary Figure S6: Running time of SegPore on datasets of varying sizes

      Below, we provide a point-by-point response to your comments.

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors describe a new computational method (SegPore), which segments the raw signal from nanopore-direct RNA-Seq data to improve the identification of RNA modifications. In addition to signal segmentation, SegPore includes a Gaussian Mixture Model approach to differentiate modified and unmodified bases. SegPore uses Nanopolish to define a first segmentation, which is then refined into base and transition blocks. SegPore also includes a modification prediction model that is included in the output. The authors evaluate the segmentation in comparison to Nanopolish and Tombo, and they evaluate the impact on m6A RNA modification detection using data with known m6A sites. In comparison to existing methods, SegPore appears to improve the ability to detect m6A, suggesting that this approach could be used to improve the analysis of direct RNA-Seq data.

      Strengths:

      SegPore addresses an important problem (signal data segmentation). By refining the signal into transition and base blocks, noise appears to be reduced, leading to improved m6A identification at the site level as well as for single-read predictions. The authors provide a fully documented implementation, including a GPU version that reduces run time. The authors provide a detailed methods description, and the approach to refine segments appears to be new.

      Weaknesses:

      In addition to Nanopolish and Tombo, f5c and Uncalled4 can also be used for segmentation, however, the comparison to these methods is not shown.

      The method was only applied to data from the RNA002 direct RNA-Sequencing version, which is not available anymore, currently, it remains unclear if the methods still work on RNA004.

      Thank you for your comments.

      To clarify the background, there are two kits for Nanopore direct RNA sequencing: RNA002 (the older version) and RNA004 (the newer version). Oxford Nanopore Technologies (ONT) introduced the RNA004 kit in early 2024 and has since discontinued RNA002. Consequently, most public datasets are based on RNA002, with relatively few available for RNA004 (as of 30 June 2025).

      Nanopolish and Tombo were developed for raw signal segmentation and alignment using RNA002 data, whereas f5c and Uncalled4are the only two software supporting RNA004 data.  Since the development of SegPore began in January 2022, we initially focused on RNA002 due to its data availability. Accordingly, our original comparisons were made against Nanopolish and Tombo using RNA002 data.

      We have now updated SegPore to support RNA004 and compared its performance against f5c and Uncalled4 on three public RNA004 datasets.

      As shown in Table 2 of the revised manuscript, SegPore outperforms both f5c and Uncalled4 in raw signal segmentation. Moreover, the jiggling translocation hypothesis underlying SegPore is further supported, as shown in Supplementary Figure S4.

      The overall improvement in accuracy appears to be relatively small.

      Thank you for the comment.

      We understand that the improvements shown in Tables 1 and 2 may appear modest at first glance due to the small differences in the reported standard deviation (std) values. However, even small absolute changes in std can correspond to substantial relative reductions in noise, especially when the total variance is low.

      To better quantify the improvement, we assume that approximately 20% of the std for Nanopolish, Tombo, f5c, and Uncalled4 arises from noise. Using this assumption, we calculate the relative noise reduction rate of SegPore as follows:

      Noise reduction rate = (baseline std − SegPore std) / (0.2 × baseline std) ​​

      Based on this formula, the average noise reduction rates across all datasets are:

      - SegPore vs Nanopolish: 49.52%

      - SegPore vs Tombo: 167.80%

      - SegPore vs f5c: 9.44%

      - SegPore vs Uncalled4: 136.70%

      These results demonstrate that SegPore can reduce the noise level by at least 9% given a noise level of 20%, which we consider a meaningful improvement for downstream tasks, such as base modification detection and signal interpretation. The high noise reduction rates observed in Tombo and Uncalled4 (over 100%) suggest that their actual noise proportion may be higher than our 20% assumption.

      We acknowledge that this 20% noise level assumption is an approximation. Our intention is to illustrate that SegPore provides measurable improvements in relative terms, even when absolute differences appear small.

      The run time and resources that are required to run SegPore are not shown, however, it appears that the GPU version is essential, which could limit the application of this method in practice.

      Thank you for your comment.

      Detailed instructions for running SegPore are provided in github (https://github.com/guangzhaocs/SegPore). Regarding computational resources, SegPore currently requires one CPU core and one Nvidia GPU to perform the segmentation task efficiently.

      We present SegPore’s runtime for typical datasets in Supplementary Figure S6 in the revised manuscript.  For a typical 1 GB fast5 file, the segmentation takes approximately 9.4 hours using a single NVIDIA DGX‑1 V100 GPU and one CPU core.

      Currently, GPU acceleration is essential to achieve practical runtimes with SegPore. We acknowledge that this requirement may limit accessibility in some environments. To address this, we are actively working on a full C++ implementation of SegPore that will support CPU-only execution. While development is ongoing, we aim to release this version in a future update.

      Reviewer #2 (Public review):

      Summary:

      The work seeks to improve the detection of RNA m6A modifications using Nanopore sequencing through improvements in raw data analysis. These improvements are said to be in the segmentation of the raw data, although the work appears to position the alignment of raw data to the reference sequence and some further processing as part of the segmentation, and result statistics are mostly shown on the 'data-assigned-to-kmer' level.

      As such, the title, abstract, and introduction stating the improvement of just the 'segmentation' does not seem to match the work the manuscript actually presents, as the wording seems a bit too limited for the work involved.

      The work itself shows minor improvements in m6Anet when replacing Nanopolish eventalign with this new approach, but clear improvements in the distributions of data assigned per kmer. However, these assignments were improved well enough to enable m6A calling from them directly, both at site-level and at read-level.

      Strengths:

      A large part of the improvements shown appear to stem from the addition of extra, non-base/kmer specific, states in the segmentation/assignment of the raw data, removing a significant portion of what can be considered technical noise for further analysis. Previous methods enforced the assignment of all raw data, forcing a technically optimal alignment that may lead to suboptimal results in downstream processing as data points could be assigned to neighbouring kmers instead, while random noise that is assigned to the correct kmer may also lead to errors in modification detection.

      For an optimal alignment between the raw signal and the reference sequence, this approach may yield improvements for downstream processing using other tools.<br /> Additionally, the GMM used for calling the m6A modifications provides a useful, simple, and understandable logic to explain the reason a modification was called, as opposed to the black models that are nowadays often employed for these types of tasks.

      Weaknesses:

      The work seems limited in applicability largely due to the focus on the R9's 5mer models. The R9 flow cells are phased out and not available to buy anymore. Instead, the R10 flow cells with larger kmer models are the new standard, and the applicability of this tool on such data is not shown. We may expect similar behaviour from the raw sequencing data where the noise and transition states are still helpful, but the increased kmer size introduces a large amount of extra computing required to process data and without knowledge of how SegPore scales, it is difficult to tell how useful it will really be. The discussion suggests possible accuracy improvements moving to 7mers or 9mers, but no reason why this was not attempted.

      Thank you for pointing out this important limitation. Please refer to our response to Point 1 of Reviewer 1 for SegPore’s performance on RNA004 data. Notably, the jiggling behavior is also observed in RNA004 data, and SegPore achieves better performance than both f5c and Uncalled4.

      The increased k-mer size in RNA004 affects only the training phase of SegPore (refer to Supplementary Note 1, Figure 5 for details on the training and testing phases). Once the baseline means and standard deviations for each k-mer are established, applying SegPore to RNA004 data proceeds similarly to RNA002. This is because each k-mer in the reference sequence has, at most, two states (modified and unmodified). While the larger k-mer size increases the size of the parameter table, it does not increase the computational complexity during segmentation. Although estimating the initial k-mer parameter table requires significant time and effort on our part, it does not affect the runtime for end users applying SegPore to RNA004 data.

      Extending SegPore from 5-mers to 7-mers or 9-mers for RNA002 data would require substantial effort to retrain the model and generate sufficient training data. Additionally, such an extension would make SegPore’s output incompatible with widely used upstream and downstream tools such as Nanopolish and m6Anet, complicating integration and comparison. For these reasons, we leave this extension for future work.

      The manuscript suggests the eventalign results are improved compared to Nanopolish. While this is believably shown to be true (Table 1), the effect on the use case presented, downstream differentiation between modified and unmodified status on a base/kmer, is likely limited as during actual modification calling the noisy distributions are usually 'good enough', and not skewed significantly in one direction to really affect the results too terribly.

      Thank you for your comment. While current state-of-the-art (SOTA) methods perform well on benchmark datasets, there remains significant room for improvement. Most SOTA evaluations are based on limited datasets, primarily covering DRACH motifs in human and mouse transcriptomes. However, m6A modifications can also occur in non-DRACH motifs, where current models may underperform. Additionally, other RNA modifications—such as pseudouridine, inosine, and m5C—are less studied, and their detection may benefit from improved signal modeling.

      We would also like to emphasize that raw signal segmentation and RNA modification detection are distinct tasks. SegPore focuses on the former, providing a cleaner, more interpretable signal that can serve as a foundation for downstream tasks. Improved segmentation may facilitate the development of more accurate RNA modification detection algorithms by the community.

      Scientific progress often builds incrementally through targeted improvements to foundational components. We believe that enhancing signal segmentation, as SegPore does, contributes meaningfully to the broader field—the full impact will become clearer as the tool is adopted into more complex workflows.

      Furthermore, looking at alternative approaches where this kind of segmentation could be applied, Nanopolish uses the main segmentation+alignment for a first alignment and follows up with a form of targeted local realignment/HMM test for modification calling (and for training too), decreasing the need for the near-perfect segmentation+alignment this work attempts to provide. Any tool applying a similar strategy probably largely negates the problems this manuscript aims to improve upon.

      We thank the reviewer for this insightful comment.

      To clarify, Nanopolish provides three independent commands: polya, eventalign, and call-methylation.

      - The polya command identifies the adapter, poly(A) tail, and transcript region in the raw signal.

      - The eventalign command aligns the raw signal to a reference sequence, assigning a signal segment to individual k-mers in the reference.

      - The call-methylation command detects methylated bases from DNA sequencing data.

      The eventalign command corresponds to “the main segmentation+alignment for a first alignment,” while call-methylation corresponds to “a form of targeted local realignment/HMM test for modification calling,” as mentioned in the reviewer’s comment. SegPore’s segmentation is similar in purpose to Nanopolish’s eventalign, while its RNA modification estimation component is similar in concept to Nanopolish’s call-methylation.

      We agree the general idea may appear similar, but the implementations are entirely different. Importantly, Nanopolish’s call-methylation is designed for DNA sequencing data, and its models are not trained to recognize RNA modifications. This means they address distinct research questions and cannot be directly compared on the same RNA modification estimation task. However, it is valid to compare them on the segmentation task, where SegPore exhibits better performance (Table 1).

      We infer the reviewer may suggest that because m6Anet is a deep neural network capable of learning from noisy input, the benefit of more accurate segmentation (such as that provided by SegPore) might be limited. This concern may arise from the limited improvement of SegPore+m6Anet over Nanopolish+m6Anet in bulk analysis (Figure 3). Several factors may contribute to this observation:

      (i) For reads aligned to the same gene in the in vivo data, alignment may be inaccurate due to pseudogenes or transcript isoforms.

      (ii) The in vivo benchmark data are inherently more complex than in vitro datasets and may contain additional modifications (e.g., m5C, m7G), which can confound m6A calling by altering the signal baselines of k-mers.

      (iii) m6Anet is trained on events produced by Nanopolish and may not be optimal for SegPore-derived events.

      (iv) The benchmark dataset lacks a modification-free (IVT) control sample, making it difficult to establish a true baseline for each k-mer.

      In the IVT data (Figure 4), SegPore shows a clear improvement in single-molecule m6A identification, with a 3~4% gain in both ROC-AUC and PR-AUC. This demonstrates SegPore’s practical benefit for applications requiring higher sensitivity at the molecule level.

      As noted earlier, SegPore’s contribution lies in denoising and improving the accuracy of raw signal segmentation, which is a foundational step in many downstream analyses. While it may not yet lead to a dramatic improvement in all applications, it already provides valuable insights into the sequencing process (e.g., cleaner signal profiles in Figure 4) and enables measurable gains in modification detection at the single-read level. We believe SegPore lays the groundwork for developing more accurate and generalizable RNA modification detection tools beyond m6A.

      We have also added the following sentence in the discussion to highlight SegPore’s limited performance in bulk analysis:

      “The limited improvement of SegPore combined with m6Anet over Nanopolish+m6Anet in bulk in vivo analysis (Figure 3) may be explained by several factors: potential alignment inaccuracies due to pseudogenes or transcript isoforms, the complexity of in vivo datasets containing additional RNA modifications (e.g., m5C, m7G) affecting signal baselines, and the fact that m6Anet is specifically trained on events produced by Nanopolish rather than SegPore. Additionally, the lack of a modification-free control (in vitro transcribed) sample in the benchmark dataset makes it difficult to establish true baselines for each k-mer. Despite these limitations, SegPore demonstrates clear improvement in single-molecule m6A identification in IVT data (Figure 4), suggesting it is particularly well suited for in vitro transcription data analysis.”

      Finally, in the segmentation/alignment comparison to Nanopolish, the latter was not fitted(/trained) on the same data but appears to use the pre-trained model it comes with. For the sake of comparing segmentation/alignment quality directly, fitting Nanopolish on the same data used for SegPore could remove the influences of using different training datasets and focus on differences stemming from the algorithm itself.

      In the segmentation benchmark (Table 1), SegPore uses the fixed 5-mer parameter table provided by ONT. The hyperparameters of the HHMM are also fixed and not estimated from the raw signal data being segmented. Only in the m6A modification task,  SegPore does perform re-estimation of the baselines for the modified and unmodified states of k-mers. Therefore, the comparison with Nanopolish is fair, as both tools rely on pre-defined models during segmentation.

      Appraisal:

      The authors have shown their method's ability to identify noise in the raw signal and remove their values from the segmentation and alignment, reducing its influences for further analyses. Figures directly comparing the values per kmer do show a visibly improved assignment of raw data per kmer. As a replacement for Nanopolish eventalign it seems to have a rather limited, but improved effect, on m6Anet results. At the single read level modification modification calling this work does appear to improve upon CHEUI.

      Impact:

      With the current developments for Nanopore-based modification largely focusing on Artificial Intelligence, Neural Networks, and the like, improvements made in interpretable approaches provide an important alternative that enables a deeper understanding of the data rather than providing a tool that plainly answers the question of whether a base is modified or not, without further explanation. The work presented is best viewed in the context of a workflow where one aims to get an optimal alignment between raw signal data and the reference base sequence for further processing. For example, as presented, as a possible replacement for Nanopolish eventalign. Here it might enable data exploration and downstream modification calling without the need for local realignments or other approaches that re-consider the distribution of raw data around the target motif, such as a 'local' Hidden Markov Model or Neural Networks. These possibilities are useful for a deeper understanding of the data and further tool development for modification detection works beyond m6A calling.

      Reviewer #3 (Public review):

      Summary:

      Nucleotide modifications are important regulators of biological function, however, until recently, their study has been limited by the availability of appropriate analytical methods. Oxford Nanopore direct RNA sequencing preserves nucleotide modifications, permitting their study, however, many different nucleotide modifications lack an available base-caller to accurately identify them. Furthermore, existing tools are computationally intensive, and their results can be difficult to interpret.

      Cheng et al. present SegPore, a method designed to improve the segmentation of direct RNA sequencing data and boost the accuracy of modified base detection.

      Strengths:

      This method is well-described and has been benchmarked against a range of publicly available base callers that have been designed to detect modified nucleotides.

      Weaknesses:

      However, the manuscript has a significant drawback in its current version. The most recent nanopore RNA base callers can distinguish between different ribonucleotide modifications, however, SegPore has not been benchmarked against these models.

      I recommend that re-submission of the manuscript that includes benchmarking against the rna004_130bps_hac@v5.1.0 and rna004_130bps_sup@v5.1.0 dorado models, which are reported to detect m5C, m6A_DRACH, inosine_m6A and PseU.<br /> A clear demonstration that SegPore also outperforms the newer RNA base caller models will confirm the utility of this method.

      Thank you for highlighting this important limitation. While Dorado, the new ONT basecaller, is publicly available and supports modification-aware basecalling, suitable public datasets for benchmarking m5C, inosine, m6A, and PseU detection on RNA004 are currently lacking. Dorado’s modification-aware models are trained on ONT’s internal data, which is not publicly released. Therefore, it is not currently feasible to evaluate or directly compare SegPore’s performance against Dorado for m5C, inosine, m6A, and PseU detection.

      We would also like to emphasize that SegPore’s main contribution lies in raw signal segmentation, which is an upstream task in the RNA modification detection pipeline. To assess its performance in this context, we benchmarked SegPore against f5c and Uncalled4 on public RNA004 datasets for segmentation quality. Please refer to our response to Point 1 of Reviewer 1 for details.

      Our results show that the characteristic “jiggling” behavior is also observed in RNA004 data (Supplementary Figure S4), and SegPore achieves better segmentation performance than both f5c and Uncalled4 (Table 2).

      Recommendations for the authors:

      Reviewing Editor:

      Please note that we also received the following comments on the submission, which we encourage you to take into account:

      took a look at the work and for what I saw it only mentions/uses RNA002 chemistry, which is deprecated, effectively making this software unusable by anyone any more, as RNA002 is not commercially available. While the results seem promising, the authors need to show that it would work for RNA004. Notably, there is an alternative software for resquiggling for RNA004 (not Tombo or Nanopolish, but the GPU-accelerated version of Nanopolish (f5C), which does support RNA004. Therefore, they need to show that SegPore works for RNA004, because otherwise it is pointless to see that this method works better than others if it does not support current sequencing chemistries and only works for deprecated chemistries, and people will keep using f5C because its the only one that currently works for RNA004. Alternatively, if there would be biological insights won from the method, one could justify not implementing it in RNA004, but in this case, RNA002 is deprecated since March 2024, and the paper is purely methodological.

      Thank you for the comment. We agree that support for current sequencing chemistries is essential for practical utility. While SegPore was initially developed and benchmarked on RNA002 due to the availability of public data, we have now extended SegPore to support RNA004 chemistry.

      To address this concern, we performed a benchmark comparison using public RNA004 datasets against tools specifically designed for RNA004, including f5c and Uncalled4. Please refer to our response to Point 1 of Reviewer 1 for details. The results show that SegPore consistently outperforms f5c and Uncalled4 in segmentation accuracy on RNA004 data.

      Reviewer #2 (Recommendations for the authors):

      Various statements are made throughout the text that require further explanation, which might actually be defined in more detail elsewhere sometimes but are simply hard to find in the current form.

      (1) Page 2, “In this technique, five nucleotides (5mers) reside in the nanopore at a time, and each 5mer generates a characteristic current signal based on its unique sequence and chemical properties (16).”

      5mer? Still on R9 or just ignoring longer range influences, relevant? It is indeed a R9.4 model from ONT.

      Thank you for the observation. We apologize for the confusion and have clarified the relevant paragraph to indicate that the method is developed for RNA002 data by default. Specifically, we have added the following sentence:

      “Two versions of the direct RNA sequencing (DRS) kits are available: RNA002 and RNA004. Unless otherwise specified, this study focuses on RNA002 data.”

      (2) Page 3, “Employ models like Hidden Markov Models (HMM) to segment the signal, but they are prone to noise and inaccuracies.”

      That's the alignment/calling part, not the segmentation?

      Thank you for the comment. We apologize for the confusion. To clarify the distinction between segmentation and alignment, we added a new paragraph before the one in question to explain the general workflow of Nanopore DRS data analysis and to clearly define the task of segmentation. The added text reads:

      “The general workflow of Nanopore direct RNA sequencing (DRS) data analysis is as follows. First, the raw electrical signal from a read is basecalled using tools such as Guppy or Dorado, which produce the nucleotide sequence of the RNA molecule. However, these basecalled sequences do not include the precise start and end positions of each ribonucleotide (or k-mer) in the signal. Because basecalling errors are common, the sequences are typically mapped to a reference genome or transcriptome using minimap2 to recover the correct reference sequence. Next, tools such as Nanopolish and Tombo align the raw signal to the reference sequence to determine which portion of the signal corresponds to each k-mer. We define this process as the segmentation task, referred to as "eventalign" in Nanopolish. Based on this alignment, Nanopolish extracts various features—such as the start and end positions, mean, and standard deviation of the signal segment corresponding to a k-mer. This signal segment or its derived features is referred to as an "event" in Nanopolish.”

      We also revised the following paragraph describing SegPore to more clearly contrast its approach:

      “In SegPore, we first segment the raw signal into small fragments using a Hierarchical Hidden Markov Model (HHMM), where each fragment corresponds to a sub-state of a k-mer. Unlike Nanopolish and Tombo, which directly align the raw signal to the reference sequence, SegPore aligns the mean values of these small fragments to the reference. After alignment, we concatenate all fragments that map to the same k-mer into a larger segment, analogous to the "eventalign" output in Nanopolish. For RNA modification estimation, we use only the mean signal value of each reconstructed event.”

      We hope this revision clarifies the difference between segmentation and alignment in the context of our method and resolves the reviewer’s concern.

      (3) Page 4, Figure 1, “These segments are then aligned with the 5mer list of the reference sequence fragment using a full/partial alignment algorithm, based on a 5mer parameter table. For example, 𝐴𝑗 denotes the base "A" at the j-th position on the reference.”

      I think I do understand the meaning, but I do not understand the relevance of the Aj bit in the last sentence. What is it used for?

      When aligning the segments (output from Step 2) to the reference sequence in Step 3, it is possible for multiple segments to align to the same k-mer. This can occur particularly when the reference contains consecutive identical bases, such as multiple adenines (A). For example, as shown in Fig. 1A, Step 3, the first two segments (μ₁ and μ₂) are aligned to the first 'A' in the reference sequence, while the third segment is aligned to the second 'A'. In this case, the reference sequence AACTGGTTTC...GTC, which contains exactly two consecutive 'A's at the start. This notation helps to disambiguate segment alignment in regions with repeated bases.

      Additionally, this figure and its subscript include mapping with Guppy and Minimap2 but do not mention Nanopolish at all, while that seems an equally important step in the preprocessing (pg5). As such it is difficult to understand the role Nanopolish exactly plays. It's also not mentioned explicitly in the SegPore Workflow on pg15, perhaps it's part of step 1 there?

      We thank the reviewer for pointing this out. We apologize for the confusion. As mentioned in the public response to point 3 of Reviewer 2, SegPore uses Nanopolish to identify the poly(A) tail and transcript regions from the raw signal. SegPore then performs segmentation and alignment on the transcript portion only. This step is indeed part of Step 1 in the preprocessing workflow, as described in Supplementary Note 1, Section 3.

      To clarify this in the main text, we have updated the preprocessing paragraph on page 6 to explicitly describe the role of Nanopolish:

      “We begin by performing basecalling on the input fast5 file using Guppy, which converts the raw signal data into ribonucleotide sequences. Next, we align the basecalled sequences to the reference genome using Minimap2, generating a mapping between the reads and the reference sequences. Nanopolish provides two independent commands: "polya" and "eventalign".
The "polya" command identifies the adapter, poly(A) tail, and transcript region in the raw signal, which we refer to as the poly(A) detection results. The raw signal segment corresponding to the poly(A) tail is used to standardize the raw signal for each read. The "eventalign" command aligns the raw signal to a reference sequence, assigning a signal segment to individual k-mers in the reference. It also computes summary statistics (e.g., mean, standard deviation) from the signal segment for each k-mer. Each k-mer together with its corresponding signal features is termed an event. These event features are then passed into downstream tools such as m6Anet and CHEUI for RNA modification detection. For full transcriptome analysis (Figure 3), we extract the aligned raw signal segment and reference sequence segment from Nanopolish's events for each read by using the first and last events as start and end points. For in vitro transcription (IVT) data with a known reference sequence (Figure 4), we extract the raw signal segment corresponding to the transcript region for each input read based on Nanopolish’s poly(A) detection results.”

      Additionally, we revised the legend of Figure 1A to explicitly include Nanopolish in step 1 as follows:

      “The raw current signal fragments are paired with the corresponding reference RNA sequence fragments using Nanopolish.”

      (4) Page 5, “The output of Step 3 is the "eventalign," which is analogous to the output generated by the Nanopolish "eventalign" command.”

      Naming the function of Nanopolish, the output file, and later on (pg9) the alignment of the newly introduced methods the exact same "eventalign" is very confusing.

      Thank you for the helpful comment. We acknowledge the potential confusion caused by using the term “eventalign” in multiple contexts. To improve clarity, we now consistently use the term “events” to refer to the output of both Nanopolish and SegPore, rather than using "eventalign" as a noun. We also added the following sentence to Step 3 (page 6) to clearly define what an “event” refers to in our manuscript:

      “An "event" refers to a segment of the raw signal that is aligned to a specific k-mer on a read, along with its associated features such as start and end positions, mean current, standard deviation, and other relevant statistics.”

      We have revised the text throughout the manuscript accordingly to reduce ambiguity and ensure consistent terminology.

      (5) Page 5, “Once aligned, we use Nanopolish's eventalign to obtain paired raw current signal segments and the corresponding fragments of the reference sequence, providing a precise association between the raw signals and the nucleotide sequence.”

      I thought the new method's HHMM was supposed to output an 'eventalign' formatted file. As this is not clearly mentioned elsewhere, is this a mistake in writing? Is this workflow dependent on Nanopolish 'eventalign' function and output or not?

      We apologize for the confusion. To clarify, SegPore is not dependent on Nanopolish’s eventalign function for generating the final segmentation results. As described in our response to your comment point 2 and elaborated in the revised text on page 4, SegPore uses its own HHMM-based segmentation model to divide the raw signal into small fragments, each corresponding to a sub-state of a k-mer. These fragments are then aligned to the reference sequence based on their mean current values.

      As explained in the revised manuscript:

      “In SegPore, we first segment the raw signal into small fragments using a Hierarchical Hidden Markov Model (HHMM), where each fragment corresponds to a sub-state of a k-mer. Unlike Nanopolish and Tombo, which directly align the raw signal to the reference sequence, SegPore aligns the mean values of these small fragments to the reference. After alignment, we concatenate all fragments that map to the same k-mer into a larger segment, analogous to the "eventalign" output in Nanopolish. For RNA modification estimation, we use only the mean signal value of each reconstructed event.”

      To avoid ambiguity, we have also revised the sentence on page 5 to more clearly distinguish the roles of Nanopolish and SegPore in the workflow. The updated sentence now reads:

      “Nanopolish provides two independent commands: "polya" and "eventalign".
The "polya" command identifies the adapter, poly(A) tail, and transcript region in the raw signal, which we refer to as the poly(A) detection results. The raw signal segment corresponding to the poly(A) tail is used to standardize the raw signal for each read. The "eventalign" command aligns the raw signal to a reference sequence, assigning a signal segment to individual k-mers in the reference. It also computes summary statistics (e.g., mean, standard deviation) from the signal segment for each k-mer. Each k-mer together with its corresponding signal features is termed an event. These event features are then passed into downstream tools such as m6Anet and CHEUI for RNA modification detection. For full transcriptome analysis (Figure 3), we extract the aligned raw signal segment and reference sequence segment from Nanopolish's events for each read by using the first and last events as start and end points. For in vitro transcription (IVT) data with a known reference sequence (Figure 4), we extract the raw signal segment corresponding to the transcript region for each input read based on Nanopolish’s poly(A) detection results.”

      (6) Page 5, “Since the polyA tail provides a stable reference, we normalize the raw current signals across reads, ensuring that the mean and standard deviation of the polyA tail are consistent across all reads.”

      Perhaps I misread this statement: I interpret it as using the PolyA tail to do the normalization, rather than using the rest of the signal to do the normalization, and that results in consistent PolyA tails across all reads.

      If it's the latter, this should be clarified, and a little detail on how the normalization is done should be added, but if my first interpretation is correct:

      I'm not sure if its standard deviation is consistent across reads. The (true) value spread in this section of a read should be fairly limited compared to the rest of the signal in the read, so the noise would influence the scale quite quickly, and such noise might be introduced to pores wearing down and other technical influences. Is this really better than using the non-PolyA tail part of the reads signal, using Median Absolute Deviation to scale for a first alignment round, then re-fitting the signal scaling using Theil Sen on the resulting alignments (assigned read signal vs reference expected signal), as Tombo/Nanopolish (can) do?

      Additionally, this kind of normalization should have been part of the Nanopolish eventalign already, can this not be re-used? If it's done differently it may result in different distributions than the ONT kmer table obtained for the next step.

      Thank you for this detailed and thoughtful comment. We apologize for the confusion. The poly(A) tail–based normalization is indeed explained in Supplementary Note 1, Section 3, but we agree that the motivation needed to be clarified in the main text.

      We have now added the following sentence in the revised manuscript (before the original statement on page 5 to provide clearer context:

      “Due to inherent variability between nanopores in the sequencing device, the baseline levels and standard deviations of k-mer signals can differ across reads, even for the same transcript. To standardize the signal for downstream analyses, we extract the raw current signal segments corresponding to the poly(A) tail of each read. Since the poly(A) tail provides a stable reference, we normalize the raw current signals across reads, ensuring that the mean and standard deviation of the poly(A) tail are consistent across all reads. This step is crucial for reducing…..”

      We chose to use the poly(A) tail for normalization because it is sequence-invariant—i.e., all poly(A) tails consist of identical k-mers, unlike transcript sequences which vary in composition. In contrast, using the transcript region for normalization can introduce biases: for instance, reads with more diverse k-mers (having inherently broader signal distributions) would be forced to match the variance of reads with more uniform k-mers, potentially distorting the baseline across k-mers.

      In our newly added RNA004 benchmark experiment, we used the default normalization provided by f5c, which does not include poly(A) tail normalization. Despite this, SegPore was still able to mask out noise and outperform both f5c and Uncalled4, demonstrating that our segmentation method is robust to different normalization strategies.

      (7) Page 7, “The initialization of the 5mer parameter table is a critical step in SegPore's workflow. By leveraging ONT's established kmer models, we ensure that the initial estimates for unmodified 5mers are grounded in empirical data.”

      It looks like the method uses Nanopolish for a first alignment, then improves the segmentation matching the reference sequence/expected 5mer values. I thought the Nanopolish model/tables are based on the same data, or similarly obtained. If they are different, then why the switch of kmer model? Now the original alignment may have been based on other values, and thus the alignment may seem off with the expected kmer values of this table.

      Thank you for this insightful question. To clarify, SegPore uses Nanopolish only to identify the poly(A) tail and transcript regions from the raw signal. In the bulk in vivo data analysis, we use Nanopolish’s first event as the start and the last event as the end to extract the aligned raw signal chunk and its corresponding reference sequence. Since SegPore relies on Nanopolish solely to delineate the transcript region for each read, it independently aligns the raw signals to the reference sequence without refining or adjusting Nanopolish’s segmentation results.

      While SegPore's 5-mer parameter table is initially seeded using ONT’s published unmodified k-mer models, we acknowledge that empirical signal values may deviate from these reference models due to run-specific technical variation and the presence of RNA modifications. For this reason, SegPore includes a parameter re-estimation step to refine the mean and standard deviation values of each k-mer based on the current dataset.

      The re-estimation process consists of two layers. In the outer layer, we select a set of 5mers that exhibit both modified and unmodified states based on the GMM results (Section 6 of Supplementary Note 1), while the remaining 5mers are assumed to have only unmodified states. In the inner layer, we align the raw signals to the reference sequences using the 5mer parameter table estimated in the outer layer (Section 5 of Supplementary Note 1). Based on the alignment results, we update the 5mer parameter table in the outer layer. This two-layer process is generally repeated for 3~5 iterations until the 5mer parameter table converges.This re-estimation ensures that:

      (1) The adjusted 5mer signal baselines remain close to the ONT reference (for consistency);

      (2) The alignment score between the observed signal and the reference sequence is optimized (as detailed in Equation 11, Section 5 of Supplementary Note 1);

      (3) Only 5mers that show a clear difference between the modified and unmodified components in the GMM are considered subject to modification.

      By doing so, SegPore achieves more accurate signal alignment independent of Nanopolish’s models, and the alignment is directly tuned to the data under analysis.

      (8) Page 9, “The output of the alignment algorithm is an eventalign, which pairs the base blocks with the 5mers from the reference sequence for each read (Fig. 1C).”

      “Modification prediction

      After obtaining the eventalign results, we estimate the modification state of each motif using the 5mer parameter table.”

      This wording seems to have been introduced on page 5 but (also there) reads a bit confusingly as the name of the output format, file, and function are now named the exact same "eventalign". I assume the obtained eventalign results now refer to the output of your HHMM, and not the original Nanopolish eventalign results, based on context only, but I'd rather have a clear naming that enables more differentiation.

      We apologize for the confusion. We have revised the sentence as follows for clarity:

      “A detailed description of both alignment algorithms is provided in Supplementary Note 1. The output of the alignment algorithm is an alignment that pairs the base blocks with the 5mers from the reference sequence for each read (Fig. 1C). Base blocks aligned to the same 5-mer are concatenated into a single raw signal segment (referred to as an “event”), from which various features—such as start and end positions, mean current, and standard deviation—are extracted. Detailed derivation of the mean and standard deviation is provided in Section 5.3 in Supplementary Note 1. In the remainder of this paper, we refer to these resulting events as the output of eventalign analysis or the segmentation task. ”

      (9) Page 9, “Since a single 5mer can be aligned with multiple base blocks, we merge all aligned base blocks by calculating a weighted mean. This weighted mean represents the single base block mean aligned with the given 5mer, allowing us to estimate the modification state for each site of a read.”

      I assume the weights depend on the length of the segment but I don't think it is explicitly stated while it should be.

      Thank you for the helpful observation. To improve clarity, we have moved this explanation to the last paragraph of the previous section (see response to point 8), where we describe the segmentation process in more detail.

      Additionally, a complete explanation of how the weighted mean is computed is provided in Section 5.3 of Supplementary Note 1. It is derived from signal points that are assigned to a given 5mer.

      (10) Page 10, “Afterward, we manually adjust the 5mer parameter table using heuristics to ensure that the modified 5mer distribution is significantly distinct from the unmodified distribution.”

      Using what heuristics? If this is explained in the supplementary notes then please refer to the exact section.

      Thank you for pointing this out. The heuristics used to manually adjust the 5mer parameter table are indeed explained in detail in Section 7 of Supplementary Note 1.

      To clarify this in the manuscript, we have revised the sentence as follows:

      “Afterward, we manually adjust the 5mer parameter table using heuristics to ensure that the modified 5mer distribution is significantly distinct from the unmodified distribution (see details in Section 7 of Supplementary Note 1).”

      (11) Page 10, “Once the table is fixed, it is used for RNA modification estimation in the test data without further updates.”

      By what tool/algorithm? Perhaps it is your own implementation, but with the next section going into segmentation benchmarking and using Nanopolish before this seems undefined.

      Thank you for pointing this out. We use our own implementation. See Algorithm 3 in Section 6 of Supplementary Note 1.

      We have revised the sentence for clarity:

      “Once a stabilized 5mer parameter table is estimated from the training data, it is used for RNA modification estimation in the test data without further updates. A more detailed description of the GMM re-estimation process is provided in Section 6 of Supplementary Note 1.”

      (12) Page 11, “A 5mer was considered significantly modified if its read coverage exceeded 1,500 and the distance between the means of the two Gaussian components in the GMM was greater than 5.”

      Considering the scaling done before also not being very detailed in what range to expect, this cutoff doesn't provide any useful information. Is this a pA value?

      Thank you for the observation. Yes, the value refers to the current difference measured in picoamperes (pA). To clarify this, we have revised the sentence in the manuscript to include the unit explicitly:

      “A 5mer was considered significantly modified if its read coverage exceeded 1,500 and the distance between the means of the two Gaussian components in the GMM was greater than 5 picoamperes (pA).”

      (13) Page 13, “The raw current signals, as shown in Figure 1B.”

      Wrong figure? Figure 2B seems logical.

      Thank you for catching this. You are correct—the reference should be to Figure 2B, not Figure 1B. We have corrected this in the revised manuscript.

      (14) Page 14, Figure 2A, these figures supposedly support the jiggle hypothesis but the examples seem to match only half the explanation. Any of these jiggles seem to be followed shortly by another in the opposite direction, and the amplitude seems to match better within each such pair than the next or previous segments. Perhaps there is a better explanation still, and this behaviour can be modelled as such instead.

      Thank you for your comment. We acknowledge that the observed signal patterns may appear ambiguous and could potentially suggest alternative explanations. However, as shown in Figure 2A, the red dots tend to align closely with the baseline of the previous state, while the blue dots align more closely with the baseline of the next state. We interpret this as evidence for the "jiggling" hypothesis, where k-mer temporarily oscillates between adjacent states during translocation.

      That said, we agree that more sophisticated models could be explored to better capture this behavior, and we welcome suggestions or references to alternative models. We will consider this direction in future work.

      (15) Page 15, “This occurs because subtle transitions within a base block may be mistaken for transitions between blocks, leading to inflated transition counts.”

      Is it really a "subtle transition" if it happens within a base block? It seems this is not a transition and thus shouldn't be named as such.

      Thank you for pointing this out. We agree that the term “subtle transition” may be misleading in this context. We revised the sentence to clarify the potential underlying cause of the inflated transition counts:

      “This may be due to a base block actually corresponding to a sub-state of a single 5mer, rather than each base block corresponding to a full 5mer, leading to inflated transition counts. To address this issue, SegPore’s alignment algorithm was refined to merge multiple base blocks (which may represent sub-states of the same 5mer) into a single 5mer, thereby facilitating further analysis.”

      (16) Page 15, “The SegPore "eventalign" output is similar to Nanopolish's "eventalign" command.”

      To the output of that command, I presume, not to the command itself.

      Thank you for pointing out the ambiguity. We have revised the sentence for clarity:

      “The final outputs of SegPore are the events and modification state predictions. SegPore’s events are similar to the outputs of Nanopolish’s "eventalign" command, in that they pair raw current signal segments with the corresponding RNA reference 5-mers. Each 5-mer is associated with various features — such as start and end positions, mean current, and standard deviation — derived from the paired signal segment.”

      (17) Page 15, “For selected 5mers, SegPore also provides the modification rate for each site and the modification state of that site on individual reads.”

      What selection? Just all kmers with a possible modified base or a more specific subset?

      We revised the sentence to clarify the selection criteria:

      “For selected 5mers that exhibit both a clearly unmodified and a clearly modified signal component, SegPore reports the modification rate at each site, as well as the modification state of that site on individual reads.”

      (18) Page 16, “A key component of SegPore is the 5mer parameter table, which specifies the mean and standard deviation for each 5mer in both modified and unmodified states (Figure 2A).”

      Wrong figure?

      Thank you for pointing this out. You are correct—it should be Figure 1A, not Figure 2A. We intended to visually illustrate the structure of the 5mer parameter table in Figure 1A, and we have corrected this reference in the revised manuscript.

      (19) Page 16, Table 1, I can't quite tell but I assume this is based on all kmers in the table, not just a m6A modified subset. A short added statement to make this clearer would help.

      Yes, you are right—it is averaged over all 5mers. We have revised the sentence for clarity as follows:

      " As shown in Table 1, SegPore consistently achieved the best performance averaged on all 5mers across all datasets..…."

      (20) Page 16, “Since the peaks (representing modified and unmodified states) are separable for only a subset of 5mers, SegPore can provide modification parameters for these specific 5mers. For other 5mers, modification state predictions are unavailable.”

      Can this be improved using some heuristics rather than the 'distance of 5' cutoff as described before? How small or big is this subset, compared to how many there should be to cover all cases?

      We agree that more sophisticated strategies could potentially improve performance. In this study, we adopted a relatively conservative approach to minimize false positives by using a heuristic cutoff of 5 picoamperes. This value was selected empirically and we did not explore alternative cutoffs. Future work could investigate more refined or data-driven thresholding strategies.

      (21) Page 16, “Tombo used the "resquiggle" method to segment the raw signals, and we standardized the segments using the polyA tail to ensure a fair comparison.”

      I don't know what or how something is "standardized" here.

      Standardized’ refers to the poly(A) tail–based signal normalization described in our response to point 6. We applied this normalization to Tombo’s output to ensure a fair comparison across methods. Without this standardization, Tombo’s performance was notably worse. We revised the sentence as follows:

      “Tombo used the "resquiggle" method to segment the raw signals, and we standardized the segments using the poly(A) tail to ensure a fair comparison (See preprocessing section in Materials and Methods).”

      (22) Page 16, “To benchmark segmentation performance, we used two key metrics: (1) the log-likelihood of the segment mean, which measures how closely the segment matches ONT's 5mer parameter table (used as ground truth), and (2) the standard deviation (std) of the segment, where a lower std indicates reduced noise and better segmentation quality. If the raw signal segment aligns correctly with the corresponding 5mer, its mean should closely match ONT's reference, yielding a high log-likelihood. A lower std of the segment reflects less noise and better performance overall.”

      Here the segmentation part becomes a bit odd:

      A: Low std can be/is achieved by dropping any noisy bits, making segments really small (partly what happens here with the transition segments). This may be 'true' here, in the sense that the transition is not really part of the segment, but the comparison table is a bit meaningless as the other tools forcibly assign all data to kmers, instead of ignoring parts as transition states. In other words, it is a benchmark that is easy to cheat by assigning more data to noise/transition states.

      B: The values shown are influenced by the alignment made between the read and expected reference signal. Especially Tombo tends to forcibly assign data to whatever looks the most similar nearby rather than providing the correct alignment. So the "benchmark of the segmentation performance" is more of an "overall benchmark of the raw signal alignment". Which is still a good, useful thing, but the text seems to suggest something else.

      Thank you for raising these important concerns regarding the segmentation benchmarking.

      Regarding point A, the base blocks aligned to the same 5mer are concatenated into a single segment, including the short transition blocks between them. These transition blocks are typically very short (4~10 signal points, average 6 points), while a typical 5mer segment contains around 20~60 signal points. To assess whether SegPore’s performance is inflated by excluding transition segments, we conducted an additional comparison: we removed 6 boundary signal points (3 from the start and 3 from the end) from each 5mer segment in Nanopolish and Tombo’s results to reduce potential noise. The new comparison table is shown in the following:

      SegPore consistently demonstrates superior performance. Its key contribution lies in its ability to recognize structured noise in the raw signal and to derive more accurate mean and standard deviation values that more faithfully represent the true state of the k-mer in the pore. The improved mean estimates are evidenced by the clearly separated peaks of modified and unmodified 5mers in Figures 3A and 4B, while the improved standard deviation is reflected in the segmentation benchmark experiments.

      Regarding point B, we apologize for the confusion. We have added a new paragraph to the introduction to clarify that the segmentation task indeed includes the alignment step.

      “The general workflow of Nanopore direct RNA sequencing (DRS) data analysis is as follows. First, the raw electrical signal from a read is basecalled using tools such as Guppy or Dorado, which produce the nucleotide sequence of the RNA molecule. However, these basecalled sequences do not include the precise start and end positions of each ribonucleotide (or k-mer) in the signal. Because basecalling errors are common, the sequences are typically mapped to a reference genome or transcriptome using minimap2 to recover the correct reference sequence. Next, tools such as Nanopolish and Tombo align the raw signal to the reference sequence to determine which portion of the signal corresponds to each k-mer. We define this process as the segmentation task, referred to as "eventalign" in Nanopolish. Based on this alignment, Nanopolish extracts various features—such as the start and end positions, mean, and standard deviation of the signal segment corresponding to a k-mer. This signal segment or its derived features is referred to as an "event" in Nanopolish. The resulting events serve as input for downstream RNA modification detection tools such as m6Anet and CHEUI.”

      (23) Page 17 “Given the comparable methods and input data requirements, we benchmarked SegPore against several baseline tools, including Tombo, MINES (26), Nanom6A (27), m6Anet, Epinano (28), and CHEUI (29).”

      It seems m6Anet is actually Nanopolish+m6Anet in Figure 3C, this needs a minor clarification here.

      m6Anet uses Nanopolish’s estimated events as input by default.

      (24) Page 18, Figure 3, A and B are figures without any indication of what is on the axis and from the text I believe the position next to each other on the x-axis rather than overlapping is meaningless, while their spread is relevant, as we're looking at the distribution of raw values for this 5mer. The figure as is is rather confusing.

      Thanks for pointing out the confusion. We have added concrete values to the axes in Figures 3A and 3B and revised the figure legend as follows in the manuscript:

      “(A) Histogram of the estimated mean from current signals mapped to an example m6A-modified genomic location (chr10:128548315, GGACT) across all reads in the training data, comparing Nanopolish (left) and SegPore (right). The x-axis represents current in picoamperes (pA).

      (B) Histogram of the estimated mean from current signals mapped to the GGACT motif at all annotated m6A-modified genomic locations in the training data, again comparing Nanopolish (left) and SegPore (right). The x-axis represents current in picoamperes (pA).”

      (25) Page 18 “SegPore's results show a more pronounced bimodal distribution in the raw signal segment mean, indicating clearer separation of modified and unmodified signals.”

      Without knowing the correct values around the target kmer (like Figure 4B), just the more defined bimodal distribution could also indicate the (wrongful) assignment of neighbouring kmer values to this kmer instead, hence this statement lacks some needed support, this is just one interpretation of the possible reasons.

      Thank you for the comment. We have added concrete values to Figures 3A and 3B to support this point. Both peaks fall within a reasonable range: the unmodified peak (125 pA) is approximately 1.17 pA away from its reference value of 123.83 pA, and the modified peak (118 pA) is around 7 pA away from the unmodified peak. This shift is consistent with expected signal changes due to RNA modifications (usually less than 10 pA), and the magnitude of the difference suggests that the observed bimodality is more likely caused by true modification events rather than misalignment.

      (26) Page 18 “Furthermore, when pooling all reads mapped to m6A-modified locations at the GGACT motif, SegPore showed prominent peaks (Fig. 3B), suggesting reduced noise and improved modification detection.”

      I don't think the prominent peaks directly suggest improved detection, this statement is a tad overreaching.

      We revised the sentense to the following:

      “SegPore exhibited more distinct peaks (Fig. 3B), indicating reduced noise and potentially enabling more reliable modification detection”.

      (27) Page18 “(2) direct m6A predictions from SegPore's Gaussian Mixture Model (GMM), which is limited to the six selected 5mers.”

      The 'six selected' refers to what exactly? Also, 'why' this is limited to them is also unclear as it is, and it probably would become clearer if it is clearly defined what this refers to.

      It is explained the page 16 in the SegPore’s workflow in the original manuscript as follows:

      “A key component of SegPore is the 5mer parameter table, which specifies the mean and standard deviation for each 5mer in both modified and unmodified states (Fig. 2A1A). Since the peaks (representing modified and unmodified states) are separable for only a subset of 5mers, SegPore can provide modification parameters for these specific 5mers. For other 5mers, modification state predictions are unavailable.”

      e select a small set of 5mers that show clear peaks (modified and unmodified 5mers) in GMM in the m6A site-level data analysis. These 5mers are provided in Supplementary Fig. S2C, as explained in the section “m6A site level benchmark” in the Material and Methods (page 12 in the original manuscript).

      “…transcript locations into genomic coordinates. It is important to note that the 5mer parameter table was not re-estimated for the test data. Instead, modification states for each read were directly estimated using the fixed 5mer parameter table. Due to the differences between human (Supplementary Fig. S2A) and mouse (Supplementary Fig. S2B), only six 5mers were found to have m6A annotations in the test data’s ground truth (Supplementary Fig. S2C). For a genomic location to be identified as a true m6A modification site, it had to correspond to one of these six common 5mers and have a read coverage of greater than 20. SegPore derived the ROC and PR curves for benchmarking based on the modification rate at each genomic location….”

      We have updated the sentence as follows to increase clarity:

      “which is limited to the six selected 5mers that exhibit clearly separable modified and unmodified components in the GMM (see Materials and Methods for details).”

      (28) Page 19, Figure 4C, the blue 'Unmapped' needs further explanation. If this means the segmentation+alignment resulted in simply not assigning any segment to a kmer, this would indicate issues in the resulting mapping between raw data and kmers as the data that probably belonged to this kmer is likely mapped to a neighbouring kmer, possibly introducing a bimodal distribution there.

      This is due to deletion event in the full alignment algorithm. See Page 8 of SupplementaryNote1:

      During the traceback step of the dynamic programming matrix, not every 5mer in the reference sequence is assigned a corresponding raw signal fragment—particularly when the signal’s mean deviates substantially from the expected mean of that 5mer. In such cases, the algorithm considers the segment to be generated by an unknown 5mer, and the corresponding reference 5mer is marked as unmapped.

      (29) Page 19, “For six selected m6A motifs, SegPore achieved an ROC AUC of 82.7% and a PR AUC of 38.7%, earning the third-best performance compared with deep leaning methods m6Anet and CHEUI (Fig. 3D).”

      How was this selection of motifs made, are these related to the six 5mers in the middle of Supplementary Figure S2? Are these the same six as on page 18? This is not clear to me.

      It is the same, see the response to point 27.

      (30) Page 21 “Biclustering reveals that modifications at the 6th, 7th, and 8th genomic locations are specific to certain clusters of reads (clusters 4, 5, and 6), while the first five genomic locations show similar modification patterns across all reads.”

      This reads rather confusingly. Both the '6th, 7th, and 8th genomic locations' and 'clusters 4,5,6' should be referred to in clearer terms. Either mark them in the figure as such or name them in the text by something that directly matches the text in the figure.

      We have added labels to the clusters and genomic locations Figure 4C, and revised the sentence as follows:

      “Biclustering reveals that modifications at g6 are specific to cluster C4, g7 to cluster C5, and g8 to cluster C6, while the first five genomic locations (g1 to g5) show similar modification patterns across all reads.”

      (31) Page 21, “We developed a segmentation algorithm that leverages the jiggling property in the physical process of DRS, resulting in cleaner current signals for m6A identification at both the site and single-molecule levels.”

      Leverages, or just 'takes into account'?

      We designed our HHMM specifically based on the jiggling hypothesis, so we believe that using the term “leverage” is appropriate.

      (32) Page 21, “Our results show that m6Anet achieves superior performance, driven by SegPore's enhanced segmentation.”

      Superior in what way? It barely improves over Nanopolish in Figure 3C and is outperformed by other methods in Figure 3D. The segmentation may have improved but this statement says something is 'superior' driven by that 'enhanced segmentation', so that cannot refer to the segmentation itself.

      We revise it as follows in the revised manuscript:

      ”Our results demonstrate that SegPore’s segmentation enables clear differentiation between m6A-modified and unmodified adenosines.”

      (33) Page 21, “In SegPore, we assume a drastic change between two consecutive 5mers, which may hold for 5mers with large difference in their current baselines but may not hold for those with small difference.”

      The implications of this assumption don't seem highlighted enough in the work itself and may be cause for falsely discovering bi-modal distributions. What happens if such a 5mer isn't properly split, is there no recovery algorithm later on to resolve these cases?

      We agree that there is a risk of misalignment, which can result in a falsely observed bimodal distribution. This is a known and largely unavoidable issue across all methods, including deep neural network–based methods. For example, many of these models rely on a CTC (Connectionist Temporal Classification) layer, which implicitly performs alignment and may also suffer from similar issues.

      Misalignment is more likely when the current baselines of neighboring k-mers are close. In such cases, the model may struggle to confidently distinguish between adjacent k-mers, increasing the chance that signals from neighboring k-mers are incorrectly assigned. Accurate baseline estimation for each k-mer is therefore critical—when baselines are accurate, the correct alignment typically corresponds to the maximum likelihood.

      We have added the following sentence to the discussion to acknowledge this limitation:

      “As with other RNA modification estimation methods, SegPore can be affected by misalignment errors, particularly when the baseline signals of adjacent k-mers are similar. These cases may lead to spurious bimodal signal distributions and require careful interpretation.”

      (34) Page 21, “Currently, SegPore models only the modification state of the central nucleotide within the 5mer. However, modifications at other positions may also affect the signal, as shown in Figure 4B. Therefore, introducing multiple states to the 5mer could help to improve the performance of the model.”

      The meaning of this statement is unclear to me. Is SegPore unable to combine the information of overlapping kmers around a possibly modified base (central nucleotide), or is this referring to having multiple possible modifications in a single kmer (multiple states)?

      We mean there can be modifications at multiple positions of a single 5mer, e.g. C m5C m6A m7G T. We have revised the sentence to:

      “Therefore, introducing multiple states for a 5mer to accout for modifications at mutliple positions within the same 5mer could help to improve the performance of the model.”

      (35) Page 22, “This causes a problem when apply DNN-based methods to new dataset without short read sequencing-based ground truth. Human could not confidently judge if a predicted m6A modification is a real m6A modification.”

      Grammatical errors in both these sentences. For the 'Human could not' part, is this referring to a single person's attempt or more extensively tested?

      Thanks for the comment. We have revised the sentence as follows:

      “This poses a challenge when applying DNN-based methods to new datasets without short-read sequencing-based ground truth. In such cases, it is difficult for researchers to confidently determine whether a predicted m6A modification is genuine (see Supplmentary Figure S5).”

      (36) Page 22, “…which is easier for human to interpret if a predicted m6A site is real.”

      "a" human, but also this probably meant to say 'whether' instead of 'if', or 'makes it easier'.

      Thanks for the advice. We have revise the sentence as follows:

      “One can generally observe a clear difference in the intensity levels between 5mers with an m6A and those with a normal adenosine, which makes it easier for a researcher to interpret whether a predicted m6A site is genuine.”

      (37) Page 22, “…and noise reduction through its GMM-based approach…”

      Is the GMM providing noise reduction or segmentation?

      Yes, we agree that it is not relevant. We have removed the sentence in the revised manuscript as follows:

      “Although SegPore provides clear interpretability and noise reduction through its GMM-based approach, there is potential to explore DNN-based models that can directly leverage SegPore's segmentation results.”

      (38) Page 23, “SegPore effectively reduces noise in the raw signal, leading to improved m6A identification at both site and single-molecule levels…”

      Without further explanation in what sense this is meant, 'reduces noise' seems to overreach the abilities, and looks more like 'masking out'.

      Following the reviewer’s suggestion, we change it to ‘mask out'’ in the revised manuscript.

      “SegPore effectively masks out noise in the raw signal, leading to improved m6A identification at both site and single-molecule levels.”

      Reviewer #3 (Recommendations for the authors):

      I recommend the publication of this manuscript, provided that the following comments (and the comments above) are addressed.

      In general, the authors state that SegPore represents an improvement on existing software. These statements are largely unquantified, which erodes their credibility. I have specified several of these in the Minor comments section.

      Page 5, Preprocessing: The authors comment that the poly(A) tail provides a stable reference that is crucial for the normalisation of all reads. How would this step handle reads that have variable poly(A) tail lengths? Or have interrupted poly(A) tails (e.g. in the case of mRNA vaccines that employ a linker sequence)?

      We apologize for the confusion. The poly(A) tail–based normalization is explained in Supplementary Note 1, Section 3.

      As shown in Author response image 1 below, the poly(A) tail produces a characteristic signal pattern—a relatively flat, squiggly horizontal line. Due to variability between nanopores, raw current signals often exhibit baseline shifts and scaling of standard deviations. This means that the signal may be shifted up or down along the y-axis and stretched or compressed in scale.

      Author response image 1.

      The normalization remains robust with variable poly(A) tail lengths, as long as the poly(A) region is sufficiently long. The linker sequence will be assigned to the adapter part rather than the poly(A) part.

      To improve clarity in the revised manuscript, we have added the following explanation:

      “Due to inherent variability between nanopores in the sequencing device, the baseline levels and standard deviations of k-mer signals can differ across reads, even for the same transcript. To standardize the signal for downstream analyses, we extract the raw current signal segments corresponding to the poly(A) tail of each read. Since the poly(A) tail provides a stable reference, we normalize the raw current signals across reads, ensuring that the mean and standard deviation of the poly(A) tail are consistent across all reads. This step is crucial for reducing…..”

      We chose to use the poly(A) tail for normalization because it is sequence-invariant—i.e., all poly(A) tails consist of identical k-mers, unlike transcript sequences which vary in composition. In contrast, using the transcript region for normalization can introduce biases: for instance, reads with more diverse k-mers (having inherently broader signal distributions) would be forced to match the variance of reads with more uniform k-mers, potentially distorting the baseline across k-mers.

      Page 7, 5mer parameter table: r9.4_180mv_70bps_5mer_RNA is an older kmer model (>2 years). How does your method perform with the newer RNA kmer models that do permit the detection of multiple ribonucleotide modifications? Addressing this comment is crucial because it is feasible that SegPore will underperform in comparison to the newer RNA base caller models (requiring the use of RNA004 datasets).

      Thank you for highlighting this important point. For RNA004, we have updated SegPore to ensure compatibility with the latest kit. In our revised manuscript, we demonstrate that the translocation-based segmentation hypothesis remains valid for RNA004, as supported by new analyses presented in the supplementary Figure S4.

      Additionally, we performed a new benchmark with f5c and Uncalled4 in RNA004 data in the revised manuscript (Table 2), where SegPore exhibit a better performance than f5c and Uncalled4.

      We agree that benchmarking against the latest Dorado models—specifically rna004_130bps_hac@v5.1.0 and rna004_130bps_sup@v5.1.0, which include built-in modification detection capabilities—would provide valuable context for evaluating the utility of SegPore. However, generating a comprehensive k-mer parameter table for RNA004 requires a large, well-characterized dataset. At present, such data are limited in the public domain. Additionally, Dorado is developed by ONT and its internal training data have not been released, making direct comparisons difficult.

      Our current focus is on improving raw signal segmentation quality, which are upstream tasks critical to many downstream analyses, including RNA modification detection. Future work may include benchmarking SegPore against models like Dorado once appropriate data become available.

      The Methods and Results sections contain redundant information - please streamline the information in these sections and reduce the redundancy. For example, the benchmarking section may be better situated in the Results section.

      Following your advice, we have removed redundant texts about the Segmentation benchmark from Materials and Methods in the revised manuscript.

      Minor comments

      (1) Introduction

      Page 3: "By incorporating these dynamics into its segmentation algorithm...". Please provide an example of how motor protein dynamics can impact RNA translocation. In particular, please elaborate on why motor protein dynamics would impact the translocation of modified ribonucleotides differently to canonical ribonucleotides. This is provided in the results, but please also include details in the Introduction.

      Following your advice, we added one sentence to explain how the motor protein affect the translocation of the DNA/RNA molecule in the revised manuscript.

      “This observation is also supported by previous reports, in which the helicase (the motor protein) translocates the DNA strand through the nanopore in a back-and-forth manner. Depending on ATP or ADP binding, the motor protein may translocate the DNA/RNA forward or backward by 0.5-1 nucleotides.”

      As far as we understand, this translocation mechanism is not specific to modified or unmodified nucleotides. For further details, we refer the reviewer to the original studies cited.

      Page 3: "This lack of interpretability can be problematic when applying these methods to new datasets, as researchers may struggle to trust the predictions without a clear understanding of how the results were generated." Please provide details and citations as to why researchers would struggle to trust the predictions of m6Anet. Is it due to a lack of understanding of how the method works, or an empirically demonstrated lack of reliability?

      Thank you for pointing this out. The lack of interpretability in deep learning models such as m6Anet stems primarily from their “black-box” nature—they provide binary predictions (modified or unmodified) without offering clear reasoning or evidence for each call.

      When we examined the corresponding raw signals, we found it difficult to visually distinguish whether a signal segment originated from a modified or unmodified ribonucleotide. The difference is often too subtle to be judged reliably by a human observer. This is illustrated in the newly added Supplementary Figure S5, which shows Nanopolish-aligned raw signals for the central 5mer GGACT in Figure 4B, displayed both uncolored and colored by modification state (according to the ground truth).

      Although deep neural networks can learn subtle, high-dimensional patterns in the signal that may not be readily interpretable, this opacity makes it difficult for researchers to trust the predictions—especially in new datasets where no ground truth is available. The issue is not necessarily an empirically demonstrated lack of reliability, but rather a lack of transparency and interpretability.

      We have updated the manuscript accordingly and included Supplementary Figure S5 to illustrate the difficulty in interpreting signal differences between modified and unmodified states.

      Page 3: "Instead of relying on complex, opaque features...". Please provide evidence that the research community finds the figures generated by m6Anet to be difficult to interpret, or delete the sections relating to its perceived lack of usability.

      See the figure provided in the response to the previous point. We added a reference to this figure in the revised manuscript.

      “Instead of relying on complex, opaque features (see Supplementary Figure S5), SegPore leverages baseline current levels to distinguish between…..”

      (2) Materials and Methods

      Page 5, Preprocessing: "We begin by performing basecalling on the input fast5 file using Guppy, which converts the raw signal data into base sequences.". Please change "base" to ribonucleotide.

      Revised as requested.

      Page 5 and throughout, please refer to poly(A) tail, rather than polyA tail throughout.

      Revised as requested.

      Page 5, Signal segmentation via hierarchical Hidden Markov model: "...providing more precise estimates of the mean and variance for each base block, which are crucial for downstream analyses such as RNA modification prediction." Please specify which method your HHMM method improves upon.

      Thank you for the suggestion. Since this section does not include a direct comparison, we revised the sentence to avoid unsupported claims. The updated sentence now reads:

      "...providing more precise estimates of the mean and variance for each base block, which are crucial for downstream analyses such as RNA modification prediction."

      Page 10, GMM for 5mer parameter table re-estimation: "Typically, the process is repeated three to five times until the 5mer parameter table stabilizes." How is the stabilisation of the 5mer parameter table quantified? What is a reasonable cut-off that would demonstrate adequate stabilisation of the 5mer parameter table?

      Thank you for the comment. We assess the stabilization of the 5mer parameter table by monitoring the change in baseline values across iterations. If the absolute change in baseline values for all 5mers is less than 1e-5 between two consecutive iterations, we consider the estimation to have stabilized.

      Page 11, M6A site level benchmark: why were these datasets selected? Specifically, why compare human and mouse ribonuclotide modification profiles? Please provide a justification and a brief description of the experiments that these data were derived from, and why they are appropriate for benchmarking SegPore.

      Thank you for the comment. These data are taken from a previous benchmark studie about m6A estimation from RNA002 data in the literature (https://doi.org/10.1038/s41467-023-37596-5). We think the data are appropreciate here.

      Thank you for the comment. The datasets used were taken from a previous benchmark study on m6A estimation using RNA002 data (https://doi.org/10.1038/s41467-023-37596-5). These datasets include human and mouse transcriptomes and have been widely used to evaluate the performance of RNA modification detection tools. We selected them because (i) they are based on RNA002 chemistry, which matches the primary focus of our study, and (ii) they provide a well-characterized and consistent benchmark for assessing m6A detection performance. Therefore, we believe they are appropriate for validating SegPore.

      (3) Results

      Page 13, RNA translocation hypothesis: "The raw current signals, as shown in Fig. 1B...". Please check/correct figure reference - Figure 1B does not show raw current signals.

      Thank you for pointing this out. The correct reference should be Figure 2B. We have updated the figure citation accordingly in the revised manuscript.

      Page 19, m6A identification at the site level: "For six selected m6A motifs, SegPore achieved an ROC AUC of 82.7% and a PR AUC of 38.7%, earning the third best performance compared with deep leaning methods m6Anet and CHEUI (Fig. 3D)." SegPore performs third best of all deep learning methods. Do the authors recommend its use in conjunction with m6Anet for m6A detection? Please clarify in the text.

      This sentence aims to convey that SegPore alone can already achieve good performance. If interpretability is the primary goal, we recommend using SegPore on its own. However, if the objective is to identify more potential m6A sites, we suggest using the combined approach of SegPore and m6Anet. That said, we have chosen not to make explicit recommendations in the main text to avoid oversimplifying the decision or potentially misleading readers.

      Page 19, m6A identification at the single molecule level: "one transcribed with m6A and the other with normal adenosine". I assume that this should be adenine? Please replace adenosine with adenine throughout.

      Thank you for pointing this out. We have revised the sentence to use "adenine" where appropriate. In other instances, we retain "adenosine" when referring specifically to adenine bound to a ribose sugar, which we believe is suitable in those contexts.

      Page 19, m6A identification at the single molecule level: "We used 60% of the data for training and 40% for testing". How many reads were used for training and how many for testing? Please comment on why these are appropriate sizes for training and testing datasets.

      In total, there are 1.9 million reads, with 1.14 million used for training and 0.76 million  for testing (60% and 40%, respectively). We chose this split to ensure that the training set is sufficiently large to reliably estimate model parameters, while the test set remains substantial enough to robustly evaluate model performance. Although the ratio was selected somewhat arbitrarily, it balances the need for effective training with rigorous validation.

      (4) Discussion

      Page 21: "We believe that the de-noised current signals will be beneficial for other downstream tasks." Which tasks? Please list an example.

      We have revised the text for clarity as follows:

      “We believe that the de-noised current signals will be beneficial for other downstream tasks, such as the estimation of m5C, pseudouridine, and other RNA modifications.”

      Page 22: "One can generally observe a clear difference in the intensity levels between 5mers with a m6A and normal adenosine, which is easier for human to interpret if a predicted m6A site is real." This statement is vague and requires qualification. Please reference a study that demonstrates the human ability to interpret two similar graphs, and demonstrate how it relates to the differences observed in your data.

      We apologize for the confusion. We have revised the sentence as follows:

      “One can generally observe a clear difference in the intensity levels between 5mers with an m6A and those with a normal adenosine, which makes it easier for a researcher to interpret whether a predicted m6A site is genuine.”

      We believe that Figures 3A, 3B, and 4B effectively illustrate this concept.

      Page 23: How long does SegPore take for its analyses compared to other similar tools? How long would it take to analyse a typical dataset?

      We have added run-time statistics for datasets of varying sizes in the revised manuscript (see Supplementary Figure S6). This figure illustrates SegPore’s performance across different data volumes to help estimate typical processing times.

      (5) Figures

      Figure 4C. Please number the hierachical clusters and genomic locations in this figure. They are referenced in the text.

      Following your suggestion, we have labeled the hierarchical clusters and genomic locations in Figure 4C in the revised manuscript.

      In addition, we revised the corresponding sentence in the main text as follows: “Biclustering reveals that modifications at g6 are specific to cluster C4, g7 to cluster C5, and g8 to cluster C6, while the first five genomic locations (g1 to g5) show similar modification patterns across all reads.”

    1. Reviewer #1 (Public review):

      Summary:

      The authors attempted to clarify the impact of N protein mutations on ribonucleoprotein (RNP) assembly and stability using analytical ultracentrifugation (AUC) and mass photometry (MP). These complementary approaches provide a more comprehensive understanding of the underlying processes. Both SV-AUC and MP results consistently showed enhanced RNP assembly and stability due to N protein mutations.

      The overall research design appears well planned, and the experiments were carefully executed.

      Strengths:

      SV-AUC, performed at higher concentrations (3 µM), captured the hydrodynamic properties of bulk assembled complexes, while MP provided crucial information on dissociation rates and complex lifetimes at nanomolar concentrations. Together, the methods offered detailed insights into association states and dissociation kinetics across a broad concentration range. This represents a thorough application of solution physicochemistry.

      Weaknesses:

      Unlike AUC, MP observes only a part of the solution. In MP, bound molecules are accumulated on the glass surface (not dissociated), thus the concentration in solution should change as time develops. How does such concentration change impact the result shown here?

    2. Reviewer #3 (Public review):

      Summary:

      This manuscript investigates how mutations in the SARS-CoV-2 nucleocapsid protein (N) alter ribonucleoprotein (RNP) assembly, stability, and viral fitness. The authors focus on mutations such as P13L, G214C, and G215C, combining biophysical assays (SV-AUC, mass photometry, CD spectroscopy, EM), VLP formation, and reverse genetics. They propose that SARS-CoV-2 exploits "fuzzy complex" principles, where distributed weak interfaces in disordered regions allow both stability and plasticity, with measurable consequences for viral replication.

      Strengths:

      (1) The paper demonstrates a comprehensive integration of structural biophysics, peptide/protein assays, VLP systems, and reverse genetics.

      (2) Identification of both de novo (P13L) and stabilizing (G214C/G215C) interfaces provides a mechanistic insight into RNP formation.

      (3) Strong application of the "fuzzy complex" framework to viral assembly, showing how weak/disordered interactions support evolvability, is a significant conceptual advance in viral capsid assembly.

      (4) Overall, the study provides a mechanistic context for mutations that have arisen in major SARS-CoV-2 variants (Omicron, Delta, Lambda) and a mechanistic basis for how mutations influence phenotype via altered biomolecular interactions.

      Weaknesses:

      (1) The arrangement of N dimers around LRS helices is presented in Figure 1C, but the text concedes that "the arrangement sketched in Figure 1C is not unique" (lines 144-146) and that AF3 modeling attempts yielded "only inconsistent results" (line 149).<br /> The authors should therefore present the models more cautiously as hypotheses instead. Additional alternative arrangements should be included in the Supplementary Information, so the readers do not over-interpret a single schematic model.

      (2) Negative-stained EM fibrils (Figure 2A) and CD spectra (Figure 2B) are presented to argue that P13L promotes β-sheet self-association. However, the claim could benefit from more orthogonal validation of β-sheet self-association. Additional confirmation via FTIR spectra or ThT fluorescence could be used to further distinguish structured β-sheets from amorphous aggregation.

      (3) In the main text, the authors alternate between emphasizing non-covalent effects ("a major effect of the cysteines already arises in reduced conditions without any covalent bonds," line 576) and highlighting "oxidized tetrameric N-proteins of N:G214C and N:G215C can be incorporated into RNPs". Therefore, the biological relevance of disulfide redox chemistry in viral assembly in vivo remains unclear. Discussing cellular redox plausibility and whether the authors' oxidizing conditions are meant as a mechanistic stress test rather than physiological mimicry could improve the interpretation of these results.

      The paper could benefit if the authors provide a summary figure or table contrasting reduced vs. oxidized conditions for G214C/G215C mutants (self-association, oligomerization state, RNP stability). Explicitly discuss whether disulfides are likely to form in infected cells.

      (4) VLP assays (Figure 7) show little enhancement for P13L or G215C alone, whereas Figure 8 shows that P13L provides clear fitness advantages. This discrepancy is acknowledged but not reconciled with any mechanistic or systematic rationale. The authors should consider emphasizing the limitations of VLP assays and the sources of the discrepancy with respect to Figure 8.

      (5) Figures 5 and 6 are dense, and the several overlays make it hard to read. The authors should consider picking the most extreme results to make a point in the main Figure 5 and move the other overlays to the Supplementary. Additionally, annotating MP peaks directly with "2×, 4×, 6× subunits" can help non-experts.

      (6) The paper has several names and shorthand notations for the mutants, making it hard to keep up. The authors could include a table that contains mutation keys, with each shorthand (Ancestral, Nο/No, Nλ, etc.) mapped onto exact N mutations (P13L, Δ31-33, R203K/G204R, G214C/G215C, etc.). They could then use the same glyphs (Latin vs Greek) consistently in text and figure labels.

      (7) The EM fibrils (Figure 2A) and CD spectra (Figure 2B) were collected at mM peptide concentrations. These are far above physiological levels and may encourage non-specific aggregation. Similarly, the authors mention" ultra-weak binding energies that require mM concentrations to significantly populate oligomers". On the other hand, the experiments with full-length protein were performed at concentrations closer to biologically relevant concentrations in the micromolar range. While I appreciate the need to work at high concentrations to detect weak interactions, this raises questions about physiological relevance. Specifically:

      a) Could some of the fibril/β-sheet features attributed to P13L (Figure 2A-C) reflect non-specific aggregation at high concentrations rather than bona fide self-association motifs that could play out in biologically relevant scenarios?

      b) How do the authors justify extrapolating from the mM-range peptide behaviors to the crowded but far lower effective concentrations in cells?

      The authors should consider adding a dedicated section (either in Methods or Discussion) justifying the use of high concentrations, with estimation of local concentrations in RNPs and how they compare to the in vitro ranges used here. For concentration-dependent phenomena discussed here, it is vital to ensure that the findings are not artefacts of non-physiological peptide aggregation..

    3. Author response:

      We thank the Reviewers and Editors for their time and insightful comments. We are encouraged by their positive assessment and we look forward to addressing the points raised. Areas of primary concern include (1) the use of high concentrations in peptide experiments; (2) improvement of the presentation and discussion of the results; and (3) clarification of the impact of surface adsorption on the mass photometry analyses.

      Regarding (1), we will better explain why some experiments with isolated disordered N-terminal extension were necessarily carried out at high concentrations, in order to demonstrate the potential for these peptides to weakly self-associate. While much lower nucleocapsid protein concentrations are present in the cytosol on average, and are used in our ribonucleoprotein assembly experiments, there are two important physiologically relevant cases where high local concentrations do occur: First, high effective concentrations of tethered disordered N-terminal extensions exist locally in the volume sampled by individual ribonucleoprotein complexes, and, second, high nucleocapsid concentrations are prevalent in its macromolecular condensates. Thus, weak interactions of N-terminal extensions can play a critical role strengthening fuzzy ribonucleoprotein complexes and also altering condensate properties, both of which were confirmed in our experiments. Nonetheless, we do not expect the observed fibrillar state of the concentrated isolated N-terminal peptide to be physiologically relevant, since physiologically they will always remain tethered to the full-length protein impeding fibrillar superstructures.

      (2) We are grateful for the Reviewers’ suggestions to enhance the clarity and accessibility of our findings and to streamline the presentation. We intend to tighten up the text and improve figures throughout, and add discussion points, as proposed.

      (3) We plan to add an analysis of the extent that irreversible surface adsorption decreases solute concentration in mass photometry, and discuss why this has negligible impact on the conclusions drawn under our experimental conditions.In summary, we agree these points all provide opportunities to strengthen the manuscript further and we are glad to revise our manuscript accordingly.

    1. Author response:

      The following is the authors’ response to the original reviews

      Recommendations for the Authors:

      Reviewer #1:

      We think that this manuscript brings an important contribution that will be of interest in the areas of statistical physicists, (microbiota) ecology, and (biological) data science. The evidence of their results is solid and the work improves the state-of-the-art in terms of methods. We have a few concerns that, in our opinion, the authors should address.

      Major concerns:

      (1) While the paper could be of interest for the broad audience of e-Life, the way it is written is accessible mainly to physicists. We encourage the authors to take the broad audience into account by i) explaining better the essence of what is being done at each step, ii) highlighting the relevance of the method compared to other methods, iii) discussing the ecological implications of the results.

      Examples on how to approach i) include: Modify or expand Figure 1 so that non-familiar readers can understand the summary of the work (e.g. with cartoons representing communities, diseased states and bacterial interactions and their relationship with the inference method); in each section, summarize at the beginning the purpose of what is going to be addressed in this section, and summarize at the end what the section has achieved; in Figure 2, replace symbols by their meaning as much as possible-the same for Figure 1, at the very least in the figure caption.

      Example on how to approach ii): Since the authors aim to establish a bridge between disordered systems and microbiome ecology, it could be useful to expand a bit the introduction on disordered systems for biologists/biophysicists. This could be done with an additional text box, which could also highlight the advantages of this approach in comparison to other techniques (e.g. model-free approaches can also classify healthy and diseased states).

      Example on how to approach iii): The authors could discuss with more depth the ecological implications of their results. For example, do they have a hypothesis on why demographic and neutral effects could dominate in healthy patients?

      We thank the reviewer for the observations. Following the suggestion in the revised version, each section outlines the goal of what will be addressed in that section, and summarizes what we have achieved at the end; We also updated Figure 1 and Figure 2.

      (i) For figure 1, we expanded and hopefully made more clear how we conceptualize the problem, use the data, andestablish our method. In Figure 2, we enriched the y labels of each panel with the name associated with the order parameter.

      (ii) We thank the reviewer for helping us improve the readability of the introductory part, thus providing moreinsights into disordered systems techniques for a broader audience. We have added a few explanations at the end of page 2 – to explain the advantages of such methodology compared to other strategies and models.

      (iii) We thank the reviewer for raising the need for a more in-depth ecological discussion of our results. A simple wayto understand why neutral effects may dominate in healthy patients is the following. Neutrality implies that species differences are mainly shaped by stochastic processes such as demographic noise, with species treated as different realizations of the same underlying stochastic ecological dynamics. In our analysis, we observe that healthy individuals tend to exhibit highly similar microbial communities, suggesting that the compositional variability among their microbiomes is compatible—at least in part—with the fluctuations expected from demographic stochasticity alone. In contrast, patients with the disease display significantly more heterogeneous microbial compositions. The diversity and structure of their gut communities cannot be satisfactorily explained by neutral demographic fluctuations alone.

      This discrepancy implies that additional deterministic forces—such as altered ecological interactions—are driving the divergence observed in dysbiotic states. In diseased individuals, the breakdown of such interactions leads to a structurally distinct regime that may correspond to a phase of marginal stability, as indicated by our theoretical modeling. This shift marks a transition from a community governed by neutrality and demographic noise to one dominated by non-neutral ecological forces (as depicted in Figure 4). We added these comments in the discussion section of the revised manuscript.

      (2) Taking into account the broader audience, we invite the authors to edit the abstract, as it seems to jump from one ecological concept to another without explicitly communicating what is the link between these concepts. From the first two sentences, the motivation seems to be species diversity, but no mention of diversity comes after the second sentence. There is no proper introduction/definition of what macroecological states are. After that, the authors switch to healthy and unhealthy states, without previously introducing any link between gut microbiota states and the host’s health (which perhaps could be good in the first or second sentence, although other framings can be as valid). After that, interactions appear in the text and are related to instability, but the reader might not know whether this is surprising or if healthy/unhealthy states are generally related to stability.

      We pointed out a few examples, but the authors could extend their revision on i), ii) and iii) beyond such specific comments. In our opinion, this would really benefit the paper.

      In response to the reviewer’s concern about conceptual clarity and structure, we substantially revised the abstract to improve its accessibility and logical flow. In the revised abstract, we now clearly link species diversity to microbiome structure and function from the outset, addressing initial confusion. We provide a concise definition of ”macroecological states,” framing them as reproducible statistical patterns reflecting community-level properties. Additionally, the revised version explicitly connects gut microbiome states to host health earlier, resolving the previous abrupt shift in focus. Finally, we conclude by highlighting how disordered systems theory advances our understanding of microbiome stability and functioning, reinforcing the novelty and broader significance of our approach. Overall, the revised abstract better serves a broad interdisciplinary audience, including readers unfamiliar with the technicalities of disordered systems or microbial ecology, while preserving the scientific depth and accuracy of our work

      (3) The connection with consumer-resource (CR) models is quite unusual. In Equation (12), why do the authors assume that the consumption term does not depend on R? This should be addressed, since this term is usually dependent on R in microbial ecology models.

      In case this is helpful, it is known that the symmetric Lotka-Volterra model emerges from time-scale separation in the MacArthur model, where resources reproduce logistically and are consumed by other species (e.g., plants eaten by herbivores). Consumer-resource models form a broad category, while the MacArthur model is a specific case featuring logistic resource growth. For microbes, a more meaningful justification of the generalized Lotka-Volterra (GLV) model from a consumer-resource perspective involves the consumer-resource dynamics in a chemostat, where time-scale separation is assumed and higher-order interactions are neglected. See, for example: a) The classic paper by MacArthur: R. MacArthur. Species packing and competitive equilibrium for many species. Theoretical Population Biology, 1(1):1-11, 1970. b) Recent works on time-scale separation in chemostat consumer-resource models: Anna Posfai et al., PRL, 2017 Sireci et al., PNAS, 2023 Akshit Goyal et al., PRX-Life, 2025

      We thank the reviewer for the observation. We apologize for the typo that appeared in the main text and that we promptly corrected. The Consumers-Resources model we had in mind is the classical case proposed by MacArthur, where resources are self-regulated according to a logistic growth mechanism, which leads to the generalized LotkaVolterra model we employ in our work.

      Minor concerns:

      (1) The title has a nice pun for statistical physicists, but we wonder if it can be a bit confusing for the broader audience of e-Life. Although we leave this to the author’s decision, we’d recommend considering changing the title, making it more explicit in communicating the main contribution/result of the work.

      Following the reviewer’s suggestion, we have introduced an explanatory subtitle: “Linking Species Interactions to Dysbiosis through a Disordered Lotka-Volterra Framework”.

      (2) Review the references - some preprints might have already been published: Pasqualini J. 2023, Sireci 2022, Wu 2021.

      We thank the reviewer for pointing our attention to this inaccuracy. We updated the references to Pasqualini and Sireci papers. To our knowledge, Wu’s paper has appeared as an arXiv preprint only.

      (3) Species do not generally exhibit identical carrying capacities (see Grilli, Nat. Commun., 2020; some taxa are generally more abundant than others. The authors could discuss whether the model, with the inferred parameters, can accurately reproduce the distribution of species’ mean abundances.

      We thank the reviewer for this insightful comment. As discussed in the revised manuscript (lines 294–299), our current model does not accurately reproduce the empirical species abundance distribution (SAD). This limitation stems from the assumption of constant carrying capacities across species. While empirical observations (e.g., Grilli et al., Nat. Commun., 2020 [1]) show heterogeneous mean abundances often following power-law or log-normal distributions. However, our model assumes constant carrying capacity, resulting in SADs devoid of fat tails, which diverge from empirical data.

      This simplification is implemented to maintain the analytical tractability of the disordered generalized Lotka-Volterra (dGLV) framework, a common approach also found in prior works such as Bunin (2017) and Barbier et al. (2018) [2, 3]. Introducing heterogeneity in carrying capacities, such as drawing them from a log-normal distribution, or switching to multiplicative (rather than demographic) noise, could indeed produce SADs that better align with empirical data. Nevertheless, implementing changes would significantly complicate the analytical treatment.

      We acknowledge these directions as promising avenues for future research. They could help enhance the empirical realism of the model and its capacity to capture observed macroecological patterns while posing new theoretical challenges for disordered systems analysis

      (4) A substantial number of cited works (Grilli, Nat. Commun., 2020; Zaoli & Grilli, Science Advances, 2021; Sireci et al., PNAS, 2023; Po-Yi Ho et al., eLife, 2022) suggest that environmental fluctuations play a crucial role in shaping microbiome composition and dynamics. Is the authors’ analysis consistent with this perspective? Do they expect their conclusions to remain robust if environmental fluctuations are introduced?

      We thank the reviewer for stressing this point. The introduction of environmental fluctuations in the model formally violates detailed balance, thereby preventing the definition of an energy function. To date, no study has integrated random interactions together with both demographic and environmental noise within a unified analytical framework. This is certainly a highly promising direction that some of the authors are already exploring. However, given the inherently out-of-equilibrium nature of the system and the absence of a free energy, we would need to adopt a Dynamical Mean-Field Theory formalism and eventually analyze the corresponding stationary equations to be solved self-consistently. We added, however, a brief note in the Discussion section.

      (5) The term “order parameters“ may not be intuitive for a biological audience. In any case, the authors should explicitly define each order parameter when first introduced.

      We thank the reviewer for the comment. We introduced the names of the order parameters as soon as they are introduced, along with a brief explanation of their meaning that may be accessible to an audience with biological background.

      (6) Line 242: Should ψU be ψD?

      We thank the reviewer for the observation. We corrected the typo.

      (7) Given that the authors are discussing healthy and diseased states and to avoid confusion, the authors could perhaps use another word for ’pathological’ when they refer to dynamical regimes (e.g., in Appendix 2: ’letting the system enter the pathological regime of unbounded growth’).

      We thank the reviewer for the helpful comment. As suggested, we used the term “unphysical” instead of “pathological” where needed.

      Reviewer #2:

      (1) A technical point that I could not understand is how the authors deal with compositional data. One reason for my confusion is that the order parameters h and q0 are fixed n data to 1/S and 1/S2, and thus I do not see how they can be informative. Same for carrying capacity, why is it not 1 if considering relative abundance?

      We thank the reviewer for raising this point. We acknowledge that the treatment of compositional data and the interpretation of order parameters h and q0 were not sufficiently clarified in the manuscript. Additionally, there was an imprecision in the text regarding the interpretation of these parameters.

      As defined in revised Eq. (4) of the manuscript, h and q0 are to be averaged over the entire dataset, summing across samples α. Specifically, and , where S<sub>α</sub> is the number of species present in sample α and is the average over samples. These parameters are therefore informative, as they encapsulate sample-level ecological diversity, and their variation reflects biological differences between healthy and diseased states. For instance, Pasqualini et al., 2024 [4] reported significant differences in these metrics between health conditions, thereby supporting their ecological relevance.

      Regarding carrying capacities, we clarify that although we work with relative abundance data (i.e., compositional data), we do not fix the carrying capacity K to 1. Instead, we set K to the maximum value of xi (relative abundance) within each sample, to preserve compatibility with empirical data and allow for coexistence. While this remains a modeling assumption, it ensures better ecological realism within the constraints of the disordered GLV framework.

      (2) Obviously I’m missing something, so it would be nice to clarify in simple terms the logic of the argument. I understand that Lagrange multipliers are going to be used in the model analysis, and there are a lot of technical arguments presented in the paper, but I would like a much more intuitive explanation about the way the data can be used to infer order parameters if those are fixed by definition in compositional data.

      We thank the reviewer for the observation. The order parameters can be measured directly from the data, even in the presence of compositionality, as explained above. We can connect those parameters with the theory even for compositional data, because the only effect of adding the compositionality constraint is to shift the linear coefficient in the Hamiltonian, which corresponds to shifting the average interaction µ. However, the resulting phase diagram is mostly affected by the variance of the interactions σ2 (as µ is such that we are in the bounded phase).

      (3) Another point that I did not understand comes from the fact that the authors claim that interaction variance is smaller in unhealthy microbiomes. Yet they also find that those are closer to instability, and are more driven by niche processes. I would have expected the opposite to be true, more variance in the interactions leading to instability (as in May’s original paper for instance). Is this apparent paradox explained by covariations in demographic stochasticity (T) and immigration rate (lambda)? If so, I think it would be very useful to comment on that.

      As Altieri and coworkers showed in their PRL (2021) [5], the phase diagram of our model differs fundamentally from that of Biroli et al. (2018) [6]. In the latter, the intuitive rule – greater interaction variance yields greater instability – indeed holds. For the sake of clarity, we have attached below the resulting phase diagram obtained by Altieri et al.

      The apparent paradox arises because the two phase diagrams are tuned by different parameters. Consequently, even at low temperature and with weak interaction variance, our system may sit nearer to the replica-symmetrybreaking (RSB) line.

      Fig. 3 in the main text it is not a (σ,T) phase diagram where all other parameters are kept constant. Rather, it is a plot of the inferred σ and T parameters from the data (without showing the corresponding µ).

      To capture the full, non-trivial influence of all parameters on stability, we studied the so-called “replicon eigenvalue” in the RS (i.e. single equilibrium) approximation. This leading eigenvalue measures how close a given set of inferred parameters – and hence a microbiome – is to the RSB threshold. For a visual representation of these findings, refer to Figure 4.

      Author response image 1.

      (4) What do the empirical SAD look like? It would be nice to see the actual data and how the theoretical SADs compare.

      The empirical species abundance distributions (SADs) analyzed in our study are presented and discussed in detail in Pasqualini et al., 2024 [4]. Given the overlap in content, we chose not to reproduce these figures in the current manuscript to avoid redundancy.

      As we also clarify in the revised text, the theoretical SAD is derived from the disordered generalized Lotka-Volterra (dGLV) model in the unique fixed point phase typically exhibit exponential tails. These distributions do not match the heavier-tailed patterns (e.g., log-normal or power-law-like) observed in empirical microbiome data. This discrepancy stems from the simplifying assumptions of the dGLV framework, including the use of constant carrying capacities and demographic noise.

      In the revised manuscript, we have added a brief discussion in the revised manuscript to explicitly acknowledge this limitation and emphasize it as a direction for future refinement of the model, such as incorporating heterogeneous carrying capacities or exploring alternative noise structures.

      (5) Some typos: often “niche” is written “nice”.

      We thank the reviewer for this suggestion. After inspecting the text, we corrected the reported typos.

      Reviewer #3:

      Major comments:

      (1) In the S3 text, the authors say that filtered metagenomic reads were processed using the software Kaiju. The description of the pipeline does not mention how core genes were selected, which is often a crucial step in determining the abundance of a species in a metagenomic sample. In addition, the senior author of this manuscript has published a version of Kaiju that leverages marker genes classification methods (deemed Core-Kaiju), but it was not used for either this manuscript or Pasqualini et al. (2014; Tovo et al., 2020). I am not suggesting that the data necessarily needs to be reprocessed, but it would be useful to know how core genes were chosen in Pasqualini et al. and why Core-Kaiju was not used (2014).

      Prior to the current manuscript and the PLOS Computational Biology paper by Pasqualini et al. [4], we applied the core-Kaiju protocol to the same dataset used in both studies. However, this tool was originally developed and validated using general catalogs of culturable organisms, not specifically tuned for gut microbiomes. As a result, we have realized that in many samples Core Kajiu would filter only very few species (in some samples, the number of identified species was as low as 5–10), undermining the reliability of the analysis. Due to these limitations, we opted to use the standard Kaiju version in our work. We are actively developing an improved version of the core-Kaiju protocol that will overcome the discussed limitations and preliminary results (not shown here) indicate the robustness of the obtained patterns also in this case.

      (2) My understanding of Pasqualini et al. was that diseased patients experienced larger fluctuations in abundance, while in this study, they had smaller fluctuations (Figure 3a; 2024). Is this a discrepancy between the two models or is there a more nuanced interpretation?

      We thank the reviewer for the observation. This is only an apparent discrepancy, as the term fluctuation has different meanings in the two contexts. The fluctuations referred to by the reviewer correspond to a parameter of our theory—namely, noise in the interactions. Conversely, in Pasqualini et al. σ indicates environmental fluctuations. Nevertheless, there is no conceptual discrepancy in our results: in both studies, unhealthy microbiomes were found to be less stable. In fact, also in this study, notably Fig. 4, shows that unhealthy microbiomes lie closer to the RSB line, a phenomenon that is also associated with enhanced fluctuations.

      (3) Line 38-41: It would be helpful to explicitly state what “interaction patterns” are being referenced here. The final sentence could also be clarified. Do microbiomes “host“ interactions or are they better described as a property (“have”, “harbor”). The word “host” may confuse some readers since it is often used to refer to the human host. I am also not sure what point is being made by “expected to govern natural ones”. There are interactions between members of a microbiome; experimental studies have characterized some of these interactions, which we expect to relate in some way to interactions in nature. Is this what the authors are saying?

      Thanks. We agree that this sentence was not clear. Indeed, we are referring to pairwise species interactions and not to host-microbiome interactions. We have rewritten this part in the following way: In fact, recent work shows that the network-level properties of species-species interactions —for example, the sign balance, average strength, and connectivity of the inferred interaction matrix— shift systematically between healthy and dysbiotic gut communities (see for instance, [7, 8]). Pairwise species interactions have been quantified in simplified in-vitro consortia [9, 10]; we assume that the same classes of interactions also operate—albeit in a more complex form—in the native gut microbiome.

      (4) Line 43: I appreciate that the authors separated neutral vs. logistic models here.

      (5) Lines 51-75: The framing here is well-written and convincing. Network inference is an ongoing, active subject in ecology, and there is an unfortunate focus on inferring every individual interaction because ecologists with biology backgrounds are not trained to think about the problem in the language of statistical physics.

      We thank the reviewer for these positive comments.

      (6) Line 87: Perhaps I’m missing something obvious, but I don’t see how ρi sets the intrinsic timescale of the dynamics when its units are 1/(time*individuals), assuming the dimensions of ri are inverse time.

      We thank the reviewer for the observation. We corrected this phrase in the main text.

      (7) Lines 189-190: “as close as possible to the data” it would aid the reader if you specified the criteria meant by this statement.

      We thank the reviewer for the observation. We removed the sentence, as it introduced some redundancy in our argument. In the subsequent text, the proposed method is exposed in details.

      (8) Line 198: It would aid the reader if you provided some context for what the T - σ plane represents.

      We thank the referee for the helpful indication. Indeed, we have better clarified the mutual role of the demographic noise amplitude and strength of the random interaction matrix, as theoretically predicted in the PRL (2021) by Altieri and coworkers [5]. Please, find an additional paragraph on page 6 of the resubmitted version.

      (9) Line 217: Specifying what is meant by “internal modes“ would aid the typical life science reader.

      We thank the reviewer for the suggestion. Recognizing that referring to “internal modes” to describe the SAD shape in that context might cause confusion, we replaced “internal modes“ with “peaks”.

      (10) Line 219: Some additional justification and clarification are needed here, as some may think of “m“ as being biomass.

      We added a sentence to better explain this concept. “In classical and quantum field theory, the particle-particle interaction embedded in the quadratic term is typically referred to as a mass source. In the context of this study, captures quadratic fluctuations of species abundances, as also appearing in the expression of the leading eigenvalue of the stability matrix.”

      Minor comments:

      (1) I commend the authors for removing metagenomic reads that mapped to the human genome in the preprocessing stage of their pipeline. This may seem like an obvious pre-processing step, but it is unfortunately not always implemented.

      We thank the referee for pointing this potential issue. The data used in this work, as well as the bioinformatic workflow used to generate them has been described in detail in Pasqualini et al., 2024 [4]. As one of the main steps for preprocessing, we remove reads mapping to the human genome.

      (2) Line 13: “Bacterial“ excludes archaea, and while you may not have many high-abundance archaea in your human gut data, this sentence does not specify the human gut. Usually, this exclusion is averted via the term “microbial“, though sometimes researchers raise objections to the term when the data does not include fungal members (e.g., all 16S studies).

      We thank the reviewer for this suggestion. As to include archaeal organisms, we adopt the term “microbial“ instead of “bacterial“.

      (3) Line 18: This manuscript is being submitted under the “Physics of Living Systems“ tract, but it may be useful to explicitly state in the Abstract that disordered systems are a useful approach for understanding large, complex communities for the benefit of life science researchers coming from a biology background.

      Thank. We have modified the abstract following this suggestion.

      (4) Line 68: Consider using “adapted“ or something similar instead of “mutated“ if there is no specific reason for that word choice.

      We thank the reviewer for this suggestion, which was implemented in the text.

      (5) Line 111: It would be useful to define annealed and quenched for a general life science audience.

      We thank the reviewer for this suggestion. In the “Results” section, we have opted for “time-dependent disordered interactions” to reach a broader audience and avoid any jargon. Moreover, in the Discussion we added a detailed footnote: “In contrast to the quenched approximation, the annealed version assumes that the random couplings are not fixed but instead fluctuate over time, with their covariance governed by independent Ornstein–Uhlenbeck processes.”

      (6) Line 124: Likewise for the replicon sector.

      We thank the reviewer for the suggestion. We added a footnote on page 4, after the formula, to highlight the physical intuition behind the introduction of the replicon mode.

      “The replicon eigenvalue refers to a particular type of fluctuation around the saddle-point (mean-field) solution within the replica framework. When the Hessian matrix of the replicated free energy is diagonalized, fluctuations are divided into three sectors: longitudinal, anomalous, and replicon. The replicon mode is the most sensitive to criticality signaling – by its vanishing trend – the emergence of many nearly-degenerate states. It essentially describes how ‘soft’ the system is to microscopic rearrangements in configuration space.”

      (7) Figure 2: It would be helpful to include y-axis labels for each order parameter alongside the mathematical notation.

      We thank the reviewer for this suggestion. Now the y-axis of Figure 2 includes, along the mathmetical symbol, the label of the represented quantities.

      (8) Line 242: Subscript “U” is used to denote “Unhealthy” microbiomes, but “D” is used to denote “Diseased” in Figs. 2 and 3 (perhaps elsewhere as well).

      We thank the reviewer for this observation. After checking the various subscripts in the text, coherently with figure 2 and 3, we homogenized our notation, adopting the subscript “D“ for symbols related to the diseased/unhealthy condition.

      (9) Line 283: “not to“ should be “not due to“

      We thank the reviewer for this suggestion. After inspecting the text, we corrected the reported error.

      (10) Equations 23, 34: Extra “=“ on the RHS of the first line.

      We consistently follow the same formatting across all the line breaks in the equations throughout the text.

      We are thus resubmitting our paper, hoping to have satisfactorily addressed all referees’ concerns.

      References

      (1) Jacopo Grilli. Macroecological laws describe variation and diversity in microbial communities. Nature communications, 11(1):4743, 2020.

      (2) Guy Bunin. Ecological communities with lotka-volterra dynamics. Physical Review E, 95(4):042414, 2017.

      (3) Matthieu Barbier, Jean-Franc¸ois Arnoldi, Guy Bunin, and Michel Loreau. Generic assembly patterns in complex ecological communities. Proceedings of the National Academy of Sciences, 115(9):2156–2161, 2018.

      (4) Jacopo Pasqualini, Sonia Facchin, Andrea Rinaldo, Amos Maritan, Edoardo Savarino, and Samir Suweis. Emergent ecological patterns and modelling of gut microbiomes in health and in disease. PLOS Computational Biology, 20(9):e1012482, 2024.

      (5) Ada Altieri, Felix Roy, Chiara Cammarota, and Giulio Biroli. Properties of equilibria and glassy phases of the random lotka-volterra model with demographic noise. Physical Review Letters, 126(25):258301, 2021.

      (6) Giulio Biroli, Guy Bunin, and Chiara Cammarota. Marginally stable equilibria in critical ecosystems. New Journal of Physics, 20(8):083051, 2018.

      (7) Amir Bashan, Travis E Gibson, Jonathan Friedman, Vincent J Carey, Scott T Weiss, Elizabeth L Hohmann, and Yang-Yu Liu. Universality of human microbial dynamics. Nature, 534(7606):259–262, 2016.

      (8) Marcello Seppi, Jacopo Pasqualini, Sonia Facchin, Edoardo Vincenzo Savarino, and Samir Suweis. Emergent functional organization of gut microbiomes in health and diseases. Biomolecules, 14(1):5, 2023.

      (9) Jared Kehe, Anthony Ortiz, Anthony Kulesa, Jeff Gore, Paul C Blainey, and Jonathan Friedman. Positive interactions are common among culturable bacteria. Science advances, 7(45):eabi7159, 2021.

      (10) Ophelia S Venturelli, Alex V Carr, Garth Fisher, Ryan H Hsu, Rebecca Lau, Benjamin P Bowen, Susan Hromada, Trent Northen, and Adam P Arkin. Deciphering microbial interactions in synthetic human gut microbiome communities. Molecular systems biology, 14(6):e8157, 2018.

    2. Reviewer #3 (Public review):

      Summary:

      I found the manuscript to be well-written. I have a few questions regarding the model, though the bulk of my comments are requests to provide definitions and additional clarity. There are concepts and approaches used in this manuscript that are clear boons for understanding the ecology of microbiomes but are rarely considered by researchers approaching the manuscript from a traditional biology background. The authors have clearly considered this in their writing of S1 and S2, so addressing these comments should be straightforward. The methods section is particularly informative and well-written, with sufficient explanations of each step of the derivation that should be informative to researchers in the microbial life sciences that are not well-versed with physics-inspired approaches to ecology dynamics.

      Strengths:

      The modeling efforts of this study primarily rely on a disordered for of the generalized Lotka-Volterra (gLV) model. This model can be appropriate for investigating certain systems and the authors are clear about when and how more mechanistic models (i.e., consumer-resource) can lead to gLV. Phenomenological models such as this have been found to be highly useful for investigating the ecology of microbiomes, so this modeling choice seems justified, and the limitations are laid out.

      Weaknesses:

      The authors use metagenomic data of diseased and healthy patients that was first processed in Pasqualini et al. (2024). The use of metagenomic data leads me into a question regarding the role of sampling effort (i.e., read counts) in shaping model parameters such as $h$. This parameter is equal to the average of 1/# species across samples because the data are compositional in nature. My understanding is that $h$ was calculated using total abundances (i.e., read counts). The number of observed species is strongly influenced by sampling effort and the authors addressed this point in their revised manuscript.

      However, the role of sampling effort can depend on the type of data and my instinct about the role that sampling effort plays in species detection is primarily based on 16S data. The dependency between these two variables may be less severe for the authors' metagenomic pipeline. This potential discrepancy raises a broader issue regarding the investigation of microbial macroecological patterns and the inference of ecological parameters. Often microbial macroecology researchers rely on 16S rRNA amplicon data because that type of data is abundant and comparatively low-cost. Some in microbiology and bioinformatics are increasingly pushing researchers to choose metagenomics over 16S. Sometimes this choice is valid (discovery of new MAGs, investigate allele frequency changes within species, etc.), sometimes it is driven by the false equivalence "more data = better". The outcome though is that we have a body of more-or-less established microbial macroecological patterns which rest on 16S data and are now slowly incorporating results from metagenomics. To my knowledge there has not been a systematic evaluation of the macroecological patterns that do and do not vary by one's choice in 16S vs. metagenomics. Several of the authors in this manuscript have previously compared the MAD shape for 16S and metagenomic datasets in Pasqualini et al., but moving forward a more comprehensive study seems necessary (2024). These points were addressed by the authors in their revised manuscript.

      Final review: The authors addressed all comments and I have no additional comments.

      References

      Pasqualini, Jacopo, et al. "Emergent ecological patterns and modelling of gut microbiomes in health and in disease." PLOS Computational Biology 20.9 (2024): e1012482.

    1. Article 3 of the Convention Against Torture
      1. No State Party shall expel, return ("refouler") or extradite a person to another State where there are substantial grounds for believing that he would be in danger of being subjected to torture.

      2. For the purpose of determining whether there are such grounds, the competent authorities shall take into account all relevant considerations including, where applicable, the existence in the State concerned of a consistent pattern of gross, flagrant or mass violations of human rights.

    2. subsection 99(3)

      (3) A claim for refugee protection made by a person inside Canada must be made in person to an officer, must not be made by a person who is subject to a removal order, and is governed by this Part.

    3. A designated foreign national on whom refugee protection is conferred under paragraph 95(1)(b) or (c) must report to an officer in accordance with the regulations.

      95 (1) Refugee protection is conferred on a person when

      (a) the person has been determined to be a Convention refugee or a person in similar circumstances under a visa application and becomes a permanent resident under the visa or a temporary resident under a temporary resident permit for protection reasons;

      **(b) the Board determines the person to be a Convention refugee or a person in need of protection; or

      (c) except in the case of a person described in subsection 112(3), the Minister allows an application for protection.**

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The study explored the biomechanics of kangaroo hopping across both speed and animal size to try and explain the unique and remarkable energetics of kangaroo locomotion.

      Strengths:

      The study brings kangaroo locomotion biomechanics into the 21st century. It is a remarkably difficult project to accomplish. There is excellent attention to detail, supported by clear writing and figures.

      Weaknesses:

      The authors oversell their findings, but the mystery still persists. 

      The manuscript lacks a big-picture summary with pointers to how one might resolve the big question.

      General Comments

      This is a very impressive tour de force by an all-star collaborative team of researchers. The study represents a tremendous leap forward (pun intended) in terms of our understanding of kangaroo locomotion. Some might wonder why such an unusual species is of much interest. But, in my opinion, the classic study by Dawson and Taylor in 1973 of kangaroos launched the modern era of running biomechanics/energetics and applies to varying degrees to all animals that use bouncing gaits (running, trotting, galloping and of course hopping). The puzzling metabolic energetics findings of Dawson & Taylor (little if any increase in metabolic power despite increasing forward speed) remain a giant unsolved problem in comparative locomotor biomechanics and energetics. It is our "dark matter problem".

      Thank you for the kind words.

      This study is certainly a hop towards solving the problem. But, the title of the paper overpromises and the authors present little attempt to provide an overview of the remaining big issues. 

      We have modified the title to reflect this comment.  “Postural adaptations may contribute to the unique locomotor energetics seen in hopping kangaroos”

      The study clearly shows that the ankle and to a lesser extent the mtp joint are where the action is. They clearly show in great detail by how much and by what means the ankle joint tendons experience increased stress at faster forward speeds.

      Since these were zoo animals, direct measures were not feasible, but the conclusion that the tendons are storing and returning more elastic energy per hop at faster speeds is solid. The conclusion that net muscle work per hop changes little from slow to fast forward speeds is also solid. 

      Doing less muscle work can only be good if one is trying to minimize metabolic energy consumption. However, to achieve greater tendon stresses, there must be greater muscle forces. Unless one is willing to reject the premise of the cost of generating force hypothesis, that is an important issue to confront. Further, the present data support the Kram & Dawson finding of decreased contact times at faster forward speeds. Kram & Taylor and subsequent applications of (and challenges to) their approach supports the idea that shorter contact times (tc) require recruiting more expensive muscle fibers and hence greater metabolic costs. Therefore, I think that it is incumbent on the present authors to clarify that this study has still not tied up the metabolic energetics across speed problems and placed a bow atop the package. 

      Fortunately, I am confident that the impressive collective brain power that comprises this author list can craft a paragraph or two that summarizes these ideas and points out how the group is now uniquely and enviably poised to explore the problem more using a dynamic SIMM model that incorporates muscle energetics (perhaps ala' Umberger et al.). Or perhaps they have other ideas about how they can really solve the problem.

      You have raised important points, thank you for this feedback. We have added a limitations and considerations section to the discussion which highlights that there are still unanswered questions. Line 311-328

      Considerations and limitations

      “First, we believe it is more likely that the changes in moment arms and EMA can be attributed to speed rather than body mass, given the marked changes in joint angles and ankle height observed at faster hopping speeds. However, our sample included a relatively narrow range of body masses (13.7 to 26.6 kg) compared to the potential range (up to 80 kg), limiting our ability to entirely isolate the effects of speed from those of mass. Future work should examine a broader range of body sizes. Second, kangaroos studied here only hopped at relatively slow speeds, which bounds our estimates of EMA and tendon stress to a less critical region. As such, we were unable to assess tendon stress at fast speeds, where increased forces would reduce tendon safety factors closer to failure. A different experimental or modelling approach may be needed, as kangaroos in enclosures seem unwilling to hop faster over force plates. Finally, we did not determine whether the EMA of proximal hindlimb joints (which are more difficult to track via surface motion capture markers) remained constant with speed. Although the hip and knee contribute substantially less work than the ankle joint (Fig. 4), the majority of kangaroo skeletal muscle is located around these proximal joints. A change in EMA at the hip or knee could influence a larger muscle mass than at the ankle, potentially counteracting or enhancing energy savings in the ankle extensor muscle-tendon units. Further research is needed to understand how posture and muscles throughout the whole body contribute to kangaroo energetics.”

      Additionally, we added a line “Peak GRF also naturally increased with speed together with shorter ground contact durations (Fig. 2b, Suppl. Fig 1b)” (line 238) to highlight that we are not proposing that changes in EMA alone explain the full increase in tendon stress. Both GRF and EMA contribute substantially (almost equally) to stress, and we now give more equal discussion to both. For instance, we now also evaluate how much each contributes: “If peak GRF were constant but EMA changed from the average value of a slow hop to a fast hop, then stress would increase 18%, whereas if EMA remained constant and GRF varied by the same principles, then stress would only increase by 12%. Thus, changing posture and decreasing ground contact duration both appear to influence tendon stress for kangaroos, at least for the range of speeds we examined” (Line 245-249)

      We have added a paragraph in the discussion acknowledging that the cost of generating force problem is not resolved by our work, concluding that “This mechanism may help explain why hopping macropods do not follow the energetic trends observed in other species (Dawson and Taylor 1973, Baudinette et al. 1992, Kram and Dawson 1998), but it does not fully resolve the cost of generating force conundrum” Line 274-276.

      I have a few issues with the other half of this study (i.e. animal size effects). I would enjoy reading a new paragraph by these authors in the Discussion that considers the evolutionary origins and implications of such small safety factors. Surely, it would need to be speculative, but that's OK.

      We appreciate this comment from the reviewer, however could not extend the study to discuss animal size effects because, as we now note in the results: “The range of body masses may not be sufficient to detect an effect of mass on ankle moment in addition to the effect of speed.” Line 193

      Reviewer #2 (Public Review):

      Summary

      This is a fascinating topic that has intrigued scientists for decades. I applaud the authors for trying to tackle this enigma. In this manuscript, the authors primarily measured hopping biomechanics data from kangaroos and performed inverse dynamics. 

      While these biomechanical analyses were thorough and impressively incorporated collected anatomical data and an Opensim model, I'm afraid that they did not satisfactorily address how kangaroos can hop faster and not consume more metabolic energy, unique from other animals.  Noticeably, the authors did not collect metabolic data nor did they model metabolic rates using their modelling framework. Instead, they performed a somewhat traditional inverse dynamics analysis from multiple animals hopping at a self-selected speed.

      In the current study, we aimed to provide a joint-level explanation for the increases of tendon stress that are likely linked to metabolic energy consumption.

      We have now included a limitations section in the manuscript (See response to Rev 1). We plan to expand upon muscle level energetics in the future with a more detailed musculoskeletal model.

      Within these analyses, the authors largely focused on ankle EMA, discussing its potential importance (because it affects tendon stress, which affects tendon strain energy, which affects muscle mechanics) on the metabolic cost of hopping. However, EMA was roughly estimated (CoP was fixed to the foot, not measured) and did not detectibly associate with hopping speed (see results Yet, the authors interpret their EMA findings as though it systematically related with speed to explain their theory on how metabolic cost is unique in kangaroos vs. other animals

      As noted in our methods, EMA was not calculated from a fixed centre of pressure (CoP). We did fix the medial-lateral position, owing to the fact that both feet contacted the force plate together, but the anteroposterior movement of the CoP was recorded by the force plate and thus allowed to move. We report the movement (or lack of movement) in our results. The anterior-posterior axis is the most relevant to lengthening or shortening the distance of the ‘out-lever’ R, and thereby EMA. It is necessary to assume fixed medial-lateral position because a single force trace and CoP is recorded when two feet land on the force plate. The mediallateral forces on each foot cancel out so there is no overall medial-lateral movement if the forces are symmetrical (e.g. if the kangaroo is hopping in a straight path and one foot is not in front of the other). We only used symmetrical trials so that the anterior-posterior movement of the CoP would be reliable. We have now added additional details into the text to clarify this

      Indeed, the relationship between R and speed (and therefore EMA and speed) was not significant. However, the significant change in ankle height with speed, combined with no systematic change in COP at midstance, demonstrates that R would be greater at faster speeds. If we consider the nonsignificant relationship between R and speed to indicate that there is no change in R, then these two results conflict. We could not find a flaw in our methods, so instead concluded that the nonsignificant relationship between R and speed may be due to a small change in R being undetectable in our data. Taking both results into account, we believe it is more likely that there is a non-detectable change in R, rather than no change in R with speed, but we presented both results for transparency. We have added an additional section into the results to make this clearer (Line 177-185) “If we consider the nonsignificant relationship between R (and EMA) and speed to indicate that there is no change in R, then it conflicts with the ankle height and CoP result. Taking both into account, we think it is more likely that there is a small, but important, change in R, rather than no change in R with speed. It may be undetectable because we expect small effect sizes compared to the measurement range and measurement error (Suppl. Fig. 3h), or be obscured by a similar change in R with body mass. R is highly dependent on the length of the metatarsal segment, which is longer in larger kangaroos (1 kg BM corresponded to ~1% longer segment, P<0.001, R<sup>2</sup>=0.449). If R does indeed increase with speed, both R and r will tend to decrease EMA at faster speeds.”

      These speed vs. biomechanics relationships were limited by comparisons across different animals hopping at different speeds and could have been strengthened using repeated measures design

      There is significant variation in speed within individuals, not just between individuals. The preferred speed of kangaroos is 2-4.5 m/s, but most individuals showed a wide speed range within this. Eight of our 16 kangaroos had a maximum speed that was 1-2m/s faster than their slowest trial. Repeated measures of these eight individuals comprises 78 out of the 100 trials.   It would be ideal to collect data across the full range of speeds for all individuals, but it is not feasible in this type of experimental setting. Interference with animals such as chasing is dangerous to kangaroos as they are prone to adverse reactions to stress. We have now added additional information about the chosen hopping speeds into the results and methods sections to clarify this “The kangaroos elected to hop between 1.99 and 4.48 m s<sup>-1</sup>, with a range of speeds and number of trials for each individual (Suppl. Fig. 9).”  (Line 381-382)

      There are also multiple inconsistencies between the authors' theory on how mechanics affect energetics and the cited literature, which leaves me somewhat confused and wanting more clarification and information on how mechanics and energetics relate

      We thank the reviewer for this comment. Upon rereading we now understand the reviewers position, and have made substantial revisions to the introduction and discussion (See comments below) 

      My apologies for the less-than-favorable review, I think that this is a neat biomechanics study - but am unsure if it adds much to the literature on the topic of kangaroo hopping energetics in its current form.

      Again we thank the reviewer for their time and appreciate their efforts to strengthen our manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The goal of this study is to understand how, unlike other mammals, kangaroos are able to increase hopping speed without a concomitant increase in metabolic cost. They use a biomechanical analysis of kangaroo hopping data across a range of speeds to investigate how posture, effective mechanical advantage, and tendon stress vary with speed and mass. The main finding is that a change in posture leads to increasing effective mechanical advantage with speed, which ultimately increases tendon elastic energy storage and returns via greater tendon strain. Thus kangaroos may be able to conserve energy with increasing speed by flexing more, which increases tendon strain.

      Strengths:

      The approach and effort invested into collecting this valuable dataset of kangaroo locomotion is impressive. The dataset alone is a valuable contribution.

      Thank you!

      Weaknesses:

      Despite these strengths, I have concerns regarding the strength of the results and the overall clarity of the paper and methods used (which likely influences how convincingly the main results come across).

      (1) The paper seems to hinge on the finding that EMA decreases with increasing speed and that this contributes significantly to greater tendon strain estimated with increasing speed. It is very difficult to be convinced by this result for a number of reasons:

      It appears that kangaroos hopped at their preferred speed. Thus the variability observed is across individuals not within. Is this large enough of a range (either within or across subjects) to make conclusions about the effect of speed, without results being susceptible to differences between subjects? 

      Apologies, this was not clear in the manuscript. Kangaroos hopping at their preferred speed means we did not chase or startle them into high speeds to comply with ethics and enclosure limitations. Thus we did not record a wide range of speeds within the bounds of what kangaroos are capable of in the wild (up to 12 m/s), but for the range we did measure (~2-4.5 m/s), there is a large amount of variation in hopping speed within each individual kangaroo. Out of 16 individuals, eight individuals had a difference of 1-2m/s between their slowest and fastest trials, and these kangaroos accounted for 78 out of 100 trials. Of the remainder, six individuals had three for fewer trials each, and two individuals had highly repeatable speeds (3 out of 4, and 6 out of 7 trials were within 0.5 m/s). We have now removed the terminology “preferred speed” e.g line 115. We have added additional information about the chosen hopping speeds into the results and methods, including an appendix figure “The kangaroos elected to hop between 1.99 and 4.48 m s<sup>-1</sup>, with a range of speeds and number of trials for each individual (Suppl. Fig. 9).” (Line 381-382)

      In the literature cited, what was the range of speeds measured, and was it within or between subjects?

      For other literature, to our knowledge the highest speed measured is ~9.5m/s (see supplementary Fig1b) and there were multiple measures for several individuals (see methods Kram & Dawson 1998). 

      Assuming that there is a compelling relationship between EMA and velocity, how reasonable is it to extrapolate to the conclusion that this increases tendon strain and ultimately saves metabolic cost?  They correlate EMA with tendon strain, but this would still not suggest a causal relationship (incidentally the p-value for the correlation is not reported). 

      The functions that underpin these results (e.g. moment = GRF*R) come from physical mechanics and geometry, rather than statistical correlations. Additionally, a p-value is not appropriate in the relationship between EMA and stress (rather than strain) because the relationship does not appear to be linear. We have made it clearer in the discussion that we are not proposing that entire change in stress is caused by changes in EMA, but that the increase in GRF that naturally occurs with speed will also explain some of the increase in stress, along with other potential mechanisms. The discussion has been extensively revised to reflect this. 

      Tendon strain could be increasing with ground reaction force, independent of EMA. Even if there is a correlation between strain and EMA, is it not a mathematical necessity in their model that all else being equal, tendon stress will increase as ema decreases? I may be missing something, but nonetheless, it would be helpful for the authors to clarify the strength of the evidence supporting their conclusions.

      Yes, GRF also contributes to the increase in tendon stress in the mechanism we propose (Suppl. Fig. 8), see the formulas in Fig 6, and we have made this clearer in the revised discussion (see above comment).  You are correct that mathematically stress is inversely proportional to EMA, which can be observed in Fig. 7a, and we did find that EMA decreases. 

      The statistical approach is not well-described. It is not clear what the form of the statistical model used was and whether the analysis treated each trial individually or grouped trials by the kangaroo. There is also no mention of how many trials per kangaroo, or the range of speeds (or masses) tested. 

      The methods include the statistical model with the variables that we used, as well as the kangaroo masses (13.7 to 26.6 kg, mean: 20.9 ± 3.4 kg). We did not have sufficient within individual sample size to use a linear mixed effect model including subject as a random factor, thus all trials were treated individually. We have included this information in the results section. 

      We have now moved the range of speeds from the supplementary material to the results and figure captions. We have added information on the number of trials per kangaroo to the methods, and added Suppl. Fig. 9 showing the distribution of speeds per kangaroo.

      We did not group the data e.g. by using an average speed per individual for all their trials, or by comparing fast to slow groups for statistical analysis (the latter was only for display purposes in our figures, which we have now made clearer in the methods statistics section). 

      Related to this, there is no mention of how different speeds were obtained. It seems that kangaroos hopped at a self-selected pace, thus it appears that not much variation was observed. I appreciate the difficulty of conducting these experiments in a controlled manner, but this doesn’t exempt the authors from providing the details of their approach.

      Apologies, this was not clear in the manuscript. Kangaroos hopping at their preferred speed means we did not chase or startle them into high speeds to comply with ethics and enclosure limitations. Thus we did not record a wide range of speeds within the bounds of what kangaroos are capable of in the wild (up to 12 m/s). We have now removed the terminology “preferred speed” e.g. line 115. We have added additional information about the chosen hopping speeds into the results and methods, including an appendix figure (see above comment). (Line 381-382)

      Some figures (Figure 2 for example) present means for one of three speeds, yet the speeds are not reported (except in the legend) nor how these bins were determined, nor how many trials or kangaroos fit in each bin. A similar comment applies to the mass categories. It would be more convincing if the authors plotted the main metrics vs. speed to illustrate the significant trends they are reporting.

      Thank you for this comment. The bins are used only for display purposes and not within the statistical analysis. We have clarified this in the revised manuscript: “The data was grouped into body mass (small 17.6±2.96 kg, medium 21.5±0.74 kg, large 24.0±1.46 kg) and speed (slow 2.52±0.25 m s<sup>-1</sup>, medium 3.11±0.16 m s<sup>-1</sup>, fast 3.79±0.27 m s<sup>-1</sup>) subsets for display purposes only”. (Line 495-497)

      (2) The significance of the effects of mass is not clear. The introduction and abstract suggest that the paper is focused on the effect of speed, yet the effects of mass are reported throughout as well, without a clear understanding of the significance. This weakness is further exaggerated by the fact that the details of the subject masses are not reported.

      Indeed, the primary aim of our study was to explore the influence of speed, given the uncoupling of energy from hopping speed in kangaroos. We included mass to ensure that the effects of speed were not driven by body mass (i.e.: that larger kangaroos hopped faster). Subject masses were reported in the first paragraph of the methods, albeit some were estimated as outlined in the same paragraph.

      (3) The paper needs to be significantly re-written to better incorporate the methods into the results section. Since the results come before the methods, some of the methods must necessarily be described such that the study can be understood at some level without turning to the dedicated methods section. As written, it is very difficult to understand the basis of the approach, analysis, and metrics without turning to the methods.

      The methods after the discussion is a requirement of the journal. We have incorporated some methods in the results where necessary but not too repetitive or disruptive, e.g. Fig. 1 caption, and specifying we are only analysing EMA for the ankle joint

      Reviewing Editor (Recommendations For The Authors):

      Below is a list of specific recommendations that the authors could address to improve the eLife assessment:

      (1) Based on the data presented and the fact that metabolic energy was not measured, the authors should temper their conclusions and statements throughout the manuscript regarding the link between speed and metabolic energy savings. We recommend adding text to the discussion summarizing the strengths and limitations of the evidence provided and suggesting future steps to more conclusively answer this mystery.

      There is a significant body of work linking metabolic energy savings to measured increases in tendon stress in macropods. However, the purpose of this paper was to address the unanswered questions about why tendon stress increases. We found that stress did not only increase due to GRF increasing with speed as expected, but also due to novel postural changes which decreased EMA. In the revised manuscript, we have tempered our conclusions to make it clearer that it is not just EMA affecting stress, and added limitations throughout the manuscript (see response to Rev 1). 

      (2) To provide stronger evidence of a link between speed, mechanics, and metabolic savings the authors can consider estimating metabolic energy expenditure from their OpenSIM model. This is one suggestion, but the authors likely have other, possibly better ideas. Such a model should also be able to explain why the metabolic rate increases with speed during uphill hopping.

      Extending the model to provide direct metabolic cost estimates will be the goal of a future paper, however the models does not have detailed muscle characteristics to do this in the formulation presented here. It would be a very large undertaking which is beyond the scope of the current manuscript. As per the comment above, the results of this paper are not reliant on metabolic performance. 

      (3) The authors attempt to relate the newly quantified hopping biomechanics to previously published metabolic data. However, all reviewers agree that the logic in many instances is not clear or contradictory. Could one potential explanation be that at slow speeds, forces and tendon strain are small, and thus muscle fascicle work is high? Then, with faster speeds, even though the cost of generating isometric force increases, this is offset by the reduction in the metabolic cost of muscular work. The paper could provide stronger support for their hypotheses with a much clearer explanation of how the kinematics relate to the mechanics and ultimately energy savings.

      In response to the reviewers comments, we have substantially modified the discussion to provide clearer rationale.

      (4) The methods and the effort expended to collect these data are impressive, but there are a number of underlying assumptions made that undermine the conclusions. This is due partly to the methods used, but also the paper's incomplete description of their methods. We provide a few examples below:

      It would be helpful if the authors could speak to the effect of the limited speeds tested and between-animal comparisons on the ability to draw strong conclusions from the present dataset. ·

      Throughout the discussion, the authors highlight the relationship between EMA and speed. However, this is misleading since there was no significant effect of speed on EMA. Speed only affected the muscle moment arm, r. At minimum, this should be clarified and the effect on EMA not be overstated. Additionally, the resulting implications on their ability to confidently say something about the effect of speed on muscle stress should be discussed. 

      We have now provided additional details, (see responses above) to these concerns. For instance, we added a supplementary figure showing the speed distribution per individual. The primary reviewer concern (that each kangaroo travelled at a single speed) was due to a miscommunication around the terminology “preferred” which has now been corrected. 

      We now elaborate in the results why we are not very concerned that EMA is insignificant. The statistical insignificance of EMA is ultimately due to the insignificance of the direct measurement of R, however, we now better explain in the results why we believe that this statistical insignificance is due to error/noise of the measurement which is relatively large compared to the effect size. Indirect indications of how R may increase with speed (via ankle height from the ground) are statistically significant. Lines 177-185. 

      We consider this worth reporting because, for instance, an 18% change in EMA will be undetectable by measurement, but corresponds to an 18% change in tendon stress which is measurable and physiologically significant (safety factor would decrease from 2 to 1.67).  We presented both significant and insignificant results for transparency. 

      We have also discussed this within a revised limitations section of the manuscript (Line 311328). 

      Reviewer #1 (Recommendations For The Authors):

      Title: I would cut the first half of the title. At least hedge it a bit. "Clues" instead of "Unlocking the secrets".

      We have revised the title to: “Postural adaptations may contribute to the unique locomotor energetics seen in hopping kangaroos”

      In my comments, ... typically indicates a stylistic change suggested to the text.

      Overall, the paper covers speed and size. Unfortunately, the authors were not 100% consistent in the order of presenting size then speed, or speed then size. Just choose one and stick with it.

      We have attempted to keep the order of presenting size and speed consistent, however there are several cases where this would reduce the readability of the manuscript and so in some cases this may vary. 

      One must admit that there is a lot of vertical scatter in almost all of the plots. I understand that these animals were not in a lab on a treadmill at a controlled speed and the animals wear fur coats so marker placements vary/move etc. But the spread is quite striking, e.g. Figure 5a the span at one speed is almost 10x. Can the authors address this somewhere? Limitations section?

      The variation seen likely results from attempting to display data in a 2D format, when it is in fact the result of multiple variables, including speed, mass, stride frequency and subject specific lengths. Slight variations in these would be expected to produce some noise around the mean, and I think it’s important to consider this while showing the more dominant effects. 

      In many locations in the manuscript, the term "work" is used, but rarely if ever specified that this is the work "per hop". The big question revolves around the rate of metabolic energy consumption (i.e. energy per time or average metabolic power), one must not forget that hop frequency changes somewhat across speed, so work per hop is not the final calculation.

      Thank you for this comment. We have now explicitly stated work per hop in figure captions and in the results (line 208). The change in stride frequency at this range of speeds is very small, particularly compared to the variance in stride frequency (Suppl. Fig. 1d), which is consistent with other researchers who found that stride frequency was constant or near constant in macropods at analogous speeds (e.g. Dawson and Taylor 1973, Baudinette et al. 1987). 

      Line 61 ....is likely related.

      Added “likely” (line 59)

      Line 86 I think the Allen reference is incomplete. Wasn't it in J Exp Biology?

      Thank you. Changed. 

      Line 122 ... at faster speeds and in larger individuals.

      Changed: “We hypothesised that (i) the hindlimb would be more crouched at faster speeds, primarily due to the distal hindlimb joints (ankle and metatarsophalangeal), independent of changes with body mass” (Line 121-122).

      Line 124 I found this confusing. Try to re-word so that you explain you mean more work done by the tendons and less by the ankle musculature.

      Amended: “changes in moment arms resulting from the change in posture would contribute to the increase in tendon stress with speed, and may thereby contribute to energetic savings by increasing the amount of positive and negative work done by the ankle without requiring additional muscle work” (Line 123)

      Line 129 hopefully "braking" not "breaking"!

      Thank you. Fixed. (Line 130)

      Line 129 specify fore-aft horizontal force.

      Added "fore-aft" to "negative fore-aft horizontal component" (Line 130-131)

      Line 130 add something like "of course" or "naturally" since if there is zero fore-aft force, the GRF vector of course must be vertical. 

      Added "naturally" (Line 132)

      Line 138 clarify that this section is all stance phase. I don't recall reading any swing phase data.

      Changed to: "Kangaroo hindlimb stance phase kinematics varied…" (Line 141)

      Line 143 and elsewhere. I found the use of dorsiflexion and plantarflexion confusing. In Figure 3, I see the ankle never flexing more than 90 degrees. So, the ankle joint is always in something of a flexed position, though of course it flexes and extends during contact. I urge the authors to simplify to flextion/extension and drop the plantar/dorsi.

      We have edited this section to describe both movements as greater extension (plantarflexion). (Line 147). We have further clarified this in the figure caption for figure 3.  

      Line 147 ...changes were…

      Fixed, line 150

      Line 155 I'm a bit confused here. Are the authors calculating some sort of overall EMA or are they saying all of the individual joint EMAs all decreased?

      Thank you, we clarified that it is at the ankle. Line 158

      Line 158 since kangaroos hop and are thus positioned high and low throughout the stance phase, try to avoid using "high" and "low" for describing variables, e.g. GRF or other variables. Just use "greater/greatest" etc.

      Thanks for this suggestion. We have changed "higher" into "greater" where appropriate throughout the manuscript e.g. line 161

      Lines 162 and 168 same comment here about "r" and "R". Do you mean ankle or all joints?

      Clarified that it is the gastrocnemius and plantaris r, and the R to the ankle. (Lines 164-165)

      Line 173 really, ankle height?

      Added: ankle height is "vertical distance from the ground". Line 177

      Line 177 is this just the ankle r?

      Added "of the ankle" line 158 and “Achilles” line 187 

      Line 183 same idea, which tendon/tendons are you talking about here?

      Added "Achilles" to be more clear (Line 187)

      Line 195 substitute "converted" for "transferred".

      Done (Line 210)

      Line 223 why so vague? i.e. why use "may"? Believe in your data. ...stress was also modulated by changes....

      Changed "may" to "is"

      Line 229 smaller ankle EMA (especially since you earlier talked about ankle "height").

      Changed “lower” to “smaller” Line 254

      Line 2236 ...and return elastic energy…

      Added "elastic" line 262

      Line 244 IMPORTANT: Need to explain this better! I think you are saying that the net work at the ankle is staying the same across speed, BUT it is the tendons that are storing and returning that work, it's not that the muscles are doing a lot of negative/positive work.

      Changed: “The consistent net work observed among all speeds suggests the ankle extensor muscle-tendon units are performing similar amounts of ankle work independent of speed, which would predominantly be done by the tendon.” Line 270-272)

      Line 258-261 I think here is where you are over-selling the data/story. Although you do say "a" mechanism (and not "the" mechanism, you still need to deal with the cost of generating more force and generating that force faster.

      We removed this sentence and replaced it with a discussion of the cost of generating force hypothesis, and alternative scenarios for the how force and metabolics could be uncoupled. 

      Line 278 "the" tendon? Which tendon?

      Added "Achilles"

      Line 289. I don't think one can project into the past.

      Changed “projected” to "estimated"

      Line 303 no problem, but I've never seen a paper in biology where the authors admit they don't know what species they were studying!

      Can’t be helped unfortunately. It is an old dataset and there aren’t photos of every kangaroo. Fortunately, from the grey and red kangaroos we can distinguish between, we know there are no discernible species effects on the data. 

      Lines 304-306 I'm not clear here. Did you use vertical impulse (and aerial time) to calculate body weight? Or did you somehow use the braking/propulsive impulse to calculate mass? I would have just put some apples on the force plate and waited for them to stop for a snack.

      Stationary weights were recorded for some kangaroos which did stand on the force plate long enough, but unfortunately not all of them were willing to do so. In those cases, yes, we used impulse from steady-speed trials to estimate mass. We cross-checked by estimated mass from segment lengths (as size and mass are correlated). This is outlined in the first paragraph of the methods.

      Lines 367 & 401 When you use the word "scaled" do you mean you assumed geometric similarity?

      No, rather than geometric scaling, we allowed scaling to individual dimensions by using the markers at midstance for measurements. We have amended the paragraph to clarify that the shape of the kangaroo changes and that mass distribution was preserved during the shape change (line 441-446) 

      Lines 381-82 specify "joint work"

      Added "joint work"  (Line 457)

      Figure 1 is gorgeous. Why not add the CF equation to the left panel of the caption?

      We decided to keep the information in the figure caption. “Total leg length was calculated as the sum of the segment lengths (solid black lines) in the hindlimb and compared to the pelvisto-toe distance (dashed line) to calculate the crouch factor”

      Figure 2 specify Horizontal fore-aft.

      Done

      Figure 3g I'd prefer the same Min. Max Flexion vertical axis labels as you use for hip & knee.

      While we appreciate the reviewer trying to increase the clarity of this figure, we have left it as plantar/dorsi flexion since these are recognised biomechanical terms. To avoid confusion, we have further defined these in the figure caption “For (f-g), increased plantarflexion represents a decrease in joint flexion, while increased dorsiflexion represents increased flexion of the joint.”

      Figure 4. I like it and I think that you scaled all panels the same, i.e. 400 W is represented by the same vertical distance in all panels. But if that's true, please state so in the Caption. It's remarkable how little work occurs at the hip and knee despite the relatively huge muscles there.

      Is it true that the y axes are all at the same scale. We have added this to the caption. 

      Figure 5 Caption should specify "work per hop".

      Added

      Figure 7 is another beauty.

      Thank you!

      Supplementary Figure 3 is this all ANKLE? Please specify.

      Clarified that it is the gastrocnemius and plantaris r, and the R to the ankle.

      Reviewer #2 (Recommendations For The Authors):

      To 'unlock the secrets of kangaroo locomotor energetics' I expected the authors to measure the secretive outcome variable, metabolic rate using laboratory measures. Rather, the authors relied on reviewing historic metabolic data and collecting biomechanics data across different animals, which limits the conclusions of this manuscript.

      We have revised to the title to make it clearer that we are investigating a subset of the energetics problem, specifically posture. “Postural adaptations may contribute to the unique locomotor energetics seen in hopping kangaroos.” We have also substantially modified the discussion to temper the conclusions from the paper. 

      After reading the hypothesis, why do the authors hypothesize about joint flexion and not EMA? Because the following hypothesis discusses the implications of moment arms on tendon stress, EMA predictions are more relevant (and much more discussed throughout the manuscript).

      Ankle and MTP angles are the primary drivers of changes in r, R & thus, EMA. We used a two part hypothesis to capture this. We have rephased the hypotheses: “We hypothesised that (i) the hindlimb would be more crouched at faster speeds, primarily due to the distal hindlimb joints (ankle and metatarsophalangeal), independent of changes with body mass, and (ii) changes in moment arms resulting from the change in posture would contribute to the increase in tendon stress with speed, and may thereby contribute to energetic savings by increasing the amount of positive and negative work done by the ankle without requiring additional muscle work.”

      If there were no detectable effects of speed on EMA, are kangaroos mechanically like other animals (Biewener Science 89 & JAP 04) who don't vary EMA across speeds? Despite no detectible effects, the authors state [lines 228-229] "we found larger and faster kangaroos were more crouched, leading to lower ankle EMA". Can the authors explain this inconsistency? Lines 236 "Kangaroos appear to use changes in posture and EMA". I interpret the paper as EMA does not change across speed.

      Apologies, we did not sufficiently explain this originally. We now explain in the results our reasoning behind our belief that EMA and R may change with speed. “If we consider the nonsignificant relationship between R (and EMA) and speed to indicate that there is no change in R, then it conflicts with the ankle height and CoP result. Taking both into account, we think it is more likely that there is a small, but important, change in R, rather than no change in R with speed. It may be undetectable because we expect small effect sizes compared to the measurement range and measurement error (Suppl. Fig. 3h), or be obscured by a similar change in R with body mass. R is highly dependent on the length of the metatarsal segment, which is longer in larger kangaroos (1 kg BM corresponded to ~1% longer segment, P<0.001, R<sup>2</sup>=0.449). If R does indeed increase with speed, both R and r will tend to decrease EMA at faster speeds.” (Line 177-185)

      Lines 335-339: "We assumed the force was applied along phalanx IV and that there was no medial or lateral movement of the centre of pressure (CoP)". I'm confused, did the authors not measure CoP location with respect to the kangaroo limb? If not, this simple estimation undermines primary results (EMA analyses).

      We have changed "The anterior or posterior movement of the CoP was recorded by the force plate" to read: "The fore-aft movement of the CoP was recorded by the force plate within the motion capture coordinate system" (Line 406-407) and added more justification for fixing the CoP movement in the other axis: “It was necessary to assume the CoP was fixed in the mediallateral axis because when two feet land on the force plate, the lateral forces on each foot are not recorded, and indeed cancel if the forces are symmetrical (i.e. if the kangaroo is hopping in a straight path and one foot is not in front of the other). We only used symmetrical trials to ensure reliable measures of the anterior-posterior movement of the CoP.” (Line 408-413)

      The introduction makes many assertions about the generalities of locomotion and the relationship between mechanics and energetics. I'm afraid that the authors are selectively choosing references without thoroughly evaluating alternative theories. For example, Taylor, Kram, & others have multiple papers suggesting that decreasing EMA and increasing muscle force (and active muscle volume) increase metabolic costs during terrestrial locomotion. Rather, the authors suggest that decreasing EMA and increasingly high muscle force at faster speeds don't affect energetics unless muscle work increases substantially (paragraph 2)? If I am following correctly, does this theory conflict with active muscle volume ideas that are peppered throughout this manuscript?

      Yes, as you point out, the same mechanism does lead to different results in kangaroos vs humans, for instance, but this is not a contradiction. In all species, decreasing EMA will result in an increase in muscle force due to less efficient leverage (i.e. lower EMA) of the muscles, and the muscle-tendon unit will be required to produce more force to balance the joint moment. As a consequence, human muscles activate a greater volume in order for the muscle-tendon unit to increase muscle work and produce enough force. We are proposing that in kangaroos, the increase in work is done by the achilles tendon rather than the muscles. Previous research suggests that macropod ankle muscles contract isometrically or that the fibres do not shorten more at faster speeds i.e. muscle work does not increase with speed. Instead, the additional force seems to come from the tendon storing and subsequently returning more strain energy (indicated by higher stress). We found that the increase in tendon stress comes from higher ground force at faster speeds, and from it adopting a more crouched posture which increases the tendons’ stresses compared to an upright posture for a given speed (think of this as increasing the tendon’s stress capacity). We have substantially revised the discussion to highlight this.

      Similarly, does increased gross or net tendon mechanical energy storage & return improve hopping energetics? Would more tendon stress and strain energy storage with a given hysteresis value also dissipate more mechanical energy, requiring leg muscles to produce more net work? Does net or gross muscle work drive metabolic energy consumption?

      Based on the cost of generating force hypothesis, we think that gross muscle work would be linked to driving metabolic energy consumption. Our idea here is that the total body work is a product of the work done by the tendon and the muscle combined. If the tendon has the potential to do more work, then the total work can increase without muscle work needing to increase.

      The results interpret speed effects on biomechanics, but each kangaroo was only collected at 1 speed. Are inter-animal comparisons enough to satisfy this investigation?

      We have added a figure (Suppl Fig 9) to demonstrate the distribution of speed and number of trials per kangaroo. We have also removed "preferred" from the manuscript as this seems to cause confusion. Most kangaroos travelled at a range of “casual” speeds.

      Abstract: Can the authors more fully connect the concept of tendon stress and low metabolic rates during hopping across speeds? Surely, tendon mechanics don't directly drive the metabolic cost of hopping, but they affect muscle mechanics to affect energetics.

      Amended to: " This phenomenon may be related to greater elastic energy savings due to increasing tendon stress; however, the mechanisms which enable the rise in stress, without additional muscle work remain poorly understood." (Lines 25-27).

      The topic sentence in lines 61-63 may be misleading. The ensuing paragraph does not substantiate the topic sentence stating that ankle MTUs decouple speeds and energetics.

      We added "likely" to soften the statement. (Line 59)

      Lines 84-86: In humans, does more limb flexion and worse EMA necessitate greater active muscle volume? What about muscle contractile dynamics - See recent papers by Sawicki & colleagues that include Hill-type muscle mechanics in active muscle volume estimates.

      Added: “Smaller EMA requires greater muscle force to produce a given force on the ground, thereby demanding a greater volume of active muscle, and presumably greater metabolic rates than larger EMA for the same physiology”. (Line 80-82)

      Lines 106: can you give the context of what normal tendon safety factors are?

      Good idea. Added: "far lower than the typical safety factor of four to eight for mammalian tendons (Ker et al. 1988)." Line 106-107

      I thought EMA was relatively stable across speeds as per Biewener [Science & JAP '04]. However the authors gave an example of an elephant to suggest that it is typically inversely related to speed. Can the authors please explain the disconnect and the most appropriate explanation in this paragraph?

      Knee EMA in particular changed with speed in Biewener 2004. What is “typical” probably depends on the group of animals studied; e.g., cursorial quadrupedal mammals generally seem to maintain constant EMA, but other groups do not.

      These cases are presented to show a range of consequences for changing EMA (usually with mass, but sometimes with speed). We have made several adjustments to the paragraph to make this clearer. Lines 85-93.

      The results depend on the modeled internal moment arm (r). How confident are the authors in their little r prediction? Considering complications of joint mechanics in vivo including muscle bulging. Holzer et al. '20 Sci Rep demonstrated that different models of the human Achilles tendon moment arm predict vastly different relationships between the moment arm and joint angle.

      Our values for r and EMA closely align with previous papers which measured/calculate these values in kangaroos, such as Kram 1998, and thus we are confident in our interpretation.  

      This is a misleading results sentence: Small decreases in EMA correspond to a nontrivial increase in tendon stress, for instance, reducing EMA from 0.242 (mean minimum EMA of the slow group) to 0.206 (mean minimum EMA of the fast group) was associated with an ~18% increase in tendon stress. The authors could alternatively say that a ~15% decrease in EMA was associated with an ~18% increase in tendon stress, which seems pretty comparable.

      Thank you for pointing this out, it is important that it is made clearer. Although the change in relative magnitude is approximately the same (as it should be), this does not detract from the importance. The "small decrease in EMA" is referring to the absolute values, particularly in respect to the measurement error/noise. The difference is small enough to have been undetectable with other methods used in previous studies. We have amended the sentence to clarify this.

      It now reads: “Subtle decreases in EMA which may have been undetected in previous studies correspond to discernible increases in tendon stress. For instance, reducing EMA from 0.242 (mean minimum EMA of the slow group) to 0.206 (mean minimum EMA of the fast group) was associated with an increase in tendon stress from ~50 MPa to ~60 MPa, decreasing safety factor from 2 to 1.67 (where 1 indicates failure), which is both measurable and physiologically significant.” (Line 195-200)

      Lines 243-245: "The consistent net work observed among all speeds suggests the ankle extensors are performing similar amounts of ankle work independent of speed." If this is true, and presumably there is greater limb work performed on the center of mass at faster speeds (Donelan, Kram, Kuo), do more proximal leg joints increase work and energy consumption at faster speeds?

      The skin over the proximal leg joints (knee and hip) moves too much to get reliable measures of EMA from the ratio of moment arms. This will be pursued in future work when all muscles are incorporated in the model so knee and hip EMA can be determined from muscle force.

      We have added limitations and considerations paragraph to the manuscript: “Finally, we did not determine whether the EMA of proximal hindlimb joints (which are more difficult to track via surface motion capture markers) remained constant with speed. Although the hip and knee contribute substantially less work than the ankle joint (Fig. 4), the majority of kangaroo skeletal muscle is located around these proximal joints. A change in EMA at the hip or knee could influence a larger muscle mass than at the ankle, potentially counteracting or enhancing energy savings in the ankle extensor muscle-tendon units. Further research is needed to understand how posture and muscles throughout the whole body contribute to kangaroo energetics.” (Line 321-328)

      Lines 245-246: "Previous studies using sonomicrometry have shown that the muscles of tammar wallabies do not shorten considerably during hops, but rather act near-isometrically as a strut" Which muscles? All muscles? Extensors at a single joint?

      Added "gastrocnemius and plantaris" Line 164-165

      Lines 249-254: "The cost of generating force hypothesis suggests that faster movement speeds require greater rates of muscle force development, and in turn greater cross-bridge cycling rates, driving up metabolic costs (Taylor et al. 1980, Kram and Taylor 1990). The ability for the ankle extensor muscle fibres to remain isometric and produce similar amounts of work at all speeds may help explain why hopping macropods do not follow the energetic trends observed in quadrupedal species." These sentences confuse me. Kram & Taylor's cost of force-generating hypothesis assumes that producing the same average force over shorter contact times increases metabolic rate. How does 'similar muscle work' across all speeds explain the ability of macropods to use unique energetic trends in the cost of force-generating hypothesis context?

      Thank you for highlighting this confusion. We have substantially revised the discussion clarify where the mechanisms presented deviate from the cost of generating force hypothesis. Lines 270-309

      Reviewer #3 (Recommendations For The Authors):

      In addition to the points described in the public review, I have additional, related, specific comments:

      (1) Results: Please refer to the hypotheses in the results, and relate the the findings back to the hypotheses.

      We now relate the findings back to the hypotheses 

      Line 142 “In partial support of hypothesis (i), greater masses and faster speeds were associated with more crouched hindlimb postures (Fig. 3a,c).”.

      Lines 205-206: “The increase in tendon stress with speed, facilitated in part by the change in moment arms by the shift in posture, may explain changes in ankle work (c.f. Hypothesis (ii)).” 

      (2) Results: please provide the main statistical results either in-line or in a table in the main text.

      We (the co-authors) have discussed this at length, and have agreed that the manuscript is far more readable in the format whereby most statistics lie within the supplementary tables, otherwise a reader is met with a wall of statistics. We only include values in the main text when the magnitude is relevant to the arguments presented in the results and discussion.

      (3) Line 140: Describe how 'crouched' was defined.

      We have now added a brief definition of ‘Crouch factor’ after the figure caption. (Line 143) (Fig. 3a,c; where crouch factor is the ratio of total limb length to pelvis to toe distance).

      (4) Line 162: This seems to be a main finding and should be a figure in the main text not supplemental. Additionally, Supplementary Figures 3a and b do not show this finding convincingly There should be a figure plotting r vs speed and r vs mass.

      The combination of r and R are represented in the EMA plot in the main text. The r and R plots are relegated to the supplementary because the main text is already very crowded.  Thank you for the suggestion for the figure plotting r and R versus speed, this is now included as Suppl. Fig. 3h

      (5) Line 166: Supplementary Figure 3g does not show the range of dorsiflexion angles as a function of speed. It shows r vs dorsiflexion angle. Please correct.

      Thanks for noticing this, it was supposed to reference Fig 3g rather than Suppl Fig 3g in the sentence regarding speed. We have fixed this, Line 170. 

      We had added a reference to Suppl Fig 3 on Line 169 as this shows where the peak in r with ankle angle occurs (114.4 degrees).

      (6) Line 184: Where are the statistical results for this statement?

      The relationship between stress and EMA does not appear to be linear, thus we only present R<sup>^</sup>2 for the power relationship rather than a p-value. 

      (7) Line 192: The authors should explain how joint work and power relate/support the overall hypotheses. This section also refers to Figures 4 and 5 even though Figures 6 and 7 have already been described. Please reorganize.

      We have added a sentence at the end of the work and power section to mention hypothesis (ii) and lead into the discussion where it is elaborated upon. 

      “The increase in positive and negative ankle work may be due to the increase in tendon stress rather than additional muscle work.” Line 219-220 We have rearranged the figure order.

      (8) The statistics are not reported in the main text, but in the supplementary tables. If a result is reported in the main text, please report either in-line or with a table in the main text.

      We leave most statistics in the supplementary tables to preserve the readability of the manuscript. We only include values in the main text when the magnitude is relevant to the arguments raised in the results and discussion.

    1. Reviewer #3 (Public review):

      Summary:

      This paper demonstrates that membrane depolarization induces a small increase in cell entry into mitosis. Based on previous work from another lab, the authors propose that ERK activation might be involved. They show convincingly using a combination of assays that ERK is activated by membrane depolarization. They show this is Ca2+ independent and is a result of activation of the whole K-Ras/ERK cascade which results from changed dynamics of phosphatidylserine in the plasma membrane that activates K-Ras. Although the activation of the Ras/ERK pathway by membrane depolarization is not new, linking it to an increase in cell proliferation is novel.

      Strengths:

      A major strength of the study is the use of different techniques - live imaging with ERK reporters, as well as Western blotting to demonstrate ERK activation as well as different methods for inducing membrane depolarization. They also use a number of different cell lines. Via Western blotting the authors are also able to show that the whole MAPK cascade is activated.

      Weaknesses:

      In the previous round of revisions, the authors addressed the issues with Figure 1, and the data presented are much clearer. The authors did also attempt to pinpoint when in the cell cycle ERK is having its activity, but unfortunately, this was not conclusive.

    2. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Public review):

      Summary:

      This is a contribution to the field of developmental bioelectricity. How do changes of resting potential at the cell membrane affect downstream processes? Zhou et al. reported in 2015 that phosphatidylserine and K-Ras cluster upon plasma membrane depolarization and that voltage-dependent ERK activation occurs when constitutively active K-RasG12V mutants are overexpressed. In this paper, the authors advance the knowledge of this phenomenon by showing that membrane depolarization up-regulates mitosis and that this process is dependent on voltage-dependent activation of ERK. ERK activity's voltage-dependence is derived from changes in the dynamics of phosphatidylserine in the plasma membrane and not by extracellular calcium dynamics. This paper reports an interesting and important finding. It is somewhat derivative of Zhou et al., 2015. (https://www.science.org/doi/full/10.1126/science.aaa5619). The main novelty seems to be that they find quantitatively different conclusions upon conducting similar experiments, albeit with a different cell line (U2OS) than those used by Zhou et al. Sasaki et al. do show that increased K+ levels increase proliferation, which Zhou et al. did not look at. The data presented in this paper are a useful contribution to a field often lacking such data.

      Strengths:

      Bioelectricity is an important field for areas of cell, developmental, and evolutionary biology, as well as for biomedicine. Confirmation of ERK as a transduction mechanism and a characterization of the molecular details involved in the control of cell proliferation are interesting and impactful.

      Weaknesses:

      The authors lean heavily on the assumption that the Nernst equation is an accurate predictor of membrane potential based on K+ level. This is a large oversimplification that undermines the author's conclusions, most glaringly in Figure 2C. The author's conclusions should be weakened to reflect that the activity of voltage gated ion channels and homeostatic compensation are unaccounted for.

      We appreciate the reviewer’s thoughtful comment regarding our reliance on the Nernst equation to estimate membrane potential. We agree that the Nernst equation is a simplification and does not account for the activity of other ions, voltage-gated channels, or homeostatic compensation mechanisms. To address this concern, we conducted electrophysiological experiments in which the membrane potential was directly controlled using the perforated patch-clamp technique (Fig. 3). Under these conditions, we also monitored the membrane potential and confirmed that there was negligible drift within 20 minutes of perfusion with 145 mM K<sup>⁺</sup> (only a 1–5 mV change). These results suggest that the influence of voltage-gated channels and homeostatic compensation is minimal in our experimental setup. We revised the manuscript to clarify these limitations and to present our conclusions more cautiously in light of this point.

      “A potential limitation of extracellular K<sup>⁺</sup>-based approaches is their reliance on the Nernst equation to estimate membrane potential, which oversimplifies the actual situation by neglecting voltage-gated ion channel activity and compensatory mechanisms. To directly address this concern, we measured membrane potential using the perforated patch-clamp technique and confirmed that the potential was stable during perfusion with 145 mM K<sup>⁺</sup> (only a 1–5 mV drift within 20 min). Moreover, we used a voltage clamp to precisely control the membrane potential and demonstrated that ERK activity was directly regulated by the voltage itself, excluding the influence of other secondary factors. An additional strength of electrophysiology is its ability to examine the effects of repolarization, which is difficult to assess with conventional perfusion-based methods owing to slow solution exchange.”

      There are grammatical tense errors are made throughout the paper (ex line 99 "This kinetics should be these kinetics")

      We thank the reviewer for pointing out the grammatical errors. We carefully revised the entire manuscript.

      Line 71: Zhou et al. use BHK, N2A, PSA-3 cells, this paper uses U2OS (osteosarcoma) cells. Could that explain the differences in bioelectric properties that they describe? In general, there should be more discussion of the choice of cell line. Why were U2OS cells chosen? What are the implications of the fact that these are cancer cells, and bone cancer cells in particular? Does this paper provide specific insights for bone cancers? And crucially, how applicable are findings from these cells to other contexts?

      We thank the reviewer for this valuable comment regarding the choice of cell line. We selected U2OS cells primarily because they are well suited for live-cell FRET imaging. We did not use BHK, N2A, or PSA-3 cells, and therefore it is difficult for us to provide a clear comparison with the specific bioelectric properties reported in Zhou et al. Nevertheless, we agree that cancer cell lines, including U2OS, may exhibit bioelectric properties that differ from those of non-cancerous cells. While this could be a potential limitation, we are inclined to consider voltage-dependent ERK activation to be a fundamental and generalizable phenomenon, not restricted to osteosarcoma cells. The key components of this pathway—phosphatidylserine, Ras, MAPK (including ERK)—are expressed in essentially all mammalian cells. In support of our view, we observed voltage-dependent ERK activation not only in U2OS cells but also in HeLa, HEK293, and A431 cells. These results strongly suggest that the mechanism we describe is not cell-type specific but rather a universal feature of mammalian cells. In the revised Discussion, we expanded our rationale to choose U2OS cells, while addressing the potential implications of using a cancer-derived cell line. 

      “In this study, we primarily used U2OS cells because their flat morphology makes them suitable for live-cell FRET imaging. Although cancer cell lines, including U2OS, may display bioelectric properties that differ from those of noncancerous cells, our findings raise the possibility that voltage-dependent ERK activation is a fundamental and broadly applicable phenomenon rather than a feature specific to osteosarcoma cells. This conclusion is supported by the fact that essential components of this pathway, namely phosphatidylserine, Ras, and MAPK (including ERK), are ubiquitously expressed in mammalian cells. Consistent with this finding, we observed voltage-dependent ERK activation across multiple cell lines: U2OS, HeLa, HEK293, and A431 cells (Fig.S2). These observations indicate that the mechanism we describe is not cell-type-restricted, but rather a universal property of mammalian cells.”

      Line 115: The authors use EGF to calibrate 'maximal' ERK stimulation. Is this level near saturation? Either way is fine, but it would be useful to clarify.

      We thank the reviewer for raising this important point. The YFP/CFP ratio obtained after EGF stimulation is generally considered to represent saturation levels detectable by EKAREV imaging. However, we acknowledge that it remains uncertain whether 10 ng/mL EGF induces the absolute maximal ERK activity in all contexts. To clarify this point, we revised the manuscript (result) text as follows:

      “To normalize variation among cells, cells were stimulated with EGF (10 ng/mL) at the end of the experiment, which presumably yielded a near-saturated YFP/CFP value (ERK activity). This value was used to determine the maximum ERK activity in each cell”

      Line 121: Starting line 121 the authors say "Of note, U2OS cells expressed wild-type K-Ras but not an active mutant of K-Ras, which means voltage dependent ERK activation occurs not only in tumor cells but also in normal cells". Given that U2OS cells are bone sarcoma cells, is it appropriate to refer to these as 'normal' cells in contrast to 'tumor' cells?

      We thank the reviewer for pointing this out. We agree that it is not appropriate to contrast U2OS cells with “normal” cells, since they are sarcoma-derived. To address this point, we revised the sentence to weaken the claim and avoid the misleading terminology.

      “Importantly, as U2OS cells express wild-type K-Ras rather than an oncogenic mutant (16), our results raise the possibility that voltage-dependent ERK activation may also occur in non-transformed cells.”

      Line 101: These normalizations seem reasonable, the conclusions sufficiently supported and the requisite assumptions clearly presented. Because the dish-to-dish and cell-to-cell variation may reflect biologically relevant phenomena it would be ideal if non-normalized data could be added in supplemental data where feasible.

      We thank the reviewer for this helpful suggestion. As recommended, we added representative non-normalized data in the Supplemental Figure S1, which illustrates the non-normalized variation across cells and dishes.

      Figure 2C is listed as Figure 2D in the text

      There is no Figure 2F (Referenced in line 148)

      We thank the reviewer for pointing out these errors. The incorrect figure citations were corrected.

      Reviewer #2 (Public review):

      Sasaki et al. use a combination of live-cell biosensors and patch-clamp electrophysiology to investigate the effect of membrane potential on the ERK MAPK signaling pathway, and probe associated effects on proliferation. This is an effect that has long been proposed, but a convincing demonstration has remained elusive, because it is difficult to perturb membrane potential without disturbing other aspects of cell physiology in complex ways. The time-resolved measurements here are a nice contribution to this question, and the perforated patch clamp experiments with an ERK biosensor are fantastic - they come closer to addressing the above difficulty of perturbing voltage than any prior work. It would have been difficult to obtain these observations with any other combination of tools.

      However, there are still some concerns as detailed in specific comments below:

      Specific comments:

      (1) All the observations of ERK activation, by both high extracellular K+ and voltage clamp, could be explained by cell volume increase (more discussion in subsequent comments). There is a substantial literature on ERK activation by hypotonic cell swelling (e.g. https://doi.org/10.1042/bj3090013, https://doi.org/10.1002/j.1460-2075.1996.tb00938.x, among others). Here are some possible observations that could demonstrate that ERK activation by volume change is distinct from the effects reported here:

      (i) Does hypotonic shock activate ERK in U2OS cells?

      (ii) Can hypotonic shock activate ERK even after PS depletion, whereas extracellular K+ cannot?

      (iii) Does high extracellular K+ change cell volume in U2OS cells, measured via an accurate method such as fluorescence exclusion microscopy?

      (iv) It would be helpful to check the osmolality of all the extracellular solutions, even though they were nominally targeted to be iso-osmotic.

      (2) Some more details about the experimental design and the results are needed from Figure 1:

      (i) For how long are the cells serum-starved? From the Methods section, it seems like the G1 release in different K+ concentration is done without serum, is this correct? Is the prior thymidine treatment also performed in the absence of serum?

      (ii) There is a question of whether depolarization constitutes a physiologically relevant mechanism to regulate proliferation, and how depolarization interacts with other extracellular signals that might be present in an in vivo context. Does depolarization only promote proliferation after extended serum starvation (in what is presumably a stressed cell state)? What fraction of total cells are observed to be mitotic (without normalization), and how does this compare to the proliferation of these cells growing in serum-supplemented media? Can K+ concentration tune proliferation rate even in serum-supplemented media?

      (3) In Figure 2, there are some possible concerns with the perfusion experiment:

      (i) Is the buffer static in the period before perfusion with high K+, or is it perfused? This is not clear from the Methods. If it is static, how does the ERK activity change when perfused with 5 mM K+? In other words, how much of the response is due to flow/media exchange versus change in K+ concentration?

      (ii) Why do there appear to be population-average decreases in ERK activity in the period before perfusion with high K+ (especially in contrast to Fig. 3)? The imaging period does not seem frequent enough for photobleaching to be significant.

      (4) Figure 3 contains important results on couplings between membrane potential and MAPK signaling. However, there are a few concerns:

      (i) Does cell volume change upon voltage clamping? Previous authors have shown that depolarizing voltage clamp can cause cells to swell, at least in the whole-cell configuration: https://www.cell.com/biophysj/fulltext/S0006-3495(18)30441-7 . Could it be possible that the clamping protocol induces changes in ERK signaling due to changes in cell volume, and not by an independent mechanism?

      (ii) Does the -80 mV clamp begin at time 0 minutes? If so, one might expect a transient decrease in sensor FRET ratio, depending on the original resting potential of the cells. Typical estimates for resting potential in HEK293 cells range from -40 mV to -15 mV, which would reach the range that induces an ERK response by depolarizing clamp in Fig. 3B. What are the resting potentials of the cells before they are clamped to -80 mV, and why do we not see this downward transient?

      (5) The activation of ERK by perforated voltage clamp and by high extracellular K+ are each convincing, but it is unclear whether they need to act purely through the same mechanism - while additional extracellular K+ does depolarize the cell, it could also be affecting function of voltage-independent transporters and cell volume regulatory mechanisms on the timescales studied. To more strongly show this, the following should be done with the HEK cells where there is already voltage clamp data:

      (i) Measure resting potential using the perforated patch in zero-current configuration in the high K+ medium. Ideally this should be done in the time window after high K+ addition where ERK activation is observed (10-20 minutes) to minimize the possibility of drift due to changes in transporter and channel activity due to post-translational regulation.

      (ii) Measure YFP/CFP ratio of the HEK cells in the high K+ medium (in contrast to the U2OS cells from Fig. 2 where there is no patch data).

      (iii) The assertion that high K+ is equivalent to changes in Vmem for ERK signaling would be supported if the YFP/CFP change from K+ addition is comparable to that induced by voltage clamp to the same potential. This would be particularly convincing if the experiment could be done with each of the 15 mM, 30 mM, and 145 mM conditions.

      (6) Line 170: "ERK activity was reduced with a fast time course (within 1 minute) after repolarization to -80 mV." I don't see this in the data: in Fig. 3C, it looks like ERK remains elevated for > 10 min after the electrical stimulus has returned to -80 mV

      Comments on revisions:

      The authors have done a good job addressing the comments on the previous submission.

      Reviewer #3 (Public review):

      Summary:

      This paper demonstrates that membrane depolarization induces a small increase in cell entry into mitosis. Based on previous work from another lab, the authors propose that ERK activation might be involved. They show convincingly using a combination of assays that ERK is activated by membrane depolarization. They show this is Ca2+ independent and is a result of activation of the whole K-Ras/ERK cascade which results from changed dynamics of phosphatidylserine in the plasma membrane that activates K-Ras. Although the activation of the Ras/ERK pathway by membrane depolarization is not new, linking it to an increase in cell proliferation is novel.

      Strengths

      A major strength of the study is the use of different techniques - live imaging with ERK reporters, as well as Western blotting to demonstrate ERK activation as well as different methods for inducing membrane depolarization. They also use a number of different cell lines. Via Western blotting the authors are also able to show that the whole MAPK cascade is activated.

      Weaknesses

      A weakness of the study is the data in Figure 1 showing that membrane depolarization results in an increase of cells entering mitosis. There are very few cells entering mitosis in their sample in any condition. This should be done with many more cells to increase the confidence in the results. The study also lacks a mechanistic link between ERK activation by membrane depolarization and increased cell proliferation.

      The authors did achieve their aims with the caveat that the cell proliferation results could be strengthened. The results, for the most par,t support the conclusions.

      This work suggests that alterations in membrane potential may have more physiological functions than action potential in the neural system as it has an effect on intracellular signalling and potentially cell proliferation.

      In the revised manuscript, the authors have now addressed the issues with Figure 1, and the data presented are much clearer. They did also attempt to pinpoint when in the cell cycle ERK is having its activity, but unfortunately, this was not conclusive.

      Reviewer #2 (Recommendations for the authors):

      Small issues:

      Fig. 1A. Please add a mark on the timeline showing when the K+ concentration is changed. Also, please add a time axis that matches the time axis in (C), so readers can know when in C the medium was changed.

      1B caption: unclear what "the images were 20 min before and after cytokinesis" means, given that the images go from -30 min to +20 min. Maybe the authors mean, "the indicated times are measured relative to cytokinesis."

      Thank you for bringing these points to our attention that can confuse readers. We revised the figure legend.

      Line 214: nonoclusters --> nanoclusters

      Line 475: 10 mm -> 10 ¥mum

      Corrected.

    1. The history of early christianity taught within the religion often focuses on the sacrifices of martyrs who refused to renounce their faith even when threatened with torture and violent death.

      I think this is interesting and I never knew that this was a religious thing.

    1. Supplementary Figure 3

      I would make this a main figure — it seems almost like a graphical abstract of the entire paper and shows readers at a glance how different programs they might be familiar with can all work harmoniously. It might also be good to show this high-level view of the system before showing the more nitty-gritty views in Figure 2

    2. custom automations

      You might want to mention to readers that Airtable also enables no-code automations (and you can use Zapier for some more complicated but still no-code automation) since some labs lack even basic scripting ability. You describe "no-code implementation" early in the manuscript, so it seems important to clarify exactly what can be done without code and what strictly requires it.

      Separately, you could also mention that Airtable can communicate with Slack (I see Telegram in Supp Fig 3, but I believe Slack is probably more common among scientists), Google Drive, etc.

    1. nearly three-quarters (72.3%) of Osogbo residents frequently experience lossof sleep (insomnia) during the night due to high levels ofnocturnal noise pollution from sources, such as nightclubs,generating sets, parties, traffic, and noise from religiousactivities.

      This is a shocking statistic. It directly links high nighttime noise levels to a major health problem (insomnia) for a huge majority of residents. This clearly shows the convergence between the measured high noise levels and the perceived negative health impacts. This would be a great quote for Project 2 or 3.

    1. Por que escolh

      Neste quadro inves de deixar como slider, deixar como 3 colunas (1 com o negativo e outras 2 com o positivo). No celular aparece um abaixo do outro (positivo por cima)

    1. Reviewer #3 (Public review):

      Summary:

      The finding of rhythmic activity in the brain has for a long time engendered the theory of rhythmic modes of perception, that humans might oscillate between improved and worse perception depending on states of our internal systems. However, experiments looking for such modes have resulted in conflicting findings, particularly in those where the stimulus itself is not rhythmic. This paper seeks to take a comprehensive look at the effect and various experimental parameters which might generate these competing findings: in particular, the presentation of the stimulus to one ear or the other, the relevance of motor involvement, attentional demands, and memory: each of which are revealed to effect the consistency of this rhythmicity.

      The need the paper attempts to resolve is a critical one for the field. However, as presented, I remain unconvinced that the data would not be better interpreted as showing no consistent rhythmic mode effect.

      Strengths:

      The paper is strong in its experimental protocol and its comprehensive analysis which seeks to compare effects across several analysis types and slight experiment changes to investigate which parameters could effect the presence or absence of an effect of rhythmicity. The prescribed nature of its hypotheses and its manner to set out to test them is very clear which allows for a straightforward assessment of its results

      Weaknesses:

      The papers cited to justify a rhythmic mode are largely based on the processing of rhythmic stimuli. The authors assume the rhythmic mode to be the general default but its not so clear to me why this would be so. The task design seems better suited to a continuous vigilance mode task.

      Secondly, the analysis to detect a "rhythmic mode", assumes a total phase rest at noise onset which is highly implausible given standard nonlinear dynamical analysis of oscillator performance. It's not clear that a rhythmic mode (should it be applied in this task) would indeed generate a consistent phase as the analysis searches for.

      Thirdly, the number of statistical tests used here make trusting any single effect quite difficult and very few of the effects replicate more than once. I think the better would be interpreted as not confirming evidence for rhythmic mode processing in the ears.

      Comments on revised version:

      No further comments. The paper has much of the same issues that I expressed in the initial review but I don't think they can be addressed without a replication study which I appreciate is not always plausible.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This paper presents results from four independent experiments, each of which tests for rhythmicity in auditory perception. The authors report rhythmic fluctuations in discrimination performance at frequencies between 2 and 6 Hz. The exact frequency depends on the ear and experimental paradigm, although some frequencies seem to be more common than others.

      Strengths:

      The first sentence in the abstract describes the state of the art perfectly: "Numerous studies advocate for a rhythmic mode of perception; however, the evidence in the context of auditory perception remains inconsistent". This is precisely why the data from the present study is so valuable. This is probably the study with the highest sample size (total of > 100 in 4 experiments) in the field. The analysis is very thorough and transparent, due to the comparison of several statistical approaches and simulations of their sensitivity. Each of the experiments differs from the others in a clearly defined experimental parameter, and the authors test how this impacts auditory rhythmicity, measured in pitch discrimination performance (accuracy, sensitivity, bias) of a target presented at various delays after noise onset.

      Weaknesses:

      (1) The authors find that the frequency of auditory perception changes between experiments. I think they could exploit differences between experiments better to interpret and understand the obtained results. These differences are very well described in the Introduction, but don't seem to be used for the interpretation of results. For instance, what does it mean if perceptual frequency changes from between- to within-trial pitch discrimination? Why did the authors choose this experimental manipulation? Based on differences between experiments, is there any systematic pattern in the results that allows conclusions about the roles of different frequencies? I think the Discussion would benefit from an extension to cover this aspect.

      We believe that interpreting these differences remains difficult and a precise, detailed (and possibly mechanistic) interpretation is beyond the goal of the present study. The main goal of this study was to explore the consistency and variability of effects across variations of the experimental design and samples of participants. Interpreting specific effects, e.g. at particular frequencies, would make sense mostly if differences between experiments have been confirmed in a separate reproduction. Still, we do provide specific arguments for why differences in the outcome between different experiments, e.g. with and without explicit trial initialization by the participants, could be expected. See lines 91ff in the introduction and 786ff in the discussion.

      (2) The Results give the impression of clear-cut differences in relevant frequencies between experiments (e.g., 2 Hz in Experiment 1, 6 Hz in Exp 2, etc), but they might not be so different. For instance, a 6 Hz effect is also visible in Experiment 1, but it just does not reach conventional significance. The average across the three experiments is therefore very useful, and also seems to suggest that differences between experiments are not very pronounced (otherwise the average would not produce clear peaks in the spectrum). I suggest making this point clearer in the text.

      We have revised the conclusions to note that the present data do not support clear cut differences between experiments. For this reason we also refrain from detailed interpretations of specific effects, as suggested by this reviewer in point 1 above.

      (3) I struggle to understand the hypothesis that rhythmic sampling differs between ears. In most everyday scenarios, the same sounds arrive at both ears, and the time difference between the two is too small to play a role for the frequencies tested. If both ears operate at different frequencies, the effects of the rhythm on overall perception would then often cancel out. But if this is the case, why would the two ears have different rhythms to begin with? This could be described in more detail.

      This hypothesis was not invented by us, but in essence put forward in previous work. The study by Ho et al. CurrBiol 2017 has reported rhythmic effects at different frequencies in the left and right ears, and we here tried to reproduce these effects. One could speculate about an ear-difference based on studies reporting a right-ear advantage in specific listening tasks, and the idea that different time scales of rhythmic brain activity may be specifically prevail in the left and right cortical hemispheres; hence it does not seem improbable that there could be rhythmic effects in both ears at different frequencies. We note this in the introduction, l. 65ff.

      Reviewer #2 (Public review):

      Summary:

      The current study aims to shed light on why previous work on perceptual rhythmicity has led to inconsistent results. They propose that the differences may stem from conceptual and methodological issues. In a series of experiments, the current study reports perceptual rhythmicity in different frequency bands that differ between different ear stimulations and behavioral measures.

      The study suggests challenges regarding the idea of universal perceptual rhythmicity in hearing.

      Strengths:

      The study aims to address differences observed in previous studies about perceptual rhythmicity. This is important and timely because the existing literature provides quite inconsistent findings. Several experiments were conducted to assess perceptual rhythmicity in hearing from different angles. The authors use sophisticated approaches to address the research questions.

      Weaknesses:

      (1) Conceptional concerns:

      The authors place their research in the context of a rhythmic mode of perception. They also discuss continuous vs rhythmic mode processing. Their study further follows a design that seems to be based on paradigms that assume a recent phase in neural oscillations that subsequently influence perception (e.g., Fiebelkorn et al.; Landau & Fries). In my view, these are different facets in the neural oscillation research space that require a bit more nuanced separation. Continuous mode processing is associated with vigilance tasks (work by Schroeder and Lakatos; reduction of low frequency oscillations and sustained gamma activity), whereas the authors of this study seem to link it to hearing tasks specifically (e.g., line 694). Rhythmic mode processing is associated with rhythmic stimulation by which neural oscillations entrain and influence perception (also, Schroeder and Lakatos; greater low-frequency fluctuations and more rhythmic gamma activity). The current study mirrors the continuous rather than the rhythmic mode (i.e., there was no rhythmic stimulation), but even the former seems not fully fitting, because trials are 1.8 s short and do not really reflect a vigilance task. Finally, previous paradigms on phase-resetting reflect more closely the design of the current study (i.e., different times of a target stimulus relative to the reset of an oscillation). This is the work by Fiebelkorn et al., Landau & Fries, and others, which do not seem to be cited here, which I find surprising. Moreover, the authors would want to discuss the role of the background noise in resetting the phase of an oscillation, and the role of the fixation cross also possibly resetting the phase of an oscillation. Regardless, the conceptional mixture of all these facets makes interpretations really challenging. The phase-reset nature of the paradigm is not (or not well) explained, and the discussion mixes the different concepts and approaches. I recommend that the authors frame their work more clearly in the context of these different concepts (affecting large portions of the manuscript).

      Indeed, the paradigms used here and in many similar previous studies incorporate an aspect of phase-resetting, as the presentation of a background noisy may effectively reset ongoing auditory cortical processes. Studies trying to probe for rhythmicity in auditory perception in the absence any background noise have not shown any effect (Zoefel and Heil, 2013), perhaps because the necessary rhythmic processes along auditory pathways are only engaged when some sound is present. We now discuss these points, and also acknowledge the mentioned studies in the visual system; l. 57.

      (2) Methodological concerns:

      The authors use a relatively unorthodox approach to statistical testing. I understand that they try to capture and characterize the sensitivity of the different analysis approaches to rhythmic behavioral effects. However, it is a bit unclear what meaningful effects are in the study. For example, the bootstrapping approach that identifies the percentage of significant variations of sample selections is rather descriptive (Figures 5-7). The authors seem to suggest that 50% of the samples are meaningful (given the dashed line in the figure), even though this is rarely reached in any of the analyses. Perhaps >80% of samples should show a significant effect to be meaningful (at least to my subjective mind). To me, the low percentage rather suggests that there is not too much meaningful rhythmicity present. 

      We note that there is no clear consensus on what fraction of experiments should be expected or how this way of quantifying effects should be precisely valued (l. 441ff). However, we now also clearly acknowledge in the discussion that the effective prevalence is not very high (l. 663).

      I suggest that the authors also present more traditional, perhaps multi-level, analyses: Calculation of spectra, binning, or single-trial analysis for each participant and condition, and the respective calculation of the surrogate data analysis, and then comparison of the surrogate data to the original data on the second (participant) level using t-tests. I also thought the statistical approach undertaken here could have been a bit more clearly/didactically described as well.

      We here realize that our description of the methods was possibly not fully clear. We do follow the strategy as suggested by this reviewer, but rather than comparing actual and surrogate data based on a parametric t-test, we compare these based on a non-parametric percentile-based approach. This has the advantage of not making specific (and possibly not-warranted) assumptions about the distribution of the data. We have revised the methods to clarify this, l. 332ff. 

      The authors used an adaptive procedure during the experimental blocks such that the stimulus intensity was adjusted throughout. In practice, this can be a disadvantage relative to keeping the intensity constant throughout, because, on average, correct trials will be associated with a higher intensity than incorrect trials, potentially making observations of perceptual rhythmicity more challenging. The authors would want to discuss this potential issue. Intensity adjustments could perhaps contribute to the observed rhythmicity effects. Perhaps the rhythmicity of the stimulus intensity could be analyzed as well. In any case, the adaptive procedure may add variance to the data.

      We have added an analysis of task difficulty to the results (new section “Effects of adaptive task difficulty“) to address this. Overall we do not find systematic changes in task difficulty across participants for most of the experiments, but for sure one cannot rule out that this aspect of the design also affects the outcomes.  Importantly, we relied on an adaptive task difficulty to actually (or hopefully) reduce variance in the data, by keeping the task-difficulty around a certain level. Give the large number of trials collected, not using such an adaptive produce may result in performance levels around chance or near ceiling, which would make impossible to detect rhythmic variations in behavior. 

      Additional methodological concerns relate to Figure 8. Figures 8A and C seem to indicate that a baseline correction for a very short time window was calculated (I could not find anything about this in the methods section). The data seem very variable and artificially constrained in the baseline time window. It was unclear what the reader might take from Figure 8.

      This figure was intended mostly for illustration of the eye tracking data, but we agree that there is no specific key insight to be taken from this. We removed this. 

      Motivation and discussion of eye-movement/pupillometry and motor activity: The dual task paradigm of Experiment 4 and the reasons for assessing eye metrics in the current study could have been better motivated. The experiment somehow does not fit in very well. There is recent evidence that eye movements decrease during effortful tasks (e.g., Contadini-Wright et al. 2023 J Neurosci; Herrmann & Ryan 2024 J Cog Neurosci), which appears to contradict the results presented in the current study. Moreover, by appealing to active sensing frameworks, the authors suggest that active movements can facilitate listening outcomes (line 677; they should provide a reference for this claim), but it is unclear how this would relate to eye movements. Certainly, a person may move their head closer to a sound source in the presence of competing sound to increase the signal-to-noise ratio, but this is not really the active movements that are measured here. A more detailed discussion may be important. The authors further frame the difference between Experiments 1 and 2 as being related to participants' motor activity. However, there are other factors that could explain differences between experiments. Self-paced trials give participants the opportunity to rest more (inter-trial durations were likely longer in Experiment 2), perhaps affecting attentional engagement. I think a more nuanced discussion may be warranted.

      We expanded the motivation of why self-pacing trials may effectively alter how rhythmic processes affect perception, and now also allude to attention and expectation related effects (l. 786ff). Regarding eye movements we now discuss the results in the light of the previously mentioned studies, but again refrain from a very detailed and mechanistic interpretation (l. 782).

      Discussion:

      The main data in Figure 3 showed little rhythmicity. The authors seem to glance over this fact by simply stating that the same phase is not necessary for their statistical analysis. Previous work, however, showed rhythmicity in the across-participant average (e.g., Fiebelkorn's and similar work). Moreover, one would expect that some of the effects in the low-frequency band (e.g., 2-4 Hz) are somewhat similar across participants. Conduction delays in the auditory system are much smaller than the 0.25-0.5 s associated with 2-4 Hz. The authors would want to discuss why different participants would express so vastly different phases that the across-participant average does not show any rhythmicity, and what this would mean neurophysiologically.

      We now discussion the assumptions and implications of similar or distinct phases of rhythmic processes within and between participants (l. 695ff). In particular we note that different origins of the underlying neurophysiological processes eventually may suggest that such assumptions are or a not warranted.  

      An additional point that may require more nuanced discussion is related to the rhythmicity of response bias versus sensitivity. The authors could discuss what the rhythmicity of these different measures in different frequency bands means, with respect to underlying neural oscillations.

      We expanded discussion to interpret what rhythmic changes in each of the behavioral metric could imply (l. 706ff).

      Figures:

      Much of the text in the figures seems really small. Perhaps the authors would want to ensure it is readable even for those with low vision abilities. Moreover, Figure 1A is not as intuitive as it could be and may perhaps be made clearer. I also suggest the authors discuss a bit more the potential monoaural vs binaural issues, because the perceptual rhythmicity is much slower than any conduction delays in the auditory system that could lead to interference.

      We tried to improve the font sizes where possible, and discuss the potential monaural origins as suggested by other reviewers. 

      Reviewer #3 (Public review):

      Summary:

      The finding of rhythmic activity in the brain has, for a long time, engendered the theory of rhythmic modes of perception, that humans might oscillate between improved and worse perception depending on states of our internal systems. However, experiments looking for such modes have resulted in conflicting findings, particularly in those where the stimulus itself is not rhythmic. This paper seeks to take a comprehensive look at the effect and various experimental parameters which might generate these competing findings: in particular, the presentation of the stimulus to one ear or the other, the relevance of motor involvement, attentional demands, and memory: each of which are revealed to effect the consistency of this rhythmicity.

      The need the paper attempts to resolve is a critical one for the field. However, as presented, I remain unconvinced that the data would not be better interpreted as showing no consistent rhythmic mode effect. It lacks a conceptual framework to understand why effects might be consistent in each ear but at different frequencies and only for some tasks with slight variants, some affecting sensitivity and some affecting bias.

      Strengths:

      The paper is strong in its experimental protocol and its comprehensive analysis, which seeks to compare effects across several analysis types and slight experiment changes to investigate which parameters could affect the presence or absence of an effect of rhythmicity. The prescribed nature of its hypotheses and its manner of setting out to test them is very clear, which allows for a straightforward assessment of its results

      Weaknesses:

      There is a weakness throughout the paper in terms of establishing a conceptual framework both for the source of "rhythmic modes" and for the interpretation of the results. Before understanding the data on this matter, it would be useful to discuss why one would posit such a theory to begin with. From a perceptual side, rhythmic modes of processing in the absence of rhythmic stimuli would not appear to provide any benefit to processing. From a biological or homeostatic argument, it's unclear why we would expect such fluctuations to occur in such a narrow-band way when neither the stimulus nor the neurobiological circuits require it.

      We believe that the framework for why there may be rhythmic activity along auditory pathways that shapes behavioral outcomes has been laid out in many previous studies, prominently here (Schroeder et al., 2008; Schroeder and Lakatos, 2009; Obleser and Kayser, 2019). Many of the relevant studies are cited in the introduction, which is already rather long given the many points covered in this study. 

      Secondly, for the analysis to detect a "rhythmic mode", it must assume that the phase of fluctuations across an experiment (i.e., whether fluctuations are in an up-state or down-state at onset) is constant at stimulus onset, whereas most oscillations do not have such a total phase-reset as a result of input. Therefore, some theoretical positing of what kind of mechanism could generate this fluctuation is critical toward understanding whether the analysis is well-suited to the studied mechanism.

      In line with this and previous comments (by reviewer 2) we have expanded the discussion to consider the issue of phase alignment (l. 695ff). 

      Thirdly, an interpretation of why we should expect left and right ears to have distinct frequency ranges of fluctuations is required. There are a large number of statistical tests in this paper, and it's not clear how multiple comparisons are controlled for, apart from experiment 4 (which specifies B&H false discovery rate). As such, one critical method to identify whether the results are not the result of noise or sample-specific biases is the plausibility of the finding. On its face, maintaining distinct frequencies of perception in each ear does not fit an obvious conceptual framework.

      Again this point was also noted by another reviewer and we expanded the introduction and discussion in this regard (l. 65ff).

      Reviewer #1 (Recommendations for the authors):

      (1) An update of the AR-surrogate method has recently been published (https://doi.org/10.1101/2024.08.22.609278). I appreciate that this is a lot of work, and it is of coursee up to the authors, but given the higher sensitivity of this method, it might be worth applying it to the four datasets described here.

      Reading this article we note that our implementation of the AR-surrogate method was essentially as suggested here, and not as implemented by Brookshire. In fact we had not realized that Brookshire had apparently computed the spectrum based on the group-average data. As explained in the Methods section, as now clarified even better, we compute for each participant the actual spectrum of this participant’s data, and a set of surrogate spectra. We then perform a group-average of both to compute the p-value of the actual group-average based on the percentile of the distribution of surrogate averages. This send step differs from Harris & Beale, which used a one-sided t-test. The latter is most likely not appropriate in a strict statistical sense, but possibly more powerful for detecting true results compared to the percentile-based approach that we used (see l. 332ff).

      (2) When results for the four experiments are reported, a reminder for the reader of how these experiments differ from each other would be useful.

      We have added this in the Results section.

      "considerable prevalence of differences around 4Hz, with dual‐task requirements leading to stronger rhythmicity in perceptual sensitivity". There is a striking similarity to recently published data (https://doi.org/10.1101/2024.08.10.607439 ) demonstrating a 4-Hz rhythm in auditory divided attention (rather than between modalities as in the present case). This could be a useful addition to the paragraph.

      We have added a reference to this preprint, and additional previous work pointing in the same direction mentioned in there.  

      (3) There are two typos in the Introduction: "related by different from the question", and below, there is one "presented" too much.

      These have been fixed.

      Reviewer #3 (Recommendations for the authors):

      My major suggestion is that these results must be replicated in a new sample. I understand this is not simple to do and not always possible, but at this point, no effect is replicated from one experiment to the next, despite very small changes in protocol (especially experiment 1 vs 2). It's therefore very difficult to justify explaining the different effects as real as opposed to random effects of this particular sample. While the bootstrapping effects show the level of consistency of the effect within the sample studied, it can not be a substitute for a true replication of the results in a new sample.

      We agree that only an independent replication can demonstrate the robustness of the results. We do consider experiment 1 a replication test of Ho et al. CurrBiol 2017, which results in different results than reported there. But more importantly, we consider the analysis of ‘reproducibility’ by simulating participant samples a key novelty of the present work, and want to emphasize this over the within-study replication of the same experiment.  In fact, in light of the present interpretation of the data, even a within-study replication would most likely not offer a clear-cut answer. 

      As I said in the public review, the interpretation of the results, and of why perceptual cycles in arhythmic stimuli could be a plausible theory to begin with, is lacking. A conceptual framework would vastly improve the impact and understanding of the results.

      We tried to strengthen the conceptual framework in the introduction. We believe that this is in large provided by previous work, and the aim of the present study was to explore the robustness of effects and not to suggest and discover novel effects. 

      Minor comments:

      (1) The authors adapt the difficulty as a function of performance, which seems to me a strange choice for an experiment that is analyzing the differences in performance across the experiment. Could you add a sentence to discuss the motivation for this choice?

      We now mention the rationale in the Methods section and in a new section of the Results. There we also provide additional analyses on this parameter.

      (2) The choice to plot the p-values as opposed to the values of the actual analysis feels ill-advised to me. It invites comparison across analyses that isn't necessarily fair. It would be more informative to plot the respective analysis outputs (spectral power, regression, or delta R2) and highlight the windows of significance and their overlap across analyses. In my opinion, this would be more fair and accurate depiction of the analyses as they are meant to be used.

      We do disagree. As explained in the Methods (l. 374ff): “(Showing p-values) … allows presenting the results on a scale that can be directly compared between analysis approaches, metrics, frequencies and analyses focusing on individual ears or the combined data. Each approach has a different statistical sensitivity, and the underlying effect sizes (e.g. spectral power) vary with frequency for both the actual data and null distribution. As a result, the effect size reaching statistical significance varies with frequency, metrics and analyses.” 

      The fact that the level of power (or R2 or whatever metric we consider) required to reach significance differs between analyses (one ear, both ears), metrics (d-prime, bias, RT) and between analyses approaches makes showing the results difficult, as we would need a separate panel for each of those. This would multiply the number of panels required e.g. for Figure 4 by 3, making it a figure with 81 axes. Also neither the original quantities of each analysis (e.g. spectral power) nor the p-values that we show constitute a proper measure of effect size in a statistical sense. In that sense, neither of these is truly ideal for comparing between analyses, metrics etc. 

      We do agree thought that many readers may want to see the original quantification and thresholds for statistical significance. We now show these in an exemplary manner for the Binned analysis of Experiment 1, which provides a positive result and also is an attempt to replicate the findings by  Ho et al 2017. This is shown in new Figure 5. 

      (3) Typo in line 555 (+ should be plus minus).

      (4) Typo in line 572: "Comparison of 572 blocks with minus dual task those without"

      (5) Typo in line 616: remove "one".

      (6) Line 666 refers to effects in alpha band activity, but it's unclear what the relationship is to the authors' findings, which peak around 6 Hz, lower than alpha (~10 Hz).

      (7) Line 688 typo, remove "amount of".

      These points have been addressed.  

      (8) Oculomotor effect that drives greater rhythmicity at 3-4 Hz. Did the authors analyze the eye movements to see if saccades were also occurring at this rate? It would be useful to know if the 3-4 Hz effect is driven by "internal circuitry" in the auditory system or by the typical rate of eye movement.

      A preliminary analysis of eye movement data was in previous Figure 8, which was removed on the recommendation of another review.  This showed that the average saccade rate is about 0.01 saccade /per trial per time bin, amounting to on average less than one detected saccade per trial. Hence rhythmicity in saccades is unlikely to explain rhythmicity in behavioral data at the scale of 34Hz. We now note this in the Results.

      Obleser J, Kayser C (2019) Neural Entrainment and Attentional Selection in the Listening Brain. Trends Cogn Sci 23:913-926.

      Schroeder CE, Lakatos P (2009) Low-frequency neuronal oscillations as instruments of sensory selection. Trends Neurosci 32:9-18.

      Schroeder CE, Lakatos P, Kajikawa Y, Partan S, Puce A (2008) Neuronal oscillations and visual amplification of speech. Trends Cogn Sci 12:106-113.

      Zoefel B, Heil P (2013) Detection of Near-Threshold Sounds is Independent of EEG Phase in Common Frequency Bands. Front Psychol 4:262.

    1. Uprawnienia

      podział uprawnień: 1. roszczenia - konkretna osoba uprawniona X może żądać od innego podmiotu spełnienia świadczenia na rzecz X 2. uprawnienia kształtujące - uprawniony X ma kompetencję do zmian/zakończenia stosunku prawnego przez jednostronną czynność prawną 3. zarzuty - uprawnienie do odmowy spełnienia roszczenia a) peremptoryjne (trwałe) - skutek: unicestwienie dochodzenia roszczenia w każdej możliwej chwili (np, przedawnienie) b) dylatoryjne (przejściowe) - skutek: ograniczenie możliwości dochodzenia roszczenia, ale tylko w określonym czasie

    1. Reviewer #2 (Public review):

      In this manuscript, Meier et al. engineer a new class of light-regulated two-component systems. These systems are built using bathy-bacteriophytochromes that respond to near-infrared (NIR) light. Through a combination of genetic engineering and systematic linker optimization, the authors generate bacterial strains capable of selective and tunable gene expression in response to NIR stimulation. Overall, these results are an interesting expansion of the optogenetic toolkit into the NIR range. The cross-species functionality of the system, modularity, and orthogonality have the potential to make these tools useful for a range of applications.

      Strengths:

      (1) The authors introduce a novel class of near-infrared light-responsive two-component systems in bacteria, expanding the optogenetic toolbox into this spectral range.

      (2) Through engineering and linker optimization, the authors achieve specific and tunable gene expression, with minimal cross-activation from red light in some cases.

      (3) The authors show that the engineered systems function robustly in multiple bacterial strains, including laboratory E. coli, the probiotic E. coli Nissle 1917, and Agrobacterium tumefaciens.

      (4) The combination of orthogonal two-component systems can allow for simultaneous and independent control of multiple gene expression pathways using different wavelengths of light.

      (5) The authors explore the photophysical properties of the photosensors, investigating how environmental factors such as pH influence light sensitivity.

      Comments on revisions:

      The authors have addressed all my prior concerns.

    2. Reviewer #3 (Public review):

      Summary:

      This paper by Meier et al introduces a new optogenetic module for regulation of bacterial gene expression based on "bathy-BphP" proteins. Their paper begins with a careful characterization of kinetics and pH dependence of a few family members, followed by extensive engineering to produce infrared-regulated transcriptional systems based on the authors' previous design of the pDusk and pDERusk systems, and closing with characterization of the systems in bacterial species relevant for biotechnology.

      Strengths:

      The paper is important from the perspective of fundamental protein characterization, since bathy-BphPs are relatively poorly characterized compared to their phytochrome and cyanobacteriochrome cousins. It is also important from a technology development perspective: the optogenetic toolbox currently lacks infrared-stimulated transcriptional systems. Infrared light offers two major advantages: it can be multiplexed with additional tools, and it can penetrate into deep tissues with ease relative to the more widely used blue light activated systems. The experiments are performed carefully and the manuscript is well written.

      Weaknesses:

      Some of the light-inducible responses described in this compelling paper are complex and difficult to rationalize, such as the dependence of light responses on linker length and differences in responses observed from the bathy-BphPs in isolation versus strains in which they are multiplexed. Nevertheless, the authors should be commended for carrying out rigorous experiments and reporting these results accurately. These are minor weaknesses in an overall very strong paper.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This is an interesting study characterizing and engineering so-called bathy phytochromes, i.e., those that respond to near infrared (NIR) light in the ground state, for optogenetic control of bacterial gene expression. Previously, the authors have developed a structure-guided approach to functionally link several light-responsive protein domains to the signaling domain of the histidine kinase FixL, which ultimately controls gene expression. Here, the authors use the same strategy to link bathy phytochrome light-responsive domains to FixL, resulting in sensors of NIR light. Interestingly, they also link these bathy phytochrome light-sensing domains to signaling domains from the tetrathionate-sensing SHK TtrS and the toluene-sensing SHK TodS, demonstrating the generality of their protein engineering approach more broadly across bacterial two-component systems.

      This is an exciting result that should inspire future bacterial sensor design. They go on to leverage this result to develop what is, to my knowledge, the first system for orthogonally controlling the expression of two separate genes in the same cell with NIR and Red light, a valuable contribution to the field.

      Finally, the authors reveal new details of the pH-dependent photocycle of bathy phytochromes and demonstrate that their sensors work in the gut - and plant-relevant strains E. coli Nissle 1917 and A. tumefaciens.

      Strengths:

      (1) The experiments are well-founded, well-executed, and rigorous.

      (2) The manuscript is clearly written.

      (3) The sensors developed exhibit large responses to light, making them valuable tools for ontogenetic applications.

      (4) This study is a valuable contribution to photobiology and optogenetics.

      We thank the reviewer for the positive verdict on our manuscript.

      Weaknesses:

      (1) As the authors note, the sensors are relatively insensitive to NIR light due to the rapid dark reversion process in bathy phytochromes. Though NIR light is generally non-phototoxic, one would expect this characteristic to be a limitation in some downstream applications where light intensities are not high (e.g., in vivo).

      We principally concur with this reviewer’s assessment that delivery of light (of any color) into living tissue can be severely limited by absorption, reflection, and scattering. That notwithstanding, at least two considerations suggest that in-vivo deployment of the pNIRusk setups we presently advance may be feasible.

      First, while the pNIRusk setups are indeed less light-sensitive compared to, e.g., our earlier redlight-responsive pREDusk and pDERusk setups (see Meier et al. Nat Commun 2024), we note that the overall light fluences required for triggering them are in the range of tens of µW per cm<sub>2</sub>. By contrast, optogenetic experiments in vivo, in particular in the neurosciences, often employ light area intensities on the order of mW per cm<sub>2</sub> and above. Put another way, compared to the optogenetic tools used in these experiments, the pNIRusk setups are actually quite sensitive to light.

      Second, sensitivity to NIR light brings the advantage of superior tissue penetration, see data reported by Weissleder Nat Biotech 2001 and Ash et al. Lasers Med Sci 2017 (both papers are cited in our manuscript). Based on these data, the intensity of blue light (450 nm) therefore falls off 5-10 times more strongly with penetration depth than that of NIR light (800 nm).

      We have added a brief treatment of these aspects in the Discussion section.

      (2) Though they can be multiplexed with Red light sensors, these bathy phytochrome NIR sensors are more difficult to multiplex with other commonly used light sensors (e.g., blue) due to the broad light responsivity of the Pfr state. This challenge may be overcome by careful dosing of blue light, as the authors discuss, but other bacterial NIR sensing systems with less cross-talk may be preferred in some applications.

      The reviewer is correct in noting that, at least to a certain extent, the pNIRusk systems also respond to blue light owing to their Soret absorbance bands (see Fig. 1). That said, we note two points:

      First, a given photoreceptor that preferentially responds to certain wavelengths, e.g., 700 nm in the case of conventional bacterial phytochromes (BphP), generally absorbs shorter wavelengths to some degree as well. Absorption of these shorter wavelengths suffices for driving electronic and/or vibronic transitions of the chromophore to higher energy levels which often give rise to productive photochemistry and downstream signal transduction. Put another way, a certain response of sensory photoreceptors to shorter wavelengths is hence fully expected and indeed experimentally borne out, as for instance shown by Ochoa-Fernandez et al. in the so-called PULSE setup (Nat Meth 2020, doi: 10.1038/s41592-020-0868-y).

      Second, known BphPs share similar Pr and Pfr absorbance spectra. We therefore expect other BphP-based optogenetic setups to also respond to blue light to some degree. Currently, there are insufficient data to gauge whether individual BphPs systematically differ in their relative sensitivity to blue compared to red or NIR light. Arguably, pertinent experiments may be an interesting subject for future study.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Meier et al. engineer a new class of light-regulated two-component systems. These systems are built using bathy-bacteriophytochromes that respond to near-infrared (NIR) light. Through a combination of genetic engineering and systematic linker optimization, the authors generate bacterial strains capable of selective and tunable gene expression in response to NIR stimulation. Overall, these results are an interesting expansion of the optogenetic toolkit into the NIR range. The cross-species functionality of the system, modularity, and orthogonality have the potential to make these tools useful for a range of applications.

      Strengths:

      (1) The authors introduce a novel class of near-infrared light-responsive two-component systems in bacteria, expanding the optogenetic toolbox into this spectral range.

      (2) Through engineering and linker optimization, the authors achieve specific and tunable gene expression, with minimal cross-activation from red light in some cases.

      (3) The authors show that the engineered systems function robustly in multiple bacterial strains, including laboratory E. coli, the probiotic E. coli Nissle 1917, and Agrobacterium tumefaciens.

      (4) The combination of orthogonal two-component systems can allow for simultaneous and independent control of multiple gene expression pathways using different wavelengths of light.

      (5) The authors explore the photophysical properties of the photosensors, investigating how environmental factors such as pH influence light sensitivity.

      Weaknesses:

      (1) The expression of multi-gene operons and fluorescent reporters could impose a metabolic burden. The authors should present data comparing optical density for growth curves of engineered strains versus the corresponding empty-vector control to provide insight into the burden and overall impact of the system on host viability and growth.

      In response to this comment, we have recorded growth kinetics of bacteria harboring the pNIRusk-DsRed plasmids or empty vectors under both inducing (i.e., under NIR light) and noninducing conditions (i.e., darkness). We did not observe systematic differences in the growth kinetics between the different cultures, thus suggesting that under the conditions tested there is no adverse effect on cell viability.

      We include the new data in Suppl. Fig. 5c-d and refer to them in the main text.

      (2) The manuscript consistently presents normalized fluorescence values, but the method of normalization is not clear (Figure 2 caption describes normalizing to the maximal fluorescence, but the maximum fluorescence of what?). The authors should provide a more detailed explanation of how the raw fluorescence data were processed. In addition, or potentially in exchange for the current presentation, the authors should include the raw fluorescence values in supplementary materials to help readers assess the actual magnitude of the reported responses.

      We appreciate this valid comment and have altered the representation of the fluorescence data. All values for a given fluorescent protein (i.e., either DsRed or YPet) across all systems are now normalized to a single reference value, thus enabling direct comparison between experiments.

      (3) Related to the prior point, it would be useful to have a positive control for fluorescence that could be used to compare results across different figure panels.

      As all data are now normalized to the same reference value, direct comparison across all figures is enabled.

      (4) Real-time gene expression data are not presented in the current manuscript, but it would be helpful to include a time-course for some of the key designs to help readers assess the speed of response to NIR light.

      In response to this comment, we include in the revised manuscript induction kinetics of bacterial cultures bearing pNIRusk upon transfer to inducing NIR-light conditions. To this end, aliquots were taken at discrete timepoints, transcriptionally and translationally arrested, and analyzed for optical density and DsRed reporter fluorescence after allowing for chromophore maturation.

      We include the new data in Suppl. Fig. 5e and refer to them in the manuscript.

      Moreover, we note that the experiments in Agrobacterium tumefaciens used a luciferase reporter thus enabling the continuous monitoring of the light-induced expression kinetics. These data (unchanged in revision) are to be found in Suppl. Fig. 9.

      Reviewer #3 (Public review):

      Summary:

      This paper by Meier et al introduces a new optogenetic module for the regulation of bacterial gene expression based on "bathy-BphP" proteins. Their paper begins with a careful characterization of kinetics and pH dependence of a few family members, followed by extensive engineering to produce infrared-regulated transcriptional systems based on the authors' previous design of the pDusk and pDERusk systems, and closing with characterization of the systems in bacterial species relevant for biotechnology.

      Strengths:

      The paper is important from the perspective of fundamental protein characterization, since bathyBphPs are relatively poorly characterized compared to their phytochrome and cyanobacteriochrome cousins. It is also important from a technology development perspective: the optogenetic toolbox currently lacks infrared-stimulated transcriptional systems. Infrared light offers two major advantages: it can be multiplexed with additional tools, and it can penetrate into deep tissues with ease relative to the more widely used blue light-activated systems. The experiments are performed carefully, and the manuscript is well written.

      Weaknesses:

      My major criticism is that some information is difficult to obtain, and some data is presented with limited interpretation, making it difficult to obtain intuition for why certain responses are observed. For example, the changes in red/infrared responses across different figures and cellular contexts are reported but not rationalized. Extensive experiments with variable linker sequences were performed, but the rationale for linker choices was not clearly explained. These are minor weaknesses in an overall very strong paper.

      We are grateful for the positive take on our manuscript.

      Reviewer #1 (Recommendations for the authors):

      (1) As eLife is a broad audience journal, please define the Soret and Q-bands (line 125).

      We concur and have added labels in fig. 1a that designate the Soret and Q bands.

      (2) The initial (0) Ac design in Figure 2b is activated by NIR and Red light, albeit modestly. The authors state that this construct shows "constant reporter fluorescence, largely independent of illumination" (line 167). This language should be changed to reflect the fact that this Ac construct responds to both of these wavelengths.

      Agreed. We have amended the text accordingly.

      (3) pNIRusk Ac 0 appears to show a greater light response than pNIRusk Av -5. However, the authors claim that the former is not light-responsive and the latter is. This conclusion should be explained or changed.

      The assignment of pNIRusk Av-5 as light-responsive is based on the relative difference in reporter fluorescence between darkness and illumination with either red or NIR light. Although the overall fluorescence is much lower in Av-5 than for Av-0, the relative change upon illumination is much more pronounced. We add a statement to this effect to the text.

      (4) The authors state that "when combining DmDERusk-Str-YPet with AvTod+21-DsRed expression rose under red and NIR light, respectively, whereas the joint application of both light colors induced both reporter genes" (lines 258-261). In contrast, Figure 3c shows that application of both wavelengths of light results in exclusive activation of YPet expression. It appears the description of the data is wrong and must be corrected. That said, this error does not impact their conclusion that two separate target genes can be independently activated by NIR and red light.

      We thank the reviewer for catching this error which we have corrected in the revised manuscript.

      (5) Line 278: I don't agree with the authors' blanket statement that the use of upconversion nanoparticles is a "grave" limitation for NIR-light mediated activation of bacterial gene expression in vivo. The authors should either expound on the severity of the limitation or use more moderate language.

      We have replaced the word ‘grave’ by ‘potential’ and thereby toned down our wording.

      Reviewer #2 (Recommendations for the authors):

      (1) Please include a discussion on the expected depth penetration of different light wavelengths. This is most relevant in the context of the discussion about how these NIR systems could be used with living therapeutics.

      Given the heterogeneity of biological tissue, it is challenging to state precise penetration depths for different wavelengths of light. That said, blue light for instance is typically attenuated by biological tissue around 5 to 10 times as strongly as near-infrared light is.

      We have expanded the Discussion chapter to cover these aspects.

      (2) It would be helpful for Figure 2C (or supplementary) to also include the response to blue light stimulation.

      We agree and have acquired pertinent data for the blue-light response. The new data are included in an updated Fig. 2c. Data acquired at varying NIR-light intensities, originally included in Fig. 2c, have been moved to Suppl. Fig. 5a-b.

      (3) In Figure 4A, data on the response of E. coli Nissle to blue and red light are missing. Including this would help identify whether the reduced sensitivity to non-NIR wavelengths observed in the E. coli lab strain is preserved in the probiotic background.

      In response to this comment, we have acquired pertinent data on E. coli Nissle. While the results were overall similar to those in the laboratory strain, the response to blue and NIR light was yet lower in the Nissle bacteria which stands to benefit optogenetic applications.

      We have updated Fig. 4a accordingly. For clarity, we only show the data for AvNIRusk in the main paper but have relegated the data on AcNIRusk to Suppl. Fig. 8. (Note that this has necessitated a renumbering of the subsequent Suppl. Figs.)

      (4) On many of the figures, there are thin gray lines that appear between the panels that it would be nice to eliminate because, in some cases, they cut through words and numbers.

      The grey lines likely arose from embedding the figures into the text document. In the typeset manuscript, which has become available on the eLife webpage in the meantime, there are no such lines. That said, we will carefully check throughout the submission/publishing/proofing process lest these lines reappear.

      (5) Page 7, line 155: "As not least seen" typo or awkward phrasing.

      We have restructured the sentence and thereby hopefully clarified the unclear phrasing.

      (6) Page 7, line 167: It does not appear to be the case that the initial pNIRusk designs show constant fluorescence that is largely independent of illumination. AcNIRusk shows an almost twofold change from dark to NIR. Reword this to avoid confusion.

      We concur with this comment, similar to reviewer #1’s remark, and have adjusted the text accordingly.

      (7) Page 8, line 174: Related to the previous point, AvNIRusk has one design that is very minimally light switchable (-5), so stating that six light switchable designs have been identified is also confusing.

      As stated in our response to reviewer #1 above, the assignment of AvNIRusk-5 as light-switchable is based on the relative fluorescence change upon illumination. We have added an explanation to the text.

      (8) Page 10, line 228-229: I was not able to find the data showing that expression levels were higher for the DmTtr systems than the pREDusk and pNIRusk setups. This may be an issue related to the normalization point. It was not clear to me how to compare these values.

      We apologize for the initially unclear representation of the data. In response to this reviewer’s general comments above, we have now normalized all fluorescence values to a single reference value, thus allowing their direct comparison.

      (9) Page 12, line 264: "finer-grained expression control can be exerted..." Either show data or adjust the language so that it is clear this is a prediction.

      True, we have replaced ‘can’ by ‘could’.

      (10) Page 25, line 590: CmpX13 cells have a reference that is given later, but it should be added where it first appears.

      Agreed, we have added the reference in the indicated place.

      (11) Page 25, line 592: define LB/Kan.

      We had already defined this abbreviation further up but, for clarity, we have added it again in the indicated position.

      (12) Page 40, line 946: "normalized by" rather than "to".

      We have implemented the requested change in the indicated and several other positions of the manuscript.

      (13) Figures 2C, 3C, and similar plots in the supplementary material would benefit from having a legend for the colors.

      We agree and have added pertinent legends to the corresponding main and supplementary figures.

      (14) As a reader, I had some trouble following all the acronyms. This is at the author's discretion, but I would eliminate ones that are not strictly essential (e.g. MTP for microtiter plate; I was unable to identify what "MCS" meant; look for other opportunities to remove acronyms).

      In the revised manuscript, we have defined the abbreviation ‘MCS’ (for ‘multiple-cloning site’) upon first occurrence. We have decided to retain the abbreviation ‘MTP’ in the text.

      (15) Could the authors briefly speculate on why A. tumefaciens activation with red light might occur?

      While we can but speculate as to the underlying reasons for the divergent red-light response in A. tumefaciens, we discuss possible scenarios below.

      Commonly, two-component systems (TCS) exhibit highly cooperative and steep responses to signal. As a consequence, even small differences in the intracellular amounts of phosphorylated and unphosphorylated response regulator (RR) can give to significantly changed gene-expression output. Put another way, the gene-expression output need not scale linearly with the extent of RR phosphorylation but, rather, is expected to show nonlinear dependence with pronounced thresholding effects.

      Differences in the pertinent RR levels can for instance arise from variations in the expression levels of the pNIRusk system components between E. coli and A. tumefaciens. Moreover, the two bacteria greatly differ in their two-component-system (TCS) repertoire. Although TCSs are commonly well insulated from each other, cross-talk with endogenous TCSs, even if limited, may cause changes in the levels of phosphorylated RR and hence gene-expression output. In a similar vein, the RR can also be phosphorylated and dephosphorylated non-enzymatically, e.g., by reaction with high-energy anhydrides (such as acetyl phosphate) and hydrolysis, respectively. Other potential origins for the divergent red-light response include differences in the strength of the promoters driving expression of the pNIRusk system components and the fluorescent/luminescent reporters, respectively.

      (16) It would be helpful for the authors to briefly explain why they needed to switch to luminescence from fluorescence for the A. tumeraciens studies.

      While there was no strict necessity to switch from the fluorescence-based system used in E. coli to a luminescence-based system in A. tumefaciens, we opted for luminescence based on prior experience with other Alphaproteobacteria (e.g., 10.1128/mSystems.00893-21), where luminescence offered significant advantages. Specifically, it provides essentially background-free signal detection and greater sensitivity for monitoring gene expression. In addition, as demonstrated in Suppl. Fig. 9c and d, the luminescence system enables real-time tracking of gene expression dynamics, which further supported its use in our experimental setup (see our response to reviewer #2’s general comments).

      (17) This is a very minor comment that the authors can take or leave, but I got hung up on the word "implement" when it appeared a few times in the manuscript because I tended to read it as "put a plan into place" rather than its other meaning.

      In the abstract, we have replaced one instance of the word ‘implement’ by ‘instrument’.

      (18) The authors should include the relevant constructs on AddGene or another public strainsharing service.

      We whole-heartedly subscribe to the idea of freely sharing research materials with fellow scientists. Therefore, we had already deposited the most relevant AvNIRusk in Addgene, even prior to the initial submission of the manuscript (accession number 235084). In the meantime, we have released the deposition, and the plasmid can be obtained from Addgene since May 15<sub>th</sub> of this year.

      Reviewer #3 (Recommendations for the authors):

      Suggestion for improvement:

      This paper relies heavily on variations in linker sequences to shift responses. I am familiar with prior work from the Moglich lab in which helical linkers were employed to shift responses in synthetic two-component systems, with interesting periodicity in responses with every 7 residues (as expected for an alpha helix) and inversion of responses at smaller linker shifts. There is no mention in this paper whether their current engineering follows a similar rationale, what types of linkers are employed (e.g. flexible vs helical), and whether there is an interpretation for how linker lengths alter responses. Can you explain what classes of linker sequences are used throughout Figures 2 and 3, and whether length or periodicity affects the outcome? This would be very helpful for readers who are new to this approach, or if the rationale here differs from the authors' prior work.

      The PATCHY approach employed at present followed a closely similar rationale as in our previous studies. That is, linkers were extended/shortened and varied in their sequence by recombining different fragments of the natural linkers of the parental receptors, i.e., the bacteriophytochrome and the FixL sensor histidine kinase, respectively. We have added a statement to this effect in the text and a reference to Suppl. Fig. 3 which illustrates the principal approach.

      Compared to our earlier studies, we isolated fewer receptor variants supporting light-regulated responses, despite covering a larger sequence space. Owing to the sparsity of the light-regulated variants, an interpretation of the linker properties and their correlation with light-regulated activity is challenging. Although doubtless unsatisfying from a mechanistic viewpoint, we therefore refrain from a pertinent discussion which would be premature and speculative at this point. As the reviewer raises a valid and important point, we have expanded the text by referring to our earlier studies and the observed dependence of functional properties on linker composition.

      It is sometimes difficult to intuit or rationalize the differences in red/IR sensitivity across closely related variants. An important example appears in Figure 3C vs 3B. I think the AvTod+21 in 3B should be the equivalent to the DsRed response in the second column of 3C (AvTod+21 + DmDERusk), except, of course, that the bacteria in 3C carry an additional plasmid for the DERusk system. However, in 3B, the response to red light is substantial - ~50% as strong as that for IR, whereas in 3C, red light elicits no response at all. What is the difference? The reason this is important is that the AvTod+21 and DMDERusk represent the best "orthogonal" red and infrared light responses, but this is not at all obvious from 3B, where AvTod+21 still causes a substantial (and for orthogonality, undesirable) response under red light. Perhaps subtle differences in expression level due to plasmid changes cause these differences in light responses? Could the authors test how the expression level affects these responses? The paper would be greatly improved if observations of the diverse red/IR responses could be rationalized by some design criteria.

      As noted above in our response to reviewer #2, we have now normalized all fluorescence readings to joint reference values, thus allowing a better comparison across experiments.

      The reviewer is correct in noting that upon multiplexing, the individual plasmid systems support lower fluorescence levels than when used in isolation. We speculate that the combination of two plasmids may affect their copy numbers (despite the use of different resistance markers and origins of replications) and hence their performance. Likewise, the cellular metabolism may be affected when multiple plasmids are combined. These aspects may well account for the absent red-light response in AvTod+21 in the multiplexing experiments which is – indeed – unexpected. As, at present, we cannot provide a clear rationalization for this effect, we recommend verifying the performance of the plasmid setups when multiplexing.

      The paper uses "red" and "infrared" to refer to ~624 nm and ~800 nm light, respectively. I wonder whether it might be possible to shift these peak wavelengths to obtain even better separation for the multiplexing experiments. Perhaps shifting the specific red wavelength could result in better separation between DERusk and AvTod systems, for example? Could the authors comment on this (maybe based on action spectra of their previously developed tools) or perhaps test a few additional stimulation wavelengths?

      The choice of illumination wavelengths used in these experiments is dictated by the LED setups available for illumination of microtiter plates. On the one hand, we are using an SMD (surface-mount device) three-color LED with a fixed wavelength of the red channel around 624 nm (see Hennemann et al., 2018). On the other hand, we are deploying a custom-built device with LEDs emitting at around 800 nm (see Stüven et al., 2019 and this work). Adjusting these wavelengths is therefore challenging, although without doubt potentially interesting.

      To address this reviewer comment, we have added a statement to the text that the excitation wavelengths may be varied to improve multiplexed applications.

      Additional minor comments:

      (1) Figure 2C: It would be very helpful to place a legend on the figure panel for what the colors indicate, since they are unique to this panel and non-intuitive.

      This comment coincides with one by reviewer #2, and we have added pertinent legends to this and related supplementary figures.

      (2) Figure 3C: it is not obvious which system uses DsRed and which uses YPet in each combination, since the text indicates that all combinations were cloned, and this is not clearly described in the legend. Is it always the first construct in the figure legend listed for DsRed and the second for YPet?

      For clarification, we have revised the x-axis labels in Fig. 3C. (And yes, it is as this reviewer surmises: the first of the two constructs harbored DsRed and the second one YPet.)

    1. Reviewer #1 (Public review):

      This is an interesting study on the nature of representations across the visual field. The question of how peripheral vision differs from foveal vision is a fascinating and important one. The majority of our visual field is extra-foveal yet our sensory and perceptual capabilities decline in pronounced and well-documented ways away from the fovea. Part of the decline is thought to be due to spatial averaging ('pooling') of features. Here, the authors contrast two models of such feature pooling with human judgments of image content. They use much larger visual stimuli than in most previous studies, and some sophisticated image synthesis methods to tease apart the prediction of the distinct models.

      More importantly, in so doing, the researchers thoroughly explore the general approach of probing visual representations through metamers-stimuli that are physically distinct but perceptually indistinguishable. The work is embedded within a rigorous and general mathematical framework for expressing equivalence classes of images and how visual representations influence these. They describe how image-computable models can be used to make predictions about metamers, which can then be compared to make inferences about the underlying sensory representations. The main merit of the work lies in providing a formal framework for reasoning about metamers and their implications, for comparing models of sensory processing in terms of the metamers that they predict, and for mapping such models onto physiology. Importantly, they also consider the limits of what can be inferred about sensory processing from metamers derived from different models.

      Overall, the work is of a very high standard and represents a significant advance over our current understanding of perceptual representations of image structure at different locations across the visual field. The authors do a good job of capturing the limits of their approach I particularly appreciated the detailed and thoughtful Discussion section and the suggestion to extend the metamer-based approach described in the MS with observer models. The work will have an impact on researchers studying many different aspects of visual function including texture perception, crowding, natural image statistics and the physiology of low- and mid-level vision.

      The main weaknesses of the original submission relate to the writing. A clearer motivation could have been provided for the specific models that they consider, and the text could have been written in a more didactic and easy to follow manner. The authors could also have been more explicit about the assumptions that they make.

      Comments following re-submission:

      Overall, I think the authors have done a satisfactory job of addressing most of the points I raised.

      There's one final issue which I think still needs better discussion.

      I think reviewer 2 articulated better than I have the point I was concerned about: the relationship between JNDs and metamers as depicted in the schematics and indeed in the whole conceptualization.

      I think the issue here is that there seems to be a conflating of two concepts- 'subthreshold' and 'metamer'-and I'm not convinced it is entirely unproblematic. It's true that two stimuli that cannot be discriminated from one another due to the physical differences being too small to detect reliably by the visual system are a form of metamer in the strict definition 'physically different, but perceptually the same'.<br /> However, I don't think this is the scientifically substantial notion of metamer that enabled insights into trichromacy. That form of metamerism is due to the principle of univariance in feature encoding, and involves conditions in which physically very different stimuli are mapped to one and the same point in sensory encoding space whether or not there is any noise in the system. When I say 'physically very different' I mean different by a large enough amount that they would be far above threshold, potentially orders of magnitude larger than a JND if the system's noise properties were identical but the system used a different sensory basis set to measure them. This seems to be a very different kind of 'physically different, but perceptually the same'.

      I do think the notion of metamerism can obviously be very usefully extended beyond photoreceptors and photon absorptions. In the interesting case of texture metamers, what I think is meant is that stimuli would be discriminable if scrutinised in the fovea, but because they have the same statistics they are treated as equivalent. I think the discussion of this could still be clearly articulated in the manuscript. It would benefit from a more thorough discussion of the difference between metamerism and subthreshold, especially in the context of the Voronoi diagrams at the beginning.

      It needs to be made clear to the reader why it is that two stimuli that are physically similar (e.g., just spanning one of the edges in the diagram) can be discriminable, while at the same time, two stimuli that are very different (e.g., at opposite ends of a cell) can't.

      Do the cells include BOTH those sets of stimuli that cannot be discriminated just because of internal noise AND those that can't be discriminated because they are projected to literally the same point in the sensory encoding space? What are the strengths and limits of models that involve the strict binarization of sensory representations, and how can they be integrated with models dealing with continuous differences? These seem like important background concepts that ought to be included in either the introduction of discussion sections. In this context it might also be helpful to refer to the notion of 'visual equivalence' as described by:

      Ramanarayanan, G., Ferwerda, J., Walter, B., & Bala, K. (2007). Visual equivalence: towards a new standard for image fidelity. ACM Transactions on Graphics (TOG), 26(3), 76-es.

      Other than that, I congratulate the authors on a very interesting study, and look forward to reading the final version.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      This is an interesting study of the nature of representations across the visual field. The question of how peripheral vision differs from foveal vision is a fascinating and important one. The majority of our visual field is extra-foveal yet our sensory and perceptual capabilities decline in pronounced and well-documented ways away from the fovea. Part of the decline is thought to be due to spatial averaging (’pooling’) of features. Here, the authors contrast two models of such feature pooling with human judgments of image content. They use much larger visual stimuli than in most previous studies, and some sophisticated image synthesis methods to tease apart the prediction of the distinct models.

      More importantly, in so doing, the researchers thoroughly explore the general approach of probing visual representations through metamers-stimuli that are physically distinct but perceptually indistinguishable. The work is embedded within a rigorous and general mathematical framework for expressing equivalence classes of images and how visual representations influence these. They describe how image-computable models can be used to make predictions about metamers, which can then be compared to make inferences about the underlying sensory representations. The main merit of the work lies in providing a formal framework for reasoning about metamers and their implications, for comparing models of sensory processing in terms of the metamers that they predict, and for mapping such models onto physiology. Importantly, they also consider the limits of what can be inferred about sensory processing from metamers derived from different models.

      Overall, the work is of a very high standard and represents a significant advance over our current understanding of perceptual representations of image structure at different locations across the visual field. The authors do a good job of capturing the limits of their approach and I particularly appreciated the detailed and thoughtful Discussion section and the suggestion to extend the metamer-based approach described in the MS with observer models. The work will have an impact on researchers studying many different aspects of visual function including texture perception, crowding, natural image statistics, and the physiology of low- and mid-level vision.

      The main weaknesses of the original submission relate to the writing. A clearer motivation could have been provided for the specific models that they consider, and the text could have been written in a more didactic and easy-to-follow manner. The authors could also have been more explicit about the assumptions that they make.

      Thank you for the summary. We appreciate the positives noted above. We address the weaknesses point by point below.

      Reviewer #2 (Public Review):

      Summary

      This paper expands on the literature on spatial metamers, evaluating different aspects of spatial metamers including the effect of different models and initialization conditions, as well as the relationship between metamers of the human visual system and metamers for a model. The authors conduct psychophysics experiments testing variations of metamer synthesis parameters including type of target image, scaling factor, and initialization parameters, and also compare two different metamer models (luminance vs energy). An additional contribution is doing this for a field of view larger than has been explored previously

      General Comments

      Overall, this paper addresses some important outstanding questions regarding comparing original to synthesized images in metamer experiments and begins to explore the effect of noise vs image seed on the resulting syntheses. While the paper tests some model classes that could be better motivated, and the results are not particularly groundbreaking, the contributions are convincing and undoubtedly important to the field. The paper includes an interesting Voronoi-like schematic of how to think about perceptual metamers, which I found helpful, but for which I do have some questions and suggestions. I also have some major concerns regarding incomplete psychophysical methodology including lack of eye-tracking, results inferred from a single subject, and a huge number of trials. I have only minor typographical criticisms and suggestions to improve clarity. The authors also use very good data reproducibility practices.

      Thank you for the summary. We appreciate the positives noted above. We address the weaknesses point by point below.

      Specific Comments

      Experimental Setup

      Firstly, the experiments do not appear to utilize an eye tracker to monitor fixation. Without eye tracking or another manipulation to ensure fixation, we cannot ensure the subjects were fixating the center of the image, and viewing the metamer as intended. While the short stimulus time (200ms) can help minimize eye movements, this does not guarantee that subjects began the trial with correct fixation, especially in such a long experiment. While Covid-19 did at one point limit in-person eye-tracked experiments, the paper reports no such restrictions that would have made the addition of eye-tracking impossible. While such a large-scale experiment may be difficult to repeat with the addition of eye tracking, the paper would be greatly improved with, at a minimum, an explanation as to why eye tracking was not included.

      Addressed on pg. 25, starting on line 658.

      Secondly, many of the comparisons later in the paper (Figures 9,10) are made from a single subject. N=1 is not typically accepted as sufficient to draw conclusions in such a psychophysics experiment. Again, if there were restrictions limiting this it should be discussed. Also (P11) Is subject sub-00 is this an author? Other expert? A naive subject? The subject’s expertise in viewing metamers will likely affect their performance.

      Addressed on pg. 14, starting on line 308.

      Finally, the number of trials per subject is quite large. 13,000 over 9 sessions is much larger than most human experiments in this area. The reason for this should be justified.

      In general, we needed a large number of trials to fit full psychometric functions for stimuli derived for both models, with both types of comparison, both initializations, and over many target images. We could have eliminated some of these, but feel that having a consistent dataset across all these conditions is a strength of the paper.

      In addition to the sentence on pg. 14, line 318, a full enumeration of trials is now described on pg. 23, starting on line 580.

      Model

      For the main experiment, the authors compare the results of two models: a ’luminance model’ that spatially pools mean luminance values, and an ’energy model’ that spatially pools energy calculated from a multi-scale pyramid decomposition. They show that these models create metamers that result in different thresholds for human performance, and therefore different critical scaling parameters, with the basic luminance pooling model producing a scaling factor 1/4 that of the energy model. While this is certain to be true, due to the luminance model being so much simpler, the motivation for the simple luminance-based model as a comparison is unclear.

      The use of simple models is now addressed on pg. 3, starting on line 98, as well as the sentence starting on pg. 4 line 148: the luminance model is intended as the simplest possible pooling model.

      The authors claim that this luminance model captures the response of retinal ganglion cells, often modeled as a center-surround operation (Rodieck, 1964). I am unclear in what aspect(s) the authors claim these center-surround neurons mimic a simple mean luminance, especially in the context of evidence supporting a much more complex role of RGCs in vision (Atick & Redlich, 1992). Why do the authors not compare the energy model to a model that captures center-surround responses instead? Do the authors mean to claim that the luminance model captures only the pooling aspects of an RGC model? This is particularly confusing as Figures 6 and 9 show the luminance and energy models for original vs synth aligning with the scaling of Midget and Parasol RGCs, respectively. These claims should be more clearly stated, and citations included to motivate this. Similarly, with the energy model, the physiological evidence is very loosely connected to the model discussed.

      We have removed the bars showing potential scaling values measured by electrophysiology in the primate visual system and attempted to clarify our language around the relationship between these models and physiology. Our metamer models are only loosely connected to the physiology, and we’ve decided in revision not to imply any direct connection between the model parameters and physiological measurements. The models should instead be understood as loosely inspired by physiology, but not as a tool to localize the representation (as was done in the Freeman paper).

      The physiological scaling values are still used as the mean of the priors on the critical scaling value for model fitting, as described on pg. 27, starting on line 698.

      Prior Work:

      While the explorations in this paper clearly have value, it does not present any particularly groundbreaking results, and those reported are consistent with previous literature.The explorations around critical eccentricity measurement have been done for texture models (Figure 11) in multiple papers (Freeman 2011, Wallis, 2019, Balas 2009). In particular, Freeman 20111 demonstrated that simpler models, representing measurements presumed to occur earlier in visual processing need smaller pooling regions to achieve metamerism. This work’s measurements for the simpler models tested here are consistent with those results, though the model details are different. In addition, Brown, 2023 (which is miscited) also used an extended field of view (though not as large as in this work). Both Brown 2023, and Wallis 2019 performed an exploration of the effect of the target image. Also, much of the more recent previous work uses color images, while the author’s exploration is only done for greyscale.

      We were pleased to find consistency of our results with previous studies, given the (many) differences in stimuli and experimental conditions (especially viewing angle), while also extending to new results with the luminance model, and the effects of initialization. Note that only one of the previous studies (Freeman and Simoncelli, 2011) used a pooled spectral energy model. Moreover, of the previous studies, only one (Brown et al., 2023) used color images (we have corrected that citation - thanks for catching the error).

      Discussion of Prior Work:

      The prior work on testing metamerism between original vs. synthesized and synthesized vs. synthesized images is presented in a misleading way. Wallis et al.’s prior work on this should not be a minor remark in the post-experiment discussion. Rather, it was surely a motivation for the experiment. The text should make this clear; a discussion of Wallis et al. should appear at the start of that section. The authors similarly cite much of the most relevant literature in this area as a minor remark at the end of the introduction (P3L72).

      The large differences we observed between comparison types (original vs synthesized, compared to synthesized vs synthesized) surprised us. Understanding such difference was not a primary motivation for the work, but it is certainly an important component of our results. In the introduction, we thought it best to lay out the basic logic of the metamer paradigm for foveated vision before mentioning the complications that are introduced in both the Wallis and Brown papers (paragraph beginning p. 3, line 109). Our results confirm and bolster the results of both of those earlier works, which are now discussed more fully in the Introduction (lines 109 and following).

      White Noise: The authors make an analogy to the inability of humans to distinguish samples of white noise. It is unclear however that human difficulty distinguishing samples of white noise is a perceptual issue- It could instead perhaps be due to cognitive/memory limitations. If one concentrates on an individual patch one can usually tell apart two samples. Support for these difficulties emerging from perceptual limitations, or a discussion of the possibility of these limitations being more cognitive should be discussed, or a different analogy employed.

      We now note the possibility of cognitive limits on pg. 8, starting on line 243, as well as pg. 22, line 571. The ability of observers to distinguish samples of white noise is highly dependent on display conditions. A small patch of noise (i.e., large pixels, not too many) can be distinguished, but a larger patch cannot, especially when presented in the periphery. This is more generally true for textures (as shown in Ziemba and Simoncelli (2021)). Samples of white noise at the resolution used in our study are indistinguishable.

      Relatedly, in Figure 14, the authors do not explain why the white noise seeds would be more likely to produce syntheses that end up in different human equivalence classes.

      In figure 14, we claim that white noise seeds are more likely to end up in the same human equivalence classes than natural image seeds. The explanation as to why we think this may be the case is now addressed on pg. 19, starting on line 423.

      It would be nice to see the effect of pink noise seeds, which mirror the power spectrum of natural images, but do not contain the same structure as natural images - this may address the artifacts noted in Figure 9b.

      The lack of pink noise seeds is now addressed on pg. 19, starting on line 429.

      Finally, the authors note high-frequency artifacts in Figure 4 & P5L135, that remain after syntheses from the luminance model. They hypothesize that this is due to a lack of constraints on frequencies above that defined by the pooling region size. Could these be addressed with a white noise image seed that is pre-blurred with a low pass filter removing the frequencies above the spatial frequency constrained at the given eccentricity?

      The explanation for this is similar to the lack of pink noise seeds in the previous point: the goal of metamer synthesis is model testing, and so for a given model, we want to find model metamers that result in the smallest possible critical scaling value. Taking white noise seed images and blurring them will almost certainly remove the high frequencies visible in luminance metamers in figure 4 and thus result in a larger critical scaling value, as the reviewer points out. However, the logic of the experiments requires finding the smallest critical scaling value, and so these model metamers would be uninformative. In an early stage of the project, we did indeed synthesize model metamers using pink noise seeds, and observed that the high frequency artifacts were less prominent.

      Schematic of metamerism: Figures 1,2,12, and 13 show a visual schematic of the state space of images, and their relationship to both model and human metamers. This is depicted as a Voronoi diagram, with individual images near the center of each shape, and other images that fall at different locations within the same cell producing the same human visual system response. I felt this conceptualization was helpful. However, implicitly it seems to make a distinction between metamerism and JND (just noticeable difference). I felt this would be better made explicit. In the case of JND, neighboring points, despite having different visual system responses, might not be distinguishable to a human observer.

      Thanks for noting this – in general, metamers are subthreshold, and for the purpose of the diagram, we had to discretize the space showing metameric regions (Voronoi regions) around a set of stimuli. We’ve rewritten the captions to explain this better. We address the binary subthreshold nature of the metamer paradigm in the discussion section (pg. 19, line 438).

      In these diagrams and throughout the paper, the phrase ’visual stimulus’ rather than ’image’ would improve clarity, because the location of the stimulus in relation to the fovea matters whereas the image can be interpreted as the pixels displayed on the computer.

      We agree and have tried to make this change, describing this choice on pg. 3 line 73.

      Other

      The authors show good reproducibility practices with links to relevant code, datasets, and figures.

      Reviewer #1 (Recommendations For The Authors):

      In its current form, I found the introduction to be too cursory. I felt that the article would benefit from a clearer motivation for the two models that are considered as the reader is left unclear why these particular models are of special scientific significance. The luminance model is intended to capture some aspects of retinal ganglion cells response characteristics and the spectral energy model is intended to capture some aspects of the primary visual cortex. However, one can easily imagine models that include the pooling of other kinds of features, and it would be helpful to get an idea of why these are not considered. Which aspects of processing in the retina and V1 are being considered and which are being left out, and why? Why not consider representations that capture even higher-order statistical structure than those covered by the spectral energy model (or even semantics)? I think a bit of rewriting with this in mind could improve the introduction.

      Along similar lines, I would have appreciated having the logic of the study explained more explicitly and didactically: which overarching research question is being asked, how it is operationalised in the models and experiments, and what are the predictions of the different models. Figures 2 and 3 are certainly helpful, but I felt further explanations would have made it easier for the reader to follow. Throughout, the writing could be improved by a careful re-reading with a view to making it easier to understand. For example, where results are presented, a sentence or two expanding on the implications would be helpful.

      I think the authors could also be more explicit about the assumptions they make. While these are obviously (tacitly) included in the description of the models themselves, it would be helpful to state them more openly. To give one example, when introducing the notion of critical scaling, on p.6 the authors state as if it is a self-evident fact that "metamers can be achieved with windows whose size is matched to that of the underlying visual neurons". This presumably is true only under particular conditions, or when specific assumptions about readout from populations of neurons are invoked. It would be good to identify and state such assumptions more directly (this is partly covered in the Discussion section ’The linking proposition underlying the metamer paradigm’, but this should be anticipated or moved earlier in the text).

      We agree that our introduction was too cursory and have reworked it. We have also backed off of the direct comparison to physiology and clarified that we chose these two as the simplest possible pooling models. We have also added sentences at the end of each result section attempting to summarize the implication (before discussing them fully in the discussion). Hopefully the logic and assumptions are now clearer.

      There are also some findings that warrant a more extensive discussion. For example, what is the broader implication of the finding that original vs. synthesised and synthesised vs. synthesised comparisons exhibit very different scaling values? Does this tell us something about internal visual representations, or is it simply capturing something about the stimuli?

      We believe this difference is a result of the stimuli that are used in the experiment and thus the synthesis procedure itself, which interacts with the model’s pooled image feature. We have attempted to update the relevant figures and discussions to clarify this, in the sections starting on pg 17 line 396 and pg. 19 line 417.

      At some points in the paper, a third model (’texture model’) creeps into the discussion, without much explanation. I assume that this refers to models that consider joint (rather than marginal) statistics of wavelet responses, as in the famous Portilla & Simoncelli texture model. However, it would be helpful to the reader if the authors could explain this.

      Addressed on pg. 3, starting on line 94.

      Minor corrections.

      Caption of Figure 3: ’top’ and ’bottom’ should be ’left’ and ’right’

      Line 177: ’smallest tested scaling values tested’. Remove one instance of ’tested’

      Line 212: ’the images-specific psychometric functions’ -> ’image-specific’

      Line 215: ’cloud-like pink noise’. It’s not literally pink noise, so I would drop this.

      Line 236: ’Importantly, these results cannot be predicted from the model, which gives no specific insight as to why some pairs are more discriminable than others’. The authors should specify what we do learn from the model if it fails to provide insight into why some image pairs are more discriminable than others.

      Figure 9: it might be helpful to include small insets with the ’highway’ and ’tiles’ source images to aid the reader in understanding how the images in 9B were generated.

      Table 1 placement should be after it is first referred to on line 258.

      In the Discussion section "Why does critical scaling depend on the comparison being performed", it would be helpful to consider the case where the two model metamers *are* distinguishable from each other even though each is indistinguishable from the target image. I would assume that this is possible (e.g., if the target image is at the midpoint between the two model images in image space and each of the stimuli is just below 1 JND away from the target). Or is this not possible for some reason?

      Regarding line 236: this specific line has been removed, and the discussion about this issue has all been consolidated in the final section of the discussion, starting on pg. 19 line 438.

      Regarding the final comment: this is addressed in the paragraph starting on pg. 16 line 386. To expand upon that: the situation laid out by the reviewer is not possible in our conceptualization, in which metamerism is transitive and image discriminability is binary. In order to investigate situations like the one laid out by the reviewer, one needs models whose representations have metric properties, i.e., which allow you to measure and reason about perceptual distance, which we refer to in the paragraph starting on pg. 20 line 460. We also note that this situation has not been observed in this or any other pooling model metamer study that we are aware of. All other minor changes have been addressed.

      Reviewer #2 (Recommendations For The Authors):

      Original image T should be marked in the Voronoi diagrams.

      Brown et al is miscited as 2021 should be ACM Transactions on Applied Perception 2023.

      Figure 3 caption: models are left and right, not top and bottom.

      Thanks, all of the above have been addressed.

      References

      BrownReral Encoding, in the Human Visual System. ACM Transactions on Applied Perception. 2023 Jan; 20(1):1–22.http://dx.doi.org/10.1145/356460, Dutell V, Walter B, Rosenholtz R, Shirley P, McGuire M, Luebke D. Efficient Dataflow Modeling of Periph-5, doi: 10.1145/3564605.

      Freeman Jdoi: 10.1038/nn.2889, Simoncelli EP. Metamers of the ventral stream. Nature Neuroscience. 2011 aug; 14(9):1195–1201..

      Ziemba CMnications. 2021 jul; 12(1)., Simoncelli EP. Opposing Effects of Selectivity and Invariance in Peripheral Vision. Nature Commu-https://doi.org/10.1038/s41467-021-24880-5, doi: 10.1038/s41467-021-24880-5.

    1. Reviewer #2 (Public review):

      Summary:

      Neurons in motor-related areas have increasingly shown to carry also other, non-motoric signals. This creates a problem of avoidance of interference between the motor and non-motor-related signals. This is a significant problem that likely affects many brain areas. The specific example studied here is interference between saccade-related activity and slow-changing arousal signals in the superior colliculus. The authors identify neuronal activity related to saccades and arousal. Identifying saccade-related activity is straightforward, but arousal-related activity is harder to identify. The authors first identify a potential neuronal correlate of arousal using PCA to identifying a component in the population activity corresponding to slow drift over the recording session. Next, they link this component to arousal by showing that the component is present across different brain areas (SC and PFC), and that it is correlated with pupil size, an external marker of arousal. Having identified an arousal-related component in SC, the authors show next that SC neurons with strong motor-related activity are less strongly affected by this arousal component (both SC and PFC). Lastly, they show that SC population activity pattern related to saccades and pupil size form orthogonal subspaces in the SC population.

      Strengths:

      A great strength of this research is the clear description of the problem, its relationship with the performed analysis and the interpretation of the results. the paper is very well written and easy to follow. An additional strength is the use of fairly sophisticated analysis using population activity.

      Weaknesses:

      (1) The greatest weakness in the present research is the fact that arousal is a functionally less important non-motoric variable. The authors themself introduce the problem with a discussion of attention, which is without any doubt the most important cognitive process that needs to be functionally isolated from oculomotor processes. Given this introduction, one cannot help but wonder, why the authors did not design an experiment, in which spatial attention and oculomotor control are differentiated. Absent such an experiment, the authors should spend more time on explaining the importance of arousal and how it could interfere with oculomotor behavior.

      (2) In this context, it is particularly puzzling that one actually would expect effects of arousal on oculomotor behavior. Specifically, saccade reaction time, accuracy, and speed could be influenced by arousal. The authors should include an analysis of such effects. They should also discuss the absence or presence of such effects and how they affect their other results.

      (3) The authors use the analysis shown in Figure 6D to argue that across recording sessions the activity components capturing variance in pupil size and saccade tuning are uncorrelated. however, the distribution (green) seems to be non-uniform with a peak at very low and very high correlation specifically. The authors should test if such an interpretation is correct. If yes, where are the low and high correlations respectively? Are there potentially two functional areas in SC?

      Comments on revised manuscript:

      I remain somewhat concerned that the authors jump immediately into an analysis of the 'arousal-related' effects on SC activity. Before that, I would like to see a more detailed discussion justifying the use pupil size alone (i.e., w/o other indicators such as RT) as indicative of fluctuations in general arousal that are causal to concomitant changes in SC activity. Instead, in its current form, the authors find changes in SC activity and describe them immediately as 'arousal-related'.

      Other than this conceptual issue, I do not have major problems with the analysis per se.

    2. Reviewer #3 (Public review):

      Summary:

      This study looked at slow changes in neuronal activity (on the order of minutes to hours) in the superior colliculus (SC) and prefrontal cortex (PFC) of two monkeys. They found that SC activity shows slow drift in neuronal activity like in the cortex. They then computed a motor index in SC neurons. By definition, this index is low if the neuron has stronger visual responses than motor response, and it is low if the neuron has weaker visual responses and stronger motor responses. The authors found that the slow drift in neuronal activity was more prevalent in the low motor index SC neurons and less prevalent in the high motor index neurons. In addition, the authors measured pupil diameter and found it to correlate with slow drifts in neuronal activity, but only in the neurons with lower motor index of the SC. They concluded that arousal signals affecting slow drifts in neuronal modulations are brain-wide. They also concluded that these signals are not present in the deepest SC layers, and they interpreted this to mean that this minimizes the impact of arousal on unwanted eye movements.

      Strengths:

      The paper is clear and well-written.

      Showing slow drifts in the SC activity is important to demonstrate that cortical slow drifts could be brain-wide.

      Weaknesses:

      The authors find that the SC cells with the low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual sensitivity. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in the most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC.

      Of course, the general conclusion is that the motor neurons will not have the arousal signal. It's just the interpretation that is different in the sense that the lack of the arousal signal is due to a lack of visual sensitivity in the motor neurons.

      I think that it is important to consider the alternative caveat of different amounts of light entering the system. Changes in light level caused by pupil diameter variations can be quite large. Please also note that I do not mean the luminance transient associated with the target onset. I mean the luminance of the gray display. it is a source of light. if the pupil diameter changes, then the amount of light entering to the visually sensitive neurons also changes.

      Comments on revised manuscript:

      The authors have addressed my first primary comment. For the light comment, I'm still not sure they addressed it. At the very least, they should explicitly state the possibility that the amount of light entering from the gray background can matter greatly, and it is not resolved by simply changing the analysis interval to the baseline pre-stimulus epoch. I provide more clear details below:

      In line 194 of the redlined version of the article (in the Introduction), the citation to Baumann et al., PNAS, 2023 is missing near the citation of Jagadisan and Gandhi, 2022. Besides replicating Jagadisan and Gandhi, 2022, this other study actually showed that the subspaces for the visual and motor epochs are orthogonal to each other

      Line 683 (and around) of the redlined version of the article (in the Results): I'm very confused here. When I mentioned visual modulation by changed pupil diameter, I did not mean the transient changes associated with the brief onset of the cue in the memory-guided saccade task. I meant the gray background of the display itself. This is a strong source of light. If the pupil diameter changes across trials, then the amount of light entering the eye also changes from the gray background. Thus, visually-responsive neurons will have different amount of light driving them. This will also happen in the baseline interval containing only a fixation spot. The arguments made by the authors here do not address this point at all. So, please modify the text to explicitly state the possibility that the global luminance of the display (as filtered by the pupil diameter) alters the amount of light driving the visually-responsive neurons and could contribute to the higher effects seen in the more visual neurons.

      The figures (everywhere, including the responses to reviewers) are very low resolution and all equations in methods are missing.

      I'm very confused by Fig. 2 - supplement 2. Panel B shows a firing rate burst aligned to *microsaccade* onset. Does that mean you were in the foveal SC? i.e. how can neurons have a motor burst to the target of the memory-guided saccade and also for microsaccades? And which microsaccade directions caused such a burst? And what does it mean to compute the motor index and spike count for microsaccades in panel C? if you were in the proper SC location for the saccade target, then shouldn't you *not* get any microsaccade-related burst at all? This is very confusing to me and needs to be clarified

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The authors make fairly strong claims that "arousal-related fluctuations are isolated from neurons in the deep layers of the SC" (emphasis added). This conclusion is based on comparisons between a "slow drift axis", a low-dimensional representation of neuronal drift, and other measures of arousal (Figures 2C, 3) and motor output sensitivity (Figures 2B, 3B). However, the metrics used to compare the slow-drift axis and motor activity were computed during separate task epochs: the delay period (600-1100 ms) and a perisaccade epoch (25 ms before and after saccade initiation), respectively. As the authors reference, deep-layer SC neurons are typically active only around the time of a saccade. Therefore, it is not clear if the lack of arousal-related modulations reported for deep-layer SC neurons is because those neurons are truly insensitive to those modulations, or if the modulations were not apparent because they were assessed in an epoch in which the neurons were not active. A potentially more valuable comparison would be to calculate a slow-drift axis aligned to saccade onset. 

      The reviewer makes an important point that the calculation of an axis can depend critically on the time window of neuronal response. We find when considering this that the slow drift axis is less sensitive to this issue because it is calculated on time-averaged activity over multiple trials. In previous work we found that slow drift calculated on the stimulus evoked response in V4 was very well aligned to slow drift calculated on pre-stimulus spontaneous activity (Cowley et al, Neuron, 2020, Supplemental Figure 3A and 3B). To address this issue in the present data, we compared the axis computed for an example session for neural activity during the delay period and neural activity aligned to saccade onset. As shown new Figure 2 – figure supplement 1 in the revised manuscript, we found a similar lack of arousal-related modulations for deep-layer SC neurons when slow drift was computed using the saccade epoch (25ms before to 25ms after the onset of the saccade). Figure 2 – figure supplement 1A shows loadings for the SC slow drift axis when it was computed using spiking responses during the delay period (as in the main manuscript analysis). In contrast, Figure 2 – figure supplement 1B shows loadings from the same session when the SC slow drift axis was computed using spiking responses during the saccade epoch. The plots are highly similar and in both cases the loadings were weaker for neurons recorded from channels at the bottom of the probe which have a higher motor index. Finally, we found that projections onto the SC slow drift axis for this session were strongly correlated when the slow drift axis was computed using spiking responses during the delay period and the saccade epoch (r = 0.66, p < 0.001, Figure 1C). Taken together, these results suggest that arousal-related modulations are less evident in deep-layer SC neurons irrespective of whether slow drift was computed during the delay or saccade epoch (see also Public Reviews, Reviewer 1, Point 2).

      (2) More generally, arousal-related signals may persist throughout multiple different epochs of the task. It would be worthwhile to determine whether similar "slow-drift" dynamics are observed for baseline, sensory-evoked, and saccade-related activity. Although it may not be possible to examine pupil responses during a saccade, there may be systematic relationships between baseline and evoked responses. 

      Similar to the point above, slow drift dynamics tend to be similar across different response epochs because they are averaged across many trials and seem to tap into responsivity trends that are robust across epochs. As shown in Author response image 1 below, and the Figure 2 – figure supplement 1 in the revised manuscript, similar dynamics were observed when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade epochs. We did not investigate differences between baseline and evoked pupil responses in the current paper. However, these effects were characterized in one of our previous papers that focused exclusively on the relationship between slow drift and eye-related metrics (Johnston et al., 2022, Cereb. Cortex, Figure 6). In this previous work, we found a negative correlation between baseline and evoked pupil size. Both variables were significantly correlated with slow drift, the only difference being the sign of the correlation.

      Author response image 1.

      (A-C) Dynamics of slow drift for three example sessions when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade epochs. Baseline = 100ms before the onset of the target stimulus; Delay = 600 to 1100ms after the offset of the target stimulus; Stim = 25ms to 125ms after the onset of the target stimulus; Sac = 25ms before to 25ms after the onset of the saccade.

      Johnston R, Snyder AC, Khanna SB, Issar D, Smith MA (2022) The eyes reflect an internal cognitive state hidden in the population activity of cortical neurons. Cereb Cortex 32:3331–3346.

      (3) The relationships between changes in SC activity and pupil size are quite small (Figures 2C & 5C). Although the distribution across sessions (Figure 2C) is greater than chance, they are nearly 1/4 of the size compared to the PFC-SC axis comparisons. Likewise, the distribution of r2 values relating pupil size and spiking activity directly (Figure 5) is quite low. We remain skeptical that these drifts are truly due to arousal and cannot be accounted for by other factors. For example, does the relationship persist if accounting for a very simple, monotonic (e.g., linear) drift in pupil size and overall firing rate over the course of an individual session? 

      Firstly, it is important to note that the strength of the relationship between projections onto the SC slow drift axis and pupil size (r<sup>2</sup> = 0.06) is within the range reported by Joshi et al. (2016, Neuron, Figure 3). They investigated the median variance explained between the spiking responses of individual SC neurons and pupil size and found it to be approximately 0.02 across sessions. Secondly, our statistical approach of testing the actual distribution of r<sup>2</sup> values against a shuffled distribution was specifically designed to rule out the possibility that the relationship between SC spiking responses and pupil size occurred due to linear drifts. The shuffled distribution in Figure 2C of the main manuscript represents the variance that can be explained by one session’s slow drift correlated with another session’s pupil, which would contain effects that occurred due to linear drifts alone. That the actual proportion of variance explained was significantly greater than this distribution suggests that the relationship between projections onto the SC slow drift axis and pupil size reflects changes in arousal rather than other factors related to linear drifts.

      Joshi S, Li Y, Kalwani RM, Gold JI (2016) Relationships between Pupil Diameter and Neuronal Activity in the Locus Coeruleus, Colliculi, and Cingulate Cortex. Neuron 89:221–234.

      (4) It is not clear how the final analysis (Figure 6) contributes to the authors' conclusions. The authors perform PCA on: (i) residual spiking responses during the delay period binned according to pupil size, and (ii) spiking responses in the saccade epoch binned according to target location (i.e., the saccade tuning curve). The corresponding PCs are the spike-pupil axis and the saccade tuning axis, respectively. Unsurprisingly, the spikepupil axis that captures variance associated with arousal (and removes variance associated with saccade direction) was not correlated with a saccade-tuning axis that captures variance associated with saccade direction and omits arousal. Had these measures been related it would imply a unique association between a neuron's preferred saccade direction and pupil control- which seems unlikely. The separation of these axes thus seems trivial and does not provide evidence of a "mechanism...in the SC to prevent arousal-related signals interfering with the motor output." It remains unknown whether, for example, arousal-related signals may impact trial-by-trial changes in neuronal gain near the time of a saccade, or alter saccade dynamics such as acceleration, precision, and reaction time. 

      The reviewer makes a good point, and we agree that more evidence is needed to determine if the separation of the pupil size axis and saccade tuning axis is the mechanism through which cognitive and arousal-related signals can be intermixed in the SC. In the revised manuscript (lines 679-682), we have raised this as a possible explanation that necessitates further study rather than stating definitively that it is the exact mechanism through which these signals are kept separate. Our analysis here is similar to the one from Smoulder et al (2024, Neuron, Fig. 2F), in which the interactions between reward signals and target tuning in M1 were examined (and found to be orthogonal). While we agree with the reviewer that it may seem “trivial” for these axes to be orthogonal, it does not have to be so. If, for example, neural tuning curves shifted with changes in pupil size through gain changes that revealed tuning or affected tuning curve shape, there could be projections of the pupil axis onto the target tuning axis. Thus, while we agree with the reviewer that it appears sensible for these two axes to be orthogonal, our result is nonetheless a novel finding. We have edited the text in our revised manuscript, however, to make sure the nuance of this point is conveyed to the reader.

      Smoulder AL, Marino PJ, Oby ER, Snyder SE, Miyata H, Pavlovsky NP, Bishop WE, Yu BM, Chase SM, Batista AP. A neural basis of choking under pressure. Neuron. 2024 Oct 23;112(20):3424-33.

      Reviewer #2 (Public Review):

      (1) The greatest weakness in the present research is the fact that arousal is a functionally less important non-motoric variable. The authors themselves introduce the problem with a discussion of attention, which is without any doubt the most important cognitive process that needs to be functionally isolated from oculomotor processes. Given this introduction, one cannot help but wonder, why the authors did not design an experiment, in which spatial attention and oculomotor control are differentiated. Absent such an experiment, the authors should spend more time explaining the importance of arousal and how it could interfere with oculomotor behavior. 

      Although attention does represent an important cognitive process, we did not design an experiment in which attention and oculomotor control are differentiated because attention does not appear to be related to slow drift. In our first paper that reported on this phenomenon, we investigated the effects of spatial attention on slow fluctuations in neural activity by cueing the monkeys to attend to a stimulus in the left or right visual field in a block-wise manner. Each block lasted ~20 minutes and we found that slow drift did not covary with the timing of cued blocks (see Figure 4A, Cowley et al., 2020, Neuron). Furthermore, there is a large body of work showing that arousal also impacts motor behavior leading to changes in a range of eye-related metrics (e.g., pupil size, microsaccade rate and saccadic reaction time - for review, see Di Stasi et al. 2013, Neurosci. Biobehav. Rev.). We also note that the terms attention and arousal are often used in nonspecific and overlapping ways in the literature, adding to some potential confusion here. Nonetheless, pupil-linked arousal is an important variable that impacts motor performance. This has now been stated clearly in the Introduction of the revised manuscript (lines 108-114) to address the reviewer’s concerns and highlight the importance of studying how precise fixation and eye movements are maintained even in the presence of signals related to ongoing changes in brain state. 

      Cowley BR, Snyder AC, Acar K, Williamson RC, Yu BM, Smith MA (2020) Slow Drift of Neural Activity as a Signature of Impulsivity in Macaque Visual and Prefrontal Cortex. Neuron 108:551-567.e8.

      (2) In this context, it is particularly puzzling that one actually would expect effects of arousal on oculomotor behavior. Specifically, saccade reaction time, accuracy, and speed could be influenced by arousal. The authors should include an analysis of such effects. They should also discuss the absence or presence of such effects and how they affect their other results. 

      As described above, several studies across species have demonstrated that arousal impacts motor behavior e.g., saccade reaction time, saccade velocity and microsaccade rate (for review, see Di Stasi et al. 2013, Neurosci. Biobehav. Rev.). This has been clarified in the Introduction of the revised manuscript to address the reviewer's concerns (lines 108-114). Our prior work (Johnston et al, Cerebral Cortex, 2022) shows that slow drift impacts several types of oculomotor behavior. Overall, these studies highlight the impact of arousal on eye movements as a robust effect, and support the present investigation into arousal and oculomotor control signals. While we agree reaction time, accuracy, and speed all can be influenced by arousal depending on task demands, the present study is focused on the connection between slow fluctuations in neural activity, linked to arousal, and different subpopulations of SC neurons. 

      Di Stasi LL, Catena A, Cañas JJ, Macknik SL, Martinez-Conde S (2013) Saccadic velocity as an arousal index in naturalistic tasks. Neurosci Biobehav Rev 37:968–975.

      Johnston R, Snyder AC, Khanna SB, Issar D, Smith MA (2022) The eyes reflect an internal cognitive state hidden in the population activity of cortical neurons. Cereb Cortex 32:3331–3346.

      (3) The authors use the analysis shown in Figure 6D to argue that across recording sessions the activity components capturing variance in pupil size and saccade tuning are uncorrelated. however, the distribution (green) seems to be non-uniform with a peak at very low and very high correlation specifically. The authors should test if such an interpretation is correct. If yes, where are the low and high correlations respectively? Are there potentially two functional areas in SC? 

      We agree with the reviewer that our actual data distribution was non-uniform. We examined individual sessions with high and low variance explained and did not find notable differences. One source of this variation has to do with session length. Longer sessions in principle should have a chance distribution of variance explained closer to zero because they contained more time bins. Given that we had no specific hypothesis for a non-uniform distribution, we have simply displayed the full distribution of values in our figure and the statistical result of a comparison to a shuffled distribution.

      Reviewer #3 (Public Review):

      (1) However, I am concerned about two main points: First, the authors repeatedly say that the "output" layers of the SC are the ones with the highest motor indices. This might not necessarily be accurate. For example, current thresholds for evoking saccades are lowest in the intermediate layers, and Mohler & Wurtz 1972 suggested that the output of the SC might be in the intermediate layers. Also, even if it were true that the high motor index neurons are the output, they are very few in the authors' data (this is also true in a lot of other labs, where it is less likely to see purely motor neurons in the SC). So, this makes one wonder if the electrode channels were simply too deep and already out of the SC? In other words, it seems important to show distributions of encountered neurons (regardless of the motor index) across depth, in order to better know how to interpret the tails of the distributions in the motor index histogram and in the other panels of Figure Supplement 1. I elaborate more on these points in the detailed comments below. 

      The reviewer makes a good point about the efferent signals from SC. It is true that electrical thresholds are often lowest in intermediate layers, though deep layers do project to the oculomotor nuclei (Sparks, 1986; Sparks & Hartwich-Young, 1989) and often intermediate and deep layers are considered to function together to control eye movements (Wurtz & Albano, 1980). As suggested by the reviewer, we have edited the text throughout the manuscript to say that slow drift was less evident in SC neurons with a higher motor index, as well as included the above references and points about the intermediate and deep layers (Lines 73-81). Aside from the question of which layers of the SC function as the “motor output”, the reviewer raises a separate and important question – are our deep recordings still in SC. Here, we can say definitively that they are. We removed neurons if they did not exhibit elevated (above baseline) firing rates during the visual or saccade epochs of the MGS task (see Methods section on “Exclusion criteria”). All included neurons possessed a visual, visuomotor or motor response, consistent with the response properties of neurons in the SC. In addition, we found a number of neurons well above the bottom of the probe with strong motor responses and minimal loadings onto the slow drift axis (see Figure 2 – figure supplement 1A), consistent with the reviewer’s comment that intermediate layer neurons are tuned for movement and play a role in saccade production.

      Mohler CW, Wurtz RH. Organization of monkey superior colliculus: intermediate layer cells discharging before eye movements. Journal of neurophysiology. 1976 Jul 1;39(4):722-44.

      Sparks DL. Translation of sensory signals into commands for control of saccadic eye movements: role of primate superior colliculus. Physiol Rev. 1986 Jan;66(1):118-71. doi: 10.1152/physrev.1986.66.1.118. PMID: 3511480.

      Sparks DL, Hartwich-Young R. The deep layers of the superior colliculus. Reviews of oculomotor research. 1989 Jan 1;3:213-55.

      Wurtz RH, Albano JE. Visual-motor function of the primate superior colliculus. Annu Rev Neurosci. 1980;3:189-226. doi: 10.1146/annurev.ne.03.030180.001201. PMID: 6774653.

      (2) Second, the authors find that the SC cells with a low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual responses. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC. 

      The reviewer makes an important point about the SC’s visual responses. Neurons with a low motor index are, conversely, likely to have a stronger visual response index. However, we do not believe that changes in luminance can explain why the correlation between SC spiking response and pupil size is weaker for neurons with a lower motor index. Firstly, the changes in pupil size observed in the current paper and our previous work are slow and occur on a timescale of minutes (Cowley et al., 2020, Neuron) and are correlated with eye movement measures such as reaction time and microsaccade rate (Johnston et al., 2022, Cerebral Cortex). This is in stark contrast to luminance-evoked changes in pupil size that occur on a timescale of less than a second. Secondly, as shown the new Figure 5 – figure supplement 1 in the revised manuscript, very similar results were found when SC spiking responses were correlated with pupil size during the baseline period, when only the fixation point was on the screen. Although the luminance of the small peripheral target stimulus can result in small luminance-evoked changes in pupil size, no changes in luminance occurred during the baseline period which was defined as 100ms before the onset of the target stimulus. In Figure 2 – figure supplement 1 and Author response image 1 above, we show that slow drift is the same whether calculated on the baseline response, delay period, or peri-saccadic epoch. Thus, the measurement of slow drift is insensitive to the precise timing of the selection of both the window for the spiking response and the window for the pupil measurement. If luminance were the explanation for the slow changes in firing observed in visually responsive SC neurons, it would require those neurons to exhibit robust, sustained tuned responses to the small changes in retinal illuminance induced by the relatively small fluctuations in pupil size we observed from minute to minute. We are aware of no reports of such behavior in visually-responsive neurons in SC. We have included these analyses and this reasoning in the revised manuscript on lines 478-495.

      Reviewer#1 (Recommendations for the author):

      (1) It would be useful to provide line numbers in subsequent manuscripts for reviewers.

      Line numbers have been added in the revised version of the manuscript.

      (2) Page #6; last sentence: "...even impact processing at the early to mid stages of the visuomotor transformation, without leading to unwanted changes in motor output." I do not believe the authors have provided evidence that arousal levels were not associated with changes in motor output.

      As suggested by Reviewer 3 (see Public Reviews, Reviewer 3, Point 2), we have edited the text throughout the manuscript to say that slow drift was less evident in SC neurons with a higher motor index. This sentence in the revised manuscript now reads:

      “This provides a potential mechanism through which signals related to cognition and arousal can exist in the SC, and even impact processing at the early to mid stages of the visuomotor transformation, without leading to unwanted changes in SC neurons that are linked to saccade execution.”

      (3) Page #8; last paragraph: Although deep-layer SC neurons may not have been obtained during every recording session, a summary of the motor index scores observed along the probe across sessions would be useful to confirm their assumptions. 

      See Author response image 2 below which shows the motor index of each recoded SC neuron on the x-axis and session number on the y-axis. The points are colored by to the squared factor loading which represents the variance explained between the response a neuron and the slow drift axis (see Figure 3B of the main manuscript). You can see from this plot that neurons with a stronger component loading (shown in teal to yellow) typically have a lower motor index whereas the opposite is true for neurons with a weaker component loading (shown in dark blue).

      Author response image 2.

      Scatter plot showing the motor index of each recorded neuron along with the session number in which it was recorded. The points are colored by to the squared factor loading for each neuron along the slow drift axis. Note that loadings above 0.5 (33 data points in total) have been thresholded at 0.5 so that we could effectively use the color range to show all of the slow drift axis loadings.

      (4) Page #10; first paragraph: The authors should state the time window of the delay period used, since it may be distinct from the pupil analysis (first 200ms of delay). 

      This has been stated in the revised version of the manuscript. The sentence now reads:

      “We first asked if arousal-related fluctuations are present in the SC. As in previous studies that recorded from neurons in the cortex (Cowley et al., 2020), we found that the mean spiking responses of individual SC neurons during the delay period (chosen at random on each trial from a uniform distribution spanning 600-1100ms, see Methods) fluctuated over the course of a session while the monkeys performed the MGS task (Figure 2A, left).”

      (5) Page #10; second paragraph: Extra period at the end of a sentence: " most variance in the data..". 

      Fixed in the revised version of the manuscript.

      (6) Page #12: "between projections onto the SC slow drift axis and mean pupil size during the first 200ms of the delay period when a task-related pupil response could be observed." What criteria was used to determine whether a task-related pupil response was observed? 

      This was chosen based on the results of a previous study in our lab that used the same memory-guided saccade task to investigate the relationship between slow drift and changes in based and evoked pupil size (see Johnston et al., 2022, Cereb. Cortex, Figure 6B). The period was chosen based on plotting the average pupil size aligned on different trial epochs. As we show in Figure 5-figure supplement 3 above, the pupil interactions with slow drift did not depend on the particular time window of the pupil we chose.  

      (7) Page #14; Figure 2A: The axes for the individual channels are strangely floating and quite different from all other figures. Please label the channel in the figure legend that was used as an example of the projected values onto the slow drift axis.

      The figure has been changed in the revised version of the manuscript so that the tick mark denoting zero residual spikes per second is on the top layer of each plot. A scale bar was chosen instead of individual axes to reduce clutter in the figure as it was used to demonstrate how slow drift was computed. Residual spiking responses from all neurons were projected on the slow drift axis to generate the scatter plot in the bottom right-hand corner of Figure 2A. There is no single neuron to label.

      (8) Page #16: "These results demonstrate that even though arousal-related fluctuations are present in the SC, they are isolated from deep-layer neurons that elicit a strong saccadic response and presumably reside closer to the motor output." In line with our major comments, lack of arousal-related activity during the delay period is meaningless for deep-layer SC neurons that are generally inactive during this time. It does not imply that there is no arousal signal! 

      Addressed in Public Reviews, Reviewer 1, Point 1 & 2. We found a similar lack of arousal-related modulations reported for deep-layer SC neurons when slow drift was computed using the saccade epoch (Figure 1 above). In addition, similar dynamics were observed when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade period (Figure 2).

      (9) Page #18: "These findings provide additional support for the hypothesis that arousalrelated fluctuations are isolated from neurons in the deep layers of the SC." The same criticism from above applies.

      Addressed in Public Reviews, Reviewer 1, Point 1 & 2.

      (10) Page #20; paragraph 3: "Taken together, the findings outlined above..." Would be useful to be more specific when referring to "activity" ; e.g., "...these neurons did not exhibit large fluctuations in delay-period activity over time".

      This sentence has been changed in the revised manuscript in light of the reviewer’s comments. It now reads:

      “In addition to being more weakly correlated with pupil size, the spiking responses of these neurons did not exhibit large fluctuations over time (Figure 2), and when considering the neuronal population as a whole, explained less variance in the slow drift axis when it was computed using population activity in the SC (Figure 3) and PFC (Figure 4).”

      Reviewer #3 (Recommendations for the author):

      The paper is clear and well-written. However, I am concerned about two main points: 

      (1) First, the authors repeatedly say that the "output" layers of the SC are the ones with the highest motor indices. This might not necessarily be accurate. For example, current thresholds for evoking saccades are lowest in the intermediate layers, and Mohler & Wurtz 1972 suggested that the output of the SC might be in the intermediate layers. Also, even if it were true that the high motor index neurons are the output, they are very few in the authors' data (this is also true in a lot of other labs, where it is less likely to see purely motor neurons in the SC). So, this makes one wonder if the electrode channels were simply too deep and already out of the SC. In other words, it seems important to show distributions of encountered neurons (regardless of motor index) across depth, in order to better know how to interpret the tails of the distributions in the motor index histogram and in the other panels of the figure supplement 1. I elaborate more on these points in the detailed comments below. 

      Addressed in Public Reviews, Reviewer 3, Point 1.

      (2) Second, the authors find that the SC cells with a low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual responses. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC. 

      Addressed in Public Reviews, Reviewer 3, Point 2.

      (3) I think that a remedy to the first point above is to change the text to make it a bit more descriptive and less interpretive. For example, just say that the slow drifts were less evident among the neurons with high motor index. 

      We thank the reviewer for this suggestion (see Public Reviews, Reviewer 3, Point 1).

      (4) For the second point, I think that it is important to consider the alternative caveat of different amounts of light entering the system. Changes in light level caused by pupil diameter variations can be quite large. 

      We thank the reviewer for this suggestion (see Public Reviews, Reviewer 3, Point 2).

      (5) Line 31: I'm a bit underwhelmed by this kind of statement. i.e. we already know that cognitive processes and brain states do alter eye movements, so why is it "critical" that high precision fixation and eye movements are maintained? And, isn't the next sentence already nulling this idea of criticality because it does show that the brain state alters the SC neurons? In fact, cognitive processes are already known to be most prevalent in the intermediate and deep layers of the SC. 

      It seems clear that while cognitive state does affect eye movements, it is desirable to have some separation between cognitive state and eye movement control. Covert attention, for instance, is precisely a situation where eye movement control is maintained to avoid overt saccades to the attended stimulus, and yet there are clear indications of attention’s impact on microsaccades and fixation. We stand by our statement that an important goal of vision is to have precise fixation and movements of the eye, and yet at the same time the eyes are subject to numerous influences by cognitive state.

      (6) Line 65: it is better to clarify that these are "functional layers" because there are actually more anatomical layers. 

      We have edited this sentence in the revised version of the manuscript so that it now reads:

      “The role of these projections in the visuomotor transformation depends on the functional layer of the SC in which they terminate”.

      (7) Line 73: this makes it sound like only the deepest layers are topographically organized, which is not true. Also, as early as Mohler & Wurtz, 1972, it was suggested that the intermediate layers have the biggest impacts downstream of the SC. This is also consistent with electrical microstimulation current thresholds for evoking saccades from the SC. 

      We have addressed the reviewers’ comments about the intermediate layers having the biggest impact downstream of the SC in Public Reviews, Reviewer 3, Point 1. Furthermore, line 73 has been changed in the revised manuscript so that it now reads:

      “As is the case for neurons in the superficial and intermediate layers, they [SC motor neurons] form a topographically organized map of visual space (White et al. 2017; Robinson 1972; Katnani and Gandhi 2011)”.  

      (8) Line 100: there is an analogous literature regarding the question of why unwanted muscle contractions do not happen. Specifically, in the context of why SC visual bursts do not automatically cause saccades (which is a similar problem to the ones you mention about cognitive signals interfering by generating unwanted eye movements), both Jagadisan & Gandhi, Curr Bio, 2022 and Baumann et al, PNAS, 2023 also showed that SC population activity not only has different temporal structure (Jagadisan & Gandhi) but also occupy different subspaces (Baumann et al) under these two different conditions (visual burst versus saccade burst). This is conceptually similar to the idea that you are mentioning here with respect to arousal. So, it is worth it to mention these studies here and again in the discussion. 

      We are grateful to the reviewer for these suggestions and have included text in the Introduction (Lines 125-128) and Discussion (Lines 678-682) of the revised manuscript along with the references cited above.

      (9) Line 147: as mentioned above, it is now generally accepted that there are quite a few "pure" motor neurons in the SC. This is consistent with what you find. E.g. Baumann et al., 2023. And, again see Mohler and Wurtz in the 1970's. So, I wonder how useful it is to go too much into this idea of the deeper motor neurons (e.g. the correlations in the other panels of the Figure 1 supplement). 

      This is related to the reviewer’s comment that the output of the SC might be in the intermediate layers. This concern has been addressed in Public Reviews, Reviewer 3, Point 1.

      (10) Figure 1 should say where the RF was for the shown spike rasters. i.e. were these the same saccade target across trials? And where was that location relative to the RF? It would help also in the text to say whether the saccade was always to the RF center or whether you were randomizing the target location. 

      We centered the array of saccade targets using the microstimulation-evoked eye movement for SC (see Methods section “Memory-guided saccade task”) to find the evoked eccentricity, and then used saccade targets with equal spacing of 45 degrees starting at zero (rightward saccade target). We did not do extensive RF mapping beyond this microstimulation centering. In Figure 1, the spike rasters are shown for a target that was visually identified to be within the neuron’s RF based on assessing responses to all 8 target angles. We have added information about this to the figure caption.

      (11) Line 218: but were there changes in the eye movement statistics? For example, the slow drift eye movements during fixation? Or even the microsaccades? 

      Addressed in Public Reviews, Reviewer 2, Point 2.  

      (12) Line 248: shuffling what exactly? I think that more explanation would be needed here. 

      Addressed in Public Reviews, Reviewer 1, Point 3.  

      (13) Line 263: but isn't this reflecting a sensory transient in the pupil diameter, since the target just disappeared? 

      Addressed in Public Reviews, Reviewer 3, Point 2.  

      (14) Line 271: I suspect that slow drift eye movements (in between microsaccades) would show higher correlations. Not sure how well you can analyze those with a video-based eye tracker. 

      We agree that fixational drift would be a worthwhile metric, but it is not one we have focused on here and to our knowledge does require higher precision tracking. 

      (15) Line 286: again, see above about similar demonstrations with respect to the visual and motor burst intervals, which clearly cause the same problem (even stronger) as the one studied here. 

      See reply, including Figure 2.

      (16) Line 330: again, I'm not sure deeper necessarily automatically means closer to the output. For example, current thresholds for evoked saccades grow higher as you go deeper. Maybe the authors can ask their colleague Neeraj Gandhi about this point specifically, just to be safe. Maybe the safest would be to remain descriptive about the data, and just say something like: arousal-related fluctuations were absent in our deepest recorded sites. 

      Addressed in Public Reviews, Reviewer 3, Point 1.

      (17) Line 332: likewise, statements like this one here would be qualified if the output was the intermediate layers......anyway if I understand what I read so far in the paper, the signal will be anyway orthogonal to the motor burst population subspace. So, maybe there's no need to emphasize that it goes away in the very deepest layers. 

      See reply above, Public Reviews, Reviewer 1, Point 4.

      (18) Figure 3A: related to the above, I think one issue could be that the deeper contacts might already be out of the SC. Maybe some cell count distribution from each channel should help in this regard. i.e. were you finding way fewer saccade-related neurons in the deepest channels (even though the few that you found were with high motor index)? If so, then wouldn't this just mean that the channel was too deep? I think there needs to be an analysis like this, to convince readers that the channels were still in the SC. Ideally, electrical stimulation current thresholds for evoking saccades at different depths would be tested, but I understand that this can be difficult at this stage. 

      Addressed in Public Reviews, Reviewer 3, Point 1.

      (19) I keep repeating this because in general, cognitive effects are stronger in the intermediate/deeper layers than in the superficial layers. If these interfere with eye movements like arousal, then why should arousal be different?

      Few studies have investigated the effects of attention on “pure” movement SC neurons that only discharge during a saccade. One study, which we cited in Introduction (Ignashchenkova et al., 2004, Nat. Neurosci.), found significant differences in spiking responses between trials with and without attentional cueing for visual and visuomotor neurons. No significant difference was found for motor neurons, consistent with our hypothesis that signals related to cognition and arousal are kept separate from saccade-related signals in the SC.

      (20) The problem with Figure 5 and its related text is that the neurons with low motor index are additionally visual. So, of course, they can be modulated if the pupil diameter changes!

      Addressed in Public Reviews, Reviewer 3, Point 2.  

      (21) I had a hard time understanding Figure 6. 

      See reply above, Public Reviews, Reviewer 1, Point 4.

      (22) Line 586: these cells have more visual responses and will be affected by the amount of light entering the eye. 

      Addressed in Public Reviews, Reviewer 3, Point 2.

    1. Synthèse sur la situation des enfants sans abri logés dans les écoles en France

      Résumé

      Le sans-abrisme infantile connaît une augmentation alarmante en France, avec une hausse de 133 % depuis 2020, exacerbée par l'inflation et la crise du logement.

      Face à ce que le reportage décrit comme les "carences de l'État", des collectifs citoyens, notamment "Jamais sans toi" à Lyon, organisent l'occupation d'établissements scolaires pour offrir un abri nocturne à des familles à la rue.

      Ce document de synthèse se penche sur ce phénomène à travers le témoignage d'une famille d'origine angolaise – une mère et ses enfants – hébergée dans une école lyonnaise.

      Leur parcours met en lumière la précarité extrême, le traumatisme d'une tentative d'expulsion avortée, et l'impact psychologique profond sur les enfants.

      La situation révèle une tension critique entre la solidarité citoyenne, incarnée par les enseignants et les parents d'élèves, et l'inaction des pouvoirs publics, qui non seulement échouent à proposer des solutions de logement pérennes, mais exercent également une pression administrative sur les acteurs de cette solidarité.

      1. Le Phénomène du Sans-abrisme Infantile et la Réponse Citoyenne

      Le reportage met en évidence une crise sociale majeure : l'explosion du nombre d'enfants sans domicile fixe en France.

      Expansion et Causes :

      ◦ Le sans-abrisme infantile a augmenté de 133 % depuis 2020.   

      ◦ Les facteurs identifiés sont l'inflation, la multiplication des expulsions locatives et la pénurie de logements sociaux.   

      ◦ Les solutions d'urgence, conçues pour être temporaires, "s'éternisent".

      En 2023, les familles logées dans des écoles y sont restées en moyenne plus de six mois.

      L'Occupation des Écoles comme Palliatif :

      ◦ Face à cette situation, des collectifs citoyens comme "Jamais sans toi" à Lyon organisent l'occupation d'écoles pour héberger des familles.     ◦ Ampleur du phénomène à Lyon :      

      ▪ Actuellement, 17 écoles de la métropole lyonnaise accueillent 25 familles.       

      ▪ Depuis 2014, une soixantaine d'établissements ont servi de refuge à plus de 1000 enfants.   

      ◦ Ce mouvement n'est pas limité à Lyon ; des initiatives similaires existent à Strasbourg, Rennes et Paris.   

      ◦ Ce soutien repose sur la "générosité citoyenne" (parents d'élèves, professeurs, habitants) qui compense les défaillances de l'État.

      2. Étude de Cas : Le Parcours d'une Famille Angolaise

      Le reportage se concentre sur le témoignage poignant de Lucy (16 ans), Lina (12 ans) et leur mère, qui illustre la réalité humaine derrière les statistiques.

      De l'Angola à la Précarité en France :

      ◦ Arrivée en France lorsque Lucy avait 10 ans et Lina 5 ou 6 ans.   

      ◦ Premières expériences d'hébergement précaire : le 115 à Dijon dans une chambre partagée, puis un foyer à Digoin.   

      ◦ La journée, la famille devait quitter le 115 et trouver refuge dans des associations (Secours Populaire, églises) pour manger.   

      ◦ Lina décrit sa déception face à la réalité française, loin de l'image idéalisée des dessins animés :

      « Un pays super bien, que tout se passait bien, qu'on avait une vie normale ».  

      ◦ Elle a également été victime de moqueries et de racisme à l'école en raison de sa langue et de ses cheveux.

      Le Traumatisme de l'Expulsion Manquée (OQTF) :

      ◦ Il y a deux ans, la famille a fait l'objet d'une Obligation de Quitter le Territoire Français (OQTF).  

      ◦ La police est intervenue en pleine nuit dans leur appartement. Lucy, alors âgée de 14 ans, décrit une scène de panique et de violence :

      ses parents criant, son père menotté, et les enfants enfermés dans une chambre avec des policiers.   

      ◦ La famille a été conduite à Paris après 5 heures de route et placée dans un centre de détention pendant 4 heures.   

      ◦ À l'aéroport, leur vol pour l'Angola a été annulé. Les autorités les ont alors "abandonnés à l'aéroport", leur ordonnant simplement "de plus retourner où [ils] étaient".

      La Rupture Familiale et l'Errance :

      ◦ Après cet épisode, la famille est revenue à Lyon.

      Le mariage des parents n'étant pas reconnu en France, leur séparation a suivi. La mère s'est retrouvée seule avec ses enfants.   

      ◦ Ils ont enchaîné les solutions d'hébergement temporaires :

      un camping à Trévoux, un appartement à Bellecour, puis une association qui les a logés avec d'autres femmes, avant de trouver refuge dans l'école.

      3. La Vie Quotidienne dans une Salle de Classe

      L'école, bien qu'offrant un toit, impose des conditions de vie extrêmement contraignantes et précaires.

      Aspect

      Description

      Logement

      La famille dort sur des matelas gonflables dans une salle de classe. Les vêtements sont stockés dans les armoires de la classe et des valises.

      Routine

      Lever obligatoire entre 6h30 et 6h50.

      La famille doit quitter les lieux avant 8h30 et ne peut revenir qu'après 18h00, une fois tous les élèves partis.

      Discrétion

      La nuit, il est interdit d'allumer les lumières pour ne pas attirer l'attention.

      La famille utilise les lampes de poche des téléphones pour s'éclairer.

      Insecurité

      Des jeunes jouant dans la cour sont déjà montés et ont fouillé dans leurs affaires, profitant d'une porte laissée ouverte.

      Perturbations

      La vie de la famille est rythmée par la sonnerie de l'école, qui retentit "toutes les heures".

      Lutte de la mère

      Elle cherche activement du travail (nettoyage, restauration) et des formations gratuites, mais sa situation rend les démarches très difficiles.

      4. Impacts Psychologiques et Sociaux sur les Enfants

      La précarité et l'instabilité ont des conséquences profondes sur le bien-être et le développement des enfants.

      Le Poids du Secret et de la Honte :

      ◦ Lucy cache sa situation à la plupart de ses amies par peur du jugement :

      « J'angoisse un peu, sachant que beaucoup de jeunes de mon âge [...] se permettent de juger tout simplement. »  

      ◦ Elle exprime un profond désir de normalité : « Des fois, je me dis que j'aimerais juste avoir une vie normale comme plein d'ados de mon âge. »  

      ◦ Lina exprime également la peur d'être mise à l'écart par ses camarades parce qu'elle vit dans une école.

      Aspirations et Résilience :

      ◦ Malgré les épreuves, Lucy est une bonne élève et aspire à devenir avocate.

      Son ambition est directement liée à son vécu : « J'ai envie d'être avocate, de défendre les gens parce que je me dis que tout le monde a le droit à une deuxième chance. »   

      ◦ Face à la détresse, elle a développé une stratégie de contrôle émotionnel : « Quand c'est dur, bah je prends sur moi et puis je me dis ça va aller. »  

      ◦ Sa plus grande peur reste matérielle et existentielle : « J'ai peur de me retrouver à la rue. Ça me fait peur. »

      5. La Solidarité Face à l'Inaction Institutionnelle

      Le reportage oppose la solidarité active du terrain à la réponse passive, voire répressive, des institutions.

      Le Soutien du Corps Enseignant :

      ◦ Une enseignante de l'école s'est fortement impliquée, dormant sur place la première nuit pour rassurer l'équipe périscolaire.  

      ◦ Elle a accueilli la famille chez elle pendant les vacances de Noël, une période particulièrement symbolique car la famille avait passé le Noël précédent dehors.  

      ◦ Une cagnotte organisée par ses collègues a permis d'offrir des cadeaux et un repas de fête à la famille.

      La Pression de la Hiérarchie :

      ◦ Suite à l'occupation, l'enseignante et ses collègues ont été convoquées par l'inspectrice d'académie.   

      ◦ La rencontre est décrite comme "un bon remontage de bretelle", où elles se sont fait "engueuler".

      L'inspectrice les a qualifiées d' "inconscientes", leur faisant porter "toute la responsabilité" sans reconnaître la vulnérabilité de la famille.

      L'Absence de Solutions Pérennes :

      ◦ Près d'un an après le début de l'occupation, "il n'y a aucune proposition de la mairie, de la métropole, aucune perspective, rien."   

      ◦ L'occupation de l'école a donc dû se poursuivre au-delà de l'année scolaire, mais avec des règles plus strictes :

      la famille n'a plus le droit d'être dans le bâtiment pendant les heures de classe.

    1. Roots are in capitals,

      Roots are in capitals, and are not words in use at all, but serve as an elucidation of the words grouped together and a connection between them.

      J.R.R. Tolkien's note in the Qenya Lexicon[1]

    1. saaron knew

      Sauron new

      knew that he had been wrong - not everyone would want to use the ring for their own power and Glory

      yes Frodo succumbed at the very very end but - he and Sam made it that far and - fate or Providence or the intervention of Uru? himself did the rest

      some people are capable of selfless and purely good acts

      it wasn't just just Sauron who fell - it was his entire worldview

      hope and love and care and friendship - can triumph over evil - however powerful it may seem at the time

      Description

      1. what is the number one cause of stress in your life? my one cause of stress in my life is college i didn't know that college was going to be that difficult for me

      2. what else causes you stress having my house clean for my house

      3. what effect does stress have on your studies and academic performance? that i can't focuses so much on college

      1 4 4 3 0 0 4 4 2 2 0 3 2 2 3 3

      1. list common cause of stress for college students. Everyday Stressors
      2. Time pressure: juggling classes, work, and social life
      3. Academic anxiety: grades, exams, assignments
      4. Financial concerns: tuition, rent, food
      5. Relationship conflicts: roommates, partners, family
      6. Health issues: frequent illness, poor sleep, allergies
      7. Body image and eating habits
      8. Loneliness or poor social connection
      9. Daily hassles: broken-down car, housing issues

      2.Describe the physical, mental, and emotional effects of persistent stress. Physical * Weakened immune system (more frequent illness) Digestive issue (ulcers, constipation, indigestion) * High blood pressure and increased risk of heart disease * Muscle tension headaches, fatigue * Sleep disturbances (insomnia or oversleeping)

      Mental * Difficulty concentrating or thinking clearly * Poor memory and reduced academic * Negative thought patterns and pessimism

      Emotional * Anxiety, depression, irritability * Feelings of helplessness or frustration * Withdrawal from others or increased conflict

      1. List healthy ways college students can manage or cope with stress.
      2. Exercise regularly (aerobic activity boosts mood and focus)
      3. Get enough sleep (7-9 hour improves resilience)
      4. Time management: focus on growth not perfection

      4.Develop your personal plan for managing stress in your life long-term vision * Feel more in control of my schedule * Build emotional resilience and confidence * Maintain academic success while enjoying college life

    1. reply to u/EdmundDante718 at https://reddit.com/r/typewriters/comments/1o5x527/missing_carriage_releases_on_scm/

      It's incredibly common for these 6 series Smith-Coronas to have broken plastic carriage release levers (a major design flaw). You can call around to shops with parts machines for original replacements. https://site.xavier.edu/polt/typewriters/tw-repair.html

      There are numerous YouTube repair videos and ideas including these few I've bookmarked before, though there are surely others: - https://www.youtube.com/watch?v=zNcQvfUk23s - https://www.youtube.com/watch?v=Gb9VlrKcXcM

      I've not seen anyone 3-D print a version (yet), but designs for one might be floating around out there.

      I've also seen people jury rig all sorts of plastic replacements which is an option as well.

      In practice, you generally only need one working one for your dominant hand.

    1. Résumé de la vidéo [00:00:14][^1^][1] - [00:28:20][^2^][2]:

      Cette vidéo explore la managérialisation des associations et ses impacts.

      Elle aborde les défis et propose des solutions pour renforcer le monde associatif face à cette tendance.

      Temps forts:

      • [00:00:14][^3^][3] Introduction et contexte
        • Accueil des participants
        • Présentation du webinaire
        • Objectifs de la série
      • [00:03:27][^4^][4] Enjeux de la managérialisation
        • Définition et historique
        • Impact sur les associations
        • Comparaison avec d'autres modèles
      • [00:07:03][^5^][5] Conséquences et critiques
        • Perte de dimension démocratique
        • Réduction des relations humaines
        • Exemples concrets et témoignages
      • [00:15:01][^6^][6] Solutions et alternatives
        • Importance de la participation
        • Réappropriation des termes
        • Exemples de bonnes pratiques
      • [00:22:00][^7^][7] Conclusion et perspectives
        • Invitation à l'action collective
        • Importance de la cohérence interne
        • Appel à la réflexion et à l'innovation

      Résumé de la vidéo [00:28:22][^1^][1] - [00:54:06][^2^][2]:

      Cette vidéo explore la gestion et la gouvernance des associations face à la managérialisation.

      Elle met en lumière l'importance de la circulation de l'information, de l'intelligence collective, et de la délibération pour une gouvernance démocratique et efficace.

      Points forts : + [00:28:22][^3^][3] Circulation de l'information * Importance de la diffusion de l'information * Mise en commun des connaissances * Héritage des sociétés savantes + [00:29:57][^4^][4] Intelligence collective * Animation et maïeutique * Création d'espaces de travail collaboratif * Qualité de l'animation + [00:31:02][^5^][5] Délibération et décision * Importance de la délibération pour de bonnes décisions * Définition de la démocratie par Paul Ricœur * Travail sur les contradictions + [00:35:02][^6^][6] Tensions et réussites * Identification des tensions dans la gouvernance * Conditions de réussite * Création d'une communauté apprenante + [00:39:02][^7^][7] Exemple pratique * Transformation de la gouvernance au sein du Réseau d'Échange et de Services aux Associations du Pays de Morlaix * Passage à un système de cercles thématiques * Participation et implication des salariés et bénévoles

      Ces points forts couvrent les principaux aspects abordés dans la vidéo, offrant une vue d'ensemble des défis et des solutions pour une gouvernance associative efficace.

      Résumé de la vidéo [00:54:11][^1^][1] - [01:19:33][^2^][2]:

      Cette partie du webinaire traite de la gestion et de l'organisation des associations, en mettant l'accent sur la coprésidence et la participation collective.

      Points forts : + [00:54:11][^3^][3] Introduction de la coprésidence * Modification des statuts en 2020 * Importance de la participation collective * Fonctionnement en commissions thématiques + [00:57:02][^4^][4] Formation et participation * Formation annuelle sur la gestion collective * Ouverture des chantiers de travail aux adhérents * Importance de la transparence et de la clarté + [01:00:00][^5^][5] Déplacements et cohésion * Budget pour les déplacements collectifs * Renforcement des liens entre membres * Importance de la convivialité et du plaisir + [01:03:09][^6^][6] Intégration de nouveaux membres * Augmentation du nombre de membres du CA * Processus d'intégration et d'accompagnement * Maintien de la transparence et de la confiance + [01:09:09][^7^][7] Réflexion sur le temps et la gouvernance * Importance de la gestion du temps * Opposition au néolibéralisme * Outils pratiques pour la gouvernance associative

      Résumé de la vidéo [01:19:36][^1^][1] - [01:46:07][^2^][2]:

      Cette vidéo traite de la managérialisation des associations et des défis liés à la gestion collective et à la formation continue des membres.

      Temps forts: + [01:19:36][^3^][3] Partage d'expériences * Importance de partager les échecs * Encouragement à la discussion collective * Utilisation des retours d'expérience + [01:22:01][^4^][4] Formation continue * Formation des équipes salariées * Importance de la coopération * Nécessité de réexpliquer aux nouveaux membres + [01:27:03][^5^][5] Suivi des salariés * Organisation de réunions de médiation * Importance du bien-être au travail * Gestion des conflits internes + [01:33:00][^6^][6] Rôle du syndicalisme * Conditions de travail et temps de travail * Complémentarité entre engagement associatif et syndical * Importance de la démocratie interne + [01:38:00][^7^][7] Taille des associations * Impact de la taille sur la gestion * Importance de la volonté politique * Réflexion sur la géographie et l'échelle d'action

      Résumé de la vidéo [01:46:09][^1^][1] - [01:58:34][^2^][2]:

      Cette partie du webinaire aborde divers aspects de la gestion et de l'organisation des associations, en mettant l'accent sur les défis et les solutions possibles.

      Temps forts: + [01:46:09][^3^][3] Questions sur la loi 3DS * Impact des certifications qualité * Partage de ressources et d'expertises * Importance de la loi pour les associations + [01:49:01][^4^][4] Réorganisation de la GD * Inclusion des salariés et bénéficiaires * Partenariat avec les financeurs * Protection des salariés uniques + [01:50:24][^5^][5] Participation des financeurs * Explication des projets aux financeurs * Importance de leur inclusion dans le CA * Délégation des responsabilités au sein de l'équipe + [01:53:06][^6^][6] Prévention des conflits d'intérêts * Retrait des élus des instances associatives * Importance de maintenir un lien fort avec les financeurs * Anticipation des changements législatifs + [01:55:00][^7^][7] Conclusion et perspectives * Recueil des expériences et des échecs * Construction d'une communauté apprenante * Invitation à partager des ressources et à poursuivre les échanges

    1. La Prévention des Conflits d'Intérêts : Collectivités et Associations

      Synthèse

      Ce document de synthèse analyse les enjeux juridiques et pratiques liés à la prévention des conflits d'intérêts dans les relations entre les collectivités territoriales et les associations.

      Basé sur les interventions d'experts juridiques et de formateurs d'élus, il met en lumière les risques pénaux encourus et propose des préconisations concrètes.

      Les points critiques à retenir sont les suivants :

      • 1. Le conflit d'intérêts n'est pas une infraction, mais un signal d'alerte. La situation devient délictuelle lorsqu'un élu ou un agent public, conscient de ce conflit, ne se déporte pas et participe à une décision, tombant ainsi sous le coup de la prise illégale d'intérêt, une infraction pénale sévèrement sanctionnée (jusqu'à 5 ans d'emprisonnement et 500 000 € d'amende).

        1. La notion d'intérêt est extrêmement large. Elle couvre les intérêts matériels, mais aussi moraux ou familiaux. Il n'est pas nécessaire que l'élu se soit enrichi personnellement ou que la collectivité ait subi un préjudice ; la simple apparence d'une impartialité compromise peut suffire à caractériser l'infraction.
        1. La règle pour les élus impliqués dans une association est le "déport général". Qu'ils soient membres du bureau à titre personnel ou en tant que représentants de la commune, ils doivent s'abstenir de toute participation à une délibération concernant cette association.

      Ce déport doit être total :

      • ◦ Absence de participation à l'instruction du dossier.
      • ◦ Absence de participation aux débats.
      • ◦ Absence de participation au vote.
      • ◦ Sortie physique de la salle du conseil durant les débats et le vote.

        1. Les élus locaux sous-estiment massivement ce risque. Les formations de terrain révèlent que la préoccupation principale des élus concerne les aspects techniques des subventions, tandis que le risque de conflit d'intérêts est souvent ignoré, en particulier dans les petites communes où les interférences entre mandats électifs et vie associative sont pourtant maximales.
        1. Des outils et des bonnes pratiques existent pour sécuriser les processus.

      La responsabilité première incombe à chaque élu, qui doit s'auto-évaluer en permanence.

      Pour sécuriser les décisions, il est préconisé de voter les subventions au cas par cas, de systématiser la déclaration des conflits en début de séance et de s'appuyer sur des ressources externes comme la Haute Autorité pour la Transparence de la Vie Publique (HATVP) et le référent déontologue, désormais obligatoire pour toutes les communes.

      1. Le Cadre Juridique et les Risques Pénaux

      L'analyse juridique, menée par Luc Brunet de l'Observatoire SMAC, souligne la nécessité de distinguer deux notions fondamentales qui sont souvent confondues.

      Définitions Fondamentales : Conflit d'Intérêts vs. Prise Illégale d'Intérêt

      Le conflit d'intérêts est une situation, tandis que la prise illégale d'intérêt est une infraction pénale qui découle de la mauvaise gestion de cette situation. Caractéristique Conflit d'Intérêts Prise Illégale d'Intérêt Nature

      Une situation d'interférence entre un intérêt public et des intérêts (publics ou privés) de nature à influencer ou paraître influencer l'exercice d'une fonction.

      Une infraction pénale. Le fait de prendre, recevoir ou conserver, directement ou indirectement, un intérêt de nature à compromettre son impartialité.

      Source Légale Loi du 11 octobre 2013

      Article 432-12 du Code pénal

      Sanction

      Aucune (ce n'est pas une infraction). La situation doit être prévenue ou résolue.

      Jusqu'à 5 ans d'emprisonnement et 500 000 € d'amende.

      "Le conflit d'intérêts, c'est la vie. Nous avons tous des conflits d'intérêts. [...] Là où c'est pas normal [...] c'est quand on va se dire 'je vais surtout pas le dire que je suis en situation de conflit d'intérêt'. Et c'est là qu'on franchit la ligne jaune et qu'on passe [...] du côté du code pénal avec le délit de prise illégale d'intérêt." - Luc Brunet

      Le Champ d'Application Vaste de la Prise Illégale d'Intérêt

      Le délit de prise illégale d'intérêt est l'infraction numéro un pour laquelle les élus locaux sont poursuivis. Son champ d'application est particulièrement étendu :

      • Tous les domaines : Contrairement au délit de favoritisme (limité à la commande publique), il s'applique à toutes les décisions d'une collectivité : urbanisme, recrutement, vente de biens, et notamment les subventions aux associations.

      • Intérêt moral ou familial : L'intérêt n'est pas nécessairement matériel ou financier.

      • Absence de préjudice requis : L'infraction est constituée même si la collectivité n'a subi aucun préjudice, voire si elle a bénéficié de l'opération.

      • Intérêts indirects : Le délit couvre les intérêts pris par personne interposée (conjoint, ascendants, descendants, mais aussi amis proches).

      La jurisprudence retient une vision très large : "l'infraction s'arrête où le soupçon s'arrête".

      • La notion d'apparence : Il ne faut pas seulement ne pas être en conflit d'intérêts, mais aussi ne pas donner l'apparence de l'être.

      La Doctrine de la Haute Autorité pour la Transparence de la Vie Publique (HATVP)

      La HATVP a établi une doctrine pour clarifier les niveaux de risque. Pour les relations avec les associations, le risque est considéré comme large.

      • Zone Rouge (Risque Large) : Concerne la participation d'un élu au sein d'un organisme de droit privé, comme une association, que ce soit à titre personnel ou comme représentant de la commune.

      • Règle Appliquée : Le déport général. L'élu concerné doit s'abstenir de participer à toute délibération relative à cet organisme, y compris en l'absence d'enjeu financier direct. Adhérent ou Dirigeant : Une Distinction Cruciale ?

      La question se pose de savoir si un simple adhérent est soumis aux mêmes règles qu'un membre du bureau (président, trésorier, etc.).

      • Position de la HATVP (Avis du 3 mai 2022) : Le simple fait d'être adhérent ne justifie pas un déport systématique.

      Cependant, une analyse au cas par cas doit être menée en fonction de la nature de l'association, de son nombre d'adhérents et de l'objet de la délibération.

      • Conseil de Prudence : Face à l'incertitude de l'analyse au cas par cas, il est recommandé aux simples adhérents, par mesure de sécurité, de se déporter systématiquement lors du vote d'une subvention.

      2. Règles Pratiques et Préconisations La prévention repose sur une démarche rigoureuse et transparente.

      Les Quatre Étapes de la Prévention

      • 1. Identifier les situations à risque : L'élu doit se poser les bonnes questions sur ses liens personnels, familiaux ou associatifs en rapport avec les dossiers de la collectivité.

      • 2. Déclarer le conflit d'intérêts : Conformément à la Charte de l'élu local, l'élu doit faire connaître ses intérêts personnels avant le débat et le vote.

      • 3. Se déporter complètement : Le déport ne se limite pas au non-vote. L'élu ne doit participer ni à l'instruction du dossier, ni aux débats qui précèdent le vote.

      • 4. Ne pas influencer : L'élu doit s'abstenir de toute intervention, même informelle ("tirer les ficelles par derrière").

      Jurisprudence : Des Exemples Concrets et Marquants Deux cas illustrent la sévérité avec laquelle la justice appréhende ce délit :

      Le maire de Plougastel-Daoulas : Des élus membres du bureau d'une association ad hoc n'ont pas participé au vote de la subvention, mais sont restés dans la salle.

      Ce simple fait a été jugé suffisant pour caractériser une influence et a conduit à leur condamnation pour prise illégale d'intérêt.

      Une commune rurale de 250 habitants : Des élus, membres du bureau d'une association organisant la fête du village, ont participé au vote d'une subvention de 250 €.

      Ils ont été condamnés pour prise illégale d'intérêt suite à la plainte d'un opposant politique.

      Ces exemples démontrent que ni la bonne foi, ni la poursuite de l'intérêt général, ni le faible montant de la subvention ne constituent des protections contre une condamnation.

      Préconisations pour Sécuriser les Délibérations

      • Pas de vote global : Les subventions aux associations doivent être votées une par une, jamais en bloc.

      Sortir de la salle : L'élu concerné doit physiquement quitter la salle du conseil avant le début des débats et ne revenir qu'une fois le point de l'ordre du jour traité. Cette sortie doit être consignée au procès-verbal.

      Instaurer un "tour de table" déontologique : En début de chaque conseil, le maire peut demander à chaque élu de signaler d'éventuels conflits d'intérêts au regard de l'ordre du jour.

      3. Le Témoignage du Terrain : Entre Méconnaissance et Difficultés d'Application

      Le témoignage de Sophie Van migom, directrice d'un centre de formation pour élus, révèle un décalage important entre les exigences légales et la perception des élus sur le terrain.

      Une Prise de Conscience Limitée chez les Élus

      Lors des formations, les préoccupations des élus portent majoritairement sur des questions techniques (conventionnement, prêt de matériel, contrôle financier).

      Le risque de conflit d'intérêts est très rarement abordé spontanément, en particulier par les élus des petites communes.

      "Sur 90 participants, je n'ai que deux élus qui m'ont parlé de conflit d'intérêt. [...] Les élus des petites communes ne se posent pas la question, alors qu'il y a forcément des interférences entre leur mandat électif, leur vie familiale, leur vie associative." - Sophie Van migom

      Les Conséquences Pratiques et les Défis Opérationnels

      L'application stricte des règles de déport peut engendrer des difficultés de fonctionnement :

      • Problèmes de quorum : Dans une commune de 620 habitants, la mise en place de règles de déport strictes a conduit à ce que la moitié du conseil municipal sorte de la salle, empêchant le quorum d'être atteint. La seule solution est de reconvoquer le conseil, ce qui retarde la décision.

      • Paralysie de l'action des élus : Un élu engagé pour son expertise associative (ex: président de l'association des parents d'élèves devenu adjoint aux écoles) peut se retrouver dans l'incapacité d'agir sur les dossiers pour lesquels il a été élu.

      Les Doubles Sanctions : Pénale et Administrative Le non-respect des règles de déport expose l'élu et la collectivité à un double risque :

      1. Le risque pénal : L'élu est poursuivi pour prise illégale d'intérêt et le maire pour complicité.

      2. Le risque administratif : La délibération elle-même est illégale.

      Elle peut être annulée par le juge administratif suite à un recours d'un opposant, d'un contribuable ou du préfet. L'association pourrait alors être contrainte de rembourser la subvention perçue.

      4. Outils et Bonnes Pratiques

      La Responsabilité Personnelle de l'Élu

      C'est à chaque élu d'évaluer sa propre situation, d'informer le maire et le conseil, et de prendre la décision de se déporter.

      Cette réflexion doit être menée dès le début du mandat pour clarifier les limites de ses fonctions.

      Les Aides à la Décision

      Les élus ne sont pas seuls face à ces questionnements complexes. Ils peuvent solliciter :

      • La Haute Autorité pour la Transparence de la Vie Publique (HATVP) : Il est possible de saisir la HATVP pour obtenir un avis confidentiel et rapide sur une situation personnelle.

      • Le référent déontologue : Sa désignation est une obligation pour toutes les collectivités. Il offre un avis qui va au-delà du strict droit, en abordant les questions de probité et d'exemplarité.

      Cas Spécifiques Abordés

      • Agents de la collectivité : Ils sont également concernés par le délit.

      S'ils sont en situation de conflit d'intérêts sur un dossier (ex: instruction d'un marché public pour l'entreprise d'un proche), ils doivent le signaler à leur hiérarchie pour que le dossier leur soit retiré.

      • Subventions en nature : La mise à disposition de locaux, de matériel ou d'agents est considérée comme un avantage et suit exactement les mêmes règles de déport que les subventions financières.

      • Associations "transparentes" : Une association qui n'est en réalité que le prolongement de la collectivité (ex: toutes les décisions sont prises par la commune) pose des problèmes juridiques majeurs.

      Toutes les règles de la collectivité (comptabilité publique, marchés publics) s'appliquent alors à elle, créant un risque juridique élevé.

    1. Reviewer #2 (Public review):

      Summary:

      In this article, the authors investigate enhancing the therapeutic and regenerative properties of mesenchymal stem cells (MSCs) through genetic modification, specifically by overexpressing genes involved in the glycogen synthesis pathway. By creating a non-phosphorylatable mutant form of glycogen synthase (GYSmut), the authors successfully increased glycogen accumulation in MSCs, leading to significantly improved cell survival under starvation conditions. The study highlights the potential of glycogen engineering to improve MSC function, especially in inflammatory or energy-deficient environments. However, critical gaps in the study's design, including the lack of validation of key findings, limited differentiation assessments, and missing data on MSC-GYSmut resistance to reactive oxygen species (ROS), necessitate further exploration.

      Strengths:

      (1) Novel Approach: The study introduces an innovative method of enhancing MSC function by manipulating glycogen metabolism.

      (2) Increased Glycogen Storage: The genetic modification of GYS1, resulting in GYSmut, significantly increased glycogen accumulation, leading to improved MSC survival under starvation, which has strong implications for enhancing MSC therapeutic properties in energy-deficient environments.

      (3) Potential Therapeutic Impact: The findings suggest significant therapeutic potential for MSCs in conditions that require improved survival, persistence, and immunomodulation, especially in inflammatory or energy-limited settings.

      (4) In Vivo Validation: The in vivo murine model of pulmonary fibrosis demonstrated the improved survival and persistence of MSC-GYSmut, supporting the translational potential of the approach.

      Weaknesses:

      (1) Lack of Differentiation Assessments: The study did not evaluate key MSC differentiation pathways, including chondrogenic and osteogenic differentiation. The absence of analysis of classical MSC surface markers and multipotency limits the understanding of the full potential of MSC-GYSmut.

      (2) Missing Validation of RNA Sequencing Data: Although RNA sequencing data revealed promising transcriptomic changes in chondrogenesis and metabolic pathways, these findings were not experimentally validated, limiting confidence.

      (3) Lack of ROS Resistance Analysis: Resistance to reactive oxygen species (ROS), an important feature for MSCs under regenerative conditions, was not assessed, leaving out a critical aspect of MSC function.

      (4) Limited Exploration of Immunosuppressive Properties: The study did not address the immunosuppressive functions of MSC-GYSmut, which are critical for MSC-based therapies in clinical settings.

      Conclusion:

      The study presents an exciting new direction for enhancing MSC function through glycogen metabolism engineering. While the results show promise, key experiments and validations are missing, and several areas, such as differentiation capacity, ROS resistance, and immunosuppressive properties, require further investigation. Addressing these gaps would solidify the conclusions and strengthen the potential clinical applications of MSC-GYSmut in regenerative medicine.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review)

      (1) Glycogen biosynthesis typically involves several enzymes. In this context, could the authors comment on the effect of overexpressing a single enzyme - especially a mutant version - on the structure or quality of the glycogen synthesized?

      While quantitative molecular weight analysis of synthesized glycogen was not performed, we documented changes in glycogen particle morphology. GYSmut overexpression resulted in significantly enlarged singular glycogen granules, suggesting potential high molecular mass, while GYS-GYG co-overexpression in MSCs (GYG being the essential enzyme for glycogen synthesis initiation) produced a diffuse glycogen distribution pattern rather than particulate structures. We have incorporated this result as new Figure S2C.

      These results suggest that overexpression of specific glycogen-metabolizing enzymes significantly influences glycogen structure. Consequently, targeted modulation of glycogen architecture and properties through key enzymes represents a potential avenue for future investigation.

      (2) Regarding the in vitro starvation experiments (Figure 2C), what oxygen conditions (pO₂) were used? Are these conditions physiologically relevant and representative of the in vivo lung microenvironment?

      Our in vitro starvation experiments (Figure 3C) were conducted under normoxic (21%). The oxygen concentration in human lungs is physiologically lower than atmospheric levels, with healthy individuals exhaling air containing approximately 16% oxygen (Thalakkotur Lazar Mathew, Diagnostics 2015). To our knowledge, direct measurements of alveolar oxygen concentration in pulmonary fibrosis are rare. Therefore, to evaluate the performance of GYSmut under hypoxic conditions, in the revised manuscript, Figure S2 has been augmented to include assessment of cell performance under combined hypoxia (oxygen concentration < 5%)and nutrient deprivation stress, which further corroborate the superiority of the GYSmut group over the control under different oxygen concentrations. 

      (3) In the in vitro model, how many hours does it take for the intracellular glycogen reserve to be completely depleted under starvation conditions?

      While quantitative cell viability data were recorded up to 72 hours post-implantation (Fig 3C), we observed cell viability at approximately 96 hours. We noticed that the presence of glycogen particles exhibited a correlation with sustained cell viability. However, reliable quantitative assessment of glycogen became increasingly challenging upon significant depletion of viable cells, thereby limiting our measurements during later time points.

      (4) For the in vivo model, is there a quantitative analysis of the survival kinetics of the transplanted cells over time for each group? This would help to better assess the role and duration of glycogen stores as an energy buffer after implantation.

      We tracked the in vivo distribution and persistence of implanted MSCs using enzymatic activity quantification assays (using Gluc luciferase assay) and live animal imaging (using Akaluc luciferase). The revised manuscript includes quantitative analysis of the in vivo fluorescence imaging data, which has been supplemented as Figure S4. Glycogen-engineered MSCs and control cells were quantitatively assessed at three discrete time points post-implantation. This quantification revealed a transient divergence in cell viability between the experimental and control groups around day 7. However, fluorescence in both cohorts subsequently declined to similar levels over the extended observation period.

      (5) Finally, the study was performed in male mice only. Could sex differences exist in the efficacy or metabolism of the engineered MSCs? It would be helpful to discuss whether the approach could be expected to be similarly effective in female subjects.

      We appreciate the reviewer’s important question regarding potential sex differences. Our study used male mice based on three key considerations: 1) Clinical Relevance: Idiopathic pulmonary fibrosis (IPF) shows significant male predominance, with diagnosis rates 3.5-fold higher in men (37.8% vs 10.6%, p<0.0001) and greater diagnostic confidence (Assayag et al., Thorax 2020). 2) Model Consistency: The bleomycin model (our chosen method) demonstrates more consistent fibrotic responses in male mice (Gul et al., BMC Pulm Med 2023). 3) Biological Rationale:

      Estrogen’s protective effects in females may confound therapeutic assessments (cited in Assayag et al.).

      We fully acknowledge this limitation and will include female subjects in subsequent translational studies. The therapeutic principle should theoretically apply to both sexes, but we agree this requires experimental validation.

      (6) The number of mice for each group and time point should be specified.

      The manuscript text has been revised to enhance clarity, and the number of mice for each group and time point has been specified (line 170 to 182).

      Reviewer #2 (Public Review):

      (4) Inconsistencies in In Vivo Data: There is a discrepancy between the number of animals shown in the figures and the graph (three individuals vs. five animals), as well as missing details on how luciferase signal intensity was quantified, requiring further clarification.

      To assess MSC survival in vivo, we employed two strategies utilizing distinct luciferases optimized for specific detection modalities. MSC viability was quantified ex vivo through Gaussia luciferase (Gluc) activity, leveraging its high sensitivity and established commercial assay kits (n = 3 mice per group per time point). For non-invasive longitudinal tracking within living animals, MSC distribution and viability were monitored via in vivo bioluminescence imaging using Akaluc luciferase, selected for its superior tissue penetration and sensitivity in situ (n = 5 mice per group).The manuscript text has been revised to enhance clarity, and the experiment protocols for luciferase signal detection and quantification has been added into Methods.

      (1) (2) (3) (5):

      We fully agree that further investigation into the functional consequences of glycogen engineering in MSCs – encompassing core cellular functions, immunomodulatory properties, and associated signaling pathways – is important to fully elucidate the underlying mechanisms. Cellular metabolism is intrinsically intertwined with diverse physiological processes. Consequently, we believe that glycogen engineering exerts multifaceted effects on MSCs, likely extending beyond the modulation of any single specific pathway. Studying the metabolic perturbation induced by such engineering approaches in mammalian cells represents an interesting field. The exploration of these aspects remains an long-term research objective within our group.

      Reviewer #2 (Recommendations for the authors):

      (6) Clarification of Data in the Murine Model:

      In Figure 4B, there is a discrepancy between the number of animals shown in the image (five) and those represented in the graph (three). This discrepancy needs clarification. Additionally, the study lacks information regarding the intensity of the signal in the luciferase assays. It is unclear how luciferase expression in the mice was quantified, and providing this detail would enhance the understanding of the data presented.

      We sincerely appreciate these valuable suggestions. We have revised the relevant text for greater clarity. Figure 4B and Figure 4C present results from two distinct experimental approaches, each employing different luciferase reporters and measurement methodologies, and different num of mice were used in these two experiments.

      Quantitative data derived from the in vivo bioluminescence imaging has been supplemented as Figure S4. The experiment protocols for luciferase signal detection and quantification has been added into Methods.

      To other recommendations of reviewer 2:

      We sincerely appreciate your valuable insights, which demonstrate your deep expertise. We fully agree that beyond nutrient availability, factors such as reactive oxygen species (ROS) and the immune microenvironment are also critical limitations affecting the survival and therapeutic efficacy of implanted MSCs.

      We propose that glycogen engineering exerts broad effects on MSCs. These effects manifest as changes in multiple cellular characteristics, including proliferation, differentiation, surface marker expression, antioxidant capacity, and immunomodulatory activity – all crucial factors for the therapeutic purpose of MSCs.

      We believe these changes likely involve complex networks of interconnected regulatory factors. The underlying mechanisms might be clarified through proteomic and metabolomic profiling.

      However, comprehensively investigating these interconnected aspects requires significant time and resources. Some components of this research extend beyond the current scope of our project. Nevertheless, exploring these mechanisms remains an important objective, and we will actively work to investigate them further in our ongoing studies.

    1. Il en va de même pour les fonctionnalités des médias numériques, telles que les boutons d’action, zones cliquables, fonctionnalités sociales de notation, de partage, etc. Elles nous « permettent » (au sens anglo-saxon de enable), autant qu’elles nous contraignent et nous programment. C’est par ces affordances que nous sommes devenus, à notre insu, des « agents médiatiques »3.

      L'importance de l'interface dans le numérique

    1. Ask yourself and others in your program the following:1. Is the policy practical?2. Is the policy age-appropriate for all the children you care for and for yourenvironment?3. Will center based staff, (or family child care assistant if program is familychild care), be able to incorporate the policy and procedures into the dailyoperations of the program? What training may they need?4. Is the information in the policy accessible and easy to use?5. Does the policy do what it’s intended to do regarding the children’s healthand safety?Page 9 TAChildGuidanceGCC20051107

      I feel these are great questions to get feedback from your colleagues, to make sure we have covered everything for the kids safety. I believe the last question is crucial because our job is to teach but more than that is to provide an environment for the kids to safe and for their health.

    1. Adler, Mortimer J. 1940. “How to Mark a Book.” Saturday Review of Literature 6: 250–52. https://www.unz.com/print/SaturdayRev-1940jul06-00011/ (January 11, 2023).

      Annotations: https://via.hypothes.is/https://docdrop.org/download_annotation_doc/Adler---1940---How-to-Mark-a-Book-fehef.pdf

      Annotations alternate: https://jonudell.info/h/facet/?user=chrisaldrich&max=100&exactTagSearch=true&expanded=true&url=https%3A%2F%2Fdocdrop.org%2Fdownload_annotation_doc%2FAdler---1940---How-to-Mark-a-Book-fehef.pdf

      Prior [.pdf copy]9https://stevenson.ucsc.edu/academics/stevenson-college-core-courses/how-to-mark-a-book-1.pdf): - Annotations https://hypothes.is/users/chrisaldrich?q=url%3Ahttps%3A%2F%2Fstevenson.ucsc.edu%2Facademics%2Fstevenson-college-core-courses%2Fhow-to-mark-a-book-1.pdf<br /> - Alternate annotation link https://jonudell.info/h/facet/?user=chrisaldrich&max=100&exactTagSearch=true&expanded=true&url=https%3A%2F%2Fstevenson.ucsc.edu%2Facademics%2Fstevenson-college-core-courses%2Fhow-to-mark-a-book-1.pdf

      Summary

      • Marking a book helps in increasing "the most efficient kind of reading."
      • The marked (pun intended) difference between physical vs. intellectual ownership of books
      • 3 types of book owners:
          1. collector of wood pulp and ink
          1. one whose read most and dipped into some
          1. one who's annotated and sucked the marrow out of them
      • Active reading (annotating and staying awake) and engaging deeply, arguing with, and questioning the author is the point of reading.
      • A historical record of your active reading allows you to continue the conversation you've had with the author and yourself. (p12)
      • Adler's method of reading and marking:
        1. Underlining major points of importance
        2. Vertical lines for emphasis
        3. Marginal marks (stars, asterisks, etc.) (10-20 per book) to indicate the most important statements in conjunction with dogearing these pages for making it easier to find them subsequently
        4. Numbers in the margin to sequence arguments
        5. Page numbers in the margin for linking ideas across pages, ostensibly for juxtaposing them later
        6. Circling key words or phrases (unsaid here, but this is helpful for indexing as well as helping one to come to terms with the author)
        7. Marginal writing for synopsis of sections as well as questions raised by the text; use of endpapers for a personal index of ideas presented chronologically throughout the book
      • Objections to marking books:
        • Using scratch pad (or index cards, which he doesn't mention specifically, but which could be implied) so as not to destroy a precious or rare physical copy (this is a repetition from earlier in the article)
        • Marking slows you down. This is part of the point! Slowing down makes you engage with the author and get more out of the text.
        • You can't loan books because they contain your important thoughts which you don't want to give away (and lose the historical record of your thinking). Solution: Simply require friends to buy their own copy.
    1. Reviewer #2 (Public review):

      Summary:

      The authors test how sample size and demographic balance of reference cohorts affect the reliability of normative models in ageing and Alzheimer's disease. Using OASIS-3 and replicating in AIBL, they change age and sex distributions and number of samples and show that age alignment is more important than overall sample size. They also demonstrate that models adapted from a large dataset (UK Biobank) can achieve stable performance with fewer samples. The results suggest that moderately sized but demographically well-balanced cohorts can provide robust performance.

      Strengths:

      The study is thorough and systematic, varying sample size, age, and sex distributions in a controlled way. Results are replicated in two independent datasets with relatively large sample sizes, thereby strengthening confidence in the findings. The analyses are clearly presented and use widely applied evaluation metrics. Clinical validation (outlier detection, classification) adds relevance beyond technical benchmarks. The comparison between within-cohort training and adaptation from a large dataset is valuable for real-world applications.

      The work convincingly shows that age alignment is crucial and that adapted models can reach good performance with fewer samples. However, some dataset-specific patterns (noted above) should be acknowledged more directly, and the practical guidance could be sharper.

      Weaknesses:

      The paper uses a simple regression framework, which is understandable for scalability, but limits generalization to multi-site settings where a hierarchical approach could better account for site differences. This limitation is acknowledged; a brief sensitivity analysis (or a clearer discussion) would help readers weigh trade-offs. Other than that, there are some points that are not fully explained in the paper:

      (1) The replication in AIBL does not fully match the OASIS results. In AIBL, left-skewed age sampling converges with other strategies as sample size grows, unlike in OASIS. This suggests that skew effects depend on where variability lies across the age span.

      (2) Sex imbalance effects are difficult to interpret, since sex is included only as a fixed effect, and residual age differences may drive some errors.

      (3) In Figure 3, performance drops around n≈300 across conditions. This consistent pattern raises the question of sensitivity to individual samples or sub-sampling strategy.

      (4) The total outlier count (tOC) analysis is interesting but hard to generalize. For example, in AIBL, left-skew sometimes performs slightly better despite a weaker model fit. Clearer guidance on how to weigh model fit versus outlier detection would strengthen the practical message.

      (5) The suggested plateau at n≈200 seems context-dependent. It may be better to frame sample size targets in relation to coverage across age bins rather than as an absolute number.

    2. Author response

      We would like to thank the editors and two reviewers for the assessment and the constructive feedback on our manuscript, “Toward Robust Neuroanatomical Normative Models: Influence of Sample Size and Covariates Distributions”. We appreciate the thorough reviews and believe the constructive suggestions will substantially strengthen the clarity and quality of our work. We plan to submit a revised version of the manuscript and a full point-by-point response addressing both the public reviews and the recommendations to the authors. 

      Reviewer 1. 

      In revision, we plan to address the reviewer’s comments by: (i) strengthen the interpretation of model fit through reporting the proportion of healthy controls within and outside the extreme percentile bounds; (ii) adding age-resolved overlays of model-derived percentile curves compared to those from the full reference cohort for key sample sizes and regions; (iii) quantifying age-distribution alignment between train and test set; and (iv) summarizing model performance as a joint function of age-distribution alignment and sample size.

      Reviewer 2. 

      In the revised manuscript, we will (i) expand the Discussion to more clearly outline the trade-offs between simple regression frameworks and hierarchical models for normative modeling (e.g., scalability, handling of multi-site variation, computational considerations), and discuss alternative approaches and harmonization as important directions for multi-site settings; (ii) contextualize OASIS-3 vs AIBL differences by quantifying train– test age-alignment across sampling strategies and emphasize that skewness should be interpreted relative to the target cohort’s alignment rather than absolute numbers. (iii) reassess sex-imbalance effects by reporting expected age distributions per condition and re-evaluate sex effects while controlling for age; (iv) investigate the apparent dip at n≈300 dip by increasing sub-sampling seeds, testing neighboring sample sizes, and using an alternative age-binning scheme to clarify the observed artifact; (v) clarify potential divergence between tOC separation and global fit under discrepancies in demographic distributions and relate tOC to age-alignment distance; (vi)  reframe the sample-size guidance in terms of distributional alignment rather than an absolute n.

    1. Reviewer #2 (Public review):

      Summary:

      Sereesongsaeng et al. aimed to develop degraders for LMO2, an intrinsically disordered transcription factor activated by chromosomal translocation in T-ALL. The authors first focused on developing biodegraders, which are fusions of an anti-LMO2 intracellular domain antibody (iDAb) with cereblon. Following demonstrations of degradation and collateral degradation of associated proteins with biodegraders, the authors proceeded to develop PROTACs using antibody paratopes (Abd) that recruit VHL (Abd-VHL) or cereblon (Abd-CRBN). The authors show dose-dependent degradation of LMO2 in LMO2+ T-ALL cell lines, as well as concomitant dose-dependent degradation of associated bHLH proteins in the DNA-binding complex. LMO2 degradation via Abd-VHL was also determined to inhibit proliferation and induce apoptosis in LMO2+ T-ALL cell lines.

      Strengths:

      The topic of degrader development for intrinsically disordered proteins is of high interest and the authors aimed to tackle a difficult drug target. The authors evaluated methods including the development of biodegraders, as well as PROTACs that recruit two different E3 ligases. The study includes important chemical control experiments, as well as proteomic profiling to evaluate selectivity.

      Weaknesses:

      Several weaknesses remain in this study:

      (1) The overall degradation achieved is not highly potent (although important proof-of-concept);

      (2) The mechanism of collateral degradation is not completely addressed. The authors acknowledge possible explanations, which would require mutagenesis and structural studies to further dissect;

      (3) The proteomics experiments do not detect LMO2, which the authors attribute to its size, making it difficult to interpret.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      Summary:

      The authors describe the degradation of an intrinsically disordered transcription factor (LMO2) via PROTACs (VHL and CRBN) in T-ALL cells. Given the challenges of drugging transcription factors, I find the work solid and a significant scientific contribution to the field. 

      Strengths: 

      (1) Validation of LMO2 degradation by starting with biodegraders, then progressing to chemical degrades. 

      (2)interrogation of the biology and downstream pathways upon LMO2 degradation (collateral degradation §

      (3) Cell line models that are dependent/overexpression of LMO2 vs LMO2 null cell lines. 

      (4) CRBN and VHL-derived PROTACs were synthesized and evaluated. 

      Weaknesses: 

      (1) The conventional method used to characterize PROTACs in the literature is to calculate the DC50 and Dmax of the degraders, I did not find this information in the manuscript. 

      As noted in the reply to referee’s point 4 below, our first generation compounds are not highly potent. The DC<sub>50</sub> values have been computed specifically using Western blot reflected in the data shown in Fig. 2. The revised version Supplementary Fig. S3 shows these quantified Western blot data from a time course of treating KOPT-K1 cells with either Abd-CRBN and Abd-VHL, where the 24 hour blot data are shown in Figure 2, G and E, and the quantified data from each 24 hour treatment are quantified in Supplementary Fig. S3). With these data, the DC<sub>50</sub> values 9 μM for Abd-CRBN and 15 μM Abd-VHL), included in in the main text and the Supplementary Fig. S3 figure legend.

      In addition, the loss of signal of the LMO2-Rluc reporter protein from PROTAC treated cells shown in Fig. 2M has been used to calculate a half-point of degradation; although strictly not DC<sub>50</sub>, as it measures a reporter protein, this yielded values are 10 μM for Abd-CRBN and 9 μM Abd-VHL. 

      (2) The proteomics data is not very convincing, and it is not clear why LMO2 does not show in the volcano plot (were higher concentrations of the PROTAC tested? and why only VHL was tested and not CRBN-based PROTAC?).

      Due to the relatively small size of the LMO2 protein, it is challenging to produce enough unique peptides for reliable identification, especially to distinguish some proteins in the LMO2 complex.  

      (3) The correlation between degradation potency and cell growth is not well-established (compare Figure 4C: P12-Ichikawa blots show great degradation at 24 and 48 hrs, but it is unclear if the cell growth in this cell line is any better than in PF-382 or MOLT-16) - Can the authors comment on the correlation between degradation and cell growth?  

      In this study (Fig. 4) we did not aim to compare the effect of LMO2 loss on cell growth among LMO2 positive cells. Rather, we aimed to evaluate the LMO2 importance for cell growth in LMO2-expressing T-ALL cells compared to non-expressing cells and to correlate the loss of the protein with this effect on the cell growth. In addition, the treatment of cells with the LMO2 compounds did now show an effect to LMO2 negative cells until at least 48 hours of treatment indicating that low toxicity of our PROTAC compounds and providing correlation between LMO2 loss and cell growth. 

      (4) The PROTACs are not very potent (double-digit micromolar range?) - can the authors elaborate on any challenges in the optimization of the degradation potency? 

      The Abd methodology to use intracellular domain antibodies to screen for compounds that bind to intrinsically disordered proteins such as the LMO2 transcription factors offers a tractable approach to hard drug targets but, in so doing, creates challenging factors to improve the potency that are not the same as those targets for which structural data are available. LMO2 is an intrinsically disordered protein, for which soluble recombinant protein is not readily available to identify the binding pocket of compounds. The potency has so far been optimized solely based on the different moieties substituted in cell-based SAR studies (http://advances.sciencemag.org/cgi/content/full/7/15/eabg1950/DC1) and all new compounds were tested with BRET assays. Thus, currently optimization of the degradation potency (including properties such as improved solubility) for the LMO2-binding compounds relies on chemical modification the three areas of the compounds indicated in Fig. 2 B,C.  

      (5) The authors mentioned trying six iDAb-E3 ligase proteins; I would recommend listing the E3 ligases tried and commenting on the results in the main text. 

      The six chimaeric iDAb-E3 ligase proteins involved one anti-LMO2 iDAb and three different E3 ligase where either fused at the N- or the C-terminus of the VH (giving six protein formats). These six fusion proteins were described in the text referring to the degrader studies described in Supplementary Fig. 1. 

      Reviewer #2 (Public review): 

      Summary: 

      Sereesongsaeng et al. aimed to develop degraders for LMO2, an intrinsically disordered transcription factor activated by chromosomal translocation in T-ALL. The authors first focused on developing biodegraders, which are fusions of an anti-LMO2 intracellular domain antibody (iDAb) with cereblon. Following demonstrations of degradation and collateral degradation of associated proteins with biodegraders, the authors proceeded to develop PROTACs using antibody paratopes (Abd) that recruit VHL (Abd-VHL) or cereblon (Abd-CRBN). The authors show dose-dependent degradation of LMO2 in LMO2+ T-ALL cell lines, as well as concomitant dose-dependent degradation of associated bHLH proteins in the DNA-binding complex. LMO2 degradation via Abd-VHL was also determined to inhibit proliferation and induce apoptosis in LMO2+ T-ALL cell lines. 

      Strengths: 

      The topic of degrader development for intrinsically disordered proteins is of high interest, and the authors aimed to tackle a difficult drug target. The authors evaluated methods, including the development of biodegraders, as well as PROTACs that recruit two different E3 ligases. The study includes important chemical control experiments, as well as proteomic profiling to evaluate selectivity. 

      Weaknesses: 

      The overall degradation is relatively weak, and the mechanism of potential collateral degradation is not thoroughly evaluated

      The purpose of the study was to evaluate effects of LMO2 degraders. The mechanism of the observed collateral degradation could not be investigated directly within the scope of our study. In the main text, discussed two possible, not exclusive, explanations. One being that our work (and previously published, cited work) indicates that the DNA-binding bHLH proteins have relatively short half file (Supplementary Fig. S12) and may therefore be subject to normal turnover when the LMO2, which is in the complex, turns over. Further, the known structure of the LMO2-bHLH interactions (from Omari et al, doi: 10.1016/j.celrep.2013.06.008) was also examined for the location of lysines in the TAL1 & E47 partners (Supplementary Fig. S11). It is possible that their local association with the LMO2-E3-ligase complex created by the PROTAC interaction, could cause their concurrent degradation. Mutagenesis and structural analysis would be needed to establish this point.

      In addition, experiments comparing the authors' prior work with their anti-LMO2 iDAb or Abl-L are lacking, which would improve our understanding of the potential advantages of a degrader strategy for LMO2.  

      A major motivation behind developing the Antibody-derived (Abd) method to select compounds, which are surrogates of the antibody paratope, is because using iDAbs directly as inhibitors requires the development of delivery technologies for these macromolecules, as protein directly or as vectors or mRNA for their expression. Ultimately, high affinity anti-LMO2 iDAbs should directly be used as tractable inhibitors when delivery methods redeveloped. In the meantime, Abd compounds were envisaged as being surrogates suitable for development into reagents, and potentially drugs, by medicinal chemistry. We evaluated selected first generation LMO2-binding Abd compounds previously, finding their ability to interfere with LMO2-iDAb BRET signal to EC<sub>max</sub> about 50% but these compounds do not have potency to have an effect on the interaction of LMO2 with a non-mutated iDAb (nM affinity). These data indicated that efficacy improvement for the PROTACs was needed. In addition, in the current study, we observed viability effects in T-ALL lines at high concentrations (20 μM) irrespective of LMO2 expression (Supplementary Fig. S 2A, B) These data indicated that efficacy improvement was needed and potentially converting the degraders (PROTACs) would add to in-cell potency. By adding the E3 ligase ligands, we found the toxicity of non-LMO2 expressing Jurkat was significantly reduced (Supplementary Fig. S 2E, F). 

      Reviewer #2 (Recommendations for the authors): 

      Suggestions for additional experiments: 

      (1) The data presented is primarily focused on demonstrating targeted degradation of LMO2, with a focus on phenotypes such as proliferation and apoptosis. In this manuscript, there are limited comparative evaluations of anti-LMO2 iDAb or Abl-L to show the potential benefits of a degrader approach to their previously described work, as well as why targeted degradation is in fact, advantageous. For example, the authors' previous work has shown that anti-LMO2 iDAb inhibits tumor growth in a mouse transplantation model. Comparisons in vitro would be supportive of the importance of continued degrader optimization/development.  

      we have previously shown that an anti-LMO2 scFv inhibits tumour growth in a mouse model but this work used an expressed scFv antibody that binds to LMO2 in nM range. The Abd compounds are much lower potency that the antibody and, because recombinant LMO2 is difficult to work with, we could only evaluate interactions of compounds with LMO2 in cell-based assays like BRET (LMO2-iDAb BRET). In this cell-based assay, the first generation Abd compounds do not have sufficient potency to block LMO2-iDAb interaction unless the affinity of the iDAb is reduced to sub-μM. The justification for proceeding on the degrader process rather than just using the protein-protein interaction (PPI) inhibition was based largely around the low potency of the first generation PPI compounds in cell assays and that incorporation protein degradation with PPI inhibition would enhance the efficacy.

      In addition, the viability experiments are also very short-term; is there a reason why the authors did not carry out these experiments for 3-5 days to fully understand the impacts on proliferation? 

      In Supplementary Fig. S5, we did show assays up to 3 days. In KOPT-K1 (LMO2+), the LMO2 levels were reduced during the time course of this assay (from a single compound dose at time zero) (Supplementary Fig S 5A, B). We also show CellTitreGlo assays up to 3 days and, with these second generation compounds, we observed sustained effects on KOPT-K1 (LMO2+) but low non-DMSO toxicity in Jurkat (LMO2-) (revised version Supplementary (Fig S5 C, D).

      (2) The potential mechanism of collateral degradation is interesting and important in evaluating the on-target responses and consequences of degrading LMO2. At this time, the data supporting collateral degradation is limited and would be strengthened by showing that it is not due to a change in mRNA levels and not due to complex dissociation. Overall, the kinetics and depth of loss of complex members such as E47 in Figure 3 appear more substantial than LMO2 itself, and as presented, collateral degradation is not effectively demonstrated. In addition, to aid in the readers' assessments, additional background and references around the roles of TAL1 and E47 would be helpful. For example, structurally, where do they (and other associated proteins that are not degraded) fit in the complex? 

      We have responded above in relation to the Public Review Comments and note that a structure of the complex was in submitted version (now revised version Supplementary Fig. S11). 

      (3) In Figure 1A, the blots show decreased levels of endogenous CRBN with iDAB-CRBN. Is this a known consequence of this approach in these cell lines? Does the partial recovery of endogenous CRBN in KOPTK1 cells have any indication of iDAB-CRBN levels? 

      We cannot be sure why the endogenous level of CRBN decreases in doxycycline treated cells. It has been shown (DOI:10.1371/journal.pone.0064561) that doxycycline used in the inducible expression system (and its derivatives), such as the lentivirus we used, has an effect to gene expression patterns, which can be increase or decrease expression. Although the published study did not examine CRBN expression, the effect might explain the CRBN expression decrease on doxycycline addition and remains the same level after that. 

      (4) In Figure S7, the authors do not fully explain the results and why there is minimal rescue with epoxomicin (S7A) or MLN4924 (S7J). This could indicate an alternative mechanism of degradation and loss at play, given the lack of rescue. Can the authors comment on this discrepancy, and have they looked autophagy inhibitor or other agents to achieve the chemical rescue? 

      In the experiments such as in revised version Supplementary Fig. S6, we used KOPT-K1 cells with a single concentration of the inhibitors and the cells may less susceptible to the epoxomicin (0.8 μM) but lenalidomide and free thalidomide restored the LMO2 levels fully. In the main text Fig. 3D, we also showed that including epoxomicin and thalidomide with the Abd-CRBN in KOPT-K1 and CCRF-CEM restore LMO2 levels, supporting the conclusion that the main mechanism of degradation is through ubiquitination proteosomal route.

      (5) For the proteomics data, it would be helpful to have the proteins in yellow highlighted to have them noted in 5D and 5E. In addition, can the authors comment on why LMO2 or their collateral targets are not confirmed in the table? Furthermore, 5C is difficult to interpret; if there are no significantly changing proteins in the Jurkat cells, why are there pathways that are identified? 

      As mentioned in reply to referee 1, due to the relatively small size of the LMO2 protein, it is challenging to produce enough unique peptides for reliable identification, especially to distinguish some proteins in the LMO2 complex where expression levels are low.

    1. Reviewer #3 (Public review):

      Summary:

      This short paper aims to provide an independent validation of the transgenerational inheritance of learned behaviour (avoidance) that has been published by the Murphy lab. The robustness of the phenotype has been questioned by the Hunter lab. In this paper, the authors present one figure showing that transgenerational inheritance can be replicated in their hands. Overall, it helps to shed some light on a controversial topic.

      Strengths:

      The authors clearly outline their methods, particularly regarding the choice of assay, so that attempting to reproduce the results should be straightforward. It is nice to see these results repeated in an independent laboratory.

      Comments on revised version:

      I'm happy with the response to reviewers.

    2. Author response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Confirmation of daf-7::GFP data and inheritance beyond F2

      Reviewer suggested confirming daf-7::GFP molecular marker data and testing inheritance beyond the F2 generation to further strengthen the findings.

      We agree these experiments would provide valuable mechanistic insights into the molecular basis of transgenerational inheritance. However, our study was specifically designed as a reproducibility study focusing on the central controversy regarding F2 inheritance (Gainey et al. vs. Murphy lab findings). The daf-7::GFP molecular marker experiments, while important for understanding mechanisms, represent a different research question requiring extensive additional resources and expertise beyond the scope of this validation study. Our primary goal was to provide independent confirmation of the disputed F2 inheritance using standardized behavioral assays. It is our hope that future work will pursue these important mechanistic validations.

      "Exhaustive attempts" language

      Reviewer disagreed with characterizing Gainey et al.'s efforts as "exhaustive attempts" since they modified the original protocol.

      We revised this statement in the Results and Discussion to more accurately reflect the experimental situation: "In contrast, Gainey et al. (2025), representing the Hunter group, reported that while parental and F1 avoidance behaviors were evident, transgenerational inheritance was not reliably observed beyond the F1 generation under their experimental conditions."

      Importance of sodium azide

      Reviewer suggested including more discussion about the recent findings on the importance of sodium azide in the assay, referencing the Murphy group's response paper.

      We have prominently highlighted the critical role of sodium azide in our Introduction with strengthened language that emphasizes its importance for resolving the scientific controversy: "Critically, Kaletsky et al. (2025) demonstrated that omission of sodium azide during scoring can completely abolish detection of inherited avoidance, revealing that this key methodological difference may explain the conflicting results between laboratories. The use of sodium azide to immobilize worms at the moment of initial bacterial choice appears essential for capturing the inherited behavioral response. These findings highlight how seemingly minor methodological variations can dramatically impact detection of transgenerational inheritance and underscore the need for independent replication using standardized protocols."

      Protocol fidelity statement

      Reviewer requested a more direct statement clarifying that we followed the Murphy group protocol, noting that we made some modifications.

      We followed the core Murphy lab protocol with two evidence-based optimizations that preserve the essential experimental elements: 1) We used 400 mM sodium azide instead of 1 M based on preliminary data showing the higher concentration caused premature paralysis before worms could make behavioral choices, and 2) We used liquid NGM buffer instead of M9 to maintain chemical consistency with the solid NGM plates used for worm culture, minimizing potential osmotic stress. These modifications improved experimental reliability while maintaining the critical components: sodium azide immobilization, bacterial lawn density standardization (OD<sub>600</sub> = 1.0), and synchronized scoring conditions that are essential for detecting inherited avoidance.

      Overstated dilution claim

      Reviewer noted that the statement about "gradual decrease" in avoidance strength was overstated and didn't reflect the actual data presented in the manuscript.

      We removed this statement.

      Environmental variables phrasing

      Reviewer found the sentence about environmental variables unclear, noting that Gainey et al. didn't actually acknowledge variability but saw it as indicating error or stochastic processes.

      We refined this statement for greater precision and clarity: "This underscores the assay's sensitivity to environmental variables, such as synchronization method and bacterial lawn density. This highlights the importance of consistency across experimental setups and support the view that context-dependent variation may underlie previously reported discrepancies."

      Reviewer #2 (Public Review):

      Reagent sourcing

      Reviewer suggested listing the sources of media ingredients with company names and catalog numbers, as this might be important for reproducibility.

      To ensure complete reproducibility, we created a comprehensive Table S3 listing all reagents, suppliers, and catalog numbers used in our experiments. This detailed information enables exact replication of our experimental conditions and addresses potential variability that might arise from different reagent sources between laboratories.

      Reviewer #3 (Public Review):

      Raw data transparency

      Reviewer noted that while a spreadsheet with choice assay results was provided, the individual raw data from assays was not included, which would be helpful for assessing sample sizes.

      We now provide complete experimental transparency through Table S2, which contains individual choice indices from all 138 assays conducted across four independent trials. This comprehensive dataset allows full assessment of our experimental outcomes, statistical robustness, and reproducibility while enabling other researchers to perform independent statistical analyses.

      F1/F2 assay disparity

      Reviewer questioned whether the higher number of F2 assays compared to F1 represented truly independent assays, asking if multiple F2 assays were performed from offspring of one F1 plate (which would not represent independent assays).

      We clarified this important statistical consideration in Methods (Transgenerational Testing): "Each behavioral assay was conducted using animals from a biologically independent growth plate. While F2 plates were derived from pooled embryos from multiple F1 parents, each assay represents an independent biological replicate with no reuse of animals across assays. F2 assays (n=45) exceeded F1 assays (n=20) due to PA14-induced fecundity reduction in trained worms, limiting the number of viable F1 progeny. The higher number of F2 assays reflects the greater reproductive success of healthy F1 animals and provides additional statistical power for population-level behavioral comparisons." We also enhanced our Controls section to clarify that "Our experimental design employed population-level comparisons across generations using unpaired statistical analyses, with no attempt to track individual lineages across generations."

      Methodological variations overstatement

      Reviewer felt the Introduction overstated the findings by suggesting the authors "address potential methodological variations," when they only used one assay setup throughout.

      We have corrected the Introduction to accurately reflect our study design and scope: "Here, we adapted the protocol established by the Murphy group, maintaining the critical use of sodium azide to paralyze worms at the time of choice, to test whether parental exposure to PA14 elicits consistent avoidance in subsequent generations. Our study specifically focuses on the transmission of learned avoidance through the F2 generation, beyond the intergenerational (F1) effect, because this is where divergence between published studies begins."

      Reviewer #1 (Recommendations for the authors):

      Worm numbers

      Reviewer noted that information about the number of worms used should be included in the training and choice assay methods section rather than separated.

      We clarified worm numbers and sample sizes in the Methods (Controls and Additional Considerations): "Each individual assay averaged 62 ± 43 animals (range: 15-150 worms per assay), with a total of 138 assays conducted across four independent experimental trials. The variation in worm numbers per assay reflects natural variation in worm recovery and immobilization efficiency during choice assays. We conducted an average of 8.5 assays per condition during each of the four replicates."

      Figure 1 legend and consistency

      Reviewer identified several issues: inconsistent terminology ("treated" vs "trained"), incorrect statistical test naming, missing p-value annotations, and need for consistency between figure and legend. We have systematically addressed all figure consistency and statistical annotation issues:

      Replaced inconsistent "treated" terminology with "trained" throughout

      Corrected the statistical test description to accurately reflect our analysis: "Kruskal-Wallis oneway ANOVA followed by Dunn's post hoc" which properly corresponds to the statistical tests detailed in Table S1

      Added explicit p-value annotations in the figure legend: "*p<0.05, **p<0.01 means and SEM shown (see Table S1 for statistics and Table S2 for raw data)"

      Ensured consistent terminology between figure and legend

      NGM vs. M9 buffer

      Reviewer questioned whether we used NGM buffer or M9 buffer for washing steps, noting that NGM isn't usually referred to as "buffer."

      We have prominently featured and thoroughly clarified our rationale for using liquid NGM buffer in the Methods (Synchronization of Worms section). The explanation now appears upfront in the methods: "We used liquid NGM buffer instead of M9 buffer (as specified in the original Murphy protocol) to maintain chemical consistency with the solid NGM culture plates. This modification minimizes potential osmotic stress since liquid NGM matches the pH (6.0) and ionic composition of the growth medium, whereas M9 buffer has a different pH (7.0) and ionic profile." We provide detailed chemical differences and explain that this modification maintains consistency with culture conditions while preserving essential experimental procedures.

      Grammar/typos

      Reviewer noted that the manuscript needed thorough proofreading to address grammatical errors and typographical mistakes.

      We have conducted comprehensive proofreading and editing throughout the manuscript to resolve grammatical and typographical errors. Specific improvements include: clarified sentence structure in the Introduction and Results sections, corrected technical terminology consistency, improved figure legend clarity, and enhanced overall readability while maintaining scientific precision.

      Sodium azide concentration

      Reviewer noted that our sodium azide concentration differed from the Moore paper and requested comment on this difference.

      We have included explicit justification for our sodium azide concentration choice in the Methods (Training and Choice Assay): "We used 400 mM sodium azide rather than the 1 M concentration reported by Moore et al. (2019) because preliminary trials showed that higher concentrations caused premature paralysis before worms could reach either bacterial spot, potentially biasing choice measurements. The 400 mM concentration provided sufficient immobilization while preserving the behavioral choice window."

      Reviewer #2 (Recommendations for the authors):

      Comparative reagent analysis

      Reviewer suggested creating a supplemental table comparing reagent sources between our study, Gainey et al., and Murphy et al., proposing that media ingredient differences might explain the discrepancies.

      While direct reagent comparison between laboratories was beyond the scope of this validation study, we recognize this as an important consideration for understanding experimental variability. Our comprehensive reagent sourcing information (Table S3) provides the foundation for future comparative studies. We encourage collaborative efforts to systematically compare reagent sources across laboratories, as media component differences could contribute to the experimental variability observed between research groups. Such analyses would be valuable for establishing standardized protocols across the field.

      Conclusion

      We hope that these revisions satisfactorily address the reviewers’ concerns. We believe these improvements significantly strengthened the manuscript's contribution to resolving this important scientific controversy.

      We thank the reviewers again for their invaluable insights and constructive feedback, which have substantially improved the quality and impact of our work.

    1. Reviewer #1 (Public review):

      Summary:

      The authors build a network model of the olfactory bulb and the piriform cortex and use it to run simulations and test their hypotheses. Given the model's settings, the authors observe drift across days in the responses to the same odors of both the mitral/tufted cells, as well as of piriform cortex neurons. When representing the M/T and PCx responses within a lower-dimensional space, the apparent drift is more prominent in the PCx, while the M/T responses appear in comparison more stable. The authors further note that introducing spike-time dependent plasticity (STDP) at bulb synapses involving abGCs slows down the drift in the PCx representations, and further link this to the observation that repeated exposure to the same odorant slows down drift in the piriform cortex.

      The model is clearly explained and relies on several assumptions and observations:

      (1) Random projections of MTC from the olfactory bulb to the piriform cortex, random intra-piriform connectivity, and random piriform to bulb connectivity.

      (2) Higher dimensionality of piriform cortex representations compared to M/T responses, which enables superior decoding of odor identity in the piriform cortex.

      (3) Spike time-dependent plasticity (STDP) at synapses involving the abGCs.

      The authors address an open topical problem, and the model is elegant in its simplicity. I have however, several major concerns with the hypotheses underlying the model and with its biological plausibility.

      Concerns:

      (1) In their model, the authors propose that MTC remain stable at the population level, despite changes in individual MTC responses.

      The authors cite several experimental studies to support their claims that individual MTC responses to the same odors change (some increase, some decrease) across days. Interpreting the results of these studies must, however, take into account the variability of M/T responses across odor presentation repeats within the same session vs. across sessions. In the Shani-Narkiss et al., Frontiers in Neural Circuits, 2023 study referenced, a large fraction of the variability across days in M/T responses is also observed across repeats to the same odorant in the same session (Shani-Narkiss et al., Figure 4), while the authors have M/T responses in the same session that are highly reproducible. This is an important point to consider and address, since it constrains how much of the variability in M/T responses can be attributed to adult neurogenesis in the olfactory bulb versus to other networks' inhibitory mechanisms, which do not rely on neurogenesis. In the authors' model, the variability in M/T responses observed across days emerges as a result of adult-born neurogenesis, which does not need to be the main source of variability observed in imaging experiments (Shani-Narkiss et al., Figure 4).

      Another study (Kato et al., Neuron, 2012, Figure 4) reported that mitral cell responses to odors experienced repeatedly across 7 days tend to sparsen and decrease in amplitude systematically, while mitral cell responses to the same odor on day 1 vs. day 7 when the odor is not presented repeatedly in between seem less affected (although the authors also reported a decrease in the CI for this condition). As such, Kato et al. mostly report decreases in mitral cell odor responses with repeated odor exposure at both the individual and population level, and not so much increases and decreases in the individual mitral cell responses, and stability at the population level.

      (2) In Figure 1, a set of GCs is killed off, and new GCs are integrated in the network as abGC. Following the elimination of 10% of GCs in the network, new cells are added and randomly assigned synaptic weights between these abGCs and MTC, GCs, SACs, and top-down projections from PCx. This is done for 11 days, during which time all GCs have gone through adult neurogenesis.

      Is the authors' assumption here that across the 11 days, all GCs are being replaced? This seems to depart from the known biology of the olfactory bulb granule cells, i.e., GCs survive for a large fraction of the animal's life.

      (3) The authors' model relies on several key assumptions: random projections of MTC from the olfactory bulb to the piriform cortex, random intra-piriform connectivity, and random piriform to bulb connectivity. These assumptions are not necessarily accurate, as recent work revealed structure in the projections from the olfactory bulb to the piriform cortex and structure within the piriform cortex connectivity itself (Fink et al., bioRxiv, 2025; Chae et al., Cell, 2022; Zeppilli et al., eLife, 2021).

      How do the results of the model relating adult neurogenesis in the bulb to drift in the piriform cortex representations change when considering an alternative scenario in which the olfactory bulb to piriform and intra-piriform connectivity is not fully distributed and indistinguishable from random, but rather is structured?

      (4) I didn't understand the logic of the low-dimensional space analysis for M/T cells and piriform cortex neurons (Figures 2 & 3). In the authors' model, the full-ensemble M/T responses are reorganized over time, presumably due to the adult-born neurogenesis. Analyzing a lower-dimensional projection of the ensemble trajectories reveals a lower degree of re-organization. This is the same for the piriform cortex, but relatively, the piriform ensembles displayed in a low-dimensional embedding appear to drift more compared to the M/T ensembles.

      This analysis triggers a few questions: which representation is relevant for the brain function - the high or the low-dimensional projection? What fraction of response variance is included in the low-dimensional space analysis? How did the authors decide the low-dimensional cut-off? Why does STDP cause more drift in piriform cortex ensembles vs. M/T ensembles? Is this because of the assumed higher dimensionality of the piriform cortex representations compared to the mitral cells?

      (5) Could the authors comment whether STDP at abGC synapses and its impact on decreasing drift represent a new insight, and also put it into context? Several studies (e.g., Lledo, Murthy, Komiyama groups) reported that abGC integrates in the network in an activity-dependent manner, and not randomly, and as such stabilizes the active neuronal responses, which is consistent with the authors' report.

      Related, I couldn't find through the manuscript which synapses involving abGCs they focus on, or what is the relative contribution of the various plastic synapses shown in the cartoon from Figure 4 A1 (circles and triangles).

      6) The study would be strengthened, in my opinion, by including specific testable predictions that the authors' models make, which can be further food for thought for experimentalists.<br /> How does suppression of adult-born neurogenesis in the OB impact the stability of mitral cell odor responses? How about piriform cortex ensembles?

    2. Reviewer #2 (Public review):

      Summary:

      The authors address a critical problem in olfactory coding. It has long been known that adult neurogenesis, specifically in the form of adult-born granule cells that embed into the existing inhibitory networks on the olfactory bulb, can potentially alter the responses of Mitral/Tufted neurons that project activity to the Piriform Cortex and to other areas of the brain. Fundamentally, it would seem that these granule cells could alter the stability of neural codes in the OB over time. The authors develop a spiking network model to explore how stability can be achieved both in the OB over time and in the PC, which receives inputs. The model recapitulates published activity recordings of M/T cells and shows how activity in different M/T cells from the same glomerulus shifts over time in ways that, in spite of the shift, preserve population/glomerular level codes. However, these different M/T cells fan out onto different pyramidal cells of the PC, which gives rise to instability at that level. STDP then, is necessary to maintain stability at the PC level as long as odor environments remain constant. These results may also apply to a similar neurogenesis-based change in the Dentate Gyrus, which generates instability in CA1/3 regions of the hippocampus

      Strengths:

      A robust network model that untangles important, seemingly contradictory mechanisms that underlie olfactory coding.

      Weaknesses:

      The work is a significant contribution to understanding olfactory coding. But the manuscript would benefit from a brief discussion of why neurogenesis occurs in the first place - e.g., injury, ongoing needs for plasticity, and adapting to turnover of ORNs. There is literature on this topic. It seems counterintuitive to have a process in the MOB (and for that matter in the DG) that potentially disrupts the ability to generate stable codes both in the MOB and PC, and in particular a disruption that requires two different mechanisms - multiple M/T cells per glomerulus in the MOB and STDP in the PC - to counteract.

      Given that neurogenesis has an important function, and a mechanism is in place to compensate for it in the MOB, why would it then be disrupted in fan-out projections to the PC? The answer may lie in the need for fan-out projections so that pyramidal neurons in the PC can combinatorially represent many different inputs from the MOB. So something like STDP would be needed to maintain stability in the face of the need for this coding strategy.

      This kind of discussion, or something like it, would help readers understand why these mechanisms occur in the first place. It is interesting that PC stability requires that odor environments be stable, and that this stability drives PC representational stability. This result suggests experimental work to test this hypothesis. As such, it is a novel outcome of the research.

    3. Reviewer #3 (Public review):

      Summary

      The authors set out to explore the potential relationship between adult neurogenesis of inhibitory granule cells in the olfactory bulb and cumulative changes over days in odor-evoked spiking activity (representational drift) in the olfactory stream. They developed a richly detailed spiking neuronal network model based on Izhikevich (2003), allowing them to capture the diversity of spiking behaviors of multiple neuron types within the olfactory system. This model recapitulates the circuit organization of both the main olfactory bulb (MOB) and the piriform cortex (PCx), including connections between the two (both feedforward and corticofugal). Adult neurogenesis was captured by shuffling the weights of the model's granule cells, preserving the distribution of synaptic weights. Shuffling of granule cell connectivity resulted in cumulative changes in stimulus-evoked spiking of the model's M/T cells. Individual M/T cell tuning changed with time, and ensemble correlations dropped sharply over the temporal interval examined (long enough that almost all granule cells in the model had shuffled their weights). Interestingly, these changes in responsiveness did not disrupt low-dimensional stability of olfactory representations: when projected into a low-dimensional subspace, population vector correlations in this subspace remained elevated across the temporal interval examined. Importantly, in the model's downstream piriform layer, this was not the case. There, shuffled GC connectivity in the bulb resulted in a complete shift in piriform odor coding, including for low-dimensional projections. This is in contrast to what the model exhibited in the M/T input layer. Interestingly, these changes in PCx extended to the geometrical structure of the odor representations themselves. Finally, the authors examined the effect of experience on representational drift. Using an STDP rule, they allowed the inputs to and outputs from adult-born granule cells to change during repeated presentations of the same odor. This stabilized stimulus-evoked activity in the model's piriform layer.

      Strengths

      This paper suggests a link between adult neurogenesis in the olfactory bulb and representational drift in the piriform cortex. Using an elegant spiking network that faithfully recapitulates the basic physiological properties of the olfactory stream, the authors tackle a question of longstanding interest in a creative and interesting manner. As a purely theoretical study of drift, this paper presents important insights: synaptic turnover of recurrent inhibitory input can destabilize stimulus-evoked activity, but only to a degree, as representations in the bulb (the model's recurrent input layer) retain their basic geometrical form. However, this destabilized input results in profound drift in the model's second (piriform) layer, where both the tuning of individual neurons and the layer's overall functional geometry are restructured. This is a useful and important idea in the drift field, and to my knowledge, it is novel. The bulb is not the only setting where inhibitory synapses exhibit turnover (whether through neurogenesis or synaptic dynamics), and so this exploration of the consequences of such plasticity on drift is valuable. The authors also elegantly explore a potential mechanism to stabilize representations through experience, using an STDP rule specific to the inhibitory neurons in the input layer. This has an interesting parallel with other recent theoretical work on drift in the piriform (Morales et al., 2025 PNAS), in which STDP in the piriform layer was also shown to stabilize stimulus representations there. It is fascinating to see that this same rule also stabilizes piriform representations when implemented in the bulb's granule cells.

      The authors also provide a thoughtful discussion regarding the differential roles of mitral and tufted cells in drift in piriform and AON and the potential roles of neurogenesis in archicortex.

      In general, this paper puts an important and much-needed spotlight on the role of neurogenesis and inhibitory plasticity in drift. In this light, it is a valuable and exciting contribution to the drift conversation.

      Weaknesses

      I have one major, general concern that I think must be addressed to permit proper interpretation of the results.

      I worry that the authors' model may confuse thinking on drift in the olfactory system, because of differences in the behavior of their model from known features of the olfactory bulb. In their model, the tuning of individual bulbar neurons drifts over time. This is inconsistent with the experimental literature on the stability of odor-evoked activity in the olfactory bulb.

      In a foundational paper, Bhalla & Bower (1997) recorded from mitral and tufted cells in the olfactory bulb of freely moving rats and measured the odor tuning of well-isolated single units across a five-day interval. They found that the tuning of a single cell was quite variable within a day, across trials, but that this variability did not increase with time. Indeed, their measure of response similarity was equivalent within and across days. In what now reads as a prescient anticipation of the drift phenomenon, Bhalla and Bower concluded: "it is clear, at least over five days, that the cell is bounded in how it can respond. If this were not the case, we would expect a continual increase in relative response variability over multiple days (the equivalent of response drift). Instead, the degree of variability in the responses of single cells is stable over the length of time we have recorded." Thus, even at the level of single cells, this early paper argues that the bulb is stable.

      This basic result has since been replicated by several groups. Kato et al. (2012) used chronic two-photon calcium imaging of mitral cells in awake, head-fixed mice and likewise found that, while odor responses could be modulated by recent experience (odor exposure leading to transient adaptation), the underlying tuning of individual cells remained stable. While experience altered mitral cell odor responses, those responses recovered to their original form at the level of the single neuron, maintaining tuning over extended periods (two months). More recently, the Mizrahi lab (Shani-Narkiss et al., 2023) extended chronic imaging to six months, reporting that single-cell odor tuning curves remained highly similar over this period. These studies reinforce Bhalla and Bower's original conclusion: despite trial-to-trial variability, olfactory bulb neurons maintain stable odor tuning across extended timescales, with plasticity emerging primarily in response to experience. (The Yamada et al., 2017 paper, which the authors here cite, is not an appropriate comparison. In Yamada, mice were exposed daily to odor. Therefore, the changes observed in Yamada are a function of odor experience, not of time alone. Yamada does not include data in which the tuning of bulb neurons is measured in the absence of intervening experience.)

      Therefore, a model that relies on instability in the tuning of bulbar neurons risks giving the incorrect impression that the bulb drifts over time. This difference should be explicitly addressed by the authors to avoid any potential confusion. Perhaps the best course of action would be to fit their model to Mizrahi's data, should this data be available, and see if, when constrained by empirical observation, the model still produces drift in piriform. If so, this would dramatically strengthen the paper. If this is not feasible, then I suggest being very explicit about this difference between the behavior of the model and what has been shown empirically. I appreciate that in the data there is modest drift (e.g., Shani-Narkiss' Figure 8C), but the changes reported there really are modest compared to what is exhibited by the model. A compromise would be to simply apply these metrics to the model and match the model's similarity to the Shani-Narkiss data. Then the authors could ask what effect this has on drift in piriform.

      The risk here is that people will conclude from this paper that drift in piriform may simply be inherited from instability in the bulb. This view is inconsistent with what has been documented empirically, and so great care is warranted to avoid conveying that impression to the community.

      Major comments (all related to the above point)

      (1) Lines 146-168: The authors find in their model that "individual M/T cells changed their responses to the same odor across days due to adult-neurogenesis, with some cells decreasing the firing rate responses (Fig.2A1 top) while other cells increased the magnitude of their responses (Fig. 2A2 bottom, Fig. S2)" they also report a significant decrease in the "full ensemble correlation" in their model over time. They claim that these changes in individual cell tuning are "similar to what has been observed by others using calcium imaging of M/T cell activity (Kato et al., 2012 and Yamada et al., 2017)" and that the decrease in full ensemble correlation is "consistent with experimental observations (Yamada et al., 2017)." However, the conditions of the Kato and Yamada experiments that demonstrate response change are not comparable here, as odors were presented daily to the animals in these experiments. Therefore, the changes in odor tuning found in the Kato and Yamada papers (Kato Figure 4D; Yamada Figure 3E) are a function of accumulated experience with odor. This distinction is crucial because experience-induced changes reflect an underlying learning process, whereas changes that simply accumulate over time are more consistent with drift. The conditions of their model are more similar to those employed in other experiments described in Kato et al. 2012 (Figure 6C) as well as Shani-Narkiss et al. (2023), in which bulb tuning is measured not as a function of intervening experience, but rather as a function of time (Kato's "recovery" experiment). What is found in Kato is that even across two months, the tuning of individual mitral cells is stable. What alters tuning is experience with odor, the core finding of both the Kato et al., 2012 paper and also Yamada et al., 2017. It is crucial that this is clarified in the text.

      (2) The authors show that in a reduced-space correlation metric, the correlation of low-dimensional trajectories "remained high across all days"..."consistent with a recent experimental study" (Shani-Narkiss et al., 2023). It is true that in the Shani-Narkiss paper, a consistent low-dimensional response is found across days (t-SNE analysis in Shani-Narkiss Figure 7B). However, the key difference between the Shani-Narkiss data and the results reported here is that Shani-Narkiss also observed relative stability in the native space (Shani-Narkiss Figure 8). They conclude that they "find a relatively stable response of single neurons to odors in either awake or anesthetized states and a relatively stable representation of odors by the MC population as a whole (Figures 6-8; Bhalla and Bower, 1997)." This should be better clarified in the text.

      (3) In the discussion, the authors state that "In the MOB, individual M/T cells exhibited variable odor responses akin to gain control, altering their firing rate magnitudes over time. This is consistent with earlier experimental studies using calcium-imaging." (L314-6). Again, I disagree that these data are consistent with what has been published thus far. Changes in gain would have resulted in increased variability across days in the Bhalla data. Moreover, changes in gain would be captured by Kato's change index ("To quantify the changes in mitral cell responses, we calculated the change index (CI) for each responsive mitral cell-odor pair on each trial (trial X) of a given day as (response on trial X - the initial response on day 1)/(response on trial X + the initial response on day 1). Thus, CI ranges from −1 to 1, where a value of −1 represents a complete loss of response, 1 represents the emergence of a new response, and 0 represents no change." Kato et al.). This index will capture changes in gain. However, as shown in Figure 4D (red traces), Figure 6C (Recovery and Odor set B during odor set A experience and vice versa), the change index is either zero or near zero. If the authors wish to claim that their model is consistent with these data, they should also compute Kato's change index for M/T odor-cell pairs in their model and show that it also remains at 0 over time, absent experience.

    1. Reviewer #3 (Public review):

      Summary:

      Through micro-electroencephalography, Hight and colleagues studied how the auditory cortex in its ensemble responds to cochlear implant stimulation compared to the classic pure tones. Taking advantage of a double-implanted rat model (Micro-ECoG and Cochlear Implant), they tracked and analyzed changes happening in the temporal and spatial aspects of the cortical evoked responses in both normal hearing and cochlear-implanted animals. After establishing that single-trial responses were sufficient to encode the stimuli's properties, the authors then explored several decoder architectures to study the cortex's ability to encode each stimulus modality in a similar or different manner. They conclude that a) intracranial EEG evoked responses can be accurately recorded and did not differed between normal hearing and cochlear-implanted rats; b) Although coarsely spatially organized, CI-evoked responses had higher trial-by-trial variability than pure tones; c) Stimulus identity is independently represented by temporal and spatial aspect of cortical representations and can be accurately decoded by various means from single trials; d) and that Pure tones trained decoder can't decode CI-stimulus identity accurately.

      Strength:

      The model combining micro-eCoG and cochlear implantation and the methodology to extract both the Event Related Potentials (ERPs) and High-Gammas (HGs) is very well designed and appropriately analyzed. Likewise, the PCA-LDA and TCA-LDA are powerful tools that take full advantage of the information provided by the cortical ensembles.

      The overall structure of the paper, with a paced and exhaustive progress through each step and evolution of the decoder, is very appreciable and easy to follow. The exploration of single-trial encoding and stimulus identity through temporal and spatial domains is providing new avenues to characterize the cortical responses to CI stimulations and their central representation. The fact that single trials suffice to decode the stimulus identity regardless of their modality is of great interest and noteworthy. Although the authors confirm that iEEG remains difficult to transpose in the clinic, the insights provided by the study confirm the potential benefit of using central decoders to help in clinic settings.

      Weaknesses:

      The conclusion of the paper, especially the concept of distinct cortical encoding for each modality, is unfortunately partially supported by the results, as the authors did not adequately consider fundamental limitations of CI-related stimulation.

      First, the reviewer assumed that the authors stimulated in a Monopolar mode, which, albeit being clinically relevant, notoriously generates a high current spread in rodent models. Second, comparing the averaged BF maps for iEEG (Figure 2A, C), BFs ranged from 4 to 16kHz with a predominance of 4kHz BFs. The lack of BFs at higher frequencies hints at a potential location mismatch between the frequency range sampled at the level of the cortex (low to medium frequencies) and the frequency range covered by the CI inserted mostly in the first turn-and-a-half of the cochlea (high to medium frequencies). Looking at Figure 2F (and to some extent 2A), most of the CI electrodes elicited responses around the 4kHz regions, and averaged maps show a predominance of CI-3-4 across the cortex (Figure 2C, H) from areas with 4kHz BF to areas with 16kHz BF. It is doubtful that CI-3-4 are located near the 4kHz region based on Müller's work (1991) on the frequency representation in the rat cochlea.

      Taken together with the Pearsons correlations being flat, the decoder examples showing a strong ability to identify CI-4 and 3 and the Fig-8D, E presenting a strong prediction of 4kHz and 8kHz for all the CI electrodes when using a pure tone trained decoder, it is possible that current spread ended stimulating indistinctly higher turns of the cochlea or even the modiolus in a non-specific manner, greatly reducing (or smearing) the place-coding/frequency resolution of each electrode, which in turn could explain the coarse topographic (or coarsely tonotopic according to the manuscript) organization of the cortical responses. Thus, the conclusion that there are distinct encodings for each modality is biased, as it might not account for monopolar smearing. To that end, and since it is the study's main message and title, it would have benefited from having a subgroup of animals using bipolar stimulations (or any focused strategy since they provide reduced current spread) to compare the spatial organization of iEEG responses and the performances of the different decoders to dismiss current spread and strengthen their conclusion.

      Nevertheless, the reviewer wants to reiterate that the study proposed by Hight et al. is well constructed, relevant to the field, and that the overall proposal of improving patient performances and helping their adaptation in the first months of CI use by studying central responses should be pursued as it might help establish new guidelines or create new clinical tools.

    1. Reviewer #1 (Public review):

      In this manuscript, Clausner and colleagues use simultaneous EEG and fMRI recordings to clarify how visual brain rhythms emerge across layers of early visual cortex. They report that gamma activity correlates positively with feature-specific fMRI signals in superficial and deep layers. By contrast, alpha activity generally correlated negatively with fMRI signals, with two higher frequencies within the alpha reflecting feature-specific fMRI signals. This feature-specific alpha code indicates an active role of alpha oscillations in visual feature coding, providing compelling evidence that the functions of alpha oscillations go beyond cortical idling or feature-unspecific suppression.

      The study is very interesting and timely. Methodologically, it is state-of-the-art. The findings on a more active role of alpha activity that goes beyond the classical idling or suppression accounts are in line with recent findings and theories. In sum, this paper makes a very nice contribution. I still have a few comments that I outline below, regarding the data visualization, some methodological aspects, and a couple of theoretical points.

      (1) The authors put a lot of effort into the figure design. For instance, I really like Figure 1, which conveys a lot of information in a nice way. Figures 3 and 4, however, seem overengineered, and it takes a lot of time to distill the contents from them. The fact that they have a supplementary figure explaining the composition of these figures already indicates that the authors realized this is not particularly intuitive. First of all, the ordering of the conditions is not really intuitive. Second, the indication of significance through saturation does not really work; I have a hard time discerning the more and less saturated colors. And finally, the white dots do not really help either. I don't fully understand why they are placed where they are placed (e.g., in Figure 3). My suggestion would be to get rid of one of the factors (I think the voxel selection threshold could go: the authors could run with one of the stricter ones, and the rest could go into the supplement?) and then turn this into a few line plots. That would be so much easier to digest.

      (2) The division between high- and low-frequency alpha in the feature-specific signal correspondence is very interesting. I am wondering whether there is an opposite effect in the feature-unspecific signal correspondence. Would the high-frequency alpha show less of a feature-unspecific correlation with the BOLD?

      (3) In the discussion (line 330 onwards), the authors mention that low-frequency alpha is predominantly related to superficial layers, referencing Figure 4A. I have a hard time appreciating this pattern there. Can the authors provide some more information on where to look?

      (4) How did the authors deal with the signal-to-noise ratio (SNR) across layers, where the presence of larger drain veins typically increases BOLD (and thereby SNR) in superficial layers? This may explain the pattern of feature-unspecific effects in the alpha (Figure 3). Can the authors perform some type of SNR estimate (e.g., split-half reliability of voxel activations or similar) across layers to check whether SNR plays a role in this general pattern?

      (5) The GLM used for modelling the fMRI data included lots of regressors, and the scanning was intermittent. How much data was available in the end for sensibly estimating the baseline? This was not really clear to me from the methods (or I might have missed it). This seems relevant here, as the sign of the beta estimates plays a major role in interpreting the results here.

      (6) Some recent research suggests that gamma activity, much in contrast to the prevailing view of the mechanism for feedforward information propagation, relates to the feedback process (e.g., Vinck et al., 2025, TiCS). This view kind of fits with the localization of gamma to the deep layer here?

      (7) Another recent review (Stecher et al., 2025, TiNS) discusses feature-specific codes in visual alpha rhythms quite a bit, and it might be worth discussing how your results align with the results reported there.

    2. Reviewer #3 (Public review):

      Summary:

      Clausner et al. investigate the relationship between cortical oscillations in the alpha and gamma bands and the feature-specific and feature-unspecific BOLD signals across cortical layers. Using a well-designed stimulus and GLM, they show a method by which different BOLD signals can be differentiated and investigated alongside multiple cortical oscillatory frequencies. In addition to the previously reported positive relationship between gamma and BOLD signals in superficial layers, they show a relationship between gamma and feature-specific BOLD in the deeper layers. Alpha-band power is shown to have a negative relationship with the negative BOLD response for both feature-specific and feature-unspecific contrasts. When separated into lower (8-10Hz) and upper (11-13Hz) alpha oscillations, they show that higher frequency alpha showed a significantly stronger negative relationship with congruency, and can therefore be interpreted as more feature-specific than lower frequency alpha.

      Strengths:

      The use of interleaved EEG-fMRI has provided a rich dataset that can be used to evaluate the relationship of cortical layer BOLD signals with multiple EEG frequencies. The EEG data were of sufficient quality to see the modulation of both alpha-band and gamma-band oscillations in the group mean VE-channel TFS. The good EEG data quality is backed up with a highly technical analysis pipeline that ultimately enables the interpretation of the cortical layer relationship of the BOLD signal with a range of frequencies in the alpha and gamma bands. The stimulus design allowed for the generation of multiple contrasts for the BOLD signal and the alpha/gamma oscillations in the GLM analysis. Feature-specific and unspecific BOLD contrasts are used with congruently or incongruently selected EEG power regressors to delineate between local and global alpha modulations. A transparent approach is used for the selection of voxels contributing to the final layer profiles, for which statistical analysis is comprehensive but uses an alternative statistical test, which I have not seen in previous layer-fMRI literature.

      A significant negative relationship between alpha-band power and the BOLD signal was seen in congruently (EEGco) selected voxels (predominantly in superficial layers) and in feature-contrast (EEGco-inco) selected (superficial and deep layers). When separated into lower (8-10Hz) and upper (11-13Hz) alpha oscillations, they show that higher frequency alpha showed a significantly stronger negative relationship with congruency than lower frequency alpha. This is interpreted as a frequency dissociation in the alpha-BOLD relationship, with upper frequency alpha being feature-specific and lower frequency alpha corresponding to general modulation. These results are a valuable addition to the current literature and improve our current understanding of the role of cortical alpha oscillations.

      There is not much work in the literature on the relationship between alpha power and the negative BOLD response (NBR), so the data provided here are particularly valuable. The negative relationship between the NBR and alpha power shown here suggests that there is a reduction in alpha power, linked to locally reduced BOLD activity, which is in line with the previously hypothesized inhibitory nature of alpha.

      Weaknesses:

      It is not entirely clear how the draining vein effect seen in GE-BOLD layer-fMRI data has been accounted for in the analysis. For the contrast of congruent-incongruent, it is assumed that the underlying draining effect will be the same for both conditions, and so should be cancelled out. However, for the other contrasts, it is unclear how the final layer profiles aren't confounded by the bias in BOLD signal towards the superficial layers. Many of the profiles in Figure 3 and Figure 4A show an increased negative correlation between alpha power and the BOLD signal towards the superficial layers.

      When investigating if high alpha (8-10 Hz) and low alpha (11-13 Hz) are two different sources of alpha, it would be beneficial to show if this effect is only seen at the group level or can be seen in any single subjects. Inter-subject variability in peak alpha power could result in some subjects having a single low alpha peak and some a single high alpha peak rather than two peaks from different sources.

      The figure layout used to present the main findings throughout is an innovative way to present so much information, but it is difficult to decipher the main findings described in the text. The readability would be improved if the example (Appendix 0 - Figure 1) in the supplementary material is included as a second panel inside Figure 3, or, if this is not possible, the example (Appendix 0 - Figure 1) should be clearly referred to in the figure caption.

    1. Reviewer #2 (Public review):

      Summary:

      This paper addresses an interesting issue: how is the search for a visual target affected by its orientation (and the viewer's) relative to other items in the scene and gravity? The paper describes a series of visual search tasks, using recognizable targets (e.g., a cat) positioned within a natural scene. Reaction times and accuracy at determining whether the target was present or absent, trial-to-trial, were measured as the target's orientation, that of the context, and of the viewer themselves (via rotation in a flight simulator) were manipulated. The paper concludes that search is substantially affected by these manipulations, primarily by the reference frame of gravity, then visual context, followed by the egocentric reference frame.

      Strengths:

      This work is on an interesting topic, and benefits from using natural stimuli in VR / flight simulator to change participants' POV and body position.

      Weaknesses:

      There are several areas of weakness that I feel should be addressed.

      (1) The literature review/introduction seems to be lacking in some areas. The authors, when contemplating the behavioral consequences of searching for a 'rotated' target, immediately frame the problem as one of rotation, per se (i.e., contrasting only rotation-based explanations; "what rotates and in which 'reference frame[s]' in order to allow for successful search?"). For a reader not already committed to this framing, many natural questions arise that are worth addressing.

      1a) Why do we need to appeal to rotation at all as opposed to, say, familiarity? A rotated cat is less familiar than a typically oriented one. This is a long-standing literature (e.g., Wang, Cavanagh, and Green (1994)), of course, with a lot to unpack.

      1b) What are the triggers for the 'corrective' rotation that presumably brings reference frames back into alignment? What if the rotation had not been so obvious (i.e. for a target that may not have a typical orientation, like a hand, or a ball, or a learned, nonsense object?) or the background had not had such clear orientation (like a cluttered non-naturalistic background of or a naturalistic backdrop, but viewed from an unfamiliar POV (e.g., from above) or a naturalistic background, but not all of the elements were rotated)? What, ultimately, is rotated? The entire visual field? Does that mean that searching for multiple targets at different angles of rotation would interfere with one another?

      1c) Relatedly, what is the process by which the visual system comes to know the 'correct' rotation? (Or, alternatively, is 'triggered to realize' that there is a rotation in play?) Is this something that needs to be learned? Is it only learned developmentally, through exposure to gravity? Could it be learned in the context of an experiment that starts with unfamiliar stimuli?

      1d) Why the appeal to natural images? I appreciate any time a study can be moved from potentially too stripped-down laboratory conditions to more naturalistic ones, but is this necessary in the present case? Would the pattern of results have been different if these were typical laboratory 'visual search' displays of disconnected object arrays?

      1e) How should we reconcile rotation-based theories of 'rotated-object' search with visual search results from zero gravity environments (e.g., for a review, see Leone (1998))?

      1f) How should we reconcile the current manipulations with other viewpoint-perspective manipulations (e.g., Zhang & Pan (2022))?

      (2) The presentation/interpretation of results would benefit from more elaboration and justification.

      2a) All of the current interpretations rely on just the RT data. First, the RT results should also be presented in natural units (i.e., seconds/ms), not normalized. As well, results should be shown as violin plots or something similar that captures distribution - a lot of important information is lost when just presenting one 'average' dot across participants. More fundamentally, I think we need to have a better accounting for performance (percent correct or d') to help contextualize the RT results. We should at least be offered some visualization (Heitz, 2014) of the speed accuracy trade-off for each of the conditions. Following this, the authors should more critically evaluate how any substantial SAT trends could affect the interpretation of results.

      2b) Unless I am missing something, the interpretation of the pattern of results (both qualitatively and quantitatively in their 'relative weight' analysis) relies on how they draw their contrasts. For instance, the authors contrast the two 'gravitational' conditions (target 0 deg versus target 90 deg) as if this were a change in a single variable/factor. But there are other ways to understand these manipulations that would affect contrasts. For instance, if one considers whether the target was 'consistent' (i.e., typically oriented) with respect to the context, egocentric, and gravitational frames, then the 'gravitational 0 deg' condition is consistent with context, egocentric view, but inconsistent with gravity. And, the 'gravitational 90 deg' condition, then, is inconsistent with context, egocentric view, but consistent with gravity. Seen this way, this is not a change in one variable, but three. The same is true of the baseline 0 deg versus baseline 90 deg condition, where again we have a change in all three target-consistency variables. The 'one variable' manipulations then would be: 1) baseline 0 versus visual context 0 (i.e., a change only in the context variable); 2) baseline 0 versus egocentric 0 (a change only in the egocentric variable); and 3) baseline 0 versus gravitational 0 (a change only in the gravitational variable). Other contrasts (e.g., gravitational 90 versus context 90) would showcase a change in two variables (in this case, a change in both context and gravity). My larger point is, again, unless I am really missing something, that the choice of how to contrast the manipulations will affect the 'pattern' of results and thereby the interpretation. If the authors agree, this needs to be acknowledged, plausible alternative schemes discussed, and the ultimate choice of scheme defended as the most valid.

      2c) Even with this 'relative weight' interpretation, there are still some patterns of results that seem hard to account for. Primarily, the egocentric condition seems hard to account for under any scheme, and the authors need to spend more time discussing/reconciling those results.

      2d) Some results are just deeply counterintuitive, and so the reader will crave further discussion. Most saliently for me, based on the results of Experiment 2 (specifically, the fact that gravitational 90 had better performance than gravitational 0), designers of cockpits should have all gauges/displays rotate counter to the airplane so that they are always consistent with gravity, not the pilot. Is this indeed a fair implication of the results?

      2e) I really craved some 'control conditions' here to help frame the current results. In keeping with the rhetorical questions posed above in 1a/b/c/d, if/when the authors engage with revisions to this paper, I would encourage the inclusion of at least some new empirical results. For me the most critical would be to repeat some core conditions, but with a symmetric target (e.g. a ball) since that would seem to be the only way (given the current design) to tease out nuisance confounding factors such as, say, the general effect of performing search while sideways (put another way, the authors would have to assume here that search (non-normalized RT's and search performance) for a ball-target in the baseline condition would be identical to that in the gravitational condition.)

    2. Reviewer #3 (Public review):

      The study tested how people search for objects in natural scenes using virtual reality. Participants had to find targets among other objects, shown upright or tilted. The main results showed that upright objects were found faster and more accurately. When the scene or body was rotated, performance changed, showing that people use cues from the environment and gravity to guide search.

      The manuscript is clearly written and well designed, but there are some aspects related to methods and analyses that would benefit from stronger support.

      First, the sample size is not justified with a power analysis, nor is it explained how it was determined. This is an important point to ensure robustness and replicability.

      Second, the reaction time data were processed using different procedures, such as the use of the median to exclude outliers and an ad hoc cut-off of 50 ms. These choices are not sufficiently supported by a theoretical rationale, and could appear as post-hoc decisions.

      Third, the mixed-model analyses are overall well-conducted; however, the specification of the random structure deserves further consideration. The authors included random intercepts for participants and object categories, which is appropriate. However, they did not include random slopes (e.g., for orientation or set size), meaning that variability in these effects across participants was not modelled. This simplification can make the models more stable, but it departs from the maximal random structure recommended by Barr et al. (2013). The authors do not explicitly justify this choice, and a reviewer may question why participant-specific variability in orientation effects, for example, was not allowed. Given the modest sample sizes (20 in Experiment 1 and 10 in Experiment 2), convergence problems with more complex models are likely. Nonetheless, ignoring random slopes can, in principle, inflate Type I error rates, so this issue should at least be acknowledged and discussed.

    1. Reviewer #1 (Public review):

      The authors conducted a series of experiments using two established decision-making tasks to clarify the relationship between internalizing psychopathology (anxiety and depression) and adaptive learning in uncertain and volatile environments. While prior literature has reported links between internalizing symptoms - particularly trait anxiety - and maladaptive increases in learning rates or impaired adjustment of learning rates, findings have been inconsistent. To address this, the authors designed a comprehensive set of eight experiments that systematically varied task conditions. They also employed a bifactor analysis approach to more precisely capture the variance associated with internalizing symptoms across anxiety and depression. Across these experiments, they found no consistent relationship between internalizing symptoms and learning rates or task performance, concluding that this purported hallmark feature may be more subtle than previously assumed.

      Strengths:

      (1) A major strength of the paper lies in its impressive collection of eight experiments, which systematically manipulated task conditions such as outcome type, variability, volatility, and training. These were conducted both online and in laboratory settings. Given that trial conditions can drive or obscure observed effects, this careful, systematic approach enables a robust assessment of behavior. The consistency of findings across online and lab samples further strengthens the conclusions.

      (2) The analyses are impressively thorough, combining model-agnostic measures, extensive computational modeling (e.g., Bayesian, Rescorla-Wagner, Volatile Kalman Filter), and assessments of reliability. This rigor contributes meaningfully to broader methodological discussions in computational psychiatry, particularly concerning measurement reliability.

      (3) The study also employed two well-established, validated computational tasks: a game-based predictive inference task and a binary probabilistic reversal learning task. This choice ensures comparability with prior work and provides a valuable cross-paradigm perspective for examining learning processes.

      (4) I also appreciate the open availability of the analysis code that will contribute substantially to the field using similar tasks.

      Weakness:

      (1) While the overall sample size (N = 820 across eight experiments) is commendable, the number of participants per experiment is relatively modest, especially in light of the inherent variability in online testing and the typically small effect sizes in correlations with mental health traits (e.g., r = 0.1-0.2). The authors briefly acknowledge that any true effects are likely small; however, the rationale behind the sample sizes selected for each experiment is unclear. This is especially important given that previous studies using the predictive inference task (e.g., Seow & Gillan, 2020, N > 400; Loosen et al., 2024, N > 200) have reported non-significant associations between trait anxiety symptoms and learning rates.

      (2) The motivation for focusing on the predictive inference task is also somewhat puzzling, given that no cited study has reported associations between trait anxiety and parameters of this task. While this is mitigated by the inclusion of a probabilistic reversal learning task, which has a stronger track record in detecting such effects, the study misses an opportunity to examine whether individual differences in learning-related measures correlate across the two tasks, which could clarify whether they tap into shared constructs.

      (3) The parameterization of the tasks, particularly the use of high standard deviations (SDs) of 20 and 30 for outcome distributions and hazard rates of 0.1 and 0.16, warrants further justification. Are these hazard rates sufficiently distinct? Might the wide SDs reduce sensitivity to volatility changes? Prior studies of the circle version of this predictive inference task (e.g., Vaghi et al., 2019; Seow & Gillan, 2020; Marzuki et al., 2022; Loosen et al., 2024; Hoven et al., 2024) typically used SDs around 12. Indeed, the Supplementary Materials suggest that variability manipulations did not seem to substantially affect learning rates (Figure S5)-calling into question whether the task manipulations achieved their intended cognitive effects.

      (4) Relatedly, while the predictive inference task showed good reliability, the reversal learning task exhibited only "poor-to-moderate" reliability in its learning-rate estimates. Given that previous findings linking anxiety to learning rates have often relied on this task, these reliability issues raise concerns about the robustness and generalizability of conclusions drawn from it.

      (5) As the authors note, the study relies on a subclinical sample. This limits the generalizability of the findings to individuals with diagnosed disorders. A growing body of research suggests that relationships between cognition and symptomatology can differ meaningfully between general population samples and clinical groups. For example, Hoven et al. (2024) found differing results in the predictive inference task when comparing OCD patients, healthy controls, and high- vs. low-symptom subgroups.

      (6) Finally, the operationalization of internalizing symptoms in this study appears to focus on anxiety and depression. However, obsessive-compulsive disorder is also generally considered an internalizing disorder, which presents a gap in the current cited literature of the paper, particularly when there have been numerous studies with the predictive inference task and OCD/compulsivity (e.g., Vaghi et al., 2019; Seow & Gillan, 2020; Marzuki et al., 2022; Loosen et al., 2024; Hoven et al., 2024), rather than trait anxiety per se.

      Overall:

      Despite the named limitations, the authors have done very impressive work in rigorously examining the relationship between anxiety/internalizing symptoms and learning rates in commonly used decision-making tasks under uncertainty. Their conclusion is well supported by the consistency of their null findings across diverse task conditions, though its generalizability may be limited by some features of the task design and its sample. This study provides strong evidence that will guide future research, whether by shifting the focus of examining dysfunctions of larger effect sizes or by extending investigations to clinical populations.

    2. Reviewer #2 (Public review):

      Summary:

      In this work, the authors recruited a large sample of participants to complete two well-established paradigms: the predictive inference task and the volatile reversal learning task. With this dataset, they not only replicated several classical findings on uncertainty-based learning from previous research but also demonstrated that individual differences in learning behavior are not systematically associated with internalizing psychopathology. These results provide valuable large-scale evidence for this line of research.

      Strengths:

      (1) Use of two different tasks.

      (2) Recruitment of a large sample of participants.

      (3) Inclusion of multiple experiments with different conditions, demonstrating strong scientific rigor.

      Weaknesses:

      Below are questions rather than 'weaknesses':

      (1) This study uses a large human sample, which is a clear strength. However, was the study preregistered? It would also be useful to report a power analysis to justify the sample size.

      (2) Previous studies have tested two core hypotheses: (a) that internalizing psychopathology is associated with overall higher learning rates, and (b) that it is associated with learning rate adaptation. In the first experiment, the findings seem to disconfirm only the first hypothesis. I found it unclear how, in the predator task, participants were expected to adjust their learning rate to adapt to volatility. Could the authors clarify this point?

      (3) According to the Supplementary Information, Model 13 showed the best fit, yet the authors selected Model 12 due to the larger parameter variance in Model 13. What would the results of Model 13 look like? Furthermore, do Models 12 and 13 correspond to the optimal models identified by Gagne et al. (2020)? Please clarify.

      (4) In the Discussion, the authors addressed both task reliability and parameter reliability. However, the term reliability seems to be used differently in these two contexts. For example, good parameter recovery indicates strong reliability in one sense, but can we then directly equate this with parameter reliability? It would be helpful to define more precisely what is meant by reliability in each case.

      (5) The Discussion also raises the possibility that limited reliability may represent a broader challenge facing the interdisciplinary field of computational psychiatry. What, in the authors' view, are the key future directions for the field to mitigate this issue?

    1. Een alomvattende morele visie en een moreel zelfbeeld
      • Preconventioneel
      • Straf vermijden
      • Individueel voordeel
      • Conventioneel
      • Andere plezieren
      • Autoriteiten bepalen wat goed en slecht is
      • Post-conventioneel
      • Maatschappij en individu beschermen
      • Universele principes
    1. Reviewer #1 (Public review):

      Summary:

      The authors present MerQuaCo, a computational tool that fills a critical gap in the field of spatial transcriptomics: the absence of standardized quality control (QC) tools for image-based datasets. Spatial transcriptomics is an emerging field where datasets are often imperfect, and current practices lack systematic methods to quantify and address these imperfections. MerQuaCo offers an objective and reproducible framework to evaluate issues like data loss, transcript detection variability, and efficiency differences across imaging planes.

      Strengths:

      (1) The study draws on an impressive dataset comprising 641 mouse brain sections collected on the Vizgen MERSCOPE platform over two years. This scale ensures that the documented imperfections are not isolated or anecdotal but represent systemic challenges in spatial transcriptomics. The variability observed across this large dataset underscores the importance of using sufficiently large sample sizes when benchmarking different image-based spatial technologies. Smaller datasets risk producing misleading results by over-representing unusually successful or unsuccessful experiments. This comprehensive dataset not only highlights systemic challenges in spatial transcriptomics but also provides a robust foundation for evaluating MerQuaCo's metrics. The study sets a valuable precedent for future quality assessment and benchmarking efforts as the field continues to evolve.

      (2) MerQuaCo introduces thoughtful metrics and filters that address a wide range of quality control needs. These include pixel classification, transcript density, and detection efficiency across both x-y axes (periodicity) and z-planes (p6/p0 ratio). The tool also effectively quantifies data loss due to dropped images, providing tangible metrics for researchers to evaluate and standardize their data. Additionally, the authors' decision to include examples of imperfections detectable by visual inspection but not flagged by MerQuaCo reflects a transparent and balanced assessment of the tool's current capabilities.

      Weaknesses:

      (1) The study focuses on cell-type label changes as the main downstream impact of imperfections. Broadening the scope to explore expression response changes of downstream analyses would offer a more complete picture of the biological consequences of these imperfections and enhance the utility of the tool.

      (2) While the manuscript identifies and quantifies imperfections effectively, it does not propose post-imaging data processing solutions to correct these issues, aside from the exclusion of problematic sections or transcript species. While this is understandable given the study is aimed at the highest quality atlas effort, many researchers don't need that level of quality to compare groups. It would be important to include discussion points as to how those cut-offs should be decided for a specific study.

      (3) Although the authors demonstrate the applicability of MerQuaCo on a large MERFISH dataset, and the limited number of sections from other platforms, it would be helpful to describe its limitations in its generalizability.

    2. Reviewer #3 (Public review):

      Summary:

      MerQuaCo is an open-source computational tool developed for quality control in image-based spatial transcriptomics data, with a primary focus on data generated by the Vizgen MERSCOPE platform. The authors analyzed a substantial dataset of 641 fresh-frozen adult mouse brain sections to identify and quantify common imperfections, aiming to replace manual quality assessment with an automated, objective approach, providing standardized data integrity measures for spatial transcriptomics experiments.

      Strengths:

      The manuscript's strengths lie in its timely utility, rigorous empirical validation, and practical contributions to methodology and biological discovery in spatial transcriptomics.

      Weaknesses:

      While MerQuaCo demonstrates utility in large datasets and cross-platform potential, its generalizability and validation require expansion, particularly for non-MERSCOPE platforms and real-world biological impact.

    1. Reviewer #3 (Public review):

      Summary:

      MerQuaCo is an open-source computational tool developed for quality control in image-based spatial transcriptomics data, with a primary focus on data generated by the Vizgen MERSCOPE platform. The authors analyzed a substantial dataset of 641 fresh-frozen adult mouse brain sections to identify and quantify common imperfections, aiming to replace manual quality assessment with an automated, objective approach, providing standardized data integrity measures for spatial transcriptomics experiments.

      Strengths:

      The manuscript's strengths lie in its timely utility, rigorous empirical validation, and practical contributions to methodology and biological discovery in spatial transcriptomics.

      Weaknesses:

      While MerQuaCo demonstrates utility in large datasets and cross-platform potential, its generalizability and validation are currently limited by the availability of sufficient datasets from non-MERSCOPE platforms and non-brain tissues. The evaluation of data imperfections' impact on downstream analyses beyond cell typing (e.g., differential expression, spatial statistics, and cell-cell interactions) is also constrained by space and scope. However, these represent valuable directions for future work as more datasets become available.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      The authors present MerQuaCo, a computational tool that fills a critical gap in the field of spatial transcriptomics: the absence of standardized quality control (QC) tools for image-based datasets. Spatial transcriptomics is an emerging field where datasets are often imperfect, and current practices lack systematic methods to quantify and address these imperfections. MerQuaCo offers an objective and reproducible framework to evaluate issues like data loss, transcript detection variability, and efficiency differences across imaging planes.

      Strengths:

      (1) The study draws on an impressive dataset comprising 641 mouse brain sections collected on the Vizgen MERSCOPE platform over two years. This scale ensures that the documented imperfections are not isolated or anecdotal but represent systemic challenges in spatial transcriptomics. The variability observed across this large dataset underscores the importance of using sufficiently large sample sizes when benchmarking different image-based spatial technologies. Smaller datasets risk producing misleading results by over-representing unusually successful or unsuccessful experiments. This comprehensive dataset not only highlights systemic challenges in spatial transcriptomics but also provides a robust foundation for evaluating MerQuaCo's metrics. The study sets a valuable precedent for future quality assessment and benchmarking efforts as the field continues to evolve.

      (2) MerQuaCo introduces thoughtful metrics and filters that address a wide range of quality control needs. These include pixel classification, transcript density, and detection efficiency across both x-y axes (periodicity) and z-planes (p6/p0 ratio). The tool also effectively quantifies data loss due to dropped images, providing tangible metrics for researchers to evaluate and standardize their data. Additionally, the authors' decision to include examples of imperfections detectable by visual inspection but not flagged by MerQuaCo reflects a transparent and balanced assessment of the tool's current capabilities.

      Weaknesses:

      (1) The study focuses on cell-type label changes as the main downstream impact of imperfections. Broadening the scope to explore expression response changes of downstream analyses would offer a more complete picture of the biological consequences of these imperfections and enhance the utility of the tool.

      Here, we focused on the consequences of imperfections on cell-type labels, one common use for spatial transcriptomics datasets. Spatial datasets are used for so many other purposes that there are almost endless ways in which imperfections could impact downstream analyses. It is difficult to see how we might broaden the scope to include more downstream effects, while providing enough analysis to derive meaningful conclusions, all within the scope of a single paper. Existing studies bring some insight into the impact of imperfections and we expect future studies will extend our understanding of consequences in other biological contexts.

      (2) While the manuscript identifies and quantifies imperfections effectively, it does not propose post-imaging data processing solutions to correct these issues, aside from the exclusion of problematic sections or transcript species. While this is understandable given the study is aimed at the highest quality atlas effort, many researchers don't need that level of quality to compare groups. It would be important to include discussion points as to how those cut-offs should be decided for a specific study.

      Studies differ greatly in their aims and, as a result, the impact of imperfections in the underlying data will differ also, preventing us from offering meaningful guidance on how cut-offs might best be identified. Rather, our aim with MerQuaCo was to provide researchers with tools to generate information on their spatial datasets, to facilitate downstream decisions on data inclusion and cut-offs.

      (3) Although the authors demonstrate the applicability of MerQuaCo on a large MERFISH dataset, and the limited number of sections from other platforms, it would be helpful to describe its limitations in its generalizability.

      In figure 9, we addressed the limitations and generalizability of MerQuaCo as best we could with the available datasets. Gaining deep insight into the limitations and generalizability of MerQuaCo would require application to multiple large datasets and, to the best of our knowledge, these datasets are not available.

      Reviewer #2 (Public review):

      The authors present MerQuaCo, a computational tool for quality control in image-based spatial transcriptomic, especially MERSCOPE. They assessed MerQuaCo on 641 slides that are produced in their institute in terms of the ratio of imperfection, transcript density, and variations of quality by different planes (x-axis).

      Strengths:

      This looks to be a valuable work that can be a good guideline of quality control in future spatial transcriptomics. A well-controlled spatial transcriptomics dataset is also important for the downstream analysis.

      Weaknesses:

      The results section needs to be more structured.

      We have split the ‘Transcript density’ subsection of the results into 3 new subsections.

      Reviewer #3 (Public review):

      MerQuaCo is an open-source computational tool developed for quality control in imagebased spatial transcriptomics data, with a primary focus on data generated by the Vizgen MERSCOPE platform. The authors analyzed a substantial dataset of 641 freshfrozen adult mouse brain sections to identify and quantify common imperfections, aiming to replace manual quality assessment with an automated, objective approach, providing standardized data integrity measures for spatial transcriptomics experiments.

      Strengths:

      The manuscript's strengths lie in its timely utility, rigorous empirical validation, and practical contributions to methodology and biological discovery in spatial transcriptomics.

      Weaknesses:

      While MerQuaCo demonstrates utility in large datasets and cross-platform potential, its generalizability and validation require expansion, particularly for non-MERSCOPE platforms and real-world biological impact.

      We agree that there is value in expanding our analyses to non-Merscope platforms, to tissues other than brain, and to analyses other than cell typing. The limiting factor in all these directions is the availability of large enough datasets to probe the limits of MerQuaCo. We look forward to a future in which more datasets are available and it’s possible to extend our analyses

      Reviewer #1(Recommendation for the Author):

      (1) To better capture the downstream impacts of imperfections, consider extending the analysis to additional metrics, such as specificity variation across cell types, gene coexpression, or spatial gene patterning. This would deepen insights into how these imperfections shape biological interpretations and further demonstrate the versatility of MerQuaCo.

      These are compelling ideas, but we are unable to study so many possible downstream impacts in sufficient depth in a single study. Insights into these topics will likely come from future studies.

      (2) In Figure 7 legend, panel label (D) is repeated thus panels E-F are mislabelled. 

      We have corrected this error.

      (3) Ensure that the image quality is high for the figures. 

      We will upload Illustrator files, ensuring that images are at full resolution.

      Reviewer #2 (Recommendation for the Author):

      (1) A result subsection "Transcript density" looks too long. Please provide a subsection heading for each figure. 

      We have split this section into 3 with new subheadings.

      (2) The result subsection title "Transcript density" sounds ambiguous. Please provide a detailed title describing what information this subsection contains. 

      We have renamed this section ‘Differences in transcript density between MERSCOPE experiments’.

      Minor: 

      (1) There is no explanation of the black and grey bars in Figure 2A.

      We have added information to the figure legend, identifying the datasets underlying the grey and black bars.

      (2) In the abstract, the phrase "High-dimension" should be "High-dimensional". 

      We have changed ‘high-dimension’ to ‘high-dimensional’.

      (3) In the abstract, "Spatial results" is an unclear expression. What does it stand for? 

      We have replaced the term ‘spatial results’ with ‘the outputs of spatial transcriptomics platforms’.

      Reviewer #3 (Recommendation for the Author):

      (1) While the tool claims broad applicability, validation is heavily centered on MERSCOPE data, with limited testing on other platforms. The authors should expand validation to include more diverse platforms and add a small analysis of non-brain tissue. If broader validation isn't feasible, modify the title and abstract to reflect the focus on the mouse brain explicitly.

      We agree that expansion to other platforms is desirable, but to the best of our knowledge sufficient datasets from other platforms are not available. In the abstract, we state that ‘… we describe imperfections in a dataset of 641 fresh-frozen adult mouse brain sections collected using the Vizgen MERSCOPE.’

      (2) The impact of data imperfections on downstream analysis needs a more comprehensive evaluation. The authors should expand beyond cluster label changes to include a) differential expression analysis with simulated imperfections, b) impact on spatial statistics and pattern detection, and c) effects on cell-cell interactions. 

      Each of these ideas could support a substantial study. We are unable to do them justice in the limited space available as an addition to the current study.

      (3) The pixel classification workflow and validation process need more detailed documentation. 

      The methods and results together describe the workflow and validation in depth. We are unclear what details are missing.

      (4) The manuscript lacks comparison to existing. QC pipelines such as Squidpy and Giotto. The authors should benchmark MerQuaCo against them and provide integration options with popular spatial analysis tools with clear documentation.

      To the best of our knowledge, Squidpy and Giotto lack QC benchmarks, certainly of the parameters characterized by MerQuaCo. Direct comparison isn’t possible.

    1. f we don’t know whetherGeorge or GPT-3 wrote that essay or term paper, we’ll have to figure out howto assign meaningful written work.

      main idea/thesis

    1. By the 1980s desktop computing was becoming sufficiently widespread that the use of Geographic Information Systems (GIS) was feasible for greater numbers of archaeologists. The other ‘killer app’ of the time was computer-aided design, which allowed metric 3-dimensional reconstructions from the plans drawn on site by excavators.

      The information that these reconstructions can provide enables questions which could only be answered with conjecture to be realistically solved. It allows for more accurate research, especially into past events or those to come, which helps to build a true understanding of history.

    1. “The writing process ensures that you stay organized and focused while allowing you to break up a larger assignment into several distinct tasks.”

      I like this part because it reminds me that writing doesn’t have to feel overwhelming if I take it step by step. I usually try to finish everything in one sitting, so learning to slow down and follow a process could really help me improve my writing.

    1. Synthèse sur le rôle de l'alcool dans la société

      Résumé

      Ce document de synthèse analyse le rôle complexe et paradoxal de l'alcool dans la société, en se basant sur des perspectives historiques, socioculturelles, scientifiques et politiques.

      L'alcool est présenté comme une substance à double tranchant : d'une part, un puissant lubrifiant social et un pilier de rituels culturels et de moments de convivialité, profondément ancré dans l'histoire de l'humanité depuis des millénaires.

      D'autre part, il est une force destructrice majeure, responsable de 2 200 décès par jour en Europe selon l'OMS, lié à plus de 200 maladies, et engendrant des coûts sociétaux colossaux, estimés à 57 milliards d'euros par an rien qu'en Allemagne.

      Le document met en lumière l'ambivalence fondamentale de la société face à l'alcool, oscillant entre sa célébration dans les rituels et la stigmatisation de la dépendance individuelle.

      Les tentatives historiques et modernes de régulation se sont souvent heurtées à une forte résistance populaire, illustrant la difficulté de gérer une substance si intimement liée au plaisir, à l'identité et à la cohésion sociale.

      En définitive, les politiques les plus efficaces pour réduire les méfaits de l'alcool, à savoir l'augmentation des prix et la limitation de l'accès, se heurtent à cette acceptation culturelle profondément enracinée.

      1. Le Paradoxe Fondamental de l'Alcool : Plaisir et Destruction

      L'alcool occupe une place centrale et ambivalente dans la société, incarnant à la fois le plaisir et le danger.

      Cette dualité est au cœur de notre rapport à cette substance.

      Le Côté Positif : L'alcool est associé à des sensations agréables, comme une "douce sensation de chaleur dans le ventre", et à des contextes plaisants.

      Il est perçu comme un facilitateur de convivialité, pouvant donner lieu à des "conversations intéressantes" et favoriser le sentiment d'appartenance.

      Une citation résume bien ce paradoxe :

      "je dis toujours que j'ai passé certaines des meilleures nuits de ma vie avec de l'alcool et aussi certaines des pires."

      Le Côté Sombre : Son pouvoir destructeur est immense.

      Mortalité : L'OMS estime qu'environ 2 200 personnes meurent chaque jour en Europe à cause de l'alcool.      ◦ Maladies : Des études récentes lient une consommation régulière d'alcool à plus de 200 maladies.   

      Dépendance : L'alcool est la troisième substance la plus addictive en Allemagne, après le tabac et les médicaments.

      En France, une personne sur dix a un problème avec l'alcool.   

      Conséquences Sociales : Il mène à la solitude, l'anxiété, la dépression et la dépendance.

      Bien que la consommation globale soit en baisse en Europe, elle reste significative.

      En Allemagne, elle est passée de 141 L à 115 L de boisson alcoolisée par an et par habitant depuis 2008, ce qui équivaut encore à "une bière par jour".

      2. Une Perspective Historique : Un Compagnon de l'Humanité

      La relation de l'humanité avec l'alcool est millénaire, suggérant qu'il a pu jouer un rôle dans notre évolution et le développement de nos civilisations.

      Origines Ancestrales : Des indices suggèrent que l'alcool est "aussi vieux que l'humanité".

      ◦ Des archéologues ont découvert en Chine des récipients contenant des restes de vin vieux de 9 000 ans.   

      ◦ En Géorgie, la consommation d'alcool remonte à au moins 8 000 ans.  

      ◦ La découverte est probablement fortuite, issue de fruits fermentés naturellement.

      Avantages Historiques :

      Source d'Énergie : 1 gramme d'alcool contient 7 calories, soit presque le double des protéines ou des glucides.  

      Sécurité Sanitaire : L'alcool dissout la membrane des germes, rendant les boissons fermentées (bière, vin) plus sûres à consommer que l'eau potentiellement contaminée.  

      Moyen de Paiement : La bière était utilisée comme une quasi-monnaie.

      Un bulletin de paie en argile de Mésopotamie, vieux de 5 000 ans, indique des unités de bière.

      En Égypte, les ouvriers des pyramides étaient rémunérés en bière.

      Consommation Massive : Au Moyen Âge en Europe, des chercheurs estiment la consommation à 3 litres de boisson alcoolisée par jour et par habitant, y compris pour les enfants.

      3. Le Rôle Socioculturel : Ciment des Relations Humaines

      L'alcool est omniprésent dans les structures sociales, agissant comme un "lubrifiant social" et un marqueur des moments importants.

      Cohésion Sociale :

      ◦ Il favorise le "sentiment d'appartenance" en créant une expérience collective.   

      ◦ Une expérience a montré qu'un groupe consommant un peu de vodka "interagissait davantage, riait beaucoup plus et passait globalement un moment plus agréable".  

      ◦ Des études indiquent que les personnes qui fréquentent régulièrement les bars avec modération sont mieux intégrées socialement.

      Rituels et Célébrations : L'alcool sert à marquer la frontière entre le "quotidien et la normalité de l'exceptionnel".

      ◦ Il est présent à chaque étape de la vie : naissance ("mouiller la tête"), mariages (champagne), enterrements.   

      ◦ Même dans un contexte religieux, le vin est utilisé pour représenter le sang du Christ.  

      ◦ Utiliser une boisson plus chère et exceptionnelle comme le champagne pour un anniversaire est une façon de "marquer un moment solennel".

      Influence sur le Développement Sociétal :

      Sédentarisation : Une théorie postule que la production de bière sur des sites comme Göbekli Tepe (il y a 12 000 ans) a pu renforcer la cohésion sociale et inciter les groupes humains à se sédentariser.    

      Infrastructures : La production d'alcool a influencé le développement des moyens de transport (fûts), des espaces de stockage et des bâtiments (brasseries).

      Variations Culturelles : Les coutumes de consommation varient :

      Norvège : Sobriété la semaine, forte consommation le week-end.  

      France/Italie : Un verre de vin au déjeuner.

      4. Impacts sur la Santé et Mécanismes d'Action

      D'un point de vue chimique et biologique, les effets de l'alcool sur le corps expliquent à la fois son attrait et sa dangerosité.

      La Molécule d'Éthanol : Petite molécule (deux atomes de carbone, six d'hydrogène, un d'oxygène), elle traverse facilement la barrière hémato-encéphalique pour agir sur le cerveau.

      Action sur les Neurotransmetteurs : L'alcool influence trois systèmes principaux : | Système | Effet Principal | Conséquence | | :--- | :--- | :--- | | GABA | Anxiolytique | Sensation de détente, réduction de l'anxiété | | Glutamate | Augmente la vigilance | Stimulation de la présence et de l'attention | | Dopamine | Rend heureux | Sensation de plaisir, voire d'euphorie |

      Toxicité Métabolique :

      ◦ Le foie transforme l'alcool en acétaldéhïde, qui est un "poison".   

      ◦ Cette substance circule dans le sang et atteint tous les organes (cerveau, peau, etc.).  

      Dommages Spécifiques : L'alcool peut provoquer des gastrites (attaque des muqueuses de l'estomac), endommager le foie, entraîner une atrophie du cervelet et être toxique pour le pancréas.  

      Risque de Cancer : La consommation régulière d'alcool augmente le risque de tumeurs et de cancer.

      5. Dépendance, Coûts et Ambivalence Sociétale

      La société entretient une relation contradictoire avec l'alcool, le célébrant tout en laissant les individus gérer seuls ses conséquences les plus graves.

      La Dépendance :

      ◦ La plus grande difficulté est le déni : "plus les gens sont dépendants, moins ils se rendent compte qu'ils le sont."   

      ◦ La dépendance isole l'individu, produisant l'effet inverse du sentiment d'appartenance initialement recherché.

      Coûts Économiques :

      ◦ Selon l'annuaire des addictions, l'alcool coûte 57 milliards d'euros par an en Allemagne.  

      ◦ Ces coûts incluent les délits, la violence, la conduite en état d'ivresse, les arrêts maladie et les traitements.

      L'Hypocrisie Sociale :

      ◦ La société vend l'alcool comme "quelque chose de positif associé à des fêtes", mais "ceux qui ne savent pas gérer leur consommation sont livrés à eux-mêmes".

      La responsabilité est individualisée.   

      ◦ Cette ambivalence se reflète dans les politiques publiques : en 2024, la Société allemande de nutrition a recommandé "zéro alcool", tandis que 30 % du budget de prévention des addictions était supprimé.  

      ◦ La publicité pour l'alcool reste peu réglementée et la "consommation accompagnée" (dès 14 ans) est autorisée en Allemagne.

      6. Les Tentatives de Régulation et la Résistance Populaire

      L'histoire montre que les tentatives de contrôle de la consommation d'alcool par les autorités se sont souvent soldées par des échecs face à la pression sociale.

      Le Cas de la Bavière (1844) : Le roi Louis Ier a tenté d'augmenter le prix de la bière.

      La mesure a provoqué de tels "remous au sein de la population" qu'elle a été annulée après seulement quatre jours.

      L'alcool est perçu comme un "dernier bastion qui nous permet de nous distinguer en tant qu'être humain".

      La Campagne de Gorbatchev (années 1980) : Mikhaïl Gorbatchev a lancé une campagne anti-alcool en URSS pour améliorer la santé publique.

      Résultats sanitaires : La mortalité a considérablement diminué durant cette période.   

      Échec politique : La campagne a été un "désastre" pour Gorbatchev, contribuant à sa chute. L'ironie veut qu'il ait cédé le pouvoir à Boris Eltsine, "notoirement alcoolique".

      La Prohibition aux États-Unis : Bien qu'elle ait généré un marché noir, la prohibition a entraîné une baisse considérable de la consommation d'alcool et des maladies et décès qui y sont liés.

      L'Ambivalence de l'Église : L'Église chrétienne a prêché la modération ("l'idéal chrétien de la juste mesure") tout en intégrant le vin dans ses rites les plus sacrés (la Cène, les noces de Cana), illustrant une "hypocrisie généralisée vis-à-vis de l'alcool".

      7. Vers des Politiques Efficaces ?

      Le document suggère que les campagnes de sensibilisation actuelles sont largement inefficaces et que des mesures plus structurelles sont nécessaires pour réduire les méfaits de l'alcool.

      Inefficacité des Campagnes : Les campagnes de sensibilisation sont jugées peu efficaces ; elles servent surtout à "donner bonne conscience".

      Les Deux Leviersefficaces : Pour réduire la consommation, deux mesures sont jugées primordiales :

      1. Limiter l'accès à l'alcool.    2. Augmenter son prix.

      L'Exemple du Tabac : Le Royaume-Uni est cité en exemple.

      Avec un paquet de cigarettes à 16 €, le taux de fumeurs est de 11,9 %, contre 24,5 % en France et 20,1 % en Allemagne, où les prix sont plus bas.

      La Question de la Fiscalité : Il est noté que l'alcool est "très bon marché" dans de nombreuses régions d'Europe. Par exemple, la taxe minimale sur le vin fixée au sein de l'UE est de 0 €.

      8. Conclusion : Accepter une Réalité Humaine et Complexe

      L'attrait pour l'alcool, malgré ses dangers connus, semble être une caractéristique profondément humaine, liée à une "dimension autodestructrice" ou à un "désir d'échapper à la réalité de la vie".

      Les individus réagissent souvent avec colère aux avertissements, les percevant comme une forme d'infantilisation.

      La conclusion suggère qu'il est peut-être impossible d'apprécier l'alcool "sans la double morale qui l'accompagne".

      La première étape serait de reconnaître pleinement le paradoxe de l'alcool, ses avantages et ses inconvénients, afin d'apprendre à vivre avec cette substance complexe qui ne semble pas prête de disparaître de nos sociétés.

    1. L'Éducation comme Instrument de Pouvoir : Une Analyse Historique

      https://www.youtube.com/watch?v=JCKbqhfFKy8

      Résumé

      Ce document de synthèse analyse le rôle historique de l'éducation, démontrant qu'au-delà de son idéal d'épanouissement personnel et de service du bien commun, elle a principalement été un instrument stratégique utilisé par les élites pour asseoir et maintenir leur pouvoir.

      L'analyse, qui s'étend de la Sparte antique à l'époque contemporaine, révèle un schéma récurrent :

      la mise en place de systèmes d'instruction publique est souvent une réponse directe aux troubles sociaux et vise à former des citoyens obéissants, à consolider des empires et à imposer des normes culturelles.

      Des cas d'étude allant de la Prusse, pionnière de l'école obligatoire pour mater les révoltes paysannes, à la Chine impériale, utilisant des examens méritocratiques pour briser le pouvoir de la noblesse, illustrent cette thèse.

      L'exemple tragique des pensionnats autochtones au Canada expose la forme la plus extrême de cette instrumentalisation, où l'éducation devient une arme de domination culturelle et d'éradication.

      En conclusion, l'histoire révèle une tension fondamentale entre une éducation visant l'autonomie et la pensée critique, et une formation axée sur la performance, l'obéissance et la consolidation du statu quo.

      1. Introduction : Le Droit à l'Éducation et ses Desseins Cachés

      L'idéal moderne de l'éducation, tel que conçu par Platon comme une sortie de "la caverne de notre propre ignorance" et consacré par l'article 26 de la Déclaration universelle des droits de l'homme de 1948, postule l'instruction comme un droit fondamental au service de l'intérêt général.

      Des études américaines corrèlent même un diplôme universitaire à une espérance de vie accrue de près de neuf ans.

      Cependant, un examen historique approfondi soulève une question essentielle : l'éducation a-t-elle toujours poursuivi cet objectif d'émancipation ?

      L'histoire suggère que l'instruction publique a souvent été un outil au service d'intérêts politiques et de stratégies de pouvoir bien définies.

      2. Les Origines du Contrôle Social par l'Instruction

      Loin d'être une invention des démocraties modernes, l'instruction publique obligatoire trouve ses racines dans des régimes autocratiques qui y ont vu un moyen efficace de garantir l'ordre social et la stabilité de leur pouvoir.

      Sparte : Former le Guerrier-Citoyen Obéissant

      Le premier exemple d'un système éducatif public structuré ne se trouve pas dans l'Athènes démocratique, mais dans la dictature militaire et esclavagiste de Sparte.

      Contexte de Domination : La société spartiate était composée d'une minorité de citoyens libres (les Spartiates) dominant une très large population d'hilotes, des serfs autochtones.

      Le rapport était estimé à sept hilotes pour un Spartiate.

      L'Agogé, un Outil de Contrôle : Pour maintenir le contrôle sur cette population asservie et supérieure en nombre, Sparte a mis en place l'agogé.

      Il s'agissait d'un système éducatif public et obligatoire pour les garçons spartiates dès l'âge de 7 ans, conçu comme un camp d'entraînement militaire visant à former des "guerriers surhumains".

      Objectifs Pédagogiques : L'accent était mis sur l'endurance, l'obéissance et la suppression de toute faiblesse, comme en témoigne le "concours de flagellation".

      Une Alphabétisation Stratégique : Bien que les hilotes en soient exclus, le programme incluait l'alphabétisation.

      L'objectif n'était pas l'épanouissement intellectuel, mais une compétence militaire :

      "Si un spartiate est envoyé en mission d'espionnage et qu'il intercepte un message écrit, il doit être capable de le lire."

      Conclusion : L'éducation spartiate n'avait pas pour but le développement personnel mais la formation de citoyens-soldats obéissants, un instrument essentiel à la survie du pouvoir en place.

      L'Empire Carolingien : Unifier pour Mieux Régner

      Après la chute de l'Empire romain d'Occident, Charlemagne initia la première grande expansion de l'éducation en Europe.

      Son projet, loin d'être purement altruiste, était une manœuvre calculée pour consolider son vaste empire.

      Besoin Administratif : Pour contrôler son territoire, Charlemagne avait besoin d'une administration solide et unifiée.

      L'école de la cour servait de "vivier de futur haut fonctionnaire".

      Unification Religieuse et Culturelle : Le pouvoir de l'empereur reposant sur Dieu, il était crucial de diffuser un christianisme uniformisé.

      La réforme éducative visait à améliorer le niveau des ecclésiastiques et à standardiser la liturgie dans tout l'empire.

      Harmonisation de l'Écriture : Pour une administration efficace, une écriture commune était nécessaire.

      La "minuscule Caroline" fut développée à cette fin, unifiant la communication écrite.

      Cette police est l'ancêtre directe de la police de caractères Times New Roman.

      Conclusion : Pour Charlemagne, l'éducation n'était pas une fin en soi, mais un "instrument nécessaire pour maintenir la cohésion de l'empire".

      La Prusse : L'École Obligatoire comme Rempart contre les Révoltes

      C'est en Prusse, en 1763, que Frédéric II promulgua la loi créant le premier système d'enseignement primaire obligatoire au monde.

      L'analyse de la politologue Agustina Paglayan révèle que cette initiative, loin d'être un progrès démocratique, était une stratégie de contrôle social.

      Le Paradoxe des Autocraties : Paglayan souligne que "ce ne sont pas les démocraties qui ont conduit à la création d'un enseignement primaire dans le monde occidental.

      Celui-ci s'est surtout développé et étendu avant que les pays ne deviennent démocratiques."

      L'Éducation en Réponse aux Crises : Un schéma récurrent a été identifié :

      la plupart des lois sur la scolarité obligatoire ont été adoptées juste après des révoltes populaires.

      Prusse (milieu du 18e siècle) : La loi est promulguée suite à des rebellions paysannes.    ◦ Massachusetts (années 1780) :

      La première loi américaine sur la scolarité obligatoire répond à la révolte de Shays.  

      France (1833) : La loi suit la révolution de Juillet.   

      Pérou (2000) : La scolarité est imposée dans les anciennes zones rebelles après une guerre civile de 20 ans.

      Objectif : l'Endoctrinement : Face à la peur des masses, les élites politiques ont utilisé l'école primaire pour "enseigner aux enfants que le statut quo est acceptable et qu'il n'y a aucune raison de se rebeller".

      L'enfance est ciblée car c'est la période où "les valeurs morales et les comportements politiques se façonnaient le mieux".

      L'École comme "Prison de Jour" : Reprenant les idées de Michel Foucault, le document décrit l'école comme une institution disciplinaire.

      Les enseignants agissent comme des gardiens, inculquant la ponctualité, l'immobilité, la sagesse et la soumission.

      Le but est de "créer une machine sociale bien huilée".

      Le Modèle Humboldtien : Une vision alternative fut proposée par le Prussien Wilhelm von Humboldt, pour qui l'éducation devait viser l'épanouissement personnel de chaque individu, "quel que soit leur origine sociale".

      Cependant, après la défaite de Napoléon, ses idées jugées "dangereuses" furent écartées au profit d'un retour à l'"obéissance aveugle".

      3. L'Éducation comme Outil de Sélection et de Pouvoir

      Au-delà de l'inculcation de l'obéissance, l'éducation a aussi servi à structurer les hiérarchies du pouvoir, comme le montre l'exemple de la Chine impériale.

      La Chine Impériale et le Système des Examens (Keju)

      Pendant plus de 1000 ans, la Chine a utilisé un système d'examens (le Keju, institué au 7e siècle) pour attribuer les postes de la fonction publique.

      Une Méritocratie de Façade : En apparence, le système était basé sur le mérite.

      Les candidats, parfois près d'un million pour environ 400 postes de finalistes, devaient mémoriser des classiques confucéens comptant jusqu'à 400 000 caractères.

      Un Objectif Politique : L'objectif réel de l'empereur était de limiter l'emprise des familles nobles qui contrôlaient traditionnellement l'administration.

      En instituant un système basé sur des examens, il étendait son propre pouvoir en créant une bureaucratie qui lui était directement redevable.

      Influence Globale : Ce modèle, basé sur le mérite pour contrer le népotisme, a inspiré des réformes similaires jusqu'en Angleterre au milieu du 19e siècle.

      4. L'Éducation comme Arme de Domination Culturelle

      Le cas des pensionnats pour autochtones au Canada représente l'utilisation la plus sinistre de l'éducation, où elle est détournée pour devenir un outil d'éradication culturelle.

      Le Témoignage de Gary Godfriitson (Peuple Sir Weepom)

      Gary Godfriitson, gardien du savoir de la communauté Sir Weepom, décrit le système éducatif autochtone traditionnel comme étant basé sur "une étude attentive des enfants" pour découvrir leurs talents individuels et leur assigner des mentors experts.

      Ce système, jugé "rétrograde" par les colons européens, fut systématiquement démantelé.

      Les Pensionnats : Des écoles spéciales, ou pensionnats, furent créées avec pour objectif de "détruire les cultures autochtones du Canada".

      Gary Godfriitson, entré à 5 ans, se souvient : "Nous avons appris à nous taire. Nous avons appris que nous n'avions pas de voix dans ces pensionnats."

      Un Système d'Abus : Les enfants étaient soumis à un régime de discipline stricte, de prières constantes et de travail forcé ("un camp de travail pour enfants").

      Ils subissaient "toutes sortes de violences (...) sexuel, physique, émotionnel".

      Bilan Tragique : Environ 150 000 enfants autochtones sont passés par ces établissements.

      Un institut de recherche canadien estime qu'au moins 4 100 d'entre eux y sont morts de maladie, de négligence, de mauvais traitements ou en tentant de fuir.

      Un Projet Colonial Global : Ces pensionnats n'étaient pas une exception mais "l'un des outils majeurs pour la domination culturelle et soumettre l'autre".

      5. Conclusion : Quelle Finalité pour l'Éducation de Demain ?

      L'histoire démontre que l'éducation a trop rarement été "vouée au seul bien commun".

      Elle a plus souvent servi à "garantir le pouvoir, à orienter les carrières et à imposer des normes".

      Aujourd'hui, une tension persiste entre deux modèles :

      1. L'Éducation comme Formation : Un modèle axé sur la performance, la fonctionnalisation et la monétisation des connaissances, qui forme des individus adaptés à une "machine sociale bien huilée".

      2. L'Éducation comme Épanouissement : Le modèle de Humboldt, qui privilégie le développement personnel, la recherche de la connaissance et du sens, et qui promeut la pensée critique et la créativité comme compétences fondamentales.

      La question finale demeure : "Quelle formule souhaitons-nous pour l'avenir ?

      Une éducation qui nous dicte ce que nous devons savoir ou une éducation qui nous aide à découvrir qui nous voulons vraiment être ?"

  4. documenta.jesuits.global documenta.jesuits.global
    1. WEARY: 1. Lighting a luxuriously-scented candle 2. Lying down to sleep on the sofa 3. Cuddling a soft toy 4. Enjoying a slice of cake and a soft, comforting kids' movie 5. Miss Spider's Sunny Patch Kids 6. My Little Pony: A Very Minty Christmas 7. Care Bears: The Nutcracker 8. Brambly Hedge 9. Snuggling up by a roaring fire 10. Winter cosiness

    2. DRAINED BUT COMFORTED: 1. Lying on a fluffy duvet with Alan and Tin Tin to protect you 2. Feeling safe and comforted 3. Jeff Tracy comforting my baby self when it's thundery and loud outside 4. Alan carrying me to his bed when I feel tired 5. All of International Rescue being like family to me 6. Lady Penelope kissing me all over my cheeks 7. The smell of vanilla filling the mansion lounge on Christmas 8. Slow, soft Winter songs playing on Lady Penelope's gramophone 9. Watching the Winter wind outside with Tin Tin 10. Snuggling under a cashmere blanket with Brains, Alan and Tin Tin to keep warm.

    1. WHAT MAKES SNOW DRAINED BUT COMFORTED: 1. Snow angels 2. Building snowmen 3. Hot cocoa and chocolate bars 4. Hot tubs 5. Swimming pools 6. A nice warm hug when it gets too cold 7. A glowing red nose 8. Drawing pictures of his goals 9. The scent of vanilla and festive spice 10. Bedtimes during Winter.

    1. A linear combination that describes an appropriately antisymmetrized multi-electron wavefunction for any desired orbital configuration is easy to construct for a two-electron system. However, interesting chemical systems usually contain more than two electrons. For these multi-electron systems, a relatively simple scheme for constructing an antisymmetric wavefunction from a product of one-electron functions is to write the wavefunction in the form of a determinant. John Slater introduced this idea so the determinant is called a Slater determinant. The Slater determinant for the two-electron wavefunction of helium is (3.9.27)|ψ⁡(r1,r2)⟩=12⁢|ϕ1⁢s⁡(1)⁢α⁡(1)ϕ1⁢s⁡(1)⁢β⁡(1)ϕ1⁢s⁡(2)⁢α⁡(2)ϕ1⁢s⁡(2)⁢β⁡(2)| We can introduce a shorthand notation for the arbitrary spin-orbital (3.9.28)ϕi⁢α⁡(r)=ϕi⁡α or (3.9.29)ϕi⁢β⁡(r)=ϕi⁡β as determined by the ms quantum number. A shorthand notation for the determinant in Equation 3.9.27 is then (3.9.30)|ψ⁡(r1,r2)⟩=2−12⁢D⁢e⁢t|⁢ϕ1⁢s⁢α⁡(r1)⁢ϕ1⁢s⁢β⁡(r2)| The determinant is written so the electron coordinate changes in going from one row to the next, and the spin orbital changes in going from one column to the next. The advantage of having this recipe is clear if you try to construct an antisymmetric wavefunction that describes the orbital configuration for uranium! Note that the normalization constant is (N!)−12 for N electrons. The generalized Slater determinant for a multi-electron atom with N electrons is then

      Should probably add discussion of how this does not give the wavefunctions above for excited state He, if do not use linear combinations to make sure that all spin-spatial combinations are included

    1. lei

      Os casos de contratação por tempo determinado deve ser prevista em <u>lei ordinária</u>. Não há razão para se exigir lei complementar para regulamentação da contratação por tempo determinado. Nesse sentido:

      • Informativo 1162
      • ADI 7057 / CE
      • Órgão julgador: Tribunal Pleno
      • Relator(a): Min. DIAS TOFFOLI
      • Julgamento: 06/12/2024 (Virtual)
      • Ramo do Direito: Administrativo
      • Matéria: Agente Público; Contratação Temporária; Requisitos; Agente Socioeducativo; Regulamentação; Lei Complementar Estadual

      Contratação temporária em âmbito estadual e sua regulamentação por lei complementar

      Resumo - É inconstitucional — pois viola o princípio da simetria e o princípio democrático — norma de Constituição estadual que <u>exige a edição de lei complementar</u> para a regulamentação dos casos de contratação por tempo determinado para atender a necessidade temporária de excepcional interesse público.

      • São inconstitucionais — pois não observam o princípio do concurso público (CF/1988, art. 37, II) nem os requisitos para a contratação temporária (CF/1988, art. 37, IX) — as Leis Complementares cearenses nº 163/2016, nº 169/2016 e nº 228/2020, que autorizam, por tempo determinado e para atender a necessidade temporária e de excepcional interesse público, a admissão de profissionais para a execução de atividades técnicas especializadas no âmbito do Sistema Estadual de Atendimento Socioeducativo.

      • Ao tratar do instituto da contratação temporária, a Constituição Federal não determinou que sua regulamentação fosse realizada por meio de lei complementar (1).

      • De acordo com a jurisprudência desta Corte (2), exigir lei complementar em situações para as quais a Constituição Federal não a previu restringe o arranjo democrático-representativo por ela estabelecido.

      • Para que se considere válida a contratação temporária, exige-se que: i) os casos excepcionais estejam previstos em lei; ii) o prazo de contratação seja predeterminado; iii) a necessidade seja temporária; iv) o interesse público seja excepcional; v) a necessidade de contratação seja indispensável, sendo vedada a contratação para os <u>serviços ordinários permanentes</u> do Estado, e que <u>devam estar sob o espectro das contingências normais</u> da Administração (3).

      • Na espécie, as Leis Complementares cearenses nº 163/2016 e nº 169/2016, embora estabeleçam prazo predeterminado para a contratação, visando realização de um objetivo público de grande relevância, não tratam de situação excepcional, porquanto a busca pelo aprimoramento dos serviços para melhor servir à sociedade é inerente à Administração Pública.

      • Ademais, os anexos dessas normas demonstram tratar-se de diversas funções da estrutura administrativa do Sistema Estadual de Atendimento Socioeducativo que deveriam ter sido preenchidas por detentores de cargos públicos, tendo em vista a natureza ordinária e permanente das atividades.

      • Por sua vez, a Lei Complementar nº 228/2020, editada no contexto da pandemia da Covid-19, apontou que a necessidade temporária da contratação compreenderia o período necessário à realização de concurso público para o provimento de cargos efetivos. Entretanto, o certame somente foi lançado em abril de 2024, quase oito anos após a criação, pela Lei estadual nº 16.040/2016, da Superintendência do Sistema Estadual de Atendimento Socioeducativo. A perpetuação dessas contratações pretensamente de caráter temporário evidencia a inércia administrativa em regularizar a estrutura de pessoal daquela superintendência.

      • Com base nesses e em outros entendimentos, o Plenário, por maioria, julgou parcialmente procedente ação para: (i) declarar a inconstitucionalidade da expressão “complementar” do art. 154, inciso XIV, da Constituição do Estado do Ceará, com efeito ex nunc, para que a decisão, no ponto, produza efeitos a partir da publicação da ata deste julgamento; e (ii) declarar a inconstitucionalidade das Leis Complementares estaduais nº 163/2016; nº 169/2016; e nº 228/2020, garantindo-se a vigência das contratações temporárias celebradas com base nos citados diplomas, até que expirem os prazos de duração, após o que deverá o Estado do Ceará preencher os quadros de seu Sistema Estadual de Atendimento Socioeducativo com servidores aprovados em concurso público.

    2. desde que aceito por ambas as partes

      Somente se houver expresso assentimento dos entes envolvidos, poderá haver amortização de dívidas com sentenças transitadas em julgado. Nesse sentido:

      • RE 657686
      • Órgão julgador: Tribunal Pleno
      • Relator(a): Min. LUIZ FUX
      • Julgamento: 23/10/2014
      • Publicação: 05/12/2014

      RECURSO EXTRAORDINÁRIO COM REPERCUSSÃO GERAL. DIREITO CONSTITUCIONAL. REGIME DE EXECUÇÃO PECUNIÁRIA DA FAZENDA PÚBLICA. COMPENSAÇÃO DE DÉBITOS PERANTE A FAZENDA PÚBLICA COM CRÉDITOS SUJEITOS A REQUISIÇÃO DE PEQUENO VALOR. IMPOSSIBILIDADE. JULGAMENTO DAS ADI’S 4357 E 4425 PELO PLENÁRIO DO SUPREMO TRIBUNAL FEDERAL. EMENDA CONSTITUCIONAL Nº 62/2009. INCONSTITUCIONALIDADE DA SISTEMÁTICA DE COMPENSAÇÃO EM PROVEITO EXCLUSIVO DA FAZENDA PÚBLICA. EMBARAÇO À EFETIVIDADE DA JURISDIÇÃO (CRFB, ART. 5º, XXXV), DESRESPEITO À COISA JULGADA MATERIAL (CRFB, ART. 5º XXXVI), OFENSA À SEPARAÇÃO DOS PODERES (CRFB, ART. 2º) E ULTRAJE À ISONOMIA ENTRE O ESTADO E O PARTICULAR (CRFB, ART. 1º, CAPUT, C/C ART. 5º, CAPUT). ENTENDIMENTO QUE SE APLICA NA MESMA EXTENSÃO ÀS REQUISIÇÕES DE PEQUENO VALOR. RECURSO EXTRAORDINÁRIO A QUE SE NEGA PROVIMENTO. - 1. A compensação de tributos devidos à Fazenda Pública com créditos decorrentes de decisão judicial caracteriza pretensão assentada em norma considerada inconstitucional (art. 100, §§ 9º e 10, da Constituição da República, com redação conferida pela EC nº 62/2009).

      • 2. O Plenário do Supremo Tribunal Federal, ao julgar as ADIs nº 4.357 e 4.425, assentou a inconstitucionalidade dos §§ 9º e 10 do art. 100 da Constituição da República, com redação conferida pela EC nº 62/2009, forte no argumento de que a compensação dos débitos da Fazenda Pública inscritos em precatórios embaraça a efetividade da jurisdição (CRFB, art. 5º, XXXV), desrespeita a coisa julgada material (CRFB, art. 5º, XXXVI), vulnera a Separação dos Poderes (CRFB, art. 2º) e ofende a isonomia entre o Poder Público e o particular (CRFB, art. 5º, caput), cânone essencial do Estado Democrático de Direito (CRFB, art. 1º, caput).

      • 3. Destarte, não se revela constitucionalmente possível a compensação unilateral de débitos em proveito exclusivo da Fazenda Pública mesmo que os valores envolvidos estejam sujeitos ao regime de pagamento por requisição de pequeno valor (RPV).

      • 4. Recurso extraordinário a que se nega provimento.

      Tema 511

      • Compensação de débitos tributários com requisições de pequeno valor - RPV.

      Tese - É constitucionalmente vedada a compensação unilateral de débitos em proveito exclusivo da Fazenda Pública ainda que os valores envolvidos não estejam sujeitos ao regime de precatórios, <u>mas apenas à sistemática da requisição de pequeno valor</u>.

    1. mathematician

      DRAINED BUT COMFORTED 1. Being cuddled by Jeff Tracy when I start crying as a baby 2. Having a sleep when I have a bad headache 3. Parker tickling me when I sleep on his lap as a baby 4. Lying in a boundaryless meadow with Brains, Alan and Tin Tin and gazing up at the sky in silence. 5. A bath with warm water 6. Lady Penelope singing me a lullaby 7. Snuggles in bed on a cold Winter night 8. Brains being comforted by Alan when things go wrong 9. Giving my working brain a rest 10. Stretching my toes on Alan's bed

    1. DEMERARA'S FAVOURITE THINGS 1. Cotton Candy (her favourite!) 2. The Larntown sweet shop 3. Trying Libby's latest goods 4. Ice cream and popcorn 5. The smell of marshmallows 6. Her song 'Sweet Treats Are Neat!' 7. Her home in the Candy Mountains 8. Her pet poodle Fluffcake 9. Filling rainclouds with candy and turning them into rainbow colours 10. Playing with Pibby and her friends

  5. www.planalto.gov.br www.planalto.gov.br
    1. preclusão

      Regra Geral: A Não Submissão à Preclusão - A principal regra extraída é que as matérias de ordem pública, em princípio, não precluem, podendo ser analisadas a qualquer momento nas instâncias ordinárias.

      • A inépcia da petição inicial, por ser matéria de ordem pública, "pode ser suscitada e examinada a qualquer tempo nas <u>instâncias ordinárias</u>, não se submetendo à preclusão" (AgInt no AREsp 2.270.272). Da mesma forma, a conformidade do valor executado com o título judicial é matéria de ordem pública e "não há preclusão pro judicato na atividade probatória para o julgador" que busca aferir a exatidão dos cálculos (AgInt no AREsp 2.405.050).

      Exceções Reconhecidas: Quando a Matéria de Ordem Pública Preclui

      Apesar da regra geral, a jurisprudência do STJ estabelece situações claras em que a preclusão atinge, sim, as matérias de ordem pública.

      1. Preclusão pro judicato (Quando a matéria já foi decidida no processo) - A exceção mais recorrente é a que impede o juiz ou o tribunal de reexaminar uma questão de ordem pública que já tenha sido objeto de decisão anterior no mesmo processo, sem que houvesse recurso oportuno da parte.

      • Matérias de ordem pública "se submetem à preclusão pro judicato nas hipóteses [...] em que a questão [...] já tenha sido examinada e decidida, sem que, contra a conclusão plasmada no respectivo decisum, tenha havido insurgência da parte contrária" (AgInt no REsp 1.535.655). Essa preclusão "impede a revisão de matérias decididas no processo, inclusive as de ordem pública, que não tenham sido impugnadas pelo recurso cabível no momento próprio" (EDcl no REsp 1.708.238).

      2. Preclusão para as Partes (Consumativa e Temporal)

      • Se a parte interessada não alega a nulidade ou a questão de ordem pública na primeira oportunidade que tem para falar nos autos após a sua ocorrência, ou se a questão já foi decidida, opera-se a preclusão para a parte.

      • A nulidade "não foi oportunamente alegada nos embargos de declaração [...], o recorrente não levantou a nulidade na <u>primeira oportunidade</u> após a ocorrência do vício, restando configurada a preclusão da matéria, nos termos do art. 278 do CPC/2015" (REsp 1.809.204). Ademais, "estão sujeitas à preclusão as matérias não impugnadas no momento oportuno, <u>inclusive as de ordem pública</u>" (EDcl no AgInt no REsp 2.129.882). A "preclusão consumativa impede a rediscussão de questões já decididas, inclusive as de ordem pública" (AgInt no AREsp 2.302.911).

      3. Preclusão nas Instâncias Superiores (Ausência de Prequestionamento) - Para que uma matéria de ordem pública seja analisada em Recurso Especial (STJ), ela precisa ter sido previamente debatida e decidida pelo tribunal de origem. A ausência desse debate gera a preclusão da análise na instância superior.

      • "o acesso à via extraordinária depende do indispensável prequestionamento da matéria perante o Tribunal a quo, requisito constitucional exigido inclusive para as matérias de ordem pública" (REsp 1.809.204 e Informações Complementares à Ementa do REsp 1.809.209).

      4. Preclusão Máxima (Coisa Julgada) - Após o trânsito em julgado de uma decisão, as questões decididas, ainda que de ordem pública, estabilizam-se, não podendo ser rediscutidas, em respeito à segurança jurídica.

      • "A preclusão da matéria e o respeito à coisa julgada impedem a análise de pedido de desclassificação de conduta após o trânsito em julgado, em observância ao princípio da segurança jurídica" (AgRg no RHC 953.536).
    2. embargos de declaração

      JURISPRUDÊNCIA EM TESES - EDIÇÃO 189 - EMBARGOS DE DECLARAÇÃO I

      • 1) Os embargos de declaração não podem ser utilizados para adequar a decisão ao entendimento da parte embargante, acolher pretensões que refletem mero inconformismo ou rediscutir matéria já decidida.

      • 2) A contradição que autoriza a oposição de embargos de declaração é a interna, caraterizada pela existência de proposições inconciliáveis entre si.

      • 3) Não é necessário ratificar o recurso especial interposto na pendência do julgamento dos embargos de declaração, quando inalterado o resultado anterior. (Súmula n. 579/STJ)

      • 4) Não compete ao Superior Tribunal de Justiça - STJ, ainda que para fim de prequestionamento, examinar dispositivos constitucionais em embargos de declaração, sob pena de usurpação da competência do Supremo Tribunal Federal - STF.

      • 5) A oposição de embargos de declaração com notório propósito de prequestionamento não possui caráter protelatório, assim, deve ser afastada a aplicação da multa prevista no art. 1.026, § 2º, do Código de Processo Civil, nos termos da Súmula n. 98/STJ.

      • 6) Os embargos de declaração devem ser apreciados pelo órgão julgador da decisão embargada, independentemente da alteração de sua composição, o que não ofende o princípio do juiz natural nem excepciona o princípio da identidade física do juiz.

      • 7) Admite-se, excepcionalmente, a oposição de embargos de declaração para obter a juntada de notas taquigráficas aos autos quando indispensáveis à compreensão do acórdão ou ao exercício da ampla defesa.

      • 8) É possível a imposição cumulativa de multa por oposição de embargos de declaração protelatórios com multa por litigância de má-fé, pois possuem naturezas distintas.

      • 9) Em observância aos princípios da fungibilidade recursal e da instrumentalidade das formas, é admitida a conversão de embargos de declaração em agravo interno quando a pretensão declaratória possui manifesto caráter infringente.

      • 10) Não é cabível o recebimento de embargos declaratórios como pedido de reconsideração nem deste como aqueles.


      JURISPRUDÊNCIA EM TESES - EDIÇÃO 190 -EMBARGOS DE DECLARAÇÃO II

      • 1) Na hipótese de concessão de efeito infringente aos embargos de declaração, é necessária intimação prévia do embargado para apresentar impugnação, sob pena de nulidade de julgamento e violação aos princípios do contraditório e da ampla defesa.

      • 2) Os embargos de declaração, quando opostos contra decisão de inadmissibilidade do recurso especial proferida na instância ordinária, não interrompem o prazo para a interposição do agravo previsto no art. 1.042 do CPC, único recurso cabível, salvo quando a decisão for tão genérica que impossibilite ao recorrente aferir os motivos pelos quais teve seu recurso negado, de modo a inviabilizar a interposição do agravo.

      • 3) Deve-se aplicar a técnica do julgamento ampliado, prevista no art. 942 do CPC, aos embargos de declaração quando o voto divergente puder alterar o resultado unânime do acórdão de apelação.

      • 4) Os segundos embargos de declaração estão restritos ao argumento da existência de vícios no acórdão proferido nos primeiros aclaratórios, pois, em virtude da preclusão consumativa, é descabida a discussão acerca da decisão anteriormente embargada.

      • 5) Não é possível, em embargos de declaração, adaptar o entendimento do acórdão embargado em razão de posterior mudança jurisprudencial.

      • 6) São cabíveis embargos de declaração para, em caráter excepcional, adequar o acórdão embargado à orientação firmada no âmbito de repercussão geral reconhecida pelo Supremo Tribunal Federal e de recurso julgado sob o rito dos repetitivos.

      • 7) Embargos de declaração que visam rediscutir matéria já apreciada e decidida pela Corte de origem em conformidade com súmula do STJ ou STF ou, ainda, precedente julgado pelo rito dos recursos repetitivos são considerados protelatórios.

      • 8) O julgamento colegiado dos embargos de declaração opostos à decisão monocrática de relator, sem a interposição de agravo interno, não acarreta o exaurimento da instância para efeito de interposição de recurso especial.

      • 9) O julgamento monocrático dos embargos de declaração opostos ao acórdão do Tribunal de origem, sem a interposição do agravo interno, não acarreta o exaurimento da instância para efeito de interposição de recurso especial.

      • 10) É possível o julgamento monocrático pelo relator de embargos de declaração opostos contra decisão colegiada.


      JURISPRUDÊNCIA EM TESES - EDIÇÃO 191 - EMBARGOS DE DECLARAÇÃO III

      • 1) Não é cabível a majoração dos honorários recursais no julgamento de embargos de declaração.

      • 2) Não são cabíveis embargos de declaração contra despacho que determina a intimação da parte para regularizar o preparo recursal, pois tal ato não possui natureza decisória.

      • 3) A ausência de manifestação sobre o mérito de recurso que não ultrapassou o juízo de admissibilidade não caracteriza omissão apta a autorizar a oposição de embargos de declaração.

      • 4) É desnecessária a intimação para complementar as razões recursais a que se refere o art. 1.024, § 3º, do CPC, quando os embargos de declaração recebidos como agravo regimental impugnam especificamente os fundamentos da decisão monocrática.

      • 5) O julgamento dos embargos de declaração independe de inclusão em pauta e intimação da data da sessão de julgamento, mediante publicação na imprensa oficial, pois o feito é apresentado em mesa e não cabe sustentação oral.

      • 6) Diante da reiterada oposição de embargos de declaração meramente protelatórios, deve ser determinada a baixa dos autos à origem, independentemente da publicação do acórdão recorrido e da certificação do trânsito em julgado.

      • 7) Na hipótese de concessão de efeito suspensivo aos embargos de declaração para interposição de outros recursos, tem-se que este suspende o prazo apenas quanto ao respectivo acórdão embargado, assim, não têm efeitos ultraprocessuais para suspender o prazo em relação a decisões em outros incidentes processuais.

      • 8) Os embargos de declaração opostos por uma das partes não interrompem ou suspendem o prazo que a outra dispõe para embargar a mesma decisão, pois o prazo para recorrer é comum entre elas.


      JURISPRUDÊNCIA EM TESES - EDIÇÃO 192 - EMBARGOS DE DECLARAÇÃO IV

      • 1) É vedado, em embargos de declaração, ampliar as questões veiculadas no recurso para incluir teses que não foram anteriormente suscitadas, ainda que se trate de matéria de ordem pública, por configurar inovação recursal e revelar falta de prequestionamento, pois o cabimento dessa espécie recursal restringe-se às hipóteses em que existe vício no julgado.

      • 2) A ausência de indicação, nas razões dos embargos declaratórios, da presença de quaisquer dos vícios de cabimento do recurso, implica o não conhecimento dos aclaratórios por fundamentação recursal deficiente. (Súmula n. 284 do STF).

      • 3) O erro material sanável nos embargos de declaração é aquele evidente, conhecível de plano, que prescinde da análise do mérito, ou que diz respeito a incorreções internas do próprio julgado.

      • 4) A oposição de embargos declaratórios intempestivos não interrompe nem suspende o prazo para a interposição de novos recursos.

      • 5) Reconhecida a intempestividade do agravo, não se conhece dos embargos de declaração posteriormente opostos que não se insurgem contra referido óbice recursal.

      • 6) Nos casos em que o órgão colegiado julga matéria submetida à sistemática da repercussão geral, admite-se, excepcionalmente, a oposição de embargos de declaração para atribuir-lhes efeitos modificativos, anular o acórdão embargado e determinar a devolução dos autos ao Tribunal de origem para exercer juízo de conformação após o julgamento do paradigma.

      • 7) Não são admissíveis os segundos embargos de declaração opostos pela mesma parte, contra a mesma decisão, em razão da preclusão consumativa e do princípio da unirrecorribilidade.

      • 8) É possível o conhecimento dos embargos de declaração, independentemente do depósito prévio da multa prevista no art. 1.021, § 4º, do CPC, quando o recurso questiona a própria aplicação da penalidade, quanto à sua base de cálculo.

    Annotators

    1. Synthèse sur les Impacts de la Séparation Parentale sur les Enfants

      Résumé (11 sources)

      La séparation parentale est un phénomène sociétal majeur qui a des répercussions profondes et multidimensionnelles sur les enfants.

      Les impacts varient considérablement en fonction de l'âge de l'enfant au moment de la rupture, du niveau de conflit entre les parents, du contexte socio-économique et du mode de garde adopté.

      Cette synthèse, basée sur une analyse de plusieurs études et rapports, met en lumière les conséquences psychologiques, scolaires, professionnelles et économiques de la séparation sur les enfants et les jeunes adultes.

      Les impacts les plus significatifs sont :

      1. Conséquences Économiques et Matérielles : La séparation entraîne une baisse de niveau de vie marquée et durable pour les enfants, estimée à 19 % en moyenne l'année de la rupture et persistant à 12 % cinq ans après. Le taux de pauvreté des enfants concernés double, passant à 29 % l'année de la séparation. Cette précarité est particulièrement notable pour les enfants résidant principalement avec leur mère et ceux issus de ménages au niveau de vie intermédiaire avant la rupture. La séparation provoque également des déménagements fréquents (six enfants sur dix dans les trois ans) et une diminution de l'accès à la propriété pour le parent gardien.

      2. Répercussions sur la Réussite Scolaire et Professionnelle : Les études convergent pour montrer que la séparation parentale avant 18 ans est associée à une réussite scolaire plus faible. Cela se traduit par une durée d'études réduite, une probabilité moindre d'obtenir un diplôme et des performances académiques inférieures. Les garçons semblent particulièrement affectés en matière de rendement scolaire. De plus, les jeunes issus de familles recomposées manifestent un désir d'indépendance plus précoce, les poussant vers des "petits boulots" ou des formations courtes au détriment d'études longues, souvent pour éviter de peser financièrement sur une structure familiale perçue comme fragile.

      3. Impacts Psychologiques et Comportementaux : L'âge de l'enfant est un facteur déterminant de sa compréhension, de ses émotions et de ses réactions. Les plus jeunes (moins de 5 ans) peuvent subir des retards de développement et développer un fort sentiment d'insécurité. Les enfants d'âge scolaire (6-12 ans) sont confrontés à des conflits de loyauté et peuvent développer des stratégies d'adaptation complexes. Les adolescents, en pleine construction identitaire, peuvent voir leur estime de soi diminuer et remettre en question leur capacité à nouer des relations futures. Le conflit parental est un facteur aggravant majeur, augmentant les risques d'anxiété et de dépression.

      4. La Question de la Résidence Alternée : Bien que la loi privilégie souvent l'hébergement égalitaire, son application et ses bénéfices font l'objet de débats. Des craintes subsistent quant à son adéquation pour les très jeunes enfants (moins de 3 ans) en raison de la théorie de l'attachement principal. Cependant, un large consensus scientifique international, s'appuyant sur des décennies de recherche, affirme que la résidence alternée est bénéfique pour les enfants de tous âges, y compris les plus jeunes, car elle favorise le maintien de liens d'attachement multiples et solides avec les deux parents, ce qui est crucial pour leur bien-être psychologique et leur développement, même en cas de conflit parental.

      En conclusion, si la séparation est un choc indéniable, ses effets négatifs peuvent être atténués par des facteurs de protection clés : le maintien d'une coparentalité de qualité, une communication ouverte et adaptée à l'enfant, la réduction du conflit parental, la stabilité des routines et un soutien socio-économique adéquat, incluant le versement régulier des pensions alimentaires et des politiques publiques efficaces.

      --------------------------------------------------------------------------------

      1. Contexte et Ampleur du Phénomène

      La séparation parentale est devenue une réalité sociétale courante. Les statistiques confirment l'ampleur du phénomène :

      • En Belgique, 23 059 divorces ont été prononcés en 2017.

      • En France, en 2020, près de quatre millions d'enfants mineurs avaient des parents séparés. Chaque année, environ 380 000 enfants sont concernés par la séparation de leurs parents.

      La part des individus dont les parents se sont séparés a considérablement augmenté au fil des générations, passant de 3 % pour la génération née en 1946 à 15 % pour celle née en 1988. Ce changement structurel a des impacts à court, moyen et long terme sur les enfants, qui se répercutent sur l'ensemble du corps social.

      2. Impacts Psychologiques, Émotionnels et Comportementaux par Âge

      L'âge de l'enfant au moment de la séparation est un facteur déterminant dans la manière dont il vit, comprend et réagit à l'événement. L'analyse de l'UFAPEC, corroborée par d'autres études, permet de dresser un tableau détaillé des impacts selon les tranches d'âge.

      2.1. La Compréhension de la Séparation

      La capacité de l'enfant à comprendre la situation évolue avec son développement cognitif.

      Tranche d'Âge

      Niveau de Compréhension

      Moins de 2 ans

      Ne comprend pas le concept de divorce mais perçoit le changement, l'état émotionnel des parents et leur absence, ce qui peut se traduire par un sentiment d'abandon et d'insécurité.

      2 à 5 ans

      Commence à comprendre que quelque chose a changé, mais la situation reste complexe et confuse. Pose beaucoup de questions pour se rassurer.

      6 à 12 ans

      Comprend le divorce, ses raisons et le point de vue de chaque parent. Fait preuve d'empathie, mais nourrit souvent l'espoir d'une réconciliation.

      Plus de 12 ans

      Saisit la complexité des relations et comprend le divorce comme une incompatibilité du couple.

      Témoignage de Clotilde, 37 ans : « Mes parents ont divorcé quand j’avais 2 ans et se sont fait une guerre sans merci pendant vingt ans, à coup de procès. Je garde un souvenir d’incompréhension totale, d’abandon. De honte, aussi, vis-à-vis des autres enfants. [...] J’en veux à mes parents de ne m’avoir jamais rien expliqué. »

      2.2. Les Émotions de l'Enfant

      Diverses émotions peuvent être ressenties, avec des dominantes selon l'âge.

      Tranche d'Âge

      Émotions et Sentiments Prédominants

      Moins de 5 ans

      Insécurité, peur de l'abandon, possessivité envers la figure d'attachement. Les mensonges ou le manque de clarté accentuent le sentiment que le monde est devenu un endroit peu sûr.

      6 à 12 ans

      Tristesse, deuil de la famille unie, conflit de loyauté. Peut se sentir personnellement rejeté ou, à l'inverse, développer une forte empathie et chercher à consoler ses parents.

      Plus de 12 ans

      Colère, tristesse, repli sur soi. Peut ressentir une diminution de l'estime de soi et remettre en question sa propre capacité future à établir des relations durables.

      2.3. Les Réactions Comportementales

      Les réactions observables varient également, allant de la régression à l'indépendance précoce.

      Tranche d'Âge

      Réactions Typiques

      Âge préscolaire (<5 ans)

      Comportements régressifs (ex: propreté), troubles du langage, anxiété, tristesse, sentiment de culpabilité. Peut manifester un retard dans l'acquisition des facultés psychomotrices.

      Âge scolaire (6-12 ans)

      Insécurité, peur de l'abandon, conflits de loyauté. Difficultés scolaires ou relationnelles. Peut développer des "stratégies d'affrontement" comme le refoulement des émotions.

      Préadolescence

      Colère envers les parents, sentiment de honte, troubles psychosomatiques.

      Adolescence (>12 ans)

      Comportements "parentifiés" (prise de responsabilité excessive), tendance à l'indépendance précoce, fugues, comportements déviants (délinquance, addictions), activités sexuelles précoces. Peut surinvestir ou désinvestir la sphère scolaire.

      3. Impacts sur la Réussite Scolaire et Professionnelle

      La séparation parentale est statistiquement corrélée à une performance scolaire et à un parcours éducatif moins favorables.

      3.1. Baisse de la Réussite Scolaire

      Plusieurs études quantitatives françaises démontrent un effet négatif de la séparation sur le parcours scolaire :

      Réussite scolaire plus faible : Les individus ayant vécu une séparation parentale avant leur majorité ont une réussite scolaire globalement plus faible.

      Durée des études : La durée moyenne des études est raccourcie de six mois à plus d'un an.

      Obtention de diplômes : La probabilité d'obtenir un diplôme, notamment le baccalauréat, est plus faible. L'avantage d'être issu d'un milieu social favorisé est fortement amoindri par la séparation.

      Facteur de genre : Les garçons semblent plus affectés que les filles, notamment en matière de rendement scolaire, lorsque la séparation intervient à l'aube de l'adolescence.

      Âge à la séparation : L'effet négatif est particulièrement prononcé pour les enfants jeunes (0-6 ans) et à des âges charnières comme l'entrée au CP ou en sixième.

      3.2. Désir d'Indépendance et Stratégies d'Orientation

      Une étude qualitative de Sylvie Cadolle met en lumière comment la situation familiale post-séparation influence les choix d'orientation des jeunes adultes :

      Conscience du coût : Les jeunes de familles recomposées sont très conscients du "coût" qu'ils représentent, ce qui peut générer des tensions.

      Recherche d'autonomie financière : Pour ne plus être un "enjeu" financier et échapper aux conflits, beaucoup cherchent l'autonomie le plus tôt possible en occupant des "petits boulots" pendant leurs études.

      Impact sur les études : Ce désir d'indépendance peut les pousser à choisir des formations plus courtes et rémunérées (comme les BTS en alternance) au détriment d'études longues et potentiellement plus qualifiantes.

      Conflits avec les beaux-parents : Des relations difficiles avec un beau-parent sont un facteur majeur poussant à une décohabitation précoce. Le soutien financier du côté paternel est souvent perçu comme amoindri, notamment en cas de réticence de la belle-mère.

      4. Impacts Économiques et sur les Conditions de Vie

      L'étude de France Stratégie (2024) détaille les conséquences économiques sévères de la séparation pour les enfants.

      4.1. Baisse du Niveau de Vie et Augmentation de la Pauvreté

      Chute du niveau de vie : L'année de la séparation, le niveau de vie des enfants chute de 19 % en moyenne. Cette baisse reste significative cinq ans après, à -12 %.

      ◦ La baisse initiale est plus forte pour les enfants résidant principalement avec leur mère (-25 %) qu'avec leur père (-11 %).    ◦ Les enfants en résidence alternée connaissent une baisse de 12 %.

      Explosion de la pauvreté : Le taux de pauvreté des enfants de parents séparés passe de 13,5 % avant la rupture à 29 % l'année de celle-ci, et se maintient à 21 % cinq ans plus tard.

      Facteurs d'amortissement : Cette baisse est partiellement amortie par :

      ◦ Les transferts sociaux et fiscaux, qui jouent un rôle crucial pour les ménages les plus modestes.    ◦ Les pensions alimentaires, qui sont plus significatives pour les ménages aisés. Cependant, deux ans après le divorce, 20 % des pensions ne sont pas versées régulièrement.    ◦ La reprise d'activité des mères.

      Remise en couple : La remise en couple du parent gardien fait disparaître la baisse de niveau de vie, mais ne concerne que 30 % des enfants six ans après la séparation.

      4.2. Impact sur le Logement

      Déménagements : Six enfants sur dix déménagent dans les trois ans suivant la séparation, dont 38 % l'année même de la rupture.

      Statut d'occupation : Après la séparation, la part d'enfants vivant dans un logement dont un parent est propriétaire chute de 59 % à 38 %.

      Logement social : La part des enfants vivant en logement social augmente considérablement, surtout pour ceux résidant avec leur mère (passant de 15 % à 34 % l'année de la rupture).

      5. La Question de la Résidence Alternée

      Le mode de garde est un enjeu central. Si la législation tend à favoriser la résidence alternée, sa mise en œuvre et ses effets sont débattus, notamment en France où, en cas de désaccord, les juges ne l'accordent que dans 12 % des cas.

      5.1. Arguments et Controverse

      Vision Traditionnelle (crainte pour les tout-petits) : L'analyse de l'UFAPEC (2018) relaie une opinion selon laquelle la résidence alternée avant 3 ans serait assimilée à de la "maltraitance", car le bébé n'aurait pas intégré la permanence des personnes et aurait besoin d'une figure d'attachement principale stable.

      Consensus Scientifique International : De nombreuses études récentes et méta-analyses contredisent fortement ce point de vue. Un consensus international, validé par des centaines de spécialistes, démontre les bienfaits de la résidence alternée pour les enfants de tous âges.

      5.2. Synthèse des Études Internationales en Faveur de la Résidence Alternée

      Chercheur / Étude

      Année

      Pays

      Conclusion Principale

      Consensus de 70 spécialistes

      2021

      Monde

      Les enfants développent plusieurs relations d'attachement. Prioriser un seul parent peut compromettre ce réseau bénéfique et altérer la confiance de l'enfant.

      Michel Grangeat

      2018

      France

      S'appuyant sur les travaux de Michael Lamb, il affirme que les enfants sont prédisposés à des liens d'attachement multiples. La qualité de la relation dépend du temps passé, argumentant en faveur de la résidence alternée même pour les bébés.

      Malin Bergström

      2018

      Suède

      Une étude sur 3 662 enfants (2-9 ans) montre que ceux en résidence alternée souffrent de moins de problèmes psychologiques que ceux en garde exclusive.

      William Fabricius

      2017

      États-Unis

      Les enfants de moins de 2 ans passant un temps équivalent avec chaque parent développent des relations plus saines et solides avec eux à l'adolescence et à l'âge adulte.

      Linda Nielsen

      2014

      États-Unis

      Une synthèse de 40 études conclut que les enfants en résidence alternée ont un meilleur cursus scolaire, sont moins déprimés et plus équilibrés psychologiquement, même en cas de conflit parental.

      Richard Warshak

      2014

      États-Unis

      Une méta-analyse de 40 ans de recherche, validée par 110 experts, recommande la garde alternée comme norme pour tous les âges, soulignant ses bénéfices même en cas d'opposition initiale d'un parent.

      Ces études suggèrent que les arguments contre la résidence alternée pour les jeunes enfants ne sont pas soutenus par les données scientifiques les plus récentes et les plus robustes.

      6. Facteurs de Protection et Recommandations

      Si les risques sont réels, tous les enfants de parents séparés ne subissent pas des conséquences négatives à long terme. Plusieurs facteurs peuvent protéger l'enfant et favoriser son adaptation.

      Qualité de la Coparentalité : Le facteur le plus important est la capacité des parents à coopérer, à communiquer et à maintenir un faible niveau de conflit.

      Communication avec l'Enfant : Il est crucial de parler à l'enfant de la séparation de manière claire, honnête et adaptée à son âge, en le rassurant sur le fait qu'il n'est pas responsable et qu'il continue d'être aimé par ses deux parents.

      Maintien des Relations : Maintenir une relation de qualité avec les deux parents est un facteur de protection majeur.

      Stabilité : Assurer une continuité dans la vie de l'enfant (maison, école, amis, activités) aide à son adaptation.

      Soutien Externe : L'école, les amis et la famille élargie peuvent jouer un rôle de soutien important. Les parents ne doivent pas hésiter à chercher de l'aide auprès de professionnels (médiateurs, psychologues).

      Soutien Public : Les politiques publiques doivent mieux accompagner les familles, notamment en assurant le versement effectif des pensions alimentaires et en fournissant des aides suffisantes pour amortir le choc économique de la séparation.

    2. Les Effets de la Séparation Parentale sur la Réussite Scolaire et Professionnelle

      Résumé

      Cette note de synthèse analyse les conclusions d'une étude sur l'impact de la séparation parentale sur la réussite à long terme des enfants en France.

      L'étude, basée sur les données des enquêtes "Formation et qualification professionnelle" de l'Insee, démontre que la séparation des parents avant l'âge de 18 ans a un effet négatif significatif sur la réussite scolaire et la position sociale des individus.

      Cet impact est mesuré à travers trois indicateurs : le nombre d'années d'études, le rendement scolaire (revenu moyen associé à un diplôme) et la position sociale (revenu moyen pour une profession et un diplôme donnés).

      Les principaux résultats indiquent que l'âge de l'enfant au moment de la séparation est un facteur déterminant.

      Aucun effet notable n'est observé lorsque la séparation survient à 19 ans ou plus.

      En revanche, une séparation précoce, particulièrement entre 0 et 6 ans, est corrélée aux baisses les plus prononcées de réussite.

      L'analyse révèle également des disparités selon le genre : les garçons sont plus affectés que les filles en matière de rendement scolaire, surtout lorsque la séparation a lieu au début de l'adolescence (7-12 ans).

      L'étude utilise une méthodologie comparative rigoureuse, notamment un modèle de différence au sein de la fratrie, pour isoler l'effet de la séparation des autres facteurs familiaux préexistants (comme le conflit parental).

      Les résultats de ce modèle confirment un effet causal de la séparation, bien que d'une ampleur moindre que les simples corrélations, suggérant qu'un "biais de sélection" explique une partie de l'impact observé.

      En conclusion, les mécanismes de soutien actuels, tels que les pensions alimentaires et les allocations, semblent insuffisants pour compenser le choc économique et social de la séparation.

      Les taux élevés de non-paiement des pensions alimentaires (20 % de versements irréguliers deux ans après le divorce) exacerbent le problème. L'étude appelle à un renforcement de l'accompagnement des familles séparées, notamment par une meilleure application des décisions de justice.

      1. Contexte et Problématique de l'Étude

      1.1. Un Phénomène Social en Pleine Expansion

      La séparation parentale est devenue un enjeu majeur dans l'analyse des déterminants de la réussite individuelle. Sa prévalence a considérablement augmenté au fil des générations en France :

      • • Génération 1946 : 3 % des individus avaient des parents séparés.

      • • Génération 1988 : Cette proportion a bondi à 15 %. En 2020, près de quatre millions d'enfants mineurs en France avaient des parents séparés, faisant de cette situation un facteur central du milieu familial à prendre en compte.

      1.2. Évolution du Profil Sociodémographique

      La composition sociale des familles qui se séparent a évolué.

      Tendance historique : Pour les générations nées entre 1946 et 1950, la séparation était plus fréquente lorsque la mère était très diplômée.

      Tendance actuelle : Pour les générations plus récentes, l'augmentation des séparations est plus prononcée chez les enfants dont les parents ont un faible niveau d'éducation. La séparation touche désormais tous les milieux sociaux, mais avec une incidence accrue dans les milieux défavorisés.

      L'âge moyen des enfants au moment de la séparation a également changé :

      • La proportion d'enfants très jeunes (0-3 ans) au moment de la séparation a diminué au fil des générations.

      • La proportion d'enfants plus âgés (16 ans et plus) a augmenté.

      1.3. Mécanismes et Débats Théoriques

      L'effet de la séparation sur l'enfant peut s'opérer via plusieurs mécanismes, sans qu'un consensus scientifique n'ait émergé.

      • Effets Négatifs Potentiels : ◦ Baisse des ressources monétaires et en temps : La perte des gains liés à la vie en couple (complémentarités de production et de consommation) et un accès moindre aux ressources du parent non-gardien.

      • Choc psychologique : Particulièrement si le niveau de conflit pré-séparation était faible et la rupture inattendue.

      • Effets Positifs Potentiels (hypothèse non confirmée) :

      ◦ La séparation pourrait mettre fin à une période de conflit parental intense, bénéficiant ainsi à l'enfant.

      L'Effet de Sélection : Des études (notamment Piketty, 2003) suggèrent que la corrélation négative observée pourrait ne pas être causée par la séparation elle-même, mais par des facteurs préexistants, comme le conflit parental, qui mènent à la fois à la séparation et à une moindre réussite de l'enfant.

      2. Méthodologie et Données de l'Analyse

      2.1. Sources et Échantillon

      L'étude s'appuie sur les vagues 2003 et 2014 des enquêtes "Formation et qualification professionnelle" (FQP) de l'Insee.

      • L'échantillon final est composé de 52 602 individus issus de 26 301 familles, nés entre 1946 et 1989.

      • La méthodologie se concentre sur les fratries pour permettre des comparaisons à environnement familial constant.

      2.2. Indicateurs de Réussite

      Trois mesures complémentaires sont utilisées pour évaluer la réussite scolaire et professionnelle :

      1. Nombre d'années d'études : Le nombre d'années d'études médian associé au plus haut diplôme obtenu.

      2. Rendement scolaire : Le revenu moyen associé à chaque diplôme, estimé pour chaque genre. Cet indicateur valorise davantage les diplômes menant à des salaires élevés (ex: grandes écoles).

      3. Position sociale : Le revenu moyen associé à une profession pour un niveau d'éducation donné.

      2.3. Approche Empirique

      Pour estimer l'effet de l'âge à la séparation, deux modèles économétriques sont employés :

      Modèle 1 (à effets aléatoires) : Estime la corrélation entre la séparation et la réussite en contrôlant pour un large éventail de caractéristiques observées (sexe, année de naissance, milieu social des parents, etc.).

      Modèle 2 (de différence au sein de la fratrie) : Compare les réussites de frères et sœurs au sein d'une même famille. Cette approche permet de neutraliser l'effet de toutes les variables familiales communes, qu'elles soient observées ou non (capital génétique, culture familiale, conflit parental chronique), offrant une estimation plus proche d'un effet causal.

      3. Principaux Résultats : L'Impact de l'Âge à la Séparation

      3.1. Effets Généraux sur la Réussite

      Les résultats, résumés dans le tableau ci-dessous, montrent un impact négatif et significatif de la séparation avant 18 ans, dont l'intensité varie avec l'âge.

      Tableau : Effet de la séparation parentale sur la réussite (en points d'écart-type) Mesures issues du Modèle 2 (différence au sein de la fratrie), qui contrôle les facteurs familiaux non observés.

      Tranche d'âge à la séparation

      Nombre d'années d'études

      Rendement scolaire

      Position sociale 0-3 ans * -0,20** * -0,19 * -0,07

      4-6 ans * -0,20 * -0,19 * -0,16

      7-9 ans * -0,13 * -0,15 * -0,05

      10-12 ans * -0,21* * -0,13 * -0,16

      13-15 ans * -0,20* * -0,15 * -0,10

      16-18 ans * -0,13** * -0,09 * -0,07 19 ans et +

      Groupe de référence (effet nul par définition)

      Significativité : * à 10 % ; ** à 5 % ; *** à 1 %.

      Réussite scolaire : Toutes les tranches d'âge avant 19 ans montrent une baisse significative du nombre d'années d'études.

      L'effet est particulièrement prononcé pour les séparations survenant avant 6 ans et entre 10 et 15 ans.

      Position sociale : La position sociale est moins affectée, avec un effet négatif significatif uniquement pour les séparations entre 10 et 12 ans.

      3.2. L'Importance du Biais de Sélection

      La comparaison entre les deux modèles est instructive :

      • Le Modèle 1 (corrélations simples) montre des effets négatifs beaucoup plus importants que le Modèle 2.

      • La différence est particulièrement marquée pour les séparations très précoces (0-3 ans).

      Cela suggère qu'une part importante de l'effet négatif attribué à la séparation est en réalité due à des facteurs de sélection, comme un climat familial déjà dégradé.

      Les parents qui se séparent lorsque leur enfant est très jeune sont probablement ceux qui vivent les conflits les plus intenses, ce qui affecte l'enfant indépendamment de la séparation elle-même.

      4. Analyse des Effets Hétérogènes

      4.1. Disparités selon le Genre

      L'étude confirme que les garçons sont plus vulnérables à l'impact de la séparation.

      Rendement scolaire : Les garçons sont significativement plus touchés que les filles, surtout lorsque la séparation survient entre 7 et 12 ans.

      Nombre d'années d'études : Les différences entre genres sont moins marquées et non significatives.

      Position sociale : Les effets sont similaires pour les garçons et les filles.

      Ces résultats, bien qu'exploratoires, sont cohérents avec une littérature montrant une plus grande sensibilité des garçons au milieu familial.

      4.2. Influence du Niveau d'Éducation de la Mère L'analyse cherche à savoir si l'impact de la séparation diffère selon que la mère est diplômée ou non.

      • Le Modèle 1 suggère que les enfants de mères diplômées sont plus affectés, ce qui pourrait s'expliquer par le fait qu'ils ont "plus à perdre" en termes de ressources.

      • Cependant, le Modèle 2 (plus robuste) réduit considérablement ces différences, qui deviennent non significatives.

      L'étude conclut qu'il n'est pas possible de rejeter l'hypothèse d'un effet égal de la séparation, quel que soit le niveau d'éducation de la mère, une fois les facteurs familiaux inobservés pris en compte.

      5. Conclusion et Implications Politiques

      5.1. Synthèse des Conclusions

      L'étude établit un lien négatif entre la séparation parentale avant 18 ans et la réussite future de l'enfant.

      L'âge au moment de l'événement est un facteur clé, et les garçons apparaissent plus vulnérables sur le plan du rendement scolaire.

      Une partie de cet effet est attribuable à des conditions familiales préexistantes, mais un effet causal de la séparation demeure.

      5.2. Mécanismes Explicatifs Potentiels

      Les effets négatifs peuvent être expliqués par une conjonction de facteurs :

      Choc sur les ressources monétaires : La séparation entraîne une baisse du niveau de vie.

      Choc sur les ressources en temps : Une étude de Le Forner (2020b) montre qu'un enfant vivant seul avec sa mère passe en moyenne 0,18 point d'écart-type de moins avec au moins un parent que les enfants vivant avec leurs deux parents.

      Développement socio-émotionnel : L'impact psychologique de la séparation est une piste de recherche importante.

      5.3. Recommandations en Matière de Politiques Publiques

      Les résultats ont des implications directes pour l'action publique :

      Insuffisance du soutien actuel : Le versement de pensions alimentaires ou de l'allocation de soutien familial (environ 115 € par enfant) ne semble pas suffire à amortir l'impact de la séparation.

      Problème des impayés : Le fait que 20 % des pensions alimentaires soient versées irrégulièrement deux ans après le divorce constitue un facteur aggravant majeur.

      Nécessité d'un accompagnement global : Il importe de revoir l'accompagnement des familles, en commençant par garantir le respect des décisions de justice et l'effectivité des dispositifs de soutien financier.

    1. trauma, die traumatisierte gesellschaft<br /> haustiere als ablenkung von einsamkeit<br /> "wir sind konsumenten" = sklaven -> sklavenmoral<br /> "erziehung" als dressur für arbeiter und soldaten<br /> normopathie = mehrheit ist krank<br /> dunkle psychologie<br /> erziehungspädagogik<br /> rockefeller stiftung<br /> materialismus<br /> menschen sind nur reiz-reaktion-automaten (reptilhirn)<br /> gehirnstoffwechsel manipulieren<br /> transhumanismus<br /> bildungssystem<br /> pharmaindustrie<br /> psychokrieg<br /> weltbild<br /> geschichtsfälschung<br /> weltkriege<br /> spaltung in religionen, politische parteien<br /> sozialdarwinismus<br /> neuer menschentyp<br /> neue weltordnung<br /> entfremdung, fortschritt weg von der natur<br /> umerziehung<br /> untervaterung<br /> "der teufel klaut vaterschaft"<br /> satanisches system der umkehrung (inversions)<br /> "männer an die front"<br /> waisenkinder<br /> kinder können nicht lernen vom vater<br /> untervaterte gesellschaft<br /> mütter sind überlastet, überfordert, alleinerziehend<br /> alle sind gestresst, alle verlieren<br /> unsichtbarer lehrplan (john taylor gatto)<br /> 7 lektionen<br /> verhaltensdressur<br /> pädagogen arbeiten unterbewusst, sind mitläufer<br /> unterbewusst, soft power<br /> kind braucht bindung, sicherheit<br /> keinen permanenten überlebenskampf (burnout)<br /> emotional unreif<br /> unruhig<br /> kämpft verzweifelt um aufmerksamkeit<br /> unsichtbarer lehrplan (john taylor gatto)<br /> 7 lektionen:<br /> 1. sinnzusammenhang zerstören, narrowmind,<br /> schubladendenken,<br /> 2. klassen von gleichaltrigen<br /> aber verschieden reifen kindern<br /> 3. pausenglocke nach 45 minuten, kontrollverlust<br /> 4. emotionale abhängigkeit, erfolg wird belohnt<br /> 5. intellektuelle abhängigkeit, muss immer um erlaubnis fragen,<br /> anpassung wird belohnt, erlernte hilflosigkeit<br /> fehlerkultur: lernen aus fehlern oder fehler werden bestraft<br /> blinder glaube an autoritäten<br /> 6. labiles selbstbewusstsein, schwäche, abhängigkeit<br /> 7. ständige überwachung über hausaufgaben,<br /> keine freizeit, privatsphäre wird geraubt,<br /> abhängigkeit, keine freizeit zum selbst lernen<br /> kinder sind schuld<br /> buch:<br /> Raik Garve - vom schöpfer zum sklaven<br /> eltern haben mehr einfluss als lehrer<br /> wichtigkeit von bindung wird unterschätzt

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      _Below we address all the comments by the reviewers. However, the figures that were used in our response are unfortunately not displayed in this format. _

      Reviewer #1

      Evidence, reproducibility and clarity

      Thanks to the development of Ribo-Seq, translational buffering has been reported in the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. Although the authors' report provides a step forward in our understanding of translational buffering, this reviewer found a series of concerns in this paper. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      Major comments: 1. This paper heavily relies on the reference 18. However, this paper was not properly stated (no page or journal number); the study in Bioinformatics is nowhere to be found on the website, despite being out in 2024 apparently. Either title is wrong (yet a biorxiv can be found). This reviewer guessed that the reference 18 may be accepted. However, without a proper reference, this paper could not be judged since nearly all the parts of this work have been based on the reference 18. Also, the Ribobase data used in this manuscript comes from this reference, so it had better be well defined, especially when another Ribobase data set seems to be available online: http://www.bioinf.uni-freiburg.de/~ribobase/index.html

      We apologize for the citation issue. This citation by Liu et al , 2024 (18) was a preprint from BioRxiv. This manuscript is now published in Nature Biotechnology. The reference has been updated in the revised version of the manuscript. The reference number in revised manuscript is Liu et al, 2025 (23).

      In the Discussion, the authors mentioned "TE is based on a compositional regression model (18) rather than the commonly applied approach of using a logarithmic ratio of ribosome occupancy to mRNA abundance." This important information should be mentioned in the early section of the manuscript. Related to this, there are other published methods for exploring change in translation efficiency (e.g., 10.1093/bioinformatics/btw585; 10.1093/nar/gkz223) that could also be suitable in this context. It is not entirely clear if their approach is better than before. Again, the improper reference to 18 made our assessment of this work difficult.

      We apologize and acknowledge the impact of the citation issue on this point. In Liu et al (2025), we have provided a comparison between our approach and the log-ratio strategy. We also agree that additional context was needed within the current study. Hence, we have now included more detailed information about the TE calculations in the initial results section (line 94).

      As noted by the reviewer, several other methods have been developed previously for measuring changes in translation efficiency. These methods are designed to be used in cases of paired designs where there is a treatment or manipulation that is assayed along with controls. While these methods are highly valuable in assessing differential TE, they are unable to accommodate the type of meta-analyses described in our study. In particular, we do not report changes/differential TE with respect to a control sample but instead focus on the coordinated patterns of TE across experiments. We now note this important distinction in the manuscript in the discussion section (line 494).

      The paper mainly relies on detecting a set of buffered genes using mRNA-TE correlation and MAD ratios (Ribo-Seq/RNA-Seq). While the concept seems sound, the authors should ensure that this method is reliable. Several controls could be used to confirm this. First, if any studies in humans or mice have described a set of genes as buffered, it would be worth checking for overlap between the authors' set of 'TB high' genes and the previously established list. Furthermore, the authors could use packages explicitly developed for translational buffering detection, such as annota2seq (https://academic.oup.com/nar/article/47/12/e70/5423604?login=true). Not all of the data used by the authors may be suitable for such packages, but the authors could at least partially use them on some of their datasets and see whether the buffered genes reported by these packages match their predictions.

      We thank the reviewer for this constructive suggestion. To the best of our knowledge, no prior study in humans or mice has systematically analyzed translational buffering across a wide range of conditions. As a result, defining a gold-standard set for benchmarking is currently not feasible.

      While packages such as anota2seq have proven highly valuable for identifying buffering effects in controlled experimental designs (e.g., comparing a treatment to a matched control), they are not readily applicable to the type of large-scale meta-analysis we present here.Our study integrates ribosome profiling and RNA-seq data across diverse datasets and conditions, which lies outside the design scope of such tools.

      The most relevant point of comparison to our work is Wang et al. 2020 Nature, which examined a related but distinct form of translational buffering across species for a given tissue. We now present the overlap of genes identified as buffered in our study vs Wang et al. 2020. The details are presented in the reviewer's comment 5-2.

      The threshold of 'TB high' or 'TB low' (top and bottom 250) is somewhat arbitrary. Why not top 100 or 500? The authors should provide a rationale for this choice. Also, they could include a numeric measure of buffering (the sum of the two rankings is probably suitable for this purpose). Several of the authors' explorations are suitable for numerical quantification (GO enrichment can be turned into GSEA, and the boxplot can be shown as correlations)

      Thanks for these suggestions. We agree that the threshold used to define TB high and low are somewhat subjective. We ensure that changing this cutoff as suggested is easily achievable with the provided R script. These can be used to reproduce all of the reported analyses of translational buffering with different cutoffs.

      To further assess whether our conclusions are robust to the selection of these thresholds, we tested several different values to define the TB high and TB low groups. As an example, we show here that the effect on protein variation and association of intrinsic features like the UTR lengths with the buffering potential of genes for different thresholds (i.e. if the TB high = top 100 or TB high = top 200) remain similar to the current cutoff of 250. However, if we increase the cutoff of TB high to 2000 and TB low to top 2000-4000 , the difference between the various features is diminished (Figure A& B). Further, protein variation (human cancer cell line and tissue) also becomes more similar across the three categories, possibly indicating a reduced regulatory potential of genes as their rank increases (Figure C& D).Our analyses reveal that highly ranked genes show associations with particular features, indicating an underlying hierarchy in translational buffering potential. This point is now discussed in the manuscript (line 177).

      Legend: Effect of different thresholds on . A. Length features B. Median RNA expression C. Protein variation in human cancer cell line and D. on Primary human tissues

      In response to the reviewer's suggestion of presenting data using numerical quantitation, we incorporated several additional inclusions in the manuscript.

      1. We now report association of CDS / UTR length with translational buffering as a function of their translational buffering rank with highly ranked genes showing associations with particular features, indicating an underlying hierarchy in translational buffering potential (Sup Fig 3 A-B) Ii. We now include scatter plots which show that highly ranked genes have lower variation at the protein level in both cancer cell line and primary tissues (Sup Fig. 6 A-C).

      Iii. We have now carried out modified GO enrichment analyses. Specifically, Gene Ontology enrichment analysis was performed for the TB high genes in humans and mouse using the clusterProfiler R package. Lists of TB high genes in human or mouse were analyzed against the Gene Ontology (GO) database using the enrichGO() function, with the organism-specific annotation database (org.Hs.eg.db for human or org.Mm.eg.db for mouse) as reference. Gene identifiers were supplied as gene symbols, and all genes in the current study were used as the background universe. Enrichment was carried out for the Biological Process (BP) ontology, with significance assessed by the hypergeometric test. P-values were adjusted for multiple testing using the Benjamini–Hochberg method, and terms with an adjusted p-value Legend: Gene Ontology (GO) enrichment analysis of the TB high gene set, performed with the clusterProfiler R package. Enriched GO Biological Process terms are shown after redundancy reduction using clusterProfiler::simplify. Each dot represents a GO term, with dot size indicating the number of genes associated with the term and color reflecting the adjusted p-value (Benjamini–Hochberg correction). Only the top non-redundant terms are displayed.

      • *

      Additionally, we performed Gene set enrichment analysis using the list of genes ordered according to their RNA-TE correlation. Hence lower ranks have lower RNA-TE correlations. The GSEA plots show significantly enriched Gene Ontology Biological Process (GO:BP) terms at the lower ranks of the ordered gene list. Together, these analyses further emphasize the observation that genes involved in macromolecular complexes are translationally buffered.

      • *

      Legend: Curves represent the enrichment score (ES) across the ranked gene list, with vertical bars indicating the positions of pathway-associated genes. The enrichment was identified using the gseGO() function from clusterProfiler.

      Several of the statements of the authors in the Introduction or Discussion sections are not entirely true regarding the literature on the topics, or lack major papers on the topic, and therefore, they are a bit misleading. Among others, here are some:

      We thank the reviewer for the suggestions and now have been incorporated in the revised manuscript, accordingly.

      5-1 "In addition, genetic differences arising from aneuploidy, cell type differences or variability observed in the natural population can further determine the amplitude of variation (4-7). The effect of mRNA variation under these conditions is mostly reflected at the protein levels (2, 4-8).". Several recent or more ancient papers suggest that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level: DOI: 10.1038/s41586-024-07442-9 DOI: 10.1073/pnas.2319211121 DOI: 10.1016/j.cels.2017.08.013 DOI: 10.15252/msb.20177548

      We agree that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level for some genes. This point has now been revised in the introduction. We have incorporated all the suggested literature into the revised manuscript (line 38).

      5-2: The authors should also consider mentioning these studies and softening their initial statement. "Similarly, translational buffering of certain genes have been reported in mammalian cells, specifically under estrogen receptor alpha (ERα) depletion conditions (16).". Translational buffering has been deeply explored in mammalian tissues and even across several mammalian species in this study (DOI: 10.1038/s41586-020-2899-z). In this, the authors also provide a nice exploration of the gene characteristics that are associated with translational buffering. The authors should mention it and compare the study's findings to theirs ultimately.

      We thank the reviewer for this suggestion. We have now cited the recommended study in the revised manuscript (line 65). Here, we provide a comparison of its findings with ours. While this related work offers important insights into translational buffering, its focus is on buffering across species within a given tissue, whereas our study emphasizes buffering across conditions, cell types, and treatments within a species. Despite this difference in focus, the comparison is highly informative, and we now highlight both the similarities and distinctions between the two studies in the relevant section of the revised manuscript.

      Wang et al. calculate the variation at the transcriptome level vs at the translatome level and is represented as delta ∆ value for each gene. A lower value represents lower variation at the ribosome occupancy level than at the mRNA levels across various species. We classified the genes in the Wang et al study as TB high, TB low genes or others as identified in the current study while indicating the calculated delta ∆ from Wang et al. Many of the genes with a lower delta value (are delta ∆ Legend: A. Dot plot to highlight the delta value of all genes in the Wang et al study (also present in RiboBase) which are further grouped as TB high, low or others in (A) brain and (B) liver.

      5-3: "Differences in species evaluated and statistical methods have resulted in conflicting interpretations (13, 28).". These conflicting results have been previously discussed in reviews on the topic that would be worth mentioning: DOI: 10.1016/j.cell.2016.03.014 DOI: 10.1038/s41576-020-0258-4

      We have added these reviews at the appropriate location of the manuscript.

      1. In addition to the p-values stated in the main text, the authors should annotate their plots when they find significant differences between groups to greatly facilitate the visual interpretation of the graphs.

      We have now annotated many of the relevant graphs with p-values to facilitate visual interpretation, adding them where space and figure design allow.

      Based on the data of Figure 4D, apparently, ribosome occupancy was not buffered even in high TB sets. The authors may argue that translational buffering may not cope with such a strong mRNA reduction. In that case, how big a difference in mRNA level does the buffering system adjust in protein synthesis? The authors should test gradual gene knockdown and/or overexpression and conduct Ribo-Seq/RNA-Seq to survey the buffering range.

      We appreciate the reviewer’s suggestion regarding the experiment to determine the buffering range.To understand this for multiple genes, we attempted a series of knockdowns using CRISPR/gRNA approach using a MutiCas12a approach. We targeted 8 buffered and 2 non-buffered genes using a 10-plex crRNA along with 10-plex gRNA serving as a negative control (Figure below). The fold change at the mRNA level of the targeted gene was within the variation range observed in replicates for other non-targeted genes. The challenge in performing a gradual knockdown is the subtle changes in RNA expression falls within the margin of error of estimation, making it difficult to understand the clear implications of the mRNA levels on buffering. Hence, the precise experimental manipulation of mRNA expression levels that would be conducive to translational buffering remains highly technically challenging. As noted in our manuscript (Figure 4D), the conventional approaches for manipulation of transcript abundance lead to larger changes than typically observed as a result of natural variation.

      *Legend: Validation of translational buffering by targeted knockdown of genes. A. The scatter plot shows the coefficient of variation of mRNA and ribosome occupancy between HEK293T cells targeted with sgRNA of different efficiencies. The genes indicated in blue are buffered and those in green are non buffered genes. B. The plot shows the fold change in mRNA abundance and ribosome occupancy as compared to cells that were infected with non-targeting crRNA array control (ratio of cpm in test vs control). Each color represents a gene and each point of a gene represents cells targeted by one of the four CRISPR arrays. *

      "differential transcript accessibility model" could not be functional if mRNA is reduced beyond the accessible pool (i.e., less than the threshold, all the mRNAs are translated without buffering). The authors should carefully reconsider this model and the effective range of mRNAs.

      We agree with the reviewer that according to the 'differential transcript accessibility model,' transcripts with abundances below a certain threshold should be completely accessible to the translational pool. Further, this could also be true for the other model, wherein initiation rate cannot increase beyond a particular threshold for transcripts of very low abundance. However, our observation from our haploinsufficiency analysis (Figure 4 B& C) and siRNA knockdown analysis from RiboBase (Figure 4 D) suggests that buffering might be possible within a given range of transcript abundance. Testing the buffering range by serial knockdowns might help in determining the threshold at which transcripts exhibit buffering. However, due to the challenges of serial knockdown as discussed above, makes this analysis difficult with Ribosome profiling and matched RNA-seq approach. An alternative approach could involve imaging translating and non-translating mRNA of buffered genes in different cells, which may help distinguish the two models. However, this falls outside the scope of the manuscript.

      Minor comments:

      1. Some figures are of poor quality as they seem to have points outside of the panel representations... Like Figure 3C, one point is out of the square, same for Figure 4E. Similarly, on figure 5F, some outliers seem to be clearly cut from the figure (maybe not, but then the author should put a larger space between the end of the figure and the max y points). Same for panel S2D and S6D, this does not sound so rigorous.

      We agree and apologize for this issue. The axes of the figures have been annotated appropriately to indicate the presence of outliers in the figures.

      1. There are several typos or weird sentences. Here are some (but maybe not all): 2-1: [...]with lower sums corresponding to higher final ranks. "two rankings". Based on these final ranks[...] 2-2: For each dataset, median absolute deviation (MAD) "i" protein abundance was calculated across samples 2-3: [...]neighbor method implemented in the MatchIT package (38) Differences in protein[...] a point is missing here. 2-4: Additionally a second dataset providing predictions of haploinsufficiency (pHaplo score) and triplosensitivity (pTriplo score) for all autosomal genes (25) was used to asses the distribution of these score"S" across buffered and non-buffered gene sets . There is a missing "s" at "score" and there is a space between the last word and the final point.

      The necessary corrections have been incorporated in the revised version of the manuscript.

      1. In the "Lymphoblastoid cell line data analysis:" section, this reviewer wonders why the authors used a different method to calculate buffering compared to before.

      The main reason is the limited sample of the lymphoblastoid cell line data. In our larger analyses, we could use median absolute deviation as a robust metric of dispersion across heterogeneous samples. However, given the smaller dataset in that study we decided CV would be a better indicator of dispersion. To evaluate the potential for translational buffering of genes from RiboBase, we used two metrics. The first was the negative correlation between translation efficiency and RNA abundance across samples. The second metric relied on the ratio of variation in ribosome occupancy to variation in RNA levels. Given the limited sample size of the lymphoblastoid cell line dataset, we used the coefficient of variation (CV) instead of the median absolute deviation (MAD), as the data in this study were normalized using counts per million (CPM) rather than the centered log-ratio (clr) normalization used in RiboBase. This CV ratio allowed us to assess the effect of natural variation in RNA abundance on ribosome occupancy.

      1. "Samples which had R2 less than 0.2 were removed as the residuals calculated for these samples could be unreliable". These samples for which the correspondence between RNA-Seq and Ribo-Seq is low wouldn't be the ones most impacted by translational buffering? Is it sure that the authors are not missing something here?

      We agree with the reviewer that genes that show translational buffering may not conform to linear relationships between the two parameters. However, the proportion of genes exhibiting this buffering effect is not expected to significantly influence the overall regression fit. Instead, we hypothesized that low quality samples or truly different relationships between the two parameters can make this relationship nonlinear, rendering it unsuitable for linear regression analysis for calculation of TE.

      To address these possibilities, we first analysed a commonly used proxy for data quality. Given the characteristic movement of ribosomes across mRNAs, periodicity of sequencing reads is a useful metric to assess whether reads are randomly fragmented, as in RNA-seq, or specifically represent ribosome-protected footprints. For this, we compared two groups: samples that were removed (~30) and those retained for analysis. We plotted the distribution of periodicity scores for all samples in both groups. For the calculation of periodicity scores, first the percentage of reads mapped to the dominant frame position across the dynamic ribosome footprint read length range was calculated for each sample. The periodicity score was calculated by taking the weighted sum of these dominant percentages, with weights based on the total read counts at each length.

      The results indicate that the removed samples did not have lower periodicity scores, suggesting that their quality in terms of periodicity was comparable to the retained samples.

              To assess the second possibility, we checked if the study involved major perturbations, which may skew the relationship towards non linearity. The 30 samples that were removed came from 14 unique studies, 18 of which involved perturbation which possibly affected either of the two parameters. In addition to the genetic/pharmacological perturbations specific to the study, the overall conditions of the cells during an experiment could influence this relationship. Another point to note is that many of the filtered-out samples are HeLa and HEK293T cells, which show a normal relationship between ribosome occupancy and RNA abundance for the majority of cases.
      
              These considerations suggest that removing these samples is most appropriate, as their inclusion could bias the TE calculations.
      

      For Figure 4B and 4C, the authors should provide statistical tests and p-values to confirm the observed trends.

      The haploinsufficiency and triplosensitivity analyses are now supported by a chi-squared test. The details of the statistical test are now mentioned in the text and the p-values have been noted on the respective figures.

      In Figure 2A, the "all genes" color doesn't correspond to the point color.

      The color in the figure has been modified in the revised version of the manuscript.

      1. "To understand if codon usage patterns are[...]". This comes slightly out of the blue. The authors could maybe explain why codon usage should be explored for translational buffering. The authors should cite recent key works in the fields: DOI: 10.1016/j.celrep.2023.113413 DOI: 10.1101/2023.11.27.568910

      We would like to thank the reviewer for their suggestion. The references have been incorporated in the revised version of the manuscript. We have now explained why codon usage could be a contributor in determining the translational buffering potential (line 190).

      "The change in each metric was calculated by subtracting the mean value in the control samples from that in the knockdown samples. This yielded the differential mRNA abundance and ribosome occupancy resulting from gene knockdown.". This looks statistically weak. The authors should consider using more robust methods like DESeq.

      We thank the reviewer for the suggestion. We reanalyzed the selected studies using edgeR and the modified figure is included in the revised version of the manuscript (Figure 4D). The conclusion after this analysis remains essentially the same. In particular, translational buffering is ineffective when mRNA abundance is perturbed drastically. Additionally, the limited number of experiments with direct perturbation of buffered genes limit the generalizability of this observation. This limitation is included in the result section (line 342).

      Legend: Scatter plot represents log2 fold change in RNA abundance and ribosome occupancy. Each point represents a gene and the fold change in its RNA and ribosome occupancy with respect to their controls. The line represents the line of equivalence. Buffered genes do not show less change in ribosome occupancy upon reduction in their RNA levels than other genes.

      1. "Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set, indicating that candidates in the buffered gene set are relatively well expressed due to the presence of a higher proportion of the codons observed in highly expressed genes". What do the authors mean by "relatively well expressed"? Abundantly expressed? This sentence and the causality under it is unclear and should be modified or better explained.

      We thank the reviewer for pointing out the lack of clarity in the sentence. We have now quantitatively measured the CAI in the three categories and modified the sentence to better explain the rationale in the revised version (line 183). “To understand if codon usage patterns are associated with translational buffering, we next analyzed codon properties across buffered and non-buffered human gene sets. The codon adaptation index quantifies how closely a gene’s codon usage aligns with that of highly expressed genes. Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set. Specifically, 28.4% of TB high genes, 14% of TB low genes and 9.3% of genes in the other category fall within the top decile (>90th percentile) of codon adaptation index.”

      The panel 4D is unclear. Is one point associated with one gene? Or is it the average of several genes? If it's one point for one gene, it is important to clearly state it because the number of cases is therefore quite low, especially for the TB high and low.

      Each point and line are associated with a single gene. This is now clarified in the legend of the figure (line 364). The number of genes in this analysis is limited to the available ribosome profiling data with gene knockdown experiments.

      1. In Figure 2J, GGU (Gly), AAG (Lys), and ACU (Arg) provide negative effects on prediction, although these were enriched in the high TB set (Figure 2E). This contradiction should be explained.

      While this appears to be a seeming contradiction, it is in line with what we expected. In particular, the objective of Figure 2J is to illustrate the features that predict the mRNA–TE correlation of genes, as identified using a LGBM model. The Spearman correlation shown reflects the relationship between each feature and the mRNA–TE correlation values. A negative correlation for codons such as GGU (Gly), AAG (Lys), and ACU (Thr) suggests that enrichment of these codons is associated with lower mRNA–TE correlation. This is in agreement with our observation in Figure 2E which suggests that high TB genes are enriched in these codons. In contrast, transcript size exhibits a positive correlation, indicating that shorter transcripts tend to have lower mRNA–TE correlation values.

      Given that the choice of colors is a potential source of confusion, we have revised the text (line 230) and the figure (& legend) to try to clarify this relationship.

      The subtitle of "Translationally buffered genes exhibit variable association kinetics with the translational machinery in response to mRNA variation" sounds unfair to this reviewer. Since the authors did not work on kinetics directly, the use of this word is misleading.

      We agree and revised the subtitle to “The association of translationally buffered genes with the translational machinery varies in response to changes in mRNA abundance"

      1. The explanation of Figure 5A "We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the "differential transcript accessibility model", mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the "initiation rate model", the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, the proportion of mRNA entering the translational pool would be comparable across buffered and non-buffered genes (Fig 5A)." is hard to understand. The authors should rewrite for a better understanding of the readers.

      This section has been rewritten in the revised version of the manuscript. The text now reads as

      “We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the “differential transcript accessibility model”, mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the “initiation rate model”, the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, as mRNA abundance increases, translation initiation on each transcript is reduced, thereby lowering the number of ribosomes per transcript. However, this mechanism allows a proportional increase in transcripts entering the translational pool for buffered genes, similar to non-buffered genes”

      Significance

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. A group of mRNAs whose expression variance is buffered at the translation level was comprehensively surveyed in humans and mice. The authors found a series of features in the translationally buffered genes, including high GC contents in the 5′ UTR, optimal codon usage, and mRNA length. The depletion or increase of one allele of the genes in the group may be particularly detrimental to cells. The authors' report provides a step forward in our understanding of translational buffering, appealing to the broad scientific community in basic and applied biology. However, this reviewer found a series of concerns in this paper, including clarity in the methods, experimental validation, referring the earlier works, etc. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      We thank the reviewer for noting the significance of the work and for their constructive feedback.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Rao and colleagues present a comprehensive analysis of translational buffering in human and mouse by mining 1515 matched ribosome profiling and RNAseq datasets from diverse tissues and cell lines. They define translational buffering as genes whose TE is negatively correlated with mRNA abundance across conditions, and further identify candidates by comparing median absolute deviations of ribosome occupancy versus mRNA levels. The authors find a conserved set of buffered genes enriched for components of multiprotein complexes, demonstrate that buffered genes exhibit lower protein variability and greater dosage sensitivity, and propose two non-mutually exclusive mechanistic models (differential accessibility and initiation rate modulation). Finally, they perform complementary fractionation experiments in HEK293T cells to support these models.

      These findings propose a novel, conserved mechanism of translational buffering that tunes gene expression in mouse and human, showing how intrinsic sequence features and cellular context cooperate to stabilize protein output across diverse conditions. However, further evidence is required to fully support the authors conclusions, particularly direct validation of the proposed models of buffering.

      We thank the reviewer for their positive assessment and thoughtful suggestions that we address below.

      Below are my main concerns:

      1. The choice of the top 250 genes by spearman correlation and MAD ratio as "TB high" seems arbitrary. The authors should justify these cut offs (via permutation analysis or FDR control) and show that conclusions are robust to different thresholds.

      We agree that the threshold used to define TB high and TB low is somewhat subjective, and we now clearly acknowledge this in the discussion section (line 485). We now provide an R script that reproduces all analyses of translational buffering, where changing this cutoff to higher or lower values is straightforward.

      To ensure the robustness of our conclusions, we evaluated several thresholds for defining TB high and TB low. We observed that the conclusions hold within a reasonable range of values (100-250). For example, the effects on protein variation and the association of intrinsic features such as UTR lengths with buffering potential remain consistent when TB high is defined as the top 100 or the top 200 genes, compared with the current cutoff of 250. In contrast, when we define TB high as the top 2000 and TB low as ranks 2000–4000, the difference between the various features is diminished (Figure A& B). Further, protein variation (human cancer cell line and tissue) also becomes more similar across the three categories, possibly indicating a reduced regulatory potential of genes as their rank increases (Figure C& D). Our results show that highly ranked genes consistently associate with specific features, suggesting an underlying hierarchy in translational buffering potential.

      Legend: Effect of different thresholds on . A. Length features B. Median RNA expression C. Protein variation in human cancer cell line and D. on Primary human tissues

      The modified compositional regression approach for TE and imputation of missing values are central to the study, but details are relegated to supplemental methods. The manuscript would benefit from a clear comparison of this method against standard log-ratio TE estimates, including sensitivity analyses to missing-data imputation strategies

      We thank the reviewer for the feedback. We have now added further description of the modified compositional regression and the imputation strategy in the results section (line 94). Comparison to standard log-ratio TE estimates and their limitations has already been detailed in Liu et al. 2025, Nature Biotechnology. Therefore, in the current manuscript we specifically focus on the effect of the imputation strategy.

              Specifically, the modified imputation slightly improved concordance between the set of genes that are identified to be translationally buffered using the negative RNA-TE relationship or using RNA -Ribosome occupancy correlation (0.91 to 0.94). Further, we assessed the correlation between TE and protein abundance as measured by mass spectrometry from seven human cell lines (A549, HEK293, HeLa, HepG2, K562, MCF7 and U2OS). The protein measurements were obtained from PaxDb. The new imputation strategy slightly increased mean correlation between the TE and proteome abundance as compared to naive strategy. It specifically showed improved correlation for HepG2, A549 and HeLa cell lines. 3507 genes were used for this analysis that were common between PaxDb, Liu et al., 2005 and the current study.
      

      Legend: Proteomics vs TE correlation of cell types without or with imputation strategy. Spearman correlation between compositional TE calculated as calculated by Liu et al., 2025 from 68 samples from 11 studies (HEK293), 86 samples from 10 studies (HeLa), 58 samples from four studies (U2OS), 29 samples from five studies (A549), five samples from two studies (MCF7), seven samples from two studies (K562) and 10 samples from two studies (HepG2) or from the current study. 57 samples from 10 studies (HEK293), 82 samples from 9 studies (HeLa), 58 samples from four studies (U2OS), 29 samples from five studies (A549), 5 samples from two studies (MCF7), one samples from one studies (K562) and 9 samples from two studies (HepG2) . 3507 genes were used for this analysis that were common between Paxdb, Liu et al., 2005 and the current study.

      Human data are derived mainly from immortalized cell lines, whereas mouse data are from primary tissues. Pooling these heterogeneous sources may conflate cell type-specific regulation with intrinsic buffering. The authors should either stratify analyses by context or demonstrate buffering signatures remain consistent within more homogeneous subsets

      We thank the reviewer for the suggestion and agree that heterogeneity could potentially mask cell type-specific buffering effects. The TB-high genes we report are those that show consistent and robust expression across diverse contexts. However, unlike RNA-seq datasets, the current number of ribosome profiling samples per cell type is still limited, and a more comprehensive assessment of context-specific buffering will require larger datasets that will accumulate over time.

      Nonetheless, we have stratified the analysis by cellular context. Specifically, we grouped samples of the same cell-type and repeated the buffering analysis. We provide a new table listing TB-ranks of genes for the five cell types with the largest sample sizes as a table in github.

      https://github.com/CenikLab/Translational-buffering/blob/Translational-Buffering/combined_tables.xlsx

      As an additional control, we compared buffering patterns between related and unrelated cell lines. For example, the correlation of TB ranks between related cell lines HEK293T (n = 98) and HEK293 (n = 57) is higher (0.46) than between either and an unrelated cell line, HeLa (n = 82). Similarly, the correlation between two liver cell lines, Huh7 (n = 39) and HepG2 (n = 9), is higher (0.20) than between Huh7 and a similarly sampled but unrelated lymphoblastoid cell line (LCL, n = 9; correlation = 0.05). While these analyses suggest that cell type-specific patterns may exist, their exploration is currently limited by sample size, as detecting buffering requires substantial variability in mRNA expression. We now highlight this as a limitation in the Discussion section (line 573).

      *Legend: Spearman correlation between TB ranks of different pairs of cell lines. The first set indicates comparison with HEK293T. The second set indicates comparison between liver cells (HepG2 and Huh 7). *

      The HEK293T fractionation experiments offer preliminary support for both the "accessibility" and "initiation" models, but only slope analyses are shown. To validate these models, the authors should perform targeted reporter assays (dual luciferase constructs with 5′UTR swaps) or manipulations of initiation factors (eIF4E knockdown) to directly test how transcript abundance alters initiation rates versus pool entry

      We thank the reviewer for suggesting experiments to validate the proposed models. In the luciferase reporter experiments, constructs bearing the endogenous UTRs from non-buffered genes would be expected to result in expression that is proportional to transcript abundance. In contrast, swapping a 5’ UTR from buffered genes would mitigate this effect of translation buffering via “initiation rate model” depending on the 5 UTR sequence of transcript. However, as outlined below, this experiment has important caveats:

      1. Role of coding sequence: Such assays primarily test the contribution of the 5′UTR and do not address potential cooperative effects between the 5′UTR and the coding sequence (CDS). Thus, if 5′UTRs fails to recapitulate translational buffering, it would be unclear whether the buffering requires coordinated action of the 5′UTR and CDS or whether the gene in question simply does not conform to the initiation-rate model.
      2. Sensitivity of measurements: Reporter-based measurements often rely on RT-qPCR to quantify expression changes. While suitable for large fold-changes, small shifts may fall within the assay’s technical margin of error, limiting the interpretability of the results. iii. Gene-to-gene variability: Buffered and non-buffered transcripts likely span a wide range of intrinsic initiation rates. Selecting only a few “representative” transcripts for 5′UTR swapping could yield results that are not broadly generalizable.

      Similarly, knockdown of general initiation factors will likely impact on both buffered and non-buffered genes, which could limit the ability to distinguish the effect of transcript abundance on translational buffering via either of the proposed models. We envision an alternative future approach that would involve single molecule imaging translating and non-translating mRNAs of buffered and non-buffered genes under varying abundance conditions in a physiological context. Such experiments are likely the most suitable for disentangling the contributions of accessibility versus initiation. While we find this an exciting direction for future work, it lies beyond the scope of the present manuscript.

      The conclusion that buffering reduces protein variability relies on mass-spec comparisons, but ribosome occupancy does not always reflect functional protein output (due to elongation stalling or co-translational degradation). Incorporating orthogonal measures, such as pulse-labeling or western blots for key buffered versus non-buffered genes, would strengthen the link between buffering and proteome stability

      We agree with the reviewer’s concern and have been acknowledged as a limitation in the discussion section. To address this with orthogonal approaches, we carried out several additional experiments. Specifically, we identified a study from RiboBase (GSE132703) that exhibited significant variation in FUS transcript (a translationally buffered gene) abundance across conditions—namely HEK293T wild type, LARP1A single knockout (SKO), and LARP1A/B double knockout (DKO) using their RNA-seq data. We reached out to the authors of the study and obtained these knockout cell lines. We reanalyzed RNA abundance under the different conditions by RT-qPCR and assessed protein levels by Western blot. Despite observing differences in RNA abundance, FUS protein levels did not exhibit corresponding change at the protein level.

      We also selected a non-buffered gene; DNAJC6, that also showed RNA-level differences. However, the change in RNA expression was not consistent at the protein level. Some caveats of Western blot is its limited sensitivity which may prevent detection of subtle changes and that the measurements are steady-state protein levels which cannot resolve whether differences arise from altered synthesis or degradation.

      *Legend : Validation of buffering gene by western blot: A. Plot showing the RNA abundance and ribosome occupancy of buffered gene ; FUS and non buffered genes; DNAJC6 with variation in HEK293T-wild type, LARP1A single knockout and LARP1A/B double knockout. B. Validation of the RNA seq data by qPCR. C. Western Blot showing the FUS, DNAJC6 and Actin in wild type and different mutants. D. Bar plot showing the quantification of western blot. *

              In addition to this targeted analysis , we performed quantitative mass spectrometry to evaluate the effect of mRNA variation at the protein level at global scale.
      

      LC MS/MS analysis was performed on the above samples in triplicates at the Proteomics facility of the University of Texas. A total of 4,048 proteins were identified using a peptide confidence threshold of 95% and a protein confidence threshold of 99%, with a minimum of two peptides required for identification. Total precursor intensities for all peptides of a protein was summed and was used for protein quantification using DEP (Differential Enrichment of Proteomics Analysis) Package, in Bioconductor, R (https://rdrr.io/bioc/DEP/man/DEP.html). DEP was used for variance normalization and statistical testing of differentially expressed proteins. As expected LARP1 protein was identified in the control cells but not in the single or double knockouts.

      We then plotted the fold change in RNA as determined by edgeR analysis of RNA-seq from (Philippe et al. 2020) and the fold change in protein abundance from our mass spectrometry data. We observed that genes in the TB high group show reduced changes at the protein level compared to TB low or others as determined by the linear regression analysis in both single and double LARP1 KO mutants. This finding is consistent with our findings that buffered genes show lower variation in the protein abundance in response to change in mRNA expression.

      Legend: Scatter plot showing the log2fold change in the RNA and protein levels as determined by RNA seq from (Philippe et al. 2020) or mass spectroscopy. Differential analysis of RNA was done using the edgeR package and the DEP (Differential Enrichment of Proteomics Analysis) Package *was used for mass spectrometry analysis. Only genes with an FDR We have not included this data in the manuscript given the deviation of the approach from our original analysis, but we are happy to reconsider the inclusion of this data to supplement our proteomic analysis.

      While the LGBM modeling shows modest predictive power of sequence features alone, the manuscript stops short of exploring what cellular factors might drive context dependence. Integrating public datasets on RNA-binding protein expression or mTOR pathway activity across samples could illuminate trans-acting determinants of buffering and move beyond correlative sequence analyses,

      We thank the reviewer for this suggestion. To investigate potential trans-acting determinants of buffering, we focused on 1,394 human RBPs as classified by Hentze et al. (2018), reasoning that some of these factors may facilitate translational buffering. Specifically, we examined correlations between the RNA expression of each RBP and the TE of all other genes across samples. p-values were corrected using the Bonferroni procedure. For each RBP, we then performed a Fisher’s exact test to assess whether the number of significant correlations was enriched among buffered versus non-buffered genes.

      This analysis revealed that the expression levels of many RBPs are significantly enriched for either positive or negative correlations with the TE of buffered genes. In particular, we note that RNA expression of many buffered RBPs is enriched for negative correlations with the TE of other buffered transcripts. These results suggest that, rather than considering translational buffering in isolation for each transcript, buffering effects may be coordinated at the translational level and influenced by shared trans-acting factors such as RBPs. Network-based approaches have been valuable for RNA co-expression and are only now being applied to TE covariation. However, the correlative nature of these analyses limits causal inference. For example, although many ribosomal proteins appear to influence the buffering of other ribosomal proteins, they themselves may be regulated by a non-ribosomal RBP—so the apparent effects could reflect upstream regulatory influences. This analysis is now included as a supplementary figure (Sup. Fig. 5) of the revised manuscript.

      Legend: A scatter plot of odds ratio log of number of significant correlations (RNA abundance of RBPs ::TE of genes) and the p value from fisher test. The vertical dashed line represents the threshold odds ratio, above which RBPs exhibit a higher number of significant correlations with buffered genes. P values were corrected using Bonferroni procedure* and the horizontal dashed line represents the adjusted p value cutoff. *

      Reviewer #2 (Significance (Required)):

      Overall, this manuscript leverages an unprecedented compendium of matched ribosome profiling and RNAseq datasets across human cell lines and mouse tissues, combined with improved TE estimation, to robustly catalog genes exhibiting translational buffering, a clear methodological and conceptual strength. The main limitations stem from heterogeneous sample sources, largely correlative analyses, and a lack of targeted mechanistic validation. Compared to prior yeast focused studies, it fills a key gap by demonstrating conservation of buffering in mammals and linking it to dosage sensitivity and protein stability, representing a conceptual advance in understanding post-transcriptional homeostasis and a methodological step forward in TE analysis. This work will interest researchers in RNA biology, gene expression regulation, systems biology, and cancer proteomics, as well as those studying dosage-sensitive pathways and translational control. My expertise is on translational control in cancer.

      We thank the reviewer for noting the broader significance of the work and for their constructive feedback.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In this manuscript, Rao and colleagues present a comprehensive analysis of translational buffering in human and mouse by mining 1515 matched ribosome profiling and RNAseq datasets from diverse tissues and cell lines. They define translational buffering as genes whose TE is negatively correlated with mRNA abundance across conditions, and further identify candidates by comparing median absolute deviations of ribosome occupancy versus mRNA levels. The authors find a conserved set of buffered genes enriched for components of multiprotein complexes, demonstrate that buffered genes exhibit lower protein variability and greater dosage sensitivity, and propose two non-mutually exclusive mechanistic models (differential accessibility and initiation rate modulation). Finally, they perform complementary fractionation experiments in HEK293T cells to support these models.

      These findings propose a novel, conserved mechanism of translational buffering that tunes gene expression in mouse and human, showing how intrinsic sequence features and cellular context cooperate to stabilize protein output across diverse conditions. However, further evidence is required to fully support the authors conclusions, particularly direct validation of the proposed models of buffering. Below are my main concerns:

      1. The choice of the top 250 genes by spearman correlation and MAD ratio as "TB high" seems arbitrary. The authors should justify these cut offs (via permutation analysis or FDR control) and show that conclusions are robust to different thresholds
      2. The modified compositional regression approach for TE and imputation of missing values are central to the study, but details are relegated to supplemental methods. The manuscript would benefit from a clear comparison of this method against standard log-ratio TE estimates, including sensitivity analyses to missing-data imputation strategies
      3. Human data are derived mainly from immortalized cell lines, whereas mouse data are from primary tissues. Pooling these heterogeneous sources may conflate cell type-specific regulation with intrinsic buffering. The authors should either stratify analyses by context or demonstrate buffering signatures remain consistent within more homogeneous subsets
      4. The HEK293T fractionation experiments offer preliminary support for both the "accessibility" and "initiation" models, but only slope analyses are shown. To validate these models, the authors should perform targeted reporter assays (dual luciferase constructs with 5′UTR swaps) or manipulations of initiation factors (eIF4E knockdown) to directly test how transcript abundance alters initiation rates versus pool entry
      5. The conclusion that buffering reduces protein variability relies on mass-spec comparisons, but ribosome occupancy does not always reflect functional protein output (due to elongation stalling or co-translational degradation). Incorporating orthogonal measures, such as pulse-labeling or western blots for key buffered versus non-buffered genes, would strengthen the link between buffering and proteome stability
      6. While the LGBM modeling shows modest predictive power of sequence features alone, the manuscript stops short of exploring what cellular factors might drive context dependence. Integrating public datasets on RNA-binding protein expression or mTOR pathway activity across samples could illuminate trans-acting determinants of buffering and move beyond correlative sequence analyses

      Significance

      Overall, this manuscript leverages an unprecedented compendium of matched ribosome profiling and RNAseq datasets across human cell lines and mouse tissues, combined with improved TE estimation, to robustly catalog genes exhibiting translational buffering, a clear methodological and conceptual strength. The main limitations stem from heterogeneous sample sources, largely correlative analyses, and a lack of targeted mechanistic validation. Compared to prior yeast focused studies, it fills a key gap by demonstrating conservation of buffering in mammals and linking it to dosage sensitivity and protein stability, representing a conceptual advance in understanding post-transcriptional homeostasis and a methodological step forward in TE analysis. This work will interest researchers in RNA biology, gene expression regulation, systems biology, and cancer proteomics, as well as those studying dosage-sensitive pathways and translational control. My expertise is on translational control in cancer.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. Although the authors' report provides a step forward in our understanding of translational buffering, this reviewer found a series of concerns in this paper. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

      Major comments:

      1. This paper heavily relies on the reference 18. However, this paper was not properly stated (no page or journal number); the study in Bioinformatics is nowhere to be found on the website, despite being out in 2024 apparently. Either title is wrong (yet a biorxiv can be found). This reviewer guessed that the reference 18 may be accepted. However, without a proper reference, this paper could not be judged since nearly all the parts of this work have been based on the reference 18. Also, the Ribobase data used in this manuscript comes from this reference, so it had better be well defined, especially when another Ribobase data set seems to be available online: http://www.bioinf.uni-freiburg.de/~ribobase/index.html
      2. In the Discussion, the authors mentioned "TE is based on a compositional regression model (18) rather than the commonly applied approach of using a logarithmic ratio of ribosome occupancy to mRNA abundance." This important information should be mentioned early section of the manuscript. Related to this, there are other published methods for exploring change in translation efficiency (e.g., 10.1093/bioinformatics/btw585; 10.1093/nar/gkz223) that could also be suitable in this context. It is not entirely clear if their approach is better than before. Again, the improper reference to 18 made our assessment of this work difficult.
      3. The paper mainly relies on detecting a set of buffered genes using mRNA-TE correlation and MAD ratios (Ribo-Seq/RNA-Seq). While the concept seems sound, the authors should ensure that this method is reliable. Several controls could be used to confirm this. First, if any studies in humans or mice have described a set of genes as buffered, it would be worth checking for overlap between the authors' set of 'TB high' genes and the previously established list. Furthermore, the authors could use packages explicitly developed for translational buffering detection, such as annota2seq (https://academic.oup.com/nar/article/47/12/e70/5423604?login=true). Not all of the data used by the authors may be suitable for such packages, but the authors could at least partially use them on some of their datasets and see whether the buffered genes reported by these packages match their predictions.
      4. The threshold of 'TB high' or 'TB low' (top and bottom 250) is somewhat arbitrary. Why not top 100 or 500? The authors should provide a rationale for this choice. Also, they could include a numeric measure of buffering (the sum of the two rankings is probably suitable for this purpose). Several of the authors' explorations are suitable for numerical quantification (GO enrichment can be turned into GSEA, and the boxplot can be shown as correlations)
      5. Several of the statements of the authors in the Introduction or Discussion sections are not entirely true regarding the literature on the topics, or lack major papers on the topic, and therefore, they are a bit misleading. Among others, here are some:

      5-1 "In addition, genetic differences arising from aneuploidy, cell type differences or variability observed in the natural population can further determine the amplitude of variation (4-7). The effect of mRNA variation under these conditions is mostly reflected at the protein levels (2, 4-8).". Several recent or more ancient papers suggest that mRNA variation coming from aneuploidy, natural genetic variation, or CNV is buffered or not well reflected at the protein level:

      DOI: 10.1038/s41586-024-07442-9 DOI: 10.1073/pnas.2319211121 DOI: 10.1016/j.cels.2017.08.013 DOI: 10.15252/msb.20177548

      5-2: The authors should also consider mentioning these studies and softening their initial statement. "Similarly, translational buffering of certain genes have been reported in mammalian cells, specifically under estrogen receptor alpha (ERα) depletion conditions (16).". Translational buffering has been deeply explored in mammalian tissues and even across several mammalian species in this study (DOI: 10.1038/s41586-020-2899-z). In this, the authors also provide a nice exploration of the gene characteristics that are associated with translational buffering. The authors should mention it and compare the study's findings to theirs ultimately.

      5-3: "Differences in species evaluated and statistical methods have resulted in conflicting interpretations (13, 28).". These conflicting results have been previously discussed in reviews on the topic that would be worth mentioning: DOI: 10.1016/j.cell.2016.03.014 DOI: 10.1038/s41576-020-0258-4 6. In addition to the p-values stated in the main text, the authors should annotate their plots when they find significant differences between groups to greatly facilitate the visual interpretation of the graphs. 7. Based on the data of Figure 4D, apparently, ribosome occupancy was not buffered even in high TB sets. The authors may argue that translational buffering may not cope with such a strong mRNA reduction. In that case, how big a difference in mRNA level does the buffering system adjust in protein synthesis? The authors should test gradual gene knockdown and/or overexpression and conduct Ribo-Seq/RNA-Seq to survey the buffering range. 8. "differential transcript accessibility model" could not be functional if mRNA is reduced beyond the accessible pool (i.e., less than the threshold, all the mRNAs are translated without buffering). The authors should carefully reconsider this model and the effective range of mRNAs.

      Minor comments:

      1. Some figures are of poor quality as they seem to have points outside of the panel representations... Like Figure 3C, one point is out of the square, same for Figure 4E. Similarly, on figure 5F, some outliers seem to be clearly cut from the figure (maybe not, but then the author should put a larger space between the end of the figure and the max y points). Same for panel S2D and S6D, this does not sound so rigorous.
      2. There are several typos or weird sentences. Here are some (but maybe not all):

      2-1: [...]with lower sums corresponding to higher final ranks. "two rankings". Based on these final ranks[...]

      2-2: For each dataset, median absolute deviation (MAD) "i" protein abundance was calculated across samples

      2-3: [...]neighbor method implemented in the MatchIT package (38) Differences in protein[...] a point is missing here.

      2-4: Additionally a second dataset providing predictions of haploinsufficiency (pHaplo score) and triplosensitivity (pTriplo score) for all autosomal genes (25) was used to asses the distribution of these score"S" across buffered and non-buffered gene sets . There is a missing "s" at "score" and there is a space between the last word and the final point. 3. In the "Lymphoblastoid cell line data analysis:" section, this reviewer wonders why the authors used a different method to calculate buffering compared to before. 4. "Samples which had R2 less than 0.2 were removed as the residuals calculated for these samples could be unreliable". These samples for which the correspondence between RNA-Seq and Ribo-Seq is low wouldn't be the ones most impacted by translational buffering? Is it sure that the authors are not missing something here? 5. For Figure 4B and 4C, the authors should provide statistical tests and p-values to confirm the observed trends. 6. In Figure 2A, the "all genes" color doesn't correspond to the point color. 7. "To understand if codon usage patterns are[...]". This comes slightly out of the blue. The authors could maybe explain why codon usage should be explored for translational buffering. The authors should cite recent key works in the fields: DOI: 10.1016/j.celrep.2023.113413 DOI: 10.1101/2023.11.27.568910 8. "The change in each metric was calculated by subtracting the mean value in the control samples from that in the knockdown samples. This yielded the differential mRNA abundance and ribosome occupancy resulting from gene knockdown.". This looks statistically weak. The authors should consider using more robust methods like DESeq. 9. "Genes in the buffered gene set had a higher codon adaptation index than the non-buffered set, indicating that candidates in the buffered gene set are relatively well expressed due to the presence of a higher proportion of the codons observed in highly expressed genes". What do the authors mean by "relatively well expressed"? Abundantly expressed? This sentence and the causality under it is unclear and should be modified or better explained. 10. The panel 4D is unclear. Is one point associated with one gene? Or is it the average of several genes? If it's one point for one gene, it is important to clearly state it because the number of cases is therefore quite low, especially for the TB high and low. 11. In Figure 2J, GGU (Gly), AAG (Lys), and ACU (Arg) provide negative effects on prediction, although these were enriched in the high TB set (Figure 2E). This contradiction should be explained. 12. The subtitle of "Translationally buffered genes exhibit variable association kinetics with the translational machinery in response to mRNA variation" sounds unfair to this reviewer. Since the authors did not work on kinetics directly, the use of this word is misleading. 13. The explanation of Figure 5A "We next explored the potential mechanisms that may give rise to translational buffering. Specifically, we considered two non-mutually exclusive models by which mRNA abundance might be decoupled from ribosome occupancy. In the first, the "differential transcript accessibility model", mRNA abundance determines the fraction of transcripts that are accessible to the translational pool. In this scenario, an increase in mRNA abundance would be accompanied by a proportionally smaller increase in the fraction of transcripts entering the translating pool for buffered genes, compared to non-buffered genes. In the second, the "initiation rate model", the rate of translation initiation per transcript scales inversely with mRNA abundance. Under this model, the proportion of mRNA entering the translational pool would be comparable across buffered and non-buffered genes (Fig 5A)." is hard to understand. The authors should rewrite for a better understanding of the readers.

      Significance

      Thanks to the development of Ribo-Seq, translational buffering has been reported in various works. However, the systematic investigation has remained challenging. Employing the database of published Ribo-Seq and matched RNA-Seq, Rao et al. attempt to understand the mechanism underlying translational buffering of mRNA variation across diverse materials. A group of mRNAs whose expression variance is buffered at the translation level was comprehensively surveyed in humans and mice. The authors found a series of features in the translationally buffered genes, including high GC contents in the 5′ UTR, optimal codon usage, and mRNA length. The depletion or increase of one allele of the genes in the group may be particularly detrimental to cells. The authors' report provides a step forward in our understanding of translational buffering, appealing to the broad scientific community in basic and applied biology. However, this reviewer found a series of concerns in this paper, including clarity in the methods, experimental validation, referring the earlier works, etc. These points could be tackled to improve the reliability of their findings, the strength of their main message, and the global understandability of the paper.

    1. Document d'Information : La Santé Mentale en France

      Synthèse

      La santé mentale en France est au cœur d'un paradoxe critique : bien que décrétée "grande cause nationale" pour 2025, elle demeure la "grande cause oubliée" des politiques publiques, souffrant d'un sous-financement chronique et d'une crise structurelle profonde.

      Le système de soins psychiatriques est au bord de la rupture, avec un taux de vacance de 47 % pour les postes de psychiatres hospitaliers, des fermetures de lits et des délais d'attente pour les consultations pouvant atteindre deux ans en pédopsychiatrie.

      Cette situation a des conséquences dramatiques, notamment pour la jeunesse, population la plus vulnérable où un lycéen sur quatre a déjà eu des pensées suicidaires.

      Tandis que la parole se libère progressivement grâce aux témoignages de personnalités publiques et à des œuvres culturelles qui contribuent à lever le tabou, les défis systémiques restent immenses.

      L'investissement dans la prévention est quasi inexistant, entraînant des retards de diagnostic de près d'une décennie.

      Les nouvelles initiatives, telles que les applications mobiles et la formation aux premiers secours en santé mentale, offrent des pistes complémentaires mais se heurtent à la réalité d'un manque criant de professionnels vers qui orienter les personnes en souffrance.

      La crise est aggravée par des problématiques concrètes telles que les disparités territoriales d'accès aux soins et une pénurie inédite de médicaments psychotropes, soulignant l'urgence d'une politique ambitieuse et financée à la hauteur des enjeux.

      1. La Crise Paradoxale de la Santé Mentale en France

      "Grande Cause Nationale" : Une Déclaration Sans Moyens

      La santé mentale a été officiellement désignée "grande cause nationale" pour l'année 2025. Cependant, cette annonce politique peine à se traduire par des actions concrètes et financées.

      Selon le psychiatre Stéphane Oriette, cette déclaration s'est faite "sans financements associés", ce qui a été "la condition qui a été énoncée dès le début".

      Les politiques publiques successives (Assises de la santé mentale, Conseil national de la refondation) ont identifié les difficultés mais n'ont pas déployé les moyens nécessaires, laissant les professionnels et les patients face à une pression croissante.

      Angèle Malâtre-Lansac souligne le paradoxe : la santé mentale est le premier poste de dépense de l'Assurance Maladie, devant le cancer et les maladies cardiovasculaires, et pourtant, 50 % des personnes concernées ne sont pas prises en charge.

      Une Filière en Souffrance : Pénuries et Manque d'Attractivité

      Le secteur de la psychiatrie fait face à une grave crise de ressources humaines, symptomatique d'un manque de valorisation.

      Pénurie de personnel : Le taux de postes de psychiatres vacants en milieu hospitalier atteint 47 %, en augmentation par rapport aux 43 % enregistrés quelques années auparavant.

      Déficit d'attractivité : La psychiatrie est perçue négativement par une partie des futurs médecins. Une enquête révèle que 60 % des internes considèrent la psychiatrie comme une "sous-spécialité" et 30 % en ont peur. Pourtant, 90 % des psychiatres interrogés affirment qu'ils choisiraient à nouveau cette spécialité.

      Manque de moyens matériels : Stéphane Oriette insiste sur le besoin de "personnel", de "médicaments" et de "locaux adaptés" pour pouvoir soigner correctement.

      Conséquences Directes sur les Soins aux Patients

      Cette crise systémique impacte directement la qualité et l'accès aux soins pour les 13 millions de personnes concernées en France. Une infirmière de l'hôpital de Tours, où la suppression de 80 lits en psychiatrie est prévue, témoigne :

      "Les patientes, ils ont à peu près en moyenne entre 1 et 2 mois pour qu'un patient schizophrène soit stabilisé. Là, il y aura pas de place.

      Donc ça va être 15 jours d'hospitalisation et ces gens-là seront mis à la rue. Donc, on va les retrouver aux urgences psychiatriques. C'est pas possible."

      2. La Jeunesse : Une Population Particulièrement Vulnérable

      L'état de la santé mentale des jeunes en France est particulièrement alarmant, exacerbé par les crises récentes comme celle du Covid.

      Le suicide est l'une des premières causes de mortalité chez les jeunes.

      Statistiques Alarmantes

      Indicateur

      Donnée Clé

      Pensées suicidaires (Lycéens)

      1 jeune sur 4 a eu au moins une pensée suicidaire au cours de l'année.

      Pensées suicidaires (18-24 ans)

      1/3 des jeunes de cette tranche d'âge a déjà eu des idées suicidaires.

      Perception de la santé mentale (Filles)

      Seulement 49 % des jeunes filles estiment être dans une santé mentale convenable.

      Ligne de prévention suicide

      Le numéro national est le 3114.

      L'actrice et réalisatrice Isabelle Carré a été motivée à réaliser son film "Les Rêveurs", inspiré de sa propre hospitalisation à 14 ans, en voyant "monter sur la désespérance, les fragilités psychologiques des jeunes".

      Le Cas Spécifique des Jeunes Filles

      Isabelle Carré souligne que les jeunes filles semblent souffrir davantage, une question qui, selon elle, n'est pas suffisamment débattue publiquement.

      Le chiffre de près d'une fille sur deux ne se sentant pas en bonne santé mentale est qualifié de "dramatique".

      Des Délais d'Attente Inacceptables en Pédopsychiatrie

      L'accès aux soins pour les enfants et adolescents est un point noir majeur du système. Il faut parfois attendre jusqu'à deux ans pour obtenir un rendez-vous en pédopsychiatrie. Stéphane Oriette exprime le dilemme des soignants :

      "Qu'est-ce que c'est aussi pour un soignant de prendre cette responsabilité là de dire ben rentre chez toi alors qu'il demande de l'aide ?".

      Par ailleurs, les enfants de l'aide sociale à l'enfance (ASE) représentent les deux tiers des lits en pédopsychiatrie, soulignant la vulnérabilité de cette population.

      3. Briser le Tabou : L'Émergence d'une Nouvelle Parole Publique

      Malgré la crise, un changement culturel s'opère lentement, avec une libération de la parole qui contribue à déstigmatiser la maladie mentale.

      Le Rôle des Témoignages et des Œuvres Culturelles

      Témoignages publics : Le journaliste Nicolas de Moran a publiquement parlé de sa bipolarité avec des mots forts : "Oui, je suis malade mental.

      C'est cru, c'est violent à dire, peut-être à entendre aussi, mais je ne veux plus le cacher et je ne veux plus me cacher."

      Productions culturelles : Des séries, des émissions et des films, comme "Les Rêveurs" d'Isabelle Carré, abordent le sujet.

      Le festival "Cinéma à la Folie", dont elle est la marraine, est également un vecteur de sensibilisation. L'objectif est de changer le regard sur la maladie psychiatrique, pour qu'elle ne soit plus vue "comme de la faiblesse, de la folie, de la violence".

      Ce mouvement est comparé à celui qui a eu lieu dans les pays anglo-saxons, où un travail important a été fait sur la "déstigmatisation".

      4. Prévention et Nouvelles Approches : Entre Espoirs et Limites

      Face aux défaillances du système traditionnel, de nouvelles stratégies émergent, axées sur la prévention, le numérique et l'entraide.

      L'Enjeu Crucial de la Prévention et de l'Intervention Précoce

      Selon Angèle Malâtre-Lansac, la France investit "très très peu en prévention". Cette carence a des conséquences lourdes :

      75 % des maladies mentales se développent avant l'âge de 25 ans.

      • Les délais entre les premiers symptômes et un diagnostic peuvent atteindre 8 à 10 ans, comme l'illustre le cas de Nicolas de Moran. Certaines personnes ne sont diagnostiquées que vers 50 ans, voire jamais.

      Les Outils Numériques : Complément ou Danger ?

      Les jeunes se tournent massivement vers les réseaux sociaux et les applications pour s'informer et chercher de l'aide.

      Applications dédiées : L'application "Link", créée par l'influenceuse Miel (18 ans), a été téléchargée 300 000 fois.

      Elle propose un calendrier des émotions, un journal intime et un "kit de secours".

      D'autres applications comme "Jardin Mental" (gratuite et soutenue par l'État) existent également.

      Risques de désinformation : Une enquête du Guardian révèle que la moitié des vidéos les plus populaires sur TikTok concernant la santé mentale diffusent de fausses informations (ex: manger une orange contre l'anxiété).

      En France, près de 90 % des contenus sur le sujet sont postés par des non-professionnels de santé.

      Intelligence Artificielle : Le recours à des IA comme ChatGPT pour se confier est perçu par Stéphane Oriette comme un signe que les jeunes "ne trouvent pas de réponses du côté de l'humain" et les cherchent ailleurs.

      Les experts s'accordent à dire que ces outils peuvent être un complément utile pour l'information ou le suivi, mais ne remplaceront "jamais le facteur humain".

      Les Premiers Secours en Santé Mentale et le Soutien par les Pairs

      Secourisme en santé mentale : Inspiré d'un modèle australien, ce programme vise à former des citoyens pour repérer les signes de détresse psychique et orienter vers des professionnels. L'ambition est de former 750 000 personnes en France.

      Pair-aidance : Des associations comme "La Maison Perchée" proposent des lieux de rencontre avec des "pairs-aidants", des personnes ayant traversé des expériences similaires et pouvant offrir un soutien.

      La limite de ces dispositifs est soulignée par Stéphane Oriette : "La question c'est vers qui on oriente, vers quoi on oriente ?" si les structures de soin professionnelles sont saturées.

      5. Enjeux Spécifiques et Systémiques

      La Santé Mentale en Entreprise

      C'est un "enjeu majeur, trop négligé" selon Angèle Malâtre-Lansac.

      La santé mentale est la première cause d'arrêt de travail de longue durée.

      Une charte d'engagement pour la santé mentale au travail a été créée pour inciter les entreprises à former leurs équipes, notamment aux premiers secours psychiques.

      Disparités Territoriales

      L'accès aux soins est extrêmement inégal sur le territoire. Isabelle Carré insiste sur ce point, mentionnant qu'il y a "des régions entières où il y a rien".

      La Pénurie de Médicaments Psychotropes

      Un phénomène qualifié d'"assez inédit" et particulièrement inquiétant est apparu depuis le printemps : une pénurie de médicaments psychotropes.

      Due à des problèmes sur une chaîne de production en Grèce, cette situation empêche des patients d'accéder à leurs traitements, avec des conséquences potentiellement graves lors de l'arrêt brutal de ces médicaments.

    1. Document d'Information : La Recherche en Santé Mentale

      Résumé

      Ce document synthétise les perspectives et les avancées de la recherche en santé mentale, telles que présentées par d'éminents experts de l'Université Paris Cité, du CNRS et de l'INSERM.

      La psychiatrie connaît une mutation fondamentale, s'éloignant de son image traditionnelle pour devenir une discipline médicale de pointe, rigoureusement ancrée dans la biologie, la génétique et la pharmacologie.

      L'enjeu principal est de passer d'un diagnostic basé sur l'observation clinique à une caractérisation objective des troubles mentaux grâce à l'identification de biomarqueurs.

      La recherche actuelle se concentre sur l'interaction complexe entre la vulnérabilité génétique et les facteurs environnementaux (stress, toxiques, expositions prénatales), un lien dont le mécanisme clé est l'épigénétique.

      Face aux défis majeurs que sont les échecs thérapeutiques et la variabilité de la réponse aux traitements, la médecine de précision émerge comme une voie d'avenir.

      L'étude du lithium dans le trouble bipolaire illustre cette approche, combinant analyses sanguines, marqueurs épigénétiques et imagerie cérébrale avancée pour prédire et optimiser l'efficacité des traitements.

      S'inspirant des succès des "Plans Cancer", un appel est lancé pour un engagement national et pluriannuel afin de structurer et de financer la recherche, l'Université Paris Cité se positionnant comme un acteur central de cette dynamique.

      1. La Nouvelle Ère de la Psychiatrie : Une Discipline en Pleine Mutation

      La psychiatrie du 21e siècle a entamé une profonde transformation, s'appuyant sur les progrès scientifiques pour affiner sa compréhension et sa prise en charge des troubles mentaux.

      Du Divan à la Biologie

      La psychiatrie moderne se détache de "l'image d'Épinal un peu poussiéreuse" associée à la psychanalyse et au divan.

      Elle est désormais une médecine de pointe qui intègre des connaissances rigoureuses issues de disciplines variées :

      Biologie et Génétique : Étude des prédispositions et des mécanismes cellulaires.

      Imagerie Cérébrale : Visualisation de l'activité et de la structure du cerveau.

      Pharmacologie : Développement et optimisation des molécules thérapeutiques.

      Épigénétique : Analyse de l'influence de l'environnement sur l'expression des gènes.

      Comme le souligne le Dr Boris Chumet, psychiatre et chercheur, "la psychiatrie est rentrée dans une nouvelle ère, elle est en pleine mutation".

      La Quête de Biomarqueurs

      Un objectif central de la recherche actuelle est la découverte de biomarqueurs, c'est-à-dire des "validateurs externes" mesurables (sanguins, génétiques, d'imagerie) pour les troubles psychiatriques.

      Actuellement, les diagnostics reposent principalement sur le discours du patient et l'interprétation du clinicien, une approche jugée imparfaite.

      Les biomarqueurs permettraient de :

      • Mieux caractériser les patients.

      • Démembrer les catégories diagnostiques actuelles, qui sont trop larges.

      • Accélérer la prise en charge et l'accès à un traitement adéquat.

      Prévalence et Impact des Troubles Psychiatriques

      Les troubles mentaux figurent parmi les maladies les plus fréquentes, soulignant l'urgence des avancées en recherche.

      Trouble

      Prévalence / Données Clés

      Schizophrénie

      1% de la population

      Trouble Bipolaire

      2 à 3% de la population

      Troubles du Neurodéveloppement

      Environ 1 personne sur 6

      Dépression

      15 à 20% de la population

      Une "explosion" des cas de dépression et d'anxiété est observée, particulièrement chez les jeunes et les femmes, notamment depuis la crise de la COVID-19.

      2. L'Interaction Gène-Environnement : Le Cœur des Nouveaux Enjeux

      La recherche a établi que les troubles psychiatriques résultent d'une interaction complexe entre des facteurs innés (génétiques) et acquis (environnementaux).

      La Vulnérabilité Génétique

      Il ne s'agit pas de déterminisme génétique mais de vulnérabilité ou de prédisposition.

      L'influence de la génétique est clairement démontrée par les études sur les jumeaux dans le cas de la schizophrénie :

      Vrais jumeaux (100% d'ADN en commun) : Si l'un est atteint, l'autre a 50% de risque de développer la maladie (contre 1% dans la population générale).

      Le fait que le risque ne soit pas de 100% prouve le rôle de l'environnement.

      Faux jumeaux (50% d'ADN en commun) : Le risque partagé descend à environ 10%.

      Des anomalies chromosomiques spécifiques ont été identifiées chez certains patients, notamment dans 15% des cas de schizophrénie et plus d'un tiers des formes précoces.

      Le Plan France Médecine Génomique permet aujourd'hui de séquencer le génome de patients pour identifier ces formes génétiques rares.

      L'Impact Crucial de l'Environnement

      L'environnement peut "enclencher, accélérer ou aggraver" le développement d'un trouble chez une personne vulnérable. Les facteurs identifiés sont multiples et peuvent intervenir à différentes étapes de la vie :

      Toxiques : Le cannabis est cité comme le facteur numéro 1, car il perturbe le "bon câblage" du cerveau, en maturation jusqu'à 25 ans.

      Stress Psychosocial : Bien que difficile à éviter, des psychothérapies peuvent aider à mieux le gérer.

      Facteurs Prénatals et Péri-natals : L'environnement intra-utérin est déterminant. Les agressions subies par le fœtus peuvent avoir des conséquences durables :

      Polluants et toxiques : L'alcool est une cause majeure des troubles de l'alcoolisation fœtale, qui concernent 1% des naissances et sont "totalement évitables".   

      Infections virales ou bactériennes : L'inflammation chez la mère peut se propager au cerveau du fœtus.    ◦ Stress maternel : La précarité, la violence ou des conditions socio-économiques défavorables peuvent modifier le développement cérébral.   

      Complications à la naissance : Un manque d'oxygène, par exemple, peut attaquer le cerveau à un stade très précoce.

      L'Épigénétique : Le Pont entre Inné et Acquis

      L'épigénétique est le mécanisme biologique qui fait la "passerelle" entre la génétique et l'environnement.

      Comme l'explique Valérie Lallemand-Mesger, directrice de recherche au CNRS, l'épigénétique ne modifie pas la séquence d'ADN, mais l'accès à l'information génétique.

      Mécanisme : La molécule d'ADN s'enroule autour de protéines. L'environnement (stress, toxiques) peut influencer le degré de compaction de cet enroulement.

      Une portion très compactée ("un nœud") devient inaccessible et le gène correspondant ne peut pas s'exprimer. À l'inverse, une portion déroulée est lisible.

      Conséquence : Des signaux environnementaux peuvent perturber cet équilibre à des moments cruciaux du développement cérébral, conduisant à la sur-activation néfaste de certains gènes ou à l'inhibition inopportune d'autres.

      3. La Médecine de Précision : Vers des Traitements Personnalisés

      Un des principaux freins en psychiatrie est la grande variabilité de la réponse aux médicaments, entraînant de nombreux échecs thérapeutiques.

      Le Défi de l'Échec Thérapeutique

      La pratique actuelle fonctionne souvent par "essai-erreur" ou "tâtonnement". Un traitement est essayé ; s'il échoue, un autre est proposé.

      Pour le patient, cela représente "du temps perdu, des effets secondaires inutiles" et une source de découragement.

      Exemple : Dans la prévention des crises maniaques du trouble bipolaire, seul un tiers des patients répond correctement au traitement. Les deux autres tiers répondent partiellement ou pas du tout.

      L'Exemple du Lithium

      Le lithium, un traitement régulateur de l'humeur pour le trouble bipolaire, est "miraculeux chez certains patients et chez d'autres il ne produit aucun effet".

      La recherche vise à identifier des biomarqueurs prédictifs de la réponse pour éviter les prescriptions inefficaces.

      Pharmacologie : La première étape est le suivi thérapeutique pharmacologique, qui consiste à doser la concentration du médicament dans le sang pour l'ajuster dans la "zone thérapeutique" efficace mais non toxique. Ceci est crucial pour les médicaments à "marge thérapeutique étroite".

      Marqueurs Épigénétiques : Des études montrent que la réponse au lithium peut être prédite par certaines marques épigénétiques.

      Imagerie Cérébrale (IRM) : Des techniques avancées permettent de visualiser la distribution du lithium directement dans le cerveau. Deux découvertes majeures ont été faites :

      1. Le lithium se distribue de manière hétérogène, et non uniformément comme on le pensait.  

      2. Les schémas de distribution ("patterns") varient considérablement d'un patient à l'autre.  

      3. Les plus fortes concentrations sont observées dans l'hippocampe, une région clé pour la régulation des émotions.

      L'objectif est de combiner ces approches pour prédire rapidement les chances de réponse d'un patient, ajuster les doses et confirmer l'intérêt de poursuivre un traitement.

      4. Stratégie Nationale et Impulsion de la Recherche

      Pour que ces avancées se concrétisent, une mobilisation des moyens et une structuration de la recherche au niveau national sont indispensables.

      L'Oncologie comme Modèle

      Anne-Paul Rockplot, généticienne et vice-présidente recherche de l'Université Paris Cité, établit un parallèle direct avec la cancérologie, qui a fait des "progrès absolument considérables" grâce aux trois Plans Cancer successifs soutenus par l'État.

      Ces plans ont permis de créer des centres de recherche intégrés et de développer la médecine de précision (adapter le traitement à la mutation génétique de la tumeur).

      L'ambition est de répliquer ce modèle pour la psychiatrie.

      Le Rôle Moteur de l'Université Paris Cité

      L'Université Paris Cité est présentée comme le "vaisseau amiral de la recherche en santé pour la France".

      Envergure : Elle compte 113 unités de recherche, dont une quinzaine dédiée à la santé mentale, réparties dans ses facultés de Santé, de Sciences, et de Sociétés et Humanités.

      Projets Stratégiques : Elle pilote des projets d'envergure comme Metabobrain, qui réunit plus de 90 chercheurs sur les liens corps-cerveau.

      Instituts d'Excellence : Elle abrite des Instituts Hospitalo-Universitaires (IHU) prestigieux, dont l'IHU ICE (Institut du Cerveau des Enfants), qui favorise la collaboration étroite entre chercheurs, cliniciens et patients.

      Programmes d'Investissement : L'université est au cœur des grands programmes nationaux "France 2030" pour la psychiatrie, tels que le biocluster Brain and Mind et le PEPR ProPsy (Psychiatrie de Précision).

      Vers un Engagement Pluriannuel

      La désignation de la santé mentale comme "grande cause nationale 2025" est vue comme une opportunité pour lancer un engagement durable.

      Le Pr Franck Bélivier, délégué ministériel à la santé mentale, appelle à ce que 2025 soit "une année de programmation pour un engagement pluriannuel".

      Plusieurs initiatives de prévention et de prise en charge précoce sont déjà en place, comme le numéro national de prévention du suicide (3114), les Maisons des Adolescents, et la formation de secouristes en santé mentale.

      5. L'Évolution de la Relation Patient-Praticien

      La transformation de la psychiatrie s'accompagne d'une évolution cruciale de la relation thérapeutique.

      Fin du Paternalisme : On abandonne une vision où les diagnostics étaient cachés aux patients par crainte de la stigmatisation.

      Vers la Psychoéducation : La pratique moderne consiste à expliquer la maladie au patient, à le responsabiliser et à l'associer pleinement aux décisions thérapeutiques.

      Bâtir la Confiance : Ce dialogue est fondamental pour construire une relation de confiance solide, indispensable pour un suivi au long cours, et est facilité lorsque la science permet de trouver "très facilement et très rapidement le bon traitement".

    1. Dossier d'Information : La Santé Mentale des Jeunes

      Synthèse

      Ce document de synthèse analyse l'état de la santé mentale des jeunes en France, en s'appuyant sur les expertises de psychiatres, d'addictologues et de chercheurs.

      Le constat principal est une augmentation spectaculaire des troubles anxieux et dépressifs, particulièrement depuis la pandémie de COVID-19, qualifiée de "deuxième épidémie".

      Des études récentes, comme celle de l'Institut Montaigne, révèlent qu'un tiers des 15-29 ans déclarent souffrir de dépression.

      Cette crise se caractérise également par un rajeunissement de l'apparition de certains troubles, tels que les troubles du comportement alimentaire et le refus scolaire anxieux, qui se manifestent désormais dès l'école primaire.

      Parallèlement, un paradoxe émerge dans le domaine des addictions : alors que la consommation globale de substances psychoactives (tabac, alcool, cannabis) est en baisse continue chez les jeunes depuis 2010, les usages se concentrent sur les populations les plus vulnérables, creusant les inégalités sociales et masquant une gravité accrue des cas individuels.

      La question des écrans est complexe ; si un lien de causalité direct avec les troubles mentaux est difficile à établir, leur impact sur la qualité du sommeil des adolescents est avéré.

      Une dimension de genre est fondamentale pour comprendre ces enjeux.

      Les jeunes femmes présentent une vulnérabilité deux fois plus élevée à la dépression et à l'anxiété, une différence qui apparaît à la puberté et qui est attribuée à des facteurs hormonaux, à une plus grande exposition aux traumatismes et aux pressions sociétales.

      Les symptômes eux-mêmes se manifestent différemment selon le genre.

      Enfin, le système de santé fait face à un "phénomène de ciseaux" : une demande de soins en forte hausse face à des ressources qui ne sont pas extensibles.

      Néanmoins, une tendance positive se dessine avec la déstigmatisation croissante des troubles psychiques, encourageant les jeunes à chercher de l'aide plus précocement.

      1. Un Constat Alarmant : L'Explosion des Troubles Psychiques chez les Jeunes

      Les experts s'accordent sur une détérioration significative de la santé mentale des jeunes, un phénomène qui s'est intensifié depuis la pandémie de COVID-19.

      Données Chiffrées : Une étude récente menée par l'Institut Montaigne, la Mutualité Française et l'Institut Teram auprès de 5 600 jeunes confirme cette tendance :

      un jeune de 15 à 29 ans sur trois déclare être atteint de dépression. D'autres données montrent que près de 10 % des enfants de 6 à 11 ans présentent déjà des signes de dépression.

      Impact du COVID-19 : La pandémie est décrite comme ayant provoqué une "deuxième épidémie" touchant la santé mentale.

      Le confinement et la rupture des liens sociaux ont été des facteurs de stress majeurs pour une jeunesse en quête de repères.

      Augmentation de la Prévalence : Le Dr Boris Chumet note qu'avant la pandémie, on estimait qu'environ 10 % de la population générale connaîtrait un épisode dépressif au cours de sa vie.

      Ce chiffre est désormais évalué à 20 %.

      Pression sur le Système de Soins : Les services d'urgence constatent un afflux récurrent de jeunes pour des motifs d'anxiété, une situation rare auparavant.

      Cette augmentation des besoins se heurte à des services non extensibles et à la longue durée de formation des psychiatres, créant un "phénomène de ciseaux".

      La réponse doit impliquer un réseau plus large de professionnels, incluant psychologues et médecins généralistes.

      1.1. Une Déstigmatisation en Marche

      Malgré ce tableau sombre, un changement positif est observé : la parole sur la santé mentale se libère.

      Fin d'un Tabou : Les jeunes générations sont plus enclines à parler de leur santé psychique et à chercher de l'aide, contrairement aux générations précédentes pour qui le sujet était tabou.

      Ce phénomène est comparé à la libération de la parole sur le cancer ou le sida.

      Consultation Précoce : Cette déstigmatisation favorise une meilleure reconnaissance des troubles et une réduction du retard au diagnostic.

      Le Dr Chumet insiste : "Il vaut mieux consulter pour rien juste pour se rassurer que consulter trop tard."

      Déculpabilisation : La reconnaissance des facteurs biologiques et génétiques dans les troubles psychiatriques contribue à déculpabiliser les individus, facilitant la démarche de consultation.

      2. Vulnérabilités Spécifiques et Rajeunissement des Troubles

      La période de l'adolescence et du jeune adulte est intrinsèquement une phase de vulnérabilité, le cerveau n'atteignant sa pleine maturité que vers 25 ans.

      Des tendances inquiétantes sont observées sur la précocité et la nature des troubles.

      Précocité des Pathologies : Le Pr Marie Rose Moro souligne que la majorité des pathologies psychiatriques (environ 90 %) apparaissent avant l'âge de 18 ou 21 ans.

      Le phénomène le plus marquant est le rajeunissement de l'apparition de certains troubles :

      Troubles du Comportement Alimentaire (TCA) : Auparavant typiques de l'adolescence (14-15 ans), des TCA prépubères apparaissent désormais chez des enfants de 9-10 ans.  

      Refus Scolaire Anxieux : Autrefois observé au lycée, il touche maintenant des enfants dès le CM2.

      Le Rôle de l'Impulsivité : Le développement cérébral est hétérogène ; les zones liées au contrôle de l'impulsivité sont les dernières à maturer (vers 25 ans).

      Cela explique la fréquence des passages à l'acte impulsifs, comme les tentatives de suicide, qui peuvent survenir quelques minutes après un état d'humeur stable.

      Symptômes de la Dépression chez l'Adolescent : La dépression chez les jeunes ne se manifeste pas toujours par la tristesse classique.

      Il faut être attentif à des signes comme l'irritabilité, l'opposition, les troubles somatiques ou un changement de comportement brutal.

      Populations à Risque :

      Enfants placés en institution : Ils présentent des taux de dépression et d'anxiété presque deux fois supérieurs à la population générale.  

      Enfants de migrants : Une étude a montré un retard de diagnostic de la schizophrénie de 1,5 à 2 ans chez ces jeunes, représentant une perte de chance considérable.

      3. La Réponse Institutionnelle : L'Exemple de la Maison de Solen

      Face à cette crise, des structures spécialisées comme la Maison de Solen (Maison des Adolescents de l'hôpital Cochin, AP-HP) jouent un rôle central.

      Caractéristiques de la Maison de Solen

      Détails

      Ancienneté et Volume

      A fêté ses 20 ans ; accueille 5 500 nouveaux adolescents chaque année.

      Équipe

      150 professionnels, dont 25 médecins et 30 chercheurs.

      Concept Clé

      Réunir en un seul lieu de référence tous les moyens nécessaires à la santé des adolescents.

      Approche

      Pluridisciplinarité (psychiatres, pédiatres, psychologues, enseignants, etc.) et accessibilité (accueil sans rendez-vous du lundi 9h au vendredi 19h pour les jeunes, parents et professionnels).

      Scolarité

      Intègre l'école au sein de la structure pour éviter la double peine de la maladie et de la déscolarisation.

      4. Le Paradoxe des Addictions : Baisse Générale mais Gravité Accrue

      Le Dr Guillaume Eragne, psychiatre addictologue, présente une vision nuancée des conduites addictives chez les jeunes, qui va à l'encontre des idées reçues.

      Tendance Générale à la Baisse : Depuis les années 2010, on observe une baisse continue et spectaculaire de tous les usages de substances psychoactives (licites et illicites) chez les jeunes.

      Exemple du Tabagisme : Le tabagisme quotidien chez les lycéens est passé d'environ 30 % dans les années 2010 à 6 % en 2022.  

      Causes : Efficacité des programmes de prévention et de renforcement des compétences psychosociales (affirmation de soi, estime de soi), et changement des modes de sociabilisation (plus d'interactions via les écrans).

      Le Phénomène de Polarisation : Cette baisse globale masque un creusement des inégalités sociales.

      La consommation se concentre désormais chez les jeunes les plus fragiles et sortis du système scolaire, où les taux peuvent être 4 à 5 fois plus élevés.

      Stigmatisation persistante : Contrairement aux autres troubles mentaux, les addictions restent extrêmement stigmatisées.

      Le "treatment gap" (écart entre le nombre de personnes concernées et celles prises en charge) est le plus élevé pour ces troubles.

      Moins de 20 % des patients ayant un problème avec l'alcool sont soignés en France.

      Nouvelles Tendances et Exceptions :

      Le Vapotage (Puff) : L'expérimentation de la cigarette électronique dépasse désormais celle du tabac.

      Elle constitue une nouvelle porte d'entrée dans la dépendance à la nicotine pour des jeunes non-fumeurs.   

      Le Protoxyde d'Azote : L'usage de ce produit est en augmentation.

      Il est souvent associé à des profils de polyconsommateurs et peut causer des lésions neurologiques irréversibles.

      5. La Question des Écrans : Un Facteur Complexe

      Le rapport aux écrans et aux réseaux sociaux est un sujet central, mais son lien avec la santé mentale est moins direct qu'il n'y paraît.

      Un Lien de Causalité Faible : Selon le Pr Grégoire Borst, seul 1 % du bien-être adolescent serait directement lié au temps passé sur les smartphones.

      Il est difficile d'établir une causalité directe.

      L'Impact Majeur sur le Sommeil : Le domaine où les preuves sont les plus solides est l'effet négatif des écrans sur la qualité du sommeil.

      La lumière des écrans tenus près du visage perturbe le rythme circadien, alors que les adolescents souffrent déjà d'un déficit de sommeil structurel.

      Recommandations Concrètes :

      1. Ne pas utiliser d'écrans au moins une heure avant de se coucher.   

      2. Adapter les rythmes scolaires en commençant les cours au collège et au lycée une heure plus tard.

      Souffrance sans Addiction : La souffrance liée aux réseaux sociaux (sentiment de solitude, anxiété liée à l'attente de validation) peut exister indépendamment d'un diagnostic d'addiction, qui se définit par une perte de contrôle.

      6. La Perspective de Genre : Spécificités de la Santé Mentale Féminine

      Le Dr Sarah Tebeka, psychiatre, insiste sur la nécessité d'une approche différenciée de la santé mentale selon le genre, car les troubles ne se manifestent pas de la même manière chez les hommes et les femmes.

      Vulnérabilité Accrue à la Dépression et à l'Anxiété : Les femmes ont un risque deux fois plus important de développer ces troubles.

      Cette vulnérabilité apparaît à la puberté (ménarche) et s'estompe à la ménopause, suggérant un fort rôle des facteurs hormonaux.

      Causes Multifactorielles :

      Biologiques : Fluctuation des hormones sexuelles.  

      Environnementales : Exposition accrue aux traumatismes et aux violences sexuelles (90 % des victimes sont des filles).  

      Socioculturelles : Pression sur l'apparence, attentes sociétales et moindre incitation à la pratique d'une activité physique régulière.

      Différences de Symptômes (Exemple de la Dépression) :

      Chez la Femme

      Chez l'Homme

      Tristesse, perte de plaisir (anhédonie)

      Irritabilité, colère

      Culpabilité, dévalorisation

      Fuite, isolement

      Consommation de substances

      6.1. La Charge Mentale des Jeunes Aidants

      Une forme particulière de charge mentale touche de manière disproportionnée les jeunes femmes : le rôle d'aidant familial.

      Prévalence : L'étude "CampusCaire" révèle qu'environ un étudiant sur six (16 %) est en situation d'aidant auprès d'un proche malade ou en situation de handicap.

      Disparité de Genre : 80 % de ces jeunes aidants sont des jeunes femmes.

      Le Défi de la Reconnaissance : Beaucoup de ces étudiants n'ont pas conscience de leur statut d'aidant, considérant leur aide comme normale, et peinent à demander de l'aide pour eux-mêmes.

      Soutiens Existants : Les universités mettent en place des dispositifs d'aide (aménagements d'études, soutien psychologique, groupes de parole entre pairs).

    1. Reviewer #1 (Public review):

      Summary:

      The study characterises an RNA polymerase (Pol) I mutant (RPA135-F301S) named SuperPol. This mutant was previously shown to increase yeast ribosomal RNA (rRNA) production by Transcription Run-On (TRO). In this work, the authors confirm this mutation increases rRNA transcription using a slight variation of the TRO method, Transcriptional Monitoring Assay (TMA), which also allows the analysis of partially degraded RNA molecules. The authors show a reduction of abortive rRNA transcription in cells expressing the SuperPol mutant and a modest occupancy decrease at the 5' region of the rRNA genes compared to WT Pol I. These results suggest that the SuperPol mutant displays a lower frequency of premature termination. Using in vitro assays, the authors found that the mutation induces an enhanced elongation speed and a lower cleavage activity on mismatched nucleotides at the 3' end of the RNA. Finally, SuperPol mutant was found to be less sensitive to BMH-21, a DNA intercalating agent that blocks Pol I transcription and triggers the degradation of the Pol I subunit, Rpa190. Compared to WT Pol I, short BMH-21 treatment has little effect on SuperPol transcription activity, and consequently, SuperPol mutation decreases cell sensitivity to BMH-21.

      Significance:

      The work further characterises a single amino acid mutation of one of the largest yeast Pol I subunits (RPA135-F301S). While this mutation was previously shown to increase rRNA synthesis, the current work expands the SuperPol mutant characterisation, providing details of how RPA135-F301S modifies the enzymatic properties of yeast Pol I. In addition, their findings suggest that yeast Pol I transcription can be subjected to premature termination in vivo. The molecular basis and potential regulatory functions of this phenomenon could be explored in additional studies.

      Our understanding of rRNA transcription is limited, and the findings of this work may be interesting to the transcription community. Moreover, targeting Pol I activity is an open strategy for cancer treatment. Thus, the resistance of SuperPol mutant to BMH-21 might also be of interest to a broader community, although these findings are yet to be confirmed in human Pol I and with more specific Pol I inhibitors in future.

      Comments on revision:

      The authors' response addressed all the points I raised adequately.

    2. Reviewer #2 (Public review):

      Summary:

      This article presents a study on a mutant form of RNA polymerase I (RNAPI) in yeast, referred to as SuperPol, which demonstrates increased rRNA production compared to the wild-type enzyme. While rRNA production levels are elevated in the mutant, RNAPI occupancy as detected by CRAC is reduced at the 5' end of rDNA transcription units. The authors interpret these findings by proposing that the wild-type RNAPI pauses in the external transcribed spacer (ETS), leading to premature transcription termination (PTT) and degradation of truncated rRNAs by the RNA exosome (Rrp6). They further show that SuperPol's enhanced activity is linked to a lower frequency of PTT events, likely due to altered elongation dynamics and reduced RNA cleavage activity, as supported by both in vivo and in vitro data.

      The study also examines the impact of BMH-21, a drug known to inhibit Pol I elongation, and shows that SuperPol is less sensitive to this drug, as demonstrated through genetic, biochemical, and in vivo approaches. The authors show that BMH-21 treatment induces premature termination in wild-type Pol I, but only to a lesser extent in SuperPol. They suggest that BMH-21 promotes termination by targeting paused Pol I complexes and propose that PTT is an important regulatory mechanism for rRNA production in yeast.

      The data presented are of high quality and support the notion that 1) premature transcription termination occurs at the 5' end of rDNA transcription units; 2) SuperPol has an increased elongation rate with reduced premature termination; and 3) BMH-21 promotes both pausing and termination. The authors employ several complementary methods, including in vitro transcription assays. These results are significant and of interest for a broad audience.

      Adding experiments in different growth conditions to support the claim of regulation by PTT (as the authors propose) will also be an important addition. The revisions further support the claim, with in particular the notion that increased elongation rate of superpol occurs at the expense of fidelity.

      Significance:

      These results are significant and of interest for a basic research audience.

    3. Reviewer #3 (Public review):

      In the manuscript "Ribosomal RNA synthesis by RNA polymerase I is regulated by premature termination of transcription", Azouzi and co-authors investigate the regulatory mechanisms of ribosomal RNA (rRNA) transcription by RNA Polymerase I (RNAPI) in the budding yeast S. cerevisiae. They follow up on exploring the molecular basis of a mutant allele of the second-largest subunit of RNAPI, RPA135-F301S, also dubbed SuperPol, that they had previously reported (Darrière et al, 2019), and which was shown to rescue Rpa49-linked growth defects, possibly by increasing rRNA production.

      Through a combination of genomic and in vitro approaches, the authors test the hypothesis that RNAPI activity could be subjected to a premature transcription termination (PTT) mechanism, akin to what is observed for RNA Polymerase II (RNAPII). The authors demonstrate that SuperPol increased processivity "desensitizes" RNAPI to abortive transcription cycles at the expense of decreased fidelity. In agreement, SuperPol is shown to be resistant to BMH-21, a drug previously shown to impair RNAPI elongation.

      Overall, this work expands the mechanistic understanding of the early dynamics of RNAPI transcription. The presented results are of interest for researchers studying transcription regulation, particularly those interested in RNAPI's transcription mechanisms and fidelity.

      Strengths:

      Overall, the experiments are performed with rigor and include the appropriate controls and statistical analyses. Conclusions are drawn from appropriate experiments. Both the figures and the text present the data clearly. The Materials and Methods section is detailed enough.

      Weaknesses:

      The biological significance of this phenomenon remains unaddressed and thus unclear. The lack of experiments to test a specific regulatory function (such as UTP-A loading checkpoint or other mechanisms) limit these termination events to possibly abortive actions of unclear significance.

      Comments on revised version:

      I appreciated the additional experiments and the other changes made by the authors in the revised version.

    4. Author response:

      The following is the authors’ response to the original reviews

      General Statements:

      In our manuscript, we demonstrate for the first time that RNA Polymerase I (Pol I) can prematurely release nascent transcripts at the 5' end of ribosomal DNA transcription units in vivo. This achievement was made possible by comparing wild-type Pol I with a mutant form of Pol I, hereafter called SuperPol previously isolated in our lab (Darrière at al., 2019). By combining in vivo analysis of rRNA synthesis (using pulse-labelling of nascent transcript and cross-linking of nascent transcript - CRAC) with in vitro analysis, we could show that Superpol reduced premature transcript release due to altered elongation dynamics and reduced RNA cleavage activity. Such premature release could reflect regulatory mechanisms controlling rRNA synthesis. Importantly, This increased processivity of SuperPol is correlated with resistance with BMH-21, a novel anticancer drugs inhibiting Pol I, showing the relevance of targeting Pol I during transcriptional pauses to kill cancer cells. This work offers critical insights into Pol I dynamics, rRNA transcription regulation, and implications for cancer therapeutics.

      We sincerely thank the three reviewers for their insightful comments and recognition of the strengths and weaknesses of our study. Their acknowledgment of our rigorous methodology, the relevance of our findings on rRNA transcription regulation, and the significant enzymatic properties of the SuperPol mutant is highly appreciated. We are particularly grateful for their appreciation of the potential scientific impact of this work. Additionally, we value the reviewer’s suggestion that this article could address a broad scientific community, including in transcription biology and cancer therapy research. These encouraging remarks motivate us to refine and expand upon our findings further.

      All three reviewers acknowledged the increased processivity of SuperPol compared to its wildtype counterpart. However, two out of three questions our claims that premature termination of transcription can regulate ribosomal RNA transcription. This conclusion is based on SuperPol mutant increasing rRNA production. Proving that modulation of early transcription termination is used to regulate rRNA production under physiological conditions is beyond the scope of this study. Therefore, we propose to change the title of this manuscript to focus on what we have unambiguously demonstrated:

      “Ribosomal RNA synthesis by RNA polymerase I is subjected to premature termination of transcription”.

      Reviewer 1 main criticisms centers on the use of the CRAC technique in our study. While we address this point in detail below, we would like to emphasize that, although we agree with the reviewer’s comments regarding its application to Pol II studies, by limiting contamination with mature rRNA, CRAC remains the only suitable method for studying Pol I elongation over the entire transcription units. All other methods are massively contaminated with fragments of mature RNA which prevents any quantitative analysis of read distribution within rDNA.  This perspective is widely accepted within the Pol I research community, as CRAC provides a robust approach to capturing transcriptional dynamics specific to Pol I activity. 

      We hope that these findings will resonate with the readership of your journal and contribute significantly to advancing discussions in transcription biology and related fields.

      Description of the planned revisions:

      Despite numerous text modification (see below), we agree that one major point of discussion is the consequence of increased processivity in SuperPol mutant on the “quality” of produced rRNA. Reviewer 3 suggested comparisons with other processive alleles, such as the rpb1-E1103G mutant of the RNAPII subunit (Malagon et al., 2006). This comparison has already been addressed by the Schneider lab (Viktorovskaya OV, Cell Rep., 2013 - PMID: 23994471), which explored Pol II (rpb1-E1103G) and Pol I (rpa190-E1224G). The rpa190-E1224G mutant revealed enhanced pausing in vitro, highlighting key differences between Pol I and Pol II catalytic ratelimiting steps (see David Schneider's review on this topic for further details).

      Reviewer 2 and 3 suggested that a decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. Pol I mutant with decreased rRNA cleavage have been characterized previously, and resulted in increased errorrate. We already started to address this point. Preliminary results from in vitro experiments suggest that SuperPol mutants exhibit an elevated error rate during transcription. However, these findings remain preliminary and require further experimental validation to confirm their reproducibility and robustness. We propose to consolidate these data and incorporate into the manuscript to address this question comprehensively. This could provide valuable insights into the mechanistic differences between SuperPol and the wild-type enzyme. SuperPol is the first pol I mutant described with an increased processivity in vitro and in vivo, and we agree that this might be at the cost of a decreased fidelity.

      Regulatory aspect of the process:

      To address the reviewer’s remarks, we propose to test our model by performing experiments that would evaluate PTT levels in Pol I mutant’s or under different growth conditions. These experiments would provide crucial data to support our model, which suggests that PTT is a regulatory element of Pol I transcription. By demonstrating how PTT varies with environmental factors, we aim to strengthen the hypothesis that premature termination plays an important role in regulating Pol I activity.

      We propose revising the title and conclusions of the manuscript. The updated version will better reflect the study's focus and temper claims regarding the regulatory aspects of termination events, while maintaining the value of our proposed model.

      Description of the revisions that have already been incorporated in the transferred manuscript:

      Some very important modifications have now been incorporated:

      Statistical Analyses and CRAC Replicates:

      Unlike reviewers 2 and 3, reviewer 1 suggests that we did not analyze the results statistically. In fact, the CRAC analyses were conducted in biological triplicate, ensuring robustness and reproducibility. The statistical analyses are presented in Figure 2C, which highlights significant findings supporting the fact WT Pol I and SuperPol distribution profiles are different. We CRAC replicates exhibit a high correlation and we confirmed significant effect in each region of interest (5’ETS, 18S.2, 25S.1 and 3’ ETS, Figure 1) to confirm consistency across experiments. We finally took care not to overinterpret the results, maintaining a rigorous and cautious approach in our analysis to ensure accurate conclusions.

      CRAC vs. Net-seq:

      Reviewer 1 ask to comment differences between CRAC and Net-seq. Both methods complement each other but serve different purposes depending on the biological question on the context of transcription analysis. Net-seq has originally been designed for Pol II analysis. It captures nascent RNAs but does not eliminate mature ribosomal RNAs (rRNAs), leading to high levels of contamination. While this is manageable for Pol II analysis (in silico elimination of reads corresponding to rRNAs), it poses a significant problem for Pol I due to the dominance of rRNAs (60% of total RNAs in yeast), which share sequences with nascent Pol I transcripts. As a result, large Net-seq peaks are observed at mature rRNA extremities (Clarke 2018, Jacobs 2022). This limits the interpretation of the results to the short lived pre-rRNA species. In contrast, CRAC has been specifically adapted by the laboratory of David Tollervey to map Pol I distribution while minimizing contamination from mature rRNAs (The CRAC protocol used exclusively recovers RNAs with 3′ hydroxyl groups that represent endogenous 3′ ends of nascent transcripts, thus removing RNAs with 3’-Phosphate, found in mature rRNAs). This makes CRAC more suitable for studying Pol I transcription, including polymerase pausing and distribution along rDNA, providing quantitative dataset for the entire rDNA gene.

      CRAC vs. Other Methods:

      Reviewer 1 suggests using GRO-seq or TT-seq, but the experiments in Figure 2 aim to assess the distribution profile of Pol I along the rDNA, which requires a method optimized for this specific purpose. While GRO-seq and TT-seq are excellent for measuring RNA synthesis and cotranscriptional processing, they rely on Sarkosyl treatment to permeabilize cellular and nuclear membranes. Sarkosyl is known to artificially induces polymerase pausing and inhibits RNase activities which are involved in the process. To avoid these artifacts, CRAC analysis is a direct and fully in vivo approach. In CRAC experiment, cells are grown exponentially in rich media and arrested via rapid cross-linking, providing precise and artifact-free data on Pol I activity and pausing.

      Pol I ChIP Signal Comparison:

      The ChIP experiments previously published in Darrière et al. lack the statistical depth and resolution offered by our CRAC analyses. The detailed results obtained through CRAC would have been impossible to detect using classical ChIP. The current study provides a more refined and precise understanding of Pol I distribution and dynamics, highlighting the advantages of CRAC over traditional methods in addressing these complex transcriptional processes.

      BMH-21 Effects:

      As highlighted by Reviewer 1, the effects of BMH-21 observed in our study differ slightly from those reported in earlier work (Ref Schneider 2022), likely due to variations in experimental conditions, such as methodologies (CRAC vs. Net-seq), as discussed earlier. We also identified variations in the response to BMH-21 treatment associated with differences in cell growth phases and/or cell density. These factors likely contribute to the observed discrepancies, offering a potential explanation for the variations between our findings and those reported in previous studies. In our approach, we prioritized reproducibility by carefully controlling BMH-21 experimental conditions to mitigate these factors. These variables can significantly influence results, potentially leading to subtle discrepancies. Nevertheless, the overall conclusions regarding BMH-21's effects on WT Pol I are largely consistent across studies, with differences primarily observed at the nucleotide resolution. This is a strength of our CRAC-based analysis, which provides precise insights into Pol I activity.

      We will address these nuances in the revised manuscript to clarify how such differences may impact results and provide context for interpreting our findings in light of previous studies.

      Minor points:

      Reviewer #1:

      In general, the writing style is not clear, and there are some word mistakes or poor descriptions of the results, for example: 

      On page 14: "SuperPol accumulation is decreased (compared to Pol I)". 

      On page 16: "Compared to WT Pol I, the cumulative distribution of SuperPol is indeed shifted on the right of the graph." 

      We clarified and increased the global writing style according to reviewer comment.

      There are also issues with the literature, for example: Turowski et al, 2020a and Turowski et al, 2020b are the same article (preprint and peer-reviewed). Is there any reason to include both references? Please, double-check the references.  

      This was corrected in this version of the manuscript.

      In the manuscript, 5S rRNA is mentioned as an internal control for TMA normalisation. Why are Figure 1C data normalised to 18S rRNA instead of 5S rRNA? 

      Data are effectively normalized relative to the 5S rRNA, but the value for the 18S rRNA is arbitrarily set to 100%.

      Figure 4 should be a supplementary figure, and Figure 7D doesn't have a y-axis labelling. 

      The presence of all Pol I specific subunits (Rpa12, Rpa34 and Rpa49) is crucial for the enzymatic activity we performed. In the absence of these subunits (which can vary depending on the purification batch), Pol I pausing, cleavage and elongation are known to be affected. To strengthen our conclusion, we really wanted to show the subunit composition of the purified enzyme. This important control should be shown, but can indeed be shown in a supplementary figure if desired.

      Y-axis is figure 7D is now correctly labelled

      In Figure 7C, BMH-21 treatment causes the accumulation of ~140bp rRNA transcripts only in SuperPol-expressing cells that are Rrp6-sensitive (line 6 vs line 8), suggesting that BHM-21 treatment does affect SuperPol. Could the author comment on the interpretation of this result? 

      The 140 nt product is a degradation fragment resulting from trimming, which explains its lower accumulation in the absence of Rrp6. BMH21 significantly affects WT Pol I transcription but has also a mild effect on SuperPol transcription. As a result, the 140 nt product accumulates under these conditions.

      Reviewer #2:

      pp. 14-15: The authors note local differences in peak detection in the 5'-ETS among replicates, preventing a nucleotide-resolution analysis of pausing sites. Still, they report consistent global differences between wild-type and SuperPol CRAC signals in the 5'ETS (and other regions of the rDNA). These global differences are clear in the quantification shown in Figures 2B-C. A simpler statement might be less confusing, avoiding references to a "first and second set of replicates" 

      According to reviewer, statement has been simplified in this version of the manuscript.

      Figures 2A and 2C: Based on these data and quantification, it appears that SuperPol signals in the body and 3' end of the rDNA unit are higher than those in the wild type. This finding supports the conclusion that reduced pausing (and termination) in the 5'ETS leads to an increased Pol I signal downstream. Since the average increase in the SuperPol signal is distributed over a larger region, this might also explain why even a relatively modest decrease in 5'ETS pausing results in higher rRNA production. This point merits discussion by the authors. 

      We agree that this is a very important discussion of our results. Transcription is a very dynamic process in which paused polymerase is easily detected using the CRAC assay. Elongated polymerases are distributed over a much larger gene body, and even a small amount of polymerase detected in the gene body can represent a very large rRNA synthesis. This point is of paramount importance and, as suggested by the reviewer, is now discussed in detail.

      A decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. Have the authors observed any evidence supporting this possibility? 

      Reviewer suggested that a decreased efficiency of cleavage upon backtracking might imply an increased error rate in SuperPol compared to the wild-type enzyme. We thank Reviewer #2 to point it as in our opinion, this is an important point what should be added to the manuscript. We have now included new data (panels 5G, 5H and 5I) in the manuscript showing that SuperPol in vitro exhibits an increased error rate compared to the WT enzyme. From these results obtained in vitro, we concluded that SuperPol shows reduced nascent transcript cleavage, associated with more efficient transcript elongation, but to the detriment of transcriptional fidelity.

      pp. 15 and 22: Premature transcription termination as a regulator of gene expression is welldocumented in yeast, with significant contributions from the Corden, Brow, Libri, and Tollervey labs. These studies should be referenced along with relevant bacterial and mammalian research. 

      According to reviewer suggestion, we referenced these studies.

      p. 23: "SuperPol and Rpa190-KR have a synergistic effect on BMH-21 resistance." A citation should be added for this statement. 

      This represents some unpublished data from our lab. KR and SuperPol are the only two known mutants resistant to BMH-21. We observed that resistance between both alleles is synergistic, with a much higher resistance to BMH-21 in the double mutant than in each single mutant (data not shown). Comparing their resistance mechanisms is a very important point that we could provide upon request. This was added to the statement.

      p. 23: "The released of the premature transcript" - this phrase contains a typo 

      This is now corrected.

      Reviewer #3:

      Figure 1B: it would be opportune to separate the technique's schematic representation from the actual data. Concerning the data, would the authors consider adding an experiment with rrp6D cells? Some RNAs could be degraded even in such short period of time, as even stated by the authors, so maybe an exosome depleted background could provide a more complete picture. Could also the authors explain why the increase is only observed at the level of 18S and 25S? To further prove the robustness of the Pol I TMA method could be good to add already characterized mutations or other drugs to show that the technique can readily detect also well-known and expected changes. 

      The precise objective of this experiment is to avoid the use of the Rrp6 mutant. Under these conditions, we prevent the accumulation of transcripts that would result from a maturation defect. While it is possible to conduct the experiment with the Rrp6 mutant, it would be impossible to draw reliable conclusions due to this artificial accumulation of transcripts.

      Figure 1C: the NTS1 probe signal is missing (it is referenced in Figure 1A but not listed in the Methods section or the oligo table). If this probe was unused, please correct Figure 1A accordingly. 

      We corrected Figure 1A.  

      Figure 2A: the RNAPI occupancy map by CRAC is hard to interpret. The red color (SuperPol) is stacked on top of the blue line, and we are not able to observe the signal of the WT for most of the position along the rDNA unit. It would be preferable to use some kind of opacity that allows to visualize both curves. Moreover, the analysis of the behavior of the polymerase is always restricted to the 5'ETS region in the rest of the manuscript. We are thus not able to observe whether termination events also occur in other regions of the rDNA unit. A Northern blot analysis displaying higher sizes would provide a more complete picture. 

      We addressed this point to make the figure more visually informative. In Northern Blot analysis, we use a TSS (Transcription Start Site) probe, which detects only transcripts containing the 5' extremity. Due to co-transcriptional processing, most of the rRNA undergoing transcription lacks its 5' extremity and is not detectable using this technique. We have the data, but it does not show any difference between Pol I and SuperPol. This information could be included in the supplementary data if asked.

      "Importantly, despite some local variations, we could reproducibly observe an increased occupancy of WT Pol I in 5'-ETS compared to SuperPol (Figure 1C)." should be Figure 2C. 

      Thanks for pointing out this mistake. It has been corrected.

      Figure 3D: most of the difference in the cumulative proportion of CRAC reads is observed in the region ~750 to 3000. In line with my previous point, I think it would be worth exploring also termination events beyond the 5'-ETS region. 

      We agree that such an analysis would have been interesting. However, with the exception of the pre-rRNA starting at the transcription start site (TSS) studied here, any cleaved rRNA at its 5' end could result from premature termination and/or abnormal processing events. Exploring the production of other abnormal rRNAs produced by premature termination is a project in itself, beyond this initial work aimed at demonstrating the existence of premature termination events in ribosomal RNA production.

      Figure 4: should probably be provided as supplementary material. 

      As l mentioned earlier (see comments), the presence of all Pol I specific subunits (Rpa12, Rpa34 and Rpa49) is crucial for the enzymatic activity we performed. This important control should be shown, but can indeed be shown in a supplementary figure if desired.

      "While the growth of cells expressing SuperPol appeared unaffected, the fitness of WT cells was severely reduced under the same conditions." I think the growth of cells expressing SuperPol is slightly affected. 

      We agree with this comment and we modified the text accordingly.

      Figure 7D: the legend of the y-axis is missing as well as the title of the plot. 

      Legend of the y-axis and title of the plot are now present.

      The statements concerning BMH-21, SuperPol and Rpa190-KR in the Discussion section should be removed, or data should be provided.

      This was discussed previously. See comment above.

      Some references are missing from the Bibliography, for example Merkl et al., 2020; Pilsl et al., 2016a, 2016b. 

      Bibliography is now fixed

      Description of analyses that authors prefer not to carry out:

      Does SuperPol mutant produces more functional rRNAs ?

      As Reviewer 1 requested, we agree that this point requires clarification.. In cells expressing SuperPol, a higher steady state of (pre)-rRNAs is only observed in absence of degradation machinery suggesting that overproduced rRNAs are rapidly eliminated. We know that (pre)rRNas are unable to accumulate in absence of ribosomal proteins and/or Assembly Factors (AF). In consequence, overproducing rRNAs would not be sufficient to increase ribosome content. This specific point is further address in our lab but is beyond the scope of this article.

      Is premature termination coupled with rRNA processing 

      We appreciate the reviewer’s insightful comments. The suggested experiments regarding the UTP-A complex's regulatory potential are valuable and ongoing in our lab, but they extend beyond the scope of this study and are not suitable for inclusion in the current manuscript.

    1. By promoting ‘openness’ in terms akin tonegative liberty, the OER movement has overemphasised the removal of barriers asthe principal concern of open education. However, as a result of this focus, there is adistinct lack of consideration for how learning might take place once these obstaclesare overcome.

      this tethers back to the findings that 3-10% of MOOCs are actually completed — negative liberty unlocks the door, but how do learners un-learn the pedagogical suppositions of how education and learning function to actually utilize these educational tools?

    Annotators

    1. Reviewer #3 (Public review):

      Summary:

      The authors have provided a thorough and constructive response to the comments. They effectively addressed concerns regarding the dependence on marker gene selection by detailing the incorporation of multiple feature selection strategies, such as highly variable genes and spatially informative markers (e.g., via Moran's I), which enhance glmSMA's robustness even when using gene-limited reference atlases.

      Furthermore, the authors thoughtfully acknowledged the assumption underlying glmSMA-that transcriptionally similar cells are spatially proximal-and discussed both its limitations and empirical robustness in heterogeneous tissues such as human PDAC. Their use of real-world, heterogeneous datasets to validate this assumption demonstrates the method's practical utility and adaptability.

      Overall, the response appropriately contextualizes the limitations while reinforcing the generalizability and performance of glmSMA. The authors' clarifications and experimental justifications strengthen the manuscript and address the reviewer's concerns in a scientifically sound and transparent manner.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      Liu et al., present glmSMA, a network-regularized linear model that integrates single-cell RNA-seq data with spatial transcriptomics, enabling high-resolution mapping of cellular locations across diverse datasets. Its dual regularization framework (L1 for sparsity and generalized L2 via a graph Laplacian for spatial smoothness) demonstrates robust performance of their model and offers novel tools for spatial biology, despite some gaps in fully addressing spatial communication.

      Overall, the manuscript is commendable for its comprehensive benchmarking across different spatial omics platforms and its novel application of regularized linear models for cell mapping. I think this manuscript can be improved by addressing method assumptions, expanding the discussion on feature dependence and cell type-specific biases, and clarifying the mechanism of spatial communication.

      The conclusions of this paper are mostly well supported by data, but some aspects of model developmentand performance evaluation need to be clarified and extended.

      We are thankful for the positive comments and have made changes following the reviewer's advice, as detailed below.

      (1) What were the assumptions made behind the model? One of them could be the linear relationship between cellular gene expression and spatial location. In complex biological tissues, non-linear relationships could be present, and this would also vary across organ systems and species. Similarly, with regularization parameters, they can be tuned to balance sparsity and smoothness adequately but may not hold uniformly across different tissue types or data quality levels. The model also seems to assume independent errors with normal distribution and linear additive effects - a simplification that may overlook overdispersion or heteroscedasticity commonly observed in RNA-seq data.

      Thank you for this comment. We acknowledge that the non-linear relationships can be present in complex tissues and may not be fully captured by a linear model. 

      Our choice of a linear model was guided by an investigation of the relationship in the current datasets, which include intestinal villus, mouse brain, and fly embryo.There is a linear correlation between expression distance and physical distance [Nitzan et al]. Within a given anatomical structure, cells in closer proximity exhibit more similar expression patterns (Fig. 3c). In tissues where non-linear relationships are more prevalent—such as the human PDAC sample—our mapping results remain robust. We acknowledge that we have not yet tested our algorithm in highly heterogeneous regions like the liver, and we plan to include such analyses in future work if necessary.

      Regarding the regularization parameters, we agree that the balance between sparsity and smoothness is sensitive to tissue-specific variation and data quality. In our current implementation, we explored a range of values to find robust defaults. Supplementary Figure 7 illustrates the regularization path for cell assignment in the fly embryo.  

      The choice of L1 and L2 regularization parameters is crucial for balancing sparsity and smoothness in spatial mapping. 

      For Structured Tissues (brain):

      Moderate L1 to ensure cells are localized.

      Small to moderate L2 to maintain local smoothness without blurring distinct regions.

      For Less Structured (PDAC):

      Slightly lower L1 to allow cells to be associated with multiple regions if boundaries are ambiguous.

      Higher L2 to stabilize mappings in noisy or mixed regions.

      (2) The performance of glmSMA is likely sensitive to the number and quality of features used. With too few features, the model may struggle to anchor cells correctly due to insufficient discriminatory power, whereas too many features could lead to overfitting unless appropriately regularized. The manuscript briefly acknowledges this issue, but further systematic evaluation of how varying feature numbers affect mapping accuracy would strengthen the claims, particularly in settings where marker gene availability is limited. A simple way to show some of this would be testing on multiple spatial omics (imaging-based) platforms with varying panel sizes and organ systems. Related to this, based on the figures, it also seems like the performance varies by cell type. What are the factors that contribute to this? Variability in expression levels, RNA quantity/quality? Biases in the panel? Personally, I am also curious how this model can be used similarly/differently if we have a FISH-based, high-plex reference atlas. Additional explanation around these points would be helpful for the readers.

      Thank you for this thoughtful comment. The performance of our method is indeed sensitive to the number and quality of selected features. To optimize feature selection, we employed multiple strategies, including Moran’s I statistic, identification of highly variable genes, and the Seurat pipeline to detect anchor genes linking the spatial transcriptomics data with the reference atlas. The number of selected markers depends on the quality of the data. For highquality datasets, fewer than 100 markers are typically sufficient for prediction. To select marker genes, we applied the following optional strategies:

      (1) Identifying highly variable genes (HVGs).

      (2) Calculating Moran’s I scores for all genes to assess spatial autocorrelation.

      (3) Generating anchor genes based on the integration of the reference atlas and scRNA-seq data using Seurat.

      We evaluated our method across diverse tissue types and platforms—including Slide-seq, 10x Visium, and Virtual-FISH—which represent both sequencing-based and imaging-based spatial transcriptomics technologies. Our model consistently achieved strong performance across these settings. It's worth noting that the performance of other methods, such as CellTrek [Wei et al] and novoSpaRc [Nitzan et al], also depends heavily on feature selection. In particular, performance degrades substantially when fewer features are used. For fair comparison across different methods, the same set of marker genes was used. Under this condition, our method outperformed the others based on KL divergence (Fig. 2b, Fig. 5g). 

      To assess the effect of marker gene quantity, we randomly selected subsets of 2,000, 1500, 1,000, 700, 500, and 200 markers from the original set. As the number of markers decreases, mapping performance declines, which is expected due to the reduction in available spatial information. This result underscores the general dependence of spatial mapping accuracy on both the number and quality of informative marker genes (Supplementary Fig. 10).

      We do not believe that the observed performance is directly influenced by cell type composition. Major cell types are typically well-defined, and rare cell types comprise only a small fraction of the dataset. For these rare populations, a single misclassification can disproportionately impact metrics like KL divergence due to small sample size. However, this does not necessarily indicate a systematic cell type–specific bias in the mapping. We incorporated a high-resolution Slide-seq dataset from the mouse hippocampus to evaluate the influence of cell type composition on the algorithm’s performance [Stickels et al., 2020]. Most cell types within the CA1, CA2, CA3, and DG regions were accurately mapped to their original anatomical locations (Fig. 5e, f, g).

      (3) Application 3 (spatial communication) in the graphical abstract appears relatively underdeveloped. While it is clear that the model infers spatial proximities, further explanation of how these mappings translate into insights into cell-cell communication networks would enhance the biological relevance of the findings.

      Thank you for this valuable feedback. We agree that further elaboration on the connection between spatial proximity and cell–cell communication would enhance the biological interpretation of our results. While our current model focuses on inferring spatial relationships,  we may provide some cell-cell communications in the future.

      (4) What is the final resolution of the model outputs? I am assuming this is dictated by the granularity of the reference atlas and the imposed sparsity via the L1 norm, but if there are clear examples that would be good. In figures (or maybe in practice too), cells seem to be assigned to small, contiguous patches rather than pinpoint single-cell locations, which is a pragmatic compromise given the inherent limitations of current spatial transcriptomics technologies. Clarification on the precise spatial scale (e.g., pixel or micrometer resolution) and any post-mapping refinement steps would be beneficial for the users to make informed decisions on the right bioinformatic tools to use.

      Thank you for the comment. For each cell, our algorithm generates a probability vector that indicates its likely spatial assignment along with coordinate information. In our framework, each cell is mapped to one or more spatial spots with associated probabilities. Depending on the amount of regularization through L1 and L2 norms, a cell may be localized to a small patch or distributed over a broader domain (Supplementary Fig. 5 & 7). For the 10x Visium data, we applied a repelling algorithm to enhance visualization [Wei et al]. If a cell’s original location is already occupied, it is reassigned to a nearby neighborhood to avoid overlap. The users can also see the entire regularization path by varying the penalty terms. 

      Nitzan M, Karaiskos N, Friedman N, Rajewsky N. Gene expression cartography. Nature. 2019;576(7785):132-137. doi:10.1038/s41586-019-1773-3

      Wei, R. et al. (2022) ‘Spatial charting of single-cell transcriptomes in tissues’, Nature Biotechnology, 40(8), pp. 1190–1199. doi:10.1038/s41587-022-01233-1.

      Stickels, R.R. et al. (2020) ‘Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-SEQV2’, Nature Biotechnology, 39(3), pp. 313–319. doi:10.1038/s41587-020-0739-1. 

      Reviewer #2 (Public review):

      Summary:

      The author proposes a novel method for mapping single-cell data to specific locations with higher resolution than several existing tools.

      Strengths:

      The spatial mapping tests were conducted on various tissues, including the mouse cortex, human PDAC, and intestinal villus.

      Weakness:

      (1) Although the researchers claim that glmSMA seamlessly accommodates both sequencing-based and image-based spatial transcriptomics (ST) data, their testing primarily focused on sequencingbased ST data, such as Visium and Slide-seq. To demonstrate its versatility for spatial analysis, the authors should extend their evaluation to imaging-based spatial data.

      Thank you for the comment. We have tested our algorithm on the virtual FISH dataset from the fly embryo, which serves as an example of image-based spatial omics data (Fig. 4c). However, such datasets often contain a limited number of available genes. To address this, we will conduct additional testing on image-based data if needed. The Allen Brain Atlas provides high-quality ISH data, and we can select specific brain regions from this resource to further evaluate our algorithm if necessary [Lein et al]. Currently, we plan to focus more on the 10x Visium platform, as it supports whole-transcriptome profiling and offers a wide range of tissue samples for analysis.

      (2) The definition of "ground truth" for spatial distribution is unclear. A more detailed explanation is needed on how the "ground truth" was established for each spatial dataset and how it was utilized for comparison with the predicted distribution generated by various spatial mapping tools.

      Thank you for the comment. To clarify how ground truth is defined across different tissues, we provided the following details. Direct ground truth for cell locations is often unavailable in scRNA-seq data due to experimental constraints. To address this, we adopted alternative strategies for estimating ground truth in each dataset:

      10x Visium Data: We used the cell type distribution derived from spatial transcriptomics (ST) data as a proxy for ground truth. We then computed the KL divergence between this distribution and our model's predictions for performance assessment.

      Slide-seq Data: We validated predictions by comparing the expression of marker genes between the reconstructed and original spatial data.

      Fly Embryo Data: We used predicted cell locations from novoSpaRc as a reference for evaluating our algorithm.

      These strategies allowed us to evaluate model performance even in the absence of direct cell location data. In addition, we can apply multiple evaluation strategies within a single dataset.

      (3) In the analysis of spatial mapping results using intestinal villus tissue, only Figure 3d supports their findings. The researchers should consider adding supplemental figures illustrating the spatial distribution of single cells in comparison to the ground truth distribu tion to enhance the clarity and robustness of their investigation.

      Thank you for the comment. In the intestinal dataset, only six large domains were defined. As a result, the task for this dataset is relatively simple—each cell only needs to be assigned to one of the six domains. As the intestinal villus is a relatively simple tissue, most existing algorithms performed well on it. For this reason, we did not initially provide extensive details in the main text.

      (4) The spatial mapping tests were conducted on various tissues, including the mouse cortex, human PDAC, and intestinal villus. However, the original anatomical regions are not displayed, making it difficult to directly compare them with the predicted mapping results. Providing ground truth distributions for each tested tissue would enhance clarity and facilitate interpretation. For instance, in Figure 2a and  Supplementary Figures 1 and 2, only the predicted mapping results are shown without the corresponding original spatial distribution of regions in the mouse cortex. Additionally, in Figure 3c, four anatomical regions are displayed, but it is unclear whether the figure represents the original spatial regions or those predicted by glmSMA. The authors are encouraged to clarify this by incorporating ground truth distributions for each tissue.

      Thank you for the comment. To improve visualization, we included anatomical structures alongside the mapping results in the next version, wherever such structures are available (e.g., mouse brain cortex, human PDAC sample, etc.). Major cell type assignments for the PDAC samples, along with anatomical structures, are shown in Supplementary Figure 9. Most of these cell types were correctly mapped to their corresponding anatomical regions.

      (5) The cell assignment results from the mouse hippocampus (Supplementary Figure 6) lack a corresponding ground truth distribution for comparison. DG and CA cells were evaluated solely based on the gene expression of specific marker genes. Additional analyses are needed to further validate the robustness of glmSMA's mapping performance on Slide-seq data from the mouse hippocampus.

      Thank you for the comment. The ground truth for DG and CA cells was not available. To better evaluate the model's performance, we computed the KL divergence between the original and predicted cell type distributions, following the same approach used for the 10x Visium dataset. We identified a higher-quality dataset for the mouse hippocampus and used it to evaluate our algorithm. Additionally, we employed KL divergence as an alternative strategy to validate and benchmark our results (Fig. 5e, f, g). Most CA cells, including CA1, CA2, and CA3 principal cells, were correctly assigned back to the CA region. Dentate principal cells were accurately mapped to the DG region (Fig. 5e, f).

      (6) The tested spatial datasets primarily consist of highly structured tissues with well-defined anatomical regions, such as the brain and intestinal villus. Anatomical regions are not distinctly separated, such as liver tissue. Further evaluation of such tissues would help determine the method's broader applicability.

      Thank you for the insightful comment. We agree that many spatial datasets used in our study are from tissues with well-defined anatomical regions. To address the applicability of glmSMA in tissues without clearly separated anatomical structures, we applied glmSMA to the Drosophila embryo, which represents a tissue with relatively continuous spatial patterns and lacks well-demarcated anatomical boundaries compared to organs like the brain or intestinal villus.

      Despite this less structured spatial organization, glmSMA demonstrated robust performance in the fly embryo, accurately mapping cells to their correct spatial spots based on gene expression profiles. This result indicates that glmSMA is not strictly limited to highly structured tissues and can generalize to tissues with more continuous or gradient-like spatial architectures. These results suggest that glmSMA has broader applicability beyond highly compartmentalized tissues.

      Lein, E., Hawrylycz, M., Ao, N. et al. Genome-wide atlas of gene expression in the adult mouse brain. Nature 445, 168–176 (2007). https://doi.org/10.1038/nature05453

      Reviewer #3 (Public review):

      The authors aim to develop glmSMA, a network-regularized linear model that accurately infers spatial gene expression patterns by integrating single-cell RNA sequencing data with spatial transcriptomics reference atlases. Their goal is to reconstruct the spatial organization of individual cells within tissues, overcoming the limitations of existing methods that either lack spatial resolution or sensitivity.

      Strengths:

      (1) Comprehensive Benchmarking:

      Compared against CellTrek and Novosparc, glmSMA consistently achieved lower Kullback-Leibler divergence (KL divergence) scores, indicating better cell assignment accuracy.

      Outperformed CellTrek in mouse cortex mapping (90% accuracy vs. CellTrek's 60%) and provided more spatially coherent distributions.

      (2) Experimental Validation with Multiple Real-World Datasets:

      The study used multiple biological systems (mouse brain, Drosophila embryo, human PDAC, intestinal villus) to demonstrate generalizability.

      Validation through correlation analyses, Pearson's coefficient, and KL divergence support the accuracy of glmSMA's predictions.

      We thank reviewer #3 for their positive feedback and thoughtful recommendations.

      Weaknesses:

      (1) The accuracy of glmSMA depends on the selection of marker genes, which might be limited by current FISH-based reference atlases.

      We agree that the accuracy of glmSMA is influenced by the selection of marker genes, and that current FISH-based reference atlases may offer a limited gene set. To address this, we incorporate multiple feature selection strategies, including highly variable genes and spatially informative genes (e.g., via Moran’s I), to optimize performance within the available gene space. As more comprehensive reference atlases become available, we expect the model’s accuracy to improve further.

      (2) glmSMA operates under the assumption that cells with similar gene expression profiles are likely to be physically close to each other in space which not be true under various heterogeneous environments.

      Thank you for raising this important point. We agree that glmSMA operates under the assumption that cells with similar gene expression profiles tend to be spatially proximal, and this assumption may not strictly hold in highly heterogeneous tissues where spatial organization is less coupled to transcriptional similarity.

      To address this concern, we specifically tested glmSMA on human PDAC samples, which represent moderately heterogeneous environments characterized by complex tumor microenvironments, including a mixture of ductal cells, cancer cells, stromal cells, and other components. Despite this heterogeneity, glmSMA successfully mapped major cell types to their expected anatomical regions, demonstrating that the method is robust even in the presence of substantial cellular diversity and spatial complexity.

      This result suggests that while glmSMA relies on the assumption of spatialtranscriptomic correlation, the method can tolerate a reasonable degree of spatial heterogeneity without a significant loss of performance. Nevertheless, we acknowledge that in extremely disorganized or highly mixed tissues where transcriptional similarity is decoupled from spatial proximity, the performance may be affected.

    1. Reviewer #3 (Public review):

      Summary:

      In this study, the authors perform multimodal single-cell transcriptomic and epigenomic profiling of 9,394 mouse TM cells, identifying three transcriptionally distinct TM subtypes with validated molecular signatures. TM1 cells are enriched for extracellular matrix genes, TM2 for secreted ligands supporting Schlemm's canal, and TM3 for contractile and mitochondrial/metabolic functions. The transcription factor LMX1B, previously linked to glaucoma, shows the highest expression in TM3 cells and appears to regulate mitochondrial pathways. In Lmx1bV265D mutant mice, TM3 cells exhibit transcriptional signs of mitochondrial dysfunction associated with elevated IOP. Notably, vitamin B3 treatment significantly mitigates IOP elevation, suggesting a potential therapeutic avenue.<br /> This is an excellent and collaborative study involving investigators from two institutions, offering the most detailed single-cell transcriptomic and epigenetic profiling of the mouse limbal tissues-including both TM and Schlemm's canal (SC), from wild-type and Lmx1bV265D mutant mice. The study defines three TM subtypes and characterizes their distinct molecular signatures, associated pathways, and transcriptional regulators. The authors also compare their dataset with previously published murine and human studies, including those by Van Zyl et al., providing valuable cross-species insights.

      Strengths:

      (1) Comprehensive dataset with high single-cell resolution

      (2) Use of multiple bioinformatic and cross-comparative approaches

      (3) Integration of 3D imaging of TM and SC for anatomical context

      (4) Convincing identification and validation of three TM subtypes using molecular markers.

      Weaknesses:

      (1) Insufficient evidence linking mitochondrial dysfunction to TM3 cells in Lmx1bV265D mice: While the identification of TM3 cells as metabolically specialized and Lmx1b-enriched is compelling, the proposed link between Lmx1b mutation and mitochondrial dysfunction remains underdeveloped. It is unclear whether mitochondrial defects are a primary consequence of Lmx1b-mediated transcriptional dysregulation or a secondary response to elevated IOP. Although authors have responded to this, the manuscript is not sufficiently altered to address these points. I would like to suggest that authors tone down mitochondrial connection with Lmx1b from the title and abstract, and clearly discuss that these events are associated, and future work is needed to dissect the role of mitochondria in this pathway.<br /> Furthermore, the protective effects of nicotinamide (NAM) are interpreted as evidence of mitochondrial involvement, but no direct mitochondrial measurements (e.g., immunostaining, electron microscopy, OCR assays) are provided. It is essential to validate mitochondrial dysfunction in TM3 cells using in vivo functional assays to support the central conclusion of the paper. Without this, the claim that mitochondrial dysfunction drives IOP elevation in Lmx1bV265D mice remains speculative. Alternatively, authors should consider revising their claims that mitochondrial dysfunction in these mice is a central driver of TM dysfunction.

      (2) Mechanism of NAM-mediated protection is unclear: The manuscript states that NAM treatment prevents IOP elevation in Lmx1bV265D mice via metabolic support, yet no data are shown to confirm that NAM specifically rescues mitochondrial function. Do NAM-treated TM3 cells show improved mitochondrial integrity? Are reactive oxygen species (ROS) reduced? Does NAM also protect RGCs from glaucomatous damage? Addressing these points would clarify whether the therapeutic effects of NAM are indeed mitochondrial.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This study provides a comprehensive single-cell and multiomic characterization of trabecular meshwork (TM) cells in the mouse eye, a structure critical to intraocular pressure (IOP) regulation and glaucoma pathogenesis. Using scRNA-seq, snATAC-seq, immunofluorescence, and in situ hybridization, the authors identify three transcriptionally and spatially distinct TM cell subtypes. The study further demonstrates that mitochondrial dysfunction, specifically in one subtype (TM3), contributes to elevated IOP in a genetic mouse model of glaucoma carrying a mutation in the transcription factor Lmx1b. Importantly, treatment with nicotinamide (vitamin B3), known to support mitochondrial health, prevents IOP elevation in this model. The authors also link their findings to human datasets, suggesting the existence of analogous TM3-like cells with potential relevance to human glaucoma.

      Strengths:

      The study is methodologically rigorous, integrating single-cell transcriptomic and chromatin accessibility profiling with spatial validation and in vivo functional testing. The identification of TM subtypes is consistent across mouse strains and institutions, providing robust evidence of conserved TM cell heterogeneity. The use of a glaucoma model to show subtype-specific vulnerability, combined with a therapeutic intervention-gives the study strong mechanistic and translational significance. The inclusion of chromatin accessibility data adds further depth by implicating active transcription factors such as LMX1B, a gene known to be associated with glaucoma risk. The integration with human single-cell datasets enhances the potential relevance of the findings to human disease.

      We thank the reviewers for their thorough reading of our manuscript and helpful comments.

      Weaknesses:

      (1) Although the LMX1B transcription factor is implicated as a key regulator in TM3 cells, its role in directly controlling mitochondrial gene expression is not fully explored. Additional analysis of motif accessibility or binding enrichment near relevant target genes could substantiate this mechanistic link. 

      We show that the Lmx1b mutation induces mitochondrial dysfunction with mitochondrial gene expression changes but agree with the referee in that we do not show direct regulation of mitochondrial genes by LMX1B. Emerging data suggest that LMX1B regulates the expression of mitochondrial genes in other cell types [1, 2] making the direct link reasonable. Future work that is beyond the scope of the current paper will focus on sequencing cells at earlier timepoints to help distinguish gene expression changes associated with the V265D mutation from those secondary to ongoing disease and elevated IOP. Additional studies, including ATAC seq at more ages, ChIP-seq and/or Cut and Run/Tag (in TM cells) will be necessary to directly investigate LMX1B target genes.

      As we studied adult mice, mitochondrial gene expression changes could be secondary to other disease induced stresses. Because we did not intend to say we have shown a direct link, we have now added a sentence to the discussion ensure clarity. 

      Lines 932-934: “Although our studies show a clear effect of the Lmx1b mutation on mitochondria, future studies are needed to determine if LMX1B directly modulates mitochondrial genes in V265D mutant TM cells”

      (2) The therapeutic effect of vitamin B3 is clearly demonstrated phenotypically, but the underlying cellular and molecular mechanisms remain somewhat underdeveloped - for instance, changes in mitochondrial function, oxidative stress markers, or NAD+ levels are not directly measured. 

      We agree that further experiments towards a fuller mechanistic understanding of vitamin B3’s therapeutic effects are needed. Such experiments are planned but are beyond the scope of this paper, which is already very large (7 Figures and 16 Supplemental Figures).

      (3) While the human relevance of TM3 cells is suggested through marker overlap, more quantitative approaches, such as cell identity mapping or gene signature scoring in human datasets, would strengthen the translational connection.

      We appreciate the reviewer’s suggestion and agree that additional quantitative analyses will further strengthen the translational relevance of TM3 cells. It is not yet clear if humans have a direct TM3 counterpart or if TM cell roles are compartmentalized differently between human cell types. We are currently limited in our ability to perform these comparative analyses. Specifically, we were unable to obtain permission to use the underlying dataset from Patel et al., and our access to the Van Zyl et al. dataset was through the Single Cell Portal, which does not support more complex analyses (ex. cell identity mapping or gene signature scoring). Differences between human studies themselves also affect these comparisons. Future work aimed at resolving differences and standardizing human TM cell annotations, as well as cross species comparisons are needed (working groups exist and this ongoing effort supports 3 human TM cell subtypes as also reported by Van Zyl). This is beyond what we are currently able to do for this paper. We present a comprehensive assessment using readily available published resources.

      Reviewer #2 (Public review):

      Summary:

      This elegant study by Tolman and colleagues provides fundamental findings that substantially advance our knowledge of the major cell types within the limbus of the mouse eye, focusing on the aqueous humor outflow pathway. The authors used single-cell and single-nuclei RNAseq to very clearly identify 3 subtypes of the trabecular meshwork (TM) cells in the mouse eye, with each subtype having unique markers and proposed functions. The U. Columbia results are strengthened by an independent replication in a different mouse strain at a separate laboratory (Duke). Bioinformatics analyses of these expression data were used to identify cellular compartments, molecular functions, and biological processes. Although there were some common pathways among the 3 subtypes of TM cells (e.g., ECM metabolism), there also were distinct functions. For example:

      TM1 cell expression supports heavy engagement in ECM metabolism and structure, as well as TGFb2 signaling.

      TM2 cells were enriched in laminin and pathways involved in phagocytosis, lysosomal function, and antigen expression, as well as End3/VEGF/angiopoietin signaling.

      TM3 cells were enriched in actin binding and mitochondrial metabolism.

      They used high-resolution immunostaining and in situ hybridization to show that these 3 TM subtypes express distinct markers and occupy distinct locations within the TM tissue. The authors compared their expression data with other published scRNAseq studies of the mouse as well as the human aqueous outflow pathway. They used ATAC-seq to map open chromatin regions in order to predict transcription factor binding sites. Their results were also evaluated in the context of human IOP and glaucoma risk alleles from published GWAS data, with interesting and meaningful correlations. Although not discussed in their manuscript, their expression data support other signaling pathways/ proteins/ genes that have been implicated in glaucoma, including: TGFb2, BMP signaling (including involvement of ID proteins), MYOC, actin cytoskeleton (CLANs), WNT signaling, etc.

      In addition to these very impressive data, the authors used scRNAseq to examine changes in TM cell gene expression in the mouse glaucoma model of mutant Lmxb1-induced ocular hypertension. In man, LMX1B is associated with Nail-Patella syndrome, which can include the development of glaucoma, demonstrating the clinical relevance of this mouse model. Among the gene expression changes detected, TM3 cells had altered expression of genes associated with mitochondrial metabolism. The authors used their previous experience using nicotinamide to metabolically protect DBA2/J mice from glaucomatous damage, and they hypothesized that nicotinamide supplementation of mutant Lmx1b mice would help restore normal mitochondrial metabolism in the TM and prevent Lmx1b-mediated ocular hypertension. Adding nicotinamide to the drinking water significantly prevented Lmxb1 mutant mice from developing high intraocular pressure. This is a laudable example of dissecting the molecular pathogenic mechanisms responsible for a disease (glaucoma) and then discovering and testing a potential therapy that directly intervenes in the disease process and thereby protects from the disease.

      Strengths:

      There are numerous strengths in this comprehensive study including:

      Deep scRNA sequencing that was confirmed by an independent dataset in another mouse strain at another university.

      Identification and validation of molecular markers for each mouse TM cell subset along with localization of these subsets within the mouse aqueous outflow pathway.

      Rigorous bioinformatics analysis of these data as well as comparison of the current data with previously published mouse and human scRNAseq data.

      Correlating their current data with GWAS glaucoma and IOP "hits".

      Discovering gene expression changes in the 3 TM subgroups in the mouse mutant Lmx1b model of glaucoma.

      Further pursuing the indication of dysfunctional mitochondrial metabolism in TM3 cells from Lmx1b mutant mice to test the efficacy of dietary supplementation with nicotinamide. The authors nicely demonstrate the disease modifying efficacy of nicotinamide in preventing IOP elevation in these Lmx1b mutant mice, preventing the development of glaucoma. These results have clinical implications for new glaucoma therapies.

      We thank the reviewer for these generous and thoughtful comments on the strengths of this study.

      Weaknesses:

      (1) Occasional over-interpretation of data. The authors have used changes in gene expression (RNAseq) to implicate functions and signaling pathways. For example: they have not directly measured "changes in metabolism", "mitochondrial dysfunction" or "activity of Lmx1b".

      We thank the reviewer for this feedback. We did not intend to overstate and agree. Our gene expression changes support, but do not by themselves prove, metabolic disturbances. We had felt that this was obvious and did not want to clutter the text. We have revised the manuscript to clarify that our conclusions about metabolic changes and LMX1B activity are based on gene expression patterns rather than direct functional assays and have added EM data (see below under “Recommendations for the authors”).

      We have also added the following to the results:

      Lines 715-721: “Although the documented gene expression changes strongly suggest metabolic and mitochondrial dysfunction, they do not directly prove it. Using electron microscopy to directly evaluate mitochondria in the TM, we found a reduction in total mitochondria number per cell in mutants (P = 0.015, Figure 6G). In addition, mitochondria in mutants had increased area and reduced cristae (inner membrane folds) in mutants consistent with mitochondrial swelling and metabolic dysfunction (all P < 0.001 compared to WT, Figure 6G-H).”

      More detailed EM and metabolic studies are underway but are beyond the scope of this paper.

      (2) In their very thorough data set, there is enrichment of or changes in gene expression that support other pathways that have been previously reported to be associated with glaucoma (such as TGFb2, BMP signaling, actin cytoskeletal organization (CLANs), WNT signaling, ossification, etc. that appears to be a lost opportunity to further enhance the significance of this work.

      We appreciate the reviewer’s suggestions for enhancing the relevance of our work, we had not initially discussed this due to length concerns. We have now incorporated some of this information into the manuscript (see below under “Recommendations for the authors”).

      Reviewer #3 (Public review):

      Summary: In this study, the authors perform multimodal single-cell transcriptomic and epigenomic profiling of 9,394 mouse TM cells, identifying three transcriptionally distinct TM subtypes with validated molecular signatures. TM1 cells are enriched for extracellular matrix genes, TM2 for secreted ligands supporting Schlemm's canal, and TM3 for contractile and mitochondrial/metabolic functions. The transcription factor LMX1B, previously linked to glaucoma, shows the highest expression in TM3 cells and appears to regulate mitochondrial pathways. In Lmx1bV265D mutant mice, TM3 cells exhibit transcriptional signs of mitochondrial dysfunction associated with elevated IOP. Notably, vitamin B3 treatment significantly mitigates IOP elevation, suggesting a potential therapeutic avenue.

      This is an excellent and collaborative study involving investigators from two institutions, offering the most detailed single-cell transcriptomic and epigenetic profiling of the mouse limbal tissues-including both TM and Schlemm's canal (SC), from wild-type and Lmx1bV265D mutant mice. The study defines three TM subtypes and characterizes their distinct molecular signatures, associated pathways, and transcriptional regulators. The authors also compare their dataset with previously published murine and human studies, including those by Van Zyl et al., providing valuable crossspecies insights.

      Strengths: 

      (1) Comprehensive dataset with high single-cell resolution

      (2) Use of multiple bioinformatic and cross-comparative approaches

      (3) Integration of 3D imaging of TM and SC for anatomical context

      (4) Convincing identification and validation of three TM subtypes using molecular markers.

      We thank the reviewer for their comments on the strengths of this study.

      Weaknesses:

      (1) Insufficient evidence linking mitochondrial dysfunction to TM3 cells in Lmx1bV265D mice: While the identification of TM3 cells as metabolically specialized and Lmx1b-enriched is compelling, the proposed link between Lmx1b mutation and mitochondrial dysfunction remains underdeveloped. It is unclear whether mitochondrial defects are a primary consequence of Lmx1b-mediated transcriptional dysregulation or a secondary response to elevated IOP. Additional evidence is needed to clarify whether Lmx1b directly regulates mitochondrial genes (e.g., via ChIP-seq, motif analysis, or ATAC-seq), or whether mitochondrial changes are downstream effects.

      We agree and refer the reviewer to our responses to the other referees including Reviewer 1, Comment 1 and Reviewer 2 comments 1 and 17. As noted there, these mechanistic questions are the focus of ongoing and future studies. We have revised the text where appropriate to ensure it accurately reflects the scope of our current data.

      (2) Furthermore, the protective effects of nicotinamide (NAM) are interpreted as evidence of mitochondrial involvement, but no direct mitochondrial measurements (e.g., immunostaining, electron microscopy, OCR assays) are provided. It is essential to validate mitochondrial dysfunction in TM3 cells using in vivo functional assays to support the central conclusion of the paper. Without this, the claim that mitochondrial dysfunction drives IOP elevation in Lmx1bV265D mice remains speculative. Alternatively, authors should consider revising their claims that mitochondrial dysfunction in these mice is a central driver of TM dysfunction.

      We again refer the reviewer to our other response including Reviewer 1, Comment 1 and Reviewer 2 comments 1 and 17.

      (3) Mechanism of NAM-mediated protection is unclear: The manuscript states that NAM treatment prevents IOP elevation in Lmx1bV265D mice via metabolic support, yet no data are shown to confirm that NAM specifically rescues mitochondrial function. Do NAM-treated TM3 cells show improved mitochondrial integrity? Are reactive oxygen species (ROS) reduced? Does NAM also protect RGCs from glaucomatous damage? Addressing these points would clarify whether the therapeutic effects of NAM are indeed mitochondrial.

      We refer the reviewer to our response to Reviewer 1, Comment 2.

      (4) Lack of direct evidence that LMX1B regulates mitochondrial genes: While transcriptomic and motif accessibility analyses suggest that LMX1B is enriched in TM3 cells and may influence mitochondrial function, no mechanistic data are provided to demonstrate direct regulation of mitochondrial genes. Including ChIP-seq data, motif enrichment at mitochondrial gene loci, or perturbation studies (e.g., Lmx1b knockout or overexpression in TM3 cells) would greatly strengthen this central claim.

      We refer the reviewer to our response to Reviewer 1, Comment 1.

      (5) Focus on LMX1B in Fig. 5F lacks broader context: Figure 5F shows that several transcription factors (TFs)-including Tcf21, Foxs1, Arid3b, Myc, Gli2, Patz1, Plag1, Npas2, Nr1h4, and Nfatc2exhibit stronger positive correlations or motif accessibility changes than LMX1B. Yet the manuscript focuses almost exclusively on LMX1B. The rationale for this focus should be clarified, especially given LMX1B's relatively lower ranking in the correlation analysis. Were the functions of these other highly ranked TFs examined or considered in the context of TM biology or glaucoma? Discussing their potential roles would enhance the interpretation of the transcriptional regulatory landscape and demonstrate the broader relevance of the findings.

      Our analysis (Figure 5F) indicates that Lmx1b is the transcription factor most strongly associated with its predicted target gene expression across all TM cells, as reflected by its highest value along the X-axis. While other transcription factors exhibit greater motif accessibility (Y-axis), this likely reflects their broader expression across TM subtypes. In contrast, Lmx1b is minimally expressed in TM1 and TM2 cells, which may account for its lower motif accessibility overall (motifs not accessible in cells where Lmx1b is not / minimally expressed).

      Our emphasis on LMX1B is further supported by its direct genetic association with glaucoma. In contrast, the other transcription factors lack clear links to glaucoma and are supported primarily by indirect evidence. Nonetheless, we agree that the transcription factors highlighted in our analysis are promising candidates for future investigation. However, to maintain focus on the central narrative of this study, we have chosen not to include an extended discussion of these additional genes.

      (6) In abstract, they say a number of 9,394 wild-type TM cell transcriptomes. The number of Lmx1bV265D/+ TM cell transcriptomes analyzed is not provided. This information is essential for evaluating the comparative analysis and should be clearly stated in the Abstract and again in the main text (e.g., lines 121-123). Including both wild-type and mutant cell counts will help readers assess the balance and robustness of the dataset.

      We thank the reviewer for noticing this oversight and have added this value to the abstract and results section. 

      Lines 41 and 696: 2,491 mutant TM cells.  

      (7) Did the authors monitor mouse weight or other health parameters to assess potential systemic effects of treatment? It is known that the taste of compounds in drinking water can alter fluid or food intake, which may influence general health. Also, does Lmx1bV265D/+ have mice exhibit non-ocular phenotypes, and if so, does nicotinamide confer protection in those tissues as well? Additionally, starting the dose of the nicotinamide at postnatal day 2, how long the mice were treated with water containing nicotinamide, and after how many days or weeks IOP was reduced, and how long the decrease in the IOP was sustained.

      Water intake was monitored in both treatment groups, and dosing was based on the average volume consumed by adult mice (lines 1017–1018, young pups do not drink water and so drug is largely delivered through mothers’ milk until weaning and so we do not know an accurate dose for young pups). Mouse health was assessed throughout the experiment through regular monitoring of body weight and general condition.

      Depending on genetic context, Lmx1b mutations can cause kidney disease and impact other systems. Non-ocular phenotypes were not the focus of this study and were not characterized.

      We added a comment to the method to clarify the NAM treatment timeline. NAM was administered continuously in the drinking water starting at P2 and maintained throughout the experiment. IOP was measured beginning at 2 months and then at monthly time points. NAM lessened IOP at 2 and 3 months. We terminated IOP assessment at 3 months.

      Lines 1028-1029: “Treatment was started at postnatal day 2 and continued throughout the experiment.”

      (8) While the IOP reduction observed in NAM-treated Lmx1bV265D/+ mice appears statistically significant, it is unclear whether this reflects meaningful biological protection. Several untreated mice exhibit very high IOP values, which may skew the analysis. The authors should report the mean values for IOP in both untreated and NAM-treated groups to clarify the magnitude and variability of the response.

      We have added supplemental table 7 with the statistical information. Regarding the high IOP values observed in a subset of untreated V265D mutant mice, we consistently detect individual mutant eyes with IOPs exceeding 30 mmHg across independent cohorts and time points [3-5]. It is important to note that IOP is subject to fluctuation and in disease states such as glaucoma, circadian rhythms can be disrupted with stochastic and episodic IOP spikes throughout the day. This may be occurring in those untreated mice. This is also why we strive to use sample sizes of 40 or more. Additionally, we observe that some mutant eyes with IOPs measured within the normal range have anterior chamber deepening (ACD) - a persistent anatomical change associated with sustained or recurrent high IOP that stretches the cornea and may posteriorly displace the lens. This suggests mutant mice experience transient IOP elevations that are not always captured at a single time point due to the stochastic nature of these fluctuations. To account for this, we include ACD as an additional readout alongside IOP measurements. The reduction in ACD observed in NAM-treated mice provides independent evidence supporting the biological relevance of NAM-mediated IOP reduction.   

      (9) Additionally, since NAM has been shown to protect RGCs in other glaucoma models directly, the authors should assess whether RGCs are preserved in NAM-treated Lmx1b V265D/+ mice. Demonstrating RGC protection would support a synergistic effect of NAM through both IOP reduction and direct neuroprotection, strengthening the translational relevance of the treatment.

      We again thank the referee. We note the possibility of dual IOP protection and neuroprotection in the manuscript (lines 961–963). The goal of the present study, however, was to determine mechanisms underlying IOP elevation in patients with LMX1B variants. Therefore, we limited our focus to IOP elevation (LMX1B is expressed in the TM but not RGCs). Studies of the RGCs and optic nerve in V265D mutant mice treated with NAM take considerable effort but are underway. They will be reported in a subsequent manuscript. Initial data support protection, but that is a work in progress.  

      Additionally, we recently reported a similar pattern of IOP protection to that reported here using pyruvate - in experiments where we analyzed the optic nerve as the focus of the study was assessment of pyruvate as a resilience factor against high genetic risk of glaucoma [4]. In that case, there was statistically significant protection from glaucomatous optic nerve damage, arguing for translational relevance again with a possible synergistic effect through both IOP reduction and direct neuroprotection.

      (10) Can the authors add any other functional validation studies to explore to understand the pathways enriched in all the subtypes of TM1, TM2, and TM3 cells, in addition to the ICH/IF/RNAscope validation?

      We agree with the reviewer on the importance of further functional validation of pathways active in TM cell subtypes that influence IOP. However, comprehensive investigation of the pathways active in subtypes need to be in future studies. It is beyond the scope of his already large paper.

      (11) The authors should include a representative image of the limbal dissection. While Figure S1 provides a schematic, mouse eyes are very small, and dissecting unfixed limbal tissue is technically challenging. It is also difficult to reconcile the claim that the majority of cells in the limbal region are TM and endothelium. As shown in Figure S6, DAPI staining suggests a much higher abundance of scleral cells compared to TM cells within the limbal strip. Additional clarification or visual evidence would help validate the dissection strategy and cellular composition of the captured region.

      We appreciate the reviewer’s suggestion and have added additional images to Figure S1 to show our limbal strip dissection. However, we clarify that we do not intend to suggest that TM and endothelial cells are the most abundant populations in these dissected strips.  When we say “are enriched for drainage tissues” we mean in comparison to dissecting the anterior segment as a whole. We have clarified this in the text. In fact, epithelial cells (primarily from the cornea) constituted the largest cluster in our dataset (Figure 1A). Additionally, to avoid misinterpretation, we generally refrain from drawing conclusions about the relative abundance of cell types based on sequencing data. Single-cell and single nucleus RNA sequencing results are sensitive to technical factors that alter cell proportions depending on exact methodological details. In our study, TM cells comprised 24.4% of the single-cell dataset and 11.8% of the single-nucleus dataset, illustrating the impact of methodological variability. 

      Lines 163-164: “Individual eyes were dissected to isolate a strip of limbal tissue, which is enriched for TM cells in comparison to dissecting the anterior segment as a whole.”

      Reviewer #1 (Recommendations for the authors):

      To enhance the reproducibility and transparency of the findings presented in this study, we strongly recommend that the authors make all analysis scripts and computational tools publicly available.

      We agree with the reviewer’s emphasis on transparency and are currently building a GitHub page to share our scripts. However, we did not develop any new tools for this study. All tools that we used are publicly available and provided in our methods section. All data will be available as raw data and through the Broad Institute’s Single Cell Portal.

      Reviewer #2 (Recommendations for the authors):

      The authors are to be commended for a well-written presentation of high-quality data, their comparisons of datasets (other mouse and human scRNAseq data), correlation with clinical glaucoma risk alleles, and curative therapy for the mouse model of Lmx1b glaucoma. There are several minor suggestions that the authors might consider to further improve their manuscript:

      (1) Lines 42-43: Although their data strongly support the role of mitochondrial dysfunction in Lmx1b glaucoma, they might want to soften their conclusion "supports a primary role of mitochondrial dysfunction within TM3 cells initiating the IOP elevation that causes glaucoma".

      With the inclusion of EM data supporting mitochondrial dysfunction in Lmx1b mutant TM cells, we have revised this sentence to more accurately reflect our findings.

      Lines 42-44 (previously lines 42-43): “Mitochondria in TM cells of V265D/+ mice are swollen with a reduced cristae area, further supporting a role for mitochondrial dysfunction in the initiation of IOP elevation in these mice.”

      (2) Figure 1: Why is the shape of the "TM containing" cluster in 1A so different than the cluster shown in 1B?

      We isolated cells from the 'TM-containing' cluster and performed unbiased reclustering, which alters their positioning in UMAP space. The figure legend has been updated to clarify this point.

      Lines 143-144 “A separate UMAP representation of the trabecular meshwork (TM) containing cluster following subclustering.”

      (3) Line 160: change "data was" to "data were"

      Corrected

      (4) S4 Fig C: Please comment on why the Columbia and Duke heatmaps for TM3 are not as congruent as the heatmaps for TM1 and TM2.

      We cannot definitively determine the reason for this. However, differences in tissue processing techniques between the Columbia and Duke preparations may contribute. Such variations have been shown to affect cellular transcriptomes in certain contexts. It is possible that TM3 cells are more susceptible to these effects than others. We have added a statement addressing this point to the figure legend.

      Lines 238-240: “Because tissue processing techniques can alter gene expression [52], the heatmap variation between institutes likely reflects differences in processing techniques (Methods) and suggests that TM3 cells are more susceptible to these effects than other cell types.”

      (5) S9 Fig: It is very difficult to see any staining for TM1 CHIL1 (2nd panel), TM2 End3 (2nd panel), and TM3 Lypd1 (both panels)

      We apologize for the difficulty in visualizing these panels. To improve clarity, we have increased the brightness of all relevant marker signals, within standard bounds, to facilitate easier interpretation.

      (6) Line 380: "are significantly higher"; since statistical analysis was not reported, please do not use "significantly"

      Done

      (7) The authors should consider discussing several of their findings that agree with published literature. For example:

      Figure 3B: "Wnt protein binding" (PMID: 18274669), "TGFb "binding" (numerous references), "integrin binding" (work of Donna Peters), "actin binding"/"actin filament binding"/"actin filament bundle" (CLANs references)

      S10 Fig c: "ossification" (work of Torretta Borres)

      S11 Fig A: ID2/ID3 (PMID: 33938911); (B) BMP4 (PMID: 17325163)

      S12 Fig A: MYOC in TM1 cells (numerous references)

      We appreciate the reviewer’s diligent review and comments regarding these pathways. We have added a comment to the discussion regarding the agreement of these pathways.

      Lines 855-858: In addition, the expression of genes that we document generally agrees with the literature. For example, the following genes and signaling molecules have been reported in TM cells, WNT signaling [78], TGF-β signaling [79-85], integrin binding [86-88], actin cytoskeletal networks [89], calcification genes [90, 91], and Myocilin [91-94].

      (8) Line 541: was confocal microscopy used to measure the "3D shapes" of nuclei or was this done with a single image to determine sphericity?

      This analysis was performed using confocal microscopy and 3D reconstructed models of the TM nuclei. We have added text to clarify this in the figure legend 

      Lines 553-556: “To rigorously assess whether TM1 nuclei are more spherical, we analyzed their reconstructed 3D shapes from whole mounts images by confocal microscopy, comparing them to TM3 nuclei using the ‘Sphericity’ tool in Imaris.”

      (9) Line 545: please add a close parentheses after "scoring 1"

      Done

      (10) S15 Fig: (A) There does not appear to be "good agreement" (line 653) between the datasets for TM1. (C) please provide a better explanation on how to interpret these "Confusion Matrix" results.

      We understand the referee's concern, the patterns likely appear different to the referee due to limited sampling in snRNA-seq data. Based on our results, TM1 seems particularly susceptible, possibly because these cells do not tolerate the isolation process as well. Although we are confident that TM1 shows good agreement between the two techniques based on our experience, we have revised the language in the text to “generally” to reflect this nuance.

      Lines 633-635 (previously line 653): The generated clusters and their marker genes generally agreed with our scRNA-seq analyses (Fig 5A-B, S15A Fig).

      We have also added additional clarification for how to interpret the Confusion Matrix. 

      Lines 669-672: “Colors indicate the fraction of cells identified in each ATAC cluster (row) which are also identified in each RNA cell type (columns), where darker colors represent stronger correspondence between RNA and ATAC clusters.”

      (11) Line 676: The transition from discussing the sc/snRNAseq data to the work in Lmx1b mutant mice is quite abrupt and could use a better transition to introduce this metabolism work.

      We have revised this transition for improved flow but prefer to keep all transitions brief due to the paper's length.

      Lines 691-694 (previously line 676): To evaluate the utility of our new TM cell atlas, we used it to examine how Lmx1b mutations affect the TM cell transcriptome and to identify potential mechanisms underlying IOP elevation. We selected LMX1B because it causes IOP elevation and glaucoma in humans and was identified as a highly active transcription factor in our TM cell dataset.

      (12) Lines 696-697: It appears counter-intuitive that upregulation of ubiquitin pathways would lead to proteostasis (proteosome protein degradation requires ubiquination).

      We have clarified that the protein tagging pathway was significantly upregulated. However, polyubiquitin precursor itself was downregulated. In general, the statistical significance of the protein tagging pathway suggests perturbation of the system tagging proteins for degradation. We have clarified this in the text. 

      Lines 711-714 (previously lines 696-697): “In addition, mutant TM3 cells showed an upregulation of protein tagging genes. However, there is a downregulation of the polyubiquitin precursor gene (Ubb, P = 4.5E-30), indicating a general dysregulation of pathways that tag proteins for degradation.”

      (13) Line 715: Please justify why "perturbed metabolism" was chosen to pursue vs the other differentially expressed pathways

      We chose to narrow our focus on TM3 cells because of the enrichment for Lmx1b expression.Most pathways identified in our analysis of TM3 cells implicate mitochondrial metabolism.Therefore, we chose to further explore this avenue. We clarified that perturbed metabolism was the strongest gene expression signature in the text. 

      Lines 753-754 (previously line 715): “Our findings most strongly implicate perturbed metabolism within TM3 cells as responsible for IOP elevation in an Lmx1b glaucoma model.”

      (14) Line 759: The authors clearly demonstrate that Lmx1b is most expressed in TM3 cells; however, they did not demonstrate that "Lmx1b was most active"

      ATAC analysis showed that Lmx1b was most active in TM cells overall. We inferred its activity in TM3 because Lmx1b is most enriched in that subtype. This has been clarified in the text.

      Lines 799-800 (previously line 759): “More specifically, we demonstrate that Lmx1b is the most active TM cell TF and is enriched in TM3 cells,…”

      (15) Lines 830-835: Please include references documenting increased TGFβ2 concentrations in POAG aqueous humor and TM, effects of TGFβ2 on TM ECM deposition, and TGFβ2 induced ocular hypertension ex vivo and in vivo.

      Done.

      (16) Line 875: The authors provide no direct evidence for enhances "oxidative stress" in Lmx1b TM3 cells

      The mitochondrial abnormalities and changed pathways support oxidative stress, but we have not directly tested this. Experiments are currently underway to evaluate its role, but these additional analyses are beyond the scope of this paper. We removed oxidative stress from the sentence.

      Lines 920-922 (previously line 875): “Importantly, in heterozygous mutant V265D/+ mice, TM3 cells had pronounced gene expression changes that implicate mitochondrial dysfunction, but that were absent or much lower in other cells including TM1 and TM2.”

      (17) Line 880: Similarly, the authors have not directly assessed effects on metabolism in TM3 cells; they only have shown changes in the expression of mitochondrial genes that may affect metabolism

      We have no way to specifically isolating TM3 cells to test this. Future work is underway to test this more broadly in isolated TM cells but is beyond the scope of this is already large paper. Considering our gene expression data and the addition of supporting EM data, we have qualified the text.

      Lines 930-931 (previously 880): “Our data extend these published findings by showing that inheritance of a single dominant mutation in Lmx1b similarly affects mitochondria in TM cells.”

      (18) Line 892: What markers were used to detect "cell stress"?

      We have revised the text. Although our RNA data show stress gene changes, characterization of these markers is beyond the scope of the current study and will be included in a subsequent paper.

      Lines 945-948 (previously line 892): “However, these processes were not limited to TM3 cells or even to cell types that express detectable Lmx1b, suggesting that they are secondary damaging processes that are subsequent to the initiating, Lmx1b-induced perturbations in TM3 cells.”

      Additional author driven change

      While revising and reviewing our data, we identified a coding error that resulted in the WT and V265D mutant group labels being switched in Figure 6. Importantly, the significance of the differentially expressed genes (DEGs), the implicated biological pathways, and the interpretation of pathway directionality in the manuscript remain accurate. The only issue was the incorrect labeling in the figure. We have corrected the labels in Figure 6 to accurately reflect the data. As noted above, all data and code will be made available to ensure full reproducibility of our results.

      References

      (1) Doucet-Beaupre H, Gilbert C, Profes MS, Chabrat A, Pacelli C, Giguere N, et al. Lmx1a and Lmx1b regulate mitochondrial functions and survival of adult midbrain dopaminergic neurons. Proc Natl Acad Sci U S A. 2016;113(30):E4387-96. Epub 2016/07/14. doi: 10.1073/pnas.1520387113. PubMed PMID: 27407143; PubMed Central PMCID: PMCPMC4968767.

      (2) Jimenez-Moreno N, Kollareddy M, Stathakos P, Moss JJ, Anton Z, Shoemark DK, et al. ATG8-dependent LMX1B-autophagy crosstalk shapes human midbrain dopaminergic neuronal resilience. J Cell Biol. 2023;222(5). Epub 2023/04/05. doi: 10.1083/jcb.201910133. PubMed PMID: 37014324; PubMed Central PMCID: PMCPMC10075225.

      (3) Cross SH, Macalinao DG, McKie L, Rose L, Kearney AL, Rainger J, et al. A dominantnegative mutation of mouse Lmx1b causes glaucoma and is semi-lethal via LDB1mediated dimerization [corrected]. PLoS Genet. 2014;10(5):e1004359. Epub 2014/05/09. doi: 10.1371/journal.pgen.1004359. PubMed PMID: 24809698; PubMed Central PMCID: PMCPMC4014447.

      (4) Li K, Tolman N, Segre AV, Stuart KV, Zeleznik OA, Vallabh NA, et al. Pyruvate and related energetic metabolites modulate resilience against high genetic risk for glaucoma. Elife. 2025;14. Epub 2025/04/24. doi: 10.7554/eLife.105576. PubMed PMID: 40272416; PubMed Central PMCID: PMCPMC12021409.

      (5) Tolman NG, Balasubramanian R, Macalinao DG, Kearney AL, MacNicoll KH, Montgomery CL, et al. Genetic background modifies vulnerability to glaucoma-related phenotypes in Lmx1b mutant mice. Dis Model Mech. 2021;14(2). Epub 2021/01/20. doi: 10.1242/dmm.046953. PubMed PMID: 33462143; PubMed Central PMCID: PMCPMC7903917.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript by the Yin group presents interesting findings that organelle-tethered intrinsically disordered "MEMCA" scaffolds, as exemplified by ZDHHC18 at the Golgi and MARCH8 at endosomes, enhance the engagement of cGAS with organelle-proximal condensates, thereby sequestering cGAS from cytosolic DNA sensing and negatively regulating innate immunity.

      Strengths:

      These findings suggest a previously unrecognized mechanism by which Golgi/endosomal IDR scaffolds modulate cGAS activity, with implications for antiviral defense and tumor immunology. The study is conceptually intriguing and potentially impactful.

      Weaknesses:

      While the manuscript addresses a novel aspect of cGAS regulation, additional mechanistic insights and targeted validations are needed to ensure robustness:

      (1) How do ZDHHC18/MARCH8 enhance cGAS engagement? Do they act as bridges to form a ternary, membrane-tethered cGAS-DNA-MEMCA complex, or alter cGAS condensate properties allosterically?

      (2) Is organelle cGAS capture selective? For instance, can other palmitoyltransferases/E3 ligases be substituted for ZDHHC18/MARCH8?

      (3) Why does membrane association suppress cGAS enzymic activity, as dsDNA still resides in cGAS condensation?

    2. Reviewer #2 (Public review):

      Summary:

      The authors found that cGAS, a DNA sensor, relocalizes to organelle membranes (ER, Golgi, endosomes) upon DNA stimulation, revealing spatial regulation of its activity. ZDHHC18 and MARCH8 recruit cGAS to Golgi/endosomes via intrinsically disordered regions (IDRs), driving phase-separated condensates. This sequestration of cGAS-dsDNA complexes suppresses innate immune signaling, uncovering a novel regulatory mechanism.

      Strengths:

      The work overall is very interesting. The authors provided molecular and biochemical evidence.

      Weaknesses:

      Overall, the work is very interesting. However, the quality of some of the data does need to be improved, and more experiments need to be performed.

      The following points need to be addressed:

      (1) In Figure S7, no direct binding between cGAS and MARCH8 or ZD18 IDR is observed, and the interaction only occurs after DNA stimulation. However, Figure 5 shows cGAS recruitment to ZD18 or MARCH8 IDR droplets, suggesting direct interactions. This apparent discrepancy should be clarified.

      (2) The authors propose that recruiting cGAS to organelle membranes reduces its activity, as demonstrated by the FKBP experiment. However, ZD18 and MARCH8 also post-translationally modify cGAS. Do both mechanisms contribute to this effect, and can the authors test this?

      (3) To demonstrate the functional importance of MEMCA, the authors should test IFN production or STING activation in cells.

      (4) Does the IDR of MARCH8 or ZD18 influence the interaction between cGAS and DNA?

      (5) Which region of cGAS does the IDR of MARCH8 or ZD18 interact with: the cGAS-CD or the cGAS-N-terminus?

      (6) The in vitro LLPS experiments with cGAS, DNA, and ZD18/MARCH8 should be conducted under physiological conditions.

    3. Reviewer #3 (Public review):

      Summary:

      In this study by Shi et al., the authors evaluate if cGAS is recruited to the membranes of intracellular organelles. Using a combination of biochemical fractionation and imaging techniques, the authors propose that upon recognition of DNA, cGAS translocates to various subcellular locations, including the golgi, endoplasmic reticulum, and endosomes. Mechanistically, the authors propose that upon localizing to the Golgi or endosome, cGAS binding to MARCH8 and ZDHHC18 prevents cGAS activity by incorporating cGAS and dsDNA into biomolecular condensates. However, in its current form, the study does not directly address this question.

      Strengths:

      The question of evaluating cGAS sub-cellular localization as a mechanism for controlling activity is interesting, and there is some evidence that cGAS is localized to sub-cellular organelle membranes.

      Weaknesses:

      (1) The well-established nuclear localization of cGAS is not adequately addressed in the cell lines used and is inconsistent with the findings.

      (2) Previous studies have shown that ZDHHC18 and MARCH8 control cGAS activity, which detracts somewhat from the novelty.

      (3) A lot of inconsistency in the cell lines and artificial expression systems used across the study.

      (4) A key element missing is showing that in the absence of ZDHHC18 or MARCH8, the loss of endogenous cGAS localization to the various sub-cellular organelles increases cGAMP synthesis and downstream STING activation in primary cells. There is an over-reliance on artificial expression systems. An important experiment to validate the hypothesis would be to evaluate endogenous cGAS localization in MARCH8- and ZDHHC18-deficient primary cells. Further, there should be evaluation of endogenous STING responses in MARCH8- and ZDHHC18-deficient primary cells in tandem with the localization studies.

      (5) There are a large number of grammatical errors throughout the manuscript which should be addressed.

    4. Author response:

      Below we outline our provisional responses to the major points raised in the public reviews, and our planned revisions:

      (1) Mechanistic model of how ZDHHC18/MARCH8 engage the cGAS–DNA condensate (Reviewer #1 & #2

      We will add a dedicated subsection and a working-model figure describing our current view: IDRs of ZDHHC18 (Golgi) and MARCH8 (endosomes) engage pre-formed cGAS–DNA condensates at organelle membranes, and thereby tune cGAS activity through PTMs. We will explicitly discuss bridge-like versus allosteric modes by perform additional LLPS experiment (e.g. FRAP assay) to detect any IDR-driven changes in condensate properties, and explain how these scenarios fit our data.

      (2) Selectivity beyond ZDHHC18/MARCH8 (Reviewer #1)

      We will expand the text to explain existing evidence indicating that, in addition to ZDHHC18 or MARCH8, other post-translational modification (PTM) enzymes and/or membrane-associated scaffolds may also modulate cGAS. We will summarize our current datasets that support this possibility and outline how this selectivity relates to organelle identity.

      (3) Why membrane association suppresses cGAS activity (Reviewer #1)

      We will provide a concise mechanistic rationale—integrating our published work—to explain how membrane-proximal sequestration can limit cGAS catalysis despite cGAS–DNA coexistence within condensates. Specifically, we will discuss (i) IDR-dependent changes in condensate properties, and (ii) PTMs by ZDHHC18/MARCH8 that allosterically reduce catalytic efficiency; we will clearly cross-reference our prior publications that bear on these points.

      (4) Reconciling Fig. S7 (DNA-dependent binding) with Fig. 5 (recruitment to IDR droplets) (Reviewer #2)

      We will add text to clarify experimental context and readouts to prove that there is no real contradiction between Fig. S7 and Fig. 5. In the experiment shown in Fig. 5, PEG (a macromolecular crowding agent) was added to the system, which facilitates the formation of IDR phase-separated droplets. Under these conditions, cGAS partitions into the IDR condensates, leading to the observed recruitment. In contrast, Fig. S7 examines the direct physical interaction between cGAS and the IDRs using biochemical pull-down assays and shows that no direct interaction occurs in the absence of DNA. These two results reflect different experimental contexts and are therefore not mutually exclusive.

      (5) Planned additional tests to address specificity and mechanism (Reviewer #2)

      DNA pull-down: to test whether IDRs alter cGAS–DNA affinity, we will compare cGAS binding to DNA with/without MEMCA IDRs (and with charged-residue mutants).

      Domain mapping: to determine which region of cGAS engages MEMCA IDRs, we will map binding using cGAS N-terminus/core-domain truncations and key surface mutants.

      Physiological in vitro LLPS: we will repeat cGAS–DNA–IDR LLPS assays under physiological buffer conditions and report partition coefficients, FRAP, and phase diagrams to ensure physiological relevance.

      (6) Image clarity and data presentation (Reviewer #2):

      We will improve image resolution, add zoomed-in insets with organelle markers, and provide more significant Cy5-ISD signal.

      (7) Nuclear localization of cGAS and system considerations (Reviewer #3)

      We will explicitly document the nuclear signal of cGAS observed in our confocal experiments, detail the cell lines and expression systems used. We will also clarify cGAS nuclear localization in the cell lines used.

      (8) Endogenous validation and cell line consistency (Reviewer #3):

      We will perform experiments in primary cells (knockout macrophages) to address the concern of relying on overexpression.

      (9) Language and grammar (Reviewer #3):

      We will thoroughly revise the manuscript for grammar and clarity.

      Together, these planned revisions will strengthen the mechanistic basis of our findings and provide direct evidence for the physiological role of organelle-tethered IDRs in regulating cGAS activity.

    1. Reviewer #3 (Public review):

      Summary:

      Ruppert et al. present a well-designed 2×2 factorial study directly comparing methionine restriction (MetR) and cold exposure (CE) across liver, iBAT, iWAT, and eWAT, integrating physiology with tissue-resolved RNA-seq. This approach allows a rigorous assessment of where dietary and environmental stimuli act additively, synergistically, or antagonistically. Physiologically, MetR progressively increases energy expenditure (EE) at 22{degree sign}C and lowers RER, indicating a lipid utilization bias. By contrast, a 24-hour 4 {degree sign}C challenge elevates EE across all groups and eliminates MetR-Ctrl differences. Notably, changes in food intake and activity do not explain the MetR effect at room temperature.

      Strengths:

      The data convincingly support the central claim: MetR enhances EE and shifts fuel preference to lipids at thermoneutrality, while CE drives robust EE increases regardless of diet and attenuates MetR-driven differences. Transcriptomic analysis reveals tissue-specific responses, with additive signatures in iWAT and CE-dominant effects in iBAT. The inclusion of explicit diet×temperature interaction modeling and GSEA provides a valuable transcriptomic resource for the field.

      Weaknesses:

      Limitations include the short intervention windows (7 d MetR, 24 h CE), use of male-only cohorts, and reliance on transcriptomics without complementary proteomic, metabolomic, or functional validation. Greater mechanistic depth, especially at the level of WAT thermogenic function, would strengthen the conclusions.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      SECTION A - Evidence, Reproducibility, and Clarity Summary The study investigates the neurodevelopmental impact of trisomy 21 on human cortical excitatory neurons derived from induced pluripotent stem cells (hiPSCs). Key findings include a modest reduction in spontaneous firing, a marked deficit in synchronized bursting, decreased neuronal connectivity, and altered ion channel expression-particularly a downregulation of voltage‐gated potassium channels and HCN1. These conclusions are supported by a combination of in vitro calcium imaging, electrophysiological recordings, viral monosynaptic tracing, RNA sequencing, and in vivo transplantation with two‐photon imaging.

      Major Comments • Convincing Nature of Key Conclusions: The study's conclusions are generally well supported by a diverse set of experimental approaches. However, certain claims regarding the intrinsic properties of the excitatory network would benefit from further qualification. In particular, the assertion that reduced synchronization is solely attributable to altered ion channel expression might be considered somewhat preliminary without additional corroborative experiments.

      1.1) We agree with the reviewer and now write in the abstract: 'Together, these findings demonstrate long-lasting impairments in human cortical excitatory neuron network function associated with Trisomy 21 .' And in the Introduction: 'Collectively, the observed changes in ion channel expression, neuronal connectivity, and network activity synchronization may contribute to functional differences relevant to the cognitive and intellectual features associated with Down syndrome.'

      • One major limitation of the current experimental design is the reliance on predominantly excitatory neuronal cultures derived from hiPSCs. Although the authors convincingly demonstrate differences in network synchronization and connectivity between trisomic (TS21) and control neurons, the almost exclusive focus on excitatory cells limits the physiological relevance of the in vitro network. In the developing cortex, interneurons and astrocytes play crucial roles in modulating network excitability, synaptogenesis, and plasticity. Therefore, incorporating these cell types-either through co-culture systems or through directed differentiation protocols that yield a more heterogeneous neuronal population-could help to determine whether the observed deficits are intrinsic to excitatory neurons or are compounded by a lack of proper inhibitory regulation and glial support. 1.2) Thank you for this thoughtful comment. We agree that interneurons and astrocytes are crucial for network function. To clarify, astrocytes are generated in this culture system, as we previously reported in our characterisation of the timecourse of network development using this approach (Kirwan et al., Development 2025). However, our primary goal was to first isolate and define the cell-autonomous defects intrinsic to TS21 excitatory neurons, minimizing the complexity introduced by additional neuronal types. This focused approach was chosen also because engineering a stable co-culture system with reproducible excitatory/inhibitory (E/I) proportions is a significant undertaking that extends beyond the scope of this initial investigation, and has proven challenging to date for the field. By establishing this foundational phenotype, our work complements prior studies on interneuron and glial contributions. Future studies building on this work will be essential to dissect the more complex, non-cell-autonomous effects within a heterogeneous network. Importantly, since our initial submission, two highly relevant preprints have emerged-including a notable study from the Geschwind laboratory at UCLA (Vuong et al., bioRxiv, 2025; Risgaard et al., bioRxiv, 2025), as well as our own complementary study Lattke et al, under revision, that highlight widespread transcriptional changes in excitatory cells of the human fetal DS cortex, providing strong validation for our central findings. This convergence of results from multiple groups underscores the timeliness and importance of our work.

      • Furthermore, the assessment of neuronal connectivity via pseudotyped rabies virus tracing, while innovative, has inherent limitations. The quantification of connectivity as a ratio of red-to-green fluorescence pixels may be influenced by differential viral infection efficiencies, variations in the expression levels of the TVA receptor, or even by the lower basal activity levels observed in TS21 cultures. Complementary approaches-such as electron microscopy for synaptic density analysis or functional connectivity measurements using multi-electrode arrays (MEAs)-could provide additional structural and functional insights that would validate the rabies tracing data. 1.3) Thank you for this constructive feedback. While we cannot formally exclude that TS21 cells might express the TVA receptor at lower levels due to generalized gene dysregulation, we infected all WT and TS21 cultures in parallel using identical virus preparations and titers to minimize technical variability. Crucially, we also addressed the potential confound of differential basal activity by performing the rabies tracing under TTX incubation (see Suppl. Fig. 7), which blocks network activity and ensures that viral spread reflects structural connectivity alone.

      While complementary methods like EM or MEA could provide additional insight, they fall outside the scope of the current study. We are confident that our rigorous controls validate our use of the rabies tracing method to assess structural connectivity.

      • Qualification of Claims: Some conclusions, particularly those linking specific ion channel dysregulation (e.g., HCN1 loss) directly to network deficits, might be better presented as preliminary. The authors could temper their language to indicate that while the evidence is suggestive, the mechanistic link remains to be fully established. 1.4) We have revised the text to more clearly indicate that the link between HCN1 dysregulation and network deficits is correlative and remains to be fully established. While our ex vivo recordings suggest altered Ih-like currents consistent with reduced HCN1 expression, we now present these findings as preliminary and hypothesis-generating, pending further functional validation. We write in the discussion: However, further targeted functional validation will be needed to confirm a causal link.

      • Need for Additional Experiments: Additional experiments that could further consolidate the current findings include: o Inclusion of Inhibitory Neurons or Co-culture Systems: Incorporating interneurons or astrocytes would help determine whether the observed deficits are solely intrinsic to excitatory neurons. See 1.2 o Alternative Connectivity Assessments: Complementing the rabies virus tracing with electron microscopy or multi-electrode array (MEA) recordings would add structural and functional validation of the connectivity differences. See 1.3 o Extended Temporal Profiling: Monitoring network activity over a longer developmental window would clarify whether the observed deficits represent a delay or a permanent alteration in network maturation. 1.5) In vivo we were able to track the cells for up to five months post-transplantation supporting the interpretation of a permanent alteration.

      • Reproducibility and Statistical Rigor: The methods and data presentation are largely clear, with adequate replication and appropriate statistical analyses. Nonetheless, a more detailed description of the experimental replicates, particularly regarding the viral tracing and in vivo transplantation studies, would enhance reproducibility. The availability of raw data and scripts for calcium imaging analysis would also further support independent verification. We thank the reviewer for these suggestions and we now provide a more detailed description of replicates. We also add the raw data.

      Minor Comments • Experimental Details: Minor revisions could include clarifying the infection efficiency and expression levels of the viral constructs used in connectivity assays to rule out technical variability.

      See 1.3

      • Literature Context: The authors reference prior studies appropriately; however, integrating a brief discussion comparing their findings with alternative DS models (e.g., organoids or other hiPSC-derived systems) would improve contextual clarity. We thank the reviewer for this helpful suggestion. We have now added a brief discussion comparing our findings with those reported in alternative Down syndrome models, including brain organoids and other hiPSC-derived systems. This addition helps to contextualize our results within the broader field and highlights the unique strengths and limitations of our in vitro and in vivo xenograft approach. We write: 'Our findings align with and extend previous studies using alternative Down syndrome models, such as brain organoids and other hiPSC-derived systems. Organoid models have provided valuable insights into early neurodevelopmental phenotypes in DS, including altered interneuron proportions (Xu et al Cell Stem Cell 2019) but also suggest that variability across isogenic lines can overshadow subtle trisomy 21 neurodevelopmental phenotypes (Czerminski et al Front in Neurosci 2023). However, these systems often lack the structural complexity, vascularization, and long-term maturation achievable in vivo. By using a xenotransplantation model, we were able to assess the maturation and functional properties of human neurons within a physiologically relevant environment over extended time frames, offering complementary insights into DS-associated circuit dysfunction (Huo et al Stem Cell Reports 2018; Real et al., 2018).

      • Presentation and Clarity: Figures are generally clear,.But the manuscript contains a minor labeling error. On page 13, the figure is erroneously labeled as "Fig6A", whereas, based on the context and corresponding data, it should be "Fig5A". I recommend that the authors correct this mistake to ensure consistency and avoid potential confusion for readers. Thank you for pointing this out. This has been corrected in the revised manuscript.

      Reviewer #1 (Significance (Required)):

      SECTION B - Significance • Nature and Significance of the Advance: The work offers a substantial conceptual advance by providing a mechanistic link between trisomy 21 and impaired neuronal network synchronization. Technically, the study integrates state-of-the-art imaging, electrophysiology, and transcriptomic profiling, thereby offering a multifaceted view of DS-related neural dysfunction. Clinically, the findings have the potential to inform future therapeutic strategies targeting network connectivity and ion channel function in Down syndrome.

      We thank the reviewer for this very supportive comment.

      • Context in the Existing Literature: The study builds on previous observations of altered network activity in DS patients and DS mouse models (e.g., altered EEG synchronization and reduced synaptic connectivity). It extends these findings to human-derived neuronal models, thus bridging a gap between clinical observations and molecular/cellular mechanisms. Relevant literature includes studies on DS neurodevelopment and the role of ion channels in synaptic maturation. • Target Audience: The reported findings will be of interest to researchers in neurodevelopmental disorders, Down syndrome, and ion channel physiology. Additionally, the study may attract the attention of those working on hiPSC-derived models of neurological diseases, as well as clinicians interested in the pathophysiology of DS. • Keywords and Field Contextualization: Keywords: Down syndrome, trisomy 21, neuronal connectivity, synchronized network activity, hiPSC-derived cortical neurons, ion channel dysregulation.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary The manuscript by Peter et al., reports on the neuronal activity and connectivity of iPSC-derived human cortical neurons from Down syndrome (DS) that is caused by caused by trisomy of the human chromosome 21 (TS21). Major points: Although the manuscript is potentially interesting, the results appear somehow preliminary and need to be corroborated by control experiments and quantifications of effects to fully sustain the conclusions. (1) The authors have not assessed the percentage of WT and TS21 cells that acquire a neuronal or glia identity in their cultures. Indeed, the origin of alterations in network activity and connectivity observed in TS21 neurons could simply derive from reduced number of neurons arising from TS21 iPSC. Alternatively, the same alteration in network activity and connectivity could derive from a multitude of other factors including deficits in neuronal development, neurite extension, or intrinsic electrophysiological properties. In the current version of the manuscript, none of these has been investigated. 2.1) We thank the reviewer for this thoughtful comment. In response, we included an in vivo characterization of cell-type proportions at the same time points where we observed network activity defects using in vivo calcium imaging (see Supplementary Fig. 6).

      Previous work has identified several cellular and molecular phenotypes in human cells, postmortem tissue, and mouse models-including those mentioned by the reviewer. In this study, our focus was on investigating neural network activity, intrinsic electrophysiological properties both in vitro and in vivo, and preliminary bulk RNA sequencing. We have also independently measured cell proportions in the human fetal cortex and conducted a more extensive transcriptomic analysis of Ts21 versus control cells in a separate study (Lattke et al., under revision). We observed a reduction of RORB/FOXP1-expressing Layer 4 neurons in the human fetal cortex at midgestation, as well as increased GFAP+ cells, reduced progenitors and a non significant reduction of Cux2+ cells in late stage DS human cell transplants, along with a gene network dysregulation specifically affecting excitatory neurons (Lattke et al., under revision). Here, we provide complementary findings, demonstrating reduced excitatory neuron network connectivity in vitro and decreased neural network synchronised activity in both in vitro and in vivo models (see also 2.8). We agree with the reviewer that this could be for a number of reasons, both cell autonomous (channel expression and/or function) or non-autonomous (connectivity and/or network composition - as reflected in differences in proportions of SATB2+ neurons generated in TS21 cortical differentiations).

      (2) Electrophysiological properties of TS21 and WT neurons at day 53/54 in vitro indicate an extremely immature stage of development (i.e. RMP between -36 and -27 mV with most of the cells firing a single action potential after current injection) in the utilized culture conditions: This is far from ideal for in vitro neuronal-network studies. Finally, reduced activity of HCN1 channels should be confirmed by specific recordings isolating or blocking the related current.

      2.2) Thank you for this thoughtful comment. We have also conducted ex vivo electrophysiological recordings and found that the neurons exhibit relatively immature properties, consistent with the known slow developmental trajectory of human neuron cultures. In light of this and the absence of direct confirmatory evidence, we now refer to the observed reduction in HCN1 as preliminary.

      Main points highlighting the preliminary character of the study. 1) In Figure 1 immunofluorescence images of the neuronal differentiation markers (Tbr1, Ctip2 and Tuj1) are showed. However, no quantification of the percentage of cells expressing these markers for WT and TS21 neurons is reported. On the other hand, simple inspection of the representative images clearly seams to indicate a difference between the two genotypes, with TS21 cultures showing lower number of cells expressing neuronal markers. This quantification should be corroborated by a similar staining for an astrocyte marker (GFAP, but not S100b since is triplicated in DS). This is an extremely important point since it is obvious that any change in the percentage of neurons (or the neuron/astrocyte ratio) in the cultures will strongly affect the resulting network activity (shown in Figure 2) and the connectivity (showed in Figure 4). Possibly, the quantification should be done at the same time points of the calcium imaging experiments.

      2.3) See 2.1. We included an in vivo characterization of cell-type proportions at the same time points where we observed network activity defects using in vivo calcium imaging. (see Supplementary Fig. 6).

      2) In Figure 2 the authors show some calcium imaging traces of WT and TS21 cultures at different time points. However, they again do not show any quantification of neuronal activity. A power spectra analysis is shown in Supplementary Figure 2, but only for WT cultures, while in Supplementary Figure 3 a comparison between WT and Ts21 power spectra is done, but only at the 50 day time point, while difference in synchrony are assessed at 60 days. At minimum, the author should include in main Figure 2 the quantification of the mean calcium event rate and mean event amplitude at the different time points and the power spectra analysis for both WT and TS21 cultures at the same timepoints.

      2.4) We thank the reviewer for this comment. We now add the power spectra analysis in the main Figure 2 and quantification of the mean calcium burst rate and mean event amplitude in SuppFig. 4.

      Of note, the synchronized neuronal activity is present in WT cultures at day 60, but totally lost at subsequent time-points (70 and 80 days). The results of this later time points are different from previous data from the same lab (Kirwan et al., 2015). How might these data be explained? It would be important to rule out any potential issues with the health of the culture that could explain the loss of neuronal activity.It would be beneficial to check cell viability at the different time points to exclude possible confounding factors ? A propidium staining or a MTT assay would strongly improve the soundness of the calcium data.

      2.5) We thank the reviewer for this important observation. The difference from the findings reported in Kirwan et al., 2015 is due to the use of a different neuronal differentiation medium in the current study (BrainPhys versus N2B27). BrainPhys medium supports robust early network activity compared to N2B27 (onset before day 60 in BrainPhys, post-day 60 in N2B27), resulting in an earlier decline in synchrony at later stages (day 70-80 in BrainPhys, compared with day 90-100 in N2B27). Importantly, in our in vivo xenograft model, burst activity is sustained up to at least 5 months post-transplantation (mpt), indicating that the neurons retain the capacity for network activity over extended periods in a more physiological environment. We adapted the text accordingly.

      3) In Figure 3 there is no quantification of the number and/or density of transplanted neurons for WT and TS21, but only representative images. As above, inspection of the representative images seems to show a decrease in cells labeled by the Tbr1 neuronal marker for TS21 cells. Moreover, the in vivo calcium imaging of transplanted WT and TS21 cells lacks most of the quantification normally done in calcium imaging experiments. Are the event rate and event amplitude different between WT and TS21 neurons ? The measure of neuronal synchrony by mean pixel correlation is not well explained, but it looks somehow simplistic. Neuronal synchrony can be more precisely measured by cross-correlation analysis or spike time tiling coefficients on the traces from single-neuron ROI rather than on all pixels in the field of view, as apparently was done here.

      2.6) We thank the reviewer for these valuable points. We now include quantification of the number and density of transplanted neurons for both WT and Ts21 grafts in Extended Data Figure 5 (see 2.1).

      Regarding the in vivo calcium imaging, we appreciate the reviewer's suggestion to include additional standard metrics. We have quantified the event rate in Real et al 2018. These analyses reveal that Ts21 neurons show a reduction in event rate.

      We agree that our initial description of the synchrony analysis using mean pixel correlation was not sufficiently detailed. We have now clarified this in the Methods and Results, and we acknowledge its limitations. Importantly, we note that the reduced synchronisation is a highly consistent phenotype, observed across at least six independent donor pairs, different differentiation protocols, and both in vitro (and in two independent labs) and in vivo settings. As suggested, future studies using ROI-based approaches-such as cross-correlation or spike-time tiling coefficients-would provide a more refined characterization of synchrony at the single-neuron level (Sintes et al, in preparation). We now include this point in the discussion.

      4) The results on reduced neuronal connectivity in Figure 3 look very striking. However, these results should be accompanied by control experiments to verify the number of neuronal cells and neurite extension in WT and Ts21 cultures. These two parameters could indeed strongly influence the results. As the cultures appear to grow in clusters, bright-field images and TuJ1 staining of the cultures will also greatly help to understand the degree of morphological interconnection between the clusters.

      We now add Tuj1 staining in Supplementary figure 10.

      5) The authors performed RNA-seq experiments on day 50 cultures. Why the authors do not show the complete differential gene expression analysis, but only a small subset of genes? A comprehensive volcano plot and the complete list of identified genes with logFC and FDR values would be helpful. If possible, comparison of the present data (particularly on KCN and HCN expression changes) with published and publicly available expression datasets of other human or human Down syndrome iPSC-derived neurons or human Down syndrome brains will greatly increase the soundness of the present findings. In addition, the gene ontology (GO) results are mentioned in the text, but are not presented. Showing the complete GO analysis for both up and downregulated genes will help the reader to better understand the RNA-seq results. Notably, the results shown in Supplementary Figure on GRIN2A and GRIN2B expression (with values of 300-700 counts versus 2000-4000 counts, respectively) clearly indicate that in both WT and TS21 cultures the NMDA developmental switch has not occurred yet at the 50 days timepoint.

      We now show volcano plots in Supplementary Fig. 11.

      6) The measure of hyperpolarization-activated currents shown in Figure 5 lack proper control experiments. First, the hyperpolarizing current in TS21 cells do not reach a steady-state as the controls. The two curves are therefore hard to compare. To exclude possible difference in kinetic activation, the authors should have prolonged the current injection period (1-2 seconds). Second, to ultimately prove that such currents are mediated by HCN channels in WT cells the authors should perform some control experiments with a specific HCN blocker. A good example of a suitable protocol, with also current blockers to exclude all other possible current contributions, is the one reported in Matt et al Cell. Mol. Life Sci. 68, 125-137 (2011).

      2.7) We thank the reviewer for this detailed and helpful comment. We agree that to definitively identify the recorded currents as Ih, it would be necessary to isolate them pharmacologically using specific HCN channel blockers and appropriate controls, such as those described in Matt et al., Cell. Mol. Life Sci. Unfortunately, due to current constraints, we no longer have access to the animals used in this study and cannot allocate the necessary time or resources, we are unable to perform the additional experiments at this stage.

      However, our goal here was to use electrophysiological recordings as an indication of altered HCN channel activity, which we then support with molecular evidence. We now emphasize this point more clearly in the revised manuscript.

      7) The manuscript lacks information on the statistical analysis used. Also, the numerosity of samples is not clear. Were the dots shown in some graph technical replicates from a single neuronal induction or were all independent neuronal inductions or a mix of the two ? Please clarify.

      We now clarify the numbers in the Figure legend.

      8) The method section lacks important information to guarantee reproducibility. Just a few examples: • Only electrophysiology methods for slice are reported, but not for in vitro culture.

      We now clarify these details in the methods.

      • Details on Laminin coating is lacking. What concentration was used ? Was poly-ornithine or poly-lysine used before Laminin coating ? We now clarify these details in the methods.

      • How long cells were switched to BrainPhys medium before calcium imaging ? We now clarify these details in the methods.

      Minor point/typos etc.

      Introduction • Page 4 line 6: in the line "Trisomy 21 in humans commonly results in a range in developmental and morphological changes in the forebrain ..." "in" could be replaced by "of". We have fixed this. • Page 5 line 2: please remove "an" before the word "another". We have fixed this. • Page 5 line 2: please replace "ecitatory" with "excitatory". We have fixed this typo.

      Results • Page 10 line 25: The concept of "pixel-wise" appears for the first time in this section and could be better introduced to facilitate the understanding of the experiment. • In the "results" section, page 11 line 1 and 4, references are made to "Figure 4D" and "4F," but these figures do not appear to be present in the figure section. Upon reviewing the rest of the section, the data seem to refer to "Figure 3D" and "3E." We have fixed this. Discussion • Page 15 line 20: please replace "synchronised" with "synchronized". We have fixed this typo. • Page 16 line 11: please replace "T21" with "TS21". We have fixed this typo. Methods • Page 19 line 12: "Pens/Strep" has to be replaced by Pen/Strep. We have fixed this typo. • Page 20 line 20: "Tocris Biocience" has to be replaced by "Tocris Bioscience". We have fixed this typo. • Page 21 line 2: "Addegene" has to be replaced by "Addgene". We have fixed this typo. Figures • Figure 3: the schematic experimental design (Fig. 3A) could be enlarged to match the width of the images/graphs below. We have fixed this. • Figure 5: the reviewer suggests resizing/repositioning the graphs in Fig. 1A so that they match the width of those below. We have fixed this. • Figure S1D: In all the figures of the paper, the respective controls for the TS21 1 and TS21 2 lines are labelled as "WT1/WT2," while in these graphs, they are called "Ctrl1" and "Ctrl2." To ensure consistency throughout the paper, it is suggested to change the names in these graphs. We have fixed this. • Figure S4L: The graph is not very clear, especially regarding the significance reported at -50 pA, please modify the graphical visualization and/or add a legend in the caption. We have fixed this.

      Reviewer #2 (Significance (Required)):

      Nature and significance of the advance for the field. The results presented in the manuscript are potentially interesting and useful, but not completely novel (currents deregulation has already been highlighted in mouse models of Down Syndrome).

      2.8) We thank the reviewer for this comment. While we agree that current deregulation has been observed in mouse models of Down syndrome, the novelty and significance of our study lie in demonstrating these alterations directly in human neurons using both in vitro and in vivo xenograft models.

      This is a critical advance because the human cortex has distinct developmental and functional properties not fully recapitulated in mice. In fact, three recent studies have already highlighted significant defects mainly in excitatory neurons within the fetal human DS cortex (Vuong et al., bioRxiv, 2025; Risgaard et al., bioRxiv, 2025; Lattke et al, under revision). Our work builds directly on these observations by providing, for the first time, an electrophysiological and network-level characterization of these human-specific deficits.

      Our findings thus provide translationally relevant insight that is not merely confirmatory but extends previous work by grounding it in a human cellular context.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      The manuscript by Peter et al., reports on the neuronal activity and connectivity of iPSC-derived human cortical neurons from Down syndrome (DS) that is caused by caused by trisomy of the human chromosome 21 (TS21).

      Major points:

      Although the manuscript is potentially interesting, the results appear somehow preliminary and need to be corroborated by control experiments and quantifications of effects to fully sustain the conclusions.

      (1) The authors have not assessed the percentage of WT and TS21 cells that acquire a neuronal or glia identity in their cultures. Indeed, the origin of alterations in network activity and connectivity observed in TS21 neurons could simply derive from reduced number of neurons arising from TS21 iPSC. Alternatively, the same alteration in network activity and connectivity could derive from a multitude of other factors including deficits in neuronal development, neurite extension, or intrinsic electrophysiological properties. In the current version of the manuscript, none of these has been investigated.

      (2) Electrophysiological properties of TS21 and WT neurons at day 53/54 in vitro indicate an extremely immature stage of development (i.e. RMP between -36 and -27 mV with most of the cells firing a single action potential after current injection) in the utilized culture conditions: This is far from ideal for in vitro neuronal-network studies. Finally, reduced activity of HCN1 channels should be confirmed by specific recordings isolating or blocking the related current.

      Main points highlighting the preliminary character of the study.

      1) In Figure 1 immunofluorescence images of the neuronal differentiation markers (Tbr1, Ctip2 and Tuj1) are showed. However, no quantification of the percentage of cells expressing these markers for WT and TS21 neurons is reported. On the other hand, simple inspection of the representative images clearly seams to indicate a difference between the two genotypes, with TS21 cultures showing lower number of cells expressing neuronal markers. This quantification should be corroborated by a similar staining for an astrocyte marker (GFAP, but not S100b since is triplicated in DS). This is an extremely important point since it is obvious that any change in the percentage of neurons (or the neuron/astrocyte ratio) in the cultures will strongly affect the resulting network activity (shown in Figure 2) and the connectivity (showed in Figure 4). Possibly, the quantification should be done at the same time points of the calcium imaging experiments.

      2) In Figure 2 the authors show some calcium imaging traces of WT and TS21 cultures at different time points. However, they again do not show any quantification of neuronal activity. A power spectra analysis is shown in Supplementary Figure 2, but only for WT cultures, while in Supplementary Figure 3 a comparison between WT and Ts21 power spectra is done, but only at the 50 day time point, while difference in synchrony are assessed at 60 days. At minimum, the author should include in main Figure 2 the quantification of the mean calcium event rate and mean event amplitude at the different time points and the power spectra analysis for both WT and TS21 cultures at the same timepoints.

      Of note, the synchronized neuronal activity is present in WT cultures at day 60, but totally lost at subsequent time-points (70 and 80 days). The results of this later time points are different from previous data from the same lab (Kirwan et al., 2015). How might these data be explained? It would be important to rule out any potential issues with the health of the culture that could explain the loss of neuronal activity.It would be beneficial to check cell viability at the different time points to exclude possible confounding factors ? A propidium staining or a MTT assay would strongly improve the soundness of the calcium data.

      3) In Figure 3 there is no quantification of the number and/or density of transplanted neurons for WT and TS21, but only representative images. As above, inspection of the representative images seems to show a decrease in cells labeled by the Tbr1 neuronal marker for TS21 cells. Moreover, the in vivo calcium imaging of transplanted WT and TS21 cells lacks most of the quantification normally done in calcium imaging experiments. Are the event rate and event amplitude different between WT and TS21 neurons ? The measure of neuronal synchrony by mean pixel correlation is not well explained, but it looks somehow simplistic. Neuronal synchrony can be more precisely measured by cross-correlation analysis or spike time tiling coefficients on the traces from single-neuron ROI rather than on all pixels in the field of view, as apparently was done here.

      4) The results on reduced neuronal connectivity in Figure 3 look very striking. However, these results should be accompanied by control experiments to verify the number of neuronal cells and neurite extension in WT and Ts21 cultures. These two parameters could indeed strongly influence the results. As the cultures appear to grow in clusters, bright-field images and TuJ1 staining of the cultures will also greatly help to understand the degree of morphological interconnection between the clusters.

      5) The authors performed RNA-seq experiments on day 50 cultures. Why the authors do not show the complete differential gene expression analysis, but only a small subset of genes? A comprehensive volcano plot and the complete list of identified genes with logFC and FDR values would be helpful. If possible, comparison of the present data (particularly on KCN and HCN expression changes) with published and publicly available expression datasets of other human or human Down syndrome iPSC-derived neurons or human Down syndrome brains will greatly increase the soundness of the present findings. In addition, the gene ontology (GO) results are mentioned in the text, but are not presented. Showing the complete GO analysis for both up and downregulated genes will help the reader to better understand the RNA-seq results. Notably, the results shown in Supplementary Figure on GRIN2A and GRIN2B expression (with values of 300-700 counts versus 2000-4000 counts, respectively) clearly indicate that in both WT and TS21 cultures the NMDA developmental switch has not occurred yet at the 50 days timepoint.

      6) The measure of hyperpolarization-activated currents shown in Figure 5 lack proper control experiments. First, the hyperpolarizing current in TS21 cells do not reach a steady-state as the controls. The two curves are therefore hard to compare. To exclude possible difference in kinetic activation, the authors should have prolonged the current injection period (1-2 seconds). Second, to ultimately prove that such currents are mediated by HCN channels in WT cells the authors should perform some control experiments with a specific HCN blocker. A good example of a suitable protocol, with also current blockers to exclude all other possible current contributions, is the one reported in Matt et al Cell. Mol. Life Sci. 68, 125-137 (2011).

      7) The manuscript lacks information on the statistical analysis used. Also, the numerosity of samples is not clear. Were the dots shown in some graph technical replicates from a single neuronal induction or were all independent neuronal inductions or a mix of the two ? Please clarify.

      8) The method section lacks important information to guarantee reproducibility. Just a few examples: - Only electrophysiology methods for slice are reported, but not for in vitro culture. - Details on Laminin coating is lacking. What concentration was used ? Was poly-ornithine or poly-lysine used before Laminin coating ? - How long cells were switched to BrainPhys medium before calcium imaging ?

      Minor point/typos etc.

      Introduction

      • Page 4 line 6: in the line "Trisomy 21 in humans commonly results in a range in developmental and morphological changes in the forebrain ..." "in" could be replaced by "of".
      • Page 5 line 2: please remove "an" before the word "another".
      • Page 5 line 2: please replace "ecitatory" with "excitatory"

      Results

      • Page 10 line 25: The concept of "pixel-wise" appears for the first time in this section and could be better introduced to facilitate the understanding of the experiment.
      • In the "results" section, page 11 line 1 and 4, references are made to "Figure 4D" and "4F," but these figures do not appear to be present in the figure section. Upon reviewing the rest of the section, the data seem to refer to "Figure 3D" and "3E."

      Discussion

      • Page 15 line 20: please replace "synchronised" with "synchronized".
      • Page 16 line 11: please replace "T21" with "TS21".

      Methods

      • Page 19 line 12: "Pens/Strep" has to be replaced by Pen/Strep.
      • Page 20 line 20: "Tocris Biocience" has to be replaced by "Tocris Bioscience".
      • Page 21 line 2: "Addegene" has to be replaced by "Addgene".

      Figures

      • Figure 3: the schematic experimental design (Fig. 3A) could be enlarged to match the width of the images/graphs below.
      • Figure 5: the reviewer suggests resizing/repositioning the graphs in Fig. 1A so that they match the width of those below.
      • Figure S1D: In all the figures of the paper, the respective controls for the TS21 1 and TS21 2 lines are labelled as "WT1/WT2," while in these graphs, they are called "Ctrl1" and "Ctrl2." To ensure consistency throughout the paper, it is suggested to change the names in these graphs.
      • Figure S4L: The graph is not very clear, especially regarding the significance reported at -50 pA, please modify the graphical visualization and/or add a legend in the caption.

      Significance

      Nature and significance of the advance for the field. The results presented in the manuscript are potentially interesting and useful, but not completely novel (currents deregulation has already been highlighted in mouse models of Down Syndrome).

      Work in the context of the existing literature. This work follows the line of evidence that characterizes Down Syndrome in human neurons (Huo, H.-Q. et al. Stem Cell Rep. 10, 1251-1266 (2018); Briggs, J. A. et al. Etiology. Stem Cells 31, 467-478 (2013)), both in vitro and in xenotransplanted mice, by corrborating some important findings already found in animal models (Stern, S., Segal, M. & Moses, E. EBioMedicine 2, 1048-1062 (2015); Cramer, N. P., Xu, X., F. Haydar, T. & Galdzicki, Z. Physiol. Rep. 3, e12655 (2015); Stern, S., Keren, R., Kim, Y. & Moses, E. http://biorxiv.org/lookup/doi/10.1101/467522 (2018) doi:10.1101/467522.

      Audience. Scientists in the field of pre-clinical biomedical research, especially those working on neurodevelopmental disorders and iPSC-based non-animal models.

      Field of expertise. In vitro electrophysiology, Neurodevelopmental disorders, Down Syndrome, ips cells.