10,000 Matching Annotations
  1. Feb 2026
    1. Is paying the advisory fee a bribe or an acceptable cost of doing business in that area of the world? What should the executives do before agreeing to pay the fee?

      Paying this "advisory fee" is 100% a bribe and is not an acceptable cost of doing business in that area of the world. Just because it is common and may happen frequently, does not make it right or acceptable. The executives should avoid paying this "advisory fee" and tell the politician that they must first consult with their legal counsel and conduct their due diligence first in making sure this is legitimate business.

    2. The Impact of Terrorism on Global Trade

      I think this is a very relevant topic in our world today due to the situation and events that have occurred in the Red Sea over the past year. It is a clear example of how terrorism impacts global trade by introducing a security threat in a critical shipping corridor for global commerce. The result of this globally (and for us, stateside) is that costs get raised, trade movement slows which results in supply chain delays, and companies have to rethink their global strategies to mitigate risk as globalization continues to grow.

    3. Globalization, however, will continue because the world’s major markets are too vitally integrated for globalization to stop.

      It's important to recognize the countless years of globalization. Diving in deeper why it won't stop because of a terrorist attack this because trade is crucial factor of everyday life that is because if the global trade decided to in fact stop the economy of all country would cause a shift creating a causing for chaos. As some countries relay heavily on another's countries goods.

    4. goods. It may be a charge per unit, such as per barrel of oil or per new car; it may be a percentage of the value of the goods, such as 5 percent of a $500,000 shipment of shoes; or it may be a combination. No

      I knew what are tariffs but I didn't know how they worked. The car example makes a lot of sense to me because I live close by the Port Hueneme Base. They ship a lot of cars like Volvo, BMW, and Land Rover. I always see them drive through the street and wonder about how tariffs affect these new cars. Now, I'm more informed about tariffs.

    1. Reviewer #2 (Public review):

      Summary:

      Kumar et al. aimed to assess the role of the understudied H3K115 acetylation mark, which is located in the nucleosomal core. To this end, the authors performed ChIP-seq experiments of H3K115ac in mouse embryonic stem cells as well as during differentiation into neuronal progenitor cells. Subsequent bioinformatic analyses revealed an association of H3K115ac with fragile nucleosomes at CpG island promoters, as well as with enhancers and CTCF binding sites. This is an interesting study, which provides important novel insights into the potential function of H3K115ac. However, the study is mainly descriptive, and functional experiments are missing.

      Strengths:

      (1) The authors present the first genome-wide profiling of H3K115ac and link this poorly characterized modification to fragile nucleosomes, CpG island promoters, enhancers, and CTCF binding sites.

      (2) The study provides a valuable descriptive resource and raises intriguing hypotheses about the role of H3K115ac in chromatin regulation.

      (3) The breadth of the bioinformatic analyses adds to the value of the dataset

      Comments on revisions:

      The authors sufficiently addressed my concerns.

    2. Reviewer #3 (Public review):

      Summary:

      Kumar et al. examine the H3K115 epigenetic mark located on the lateral surface of the histone core domain and present evidence that it may serve as a marker enriched at transcription start sites (TSSs) of active CpG island promoters and at polycomb-repressed promoters. They also note enrichment of the H3K115ac mark is found on fragile nucleosomes within nucleosome-depleted regions, on active enhancers and CTCF bound sites. They propose that these observations suggest that H3K115ac contributes to nucleosome destabilization and so may servers a marker of functionally important regulatory elements in mammalian genomes.

      Strengths:

      The authors present novel observations suggesting that acetylation of a histone residue in a core (versus on a histone tail) domain may serve a functional role in promoting transcription in CPG islands and polycomb-repressed promoters. They present a solid amount of confirmatory in silico data using appropriate methodology that supports the idea that H3K115ac mark may function to destabilize nucleosomes and contribute to regulating ESC differentiation. These findings are quite novel.

      Weaknesses:

      Additional experiments to confirm specificity of the antibodies used have been done, improving confidence in the study.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public reviews):

      (1) The absence of replicate paired-end datasets limits confidence in peak localization.

      The reviewer was under the impression that that we did not perform biological replicates of our ChIP-seq experiments. All ChIP-seq (and ATAC-seq) experiments were performed with biological replicates and the Pearson’s correlations (all >0.9) between replicates were provided in Supplementary Table 1. We had indicated this in the text and methods but will try to make this even clearer.

      (2) The analyses are primarily correlative, making it difficult to fully assess robustness or to support strong mechanistic conclusions.

      Histone modifications are difficult to alter genetically because of the high copy number of histone genes and inhibition of HATs/HDACs in general leads to alterations in other histone modifications. It is an inherent challenge in establishing causality of histone modifications, especially histone acetylation marks.

      (3) Some claims (e.g., specificity for CpG islands, "dynamic" regulation during differentiation) are not fully supported by the analyses as presented.

      We have modified the text in response to this point. The new text reads: “Non-CGI promoters have lower overall levels of transcription compared to CGI promoters, and for this promoter class H3K115ac enrichment detected by ChIP is only really seen for the highest quartile of transcription (4SU) quartile of expression (Figure 1G). CGI promoters on the other hand, exhibit significant levels of detected H3K115ac even for the lowest quartile of expression. These results suggest a special link between CGI promoters and H3K115ac”.

      (4) Overall, the study introduces an intriguing new angle on globular PTMs, but additional rigor and mechanistic evidence are needed to substantiate the conclusions.

      We agree that the paper does not provide mechanistic details or solid causality of H3K115ac. We have only emphasized the potential role of H3K115ac in nucleosome fragility based on our in vivo data and previously published in-vitro experiments (Manohar et.al., 2009, Chatterjee et. al., 2015). We do provide the evidence that H3K115ac is enriched on subnucleosomal particles via sucrose gradient sedimentation of MNase-digested chromatin (Figure 3C-D).

      Reviewer #2 (Public review):

      (1) I am not fully convinced about the specificity of the antibody. Although the experiment in Figure S1A shows a specific binding to H3K115ac-modified peptides compared to unmodified peptides, the authors do not show any experiment that shows that the antibody does not bind to unrelated proteins. Thus, a Western of a nuclear extract or the chromatin fraction would be critical to show. Also, peptide competition using the H3K115ac peptide to block the antibody may be good to further support the specificity of the antibody. Also, I don't understand the experiment in Figure S1B. What does it tell us when the H3K115ac histone mark itself is missing? The KLF4 promoter does not appear to be a suitable positive control, given that hundreds of proteins/histone modifications are likely present at this region. It is important to clearly demonstrate that the antibody exclusively recognizes H3K115ac, given that the conclusion of the manuscript strongly depends on the reliability of the obtained ChIP-Seq data.

      ChIP-qPCR in S1B includes competition from native chromatin and shows high specificity to its target. We have provided antibody validation in three ways:

      - Western blot with dot-blot of synthetic peptides (Figure S1A).

      - Western blots with Whole cell extracts (Figure 4D).

      - ChIP-qPCR on native chromatin spiked with a cocktail of synthetic mono-nucleosomes, each carrying a single acetylation and a specific barcode (SNAP-ChIP K-AcylStat Panel).

      We could not include H3K115ac marked nucleosomes as they are not available in the panel. Figure S1B shows that the H3K115ac antibody exhibits negligible binding to known K-acyl marks, comparable to an unmodified nucleosome. Because of the absence of a H3K115ac modified barcoded nucleosome, we used the KLF4 promoter from mESCs as a positive control, in agreement with ChIP-seq signal shown in the genome browser profile (Figure 1E), the KLF4 promoter shows a significantly higher signal than the gene body.

      (2) The association of H3K115ac with fragile nucleosomes is based on MNase-sensitivity and fragment length, which are indirect methods and can have technical bias. Experiments that support that the H3K115ac modified nucleosomes are indeed more fragile are missing.

      We have performed ChIP-seq on MNase digested mESC chromatin fractionated on sucrose gradients and this shows that H3K115ac is enriched in fractions containing sub-nucleosomal and fragile nucleosomes but depleted in fractions containing stable nucleosomes (Figure 3D).

      (3) The comparison of H3K115ac with H3K122ac and H3K64ac relies on publicly available datasets. Since the authors argue that these marks are distinct, data generated under identical experimental conditions would be more convincing. At a minimum, the limitations of using external datasets should be discussed.

      H3K64ac and H3K122ac datasets were generated by us in a previous publication (Pradeepa et. al., 2016) using same native MNase ChIP protocol as used here. The ChIP-seq datasets for H3K122ac and H3K27ac are processed in an identical manner, with the same computational pipelines, to the H3K115ac data sets generated in this paper.

      (4) The enrichment of H3K115ac at enhancers and CTCF binding sites is notable but remains descriptive. It would be interesting to clarify whether H3K115ac actively influences transcription factor/CTCF binding or is a downstream correlate.

      We agree with the reviewer’s comment, but we have not claimed causality.

      (5) No information is provided about how H3K115ac may be deposited/removed. Without this information, it is difficult to place this modification into established chromatin regulatory pathways.

      Due to broad target specificity, redundancies and crosstalk among different classes of HATs and HDACs, it is not tractable to answer this question in the current manuscript.

      Reviewer #3 (Public reviews):

      Reviewer 3 is mistaken in thinking our ChIP experiments are performed under cross-linked conditions. As clearly stated in the main text and methods, all our ChIP-seq for histone modifications is done on native MNase-digested chromatin – with no cross-linking. This includes the spike-in experiment shown in Fig S1B to test H3K115ac antibody specificity against the bar-coded SNAP-ChIP® K-AcylStat Panel from Epicypher. We could not include H3K115ac bar-coded nucleosomes in that experiment since they are not available in the panel.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) I have two primary concerns that resound through the entire paper:

      (a) Overall, the manuscript is making strong claims based on entirely correlative datasets. No quantitative analyses are performed to demonstrate co-occupancy/localization. Please see more detailed descriptions below.

      Our responses to specific points are provided against each comment below.

      (b) Lack of paired-end replicates for H3K115ac ChIP-seq. While the reviewer token for the deposited data was not made accessible to me, looking at Supplementary Table 1, it appears there are two H3K115ac ChIP-seq datasets. One is paired-end and is single-read. So are peaks called with only one replicate of PE? Or are inaccurate peaks called with SR datasets? Either way, this is not a rigorous way to evaluate H3K115ac localization.

      We are sorry that this reviewer was not able to access the data – the token for the GEO accession was provided for reviewers at the journal’s request. All ChIP-seq (and ATAC-seq) experiments (paired and single-end) were performed with two biological replicates and the Pearson’s correlations (all >0.9) between replicates were provided in Supplementary Table 1. This was indicated in both the main text and in the methods. In the revised manuscript we have tried to make this even clearer and have put the relevant Pearsons coefficient (r) into the text at the appropriate places. For the reviewer’s information, here is the complete list of data samples in the GEO Accession:

      Author response image 1.

      While I agree that H3K115ac occupancy is high at +CGIs, the authors downplay that H3K122ac and H3K27ac is also more highly enriched at these locations (page 7, last sentence of first paragraph). I imagine this is all due to the more highly transcribed nature of these genes. Sub-stratifying the K27ac and K122ac by transcription (as in Figure 1G) would help to demonstrate a unique nature of H3K115ac. But even better would be to do an analysis that plots H3K115ac enrichment vs transcription for every individual gene rather than aggregate analyses that are biased by single locations. For example, make an XY scatterplot of RNAPII occupancy or 4SU-seq signal vs H3K115ac level, where each point represents a single gene. Because the interpretation that it is CGI-based and not transcription is confounded with the fact that -CGI are more lowly transcribed. So, looking at Figure 1G, even the -CGI occupancy of H3K115ac is correlated with transcription, but it is just more lowly transcribed.

      We thank the reviewer for these suggestions but point out that Figure 1G shows H3K115ac signal for CGI+ and CGI– TSS that are matched for expressions levels (quartiles of 4SU-seq). Fig 1F shows that H3k115ac is much more of a discriminator between CGI+ and – than H3K27ac or H3K122ac.

      (2) H3K115ac, H3K27ac, and H3K122ac are all more enriched (in aggregate) at +CGI locations (Fig 1F); so do these locations just have more positioned nucleosomes? More H3.3? So that these PTMs are just more enriched due to the opportunity?

      Positioned nucleosomes are generally found downstream of the TSS of active CpG island promoters, so what the reviewer suggests may well account for the relative enrichment of H327ac and H3K122ac at CGI+ vs CGI- promoters in Fig.1F. But H3K115ac localisation is distinct, with the peak at the nucleosome-depleted region not the +1 nucleosome. This is also confirmed by the contour plots in Fig 3. Our observation is also not explained by an enrichment of H3.3 at CGI promoters, since we show that H3K115ac is not specific to H3.3 (Fig 4D).

      (3) The authors note in paragraph 2 of page 7 that "H3K115ac does not scale linearly with gene expression..." but the authors never show a quantification of this; stratification in four clusters is not able to make a linear correlation. Furthermore, in the second line of page 7, the authors state that the levels do generally correlate with transcription. To claim it is a specific CGI link and not transcription is tricky, but I encourage the authors to consider more quantifiable ways, rather than correlations, to demonstrate this point, if it is observed.

      We thank the reviewer for this comment, and taking it into consideration, we have decided to re-phrase this paragraph. The new text reads: “Non-CGI promoters have lower overall levels of transcription compared to CGI promoters, and for this promoter class H3K115ac enrichment detected by ChIP is only really seen for the highest quartile of transcription (4SU) quartile of expression (Figure 1G). CGI promoters on the other hand, exhibit significant levels of detected H3K115ac even for the lowest quartile of expression. These results suggest a special link between CGI promoters and H3K115ac”.

      (4) The authors claim on page 7 that "on average, transcription increased from TSS that also gained H3K115ac but to a modest extent, compared with the more substantial loss of H3K115ac from downregulated TSS". However, both upregulated and downregulated are significant; the difference in magnitude could simply be due to more highly or more lowly transcribed locations, meaning that fold change could be more robustly detected. I caution the authors to substantiate claims like this rather than stating a correlation.

      We thank the reviewer for this comment which relates to the data in Fig 2A. It is Fig. 2B shows that the association of H3K115ac loss with downregulation is statistically stronger than H3K115ac gain with upregulation, but only for CGI promoters. With regard to the text on the original pg 7 that is referred to, we have now reworded this to read “Average levels of transcription increased from TSS that also gained H3K115ac, and there was loss of H3K115ac from downregulated TSS (Figure 2A).”

      (5) For Figure 2C, the authors argue that H3K115ac correlate with bivalent locations. So this is all qualitative and aggregate localization; please quantitatively demonstrate this claim.

      Figure S2D provides statistics for this (observed/expected and Fishers exact test).

      (6) The authors claim in Figure 2 that H3115ac is dynamic during differentiation (title of Figure 2). However, there are locations that gain and lose, or maintain H3K115ac. In fact, the most discussed locations are H3K115ac with no change (2C); which means it is NOT dynamic during differentiation. So what is the message for the role during differentiation? From Supplemental Table 1, it appears there is a single ChIP experiment for H3K115ac in NPC, and it is a single read. So this is also a difficult claim with one replicate. Related to this, in S2A, the authors show K115ac where there is no change in transcription; so what is the role of H3K115ac at TSSs relevant to differentiation - it is at both locations changed and unchanged in transcription, but H3K115ac levels itself do not change at these subsets. So, how is this dynamic? This is very confusing, and clearer analyses and descriptions are necessary to deconvolute these data.

      We apologise for the misleading title for Figure 2. This has now been amended to “Changes in H3K115ac during differentiation”. The message of this figure is that whilst changes in H3K115ac at TSS are small (panels A-C), at enhancers the changes are much more dramatic (panel D). The reviewer is incorrect about the number of replicates for NPCs – there are two biological replicates (see response to point 1b).

      (7) The authors go on to examine H3K115ac enrichment on fragile nucleosomes through sucrose gradient sedimentation. A control for H3K27ac or H3K122ac would be nice for comparison.

      We do not have the material available to perform these experiments

      (8) When discussing Figures 3 and SF3, the authors mention performing a different MNase for a second ChIP. Showing the MNase distribution for both the more highly digested and the lowly digested would be nice. a) Related to the above, the authors show input in SF3E to argue that the difference in H3K115ac vs H3K27ac is not due to the library, but they do not show the MNase digestion patterns, which is more important for this argument.

      Input libraries (first two graphs of FigS3E) are the MNase-digested chromatin. Comparison of nucleotide frequencies from millions of reads is more robust method than the fragment length patterns.

      (9) The authors move on to examine H3K115ac at enhancers. Just out of curiosity, given what was found at promoters, is H3K115ac enriched at +CGI enhancers? And what is the correlation with enhancer transcription?

      This is an interesting point, but the number of enhancers associated with CGI is not very high and so we did not focus on this. We have not analysed a correlation with eRNAs in this paper.

      (10) The authors state on page 14 that the most frequent changes in H3K115ac during differentiation are at these enhancers. So do these changes connect with differentiation-specific genes, and/or genes that have altered transcription during differentiation? Just trying to understand the functional role.

      Given the challenges of connecting enhancers with target genes, we have not addressed this question quantitatively. However, we draw the reviewer’s attention to the Genome Browser shots in Figures 2D and S2C, which show clear gain of H3K115ac (and ATAC-seq peaks) at intra and intergenic regions close to genes whose transcription is activated during the differentiation to NPCs.

      (11) Related, at the end of page 14, the authors state that the changes in H3K115ac correlate with changes in ATAC-seq; I imagine this dynamic is not unique for H3K115ac and this is observed for other PTMs (H3K27ac), so assessing and clarifying this, to again get to the specific interest of H3K115ac, would be ideal.

      We have not claimed that chromatin accessibility is unique to H3K115ac. It is the location of H3K115ac which is found inside the ATAC-seq peak region while H3K27ac is found only upstream/downstream of the ATAC peak that is so striking. This is apparent in Fig 4C.

      (12) The authors examine levels of H3K115ac in H3.3 KO cell lines via western blot (Figure 4D), but no replicates and/or quantification are shown.

      We now provide a biological replicate for the Western Blot (new FigS4H) together with an image of the whole gel for the data in Fig 4D

      (13) In Figure S4 and at the end of page 17, the authors are arguing that there is a link to pioneer TF complexes, based on Oct4 binding. First, while Oct4 has pioneering activity, not all Oct4 sites (or motifs) are pioneering; this has been established. So if you want to use Oct4, substratifying by pioneer vs no pioneer is necessary. Second, demonstrating this is unique to pioneer and not to non-pioneer TFs would be an important control.

      In response to the reviewer’s comment, we have removed the term “pioneer” from the manuscript.

      (14) Minor point: Figure 4 A and B, there are some formatting issues with the scale bars.

      We thank the reviewer for pointing this out, and the errors have been corrected in the revised figure.

      (15) Minor point is that it should be clear when single replicates of data are used and when PE/SR sequences are combined or which one is used in each analysis, as this was hard to discern when reading the paper and figure legends.

      We have clearly stated in the text that, after Figure2, we repeated all experiments in paired-end mode. All processing steps are defined separately for single end and paired end datasets in the method section. Details of biological replicates are provided in Sup. Table 1. These concerns are also addressed in our response to Reviewer’s public comment-1.

      (16) Minor point: it is surprising that different MNase and different units were used in the ChIP vs sucrose sedimentation. Could the authors clarify why?

      Chromatin prep for sucrose gradients were done on a much larger scale than for ChIP-seq and required different setups to obtain the right level of MNase digestion.

      (17) The authors note that fragile nucleosomes contain H2A.Z and H3.3, but they never perform an analysis of available data to demonstrate a correlation (or better a quantifiable correlation) between H3K115ac occupancy and these marks at the locations they identify H3K115ac.

      Since have shown (Fig. 4) that depletion of H3.3 does not affect overall levels of H3K115ac, we do not think there is value in further quantitative correlative analyses of H3K115ac and variant histones.

      (18) Minor point: What is the overlap in peaks for H3K115ac, H3K122ac, and H3K27ac (Figure 1C)?

      Nearly all H3K115ac peaks overlap with H3K122ac and/or H3K27ac. Its most distinct properties are its association with CGI promoters, fragile nucleosomes and its unique localisation within the NDRs, three points that the manuscript is focussed on.

      Reviewer #3 (Recommendations for the authors):

      (1) The western blot results in Figure 4D probing for H3, H3.3, and H3K115ac use Ponceau S staining, presumably of an area of the membrane where histones might be expected to migrate, as a measure of loading. However, the Ponceau S bands appear uniformly weaker in the H3.3KO lanes, yet despite this, blotting with H3.3 antibody detects a band in H3.3 knockout ESCs, suggesting that the antibody does not have a high degree of specificity. Again, a blocking experiment with appropriate peptides would instill more confidence in the specificity of these reagents, and/or the authors could provide independent validation of the knockout model to differentiate between a partial knockout or antibody cross-reactivity (e.g., by Sanger sequencing).

      In a revised Fig. S4H we now show the whole gel corresponding to this blot but including co-staining with an antibody for H4 to provide a better loading control. We also provide a biological replicate of this Western blot in the lower panel of Fig. S4H.

      (2) The manuscript would benefit from in vitro follow-up and validation, but if the authors intend to keep the manuscript primarily in silico, I suggest dedicating a few lines in each section to explain the plots, their axes, and their purpose, as well as to assist with interpretation, rather than directly discussing the results. This would make the manuscript more accessible and understandable for a broader audience in the field of epigenetics.

      In the revised version, we have tried to improve the text to make the data more accessible to a broad audience.

    1. eLife Assessment

      This potentially important study explores the specificity of olfactory perceptual learning. In keeping with previous work, the authors found that learning to discriminate between two enantiomers does not generalize across the nostrils or to unrelated enantiomers, whereas learning to discriminate odor mixtures does generalize across the nostrils and to other odor mixtures, with this learning effect persisting over at least two weeks. While the evidence presented to support these findings is convincing, it remains unclear why the results differ for enantiomers and why training on odor mixtures generalizes to other odor mixtures.

      Discrimination of odor enantiomers ultimately relies on the enantioselectivity of olfactory receptors, whereas mixture discrimination likely depends on relative differences in perceived configural odor notes. These processes probably engage plasticity at different stages of the olfactory pathway. The revised Discussion (p.16-18) now elaborates on this distinction and the potential underlying mechanisms. Please also refer to our responses to Reviewer 1’s Point 1 and Reviewer 2’s Points 2 and 3 below.

      Reviewer #1 (Public Review):

      This study extends a previous study by the same group on the generalization of odor discrimination from one nostril to the other. In their earlier study, the group showed that learning to discriminate between two enantiomers does not generalize across nostrils. This was surprising given the Mainland & Sobel 2001 study that found that detecting androstenone in people who do not detect it can generalize across the two nostrils. In this study, they confirmed their previous results and reported that, unlike enantiomers, learning to discriminate odor mixtures generalizes across nostrils, generalizes to other odor mixtures, and is persistent over at least two weeks.

      This interesting and important result extends our knowledge of this phenomenon and will likely steer more research. It may also help develop new training protocols for people with impairments in their sense of smell.

      We thank the reviewer for the encouraging remarks.

      The main weakness of this study is its scope, as it does not provide substantial insight into why the results differ for enantiomers and why training on odor mixtures generalizes to other odor mixtures.

      We thank the reviewer for this insightful comment. While the present study does not directly identify the neural mechanisms underlying these differences, it provides behavioral constraints on where specificity and generalization may arise within the olfactory system. Further neuroimaging and neurophysiological work will be needed to fully elucidate the underlying mechanisms.

      Reviewer #2 (Public Review):

      The manuscript from Chang et al. taps on an important issue in olfactory perceptual plasticity, named the generalization of perceptual learning effect by training using odors. They employed a discrimination training/learning task with either binary odor mixture or odor enantiomers, and tested for post-training effect at several time intervals. Their results showed contrasting patterns of specificity (enantiomers) and transfer (odor mixtures), and the learning effect persisted at 2 weeks post-training. They demonstrated that the effect was independent of task difficulty, olfactory adaptation and gender.

      Overall this was a well-controlled study and shows novel results. The strength of the study includes the consideration of odor structure and perceptual (dis)similarity and the control training condition.

      We appreciate the reviewer’s positive assessment of our work.

      I have two minor issues that hope the authors could address in the next version of the manuscript.

      (1). The author used a binary odor mixture with a ration 7:9 or 9:11, why is this ratio chosen and used for the experiment?

      This ratio was selected based on pilot testing and practical constraints. During piloting, we evaluated several mixing ratios to identify those that met two key criteria: (1) Baseline indiscriminability: Most participants were unable to reliably discriminate between the two binary mixtures in a:b and b:a ratios at baseline. (2)Trainability: With 1–5 weeks of training, participants could acquire the ability to discriminate between them.

      The a:b ratios of 7:9 and 9:11 were the ratios that met both criteria in our pilot testing, making them suitable for assessing training‑induced improvements in mixture discrimination. This clarification has been added to the revised Olfactory Stimuli subsection of the Materials and Methods (p.19-20 of the revised manuscript).

      (2) Over the course of training, has the valence of odor (odor mixture) changed, it would be helpful to include these results in the supplements. As the author indicated in the discussion, the potential site underlying the transfer effect is the OFC, which has been found to represent odor valence previously (Anderson, Christoff et al. 2003). It would be nice to see the author replicate the results with odor/odor mixture valence (change) controlled.

      Anderson, A. K., K. Christoff, I. Stappen, D. Panitz, D. G. Ghahremani, G. Glover, J. D. Gabrieli and N. Sobel (2003). "Dissociated neural representations of intensity and valence in human olfaction." Nat Neurosci 6(2): 196-202.

      Odor valence ratings were not collected in Experiments 1 and 2. However, we have since conducted a new experiment examining concentration discrimination learning (see our response to Reviewer 1, Point 1), using the constituents of the mixtures from Experiment 2 as stimuli (i.e., concentration pairs of acetophenone, 2 octanone, methyl salicylate, and isoamyl butyrate). In this new experiment (now incorporated as Experiment 3 in the revised manuscript), unilateral odor valence ratings were collected at baseline (Day 0) and at the post training test and retests on Days N, N+1, N+3, N+7, and N+14.

      For all odor pairs (training and controls), there was no significant change in perceived valence from baseline to Day N, regardless of nostril (ps > 0.05 for the main effects of session and nostril, as well as their interaction; Figure S5D). Moreover, odor valence ratings remained stable across the five post training test sessions (ps ≥ 0.29 for the main and interaction effects involving session), showing the same pattern as at baseline (Figure S5D, F). Thus, training appeared to have no measurable influence on odor valence perception. These results have been incorporated into the revised manuscript on p.14-15.

    1. Author response:

      Reviewer #1 (Public Review):

      The authors tested the hypothesis that at high elevations avian eggs will be adapted to prevent desiccation that might arise from loss of water to surrounding drier air. They used a combination of gas diffusion experiments and scanning electron microscopy to examine water vapour conductance rates and eggshell structure, including thickness, pore size, and pore density among 197 bird species distributed along an elevational gradient in the Andes. While there was a correlation between water vapour conductance and elevation among species, a decrease in water vapour conductance with elevation was not associated with eggshell thickness, pore size, and pore density, suggesting the variation in the structure of the eggshells is unlikely to do with among species differences in water loss along elevational gradients. This study is very interesting and timely, especially with increasing water vapour pressure due to climate warming. It is a very well-written study and easy to read. However, I have some concerns about the conclusions drawn from the results.

      There are more than twice as many species in low and medium-elevation sites compared to high-elevation sites, so the amount of variation in low and medium-elevation should be expected to be higher by default. The argument for a wider range of variation in lowelevation species will be stronger if the comparison was a similar sample size. Moreover, the pattern clearly breaks down within families. Note also that for Low and medium elevation there is no difference in the amount of variation in conductance residuals possibly because the sample sizes are similar. The seemingly strong positive correlation between eggshell conductance and egg mass may be driven by the five high and two medium-elevation species with large eggs. There seem to be hardly any high-elevation species with egg mass greater than 12g whereas species in low elevation egg size seem to be as high as 80g (Figure 2a). Since larger eggs (and thus eggs of larger birds) lose more water compared to smaller eggs, the correlation between water vapour conductance and elevation may be more strongly associated with body size distribution along elevational gradients rather than egg structure and function.

      We thank the reviewer for this thoughtful observation. As noted in our response to comment 3, we recognize that the higher number of species at low and mid-elevations reflects the natural turnover in species richness along elevational gradients, and we are transparent about this caveat in our revised Discussion section. Nevertheless, to address this specific concern, we conducted additional analyses excluding the species with large eggs (i.e., egg mass >12g, which are only present at low and mid-elevations in our dataset). These analyses are now included in the Supplementary Figure 1, and the main pattern of lower water vapor conductance at high elevations holds even when larger eggs are excluded.

      We agree that the well-known scaling relationship between egg mass and conductance (recognized since the 1970s) may partially explain the observed trends across the elevational gradient. Our aim was to explore whether the known relationship between egg size and conductance varies when incorporating environmental variables such as elevation, which brings with it changes in humidity and oxygen availability. While we acknowledge the possible confounding effect of body size distributions along the gradient, our results, even after controlling for egg size (residual analysis), still suggest a decrease in conductance at higher elevations, consistent with predictions based on environmental conditions.

      We have clarified these points in the revised Discussion, including the acknowledgment that disentangling the relative contributions of body size and elevation to conductance patterns remains challenging and warrants further study.

      Authors argue that the observed variation in the relationship between water vapour conductance and elevation among and within bird families suggests potential differences in the adaptive response to common selective pressures in terms of eggshell thickness and pore density, and size. The evidence for this is generally weak from the data analyses because the decrease in water vapour conductance with elevation was not consistent across taxonomic groups nor were differences associated with specific patterns in eggshell thickness and pore density, and size.

      We appreciate the reviewer’s comments on the observed variation in water vapor conductance across taxonomic groups. As mentioned in response to comment 7, we have removed the explicit analyses and figures showing within-family comparisons, as these were exploratory and not directly tied to a specific hypothesis. We have also toned down our speculations regarding the potential adaptive drivers of the observed variation. In the revised Discussion, we emphasize the need for further research to explore these patterns and acknowledge the limitations of our current dataset in making strong conclusions about the adaptive responses to selective pressures.

      It is not clear how the authors expected the relationship between water vapour conductance and elevation to differ among taxonomic groups and there was no attempt to explain the biological implication of these differences among taxonomic groups based on the specific traits of the species or their families. This missing piece of information is crucial to justify the argument that differences among taxonomic groups may be due to differences in adaptive response.

      We appreciate the reviewer’s point. To clarify, we were not expecting the relationship between water vapor conductance and elevation to differ among taxonomic groups. Rather, our primary hypothesis was that water vapor conductance would decrease with elevation due to the drier conditions in highland habitats, and we sought to link this pattern with structural characteristics of the eggshell. The suggestion of potential differences among taxonomic groups arose from the lack of a consistent pattern across families, which prompted us to consider possible adaptive variation. We now address this more clearly in the Discussion section, acknowledging the need for further exploration into the potential selective pressures driving this variation among taxonomic groups.

      Reviewer #2 (Public Review):

      This paper represents a strong advance for two main reasons. First, it provides evidence that egg physiology varies with elevation as predicted by the hypothesis that eggs are physiologically adapted to certain climatic conditions. This means egg physiological adaptation is a factor that could influence species' elevational ranges. Second, it is a proof-of-concept study that shows it is possible to measure eggshell physiology for a large number of species in the field in order to test hypotheses. As such, it should inspire many further tests that examine adaptation in egg physiology in the context of species' distributions along environmental gradients.

      There are two caveats that readers should be aware of. First, measuring these traits is difficult, and there remain questions about the efficacy of different methods. For example, the authors note that quantifying eggshell structures is very difficult, with several unresolved questions about their method of using scanning electron microscopy images to measure eggshell pores. Similarly, the authors mention that temperature variation may partially influence their main result that high-elevation eggs lose water at slower rates than low-elevation eggs (temperatures were colder for experiments at high elevations than for low elevations). Second, I regard the analyses of eggshell traits for specific families as exploratory. There are no a priori expectations for how different families might be expected to differ in their patterns. These analyses are fruitful in that they generate additional hypotheses that future work can test. However, it does mean that the statistical significance of eggshell trait relationships with elevation for specific families should be interpreted with caution.

      We thank Reviewer 2 for these insightful comments. As mentioned earlier, measuring these traits is indeed very challenging, and we acknowledge the limitations of our methods, particularly when it comes to using scanning electron microscopy to quantify eggshell structures. We are aware of the unresolved questions around these techniques, and we plan to continue refining these methods in future studies. Regarding the influence of temperature variation on water loss, we recognize that colder temperatures at high elevations may have influenced our results, and we address this potential confounding factor in the Discussion section, Line 257.

      We also agree with the reviewer’s point regarding the exploratory nature of the family-specific analyses. These analyses were not guided by specific hypotheses, other than the expectation of replicating the overall pattern, and we recognize that they should be interpreted with caution. They serve primarily to generate additional hypotheses for future studies. In the revised manuscript, we have toned down the emphasis on the statistical significance of eggshell trait relationships with elevation for specific families, and we emphasize the need for further research to confirm these patterns.

    1. Reviewer #3 (Public Review):

      Primary neutrophils are difficult to modify genetically, whereas the generation of knockout mice to study the role of specific proteins is time-consuming and expensive. CRISPR-Cas 9 genetic modification of neutrophil progenitors in vitro offers a platform to study neutrophil biology. Hoxb8 cells are immortalized neutrophil progenitors that differentiate into neutrophils when cultured in the presence of G-CSF, and have been shown to recapitulate the stages of murine neutrophil differentiation. They have also been shown to be amendable to CRISPR-Cas 9 genetic editing with successful deletion of key transcriptional regulators of neutrophil maturation and function. The authors of this manuscript offer an extension to this technique, by generating Hoxb8 cells that constitutively express Cas9. This may reduce the variation between the generated knock-out cells by avoiding the introduction of Cas9 in a plasmid every time together with a guide RNA.

      The first part of the manuscript is dedicated to the characterisation of Cas9+HoxB8 cells throughout their differentiation. Considering the existing body of literature on HoxB8 progenitors and their differentiation into neutrophils ex vivo, it does not significantly further our understanding of these cells, but rather provides a good validation to their Cas9+ modified version of them. Gene editing using Cas9+ Hoxb8 progenitors seems to be highly efficient, which is an important technical point, however, it is hard to assess the degree of improvement in efficiency compared to the published protocols with Cas9 delivery in a plasmid.

      As a test, the authors use Cas9+HoxB8 progenitor to generate a knockout of CEBPE, known for its important role in neutrophil development. They convincingly demonstrate its impact on HoxB8 cell differentiation, with in vivo reconstitution of wild-type and CEBPE-deficient HoxB8 progenitors into irradiated mice being especially elegant. However, the transfer into different recipient mice assumed no differences in the recipient environment, while immunophenotyping for mature neutrophils within the HoxB8 progenitor-derived cells did not account for possible differences in numbers of wt and CEBPE KO surviving cells, limiting the conclusions.

      Finally, the authors put the system to the test by screening a library of Brie gRNA library of ~80K mouse sgRNAs, targeting almost 20K genes with 4 gRNA per gene coverage, to identify genes that are needed for the differentiation of Cas9+ERHoxb8 progenitors into mature neutrophils. They identify a number of hits, amongst which the WASH complex and CEBPE are highlighted. A comparison of cell numbers prior to differentiation and at 4 days post differentiation indicates that they are indeed required for neutrophil survival. To validate the role of these hits in neutrophil maturation itself, as they stated in the aims, i.e. "to identify genes that modulate the differentiation of Cas9+ERHoxb8 progenitors into mature neutrophils", phenotypic, functional, and morphological characterization of these cell lines could have been appropriate.

      Overall, this study has the potential to improve on the established lentiviral CRISPR-Cas9 editing of Hoxb8 cells and be valuable for library-screening approaches for neutrophil modulators, which will benefit the community.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      The authors assess the effectiveness of electroporating mRNA into male germ cells to rescue the expression of proteins required for spermatogenesis progression in individuals where these proteins are mutated or depleted. To set up the methodology, they first evaluated the expression of reporter proteins in wild-type mice, which showed expression in germ cells for over two weeks. Then, they attempted to recover fertility in a model of late spermatogenesis arrest that produces immotile sperm. By electroporating the mutated protein, the authors recovered the motility of ~5% of the sperm; although the sperm regenerated was not able to produce offspring using IVF, the embryos reached the 2-cell state (in contrast to controls that did not progress past the zygote state).

      This is a comprehensive evaluation of the mRNA methodology with multiple strengths. First, the authors show that naked synthetic RNA, purchased from a commercial source or generated in the laboratory with simple methods, is enough to express exogenous proteins in testicular germ cells. The authors compared RNA to DNA electroporation and found that germ cells are efficiently electroporated with RNA, but not DNA. The differences between these constructs were evaluated using in vivo imaging to track the reporter signal in individual animals through time. To understand how the reporter proteins affect the results of the experiments, the authors used different reporters: two fluorescent (eGFP and mCherry) and one bioluminescent (Luciferase). Although they observed differences among reporters, in every case expression lasted for at least two weeks. The authors used a relevant system to study the therapeutic potential of RNA electroporation. The ARMC2-deficient animals have impaired sperm motility phenotype that affects only the later stages of spermatogenesis. The authors showed that sperm motility was recovered to ~5%, which is remarkable due to the small fraction of germ cells electroporated with RNA with the current protocol. The sperm motility parameters were thoroughly assessed by CASA. The 3D reconstruction of an electroporated testis using state-of-the-art methods to show the electroporated regions is compelling.

      The main weakness of the manuscript is that although the authors manage to recover motility in a small fraction of the sperm population, it is unclear whether the increased sperm quality is substantial to improve assisted reproduction outcomes. The authors found that the rescued sperm could be used to obtain 2-cell embryos via IVF, but no evidence for more advanced stages of embryo differentiation was provided. The motile rescued sperm was also successfully used to generate blastocyst by ICSI, but the statistical significance of the rate of blastocyst production compared to non-rescued sperm remains unclear. The title is thus an overstatement since fertility was never restored for IVF, and the mutant sperm was already able to produce blastocysts without the electroporation intervention.

      Overall, the authors clearly show that electroporating mRNA can improve spermatogenesis as demonstrated by the generation of motile sperm in the ARMC2 KO mouse model.

      We thank the reviewer for this thoughtful and constructive comment. We agree that our study demonstrates a partial functional recovery of spermatogenesis rather than a complete restoration of fertility. Our main objective was to establish and validate a proof-of-concept approach showing that mRNA electroporation can rescue the expression of a missing or mutated protein in post-meiotic germ cells and result in the production of motile sperm.

      To address the reviewer’s concern, we have the title and discussion to more accurately reflect the scope of our findings. The new title reads:

      “Sperm motility in mice with oligo-astheno-teratozoospermia restored by in vivo injection and electroporation of naked mRNA”

      In the manuscript, we now emphasize that while motility recovery was significant, complete fertility restoration was not achieved. We have also clarified that:

      The 5% recovery in motile sperm represents a substantial improvement considering the small population of germ cells reached by the current electroporation method.

      The 2-cell embryo formation observed after IVF serves as a strong indication of partial functional recovery

      Finally, we now explicitly state in the Discussion that this approach should be considered a therapeutic proof-of-concept, demonstrating feasibility and potential, rather than a fully curative intervention.

      Reviewer #2 (Public review):

      The authors inject, into the rete testes, mRNA and plasmids encoding mRNAs for GFP and then ARMC2 (into infertile Armc2 KO mice) in a gene therapy approach to express exogenous proteins in male germ cells. They do show GFP epifluorescence and ARMC2 protein in KO tissues, although the evidence presented is weak. Overall, the data do not necessarily make sense given the biology of spermatogenesis and more rigorous testing of this model is required to fully support the conclusions, that gene therapy can be used to rescue male infertility.

      In this revision, the authors attempt to respond to the critiques from the first round of reviews. While they did address many of the minor concerns, there are still a number to be addressed. With that said, the data still do not support the conclusions of the manuscript.

      We thank the reviewer for their careful and detailed assessment of our manuscript. We appreciate the concerns raised regarding mRNA stability, GFP localization, and the interpretation of spermatogenesis stages, and we have addressed these points in the manuscript and in the responses below.

      (1) The authors have not satisfactorily provided an explanation for how a naked mRNA can persist and direct expression of GFP or luciferase for ~3 weeks. The most stable mRNAs in mammalian cells have half-lives of ~24-60 hours. The stability of the injected mRNAs should be evaluated and reported using cell lines. GFP protein's half-life is ~26 hours, and luciferase protein's half-life is ~2 hours.

      We thank the reviewer for this important comment. The stability of mRNA-GFP was assessed by RT-QPCR in HEK cells and seminiferous tubule cells (Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells (Fig. 5A). Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability, efficient translation within germ cells and the slow protein turnover that is typical of the spermatogenic lineage.

      (2) There is no convincing data shown in Figs. 1-8 that the GFP is even expressed in germ cells, which is obviously a prerequisite for the Armc2 KO rescue experiment shown in the later figures! In fact, to this reviewer the GFP appears to be in Sertoli cell cytoplasm, which spans the epithelium and surrounds germ cells - thus, it can be oft-confused with germ cells. In addition, if it is in germ cells, then the authors should be able to show, on subsequent days, that it is present in clones of germ cells that are maturing. Due to intracellular bridges, a molecule like GFP has been shown to diffuse readily and rapidly (in a matter of minutes) between adjacent germ cells. To clarify, the authors must generate single cell suspensions and immunostain for GFP using any of a number of excellent commercially-available antibodies to verify it is present in germ cells. It should also be present in sperm, if it is indeed in the germline.

      We thank the reviewer for this insightful comment. To directly address the concern, we performed additional experiments to assess GFP expression in germ cells following in vivo mRNA delivery. GFP-encoding mRNA was injected and electroporated into the testes on day 0. On day 1, testes were collected, enzymatically dissociated, and the resulting seminiferous tubule cell suspensions were cultured for 12 hours. Live cells were then analyzed by fluorescence microscopy (Fig. 10).

      We observed GFP expression in various germ cell types, including pachytene spermatocytes (53,4 %) (Fig 10 A-), round spermatids (25 %) (Fig 10B-E) and in elongated spermatids (11,4%) (Fig 10 C-E). The identification of these cell types was based on DAPI nuclear staining patterns, cell size fig 10 F, non-adherent characteristics, and the use of an enzymatic dissociation protocol.

      Fluorescence imaging revealed strong cytoplasmic GFP signals in each of these populations, confirming efficient transfection and translation of the delivered mRNA. These results demonstrate that the in vivo injection and electroporation protocol enables effective mRNA transfection across multiple stages of spermatogenesis. These results confirm that the injected mRNA is efficiently translated in germ cells at various stages of spermatogenesis. Together, these data validate the germ cell-specific nature of the GFP signal, supporting the Armc2 KO rescue experiments.

      As mentioned previously, we assessed the stability of mRNA-GFP using RT-QPCR in HEK cells and seminiferous tubule cells (see Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells. Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability and local translation within germ cells, as well as the slow protein turnover typical of the spermatogenic lineage.

      Other comments:

      70-1 This is an incorrect interpretation of the findings from Ref 5 - that review stated there were ~2,000 testis-enriched genes, but that does not mean "the whole process involves around two thousand of genes"

      We thank the reviewer for this helpful comment. We agree that our previous phrasing was imprecise. We have revised the sentence to clarify that approximately 2,000 genes show testis-enriched expression, rather than implying that the entire spermatogenic process is limited to these genes. The corrected sentence now reads:

      “Spermatogenesis involves the coordinated expression of a large number of genes, with approximately 2,000 showing testis-enriched expression, about 60% of which are expressed exclusively in the testes”

      74 would specify 'male':

      we have now specified it as you suggested.

      79-84 Are the concerns with ICSI due to the procedure itself, or the fact that it's often used when there is likely to be a genetic issue with the male whose sperm was used? This should be clarified if possible, using references from the literature, as this reviewer imagines this could be a rather contentious issue with clinicians who routinely use this procedure, even in cases where IVF would very likely have worked:

      We thank the reviewer for this important comment. Concerns about ICSI outcomes indeed reflect two partly overlapping causes: the procedure itself (direct sperm injection and associated laboratory manipulations) and the clinical/genetic background of couples undergoing ICSI (especially men with severe male-factor infertility). Large reviews and meta-analyses report a small increase in some perinatal and congenital risks after ART/ICSI, but these studies conclude that it is difficult to fully disentangle procedural effects from parental factors. Importantly, genetic or epigenetic abnormalities in the male (which motivate use of ICSI) likely contribute to adverse outcomes in offspring, while some studies also suggest that ICSI-specific manipulations may alter epigenetic marks in embryos. For these reasons professional bodies recommend reserving ICSI for appropriate male-factor indications rather than as routine insemination for non-male-factor cases

      We have revised the text accordingly to clarify this distinction:

      “ICSI can efficiently overcome the problems faced.  Nevertheless, concerns persist regarding the potential risks associated with this technique, including blastogenesis defect, cardiovascular defect, gastrointestinal defect, musculoskeletal defect, orofacial defect, leukemia, central nervous system tumors, and solid tumors [1-4]. Statistical analyses of birth records have demonstrated an elevated risk of birth defects, with a 30-40 % increased  likelihood in cases involving ICSI [1-4], and a prevalence of birth defects between 1 % and 4 % [3]. It is important to note, however, that the origin of these risks remains debated. Several large epidemiological and mechanistic studies indicate that both the procedure itself (direct microinjection and in vitro manipulation) and the underlying genetic or epigenetic abnormalities often present in men requiring ICSI contribute to the observed outcomes [1, 3] [5, 6] . To overcome these drawbacks, a number of experimental strategies have been proposed to bypass ARTs and restore spermatogenesis and fertility, including gene therapy [7-10].”

      199 Codon optimization improvement of mRNA stability needs a reference;

      We have added the references accordingly: [11-15]

      In one study using yeast transcripts, optimization improved RNA stability on the order of minutes (e.g., from ~5 minutes to ~17 minutes); is there some evidence that it could be increased dramatically to days or weeks?

      We agree with the reviewer that codon optimization can enhance mRNA stability, but available evidence indicates that this effect is moderate. In Saccharomyces cerevisiae, Presnyak et al. (2015) [16] showed that codon optimization increased mRNA half-life from approximately 5 minutes to ~17 minutes, representing a several-fold improvement rather than a shift to days or weeks. Similar codon-dependent stabilization has been observed in mammalian systems, where transcripts enriched in optimal codons exhibit longer half-lives and enhanced translation efficiency [11]; [17]). However, these studies consistently report effects on the scale of minutes to hours. In mammalian cells, the prolonged stability of therapeutic or vaccine mRNAs—lasting for days—is primarily achieved through additional features such as optimized untranslated regions, chemical nucleotide modifications (e.g., N¹-methylpseudouridine), and protective delivery systems, rather than codon usage alone ([18]; [19]).

      Other molecular optimizations that improve in vivo mRNA stability and translation include a poly(A) tail, which binds poly(A)-binding proteins to protect the transcript from 3′ exonuclease degradation and promotes ribosome recycling, and a CleanCap structure at the 5′ end, which mimics the natural Cap 1 configuration, protects against 5′ exonuclease attack, and enhances translational initiation [11-15]. Together, these modifications act synergistically to stabilize the transcript and support efficient translation.

      472-3 The reported half-life of EGFP is ~36 hours - so, if the mRNA is unstable (and not measured, but certainly could be estimated by qRT-PCR detection of the transcript on subsequent days after injection) and EGFP is comparatively more stable (but still hours), how does EGFP persist for 21 days after injection of naked mRNA??

      We thank the reviewer for this important comment. The stability of mRNA-GFP was assessed by RT-QPCR in HEK cells and seminiferous tubule cells (Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells (Fig. 5). Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability, efficient translation within germ cells and the slow protein turnover that is typical of the spermatogenic lineage.

      Curious why the authors were unable to get anti-GFP to work in immunostaining?

      We appreciate the reviewer’s question. We attempted to detect GFP using several commercially available anti-GFP antibodies under various standard immunostaining conditions. However, in our hands, these antibodies consistently produced either no signal or high background staining, making the results unreliable. We therefore relied on direct detection of GFP fluorescence, which provides a more accurate and specific readout of protein expression in our system.

      In Fig. 3-4, the GFP signals are unremarkable, in that they cannot be fairly attributed to any structure or cell type - they just look like blobs; and why, in Fig. 4D-E, why does the GFP signal appear stronger at 21 days than 15 days? And why is it completely gone by 28 days? This data is unconvincing.

      We would like to thank the reviewer for their comments. Figure 3–4 provides a global overview of GFP expression on the surface of the testis. The entire testis was imaged using an inverted epifluorescence microscope, and the GFP signal represents a composite of multiple seminiferous tubules across the tissue surface. Due to this whole-organ imaging approach, it is not possible to resolve individual structures such as the basement membrane or lumen, which is why the signals may appear as diffuse “blobs.”

      Regarding the time-course in Figure 4D–E, the apparent increase in GFP signal at 21 days compared with 15 days likely reflects accumulation and translation of the delivered mRNA in germ cells over time, whereas the absence of signal at 28 days corresponds to the natural turnover and degradation of GFP protein and mRNA in the tissue. We hope this explanation clarifies the observed patterns of fluorescence.

      If the authors did a single cell suspension, what types or percentage of cells would be GFP+? Since germ cells are not adherent in culture, a simple experiment could be done whereby a single cell suspension could be made, cultured for 4-6 hours, and non-adherent cells "shaken off" and imaged vs adherent cells. Cells could also be fixed and immunostained for GFP, which has worked in many other labs using anti-GFP.

      We thank the reviewer for this insightful comment. To directly address the concern, we performed additional experiments to assess GFP expression in germ cells following in vivo mRNA delivery. GFP-encoding mRNA was injected and electroporated into the testes on day 0. On day 1, testes were collected, enzymatically dissociated, and the resulting seminiferous tubule cell suspensions were cultured for 12 hours. Live cells were then analyzed by fluorescence microscopy (Fig. 10).

      We observed GFP expression in various germ cell types, including pachytene spermatocytes (53,4 %) (Fig 10 A-), round spermatids (25 %) (Fig 10B-E) and in elongated spermatids (11,4%) (Fig 10 C-E). The identification of these cell types was based on DAPI nuclear staining patterns, cell size fig 10 F, non-adherent characteristics, and the use of an enzymatic dissociation protocol.

      Fluorescence imaging revealed strong cytoplasmic GFP signals in each of these populations, confirming efficient transfection and translation of the delivered mRNA. These results demonstrate that the in vivo injection and electroporation protocol enables effective mRNA transfection across multiple stages of spermatogenesis.

      These results confirm that the injected mRNA is efficiently translated in germ cells at various stages of spermatogenesis. Together, these data validate the germ cell-specific nature of the GFP signal, supporting the Armc2 KO rescue experiments.

      As mentioned previously, we assessed the stability of mRNA-GFP using RT-QPCR in HEK cells and seminiferous tubule cells (see Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells. Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability and local translation within germ cells, as well as the slow protein turnover typical of the spermatogenic lineage.

      In Fig. 5, what is the half-life of luciferase? From this reviewer's search of the literature, it appears to be ~2-3 h in mammalian cells. With this said, how do the authors envision detectable protein for up to 20 days from a naked mRNA? The stability of the injected mRNAs should be shown in a mammalian cell line - perhaps this mRNA has an incredibly long half-life, which might help explain these results. However, even the most stable endogenous mRNAs (e.g., globin) are ~24-60 hrs.

      We did not directly assess the stability of luciferase mRNA, but we evaluated the persistence of GFP mRNA, which was synthesized and optimized using the same sequence optimization and chemical modification strategy as the luciferase mRNA. In these experiments, mRNA-GFP was detectable in seminiferous tubule cells for up to two weeks after injection. We therefore expect a similar stability profile for the luciferase mRNA. These findings suggest that the prolonged fluorescence or bioluminescence observed in our study likely reflects a combination of factors, including enhanced transcript stability, local translation within germ cells, and the inherently slow protein turnover characteristic of the spermatogenic lineage.

      527-8 The Sertoli cell cytoplasm is not just present along the basement membrane as stated, but also projects all the way to the lumina:

      we clarified the sentence " Sertoli cells have an oval to elongated nucleus and the cytoplasm presents a complex shape (“tombstone” pattern) along the basement membrane, with long projections that extend toward the lumen."

      529-30 This is incorrect, as round spermatids are never "localized between the spermatocytes and elongated spermatids" - if elongated spermatids are present, rounds are not - they are never coincident in the same testis section:

      We thank the reviewer for this important comment and for drawing attention to the detailed staging of the seminiferous epithelium. We agree that the spatial organization of germ cells varies depending on the stage of spermatogenesis. While round spermatids (steps 1–8) and elongated spermatids (steps 9–16) are typically associated with distinct stages, transitional stages of the seminiferous epithelium can contain both cell types in close proximity, reflecting the continuous and overlapping nature of spermatid differentiation (Meistrich, 2013, Methods Mol. Biol. 927:299–307). We have revised the text to clarify this point, indicating that the relative positioning of germ cell types depends on the stage of the seminiferous cycle rather than implying their constant coexistence within the same tubule section.

      Fig. 7. To this reviewer, all of the GFP appears to be in Sertoli cell cytoplasm In Figs 1-8 there is no convincing evidence presented that GFP is expressed in germ cells! In fact, it appears to be in Sertoli cells.

      We thank the reviewer for their observation. As previously mentioned, we have included an additional experiment specifically demonstrating GFP expression in germ cells (fig 10). This new data provides clear evidence that the GFP signal is not restricted to Sertoli cells and confirms successful uptake and translation of GFP mRNA in germ cells.

      Fig. 9 - alpha-tubuline?

      We corrected the figure.

      Fig. 11 - how was sperm morphology/motility not rescued on "days 3, 6, 10, 15, or 28 after surgery", but it was in some at 21 and 35? How does this make sense, given the known kinetics of male germ cell development??

      We note the reviewer’s concern regarding the timing of motile sperm appearance. Variability among treated mice is expected because transfection efficiency differed between spermatogonia and spermatids. Full spermiogenesis requires ~15 days, and epididymal transit adds ~8 days, consistent with motile sperm appearing around 21 days post-injection in some mice.

      And at least one of the sperm in the KO in Fig. B5 looks relatively normal, and the flagellum may be out-of-focus in the image? With only a few sperm for reviewers to see, how can we know these represent the population?

      We thank the reviewer for their comment. Upon closer examination of the image, the flagellum of the spermatozoon in question is clearly abnormally short and this is not due to being out of focus. Furthermore, the supplementary figure shows that the KO consistently lacks normal spermatozoa. These defects are consistent with previous findings from our laboratory [22], confirming that the observed phenotype is representative of the KO population rather than an isolated occurrence.

      Reviewer #3 (Public review):

      Summary:

      The authors used a novel technique to treat male infertility. In a proof-of-concept study, the authors were able to rescue the phenotype of a knockout mouse model with immotile sperm using this technique. This could also be a promising treatment option for infertile men.

      Strengths:

      In their proof-of-concept study, the authors were able to show that the novel technique rescues the infertility phenotype of Armc2 knockout spermatozoa. In the current version of the manuscript, the authors have added data on in vitro fertilisation experiments with Armc2 mRNA-rescued sperm. The authors show that Armc2 mRNA-rescued sperm can successfully fertilise oocytes that develop to the blastocyst stage. This adds another level of reliability to the data.

      Weaknesses:

      Some minor weaknesses identified in my previous report have already been fixed. The technique is new and may not yet be fully established for all issues. Nevertheless, the data presented in this manuscript opens the way for several approaches to immotile spermatozoa to ensure successful fertilisation of oocytes and subsequent appropriate embryo development.

      [Editors' note: The images in Figure 12 do not support the authors' interpretation that 2-cell embryos resulted from in vitro fertilization. Instead, the cells shown appear to be fragmented, unfertilized eggs. Combined with the lack of further development, it seems highly unlikely that fertilization was successful.]

      We thank the reviewer for their careful evaluation and constructive feedback. We appreciate the acknowledgment of the strengths of our study, particularly the proof-of-concept demonstration that Armc2-mRNA electroporation can rescue sperm motility in Armc2 knockout mice.

      Regarding the concern raised by the editor about Figure 12, we would like to clarify two technical points. First, the IVF experiments were performed using CD1 oocytes and B6D2 sperm. Due to strain-specific incompatibilities, fertilization of CD1 oocytes by B6D2 sperm typically does not progress beyond the two-cell stage (Fernández-González [23] et al., 2008, Biology of Reproduction). Therefore, the observation of two-cell embryos represents the expected limit of development in this cross and serves as a strong indication of successful fertilization, even though further development is not possible. Second, the oocytes used in these experiments were treated with collagenase to remove cumulus cells. This enzymatic treatment can sometimes affect the morphology of early embryos, which may explain why the two-cell embryos in Figure 12 appear less regular or somewhat fragmented. We also included a control showing embryos from B6D2 sperm with the same collagenase treatment on CD1 oocytes, which yielded similar appearances (Fig14 A4).

      To provide additional functional evidence, we complemented the IVF experiments with ICSI using rescued Armc2<sup>–/–</sup> sperm and B6D2 oocytes, which allowed embryos to develop to the blastocyst stage. In these experiments, 25% of injected oocytes reached the blastocyst stage with rescued sperm compared to 13% for untreated Armc2–/– sperm (Supplementary Fig. 9) These results support the functional competence of rescued sperm and demonstrate partial recovery of fertilization ability following Armc2 mRNA electroporation.

      We have clarified these points in the revised Results and Discussion sections to emphasize that the IVF data indicate partial functional recovery of rescued sperm rather than full fertility restoration. These clarifications address the editor’s concern while accurately representing the technical limitations of the strain combination used in our experiments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Fig 12 and Supplementary Fig 9 are mislabeled in the text and rebuttal.

      We thank the reviewer for pointing this out. We have carefully checked the manuscript and the rebuttal text, and corrected all references to Figure 12 and Supplementary Figure 9 to ensure they are accurately labeled and consistent throughout the text.

      Reviewer #3 (Recommendations for the authors):

      The contribution of the newly added authors should be clarified. All other aspects of inadequacy raised in my previous report have been adequately addressed.

      No further comments.

      We thank the reviewer for noting this. The contributions of the newly added authors have been clarified in the Author Contributions section of the revised manuscript. All other points raised in the previous review have been addressed as indicated.

      References

      (1) Hansen, M., et al., Assisted reproductive technologies and the risk of birth defects--a systematic review. Hum Reprod, 2005. 20(2): p. 328-38.

      (2) Halliday, J.L., et al., Increased risk of blastogenesis birth defects, arising in the first 4 weeks of pregnancy, after assisted reproductive technologies. Hum Reprod, 2010. 25(1): p. 59-65.

      (3) Davies, M.J., et al., Reproductive technologies and the risk of birth defects. N Engl J Med, 2012. 366(19): p. 1803-13.

      (4) Kurinczuk, J.J., M. Hansen, and C. Bower, The risk of birth defects in children born after assisted reproductive technologies. Curr Opin Obstet Gynecol, 2004. 16(3): p. 201-9.

      (5) Graham, M.E., et al., Assisted reproductive technology: Short- and long-term outcomes. Dev Med Child Neurol, 2023. 65(1): p. 38-49.

      (6) Palermo, G.D., et al., Intracytoplasmic sperm injection: state of the art in humans. Reproduction, 2017. 154(6): p. F93-f110.

      (7) Usmani, A., et al., A non-surgical approach for male germ cell mediated gene transmission through transgenesis. Sci Rep, 2013. 3: p. 3430.

      (8) Raina, A., et al., Testis mediated gene transfer: in vitro transfection in goat testis by electroporation. Gene, 2015. 554(1): p. 96-100.

      (9) Michaelis, M., A. Sobczak, and J.M. Weitzel, In vivo microinjection and electroporation of mouse testis. J Vis Exp, 2014(90).

      (10) Wang, L., et al., Testis electroporation coupled with autophagy inhibitor to treat non-obstructive azoospermia. Mol Ther Nucleic Acids, 2022. 30: p. 451-464.

      (11) Wu, Q., et al., Translation affects mRNA stability in a codon-dependent manner in human cells. eLife, 2019. 8: p. e45396.

      (12) Gallie, D.R., The cap and poly(A) tail function synergistically to regulate mRNA translational efficiency. Genes & Development, 1991. 5(11): p. 2108-2116.

      (13) Henderson, J.M., et al., Cap 1 messenger RNA synthesis with co-transcriptional CleanCap® analog improves protein expression in mammalian cells. Nucleic Acids Research, 2021. 49(8): p. e42.

      (14) Stepinski, J., et al., Synthesis and properties of mRNAs containing novel “anti-reverse” cap analogs. RNA, 2001. 7(10): p. 1486-1495.

      (15) Sachs, A.B., P. Sarnow, and M.W. Hentze, Starting at the beginning, middle, and end: translation initiation in eukaryotes. Cell, 1997. 89(6): p. 831-838.

      (16) Presnyak, V., et al., Codon optimality is a major determinant of mRNA stability. Cell, 2015. 160(6): p. 1111-24.

      (17) Cao, D., et al., Unlock the sustained therapeutic efficacy of mRNA. J Control Release, 2025. 383: p. 113837.

      (18) Karikó, K., et al., Incorporation of pseudouridine into mRNA yields superior nonimmunogenic vector with increased translational capacity and biological stability. Mol Ther, 2008. 16(11): p. 1833-40.

      (19) Pardi, N., et al., mRNA vaccines — a new era in vaccinology. Nature Reviews Drug Discovery, 2018. 17(4): p. 261-279.

      (20) Meistrich, M.L. and R.A. Hess, Assessment of Spermatogenesis Through Staging of Seminiferous Tubules, in Spermatogenesis: Methods and Protocols, D.T. Carrell and K.I. Aston, Editors. 2013, Humana Press: Totowa, NJ. p. 299-307.

      (21) Au - Mäkelä, J.-A., et al., JoVE, 2020(164): p. e61800.

      (22) Coutton, C., et al., Bi-allelic Mutations in ARMC2 Lead to Severe Astheno-Teratozoospermia Due to Sperm Flagellum Malformations in Humans and Mice. Am J Hum Genet, 2019. 104(2): p. 331-340.

      (23) Fernández-Gonzalez, R., et al., Long-term effects of mouse intracytoplasmic sperm injection with DNA-fragmented sperm on health and behavior of adult offspring. Biol Reprod, 2008. 78(4): p. 761-72.

    1. Keycap: 1 Emoji

      xxx

      1️⃣

      Related Emojis 🔟 *️⃣

      ️⃣

      0️⃣ 2️⃣ 3️⃣ 4️⃣ 5️⃣ 6️⃣ 7️⃣ 8️⃣ 9️⃣ 🐵 🍵 🌆 🇨🇻 🚨

    1. Chapter Outline

      *CAUTION If you are arriving here to complete your annotations for readings after Week 3 of HIST262, please look in your email for a message from Dr. Block on Feb. 6 titled "Hypothes.is problems -- ready to be solved*" In order to link your account with Canvas and receive grades, you MUST complete the process to fix your login credentials. If you need help, stop by the OIT help desk, or contact Hypothesis support at [support@hypothes.is]. (mailto:support@hypothes.is) for assistance.

    1. Reviewer #1 (Public review):

      Summary:

      In their study the authors investigated the F. graminearum homologue of the Drosophila Misato-Like Protein DML1 for a function in secondary metabolism and sensitivity to fungicides.

      Strengths:

      Generally, the topic of the study is interesting and timely and the manuscript is well written, albeit in some cases details on methods or controls are missing.

      Weaknesses:

      However, a major problem I see is with the core result of the study, the decrease of the DON content associated with deletion of FgDML1: Although some growth data are shown in figure 6 - indicating a severe growth defect - the DON production presented in figure 3 is not related to biomass. Also, the method and conditions for measuring DON are not described. Consequently, it could well be concluded that the decreased amount of DON detected is simply due to a decreased growth and specific DON production of the mutant remains more or less the same.

      To alleviate this concern, it is crucial to show the details on the DON measurement and growth conditions and to relate the biomass formation on the same conditions to the DON amount detected. Only then a conclusion as to an altered production in the mutant strains can be drawn.

      Comments to the revised manuscript:

      The authors carefully revised the manuscript and provided explanations for methods in several cases. However, there are still some problems - probably due to misunderstanding - that need revision.

      (1) A major problem of the first version of the manuscript was the lack of appropriate description of biomass analysis and the consideration of the respective results for evaluation of production of DON and other metabolites. Although the authors provide some explanation in the response to reviews, I could not find a corresponding explanation or description in the manuscript. It is not sufficient to explain the problem to me, but a detailed explanation and description of the method has to be provided in the manuscript along with the definition of one "unit of mycelium". It is still not entirely clear to me what such a "unit of mycelium" is.

      Please clarify this and any other uncertainties that were commented on by me and other reviewers in the manuscript, not only in the response to reviews. Also adjust the reference list accordingly.

      (2) Another problem was, that the authors considered FgDML1 a regulator of DON production. As mentioned by me and reviewer 3, FgDML1 is crucial to numerous functions in F. graminearum and its lack causes a plethora of problems for fungal physiology. Hence, although it is clear that the lack of FgDML1 causes alterations in DON production, it is not appropriate to designate this factor as a "regulator".<br /> It seems to me that the authors are afraid that if FgDML1 would not be a "regulator" that this would decrease the value of their study, which is not the case. This is a matter of correct wording. Therefore, please revise the wording accordingly, starting with the title:

      ...FgDML1 impacts DON toxin biosynthesis...

      Moreover, for sure the manuscript might benefit from more detailed description of the whole cascade leading from FgDML1 to DON biosynthesis and production of the other metabolites that change upon deletion. Such explanation can help the reader grasp the relevance of FgDML for regulatory processes as well as on more general versus specific effects.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Summary:

      In their study, the authors investigated the F. graminearum homologue of the Drosophila Misato-Like Protein DML1 for a function in secondary metabolism and sensitivity to fungicides.

      Strengths:

      Generally, the topic of the study is interesting and timely, and the manuscript is well written, albeit in some cases, details on methods or controls are missing.

      Weaknesses:

      However, a major problem I see is with the core result of the study, the decrease in the DON content associated with the deletion of FgDML1. Although some growth data are shown in Figure 6, indicating a severe growth defect, the DON production presented in Figure 3 is not related to biomass. Also, the method and conditions for measuring DON are not described. Consequently, it could well be concluded that the decreased amount of DON detected is simply due to decreased growth, and the specific DON production of the mutant remains more or less the same.

      To alleviate this concern, it is crucial to show the details on the DON measurement and growth conditions and to relate the biomass formation under the same conditions to the DON amount detected. Only then can a conclusion as to an altered production in the mutant strains be drawn.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions. The point to point responds to the reviewer’s comments are listed as following. Our method for DON quantification was based on the amount per unit of mycelium. After obtaining the absorbance value from the ELISA reaction, the concentration of DON was calculated according to a standard curve and a formula, then divided by the dry weight of the mycelium to obtain the DON content per unit of mycelium, with the results finally expressed in µg/g.

      (1) Line 139f

      ... FgDML1 is a critical positive regulator of virulence ....

      Clearly, the deletion of FgDML1 impacts virulence, but it is too much of a general effect to say it is a regulator. DML1 acts high up in the cascade, impacting numerous processes, one of which is virulence. Generally, it has to be considered that deletion of DML1 causes a severe growth defect, which in turn is likely to lead to a plethora of effects. Besides discussing this fact, please also revise the manuscript to avoid references to "direct effects" or "regulator".

      Thank you very much for your advice. Our method for determining the amount of DON is based on the amount of mycelium per unit. After obtaining the absorbance value through Elisa reaction, we calculate the concentration of DON toxin according to the established standard curve and formula. Then, we divide it by the dry weight of mycelium to obtain the DON toxin content per unit mycelium, and finally present the results in µg/g. In summary, we conclude that the decrease in DON production by ΔFgDML is not due to slower hyphal growth, but rather a decrease in the ability of unit hyphae to produce DON toxins compared to the wild type. Given the decrease in DON toxin synthesis caused by FgDML1 deficiency, we believe that using a regulator is reasonable.

      (2) Line 143

      Please define "toxin-producing conditions".

      Thank you very much for your advice. We have accurately defined the conditions for toxin-producing conditions in the manuscript' toxin-inducing conditions '(28°C, 145 ×g, 7 days incubation)' (in L163-164)

      (3) Line 149

      A brief intro on toxisomes should be provided in the introduction to better integrate this into the manuscript's results.

      Thank you very much for your advice. We have added corresponding content about toxin producing bodies in the introduction section 'The biosynthesis of DON entails a reorganization of the endoplasmic reticulum into a specialized compartment termed the "toxisome" (Tang et al., 2018). The assembly of the toxisome coincides with the aggregation of key biosynthetic enzymes, which in turn enhances the efficiency of DON production. Concurrently, this compartmentalization serves as a self-defense mechanism, protecting the fungus from the autotoxicity of TRI pathway intermediates (Boenisch et al., 2017). The proteins TRI1, TRI4, TRI14, and Hmr1 are confirmed constituents of this structure(Kistler and Broz, 2015; Menke et al., 2013).' (in L86-93)

      (4) Line 153

      DON production decreases by about 80 %, but not to 0. Consequently, DML1 is important, but NOT essential for DON production.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'FgDML1 is essential for the biosynthesis of the DON toxin. '(in L161)

      (5) Line 168ff

      Please provide a reference for FgDnm1 being critical for mitochondrial fission and state whether such an interaction has been shown in other organisms.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'FgDnm1 is a key dynamin-related protein mediating mitochondrial fission(Griffin et al., 2005; Kang et al., 2023), suggesting that FgDML1 may form a complex with FgDnm1 to regulate mitochondrial fission and fusion processes. To our knowledge, this is the first report documenting an interaction between DML1 and Dnm in any fungal species, including model organisms such as S. cerevisiae. This novel finding provides new insights into the molecular mechanisms underlying mitochondrial dynamics in filamentous fungi. '(in L277-283)

      (6) Line 178

      Please specify whether Complex III activity was related to biomass and provide a p-value or standard deviation for the value.

      Thank you very much for your question. The activity determination of complex III was completed using a complex III enzyme activity kit (Solarbio, Beijing, China) (Li, et al 2022; Wang, et al 2022). Take 0.1 g of standardized mycelium as the sample for the experiment. Given that the mycelium has been homogenized, we believe that there is no necessary correlation between the activity and biomass of complex III. And we also refined the specific measurement steps in the article. ' Briefly, 0.1 g of mycelia was homogenized with 1 mL of extraction buffer in an ice bath. The homogenate was centrifuged at 600 ×g for 10 min at 4°C. The resulting supernatant was then subjected to a second centrifugation at 11,100 ×g for 10 min at 4°C. The pellet was resuspended in 200 μL of extraction buffer and disrupted by ultrasonication (200 W, 5 s pulses with 10 s intervals, 15 cycles). Complex III enzyme activity was finally measured by adding the working solution as per the manufacturer's protocol. Each treatment group contains three biological replicates and three technical replicates. '(in L511-517)

      Li C, et al. Amino acid catabolism regulates hematopoietic stem cell proteostasis via a GCN2-eIF2 axis. Cell Stem Cell. 2022 Jul 7; 29(7):1119-1134.e7. doi: 10.1016/j.stem.2022.06.004. PMID: 35803229.

      Wang K, et al. Locally organised and activated Fth1hi neutrophils aggravate inflammation of acute lung injury in an IL-10-dependent manner. Nat Commun. 2022 Dec 13;13(1):7703. doi: 10.1038/s41467-022-35492-y. PMID: 36513690; PMCID: PMC9745290

      (7) Line 185

      Albeit this headline is a reasonable hypothesis, you actually did not show that the conformation is altered. Please reword accordingly.

      Please also add references for cyazofamid acting on the QI site versus other fungicides acting on the QO site.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'Overexpression of FgQCR2, FgQCR8, and FgQCR9 may alters the conformation of the QI site, resulting in reduced sensitivity to cyazofamid. '(in L212-213). For fungicides targeting Qi and QO sites, we have added corresponding descriptions in the respective sections 'Numerous fungicides have been developed to inhibit the Qo site (e.g., pyraclostrobin, azoxystrobin)(Nuwamanya et al., 2022; Peng et al., 2022) and the Qi site (e.g., cyazofamid)(Mitani et al., 2001) of the cytochrome bc1 complex. '(in L327-329)

      (8) Line 200

      This section on growth should be moved up right after introducing the mutant strain.

      Thank you very much for your advice. We have advanced the part of nutritional growth and sexual asexual development before DON toxin to promote better reading and understanding. We arranged the sequence in the previous way to emphasize the new discovery between mitochondria and DON toxin. We found a significant decrease in DON toxin in ΔFgDML1, defects in the formation of toxin producing bodies, and downregulation of FgTRis at both the gene and protein levels. In summary, we believe that the absence of FgDML1 does indeed lead to a decrease in the content of DON toxin, and FgDML1 plays a regulatory role in the synthesis of DON toxin. In addition, our measurements of DON toxin, acetyl CoA, ATP and other indicators are all based on the amount per unit hyphae, excluding differences caused by hyphal biomass or growth. We have further refined the materials and methods to facilitate better reading and understanding.

      (9) Line 203

      "... significantly reduced growth rates ..."

      This is not what was measured here. Figure 6A shows a plate assay that can be used to assess hyphal extension. In the figure, it is also visible that the mycelium of the deletion mutant is much denser, maybe due to increased hyphal branching. Please reword.

      Additionally, it is important to include a biomass measurement here under the conditions used for DON assessment. Hyphal extension measurements cannot be used instead of biomass.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'The ΔFgDML1 strain displayed a distinct growth phenotype characterized by retardation in radial growth and the formation of more compact, denser hyphal networks on all tested media compared to the PH-1 and ΔFgDML-C strains. '(in L136-138).

      (10) Line 217

      Please include information on how long the cultures were monitored. Given the very slow growth of the mutant, perithecia formation may be considerably delayed beyond 14 days.

      Thank you very much for your advice. Based on your suggestion, we have extended the incubation time for sexual reproduction to 21 days to more accurately evaluate its sexual reproduction ability. Our results show that even after 21 days, Δ FgDML1 still cannot produce ascospores and ascospores, which proves that the absence of FgDML1 does indeed cause sexual reproduction defects in F. graminearum.

      Author response image 1.

      Discussion

      (11) Please mention your summary Figure 8 early on in the discussion, and explain conclusions with this figure in mind. Please avoid repetition of the results section as much as possible.

      Also, please state clearly what was already known from previous research and is in agreement with your results, and what is new (in fungi or generally).

      Thank you very much for your advice. Based on your suggestion, we mentioned Fig8 earlier in the first half of the discussion and provided guidance for the following text. We also conducted a more comprehensive discussion by analyzing our research results and comparing them with previous studies. 'Our study defines a novel mechanism through which FgDML1 governs mitochondrial homeostasis. We demonstrate that FgDML1 directly interacts with the key mitochondrial fission regulator FgDnm1 and positively modulates cellular bioenergetic metabolism, as evidenced by elevated ATP and acetyl-CoA levels (Fig. 8). '(in L250-253). 'The Misato/DML1 protein family is evolutionarily conserved from yeast to humans and plays a critical role in mitochondrial regulation. In S. cerevisiae, DML1 is an essential gene; its deletion is lethal, while its overexpression results in fragmented mitochondrial networks and aberrant cellular morphology, underscoring its necessity for normal mitochondrial function (Gurvitz et al., 2002). Similarly, in Homo sapiens, the homolog Misato localizes to the mitochondrial outer membrane, and both its depletion and overexpression are sufficient to disrupt mitochondrial morphology and distribution (Kimura and Okano, 2007). '(in L241-244).

      (12) Line 262ff

      Please specify if this interaction was shown previously in other organisms and provide references.

      Thank you very much for your advice. We have clearly stated in the corresponding section that the interaction between FgDML and FgDnm is the first reported, and to our knowledge, no relevant reports have been found in other species so far. ' Notably, FgDML1 was found to interact with FgDnm1 (Fig. 5E), FgDnm1 is a key dynamin-related protein mediating mitochondrial fission(Griffin et al., 2005; Kang et al., 2023), suggesting that FgDML1 may form a complex with FgDnm1 to regulate mitochondrial fission and fusion processes. To our knowledge, this is the first report documenting an interaction between DML1 and Dnm in any fungal species, including model organisms such as S. cerevisiae. This novel finding provides new insights into the molecular mechanisms underlying mitochondrial dynamics in filamentous fungi. '(in L276-283)

      (13) Line 287ff

      There is no result that would justify this speculation. Please remove.

      Thank you very much for your advice. We have modified the corresponding wording in the corresponding section. 'In conclusion, our findings suggest that the overexpression of assembly factors FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 potentially modifies the conformation of the Qi site, which specifically modulates the sensitivity of F. graminearum to cyazofamid. '(in L352-355)

      Materials and methods

      (14) A table with all primer sequences used in the study and their purpose is missing. For every experiment, the number of technical and biological replicates needs to be stated.

      Thank you very much for your advice. We have presented all the primers used in this study in Supplementary Table 1 (in Table S1) .We added the number of technical and biological replicates in the material and method descriptions for each experiment. 'For each sample, a total of 200 conidia were counted. The experiment included three biological replicates with three technical replicates each.'(in L434-436). 'Each treatment group contains three biological replicates. '(in L444-445). 'Each treatment group contains three biological replicates and three technical replicates. ' (in L463-464). 'Each treatment group contains three biological replicates and three technical replicates. '(in L474-475). 'Each treatment group contains three biological replicates. '(in L483). 'Each treatment group contains three biological replicates and three technical replicates.'(in L501-502). 'Each treatment group contains three biological replicates and three technical replicates. '(in L516-517). 'The experiment was independently repeated three times. '(in L533-534).

      (15) Line 369ff

      Please provide final concentrations used for assays here.

      Thank you very much for your advice. The final concentration has been displayed in the Figure (in Fig6. A, B) (in Fig. S3). And we have provided supplementary Table 2 to reflect the concentration in a more intuitive way.(in Table. S2)

      (16) Line 383

      Please provide a reference or data on the use of F2du for transformant selection and explain the abbreviation.

      Thank you very much for your advice. Based on your suggestion, we have provided the full name and references of F2du. 'Transformants were selected on PDA plates containing either 100 μg/mL Hygromycin B (Yeasen, Shanghai, China) or 0.2 μmol/mL 5-Fluorouracil 2'-deoxyriboside (F2du) (Solarbio, Beijing, China)(Zhao et al., 2022). '(in L405-407).

      (17) Line 407

      Please provide a reference for the method and at least a brief description.

      Thank you very much for your advice. Based on your suggestion, we have added references and provided a brief introduction to the method. 'As previously described (Tang et al., 2020; Wang et al., 2025), Specifically, coleoptiles were inoculated with conidial suspensions and incubated for 14 days, while leaves were inoculated with fresh mycelial plugs and incubated for 5 days, followed by observation and quantification of disease symptoms. DON toxin was measured using a Wise Science ELISA-based kit (Wise Science, Jiangsu, China) (Li et al., 2019; Zheng et al., 2018). '(in L466-471)

      (18) Line 414ff

      Also, here, the amount of biomass has to be considered for the measurement to be able to distinguish if actually less of the compounds were produced or if the effect seen was merely due to an altered amount of biomass present.

      Thank you very much for your advice. We believe that biomass is not within the scope of our measurement indicators, as we have measured and calculated based on unit hyphae. Therefore, we have ruled out experimental bias caused by a decrease in biomass.

      RNA and RT-qPCR

      (19) Line 461

      When the strains were transferred to AEA medium, was the biomass measured, at least wet weight, and in which culture volume was it done? It makes a big difference if the amount of (wet) biomass dilutes a small amount of fungicide-containing culture or if biomass is added in at least roughly equal amounts in sufficient growth medium to ensure equal conditions.

      Thank you very much for your question. Our sample processing controlled the wet weight of the samples before dosing, ensuring that the wet weight of the mycelium obtained from each sample before dosing was 0.2g. The mycelium was obtained through AEA with a volume of 100mL. This ensured consistency in the initial biomass between groups before dosing, and also ensured the accuracy of the drug concentration.

      (20) Line 466

      Please provide the name and supplier of the kit.

      Thank you very much for your advice. We have added corresponding content in the corresponding location. 'Mycelium was collected and total RNA was extracted following the instructions provided by the Total RNA Extraction Kit (Tiangen, Beijing, China).' (in L523-524).

      (21) All primer sequences must be provided in a table.

      Thank you very much for your advice. We have presented all the primers used in this study in Supplementary Table 1. (in Table S1).

      (22) For RT qPCR it is essential to check the RNA quality to be sure that the obtained results are not artifacts due to varying quality, which may exceed differences. Please state how quality control was done and which threshold was applied for high-quality RNA to be used in RTqPCR (like RIN factor, etc).

      Thank you very much for your question. We performed stringent quality control on the extracted total RNA. First, a micro-spectrophotometer was used to measure RNA concentration and purity, confirming that the A260/A280 ratio was between 1.8 and 2.0 and the A260/A230 ratio was greater than 2.0, indicating good RNA purity without significant protein or organic solvent contamination.Subsequently, verification by agarose gel electrophoresis revealed distinct 28S and 18S rRNA bands, demonstrating good RNA integrity and absence of degradation.

      Author response image 2.

      (B): Minor Comments:

      (1) Please increase the font size of the labels and annotations of the figures; it is hard to read as it is now.

      Thank you very much for your advice. We have increased the size of annotations or numerical labels in the corresponding images for better reading.

      (2) Throughout the manuscript: Please check that all abbreviations are explained at first use.

      Thank you very much for your advice. We have checked the entire text to ensure that abbreviations have their full names when they first appear.

      (3) I do hope that the authors can clarify all concerns and provide an amended manuscript of this interesting story.

      Thank you very much for your advice. Sincerely thank you for your suggestions and questions, which have been very helpful to us.

      Reviewer #2:

      The manuscript entitled "Mitochondrial Protein FgDML1 Regulates DON Toxin Biosynthesis and Cyazofamid Sensitivity in Fusarium graminearum by affecting mitochondrial homeostasis" identified the regulatory effect of FgDML1 in DON toxin biosynthesis and sensitivity of Fusarium graminearum to cyazofamid. The manuscript provides a theoretical framework for understanding the regulatory mechanisms of DON toxin biosynthesis in F. graminearum and identifies potential molecular targets for Fusarium head blight control. The paper is innovative, but there are issues in the writing that need to be addressed and corrected.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions with red words. In the response comments, to highlight the specific positions of the revised parts in the manuscript with red line number. The point to point responds to the reviewer’s comments are listed as following.

      Weaknesses:

      (1) The authors speculate that cyazofamid treatment caused upregulation of the assembly factors, leading to a change in the conformation of the Qi protein, thus restoring the enzyme activity of complex III. But no speculation was given in the discussion as to why this would lead to the upregulation of assembly factors, and how the upregulation of assembly factors would change the protein conformation, and is there any literature reporting a similar phenomenon? I would suggest adding this to the discussion.

      Thank you very much for your advice. Based on your suggestion, we have added content related to the assembly factor of complex III in the discussion section and made modifications to the corresponding wording. 'Previous studies have reported that mutations in the Complex III assembly factors TTC19, UQCC2, and UQCC3 impair the assembly and activity of Complex III (Feichtinger et al., 2017; Wanschers et al., 2014). '(in L345-347). 'In conclusion, our findings suggest that the overexpression of assembly factors FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 potentially modifies the conformation of the Qi site, which specifically modulates the sensitivity of F. graminearum to cyazofamid. '(in L352-355).

      (2) Would increased sensitivity of the mutant to cell wall stress be responsible for the excessive curvature of the mycelium?

      Thank you very much for your question. We believe that the sensitivity of ΔFgDML1 to osmotic stress is reduced, which may not be related to hyphal bending, as shown in the Author response image 3. During the conidia stage, ΔFgDML1 cannot germinate in YEPD, while the application of 1M Sorbitol promotes its germination. But it is caused by internal unknown mechanisms, which is also the focus of our future research.

      Author response image 3.

      (3) The vertical coordinates of Figure 7B need to be modified with positive inhibition rates for the mutants.

      Thank you very much for your advice. The display in Figure 7B truly reflects its inhibition rate. In the Δ FgDML1 mutant, when subjected to osmotic stress treatment, the inhibition rate becomes negative, indicating that the colony growth is greater than that of the CK. Therefore, the negative inhibition rate is shown in Figure 7B.

      (1) In Figure 1B, Figure 3C, and Figure 6C, the scale below the picture is not clear. In Figure 5D, the histogram is unclear, and it is recommended to redraw the graph.

      Thank you very much for your advice. The issue with the above images may be due to Word compression. We have changed the settings and enlarged the images as much as possible to better display them.

      (2) The full Latin name of the strain should be used in the title of figures and tables.

      Thank you very much for your advice. Based on your suggestion, we have used the full names of the strains appearing in the title of figures and tables.

      (3) Proteins in line 117 should be abbreviated.

      Thank you very much for your advice. Based on your suggestion, we have abbreviated the corresponding positions. 'The DML1 protein from S. cerevisiae was used as a query for a BLAST search against the Fusarium genome database, resulting in the identification of the putative DML1 gene FgDML1 (FGSG_05390) in F. graminearum. '(in L118-120).

      (4) The sentence in lines 187-189, which is supposed to introduce why the test is sensitive to the three drugs, is currently illogical.

      Thank you very much for your advice. Based on your suggestion, we have made modifications to the corresponding sections. 'Since Complex III is involved in the action of both cyazofamid (targeting the QI site) and pyraclostrobin (targeting the QO site), the sensitivity of ΔFgDML1 to cyazofamid and pyraclostrobin was investigated. ' (in L214-216).

      (5) The expression of FgQCR2, FgQCR7, and FgQCR8 was significantly upregulated in ΔFgDML1 at transcription levels. Do FgQCR2, FgQCR8, and FgQCR9 show upregulated expression at the protein level?

      Thank you very much for your question. Based on your suggestion, we evaluated the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in PH-1 and ΔFgDML1, and we found that the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 were higher than those in PH-1. (in Fig. 6F).

      (6) In Figure 7B, it is recommended to adjust the position of the horizontal axis labels in the histogram.

      Thank you very much for your advice. Based on your suggestion, we have made modifications to the corresponding sections.(in Fig. 7B)

      (7) There are numerous errors in the writing of gene names in the text. Please check the full text and change the writing of gene names and mutant names to italic.

      Thank you very much for your advice. We have checked the entire text to ensure that all genes have been italicized.

      (8) All acronyms should be spelled out in figure and table captions. e.g., F. graminearum.

      Thank you very much for your advice. Based on your suggestion, we have used the full names of the strains appearing in the title of figures and tables.

      (9) In line 492, P should be lowercase and italic.

      Thank you very much for your advice. Based on your suggestion, we have made adjustments to the corresponding content.

      Reviewer #3:

      Summary:

      The manuscript "Mitochondrial 1 protein FgDML1 regulates DON toxin biosynthesis and cyazofamid sensitivity in Fusarium graminearum by affecting mitochondrial homeostasis" describes the construction of a null mutant for the FgDML1 gene in F. graminearum and assays characterising the effects of this mutation on the pathogen's infection process and lifecycle. While FgDML1 remains underexplored with an unclear role in the biology of filamentous fungi, and although the authors performed several experiments, there are fundamental issues with the experimental design and execution, and interpretation of the results.

      Strengths:

      FgDML1 is an interesting target, and there are novel aspects in this manuscript. Studies in other organisms have shown that this protein plays important roles in mitochondrial DNA (mtDNA) inheritance, mitochondrial compartmentalisation, chromosome segregation, mitochondrial distribution, mitochondrial fusion, and overall mitochondrial dynamics. Indeed, in Saccharomyces cerevisiae, the mutation is lethal. The authors have carried out multi-faceted experiments to characterise the mutants.

      Weaknesses:

      However, I have concerns about how the study was conceived. Given the fundamental importance of mitochondrial function in eukaryotic cells and how the absence of this protein impacts these processes, it is unsurprising that deletion of this gene in F. graminearum profoundly affects fungal biology. Therefore, it is misleading to claim a direct link between FgDML1 and DON toxin biosynthesis (and virulence), as the observed effects are likely indirect consequences of compromised mitochondrial function. In fact, it is reasonable to assume that the production of all secondary metabolites is affected to some extent in the mutant strains and that such a strain would not be competitive at all under non-laboratory conditions. The order in which the authors present the results can be misleading, too. The results on vegetative growth rate appeared much later in the manuscript, which should have come first, as the FgDML1 mutant exhibited significant growth defects, and subsequent results should be discussed in that context. Moreover, the methodologies are not described properly, making the manuscript hard to follow and difficult to replicate.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions with red words. In the response comments, to highlight the specific positions of the revised parts in the manuscript with red line number. The point to point responds to the reviewer’s comments are listed as following.

      For weaknesses,we arranged the sequence in this way to emphasize the novel discovery between mitochondria and DON toxin. We found a significant decrease in DON toxin in Δ FgDML1, defects in the formation of toxin producing bodies, and downregulation of FgTRis at both the gene and protein levels. In summary, we believe that the absence of FgDML1 does indeed lead to a decrease in the content of DON toxin, and FgDML1 plays a regulatory role in the synthesis of DON toxin. In addition, our measurements of DON toxin, acetyl CoA, ATP and other indicators are all based on the amount per unit hyphae, excluding differences caused by hyphal biomass or growth. We have further refined the materials and methods to facilitate better reading and understanding.

      (1) Lines 37-39: The disease itself does not produce toxins; it is the fungus that causes the disease that produces toxins. Moreover, the disease symptoms observed are likely caused by the toxins produced by the fungus.

      Thank you very much for your advice. We have made modifications to the wording of the corresponding sections. 'Studies have shown that increased DON levels are positively correlated with the pathogenicity rate of F. graminearum.'(in L36-37).

      (2) Lines 82-87: While it is challenging to summarise the role of ATP in just a few words, this section needs improvement for clarity and accuracy. Additionally, I do not believe that drawing a direct link between mitochondrial defects and toxin production is an appropriate strategy in this case.

      Thank you very much for your advice. Based on your suggestion, we have added corresponding descriptions in the corresponding positions to provide more information on the relationship between ATP and toxins, in order to better prepare for the following text. 'Pathogen-intrinsic ATP homeostasis is recognized as a critical, rate-limiting determinant for toxin biosynthesis. Previous studies indicate that dual-target inhibition of ATP synthase (AtpA) and adenine deaminase (Ade) by a specific small-molecule probe effectively depletes intracellular ATP, consequently suppressing the synthesis of key virulence factors TcdA and TcdB transcriptionally and translationally(Marreddy et al., 2024). The systemic toxicity of Anthrax Edema Toxin (ET) is primarily attributed to its catalytic activity, which depletes the host cell's ATP reservoir, thereby triggering a bioenergetic collapse that culminates in cell lysis and death(Liu et al., 2025). '(in L78-86).

      (3) Lines 125-126: The manuscript does not clearly describe how subcellular localisation was determined. This methodology needs to be properly detailed.

      Thank you very much for your advice. The subcellular localization was validated through co-localization analysis with MitoTracker Red CMXRos, a mitochondrial-specific dye. The observed overlap between the FgDML1-GFP signal and the mitochondrial marker confirmed mitochondrial localization. Based on these results, we determined that FgDML1 is definitively localized to the mitochondria.We have incorporated this description in the appropriate section of the manuscript. 'Furthermore, subcellular localization studies confirmed that FgDML1 localizes to mitochondria, as demonstrated by colocalization with a mitochondria-specific dye MitoTracker Red CMXRos (Fig. 1B). '(in L125-127).

      (4) Regarding the organisation of the Results section, it needs to be revised. While I understand the authors' intention to emphasise the impact on virulence, the results showing how FgDML1 deletion affects vegetative growth, asexual and sexual reproduction, and sensitivity to stressors should be presented before the virulence assays and effects on DON production. Additionally, the authors do not provide any clear evidence that FgDML1 directly interacts with proteins involved in asexual or sexual reproduction, stress responses, or virulence. Therefore, it is misleading to suggest that FgDML1 directly regulates these processes. The observed phenotypes are, rather, a consequence of severely impaired mitochondrial function. Without functional mitochondria, the cell cannot operate properly, leading to widespread physiological defects. In this regard, statements such as those in lines 139-140 and 343-344 are misleading.

      Thank you very much for your advice. We have adjusted the order of the images based on your suggestion, placing the characterization of ΔFgDML1 in nutritional growth, sexual reproduction, and other aspects before DON toxin. And we have made adjustments to the corresponding statements. 'These findings demonstrate that FgDML1 is a positive regulator of virulence in F. graminearum. '(in L140-141).

      (5) Lines 185-186: The authors do not provide sufficient evidence to support the claim that FgQCR2, FgQCR8, and FgQCR9 overexpression is the main cause of reduced cyazofamid sensitivity. Although expression of these genes is altered, reduced sensitivity may result from changes in other proteins or pathways. To strengthen this claim, overexpression of FgQCR2, 8, and 9 in the wild-type background, followed by assessment of cyazofamid resistance, would be necessary. As it stands, there is no support for the claim presented in lines 329-332.

      Thank you very much for your advice. To establish a causal link between the overexpression of FgQCR2, FgQCR7, and FgQCR8 and the observed reduction in cyazofamid sensitivity, we first quantified the protein levels of these assembly factor. Western blot analysis confirmed their elevated expression in the ΔFgDML1 mutant compared to the wild-type PH-1. We further generated individual overexpression strains for FgQCR2, FgQCR7, and FgQCR8 in the wild-type PH-1 background. Fungicide sensitivity assays revealed that all three overexpression mutants displayed significantly reduced sensitivity to cyazofamid compared to the parental strain. These genetic complementation experiments confirm that upregulation of FgQCR2, FgQCR7, and FgQCR8 is sufficient to confer reduced cyazofamid sensitivity.We have incorporated these explanations and provided supporting images in the appropriate section of the manuscript. 'To further clarify whether the upregulated expression of FgQCR2, FgQCR7, and FgQCR8 genes affects their protein expression levels, we measured the protein levels. The results showed that the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 were higher than those in PH-1(Fig. 6F). Subsequently, we overexpressed FgQCR2, FgQCR7, and FgQCR8 in the wild-type background, and the corresponding overexpression mutants exhibited reduced sensitivity to cyazofamid(Fig. 6E). '(in L205-211)(in Fig. 6E, F)

      (6) Lines 187-190: This segment is confusing and difficult to follow. It requires rewriting for clarity.

      Thank you very much for your advice. Based on your suggestion, we have made corresponding modifications in the corresponding locations. 'Since Complex III is involved in the action of both cyazofamid (targeting the QI site) and pyraclostrobin (targeting the QO site), the sensitivity of ΔFgDML1 to cyazofamid and pyraclostrobin was investigated. ''(in L214-216)

      (7) Lines 345-346: The authors state that in this study, FgDML1 is localised in mitochondria, which implies that in other studies, its localisation was different. Is this accurate? Clarification is needed.

      Thank you very much for your question. In previous studies, the localization of this protein was not clearly defined, and its function was only emphasized to be related to mitochondria. Whether in yeast or in Drosophila melanogaster. (Miklos et al., 1997; Gurvitz et al., 2002)

      Miklos GLG, Yamamoto M-T, Burns RG, Maleszka R. 1997. An essential cell division gene of drosophila, absent from saccharomyces, encodes an unusual protein with  tubulin-like and myosin-like peptide motifs. Proc Natl Acad Sci 94:5189–5194. doi:10.1073/pnas.94.10.5189

      Gurvitz A, Hartig A, Ruis H, Hamilton B, de Couet HG. 2002. Preliminary characterisation of DML1, an essential saccharomyces cerevisiae gene related to misato of drosophila melanogaster. FEMS Yeast Res 2:123–135. doi:10.1016/S1567-1356(02)00083-1

      Material and Methods Section

      (8) In general, the methods require more detailed descriptions, including the brands and catalog numbers of reagents and kits used. Simply stating that procedures were performed according to manufacturers' instructions is insufficient, particularly when the specific brand or kit is not identified.

      Thank you very much for your advice. We have added corresponding content based on your suggestion to more comprehensively display the reagent brand and complete product name. 'Transformants were selected on PDA plates containing either 100 μg/mL Hygromycin B (Yeasen, Shanghai, China) or 0.2 μmol/mL 5-Fluorouracil 2'-deoxyriboside (F2du) (Solarbio, Beijing, China)(Zhao et al., 2022). ' (in L405-407). 'DON toxin was measured using a Wise Science ELISA-based kit (Wise Science, Jiangsu, China) (Li et al., 2019; Zheng et al., 2018) '. (in L469-471)

      (9) Line 364: What do CM and MM stand for? Please define.

      Thank you very much for your advice. Based on your suggestion, we have made modifications in the corresponding locations. 'To evaluate vegetative growth, complete medium (CM), minimal medium (MM), and V8 Juice Agar (V8) media were prepared as described previously(Tang et al., 2020). '(in L385-387)

      Generation of Deletion and Complemented Mutants:

      (10) This section lacks detail. For example, were PCR products used directly for PEG-mediated transformation, or were the fragments cloned into a plasmid?

      Thank you very much for your question. We directly use the fused fragments for protoplast transformation after sequencing confirmation. We have clearly defined the fragment form used for transformation at the corresponding location. 'The resulting fusion fragment was transformed into the wild-type F. graminearum PH-1 strain via polyethylene glycol (PEG)-mediated protoplast transformation. '(in L403-405).

      (11) PCR and Southern blot validation results should be included as supplementary material, along with clear interpretations of these results.

      Thank you very much for your advice. In the supplementary material we submitted, Supplementary Figure 2 already includes the results of PCR and Southern blot validation.(in Fig. S2)

      (12) There is almost no description of how the mutants mentioned in lines 388-390 were generated.

      Thank you very much for your advice. Based on your suggestions, we have added relevant content in the appropriate sections to more comprehensively and clearly reflect the experimental process. 'Specifically, FgDML1, including its native promoter region and open reading frame (ORF) (excluding the stop codon), was amplified.The PCR product was then fused with the XhoI -digested pYF11 vector. After transformation into E. coli and sequence verification, the plasmid was extracted and subsequently introduced into PH-1 protoplasts. For FgDnm1-3×Flag, the 3×Flag tag was added to the C-terminus of FgDnm1 by PCR, fused with the hygromycin resistance gene and the FgDnm1 downstream arm, and then introduced into PH-1 protoplasts. The overexpression mutant was constructed according to a previously described method. Specifically, the ORF of FgDML1 was amplified and the PCR product was ligated into the SacII-digested pSXS overexpression vector. The resulting plasmid was then transformed into PH-1 protoplasts (Shi et al., 2023). For the construction of PH-1::FgTri1+GFP and ΔFgDML1::FgTri1+GFP, the ORF of FgTri1 was amplified and ligated into the XhoI-digested pYF11 vector as described above. The resulting vectors were then transformed into protoplasts of PH-1 or ΔFgDML1, respectively.'(in L413-426).

      Vegetative Growth and Conidiation Assays:

      (13) There is no information about how long the plates were incubated before photos were taken. Judging by the images, it appears that different incubation times may have been used.

      Thank you very much for your advice. Due to the slower growth of ΔFgDML1, we adopted different incubation periods and have supplemented the relevant content in the corresponding section. 'All strains were incubated at 25°C in darkness; however, due to ΔFgDML1 slower growth, the ΔFgDML1 mutant required a 5-day incubation period compared to the 3 days used for PH-1 and ΔFgDML1-C. '(in L490-493).

      (14) There is no description of the MBL medium.

      Thank you very much for your advice. Based on your suggestion, we have supplemented the corresponding content in the corresponding positions. 'Mung bean liquid (MBL) medium was used for conidial production, while carrot agar (CA) medium was utilized to assess sexual reproduction(Wang et al., 2011). '(in L387-389).

      DON Production and Pathogenicity Assays:

      (15) Were DON levels normalised to mycelial biomass? The vegetative growth assays show that FgDML1 null mutants exhibit reduced growth on all tested media. If mutant and wild-type strains were incubated for the same period under the same conditions, it is reasonable to assume that the mutants accumulated significantly less biomass. Therefore, results related to DON production, as well as acetyl-CoA and ATP levels, must be normalised to biomass.

      Thank you very much for your question. We have taken into account the differences in mycelial biomass. Therefore, when measuring DON, acetyl-CoA, and ATP levels, all data were normalized to mycelial mass and calculated as amounts per unit of mycelium, thereby avoiding discrepancies arising from variations in biomass.

      Sensitivity Assays:

      (16) While the authors mention that gradient concentrations were used, the specific concentrations and ranges are not provided. Importantly, have the plates shown in Figure 5 been grown for different periods or lengths? Given the significantly reduced growth rate shown in Figure 6A, the mutants should not have grown to the same size as the WT (PH-1) as shown in Figures 5A and 5B unless the pictures have been taken on different days. This needs to be explained.

      Thank you very much for your question. Due to the slower growth of ΔFgDML1, we adopted different incubation periods and have supplemented the relevant content in the corresponding section. 'All strains were incubated at 25°C in darkness; however, due to ΔFgDML1 slower growth, the ΔFgDML1 mutant required a 5-day incubation period compared to the 3 days used for PH-1 and ΔFgDML1-C. '(in L490-493).

      (17) Additionally, was inhibition measured similarly for both stress agents and fungicides? This should be clarified.

      Thank you very much for your question. We have supplemented the specific concentration gradient of fungicides. 'The concentration gradients for each fungicide in the sensitivity assays were set up according to Supplementary Table S2. '(in L493-494)(in Table. S2).

      Complex III Enzyme Activity:

      (18) A more detailed description of how this assay was performed is needed.

      Thank you very much for your advice. We have provided further detailed descriptions of the corresponding sections. 'Briefly, 0.1 g of mycelia was homogenized with 1 mL of extraction buffer in an ice bath. The homogenate was centrifuged at 600 ×g for 10 min at 4°C. The resulting supernatant was then subjected to a second centrifugation at 11,000 ×g for 10 min at 4°C. The pellet was resuspended in 200 μL of extraction buffer and disrupted by ultrasonication (200 W, 5 s pulses with 10 s intervals, 15 cycles). Complex III enzyme activity was finally measured by adding the working solution as per the manufacturer's protocol. '(in L511-517)

      (19) Were protein concentrations standardised prior to the assay?

      Thank you very much for your question. Protein concentrations for all Western blot samples were quantified using a BCA assay kit to ensure equal loading.

      (20) Line 448: Are ΔFgDML1::Tri1+GFP and ΔFgDML1+GFP the same strain? ΔFgDML1::Tri1+GFP has not been previously described.

      Thank you very much for your question. These two strains are not the same strain, and we have supplemented their construction process in the corresponding section. 'For the construction of PH-1::FgTri1+GFP and ΔFgDML1::FgTri1+GFP, the ORF of FgTri1 was amplified and ligated into the XhoI-digested pYF11 vector as described above. The resulting vectors were then transformed into protoplasts of PH-1 or ΔFgDML1, respectively. '(in L423-426)

      (21) Lines 460 and 468: Please adopt a consistent nomenclature, either RT-qPCR or qRT-PCR.

      Thank you very much for your advice. We have unified it and modified the corresponding content in the corresponding sections. 'Reverse Transcription Quantitative Polymerase Chain Reaction (RT-qPCR) was carried out using the QuantStudio 6 Flex real-time PCR system (Thermo, Fisher Scientific, USA) to assess the relative expression of three subunits of Complex III (FgCytb, FgCytc1, FgISP), five assembly factors (FgQCR2, FgQCR6, FgQCR7, FgQCR8, FgQCR9), and DON biosynthesis-related genes (FgTri5 and FgTri6). '(in L526-531)

      (22) Lines 472-473: Why was FgCox1 used as a reference for FgCytb? Clarification is needed.

      Thank you very much for your question. FgCytb (cytochrome b) and FgCOX1 (cytochrome c oxidase subunit I) are both encoded by the mitochondrial genome and serve as core components of the oxidative phosphorylation system (Complex III and Complex IV, respectively). Their transcription is co-regulated by mitochondrial-specific mechanisms in response to cellular energy status. Consequently, under experimental conditions that perturb energy homeostasis, FgCOX1 expression exhibits relative, context-dependent stability with FgCytb, or at least co-varies directionally, making it a superior reference for normalizing target gene expression. In contrast, FgGapdh operates within a distinct genetic and regulatory system. Using FgCOX1 ensures that both reference and target genes reside within the same mitochondrial compartment and functional module, thereby preventing normalization artifacts arising from independent variation across disparate pathways.

      (23) Lines 476-477: This step requires a clearer and more detailed explanation.

      Thank you very much for your advice. We provided detailed descriptions of them in their respective positions. 'For FgDnm1-3×Flag, the 3×Flag tag was added to the C-terminus of FgDnm1 by PCR, fused with the hygromycin resistance gene and the FgDnm1 downstream arm, and then introduced into PH-1 protoplasts. '(in L417-419). 'The FgDnm1-3×Flag fragment was introduced into PH-1 and FgDML1+GFP protoplasts, respectively, to obtain single-tagged and double-tagged strains. '(in L541-543)

      Western blotting:

      (24) Uncropped Western blot images should be provided as supplementary material.

      Thank you very much for your advice. All Western blot images will be submitted to the supplementary material package.

      (25) Lines 485-489: A more thorough description of the antibodies used (including source, catalogue number, and dilution) is necessary.

      Thank you very much for your advice. The antibodies used are clearly stated in terms of brand, catalog number, and dilution. We have added the dilution ratio. 'All antibodies were diluted as follows: primary antibodies at 1:1000 and secondary antibodies at 1:10000. '(in L550-551)

      (26) The Western blot shown in Figure 3D appears problematic, particularly the anti-GAPDH band for FgDML1::FgTri1+GFP. Are both anti-GAPDH bands derived from the same gel?

      Thank you very much for your advice. We are unequivocally certain that these data derive from the same gel. Therefore, we are providing the original image for your inspection.

      Author response image 4.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript reports the identification of putative orthologues of mitochondrial contact site and cristae organizing system (MICOS) proteins in Plasmodium falciparum - an organism that unusually shows an acristate mitochondrion during the asexual part of its life cycle and then this develops cristae as it enters the sexual stage of its life cycle and beyond into the mosquito. The authors identify PfMIC60 and PfMIC19 as putative members and study these in detail. The authors at HA tags to both proteins and look for timing of expression during the parasite life cycle and attempt (unsuccessfully) to localise them within the parasite. They also genetically deleted both gene singly and in parallel and phenotyped the effect on parasite development. They show that both proteins are expressed in gametocytes and not asexuals, suggesting they are present at the same time as cristae development. They also show that the proteins are dispensible for the entire parasite life cycle investigated (asexuals through to sporozoites), however there is some reduction in mosquito transmission. Using EM techniques they show that the morphology of gametocyte mitochondria is abnormal in the knock out lines, although there is great variation.

      Major comments: The manuscript is interesting and is an intriguing use of a well studied organism of medical importance to answer fundamental biological questions. My main comments are that there should be greater detail in areas around methodology and statistical tests used. Also, the mosquito transmission assays (which are notoriously difficult to perform) show substantial variation between replicates and the statistical tests and data presentation are not clear enough to conclude the reduction in transmission that is claimed. Perhaps this could be improved with clearer text?

      We would like to thank the reviewer for taking the time to review our manuscript. We are happy to hear the reviewer thinks the manuscript is interesting and thank the reviewer for their constructive feedback.

      To clarify the statistical analyses used, we included a new supplementary dataset with all statistical analyses and p-values indicated per graph. Furthermore, figure legends now include the information on the exact statistical test used in each case.

      Regarding mosquito experiments, while we indeed reported a reduction in transmission and oocysts numbers we are aware that this effect might be due to the high variability in mosquito feeding assays. To highlight this point, we deleted the sentence "with the transmission reduction of [numbers]...." and we included the sentence "The high variability encountered in the standard membrane feeding assays, though, partially obstructs a clear conclusion on the biological relevance of the observed reduction in oocyst numbers"

      More specific comments to address: Line 101/Fig1E (and figure legend) - What is this heatmap showing. It would be helpful to have a sentence or two linking it to a specific methodology. I could not find details in the M+M section and "specialized, high molecular mass gels" does not adequately explain what experiments were performed. The reference to Supplementary Information 1 also did not provide information.

      We added the information "high molecular mass gels with lower acrylamide percentage" to clarify methodology in the text. Furthermore, we extended the figure legend to include all relevant information. Further experimental details can be found in the study cited in this context, where the dataset originates from (Evers et al., 2021).

      Line 115 and Supplementary Figure 2C + D - The main text says that the transgenic parasites contained a mitochondrially localized mScarlet for visualization and localization, but in the supplementary figure 2 it shows mitotracker labelling rather than mScarlet. This is very confusing. The figure legend also mentions both mScarlet and MitoTracker. I assume that mScarlet was used to view in regular IFAs (Fig S2C) and the MitoTracker was used for the expansion microscopy (Fig S2D)? Please clarify.

      We thank the reviewer for pointing this out - this was indeed incorrectly annotated. We used the endogenous mito-mScarlet signal in IFA and mitoTracker in U-ExM. The figure annotation has now been corrected.

      Figure 2C - what is the statistical test being used (the methods say "Mean oocysts per midgut and statistical significance were calculated using a generalized linear mixed effect model with a random experiment effect under a negative binomial distribution." but what test is this?)?

      The statistic test is now included in the material and method section with the sentence "The fitted model was used to obtain estimated means and contrasts and were evaluated using Wald Statistics". The test is now also mentioned in the figure legend.

      Also the choice of a log10 scale for oocyst intensity is an unusual choice - how are the mosquitoes with 0 oocysts being represented on this graph? It looks like they are being plotted at 10^-1 (which would be 0.1 oocysts in a mosquito which would be impossible).

      As the data spans three orders of magnitude with low values being biologically meaningful, we decided that a log scale would best facilitate readability of the graph. As the 0 values are also important to show, we went with a standard approach to handle 0s in log transformed data and substituted the 0s with a small value (0.001). We apologize for not mentioning this transformation in the manuscript. To make this transformation transparent, we added a break at the lower end of the log‑scaled y‑axis and relabelled the lowest tick as '0'. This ensures that mosquitoes with zero oocysts are shown along the x‑axis without being assigned an artificial value on the log scale. We would furthermore like to highlight that for statistics we used the true value 0 and not 0.001.

      Figure 2D - it is great that the data from all feeding replicates has been shared, however it is difficult to conclude any meaningful impact in transmission with the knock-out lines when there is so much variation and so few mosquitoes dissected for some datapoints (10 mosquitoes are very small sample sizes). For example, Exp1 shows a clear decrease in mic19- transmission, but then Exp2 does not really show as great effect. Similarly, why does the double knock out have better transmission than the single knockouts? Sure there would be a greater effect?

      We agree with the reviewer and with the new sentence added, as per major point, we hope we clarified the concept. Note that original Figure 2D has been moved to the supplementary information, as per minor comment of another reviewer.

      Figure 3 legend - Please add which statistical test was used and the number of replicates.

      Done

      Figure 4 legend - Please add which statistical test was used and the number of replicates.

      Done. Regarding replicates, note that while we measured over 100 cristae from over 30 mitochondria, these all stem from the same parasite culture.

      Figure 5C - the 3D reconstructions are very nice, but what does the red and yellow coloring show?

      Indeed, the information was missing. We added it to the figure legend.

      Line 352 - "Still, it is striking that, despite the pronounced morphological phenotype, and the possibly high mitochondrial stress levels, the parasites appeared mostly unaffected in life cycle propagation, raising questions about the functional relevance of mitochondria at these stages." How do the authors reconcile this statement with the proven fact that mitochondria-targeted antimalarials (such as atovaquone) are very potent inhibitors of parasite mosquito transmission?

      Our original sentence was reductive. What we wanted to state was related to the functional relevance of crista architecture and overall mitochondrial morphology rather than the general functional relevance of the mitochondria. We changed the sentence accordingly.

      Furthermore, even though we do not discuss this in the article, we are aware of mitochondria targeting drugs that are known to block mosquito transmission. We want to point out that it is difficult to discern the disruption of ETC and therefore an impact on energy conversion with the impact on the essential pathway of pyrimidine synthesis, highly relevant in microgamete formation. Still, a recent paper from Sparkes et al. 2024 showed the essentiality of mitochondrial ATP synthesis during gametogenesis so it is very likely that the mitochondrial energy conversion is highly relevant for transmission to the mosquito.

      Reviewer #1 (Significance (Required)):

      This manuscript is a novel approach to studying mitochondrial biology and does open a lot of unanswered questions for further research directions. Currently there are limitations in the use of statistical tests and detail of methodology, but these could be easily be addressed with a bit more analysis/better explanation in the text. This manuscript could be of interest to readers with a general interest in mitochondrial cell biology and those within the specific field of Plasmodium research. My expertise is in Plasmodium cell biology.

      We thank the reviewer for the praise.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Major comments: 1) In my opinion, the authors tend to sensationalize or overinterpret their results. The title of the manuscript is very misleading. While MICOS is certainly important for crista formation, it is not the only factor, as ATP synthase dimer rows make a highly significant contribution to crista morphology. Thus, one can argue with equal validity that ATP synthase should be considered the 'architect', as it's the conformation of the dimers and rows modulate positive curvature. Secondly, while cristae are still formed upon mic60/mic19 gene knockout (KO), they are severely deformed, and likely dysfunctional (see below). Thus, I do not agree with the title that MICOS is dispensable for crista formation, because the authors results show that it clearly is essential. So, the title should be changed.

      We thank the reviewer for taking the time to review our manuscript.

      Based on the reviewers' interpretation we conclude the title does not come across as intended. We have changed the title to: "The role of MICOS in organizing mitochondrial cristae in malaria parasites"

      The Discussion section starting from line 373 also suffers from overinterpretation as well as being repetitive and hard to understand. The authors infer that MICOS stability is compromised less in the single KOs (sKO) in compared to the mic60/mic19 double KO (dKO). MICOS stability was never directly addressed here and the composition of the MICOS complex is unaddressed, so it does not make sense to speculate by such tenuous connections. The data suggest to me that mic60 and mic19 are equally important for crista formation and crista junction (CJ) stabilization, and the dKO has a more severe phenotype than either KO, further demonstrating neither is epistatic.

      We do agree with the reviewer's notion that we did not address complex stability, and our wording did not make this sufficiently clear. We shortened and rephrased the paragraph in question.

      The following paragraphs (line 387 to 422) continues with such unnecessary overinterpretation to the point that it is confusing and contradictory. Line 387 mentions an 'almost complete loss of CJs' and then line 411 mentions an increase in CJ diameter, both upon Mic60 ablation. I do not think this discussion brings any added value to the manuscript and should be shortened. Yes, maybe there are other putative MICOS subunits that may linger in the KOS that are further destabilized in the dKO, or maybe Mic60 remains in the mic19 KO (and vice versa) to somehow salvage more CJs, which is not possible in the dKO. It is impossible to say with confidence how ATP synthase behaves in the KOs with the current data.

      We shortened this paragraph.

      2) While the authors went through impressive lengths to detect any effect on lifecycle progression, none was found except for a reduction in oocyte count. However, the authors did not address any direct effect on mitochondria, such as OXPHOS complex assembly, respiration, membrane potential. This seems like a missed opportunity, given the team's previous and very nice work mapping these complexes by complexome profiling. However, I think there are some experiments the authors can still do to address any mitochondrial defects using what they have and not resorting to complexome profiling (although this would be definitive if it is feasible):

      i) Quantification of MitoTracker Red staining in WT and KOs. The authors used this dye to visualize mitochondria to assay their gross morphology, but unfortunately not to assay membrane potential in the mutants. The authors can compare relative intensities of the different mitochondria types they categorized in Fig. 3A in 20-30 cells to determine if membrane potential is affected when the cristae are deformed in the mutants. One would predict they are affected.

      Interesting suggestion. As our staining and imaging conditions are suitable for such analysis (as demonstrated by Sarazin et al., 2025, https://www.biorxiv.org/content/10.1101/2025.11.27.690934v1), we performed the measurements on the same dataset which we collected for Figure 3. We did, however, not detect any difference in mitotracker intensity between the different lines. The result of this analysis is included in the new version of Supplementary figure S6.

      ii) Sporozoites are shown in Fig S5. The authors can use the same set up to track their motion, with the hypothesis that they will be slower in the mutants compared to WT due to less ATP. This assumes that sporozoite mitochondria are active as in gametocytes.

      While theoretically plausible and informative, we currently do not know the relevance of mitochondrial energy conversion for general sporozoite biology or specifically features of sporozoite movement. Given the required resources and time to set this experiment up and the uncertainty whether it is a relevant proxy for mitochondrial functioning, we argue it is out of scope for this manuscript.

      iii) Shotgun proteomics to compare protein levels in mutants compared to WT, with the hypothesis that OXPHOS complex subunits will be destabilized in the mutants with deformed cristae. This could be indirect evidence that OXPHOS assembly is affected, resulting in destabilized subunits that fail to incorporate into their respective complexes.

      While this experiment could potentially further our understanding of the interaction between MICOS and levels of OXPHOS complex subunits we argue that the indirect nature of the evidence does not justify the required investments.

      To expedite resubmission, the authors can restrict the cell lines to WT and the dKO, as the latter has a stronger phenotype that the individual KOs and conclusions from this cell line are valid for overall conclusions about Plasmodium MICOS.

      I will also conclude that complexome/shotgun proteomics may be a useful tool also for identifying other putative MICOS subunits by determining if proteins sharing the same complexome profile as PfMic60 and Mic19 are affected. This would address the overinterpretation problem of point 1.

      3) I am aware of the authors previous work in which they were not able to detect cristae in ABS, and thus have concluded that these are truly acristate. This can very well be true, or there can be immature cristae forms that evaded detection at the resolution they used in their volumetric EM acquisitions. The mitochondria and gametocyte cristae are pretty small anyway, so it not unreasonable to assume that putative rudimentary cristae in ABS may be even smaller still. Minute levels of sampled complex III and IV plus complex V dimers in ABS that were detected previously by the authors by complexome profiling would argue for the presence of miniscule and/or very few cristae.

      I think that authors should hedge their claim that ABS is acrisate by briefly stating that there still is a possibility that miniscule cristae may have been overlooked previously.

      We acknowledge that we cannot demonstrate the absolute absence of any membrane irregularities along the inner mitochondrial membrane. At the same time, if such structures were present, they would be extremely small and unlikely to contain the full set of proteins characteristic of mature cristae. For this reason, we consider it appropriate to classify ABS mitochondria as acristate. To reflect the reviewer's point while maintaining clarity for readers, we have slightly adjusted our wording in the manuscript, changing 'fully acristate' to 'acristate'.

      This brings me to the claim that Mic19 and Mic60 proteins are not expressed in ABS. This is based on the lack of signal from the epitope tag; a weak signal is detected in gametocytes. Thus, one can counter that Mic19 and Mic60 are also expressed, but below the expression limits of the assay, as the protein exhibits low expression levels when mitochondrial activity is upregulated.

      We agree with the reviewer that the absence of a detectable epitope‑tag signal does not definitively exclude low‑level expression, and we have therefore replaced the term 'absent' with 'undetectable' throughout the manuscript. In context with previous findings of low-level transcripts of the proteins in a study by Lopez-Berragan et al. and Otto et al., we also added the sentence "The apparent absence could indicate that transcripts are not translated in ABS or that the proteins' expression was below detection limits of western blot analysis." to the discussion. _At the same time, we would like to clarify that transcript levels for both genes fall within the

      To address this point, the authors should determine of mature mic60 and mic19 mRNAs are detected in ABS in comparison to the dKO, which will lack either transcript. RT-qPCR using polyT primers can be employed to detect these transcripts. If the level of these mRNAs are equivalent to dKO in WT ABS, the authors can make a pretty strong case for the absence of cristae in ABS.

      We appreciate the reviewer's suggestion. As noted in the Discussion, existing transcriptomic datasets already show detectable MIC19 and MIC60 mRNAs in ABS. For this reason, we expect RT-qPCR to reveal low (but not absent) levels of both transcripts, unlike the true loss expected to be observed in the dKO. Because such residual signals have been reported previously and their biological relevance remains uncertain, we do not believe transcript levels alone can serve as a definitive indicator of cristae absence in ABS.

      They should highlight the twin CX9C motifs that are a hallmark of Mic19 and other proteins that undergo oxidative folding via the MIA pathway. Interestingly, the Mia40 oxidoreductase that is central to MIA in yeast and animals, is absent in apicomplexans (DOI: 10.1080/19420889.2015.1094593).

      Searching for the CX9C motifs is a valuable suggestion. In response to the reviewer´s suggestion we analysed the conservation of the motif in PfMIC19 and included this in a new figure panel (Figure 1 F).

      Did the authors try to align Plasmodium Mic19 orthologs with conventional Mic19s? This may reveal some conserved residues within and outside of the CHCH domain.

      In response to this comment we made Figure 1 F, where we show conserved residues within the CHCH domains of a broad range of MIC19 annotated sequences across the opisthokonts, and show that the Cx9C motifs are conserved also in PfMIC19. Outside the CHCH domain, we did not find any meaningful conservation, as PfMIC19 heavily diverges from opisthokont MIC19.

      5) Statistcal significance. Sometimes my eyes see population differences that are considered insignificant by the statistical methods employed by the authors, eg Fig. 4E, mutants compared to WT, especially the dKO. Have the authors considered using other methods such as student t-test for pairwise comparisons?

      The graphs in figures 3, 4 and 5 got a makeover, such that they now are in linear scale and violin plots (also following a suggestion from further down in the reviewer's comments). We believe that this improves interpretability. ANOVA was kept as statistical testing to assure the correction for multiple comparisons that cannot be performed with standard t-test. A full overview of statistics and exact p-values can also be found in the newly added supplementary information 2.

      Minor comments: Line 33. Anaerobes (eg Giardia) have mitochondria that do produce ATP, unlike aerobic mitochondria

      We acknowledge that producing ATP via OXPHOS is not a characteristic of all mitochondria-like organelles (e.g. mitosomes), which is why these are typically classified separately from canonical mitochondria. When not considering mitochondria-like organelles, energy conversion is the function that the mitochondrion is most well-known for and the one associated with cristae.

      Line 56: Unclear what authors mean by "canonical model of mitochondria"

      To clarify we changed this to "yeast or human" model of mitochondria.

      Lines 75-76: This applies to Mic10 only

      We removed the "high degree of conservation in other cristate eukaryotes" statement.

      Line 80: Cite DOI: 10.1016/j.cub.2020.02.053

      Done

      Fig 2D: I find this table difficult to read. If authors keep table format, at least get rid of 'mean' column' as this data is better depicted in 2C. I suggest depicted this data either like in 3B depicting portion of infected vs unaffected flies in all experiments, then move modified Table to supplement. Important to point out experiment 5 appears to be an outlier with reduced infectivity across all cell lines, including WT.

      To clarify: the mean reported in the table indicates the mean per replicate while the mean reported in figure 2C is the overall mean for a given genotype that corrects for variability within experiments. We agree that moving the table to the supplementary data is a good idea. We decided to not include a graph for infected and non-infected mosquitoes as this information would be partially misleading, highlighting a phenotype we argue to be influenced by the strong variability.

      Fig. 3C-G: I feel like these data repeatedly lead to same conclusions. These are all different ways of showing what is depicted in Fig 2B: mitochondria gross morphology is affected upon ablation of MICOS. I suggest that these graphs be moved to supplement and replaced by the beautiful images.

      Thank you for the nice comment on our images. We have now moved part of the graphs to supplementary figure 6 and only kept the Relative Frequency, Sphericity and total mitochondria volume per cell in the main figure.

      Line 180: Be more specific with which tubulin isoform is used as a male marker and state why this marker was used in supplemental Fig S6.

      We have now specified the exact tubulin isoform used as the male gametocyte marker, both in the main text and in Supplementary Fig. S6. This is a commercial antibody previously known to work as an effective male marker, which is why we selected it for this experiment. This is now clearly stated in the manuscript.

      Line 196 and Fig 3C: the word 'intensities' in this context is very ambiguous. Please choose a different term (puncta, elements, parts?). This is related to major point 2i above.

      To clarify the biological effect that we can conclude form the measurement, we added an explanation about it in the respective section of the results, and we decided to replace the raw results of the plug-in readout with the deduced relative dispersion.

      Line 222: Report male/female crista measurements

      We added Supplementary information 2, which contains exact statistical test and outcomes on all presented quantifications as well as a per-sex statistical analysis of the data from figure 4. Correspondingly, we extended supplementary information 2 by a per-sex colour code for the thin section TEM data.

      Fig. 4B-E: depict data as violin plots or scatter plots like Fig. 2C to get a better grasp of how the crista coverage is distributed. It seems like the data spread is wider in the double KO. This would also solve the problem with the standard deviation extending beyond 0%.

      We changed this accordingly.

      Lines 331-333: Please clarify that this applies for some, but not all MICOS subunits. Please also see major point 1 above. Also, the authors should point out that despite their structural divergence, trypanosomal cryptic mitofilins Mic34 and Mic40 are essential for parasite growth, in contrast to their findings with PfMic60 (DOI: https://doi.org/10.1101/2025.01.31.635831).

      This has been changed accordingly.

      Line 320: incorrect citation. Related to point 1above.

      Correct citation is now included in the text.

      Lines 333-335. This is related to the above. Again, some subunits appear to affect cell growth under lab conditions, and some do not. This and the previous sentence should be rewritten to reflect this.

      This has been changed accordingly.

      Line 343-345: The sentence and citation 45 are strange. Regarding the former, it is about CHCHD10, whose status as a bona fide MICOS subunit is very tenuous, so I would omit this. About the phenomenon observed, I think it makes more sense to write that Mic60 ablation results in partially fragmented mitochondria in yeast (Rabl et al., 2009 J Cell Biol. 185: 1047-63). A fragmented mitochondria is often a physiological response to stress. I would just rewrite as not to imply that mitochondrial fission (or fusion) is impaired in these KOs, or at least this could be one of several possibilities.

      The sentence has been substituted following the indication of the reviewer. Though we still include the data of the human cells as this has also been shown in Stephens et al. 2020.

      Line 373: 'This indicates' is too strong. I would say 'may suggest' as you have no proof that any of the KOs disrupts MICOS. This hypothesis can be tested by other means, but not by penetrance of a phenotype.

      Done

      Line 376-377; 'deplete functionality' does not make sense, especially in the context of talking about MICOS subunit stability. In my opinion, this paragraph overinterprets the KO effects on MICOS stability. None of the experiments address this phenomenon, and thus the authors should not try to interpret their results in this context. See major point 1. Other suggestions for added value

      We removed the sentence. Also, the entire paragraph has been shortened, restructured and wording was changed to address major point 1.

      1) Does Plasmodium Sam50 co-fractionate with Mic60 and Mic19 in BN PAGE (Fig. 1E)

      While we did identify SAMM50 in our BN PAGE, the protein does not co-migrate with the MICOS components but instead comigrates with other components of a putative sorting and assembly machinery (SAM) complex. As SAMM50, the SAM complex and the overarching putative mitochondrial membrane space bridging (MIB) complex are not mentioned in the manuscript, we decided to not include the information in the figure.

      Reviewer #2 (Significance (Required)):

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      However, despite the herculean efforts of the authors, the manuscript as it currently stands represents only a minor advance in our understanding of the evolution of MICOS, which from the title and focus of the manuscript, is the main goal of the authors. In its current form, the manuscript reports some potentially important findings:

      1) Mic60 is verified to play a role in crista formation, as is predicted by its orthology to other characterized Mic60 orthologs.

      2) The discovery of a novel Mic19 analog (since the authors maintain there is no significant sequence homology), which exhibits a similar (or the same?) complexome profile with Mic60. This protein was upregulated in gametocytes like Mic60 and phenocopies Mic60 KO.

      3) Both of these MICOS subunits are essential (not dispensable) for proper crista formation

      4) Surprisingly, neither MICOS subunit is essential for in vitro growth or differentiation from ABS to sexual stages, and from the latter to sporozoites. This says more about the biology of plasmodium itself than anything about the essentiality of Mic60, ie plasmodium life cycle progression tolerates defects to mitochondrial morphology. But yes, I agree with the authors that Mic60's apparent insignificance for cell growth in examined conditions does differ with its essentiality in other eukaryotes. But fitness costs were not assayed (eg by competition between mutants and WT in infection of mosquitoes)

      5) Decreased fitness of the mutants is implied by a reduction of oocyte formation.

      While interesting in their own way, collectively they do not represent a major advance in our understanding of MICOS evolution. Furthermore, the findings bifurcate into categories informing MICOS or Plasmodium biology. Both aspects are somewhat underdeveloped in their current form.

      This is unfortunate because there seem to be many missed opportunities in the manuscript that could, with additional experiments, lead to a manuscript with much wider impact. For me, what is remarkable about Plasmodium MICOS that sets it apart from other iterations is the apparent absence of the Mic10 subunit. Purification of plasmodium MICOS via the epitope tagged Mic60 and Mic19 could have verified that MICOS is assembled without this core subunit. Perhaps Mic60 and Mic19 are the vestiges of the complex, and thus operate alone in shaping cristae. Such a reduction may also suggest the declining importance of mitochondria in plasmodium.

      Another missed opportunity was to assay the impact of MICOS-depletion of OXPHOS in plasmodium. This is a salient issue as maybe crista morphology is decoupled from OXPHOS capacity in Plasmodium, which links to the apparent tolerance of mitochondrial morphology in cell growth and differentiation. I suggested in section A experiments to address this deficit.

      Finally, the authors could assay fitness costs of MICOS-ablation and associated phenotypes by assaying whether mosquito infectivity is reduced in the mutants when they are directly competing with WT plasmodium. Like the authors, I am also surprised that MICOS mutants can pass population bottlenecks represented by differentiation events. Perhaps the apparent robustness of differentiation may contribute plasmodium's remarkable ability to adapt.

      I realize that the authors put a lot of efforts into their study and again, I am very impressed by the sophistication of the methods employed. Nevertheless, I think there is still better ways to increase the impact of the study aside from overinterpreting the conclusions from the data. But this would require more experiments along the lines I suggest in Section A and here.

      We thank the reviewer for their extensive analysis of the significance of our findings, including the compliments on our microscopy images and the sophisticated experimental approaches. We hope we have convincingly argued why we could or could not include some of the additional analyses suggested by the reviewer in section 1 above.

      With regard to the significance statement, we want to point out that our finding that PfMICOS is not needed for initial formation of cristae (as opposed to organization thereof), is a confirmation of something that has been assumed by the field, without being the actual focus of studies. We argue that the distinction between formation and organization of cristae is important and deserves some attention within the manuscript. The result of MICOS not being involved in the initial formation of cristae, we argue to be relevant in Plasmodium biology and beyond. As for the insights into how MICOS works in Plasmodium we have confirmed that the previously annotated PfMIC60 is indeed involved in the organization of cristae. Furthermore, we have identified and characterized PfMIC19. These findings, we argue, are indeed meaningful insights into PfMICOS.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      MICOS is a conserved mitochondrial protein complex responsible for organising the mitochondrial inner membrane and the maintenance of cristae junctions. This study sheds first light on the role of two MICOS subunits (Mic60 and the newly annotated Mic19) in the malaria parasite Plasmodium falciparum, which forms cristae de novo during sexual development, as demonstrated by EM of thin section and electron tomography. By generating knockout lines (including a double knockout), the authors demonstrate that knockout of both MICOS subunits leads to defects in cristae morphology and a partial loss of cristae junctions. With a formidable set of parasitological assays, the authors show that despite the metabolically important role of mitochondria for gametocytes, the knockout lines can progress through the life stages and form sporozoites, albeit with diminished infection efficiency.

      We thank the reviewer for their time and compliment.

      Major comments:

      1) The authors should improve to present their findings in the right context, in particular by:

      (i) giving a clearer description in the introduction of what is already known about the role of MICOS. This starts in the introduction, where one main finding is missing: loss of MICOS leads to loss of cristae junctions and the detachment of cristae membranes, which are nevertheless formed, but become membrane vesicles. This needs to be clearly stated in the introduction to allow the reader to understand the consistency of the authors' findings in P. falciparum with previous reports in the literature.

      We extended the introduction to include this information.

      (ii) at the end to the introduction, the motivating hypothesis is formulated ad hoc "conclusive evidence about its involvement in the initial formation of cristae is still lacking" (line 83). If there is evidence in the literature that MICOS is strictly required for cristae formation in any organism, then this should be explained, because the bona fide role of MICOS is maintenance of cristae junctions (the hypothesis is still plausible and its testing important).

      To clarify we rephrased the sentence to: "Although MICOS has been described as an organizer of crista junctions, its role during the initial formation of nascent cristae has not been investigated."

      2) Line 96-97: "Interestingly, PfMIC60 is much larger than the human MICOS counterpart, with a large, poorly predicted N-terminal extension." This statement is lacking a reference and presumably refers to annotated ORFs. The authors should clarify if the true N-terminus is definitely known - a 120kDa size is shown for the P. falciparum but this is not compared to the expected length or the size in S. cerevisiae.

      To solve the reference issue, we added the uniprot IDs we compared to see that the annotated ORF is bigger in Plasmodium. We also changed the comparison to yeast instead of human, because we realized it is confusing to compare to yeast all throughout the figure, but then talk about human in this specific sentence.

      Regarding whether the true N-terminus is known. Short answer: No, not exactly.

      However, we do know that the Pf version is about double the size of the yeast protein.

      As the reviewer correctly states, we show the size of 120kDa for the tagged protein in Figure 1G. Considering that we tagged the protein C-terminally, and observed a 120kDa product on western blot, it is safe to conclude that the true N-terminus does not deviate massively from the annotated ORF, and hence, that there is a considerable extension of the protein beyond a 60kDa protein. We do not directly compare to yeast MIC60 on our western blots, however, that comparison can be drawn from literature: Tarasenko et al., 2017 showed that purified MIC60 running at ~60kDa on SDS-PAGE actively bends membranes, suggesting that in its active form, the monomer of yeast MIC60 is indeed 60kDa in size.

      To clarify, we now emphasize that we ran the Alphafold prediction on the annotated open reading frame (annotated and sequenced by Bohme et al. and Chapell et al. now cited in the manuscript), and revised the wording to make clear what we are comparing in which sentence.

      3) lines 244-245: "Furthermore, our data indicates the effect size increases with simultaneous ablation of both proteins?". The authors should explain which data they are referring to, as some of the data in Fig 3 and 4 look similar and all significance tests relate to the wild type, not between the different mutants, so it is not clear if any overserved differences are significant. The authors repeat this claim in the discussion in lines 368-369 without referring to a specific significance test. This needs to be clarified.

      As a reply to this and other comments from the reviewers we added the multiple testing within all samples. In addition, to clarify statistics used we included a supplementary dataset with all p-values and statistical tests used.

      4) lines 304-306: "Though well established as the cristae organizing system, the role of MICOS in initial formation of cristae remains hidden in model organisms that constitutively display cristae.". This sentence is misleading since even in organisms that display numerous cristae throughout their life cycle, new cristae are being formed as the cells proliferate. Thus, failure to produce cristae in MICOS knockout lines would have been observable but has apparently not been reported in the literature. Thus, the concerted process in P. falciparum makes it a great model organism, but not fundamentally different to what has been studied before in other organisms.

      We deleted this statement.

      5) lines 373-378. "where ablation of just MIC60 is sufficient to deplete functionality of the entire MICOS (11, 15),". The authors' claim appears to be contrary to what is actually stated in ref 15, which they cite:

      "MICOS subunits have non-redundant functions as the absence of both MICOS subcomplexes results in more severe morphological and respiratory growth defects than deletion of single MICOS subunits or subcomplexes."

      This seems in line with what the authors show, rather than "different".

      This sentence has been removed.

      6) lines 380-385: "... thus suggesting that membrane invaginations still arise, but are not properly arranged in these knockout lines. This suggests that MICOS either isn't fully depleted,...". These conclusions are incompatible with findings from ref. 15, which the authors cite. In that study, the authors generated a ∆MICOS line which still forms membrane invaginations, showing that MICOS is not required at all for this process in yeast. Hence the authors' implication that MICOS needs to be fully depleted before membrane invaginations cease to occur is not supported by the literature.

      This sentence has been deleted in the revised version of the manuscript.

      Minor comments:

      7) The authors should consider if the first part of their title could be seen as misleading: It suggests that MICOS is "the architect" in cristae formation, but this is not consistent with the literature nor their own findings.

      Title is changed accordingly

      Minor comments:

      • Line 43, of the three seminal papers describing the discovery of MICOS in 2011, the authors only cite two (refs 6 and 7), but miss the third paper, Hoppins et al, PMID: 21987634, which should probably be corrected.

      Done, the paper is now cited

      • Page 2, line 58: for a more complete picture the authors should also cite the work of others here which shows that although at very low levels, e.g. complex III (a drug target) and ATP synthase do assemble (Nina et al, 2011, JBC).

      Done

      • Page 3, line 80: "Irrespective of the shape of an organism's cristae, the crista junctions have been described as tubular channels that connect the cristae membrane to the inner boundary membrane (22, 24)." This omits the slit-shaped cristae junctions found in yeast (Davies et al, 2011, PNAS), which the authors should include.

      The paper and concept have been added to the manuscript, though the sentence has been moved up in the introduction, when crista junctions are first introduced.

      • Line 97: "poorly predicted N-terminal extension", as there is no experimental structure, we don't know if the prediction is poor. Presumably the authors mean either poorly ordered or the absence of secondary structure elements, or the poor confidence score for that region in the prediction? This should be clarified or corrected.

      We were referring to the poor confidence score. To address this comment as well as major point 2, we rewrote the respective paragraph. It now clearly states that confidence of the prediction is low, and we mention the tool that was used to identify conserved domains (Topology-based Evolutionary Domains).

      • Line 98: "an antiparallel array of ten β-sheets". They are actually two parallel beta-sheets stacked together. The authors could find out the name of this fold, but the confidence of the prediction is marked a low/very low. So, its existence is unknown, not just its "function".

      We adapted the domain description to "a stack of two parallel beta-sheets" and replaced the statement on unknown function by the statement "Because this domain is predicted solely from computational analysis, both its actual existence in the native protein and its biological function remain unknown."

      Fig 1B: The authors show two alphafold predictions of S. cerevisiae and P. falciparum Mic60 structures. There is however an experimental Mic60/19 (fragment) structure from the former organism (PMID: 36044574), which should be included if possible

      We appreciate the reviewer's suggestion and note that the available structural data indeed provides valuable insight into how MIC60 and MIC19 interact. However, these structures represent fusion constructs of limited protein fragments and therefore capture only a small portion of each protein, specifically the interaction interface. Because our aim in Fig. 1B is to compare the overall domain architecture of the full‑length proteins, we believe that including fragment‑based structures would be less informative in this context.

      Line: 318-321: "The same trend was observed for PfMIC19 and PfMIC60. Although transcriptomic data suggested that low-level transcripts of PfMIC19 and PfMIC60 are present in ABS (38), we did not detect either of the proteins in ABS by western blot analysis. While this statement is true, the authors should comment on the sensitivity of the respective methods - how well was the antibody working in their hands and how do they interpret the absence of a WB band compared to transcriptomics data?

      The HA antibody used in our experiments is a standard commercial reagent that performs reliably in both WB and IFA, although it shows a low background signal in gametocytes. We agree that the sensitivity of the method and the interpretation of weak or absent bands should be addressed explicitly. Transcript levels for both PfMIC19 and PfMIC60 in asexual blood stages fall within the

      • Lines 322-323: would the authors not typically have expected an IFA signal given the strength of the band in Western blot? If possible, the authors should comment if the negative fluorescence outcome can indeed be explained with the low abundance or if technical challenges are an equally good explanation.

      Considering the nature of the investigated proteins (embedded in the IMM and spread throughout the mitochondria) difficulties in achieving a clear signal in IFA or U-ExM are not very surprizing. While epitopes may remain buried in IFA, U-ExM usually increases accessibility for the antibodies. However, U-ExM comes at the cost of being prone to dotty background signals, therefore potentially hiding low abundance, naturally dotty signals such as the signal of MICOS proteins that localize to distinct foci (at the CJ) along the mitochondrion. Current literature suggests that, in both human and yeast, STED is the preferred method for accurate spatial resolution of MICOS proteins (https://www.ncbi.nlm.nih.gov/pubmed/32567732,https://www.ncbi.nlm.nih.gov/pubmed/32067344). Unfortunately, we do not have experience with, nor access to, this particular technique/method.

      Lines 357-365: the authors describe limitations of the applied methods adequately. Perhaps it would be helpful to make a similar statement about the analysis of 3D objects like mitochondria and cristae from 2D sections. E.g. the apparent cristae length depends on whether cristae are straight (e.g. coiled structures do not display long cross sections despite their true length in 3D).

      The limitations of other methods are described in the respective results section.

      We added a clarifying sentence in the results section of Figure 4:

      "Note that such measurements do not indicate the true total length or width of cristae, as the data is two-dimensional. The recorded values are to be considered indicative of possible trends, rather than absolute dimensions of cristae."

      This statement refers to the length/width measurements of cristae.

      In the context of Figure 4 D we mention the following (see preprint lines 229 - 230): "We expect this effect to translate into the third dimension and thus conclude that the mean crista volume increases with the loss of either PfMIC19,PfMIC60, or both."

      For Figure 5, we included a clarifying statement in the results section of the preprint (lines 269 - 273): "Note that these mitochondrial volumes are not full mitochondria, but large segments thereof. As a result of the incompleteness of the mitochondria within the section, and the tomography specific artefact of the missing wedge, we were unable to confirm whether cristae were in fact fully detached from the boundary membrane, or just too long to fit within the observable z-range. "

      Line 404: perhaps undetected or similar would be a better description than "hidden"?

      The sentence does not exist in the revised manuscript

      Reviewer #3 (Significance (Required)):

      The main strength of the study is that it provides the first characterisation of the MICOS complex in P. falciparum, a human parasite in which the mitochondrion has been shown to be a drug target. Mic60 and the newly annotated Mic19 are confirmed to be essential for proper cristae formation and morphology, as well as overall mitochondrial morphology. Furthermore, the mutant lines are characterised for their ability to complete the parasite life cycle and defects in infection effectivity are observed. This work is an important first step for deciphering the role of MICOS in the malaria parasite and the composition and function of this complex in this organism. The limitation of the study stems from what is already known about MICOS and its subunits in

      great detail in yeast and humans with similar findings regarding loss of cristae and cristae defects. The findings of this study do not provide dramatic new insight on MICOS function or go substantially beyond the vast existing literature in terms of the extent of the study, which focuses on parasitological assays and morphological analysis. Exploring the role of MICOS in an early-divergent organism and human parasite is however important given the divergence found in mitochondrial biology and P. falciparum is a uniquely suited model system. One aspect that would increase the impact of the paper would be if the authors could mechanistically link the observed morphological defects to the decreased infection efficiency, e.g. by probing effects on mitochondrial function. This will likely be challenging as the morphological defects are diverse and the fitness defects appear moderate/mild.

      As suggested by Reviewer 2, we examined mitochondrial membrane potential in gametocytes using MitoTracker staining and did not observe any obvious differences associated with the morphological defects. At present, additional assays to probe mitochondrial function in P. falciparum gametocytes are not sufficiently established, and developing and validating such methods would require substantial work before they could be applied to our mutant lines. For these reasons, a more detailed mechanistic link between the observed morphological changes and the reduced infection efficiency is currently beyond reach.

      The advance presented in this study is to pioneer the study of MICOS in P. falciparum, thus widening our understanding of the role of this complex to different model organism. This study will likely be mainly of interest for specialised audiences such as basic research parasitologists and mitochondrial biologists. My own field of expertise is mitochondrial biology and structural biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      MICOS is a conserved mitochondrial protein complex responsible for organising the mitochondrial inner membrane and the maintenance of cristae junctions. This study sheds first light on the role of two MICOS subunits (Mic60 and the newly annotated Mic19) in the malaria parasite Plasmodium falciparum, which forms cristae de novo during sexual development, as demonstrated by EM of thin section and electron tomography. By generating knockout lines (including a double knockout), the authors demonstrate that knockout of both MICOS subunits leads to defects in cristae morphology and a partial loss of cristae junctions. With a formidable set of parasitological assays, the authors show that despite the metabolically important role of mitochondria for gametocytes, the knockout lines can progress through the life stages and form sporozoites, albeit with diminished infection efficiency.

      Major comments:

      1) The authors should improve to present their findings in the right context, in particular by:

      (i) giving a clearer description in the introduction of what is already known about the role of MICOS. This starts in the introduction, where one main finding is missing: loss of MICOS leads to loss of cristae junctions and the detachment of cristae membranes, which are nevertheless formed, but become membrane vesicles. This needs to be clearly stated in the introduction to allow the reader to understand the consistency of the authors' findings in P. falciparum with previous reports in the literature.

      (ii) at the end to the introduction, the motivating hypothesis is formulated ad hoc "conclusive evidence about its involvement in the initial formation of cristae is still lacking" (line 83). If there is evidence in the literature that MICOS is strictly required for cristae formation in any organism, then this should be explained, because the bona fide role of MICOS is maintenance of cristae junctions (the hypothesis is still plausible and its testing important).

      2) Line 96-97: "Interestingly, PfMIC60 is much larger than the human MICOS counterpart, with a large, poorly predicted N-terminal extension." This statement is lacking a reference and presumably refers to annotated ORFs. The authors should clarify if the true N-terminus is definitely known - a 120kDa size is shown for the P. falciparum but this is not compared to the expected length or the size in S. cerevisiae.

      3) lines 244-245: "Furthermore, our data indicates the effect size increases with simultaneous ablation of both proteins?". The authors should explain which data they are referring to, as some of the data in Fig 3 and 4 look similar and all significance tests relate to the wild type, not between the different mutants, so it is not clear if any overserved differences are significant. The authors repeat this claim in the discussion in lines 368-369 without referring to a specific significance test. This needs to be clarified.

      4) lines 304-306: "Though well established as the cristae organizing system, the role of MICOS in initial formation of cristae remains hidden in model organisms that constitutively display cristae.". This sentence is misleading since even in organisms that display numerous cristae throughout their life cycle, new cristae are being formed as the cells proliferate. Thus, failure to produce cristae in MICOS knockout lines would have been observable but has apparently not been reported in the literature. Thus, the concerted process in P. falciparum makes it a great model organism, but not fundamentally different to what has been studied before in other organisms.

      5) lines 373-378. "where ablation of just MIC60 is sufficient to deplete functionality of the entire MICOS (11, 15),". The authors' claim appears to be contrary to what is actually stated in ref 15, which they cite:

      "MICOS subunits have non-redundant functions as the absence of both MICOS subcomplexes results in more severe morphological and respiratory growth defects than deletion of single MICOS subunits or subcomplexes."

      This seems in line with what the authors show, rather than "different".

      6) lines 380-385: "... thus suggesting that membrane invaginations still arise, but are not properly arranged in these knockout lines. This suggests that MICOS either isn't fully depleted,...". These conclusions are incompatible with findings from ref. 15, which the authors cite. In that study, the authors generated a ∆MICOS line which still forms membrane invaginations, showing that MICOS is not required at all for this process in yeast. Hence the authors' implication that MICOS needs to be fully depleted before membrane invaginations cease to occur is not supported by the literature.

      7) The authors should consider if the first part of their title could be seen as misleading: It suggests that MICOS is "the architect" in cristae formation, but this is not consistent with the literature nor their own findings.

      Minor comments:

      • Line 43, of the three seminal papers describing the discovery of MICOS in 2011, the authors only cite two (refs 6 and 7), but miss the third paper, Hoppins et al, PMID: 21987634, which should probably be corrected.
      • Page 2, line 58: for a more complete picture the authors should also cite the work of others here which shows that although at very low levels, e.g. complex III (a drug target) and ATP synthase do assemble (Nina et al, 2011, JBC).
      • Page 3, line 80: "Irrespective of the shape of an organism's cristae, the crista junctions have been described as tubular channels that connect the cristae membrane to the inner boundary membrane (22, 24)." This omits the slit-shaped cristae junctions found in yeast (Davies et al, 2011, PNAS), which the authors should include.
      • Line 97: "poorly predicted N-terminal extension", as there is no experimental structure, we don't know if the prediction is poor. Presumably the authors mean either poorly ordered or the absence of secondary structure elements, or the poor confidence score for that region in the prediction? This should be clarified or corrected.
      • Line 98: "an antiparallel array of ten β-sheets". They are actually two parallel beta-sheets stacked together. The authors could find out the name of this fold, but the confidence of the prediction is marked a low/very low. So, its existence is unknown, not just its "function".
      • Fig 1B: The authors show two alphafold predictions of S. cerevisiae and P. falciparum Mic60 structures. There is however an experimental Mic60/19 (fragment) structure from the former organism (PMID: 36044574), which should be included if possible
      • Line: 318-321: "The same trend was observed for PfMIC19 and PfMIC60. Although transcriptomic data suggested that low-level transcripts of PfMIC19 and PfMIC60 are present in ABS (38), we did not detect either of the proteins in ABS by western blot analysis. While this statement is true, the authors should comment on the sensitivity of the respective methods - how well was the antibody working in their hands and how do they interpret the absence of a WB band compared to transcriptomics data?
      • Lines 322-323: would the authors not typically have expected an IFA signal given the strength of the band in Western blot? If possible, the authors should comment if the negative fluorescence outcome can indeed be explained with the low abundance or if technical challenges are an equally good explanation.
      • Lines 357-365: the authors describe limitations of the applied methods adequately. Perhaps it would be helpful to make a similar statement about the analysis of 3D objects like mitochondria and cristae from 2D sections. E.g. the apparent cristae length depends on whether cristae are straight (e.g. coiled structures do not display long cross sections despite their true length in 3D).
      • Line 404: perhaps undetected or similar would be a better description than "hidden"?

      Significance

      The main strength of the study is that it provides the first characterisation of the MICOS complex in P. falciparum, a human parasite in which the mitochondrion has been shown to be a drug target. Mic60 and the newly annotated Mic19 are confirmed to be essential for proper cristae formation and morphology, as well as overall mitochondrial morphology. Furthermore, the mutant lines are characterised for their ability to complete the parasite life cycle and defects in infection effectivity are observed. This work is an important first step for deciphering the role of MICOS in the malaria parasite and the composition and function of this complex in this organism.

      The limitation of the study stems from what is already known about MICOS and its subunits in other organism. MICOS subunit knockouts have been characterised in great detail in yeast and humans with similar findings regarding loss of cristae and cristae defects. The findings of this study do not provide dramatic new insight on MICOS function or go substantially beyond the vast existing literature in terms of the extent of the study, which focuses on parasitological assays and morphological analysis.

      Exploring the role of MICOS in an early-divergent organism and human parasite is however important given the divergence found in mitochondrial biology and P. falciparum is a uniquely suited model system. One aspect that would increase the impact of the paper would be if the authors could mechanistically link the observed morphological defects to the decreased infection efficiency, e.g. by probing effects on mitochondrial function. This will likely be challenging as the morphological defects are diverse and the fitness defects appear moderate/mild.

      The advance presented in this study is to pioneer the study of MICOS in P. falciparum, thus widening our understanding of the role of this complex to different model organism. This study will likely be mainly of interest for specialised audiences such as basic research parasitologists and mitochondrial biologists. My own field of expertise is mitochondrial biology and structural biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Major comments:

      1) In my opinion, the authors tend to sensationalize or overinterpret their results. The title of the manuscript is very misleading. While MICOS is certainly important for crista formation, it is not the only factor, as ATP synthase dimer rows make a highly significant contribution to crista morphology. Thus, one can argue with equal validity that ATP synthase should be considered the 'architect', as it's the conformation of the dimers and rows modulate positive curvature. Secondly, while cristae are still formed upon mic60/mic19 gene knockout (KO), they are severely deformed, and likely dysfunctional (see below). Thus, I do not agree with the title that MICOS is dispensable for crista formation, because the authors results show that it clearly is essential. So, the title should be changed.

      The Discussion section starting from line 373 also suffers from overinterpretation as well as being repetitive and hard to understand. The authors infer that MICOS stability is compromised less in the single KOs (sKO) in compared to the mic60/mic19 double KO (dKO). MICOS stability was never directly addressed here and the composition of the MICOS complex is unaddressed, so it does not make sense to speculate by such tenuous connections. The data suggest to me that mic60 and mic19 are equally important for crista formation and crista junction (CJ) stabilization, and the dKO has a more severe phenotype than either KO, further demonstrating neither is epistatic.

      The following paragraphs (line 387 to 422) continues with such unnecessary overinterpretation to the point that it is confusing and contradictory. Line 387 mentions an 'almost complete loss of CJs' and then line 411 mentions an increase in CJ diameter, both upon Mic60 ablation. I do not think this discussion brings any added value to the manuscript and should be shortened. Yes, maybe there are other putative MICOS subunits that may linger in the KOS that are further destabilized in the dKO, or maybe Mic60 remains in the mic19 KO (and vice versa) to somehow salvage more CJs, which is not possible in the dKO. It is impossible to say with confidence how ATP synthase behaves in the KOs with the current data.

      2) While the authors went through impressive lengths to detect any effect on lifecycle progression, none was found except for a reduction in oocyte count. However, the authors did not address any direct effect on mitochondria, such as OXPHOS complex assembly, respiration, membrane potential. This seems like a missed opportunity, given the team's previous and very nice work mapping these complexes by complexome profiling. However, I think there are some experiments the authors can still do to address any mitochondrial defects using what they have and not resorting to complexome profiling (although this would be definitive if it is feasible):

      i) Quantification of MitoTracker Red staining in WT and KOs. The authors used this dye to visualize mitochondria to assay their gross morphology, but unfortunately not to assay membrane potential in the mutants. The authors can compare relative intensities of the different mitochondria types they categorized in Fig. 3A in 20-30 cells to determine if membrane potential is affected when the cristae are deformed in the mutants. One would predict they are affected.

      ii) Sporozoites are shown in Fig S5. The authors can use the same set up to track their motion, with the hypothesis that they will be slower in the mutants compared to WT due to less ATP. This assumes that sporozoite mitochondria are active as in gametocytes.

      iii) Shotgun proteomics to compare protein levels in mutants compared to WT, with the hypothesis that OXPHOS complex subunits will be destabilized in the mutants with deformed cristae. This could be indirect evidence that OXPHOS assembly is affected, resulting in destabilized subunits that fail to incorporate into their respective complexes.

      To expedite resubmission, the authors can restrict the cell lines to WT and the dKO, as the latter has a stronger phenotype that the individual KOs and conclusions from this cell line are valid for overall conclusions about Plasmodium MICOS.

      I will also conclude that complexome/shotgun proteomics may be a useful tool also for identifying other putative MICOS subunits by determining if proteins sharing the same complexome profile as PfMic60 and Mic19 are affected. This would address the overinterpretation problem of point 1.

      3) I am aware of the authors previous work in which they were not able to detect cristae in ABS, and thus have concluded that these are truly acristate. This can very well be true, or there can be immature cristae forms that evaded detection at the resolution they used in their volumetric EM acquisitions. The mitochondria and gametocyte cristae are pretty small anyway, so it not unreasonable to assume that putative rudimentary cristae in ABS may be even smaller still. Minute levels of sampled complex III and IV plus complex V dimers in ABS that were detected previously by the authors by complexome profiling would argue for the presence of miniscule and/or very few cristae.

      I think that authors should hedge their claim that ABS is acrisate by briefly stating that there still is a possibility that miniscule cristae may have been overlooked previously.

      This brings me to the claim that Mic19 and Mic60 proteins are not expressed in ABS. This is based on the lack of signal from the epitope tag; a weak signal is detected in gametocytes. Thus, one can counter that Mic19 and Mic60 are also expressed, but below the expression limits of the assay, as the protein exhibits low expression levels when mitochondrial activity is upregulated.

      To address this point, the authors should determine of mature mic60 and mic19 mRNAs are detected in ABS in comparison to the dKO, which will lack either transcript. RT-qPCR using polyT primers can be employed to detect these transcripts. If the level of these mRNAs are equivalent to dKO in WT ABS, the authors can make a pretty strong case for the absence of cristae in ABS.

      4) The major finding of the manuscript is of a Mic19 analog in plasmodium should be highlighted. As far as I know, this manuscript could represent the first instance of Mic19 outside of opisthokonts that was not found by sensitive profile HMM searches and certainly the first time such a Mic19 was functionally analyzed.

      They should highlight the twin CX9C motifs that are a hallmark of Mic19 and other proteins that undergo oxidative folding via the MIA pathway. Interestingly, the Mia40 oxidoreductase that is central to MIA in yeast and animals, is absent in apicomplexans (DOI: 10.1080/19420889.2015.1094593).

      Did the authors try to align Plasmodium Mic19 orthologs with conventional Mic19s? This may reveal some conserved residues within and outside of the CHCH domain.

      5) Statistcal significance. Sometimes my eyes see population differences that are considered insignificant by the statistical methods employed by the authors, eg Fig. 4E, mutants compared to WT, especially the dKO. Have the authors considered using other methods such as student t-test for pairwise comparisons?

      Minor comments:

      Line 33. Anaerobes (eg Giardia) have mitochondria that do produce ATP, unlike aerobic mitochondria

      Line 56: Unclear what authors mean by "canonical model of mitochondria"

      Lines 75-76: This applies to Mic10 only

      Line 80: Cite DOI: 10.1016/j.cub.2020.02.053

      Fig 2D: I find this table difficult to read. If authors keep table format, at least get rid of 'mean' column' as this data is better depicted in 2C. I suggest depicted this data either like in 3B depicting portion of infected vs unaffected flies in all experiments, then move modified Table to supplement. Important to point out experiment 5 appears to be an outlier with reduced infectivity across all cell lines, including WT.

      Fig. 3C-G: I feel like these data repeatedly lead to same conclusions. These are all different ways of showing what is depicted in Fig 2B: mitochondria gross morphology is affected upon ablation of MICOS. I suggest that these graphs be moved to supplement and replaced by the beautiful images

      Line 180: Be more specific with which tubulin isoform is used as a male marker and state why this marker was used in supplemental Fig S6.

      Line 196 and Fig 3C: the word 'intensities' in this context is very ambiguous. Please choose a different term (puncta, elements, parts?). This is related to major point 2i above.

      Line 222: Report male/female crista measurements

      Fig. 4B-E: depict data as violin plots or scatter plots like Fig. 2C to get a better grasp of how the crista coverage is distributed. It seems like the data spread is wider in the double KO. This would also solve the problem with the standard deviation extending beyond 0%.

      Lines 331-333: Please clarify that this applies for some, but not all MICOS subunits. Please also see major point 1 above. Also, the authors should point out that despite their structural divergence, trypanosomal cryptic mitofilins Mic34 and Mic40 are essential for parasite growth, in contrast to their findings with PfMic60 (DOI: https://doi.org/10.1101/2025.01.31.635831).

      Line 320: incorrect citation. Related to point 1above.

      Lines 333-335. This is related to the above. Again, some subunits appear to affect cell growth under lab conditions, and some do not. This and the previous sentence should be rewritten to reflect this.

      Line 343-345: The sentence and citation 45 are strange. Regarding the former, it is about CHCHD10, whose status as a bona fide MICOS subunit is very tenuous, so I would omit this. About the phenomenon observed, I think it makes more sense to write that Mic60 ablation results in partially fragmented mitochondria in yeast (Rabl et al., 2009 J Cell Biol. 185: 1047-63). A fragmented mitochondria is often a physiological response to stress. I would just rewrite as not to imply that mitochondrial fission (or fusion) is impaired in these KOs, or at least this could be one of several possibilities.

      Line 373: 'This indicates' is too strong. I would say 'may suggest' as you have no proof that any of the KOs disrupts MICOS. This hypothesis can be tested by other means, but not by penetrance of a phenotype.

      Line 376-377; 'deplete functionality' does not make sense, especially in the context of talking about MICOS subunit stability. In my opinion, this paragraph overinterprets the KO effects on MICOS stability. None of the experiments address this phenomenon, and thus the authors should not try to interpret their results in this context. See major point 1.

      Other suggestions for added value

      1) Does Plasmodium Sam50 co-fractionate with Mic60 and Mic19 in BN PAGE (Fig. 1E)

      2) Can Alphafold3 predict a heterotetramer of PfMic60? What about the four Mic19 and Mic60 subunits together. Is this tetramer consistent with the Bock-Bierbaum model. Is this model consistent with the CJ diameter measured in plasmodium, which is perhaps better evidence than that in lines 419-422.

      Significance

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      However, despite the herculean efforts of the authors, the manuscript as it currently stands represents only a minor advance in our understanding of the evolution of MICOS, which from the title and focus of the manuscript, is the main goal of the authors.

      In its current form, the manuscript reports some potentially important findings:

      1) Mic60 is verified to play a role in crista formation, as is predicted by its orthology to other characterized Mic60 orthologs.

      2) The discovery of a novel Mic19 analog (since the authors maintain there is no significant sequence homology), which exhibits a similar (or the same?) complexome profile with Mic60. This protein was upregulated in gametocytes like Mic60 and phenocopies Mic60 KO.

      3) Both of these MICOS subunits are essential (not dispensable) for proper crista formation

      4) Surprisingly, neither MICOS subunit is essential for in vitro growth or differentiation from ABS to sexual stages, and from the latter to sporozoites. This says more about the biology of plasmodium itself than anything about the essentiality of Mic60, ie plasmodium life cycle progression tolerates defects to mitochondrial morphology. But yes, I agree with the authors that Mic60's apparent insignificance for cell growth in examined conditions does differ with its essentiality in other eukaryotes. But fitness costs were not assayed (eg by competition between mutants and WT in infection of mosquitoes)

      5) Decreased fitness of the mutants is implied by a reduction of oocyte formation.

      While interesting in their own way, collectively they do not represent a major advance in our understanding of MICOS evolution. Furthermore, the findings bifurcate into categories informing MICOS or Plasmodium biology. Both aspects are somewhat underdeveloped in their current form.

      This is unfortunate because there seem to be many missed opportunities in the manuscript that could, with additional experiments, lead to a manuscript with much wider impact.

      For me, what is remarkable about Plasmodium MICOS that sets it apart from other iterations is the apparent absence of the Mic10 subunit. Purification of plasmodium MICOS via the epitope tagged Mic60 and Mic19 could have verified that MICOS is assembled without this core subunit. Perhaps Mic60 and Mic19 are the vestiges of the complex, and thus operate alone in shaping cristae. Such a reduction may also suggest the declining importance of mitochondria in plasmodium.

      Another missed opportunity was to assay the impact of MICOS-depletion of OXPHOS in plasmodium. This is a salient issue as maybe crista morphology is decoupled from OXPHOS capacity in Plasmodium, which links to the apparent tolerance of mitochondrial morphology in cell growth and differentiation. I suggested in section A experiments to address this deficit.

      Finally, the authors could assay fitness costs of MICOS-ablation and associated phenotypes by assaying whether mosquito infectivity is reduced in the mutants when they are directly competing with WT plasmodium. Like the authors, I am also surprised that MICOS mutants can pass population bottlenecks represented by differentiation events. Perhaps the apparent robustness of differentiation may contribute plasmodium's remarkable ability to adapt.

      I realize that the authors put a lot of efforts into their study and again, I am very impressed by the sophistication of the methods employed. Nevertheless, I think there is still better ways to increase the impact of the study aside from overinterpreting the conclusions from the data. But this would require more experiments along the lines I suggest in Section A and here.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript reports the identification of putative orthologues of mitochondrial contact site and cristae organizing system (MICOS) proteins in Plasmodium falciparum - an organism that unusually shows an acristate mitochondrion during the asexual part of its life cycle and then this develops cristae as it enters the sexual stage of its life cycle and beyond into the mosquito. The authors identify PfMIC60 and PfMIC19 as putative members and study these in detail. The authors at HA tags to both proteins and look for timing of expression during the parasite life cycle and attempt (unsuccessfully) to localise them within the parasite. They also genetically deleted both gene singly and in parallel and phenotyped the effect on parasite development. They show that both proteins are expressed in gametocytes and not asexuals, suggesting they are present at the same time as cristae development. They also show that the proteins are dispensible for the entire parasite life cycle investigated (asexuals through to sporozoites), however there is some reduction in mosquito transmission. Using EM techniques they show that the morphology of gametocyte mitochondria is abnormal in the knock out lines, although there is great variation.

      Major comments: The manuscript is interesting and is an intriguing use of a well studied organism of medical importance to answer fundamental biological questions. My main comments are that there should be greater detail in areas around methodology and statistical tests used. Also, the mosquito transmission assays (which are notoriously difficult to perform) show substantial variation between replicates and the statistical tests and data presentation are not clear enough to conclude the reduction in transmission that is claimed. Perhaps this could be improved with clearer text?

      More specific comments to address:

      Line 101/Fig1E (and figure legend) - What is this heatmap showing. It would be helpful to have a sentence or two linking it to a specific methodology. I could not find details in the M+M section and "specialized, high molecular mass gels" does not adequately explain what experiments were performed. The reference to Supplementary Information 1 also did not provide information. Line 115 and Supplementary Figure 2C + D - The main text says that the transgenic parasites contained a mitochondrially localized mScarlet for visualization and localization, but in the supplementary figure 2 it shows mitotracker labelling rather than mScarlet. This is very confusing. The figure legend also mentions both mScarlet and MitoTracker. I assume that mScarlet was used to view in regular IFAs (Fig S2C) and the MitoTracker was used for the expansion microscopy (Fig S2D)? Please clarify. Figure 2C - what is the statistical test being used (the methods say "Mean oocysts per midgut and statistical significance were calculated using a generalized linear mixed effect model with a random experiment effect under a negative binomial distribution." but what test is this?)? Also the choice of a log10 scale for oocyst intensity is an unusual choice - how are the mosquitoes with 0 oocysts being represented on this graph? It looks like they are being plotted at 10^-1 (which would be 0.1 oocysts in a mosquito which would be impossible). Figure 2D - it is great that the data from all feeding replicates has been shared, however it is difficult to conclude any meaningful impact in transmission with the knock-out lines when there is so much variation and so few mosquitoes dissected for some datapoints (10 mosquitoes are very small sample sizes). For example, Exp1 shows a clear decrease in mic19- transmission, but then Exp2 does not really show as great effect. Similarly, why does the double knock out have better transmission than the single knockouts? Sure there would be a greater effect? Figure 3 legend - Please add which statistical test was used and the number of replicates. Figure 4 legend - Please add which statistical test was used and the number of replicates. Figure 5C - the 3D reconstructions are very nice, but what does the red and yellow coloring show? Line 352 - "Still, it is striking that, despite the pronounced morphological phenotype, and the possibly high mitochondrial stress levels, the parasites appeared mostly unaffected in life cycle propagation, raising questions about the functional relevance of mitochondria at these stages." How do the authors reconcile this statement with the proven fact that mitochondria-targeted antimalarials (such as atovaquone) are very potent inhibitors of parasite mosquito transmission?

      Significance

      This manuscript is a novel approach to studying mitochondrial biology and does open a lot of unanswered questions for further research directions. Currently there are limitations in the use of statistical tests and detail of methodology, but these could be easily be addressed with a bit more analysis/better explanation in the text. This manuscript could be of interest to readers with a general interest in mitochondrial cell biology and those within the specific field of Plasmodium research.

      My expertise is in Plasmodium cell biology.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) I have to admit that it took a few hours of intense work to understand this paper and to even figure out where the authors were coming from. The problem setting, nomenclature, and simulation methods presented in this paper do not conform to the notation common in the field, are often contradictory, and are usually hard to understand. Most importantly, the problem that the paper is trying to solve seems to me to be quite specific to the particular memory study in question, and is very different from the normal setting of model-comparative RSA that I (and I think other readers) may be more familiar with.

      We have revised the paper for clarity at all levels: motivation, application, and parameterization. We clarify that there is a large unmet need for using RSA in a trial-wise manner, and that this approach indeed offers benefits to any team interested in decoding trial-wise representational information linked to a behavioral responses, and as such is not a problem specific to a single memory study.

      (2) The definition of "classical RSA" that the authors are using is very narrow. The group around Niko Kriegeskorte has developed RSA over the last 10 years, addressing many of the perceived limitations of the technique. For example, cross-validated distance measures (Walther et al. 2016; Nili et al. 2014; Diedrichsen et al. 2021) effectively deal with an uneven number of trials per condition and unequal amounts of measurement noise across trials. Different RDM comparators (Diedrichsen et al. 2021) and statistical methods for generalization across stimuli (Schütt et al. 2023) have been developed, addressing shortcomings in sensitivity. Finally, both a Bayesian variant of RSA (Pattern component modelling, (Diedrichsen, Yokoi, and Arbuckle 2018) and an encoding model (Naselaris et al. 2011) can effectively deal with continuous variables or features across time points or trials in a framework that is very related to RSA (Diedrichsen and Kriegeskorte 2017). The author may not consider these newer developments to be classical, but they are in common use and certainly provide the solution to the problems raised in this paper in the setting of model-comparative RSA in which there is more than one repetition per stimulus.

      We appreciate the summary of relevant literature and have included a revised Introduction to address this bounty of relevant work. While much is owed to these authors, new developments from a diverse array of researchers outside of a single group can aid in new research questions, and should always have a place in our research landscape. We owe much to the work of Kriegeskorte’s group, and in fact, Schutt et al., 2023 served as a very relevant touchpoint in the Discussion and helped to highlight specific needs not addressed by the assessment of the “representational geometry” of an entire presented stimulus set. Principal amongst these needs is the application of trial-wise representational information that can be related to trial-wise behavioral responses and thus used to address specific questions on brain-behavior relationships. We invite the Reviewer to consider the utility of this shift with the following revisions to the Introduction.

      Page 3. “Recently, methodological advancements have addressed many known limitations in cRSA. For example, cross-validated distance measures (e.g., Euclidean distance) have improved the reliability of representational dissimilarities in the presence of noise and trial imbalance (Walther et al., 2016; Nili et al., 2014; Diedrichsen et al., 2021). Bayesian approaches such as pattern component modeling (Diedrichsen, Yokoi, & Arbuckle, 2018) have extended representational approaches to accommodate continuous stimulus features or temporal variation. Further, model comparison RSA strategies (Diedrichsen et al., 2021) and generalization techniques across stimuli (Schütt et al., 2023) have improved sensitivity and inference. Nevertheless, a common feature shared across most of improvements is that they require stimuli repetition to examine the representational structure. This requirement limits their ability to probe brain-behavior questions at the level of individual events”.

      Page 8. “While several extensions of RSA have addressed key limitations in noise sensitivity, stimulus variance, and modeling (e.g., Diedrichsen et al., 2021; Schütt et al., 2023), our tRSA approach introduces a new methodological step by estimating representational strength at the trial level. This accounts for the multi-level variance structure in the data, affords generalizability beyond the fixed stimulus set, and allows one to test stimulus- or trial-level modulations of neural representations in a straightforward way”.

      Page 44. “Despite such prevalent appreciation for the neurocognitive relevance of stimulus properties, cRSA often does not account for the fact that the same stimulus (e.g., “basketball”) is seen by multiple subjects and produces statistically dependent data, an issue addressed by Schütt et al., 2023, who developed cross validation and bootstrap methods that explicitly model dependence across both subjects and stimulus conditions”.

      (3) The stated problem of the paper is to estimate "representational strength" in different regions or conditions. With this, the authors define the correlation of the brain RDM with a model RDM. This metric conflates a number of factors, namely the variances of the stimulus-specific patterns, the variance of the noise, the true differences between different dissimilarities, and the match between the assumed model and the data-generating model. It took me a long time to figure out that the authors are trying to solve a quite different problem in a quite different setting from the model-comparative approach to RSA that I would consider "classical" (Diedrichsen et al. 2021; Diedrichsen and Kriegeskorte 2017). In this approach, one is trying to test whether local activity patterns are better explained by representation model A or model B, and to estimate the degree to which the representation can be fully explained. In this framework, it is common practice to measure each stimulus at least 2 times, to be able to estimate the variance of noise patterns and the variance of signal patterns directly. Using this setting, I would define 'representational strength" very differently from the authors. Assume (using LaTeX notation) that the activity patterns $y_j,n$ for stimulus j, measurement n, are composed of a true stimulus-related pattern ($u_j$) and a trial-specific noise pattern ($e_j,n$). As a measure of the strength of representation (or pattern), I would use an unbiased estimate of the variance of the true stimulus-specific patterns across voxels and stimuli ($\sigma^2_{u}$). This estimator can be obtained by correlating patterns of the same stimuli across repeated measures, or equivalently, by averaging the cross-validated Euclidean distances (or with spatial prewhitening, Mahalanobis distances) across all stimulus pairs. In contrast, the current paper addresses a specific problem in a quite specific experimental design in which there is only one repetition per stimulus. This means that the authors have no direct way of distinguishing true stimulus patterns from noise processes. The trick that the authors apply here is to assume that the brain data comes from the assumed model RDM (a somewhat sketchy assumption IMO) and that everything that reduces this correlation must be measurement noise. I can now see why tRSA does make some sense for this particular question in this memory study. However, in the more common model-comparative RSA setting, having only one repetition per stimulus in the experiment would be quite a fatal design flaw. Thus, the paper would do better if the authors could spell the specific problem addressed by their method right in the beginning, rather than trying to set up tRSA as a general alternative to "classical RSA".

      At a general level, our approach rests on the premise that there is meaningful information present in a single presentation of a given stimulus. This assumption may have less utility when the research goals are more focused on estimating the fidelity of signal patterns for RSA, as in designs with multiple repetitions. But it is an exaggeration to state that such a trial-wise approach cannot address the difference between “true” stimulus patterns and noise. This trial-wise approach has explicit utility in relating trial-wise brain information to trial-wise behavior, across multiple cognitions (not only memory studies, as applied here). We have added substantial text to the Introduction distinguishing cRSA, which is widely employed, often in cases with a single repetition per stimulus, and model comparative methods that employ multiple repetitions. We clarify that we do not consider tRSA an alternative to the model comparative approach, and discuss that operational definitions of representational strength are constrained by the study design.

      Page 3. “In this paper, we present an advancement termed trial-level RSA, or tRSA, which addresses these limitations in cRSA (not model comparison approaches) and may be utilized in paradigms with or without repeated stimuli”.

      Page 4. “Representational geometry usually refers to the structure of similarities among repeated presentations of the same stimulus in the neural data (as captured in the brain RSM) and is often estimated utilizing a model comparison approach, whereas representational strength is a derived measure that quantifies how strongly this geometry aligns with a hypothesized model RSM. In other words, geometry characterizes the pattern space itself, while representational strength reflects the degree of correspondence between that space and the theoretical model under test”.

      Finally, we clarified that in our simulation methods we assume a true underlying activity pattern and a random error pattern. The model RSM is computed based on the true pattern, whereas the brain RSM comes from the noisy pattern, not the model RSM itself.

      Page 9. “Then, we generated two sets of noise patterns, which were controlled by parameters σ<sub>A</sub> and σ<sub>B</sub> , respectively, one for each condition”.

      (4) The notation in the paper is often conflicting and should be clarified. The actual true and measured activity patterns should receive a unique notation that is distinct from the variances of these patterns across voxels. I assume that $\sigma_ijk$ is the noise variances (not standard deviation)? Normally, variances are denoted with $\sigma^2$. Also, if these are variances, they cannot come from a normal distribution as indicated on page 10. Finally, multi-level models are usually defined at the level of means (i.e., patterns) rather than at the level of variances (as they seem to be done here).

      We have added notations for true and measured activity patterns to differentiate it from our notation for variance. We agree that multilevel models are usually defined at the level of means rather than at the level of variances and we include a Figure (Fig 1D) that describes the model in terms of the means. We clarify that the σ ($\sigma$) used in the manuscript were not variances/standard deviations themselves; rather, they were meant to denote components of the actual (multilevel) variance parameter. Each component was sampled from normal distributions, and they collectively summed up to comprise the final variance parameter for each trial. We have modified our notation for each component to the lowercase letter s to minimize confusion. We have also made our R code publicly available on our lab github, which should provide more clarity on the exact simulation process.

      (5) In the first set of simulations, the authors sampled both model and brain RSM by drawing each cell (similarity) of the matrix from an independent bivariate normal distribution. As the authors note themselves, this way of producing RSMs violates the constraint that correlation matrices need to be positive semi-definite. Likely more seriously, it also ignores the fact that the different elements of the upper triangular part of a correlation matrix are not independent from each other (Diedrichsen et al. 2021). Therefore, it is not clear that this simulation is close enough to reality to provide any valuable insight and should be removed from the paper, along with the extensive discussion about why this simulation setting is plainly wrong (page 21). This would shorten and clarify the paper.

      We have added justification of the mixed-effects model given the potential assumption violations. We caution readers to investigate the robustness of their models, and to employ permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the Appendix. Finally, we agree that the first simulation setting does not possess several properties of realistic RDMs/RSMs; however, we believe that there is utility in understanding the mathematical properties of correlations – an essential component of RSA – in a straightforward simulation where the ground truth is known, thus moving the simulation to Appendix 1.

      (6) If I understand the second simulation setting correctly, the true pattern for each stimulus was generated as an NxP matrix of i.i.d. standard normal variables. Thus, there is no condition-specific pattern at all, only condition-specific noise/signal variances. It is not clear how the tRSA would be biased if there were a condition-specific pattern (which, in reality, there usually is). Because of the i.i.d. assumption of the true signal, the correlations between all stimulus pairs within conditions are close to zero (and only differ from it by the fact that you are using a finite number of voxels). If you added a condition-specific pattern, the across-condition RSA would lead to much higher "representational strength" estimates than a within-condition RSA, with obvious problems and biases.

      The Reviewer is correct that the voxel values in the true pattern are drawn from i.i.d. standard normal distributions. We take the Reviewer’s suggestion of “condition-specific pattern” to mean that there could be a condition-voxel interaction in two non-mutually exclusive ways. The first is additive, essentially some common underlying multi-voxel pattern like [6, 34, -52, …, 8] for all condition A trials, and different one such pattern for condition B trials, etc. The second is multiplicative, essentially a vector of scaling factors [x1.5, x0.5, x0.8, …, x2.7] for all condition A trials, and a different one such vector for condition B trials, etc. Both possibilities could indeed affect tRSA as much as it would cRSA.

      Importantly, If such a strong condition-specific pattern is expected, one can build a condition-specific model RDM using one-shot coding of conditions (see example figure; src: https://www.newbi4fmri.com/tutorial-9-mvpa-rsa), to either capture this interesting phenomenon or to remove this out as a confounding factor. This practice has been applied in multiple regression cRSA approaches (e.g., Cichy et al., 2013) and can also be applied to tRSA.

      (7) The trial-level brain RDM to model Spearman correlations was analyzed using a mixed effects model. However, given the symmetry of the RDM, the correlations coming from different rows of the matrix are not independent, which is an assumption of the mixed effect model. This does not seem to induce an increase in Type I errors in the conditions studied, but there is no clear justification for this procedure, which needs to be justified.

      We appreciate this important warning, and now caution readers to investigate the robustness of their models, and consider employing permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the supplement.

      Page 46. “While linear mixed-effects modeling offers a powerful framework for analyzing representational similarity data, it is critical that researchers carefully construct and validate their models. The multilevel structure of RSA data introduces potential dependencies across subjects, stimuli, and trials, which can violate assumptions of independence if not properly modeled. In the present study, we used a model that included random intercepts for both subjects and stimuli, which accounts for variance at these levels and improves the generalizability of fixed-effect estimates. Still, there is a potential for systematic dependence across trials within a subject. To ensure that the model assumptions were satisfied, we conducted a series of diagnostic checks on an exemplar ROI (right LOC; middle occipital gyrus) in the Object Perception dataset, including visual inspection of residual distributions and autocorrelation (Appendix 3, Figure 13). These diagnostics supported the assumptions of normality, homoscedasticity, and conditional independence of residuals. In addition, we conducted permutation-based inference, similar to prior improvements to cRSA (Niliet al. 2014), using a nested model comparison to test whether the mean similarity in this ROI was significantly greater than zero. The observed likelihood ratio test statistic fell in the extreme tail of the null distribution (Appendix 3, Figure 14), providing strong nonparametric evidence for the reliability of the observed effect. We emphasize that this type of model checking and permutation testing is not merely confirmatory but can help validate key assumptions in RSA modeling, especially when applying mixed-effects models to neural similarity data. Researchers are encouraged to adopt similar procedures to ensure the robustness and interpretability of their findings”.

      Exemplar Permutation Testing

      To test whether the mean representational strength in the ROI right LOC (middle occipital gyrus) was significantly greater than zero, we used a permutation-based likelihood ratio test implemented via the permlmer function. This test compares two nested linear mixed-effects models fit using the lmer function from the lme4 package, both including random intercepts for Participant and Stimulus ID to account for between-subject and between-item variability.

      The null model excluded a fixed intercept term, effectively constraining the mean similarity to zero after accounting for random effects:

      ROI ~ 0 + (1 | Participant) + (1 | Stimulus)

      The full model included the same random effects structure but allowed the intercept to be freely estimated:

      ROI ~ 1 + (1 | Participant) + (1 | Stimulus)

      By comparing the fit of these two models, we directly tested whether the average similarity in this ROI was significantly different from zero. Permutation testing (1,000 permutations) was used to generate a nonparametric p-value, providing inference without relying on normality assumptions. The full model, which estimated a nonzero mean similarity in the right LOC (middle occipital gyrus), showed a significantly better fit to the data than the null model that fixed the mean at zero (χ²(1) = 17.60, p = 2.72 × 10⁻⁵). The permutation-based p-value obtained from permlmer confirmed this effect as statistically significant (p = 0.0099), indicating that the mean similarity in this ROI was reliably greater than zero. These results support the conclusion that the right LOC contains representational structure consistent with the HMAXc2 RSM. A density plot of the permuted likelihood ratio tests is plotted along with the observed likelihood ratio test in Appendix 3 Figure 14.

      (8) For the empirical data, it is not clear to me to what degree the "representational strength" of cRSA and tRSA is actually comparable. In cRSA, the Spearman correlation assesses whether the distances in the data RSM are ranked in the same order as in the model. For tRSA, the comparison is made for every row of the RSM, which introduces a larger degree of flexibility (possibly explaining the higher correlations in the first simulation). Thus, could the gains presented in Figure 7D not simply arise from the fact that you are testing different questions? A clearer theoretical analysis of the difference between the average row-wise Spearman correlation and the matrix-wise Spearman correlation is urgently needed. The behavior will likely vary with the structure of the true model RDM/RSM.

      We agree that the comparability between mean row-wise Spearman correlations and the matrix-wise Spearman correlation is needed. We believe that the simulations are the best approach for this comparison, since they are much more robust than the empirical dataset and have the advantage of knowing the true pattern/noise levels. We expand on our comparison of mean tRSA values and matrix-wise Spearman correlations on page 42.

      Page 42. “Although tRSA and cRSA both aim to quantify representational strength, they differ in how they operationalize this concept. cRSA summarizes the correspondence between RSMs as a single measure, such as the matrix-wise Spearman correlation. In contrast, tRSA computes such correspondence for each trial, enabling estimates at the level of individual observations. This flexibility allows trial-level variability to be modeled directly, but also introduces subtle differences in what is being measured. Nonetheless, our simulations showed that, although numerical differences occasionally emerged—particularly when comparing between-condition tRSA estimates to within-condition cRSA estimates—the magnitude of divergence was small and did not affect the outcome of downstream statistical tests”.

      (9) For the real data, there are a number of additional sources of bias that need to be considered for the analysis. What if there are not only condition-specific differences in noise variance, but also a condition-specific pattern? Given that the stimuli were measured in 3 different imaging runs, you cannot assume that all measurement noise is i.i.d. - stimuli from the same run will likely have a higher correlation with each other.

      We recognize the potential of condition-specific patterns and chose to constrain the analyses to those most comparable with cRSA. However, depending on their hypotheses, researchers may consider testing condition RSMs and utilizing a model comparison approach or employ the z-scored approach, as employed in the simulations above. Regarding the potential run confounds, this is always the case in RSA and why we exclude within-run comparisons. We have also added to the Discussion the suggestion to include run as a covariate in their mixed-effects models. However, we do not employ this covariate here as we preferred the most parsimonious model to compare with cRSA.

      Page 46 - 47. “Further, while analyses here were largely employed to be comparable with cRSA, researchers should consider taking advantage of the flexibility of the mixed-effects models and include co variates of non-interest (run, trial order etc.)”.

      (10) The discussion should be rewritten in light of the fact that the setting considered here is very different from the model-comparative RSA in which one usually has multiple measurements per stimulus per subject. In this setting, existing approaches such as RSA or PCM do indeed allow for the full modelling of differences in the "representational strength" - i.e., pattern variance across subjects, conditions, and stimuli.

      We agree that studies advancing designs with multiple repetitions of a given stimulus image are useful in estimating the reliability of concept representations. We would argue however that model comparison in RSA is not restricted to such data. Many extant studies do not in fact have multiple repetitions per stimulus per subject (Wang et al., 2018 https://doi.org/10.1088/1741-2552/abecc3, Gao et al, 2022 https://doi.org/10.1093/cercor/bhac058, Li et al, 2022 https://doi.org/10.1002/hbm.26195, Staples & Graves, 2020 https://doi.org/10.1162/nol_a_00018) that allow for that type of model-comparative approach. While beneficial in terms of noise estimation, having multiple presentations was not a requirement for implementing cRSA (Kriegeskorte, 2008 https://doi.org/10.3389/neuro.06.004.2008). The aim of this manuscript is to introduce the tRSA approach to the broad community of researchers whose research questions and datasets could vary vastly, including but not limited to the number of repeated presentations and the balance of trial counts across conditions.

      (11) Cross-validated distances provide a powerful tool to control for differences in measurement noise variances and possible covariances in measurement noise across trials, which has many distinct advantages and is conceptually very different from the approach taken here.

      We have added language on the value of cross-validation approaches to RSA in the Discussion:

      Page 47. “Additionally, we note that while our proposed tRSA framework provides a flexible and statistically principled approach for modeling trial-level representational strength, we acknowledge that there are alternative methods for addressing trial-level variability in RSA. In particular, the use of cross-validated distance metrics (e.g., crossnobis distance) has become increasingly popular for controlling differences in measurement noise variance and accounting for possible covariance structures across trials (Walther et al., 2016). These metrics offer several advantages, including unbiased estimation of representational dissimilarities under Gaussian noise assumptions and improved generalization to unseen data. However, cross-validated distances are conceptually distinct from the approach taken here: whereas cross-validation aims to correct for noise-related biases in representational dissimilarity matrices, our trial-level RSA method focuses on estimating and modeling the variability in representation strength across individual trials using mixed-effects modeling. Rather than proposing a replacement for cross-validated RSA, tRSA adds a complementary tool to the methodological toolkit—one that supports hypothesis-driven inference about condition effects and trial-level covariates, while leveraging the full structure of the data”.

      (12) One of the main limitations of tRSA is the assumption that the model RDM is actually the true brain RDM, which may not be the case. Thus, in theory, there could be a different model RDM, in which representational strength measures would be very different. These differences should be explained more fully, hopefully leading to a more accessible paper.

      Indeed, the chosen model RSM may not be the true RSM, but as the noise level increases the correlation between RSMs practically becomes zero. In our simulations we assume this to be true as a straightforward way to manipulate the correspondence between the brain data and the model. However, just like cRSA, tRSA is constrained by the model selections the researchers employ. We encourage researchers to have carefully considered theoretically-motivated models and, if their research questions require, consider multiple and potentially competing models. Furthermore, the trial-wise estimates produced by tRSA encourage testing competing models within the multiple regression framework. We have added this language to the Discussion.

      Page 46. ..”choose their model RSMs carefully. In our simulations, we designed our model RSM to be the “true” RSM for demonstration purposes. However, researchers should consider if their models and model alternatives”.

      Pages 45-46. “While a number of studies have addressed the validity of measuring representational geometry using designs with multiple repetitions, a conceptual benefit of the tRSA approach is the reliance on a regression framework that engenders the testing of competing conceptual models of stimulus representation (e.g., taxonomic vs. encyclopedic semantic features, as in Davis et al., 2021)”.

      Reviewer #2 (Public review):

      (1)  While I generally welcome the contribution, I take some issue with the accusatory tone of the manuscript in the Introduction. The text there (using words such as 'ignored variances', 'errouneous inferences', 'one must', 'not well-suited', 'misleading') appears aimed at turning cRSA in a 'straw man' with many limitations that other researchers have not recognized but that the new proposed method supposedly resolves. This can be written in a more nuanced, constructive manner without accusing the numerous users of this popular method of ignorance.

      We apologize for the unintended accusatory tone. We have clarified the many robust approaches to RSA and have made our Introduction and Discussion more nuanced throughout (see also 3, 11 and16).

      (2) The described limitations are also not entirely correct, in my view: for example, statistical inference in cRSA is not always done using classic parametric statistics such as t-tests (cf Figure 1): the rsatoolbox paper by Nili et al. (2014) outlines non-parametric alternatives based on permutation tests, bootstrapping and sign tests, which are commonly used in the field. Nor has RSA ever been conducted at the row/column level (here referred to by the authors as 'trial level'; cf King et al., 2018).

      We agree there are numerous methods that go beyond cRSA addressing these limitations and have added discussion of them into our manuscript as well as an example analysis implementing permutation tests on tRSA data (see response to 7). We thank the reviewer for bringing King et al., 2014 and their temporal generalization method to our attention, we added reference to acknowledge their decoding-based temporal generalization approach.

      Page 8. “It is also important to note that some prior work has examined similarly fine-grained representations in time-resolved neuroimaging data, such as the temporal generalization method introduced by King et al. (see King & Dehaene, 2014). Their approach trains classifiers at each time point and tests them across all others, resulting in a temporal generalization matrix that reflects decoding accuracy over time. While such matrices share some structural similarity with RSMs, they do not involve correlating trial-level pattern vectors with model RSMs nor do their second-level models include trial-wise, subject-wise, and item-wise variability simultaneously”.

      (3) One of the advantages of cRSA is its simplicity. Adding linear mixed effects modeling to RSA introduces a host of additional 'analysis parameters' pertaining to the choice of the model setup (random effects, fixed effects, interactions, what error terms to use) - how should future users of tRSA navigate this?

      We appreciate the opportunity to offer more specific proscriptions for those employing a tRSA technique, and have added them to the Discussion:

      Page 46. “While linear mixed-effects modeling offers a powerful framework for analyzing representational similarity data, it is critical that researchers carefully construct and validate their models and choose their model RSMs carefully. In our simulations, we designed our model RSM to be the “true” RSM for demonstration purposes. However, researchers should consider if their models and model alternatives. However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question”.

      (4) Here, only a single real fMRI dataset is used with a quite complicated experimental design for the memory part; it's not clear if there is any benefit of using tRSA on a simpler real dataset. What's the benefit of tRSA in classic RSA datasets (e.g., Kriegeskorte et al., 2008), with fixed stimulus conditions and no behavior?

      To clarify, our empirical approach uses two different tasks: an Object Perception task more akin to the classic RSA datasets employing passive viewing, and a Conceptual Retrieval task that more directly addresses the benefits of the trialwise approach. We felt that our Object Perception dataset is a simpler empirical fMRI dataset without explicit task conditions or a dichotomous behavioral outcome, whereas the Retrieval dataset is more involved (though old/new recognition is the most common form of memory retrieval testing) and  dependent on behavioral outcomes. However, we recognize the utility of replication from other research groups and do invite researchers to utilize tRSA on their datasets.

      (5) The cells of an RDM/RSM reflect pairwise comparisons between response patterns (typically a brain but can be any system; cf Sucholutsky et al., 2023). Because the response patterns are repeatedly compared, the cells of this matrix are not independent of one another. Does this raise issues with the validity of the linear mixed effects model? Does it assume the observations are linearly independent?

      We recognize the potential danger for not meeting model assumptions. Though our simulation results and model checks suggest this is not a fatal flaw in the model design, we caution readers to investigate the robustness of their models, and consider employing permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the Appendix. See response to R1.

      (6) The manuscript assumes the reader is familiar with technical statistical terms such as Type I/II error, sensitivity, specificity, homoscedasticity assumptions, as well as linear mixed models (fixed effects, random effects, etc). I am concerned that this jargon makes the paper difficult to understand for a broad readership or even researchers currently using cRSA that might be interested in trying tRSA.

      We agree this jargon may cause the paper to be difficult to understand. We have expanded/added definitions to these terms throughout the methods and results sections.

      Page 12. “Given data generated with 𝑠<sub>𝑐𝑜𝑛𝑑,𝐴</sub> = 𝑠<sub>𝑐𝑜𝑛𝑑,B</sub>, the correct inference should be a failure to reject the null hypothesis of ; any significant () result in either direction was considered a false positive (spurious effect, or Type I error). Given data generated with , the inference was considered correct if it rejected the null hypothesis of  and yielded the expected sign of the estimated contrast (b<sub>B-𝐴</sub><0). A significant result with the reverse sign of the estimated contrast (b<sub>B-𝐴</sub><0) was considered a Type I error, and a nonsignificant (𝑝 ≥ 0.05) result was considered a false negative (failure to detect a true effect, or Type II error)”.

      Page 2. “Compared to cRSA, the multi-level framework of tRSA was both more theoretically appropriate and significantly sensitive (better able to detect) to true effects”.

      Page 25.”The performance of cRSA and tRSA were quantified with their specificity (better avoids false positives, 1 - Type I error rate) and sensitivity (better avoids false negatives 1 - Type II error rate)”.

      Page 6. “One of the fundamental assumptions of general linear models (step 4 of cRSA; see Figure 1D) is homoscedasticity or homogeneity of variance — that is, all residuals should have equal variance” .

      Page11. “Specifically, a linear mixed-effects model with a fixed effect  of condition (which estimates the average effect across the entire sample, capturing the overall effect of interest) and random effects of both subjects and stimuli (which model variation in responses due to differences between individual subjects and items, allowing generalization beyond the sample) were fitted to tRSA estimates via the `lme4 1.1-35.3` package in R (Bates et al., 2015), and p-values were estimated using Satterthwaites’s method via the `lmerTest 3.1-3` package (Kuznetsova et al., 2017)”.

      (7) I could not find any statement on data availability or code availability. Given that the manuscript reuses prior data and proposes a new method, making data and code/tutorials openly available would greatly enhance the potential impact and utility for the community.

      We thank the reviewer for raising our oversight here. We have added our code and data availability statements.

      Page 9. “Data is available upon request to the corresponding author and our simulations and example tRSA code is available at https://github.com/electricdinolab”.

      Reviewer #1 (Recommendations for the authors):

      (13) Page 4: The limitations of cRSA seem to be based on the assumption that within each different experimental condition, there are different stimuli, which get combined into the condition. The framework of RSA, however, does not dictate whether you calculate a condition x condition RDM or a larger and more complete stimulus x stimulus RDM. Indeed, in practice we often do the latter? Or are you assuming that each stimulus is only shown once overall? It would be useful at this point to spell out these implicit assumptions.

      We agree that stimulus x stimulus RDMs can be constructed and are often used. However, as we mentioned in the Introduction, researchers are often interested in the difference between two (or more) conditions, such as “remembered” vs. “forgotten” (Davis et al., https://doi.org/10.1093/cercor/bhaa269) or “high cognitive load” vs. “low cognitive load” (Beynel et al., https://doi.org/10.1523/JNEUROSCI.0531-20.2020). In those cases, the most common practice with cRSA is to construct condition-specific RDMs, compute cRSA scores separately for each condition, and then compare the scores at the group level. The number of times each stimulus gets presented does not prevent one from creating a model RDM that has the same rows and columns as the brain RDM, either in the same condition (“high load”) or across different conditions.

      (14) Page 5: The difference between condition-level and stimulus-level is not clear. Indeed, this definition seems to be a function of the exact experimental design and is certainly up for interpretation. For example, if I conduct a study looking at the activity patterns for 4 different hand actions, each repeated multiple times, are these actions considered stimuli or conditions?

      We have added clarifying language about what is considered stimuli vs conditions. Indeed, this will depend on the specific research questions being employed and will affect how researchers construct their models. In this specific example, one would most likely consider each different hand action a condition, treating them as fixed effects rather than random effects, given their very limited number and the lack of need to generalize findings to the broader “hand actions” category.

      Page 5. “Critically, the distinction between condition-level and stimulus level is not always clear as researchers may manipulate stimulus-level features themselves. In these cases, what researchers ultimately consider condition-level and stimulus-level will depend on their specific research questions. For example, researchers intending to study generalized object representation may consider object category a stimulus-level feature, while researchers interested in if/how object representation varies by category may consider the same category variable condition-level”.

      (15) Page 5: The fact that different numbers of trials / different levels of measurement noise / noise-covariance of different conditions biases non-cross-validated distances is well known and repeatedly expressed in the literature. We have shown that cross-validation of distances effectively removes such biases - of course, it does not remove the increased estimation variability of these distances (for a formal analysis of estimation noise on condition patterns and variance of the cross-nobis estimator, see (Diedrichsen et al. 2021)).

      We thank the reviewer for drawing our attention to this literature and have added discussions of these methods.

      (16). Page 5: "Most studies present subjects with a fixed set of stimuli, which are supposedly samples representative of some broader category". This may be the case for a certain type of RSA experiments in the visual domain, but it would be unfair to say that this is a feature of RSA studies in general. In most studies I have been involved in, we use a "stimulus" x "stimulus" RDM.

      We have edited this sentence to avoid the “most” characterization. We also added substantial text to the introduction and discussion distinguishing cRSA, which is nonetheless widely employed, especially in cases with a single repetition per stimulus (Macklin et al., 2023, Liu et al, 2024) and the model comparative method and explicitly stating that we do not consider tRSA an alternative to the model comparative approach.

      (17). Page 5: I agree that "stimuli" should ideally be considered a random effect if "stimuli" can be thought of as sampled from a larger population and one wants to make inferences about that larger population. Sometimes stimuli/conditions are more appropriately considered a fixed effect (for example, when studying the response to stimulation of the 5 fingers of the right hand). Techniques to consider stimuli/conditions as a random effect have been published by the group of Niko Kriegeskorte (Schütt et al. 2023).

      Indeed, in some cases what may be thought of as “stimuli” would be more appropriately entered into the model as a fixed effect; such questions are increasingly relevant given the focus on item-wise stimulus properties (Bainbridge et al., Westfall & Yarkoni). We have added text on this issue to the Discussion and caution researchers to employ models that most directly answer their research questions.

      Page 46. “However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question. An effect is fixed when the levels represent the specific conditions of theoretical interest (e.g., task condition) and the goal is to estimate and interpret those differences directly. In contrast, an effect is random when the levels are sampled from a broader population (e.g., subjects) and the goal is to account for their variability while generalizing beyond the sample tested. Note that the same variable (e.g., stimuli) may be considered fixed or random depending on the research questions”.

      (18) Page 6: It is correct that the "classical" RSA depends on a categorical assignment of different trials to different stimuli/conditions, such that a stimulus x stimulus RDM can be computed. However, both Pattern Component Modelling (PCM) and Encoding models are ideally set up to deal with variables that vary continuously on a trial-by-trial or moment-by-moment basis. tRSA should be compared to these approaches, or - as it should be clarified - that the problem setting is actually quite a different one.

      We agree that PCM and encoding models offer a flexible approach and handle continuous trial-by-trial variables. We have clarified the problem setting in cRSA is distinct on page 6, and we have added the robustness of encoding models and their limitations to the Discussion.

      Page 6. “While other approaches such as Pattern Component Modeling (PCM) (Diedrichsen et al., 2018) and encoding models (Naselaris et al., 2011) are well-suited to analyzing variables that vary continuously on a trial-by-trial or moment-by-moment basis, these frameworks address different inferential goals. Specifically, PCM and encoding models focus on estimating variance components or predicting activation from features, while cRSA is designed to evaluate representational geometry. Thus, cRSA as well as our proposed approach address a problem setting distinct from PCM and encoding models”.

      (19) Page 8: "Then, we generated two noise patterns, which were controlled by parameters 𝜎 𝐴 and 𝜎𝐵, respectively, one for each condition." This makes little sense to me. The noise patterns should be unique to each trial - you should generate n_a + n_b noise patterns, no?

      We clarify that the “noise patterns” here are n_voxel x n_trial in size; in other words, all trial-level noise patterns are generated together and each trial has their own unique noise pattern. We have revised our description as “two sets of noise patterns” for clarity starting on page 9.

      (20) Page 9: First, I assume if this is supposed to be a hierarchical level model, the "noise parameters" here correspond to variances? Or do these \sigma values mean to signify standard deviations? The latter would make little sense. Or is it the noise pattern itself?

      As clarified in 4., the σ values are meant to denote hierarchical components of the composite standard deviation; we have updated our notation to use lower case letter s instead for clarity.

      (21) Page 10: your formula states "𝜎<sub>𝑠𝑢𝑏𝑗</sub>~ 𝙽(0, 0.5^2)". This conflicts with your previous mention that \sigmas are noise "levels" are they the noise patterns themselves now? Variances cannot be normally distributed, as they cannot be negative.

      As clarified in 4., the σ values are meant to denote hierarchical components of the composite standard deviation; we have updated our notation to use lower case letter s instead for clarity.

      (22) Page 13: What was the task of the subject in the Memory retrieval task? Old/new judgements relative to encoding of object perception?

      We apologize for the lack of clarity about the Memory Retrieval task and have added that information and clarified that the old/new judgements were relative to a separate encoding phase, the brain data for which has been reported elsewhere.

      Page 14. “Memory Retrieval took place one day after Memory Encoding and involved testing participants’ memory of the objects seen in the Encoding phase. Neural data during the Encoding phase has been reported elsewhere. In the main Memory Retrieval task, participants were presented with 144 labels of real-world objects, of which 114 were labels for previously seen objects and 30 were unrelated novel distractors. Participants performed old/new judgements, as well as their confidence in those judgements on a four-point scale (1 = Definitely New, 2 = Probably New, 3 = Probably Old, 4 = Definitely Old)”.

      (23) Page 13: If "Memory Retrieval consisted of three scanning runs", then some of the stimulus x stimulus correlations for the RSM must have been calculated within a run and some between runs, correct? Given that all within-run estimates share a common baseline, they share some dependence. Was there a systematic difference between the within-run and the between-run correlations?

      We have clarified in this portion of the methods that within run comparisons were excluded from our analyses. We also double-checked that the within-run exclusion was included in the description of the Neural RSMs.

      Page 14. “Retrieval consisted of three scanning runs, each with 38 trials, lasting approximately 9 minutes and 12 seconds (within-run comparisons were later excluded from RSA analyses)”.

      Page 18. “This was done by vectorizing the voxel-level activation values within each region and calculating their correlations using Pearson’s r, excluding all within-run comparisons.”

      (24) Page 20: It is not clear why the mean estimate of "representational strength" (i.e., model-brain RSM correlations) is important at all. This comes back to Major point #2, namely that you are trying to solve a very different problem from model-comparative RSA.

      We have clarified that our approach is not an alternative to model-comparative RSA, and that depending on the task constraints researchers may choose to compare models with tRSA or other approaches requiring stimulus repetition (see 3).

      (25) Page 21: I believe the problems of simulating correlation matrices directly in the way that the authors in their first simulation did should be well known and should be moved to an appendix at best. Better yet, the authors could start with the correct simulation right away.

      We agree the paper is more concise with these simulations being moved to the appendix and more briefly discussed. We have implemented these changes (Appendix 1). However, we are not certain that this problem is unknown, and have several anecdotes of researchers inquiring about this “alternative” approach in talks with colleagues, thus we do still discuss the issues with this method.

      (26) Page 26: Is the "underlying continuous noise variable 𝜎𝑡𝑟𝑖𝑎𝑙 that was measured by 𝑣𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑑 " the variance of the noise pattern or the noise pattern itself? What does it mean it was "measured" - how?

      𝜎𝑡𝑟𝑖𝑎𝑙 is a vector of standard deviations for different trials, and 𝜎𝑡𝑟𝑖𝑎𝑙 i would be used to generate the noise patterns for trial i. v_measured is a hypothetical measurement of trial-level variability, such as “memorability” or “heartbeat variability”. We have revised our description to clarify our methods.

      Reviewer #2 (Recommendations for the authors):

      (8) It would be helpful to provide more clarity earlier on in the manuscript on what is a 'trial': in my experience, a row or column of the RDM is usually referred to as 'stimulus condition', which is typically estimated on multiple trials (instances or repeats) of that stimulus condition (or exemplars from that stimulus class) being presented to the subject. Here, a 'trial' is both one measurement (i.e., single, individual presentation of a stimulus) and also an entry in the RDM, but is this the most typical scenario for cRSA? There is a section in the Discussion that discusses repetitions, but I would welcome more clarity on this from the get-go.

      We have added discussion of stimulus repetition methods and datasets to the Introduction and clarified our use of the terms.

      Page 8. “Critically, in single-presentation designs, a “trial” refers to one stimulus presentation, and corresponds to a row or column in the RSM. In studies with repeated stimuli, these rows are often called “conditions” and may reflect aggregated patterns across trials. tRSA is compatible with both cases: whether rows represent individual trials or averaged trials that create “conditions”, tRSA estimates are computed at the row level”.

      (9) The quality of the results figures can be improved. For example, axes labels are hard to read in Figure 3A/B, panels 3C/D are hard to read in general. In Figure 7E, it's not possible to identify the 'dark red' brain regions in addition to the light red ones.

      We thank the reviewer for raising these and have edited the figures to be more readable in the manner suggested.

      (10) I would be interested to see a comparison between tRSA and cRSA in other fMRI (or other modality) datasets that have been extensively reported in the literature. These could be the original Kriegeskorte 96 stimulus monkey/fMRI datasets, commonly used open datasets in visual perception (e.g., THINGS, NSD), or the above-mentioned King et al. dataset, which has been analyzed in various papers.

      We recognize the great utility of replication from other research groups and do invite researchers to utilize tRSA on their datasets.

      (11) On P39, the authors suggest 'researchers can confidently replace their existing cRSA analysis with tRSA': Please discuss/comment on how researchers should navigate the choice of modeling parameters in tRSA's linear mixed effects setting.

      We have added discussion of the mixed-effects parameters and the various and encourage researchers to follow best practices for their model selection.

      Page 46. “However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question”.

      (12) The final part of the Results section, demonstrating the tRSA results for the continuous memorability factor in the real fMRI data, could benefit from some substantiation/elaboration. It wasn't clear to me, for example, to what extent the observed significant association between representational strength and item memorability in this dataset is to be 'believed'; the Discussion section (p38). Was there any evidence in the original paper for this association? Or do we just assume this is likely true in the brain, based on prior literature by e.g. Bainbridge et al (who probably did not use tRSA but rather classic methods)?

      Indeed, memorability effects have been replicated in the literature, but not using the tRSA method. We have expanded our discussion to clarify the relationship of our findings and the relevant literature and methods it has employed.

      Page 38. “Critically, memorability is a robust stimulus property that is consistent across participants and paradigms (Bainbridge, 2022). Moreover, object memorability effects have been replicated using a variety of methods aside from tRSA, including univariate analyses and representational analyses of neural activity patterns where trial-level neural activity pattern estimates are correlated directly with object memorability (Slayton et al, 2025).”

      (13) The abstract could benefit from more nuance; I'm not sure if RSA can indeed be said to be 'the principal method', and whether it's about assessing 'quality' of representations (more commonly, the term 'geometry' or 'structure' is used).

      We have edited the abstract to reflect the true nuisance in the current approaches.

      Abstract. Neural representation refers to the brain activity that stands in for one’s cognitive experience, and in cognitive neuroscience, a prominent method of studying neural representations is representational similarity analysis (RSA). While there are several recent advances in RSA, the classic RSA (cRSA) approach examines the structure of representations across numerous items by assessing the correspondence between two representational similarity matrices (RSMs): usually one based on a theoretical model of stimulus similarity and the other based on similarity in measured neural data.

      (14) RSA is also not necessarily about models vs. neural data; it can also be between two neural systems (e.g., monkey vs. human as in Kriegeskorte et al., 2008) or model systems (see Sucholutsky et al., 2023). This statement is also repeated in the Introduction paragraph 1 (later on, it is correctly stated that comparing brain vs. model is most likely the 'most common' approach).

      We have added these examples in our introduction to RSA.

      Page 3.”One of the central approaches for evaluating information represented in the brain is representational similarity analysis (RSA), an analytical approach that queries the representational geometry of the brain in terms of its alignment with the representational geometry of some cognitive model (Kriegeskorte et al., 2008; Kriegeskorte & Kievit, 2013), or, in some cases, compares the representational geometry of two neural systems (e.g., Kriegeskorte et al., 2008) or two model systems (Sucholutsky et al., 2023)”.

      (15) 'theoretically appropriate' is an ambiguous statement, appropriate for what theory?

      We apologize for the ambiguous wording, and have corrected the text:

      Page 11. “Critically, tRSA estimates were submitted to a mixed-effects model which is statistically appropriate for modeling the hierarchical structure of the data, where observations are nested within both subjects and stimuli (Baayen et al., 2008; Chen et al., 2021)”.

      (16) I found the statement that cRSA "cannot model representation at the level of individual trials" confusing, as it made me think, what prohibits one from creating an RDM based on single-trial responses? Later on, I understood that what the authors are trying to say here (I think) is that cRSA cannot weigh the contributions of individual rows/columns to the overall representational strength differently.

      We thank the reviewer for their clarifying language and have added it to this section of the manuscript.

      “Abstract. However, because cRSA cannot weigh the contributions of individual trials (RSM rows/columns), it is fundamentally limited in its ability to assess subject-, stimulus-, and trial-level variances that all influence representation”.

      (17) Why use "RSM" instead of "RDM"? If the pairwise comparison metric is distance-based (e..g, 1-correlation as described by the authors), RDM is more appropriate.

      We apologize for the error, and have clarified the Methods text:

      Page3-4. First, brain activity responses to a series of N trials are compared against each other (typically using Pearson’s r) to form an N×N representational similarity matrix.

      (18) Figure 2: please write 'Correlation estimate' in the y-axis label rather than 'Estimate'.

      We have edited the label in Figure 2.

      (19) Page 6 'leaving uncertain the directionality of any findings' - I do not follow this argument. Obviously one can generate an RDM or RSM from vector v or vector -v. How does that invalidate drawing conclusions where one e.g., partials out the (dis)similarity in e.g., pleasantness ratings out of another RDM/RSM of interest?

      We agree such an approach does not invalidate the partial method; we have clarified what we mean by “directionality”.

      Page 8. ”For instance, even though a univariate random variable , such as pleasantness ratings, can be conveniently converted to an RSM using pairwise distance metrics (Weaverdyck et al., 2020), the very same RSM would also be derived from the opposite random variable , leaving uncertain of the directionality (or if representation is strongest for pleasant or unpleasant items) of any findings with the RSM (see also Bainbridge & Rissman, 2018)”.

      (20) P7 'sampled 19900 pairs of values from a bi-variate normal distribution', but the rows/columns in an RDM are not independent samples - shouldn't this be included in the simulation? I.e., shouldn't you simulate first the n=200 vectors, and then draw samples from those, as in the next analysis?

      This section has been moved to Appendix 1 (see responses to Reviewer 1.13).

      (21) Under data acquisition, please state explicitly that the paper is re-using data from prior experiments, rather than collecting data anew for validating tRSA.

      We have clarified this in the data acquisition section.

      Page 13. “A pre-existing dataset was analyzed to evaluate tRSA. Main study findings have been reported elsewhere (S. Huang, Bogdan, et al., 2024)”.

      (22) Figure 4 could benefit from some more explanation in-text. It wasn't clear to me, for example, how to interpret the asterisks depicted in the right part of the figure.

      We clarified the meaning of the asterisks in the main text in addition to the existent text in the figure caption.

      Page 26. “see Figure 4, off-diagonal cells in blue; asterisks indicate where tRSA was statistically more sensitive then cRSA)”.

      (23) Page 38 "the outcome of tRSA's improved characterization can be seen in multiple empirical outcomes:" it seems there is one mention of 'outcomes' too many here.

      We have revised this sentence.

      Page 41. “tRSA's improved characterization can be seen in multiple empirical outcomes”.

      (24) Page 38 "model fits became the strongest" it's not clear what aspect of the reported results in the paragraph before this is referring to - the Appendix?

      Yes, the model fits are in the Appendix, we have added this in text citation.

      Moreover, model-fits became the strongest when the models also incorporated trial-level variables such as fMRI run and reaction time (Appendix 3, Table 6).

      References

      Diedrichsen, J., Berlot, E., Mur, M., Schütt, H. H., Shahbazi, M., & Kriegeskorte, N. (2021). Comparing representational geometries using whitened unbiased-distance-matrix similarity. Neurons, Behavior, Data and Theory, 5(3). https://arxiv.org/abs/2007.02789

      Diedrichsen, J., & Kriegeskorte, N. (2017). Representational models: A common framework for understanding encoding, pattern-component, and representational-similarity analysis. PLoS Computational Biology, 13(4), e1005508.

      Diedrichsen, J., Yokoi, A., & Arbuckle, S. A. (2018). Pattern component modeling: A flexible approach for understanding the representational structure of brain activity patterns. NeuroImage, 180, 119-133.

      Naselaris, T., Kay, K. N., Nishimoto, S., & Gallant, J. L. (2011). Encoding and decoding in fMRI. NeuroImage, 56(2), 400-410.

      Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS Computational Biology, 10(4), e1003553.

      Schütt, H. H., Kipnis, A. D., Diedrichsen, J., & Kriegeskorte, N. (2023). Statistical inference on representational geometries. ELife, 12. https://doi.org/10.7554/eLife.82566

      Walther, A., Nili, H., Ejaz, N., Alink, A., Kriegeskorte, N., & Diedrichsen, J. (2016). Reliability of dissimilarity measures for multi-voxel pattern analysis. NeuroImage, 137, 188-200.

      King, M. L., Groen, I. I., Steel, A., Kravitz, D. J., & Baker, C. I. (2019). Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images. NeuroImage, 197, 368-382.

      Kriegeskorte, N., Mur, M., Ruff, D. A., Kiani, R., Bodurka, J., Esteky, H., ... & Bandettini, P. A. (2008). Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron, 60(6), 1126-1141.

      Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS computational biology, 10(4), e1003553.

      Sucholutsky, I., Muttenthaler, L., Weller, A., Peng, A., Bobu, A., Kim, B., ... & Griffiths, T. L. (2023). Getting aligned on representational alignment. arXiv preprint arXiv:2310.13018.

    1. Reviewer #1 (Public review):

      In this manuscript, Dillard and colleagues integrate cross-species genomic data with a systems approach to identify potential driver genes underlying human GWAS loci and establish the cell type(s) within which these genes act and potentially drive disease.

      Specifically, they utilize a large single cell RNA-seq (scRNA-seq) dataset from an osteogenic cell culture model - bone marrow-derived stromal cells cultured under osteogenic conditions (BMSC-OBs) - from a genetically diverse outbred mouse population called the Diversity Outbred (DO) stock to discover network driver genes that likely underlie human bone mineral density (BMD) GWAS loci. The DO mice segregate over 40M single nucleotide variants, many of which affect gene expression levels, therefore making this an ideal population for systems genetic and co-expression analyses.

      The current study builds on previous published work from the same group that used co-expression analysis to identify co-expressed "modules" of genes that were enriched for BMD GWAS associations. In this study, the authors utilized a much larger scRNA-seq dataset from 80 DO BMSC-OBs, inferred co-expression based on Bayesian networks for each identified mesenchymal cell type, focused on networks with dynamic expression trajectories that are most likely driving differentiation of BMSC-OBs, and then prioritized genes ("differentiation driver genes" or DDGs) in these osteogenic differentation networks that had known expression or splicing QTLs (eQTL/sQTLs) in any GTEx tissue that co-localized with human BMD GWAS loci. The systems analysis is impressive, the experimental methods are described in detail, and the experiments appear to be carefully done. The computational analysis of the single cell data is comprehensive and thorough, and the evidence presented in support of the identified DDGs, including Tpx2 and Fgfrl1, is for the most part convincing. Some limitations in the data resources and methods hamper enthusiasm somewhat and are discussed below.

      Overall, while this study will no doubt be valuable to the BMD community, the cross-species data integration and analytical framework may be more valuable and generally applicable to the study of other diseases, especially for diseases with robust human GWAS data but for which robust human genomic data in relevant cell types is lacking.

      Specific strengths of the study include the large scRNA-seq dataset on BMSC-OBs from 80 DO mice, the clustering analysis to identify specific cell types and sub-types, the comparison of cell type frequencies across the DO mice, and the CELLECT analysis to prioritize cell clusters that are enriched for BMD heritability (Figure 1). The network analysis pipeline outlined in Figure 2 is also a strength, as is the pseudotime trajectory analysis (results in Figure 3).

      Potential drawbacks of the authors' approach include their focus on genes that were previously identified as having an eQTL or sQTL in any GTEx tissue. The authors rightly point out that the GTEx database does not contain data for bone tissue, but reason that eQTLs can be shared across many tissues - this assumption is valid for many cis-eQTLs, but it could also exclude many genes as potential DDGs with effects that are specific to bone/osteoblasts. Indeed, the authors show that important BMD driver genes have cell-type specific eQTLs. Another issue concerns potential model overfitting in the iterativeWGCNA analysis of mesenchymal cell type-specific co-expression, which identified an average of 76 co-expression modules per cell cluster (range 26-153). Based on the limited number of genes that are detected as expressed in a given cell due to sparse per cell read depth (400-6200 reads/cell) and drop outs, it's surprising that as many as 153 co-expression modules could be distinguished within any cell cluster. I would suspect some degree of model overfitting is responsible for these results.

      Overall, though, these concerns are minor relative to the many strengths of the study design and results. Indeed, I expect the analytical framework employed by the authors here will be valuable to -- and replicated by -- researchers in other disease areas.

      Comments on revisions:

      Thank you for addressing my concerns. This is an impressive study and manuscript that you should be proud of.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Farber and colleagues have performed single cell RNAseq analysis on bone marrow derived stem cells from DO Mice. By performing network analysis, they look for driver genes that are associated with bone mineral density GWAS associations. They identify two genes as potential candidates to showcase the utility of this approach.

      Strengths:

      The study is very thorough and the approach is innovative and exciting. The manuscript contains some interesting data relating to how cell differentiation is occurring and the effects of genetics on this process. The section looking for genes with eQTLs that differ across the differentiation trajectory (Figure 4) was particularly exciting.

      Weaknesses:

      The manuscript is, in parts, hard to read due to the use of acronyms and there are some questions about data analysis that still need to be addressed.

      Comments on revisions:

      Dillard et al have made several improvements to their manuscript.

      (1) We previously asked the authors to determine whether any cell types were enriched for BMD-related traits since the premise of the paper is that 'many genes impacting BMD do so by influencing osteogenic differentiation or ... adipogenic differentiation'. Given the potential for the cell culture method to skew the cell type distribution non-physiologically, it is important to establish which cell types in their assay are most closely associated with BMD traits. The new CELLECT analysis and Figure 1E address this point nicely. However, it would still be nice to see the correlations between these cell types and BMD traits in the mice as this would provide independent evidence to support their physiological importance more broadly.

      (2) Shortening the introduction.

      (3) Addressing limitations that arise from not accounting for founder genome SNPs when aligning scRNA-seq data.

      (4) The main take-away of this paper is, to us, the development of a single cell approach to studying BMD-related traits. It is encouraging that the cells post-culture appear to be representative of those pre-culture (supplemental figure 3).

      However, the authors seem to have neglected several comments made by both reviewers. While we share the authors' enthusiasm for the single cell analytical approach, we do not understand their reluctance to perform further statistical tests. We feel that the following comments have still not been addressed:

      (1) The manuscript still contains the following:

      "To provide further support that tradeSeq-identified genes are involved in differentiation, we performed a cell type-specific expression quantitative trait locus (eQTL) analysis for each mesenchymal cell type from the 80 DO mice. We identified 563 genes (eGenes) regulated by a significant cis-eQTL in specific cell types of the BMSC-OB scRNA-seq data (Supplementary Table S14). In total, 73 eGenes were also tradeSeq-identified genes in one or more cell type boundaries along their respective trajectories (Supplementary Table S9)."

      The purpose of this paragraph is to convince readers that the eGenes approach aligns with the tradeSeq approach (and that their approach can therefore be trusted). It is essential that such claims are supported by statistical reasoning. Given that it would be very simple to perform permutation/enrichment analyses to address this point, and both reviewers requested similar analyses, we do not understand the author's reluctance here. Otherwise, this section should be rewritten so that it does not imply that the identification of these genes provides support for their approach.

      (2) Given that a central purpose of this manuscript is to establish a systematic workflow for identifying candidate genes, the manuscript could still benefit from more explanation as to why the authors chose to highlight Tpx2 and Fgfrl1. Tpx2 does already have a role in bone physiology through the IMPC. The authors should comment on why they did not explore Kremen1, for instance, as this gene seems important for the transition to both OB1 and 2.

      A final minor comment is that it would be very helpful if the authors could indicate if the DDGs in Table 1 are also eGenes for the relevant cell type. This is much more meaningful than looking through GTEx.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      In this manuscript, Dillard and colleagues integrate cross-species genomic data with a systems approach to identify potential driver genes underlying human GWAS loci and establish the cell type(s) within which these genes act and potentially drive disease. Specifically, they utilize a large single-cell RNA-seq (scRNA-seq) dataset from an osteogenic cell culture model - bone marrow-derived stromal cells cultured under osteogenic conditions (BMSC-OBs) - from a genetically diverse outbred mouse population called the Diversity Outbred (DO) stock to discover network driver genes that likely underlie human bone mineral density (BMD) GWAS loci. The DO mice segregate over 40M single nucleotide variants, many of which affect gene expression levels, therefore making this an ideal population for systems genetic and co-expression analyses. The current study builds on previously published work from the same group that used co-expression analysis to identify co-expressed "modules" of genes that were enriched for BMD GWAS associations. In this study, the authors utilize a much larger scRNA-seq dataset from 80 DO BMSC-OBs, infer co-expression-based and Bayesian networks for each identified mesenchymal cell type, focused on networks with dynamic expression trajectories that are most likely driving differentiation of BMSC-OBs, and then prioritized genes ("differentiation driver genes" or DDGs) in these osteogenic differentiation networks that had known expression or splicing QTLs (eQTL/sQTLs) in any GTEx tissue that colocalized with human BMD GWAS loci. The systems analysis is impressive, the experimental methods are described in detail, and the experiments appear to be carefully done. The computational analysis of the single-cell data is comprehensive and thorough, and the evidence presented in support of the identified DDGs, including Tpx2 and Fgfrl1, is for the most part convincing. Some limitations in the data resources and methods hamper enthusiasm somewhat and are discussed below. Overall, while this study will no doubt be valuable to the BMD community, the cross-species data integration and analytical framework may be more valuable and generally applicable to the study of other diseases, especially for diseases with robust human GWAS data but for which robust human genomic data in relevant cell types is lacking. 

      Specific strengths of the study include the large scRNA-seq dataset on BMSC-OBs from 80 DO mice, the clustering analysis to identify specific cell types and sub-types, the comparison of cell type frequencies across the DO mice, and the CELLECT analysis to prioritize cell clusters that are enriched for BMD heritability (Figure 1). The network analysis pipeline outlined in Figure 2 is also a strength, as is the pseudotime trajectory analysis (results in Figure 3). One weakness involves the focus on genes that were previously identified as having an eQTL or sQTL in any GTEx tissue. The authors rightly point out that the GTEx database does not contain data for bone tissue, but the reason that eQTLs can be shared across many tissues - this assumption is valid for many cis-eQTLs, but it could also exclude many genes as potential DDGs with effects that are specific to bone/osteoblasts. Indeed, the authors show that important BMD driver genes have cell-type-specific eQTLs. Furthermore, the mesenchymal cell type-specific co-expression analysis by iterative WGCNA identified an average of 76 co-expression modules per cell cluster (range 26-153). Based on the limited number of genes that are detected as expressed in a given cell due to sparse per-cell read depth (400-6200 reads/cell) and dropouts, it's hard to believe that as many as 153 co-expression modules could be distinguished within any cell cluster. I would suspect some degree of model overfitting here and would expect that many/most of these identified modules have very few gene members, but the methods list a minimum module size of 20 genes. How do the numbers of modules identified in this study compare to other published scRNA-seq studies that use iterative WGCNA? 

      In the section "Identification of differentiation driver genes (DDGs)", the authors identified 408 significant DDGs and found that 49 (12%) were reported by the International Mouse Knockout [sic] Consortium (IMPC) as having a significant effect on whole-body BMD when knocked out in mice. Is this enrichment significant? E.g., what is the background percentage of IMPC gene knockouts that show an effect on whole-body BMD? Similarly, they found that 21 of the 408 DDGs were genes that have BMD GWAS associations that colocalize with GTEx eQTLs/sQTLs. Given that there are > 1,000 BMD GWAS associations, is this enrichment (21/408) significant? Recommend performing a hypergeometric test to provide statistical context to the reported overlaps here. 

      We thank the reviewer for their constructive feedback and thoughtful questions. In regards to the iterativeWGCNA, a larger number of modules is sometimes an outcome of the analysis, as reported in the iterativeWGCNA preprint (Greenfest-Allen et al., 2017). While we did not make a comparison to other works leveraging this tool for scRNA-seq, it has been used broadly across other published studies, such as PMID: 39640571, 40075303, 33677398, 33653874. While model overfitting, as you mention, may be a cause for more modules, our Bayesian network analysis we perform after iterativeWGCNA highlights smaller aspects of coexpression modules, as opposed to focusing on the entirety of any given module.

      We did not perform enrichment or statistical tests as our goal was to simply highlight attributes or unique features of these genes for additional context.

      Reviewer #2 (Public review): 

      Summary: 

      In this manuscript, Farber and colleagues have performed single-cell RNAseq analysis on bone marrow-derived stem cells from DO Mice. By performing network analysis, they look for driver genes that are associated with bone mineral density GWAS associations. They identify two genes as potential candidates to showcase the utility of this approach. 

      Strengths: 

      The study is very thorough and the approach is innovative and exciting. The manuscript contains some interesting data relating to how cell differentiation is occurring and the effects of genetics on this process. The section looking for genes with eQTLs that differ across the differentiation trajectory (Figure 4) was particularly exciting. 

      Weaknesses: 

      The manuscript is in parts hard to read due to the use of acronyms and there are some questions about data analysis that need to be addressed. 

      We thank the reviewer for their feedback and shared enthusiasm for our work. We tried to minimize the use of technical acronyms as much as we could without compromising readability. Additionally, we addressed questions regarding aspects of data analysis. 

      Reviewer #1 (Recommendations for the authors):

      (1) For increased transparency and to allow reproducibility, it would be necessary for the scripts used in the analysis to be shared along with the publication of the preprint. Also, where feasible, sharing the processed data in addition to the raw data would allow the community greater access to the results and be highly beneficial. 

      Thank you for this suggestion. The raw data will be available via GEO accession codes listed in the data availability statement. We will make available scripts for some analyses on our Github (https://github.com/Farber-Lab/DO80_project) and processed scRNA-seq data in a Seurat object (.rds) on Zenodo (https://zenodo.org/records/15299631)

      (2) Lines 55-76: I think the summary of previous work here is too long. I understand that they would like to cover what has been done previously, but this seems like overkill. 

      Good suggestion. We have streamlined some of the summary of our previous work.

      (3) Did the authors try to map QTL for cell-type proportion differences in their BMSC-OBs? While 80 samples certainly limit mapping power, the data shown in Figs 4C/D suggest that you might identify a large-effect modifier of LMP/OB1 proportions. 

      We did try to map QTL for cell type proportion differences, but no significant associations were identified. 

      (4) Methods question: Does the read alignment method used in your analysis account for SNPs/indels that segregate among the DO/CC founder strains? If not, the authors may wish to include this in their discussion of study limitations and speculate on how unmapped reads could affect expression results. 

      The read alignment method we used does not account for SNPs/indels from the DO founder strains that fall in RNA transcripts captured in the scRNA-seq data. We have included this as a limitation in our discussion (line 422-424). 

      (5) Much of the discussion reads as an overview of the methods, while a discussion of the results and their context to the existing BMD literature is relatively lacking in comparison.

      We have added additional explanation of the results and context to the discussion (line 381-382, 396-407). 

      (6) Figure 1E and lines 146-149: Adjusted p values should be reported in the figure and accompanying text instead of switching between unadjusted and adjusted p values. 

      We updated Figure 1e to portray adjusted p-values, listed the adjusted p-values in legend of Figure 1e, and listed them in the main text (line 153-154).

      (7) Why do the authors bring the IMPC KO gene list into the analysis so late? This seems like a highly relevant data resource (moreso than the GTEx eQTLs/sQTLs) that could have been used much earlier to help identify DDGs. 

      Given that our scRNA-seq data is also from mice, we did choose to integrate information from the IMPC to highlight supplemental features of genes in networks (i.e., genes that have an experimentally-tested and significant effect on BMD in mice). However, our primary goal was to inform human GWAS and leverage our previous work in which we identified colocalizations between human BMD GWAS and eQTL/sQTL in a human GTEx tissue, which is why this information was used to guide our network analysis.

      (8) Does Fgfrl1 and/or Tpx2 have a cis-eQTL in your BMSC-OB scRNA-seq dataset? 

      We did not identify cis-eQTL effects for Fgfrl1 and Tpx2.

      (9) Figure 4B-C: These eQTLs may be real, but based on the diplotype patterns in Figure 4C, I suspect they are artifacts of low mapping power that are driven by rare genotype classes with one or two samples having outlier expression results. For example, if you look at the results in Fig 4C for S100a1 expression, the genotype classes with the highest/lowest expression have lower sample numbers. In the case of Pkm eQTL showing a PWK-low effect, the PWK genome has many SNPs that differ from the reference genome in the 3' UTR of this gene, and I wonder if reads overlapping these SNPs are not aligning correctly (see point 4 above) and resulting (falsely) in lower expression values for samples with a PWK haplotype. 

      As mentioned above, our alignment method did not consider DO founder genetic variation that is specifically located in the 3’ end of RNA transcripts in the scRNA-seq data. We have included this as a limitation in our discussion (line 422-424).

      In future studies, we intend to include larger populations of mice to potentially overcome, as you mention, any artifacts that may be attributable to low statistical power, rare genotype classes, or outlier expression.

      Reviewer #2 (Recommendations for the authors):

      Major Points 

      (1) The authors hypothesize "that many genes impacting BMD do so by influencing osteogenic differentiation or possibly bone marrow adipogenic differentiation". However, cell type itself does not correlate with any bone trait. Does this indicate that the hypothesis is not entirely correct, as genes that drive these phenotypes would not be enriched in one particular cell type? The authors have previously identified "high-priority target genes". So, are there any cell types that are enriched for these target genes? If not, this would indicate that all these genes are more ubiquitously expressed and this is probably why they would have a greater effect on the overall bone traits. Furthermore, are the 73 eGenes (so genes with eQTLs in a particular cell type that change around cell type boundaries) or the DDGs (Table 1) enriched for these high-priority target genes? 

      The bone traits measured in the DO mice are complex and impacted by many factors, including the differentiation propensity and abundance of certain cell types, both within and outside of bone. Though we did not identify correlations between cell type abundance and the bone traits we measured, we tailored our investigations to focus on cellular differentiation using the scRNA-seq data. However, future studies would need to be performed to investigate any connections between cellular differentiation, cell type abundance, and bone traits.

      We did not perform enrichment analyses of either the target genes identified from our other work or eGenes identified here, but instead used the target gene list to center our network analysis and the eGenes to showcase the utility of the DO mouse population.

      (2) The readability of the paper could be improved by minimising the use of acronyms and there are several instances of confusing wording throughout the paper. In many cases, this can be solved by re-organising sentences and adding a bit more detail. For example, it was unclear how you arrived at Fgfrl1 or Tpx2.

      One of the goals of our study was to identify genes that have (to our knowledge) little to no known connection to BMD. We chose to highlight Fgfrl1 and Tpx2 because there is minimal literature characterizing these genes in the context of bone, which we speak to in the results (line 296-297). Additionally, we prioritized these genes in our previous work and they were identified in this study by using our network analyses using the scRNA-seq data, which we mention in the results (line 276-279).

      (3) Technical aspects of the assay. In Figure 1d you show that the cell populations vary considerably between different DO mice. It would be useful to give some sense of the technical variance of this assay given that the assay involves culturing the cells in an exogenous environment. This could take the form of tests between mice within the same inbred strain, or even between different legs of the same DO mice to show that results are technically very consistent. It might also be prudent to identify that this is a potential limitation of the approach as in vitro culturing has the potential to substantially change the cell populations that are present. 

      We agree that in vitro culturing, in addition to the preparation of single cells for scRNA-seq, are unavoidable sources of technical variation in this study. However, the total number of cells contributed by each of the 80 DO mice after data processing does not appear to be skewed and the distribution appears normal (see added figures, now included as Supplemental Figure 3). Therefore, technical variation is at least consistent across all samples. Nevertheless, we have mentioned the potential for technical variation artifacts in our study in the discussion (line 414-416).

      (4) Need for permutation testing. "We identified 563 genes regulated by a significant eQTL in specific cell types. In total, 73 genes with eQTLs were also tradeSeq-identified genes in one or more cell type boundaries". These types of statements are fine but they need to be backed up with permutation testing to show that this level of enrichment is greater than one would expect by chance. 

      We did not perform enrichment tests as our only goal was to 1. determine if eQTL could be resolved in the DO mouse population using our scRNA-seq data and 2. predict in what cell type the associated eQTL and associated eGene may have an effect.

      (5) The main novelty of the paper seems to be that you have used single-cell RNA seq (given that you appear to have already detailed the candidates at the end). I don't think this makes the paper less interesting, but I think you need to reframe the paper more about the approach, and not the specific results. How you landed on these candidates is also not clear. So the paper might be improved by more robustly establishing the workflow and providing guidelines for how studies like this should be conducted in the future. 

      We sought to not only devise a rigorous approach to analyze our single cell data, but also showcase the utility of the approach in practice by highlighting targets for future research (i.e., Fgfrl1 and Tpx2).

      Our goal was to identify novel genes and we landed on these candidate genes (Fgfrl1 and Tpx2) because they had substantial data supporting their causality and they have yet to be fully characterized in the context of bone and BMD (line 295-297).

      In regards to establishing the workflow, we have included rationale for specific aspects of our approach throughout the paper. For example, Figure 2 itemizes each step of our network analysis and we explain why each step is utilized throughout various parts results (e.g., lines 168-170, 179-181, 191-193, 202-203, 257-260, 276-277).

      We have added a statement advocating for large-scale scRNA-seq from genetically diverse samples and network analyses for future studies (line 436-438).

      Minor Points 

      (1) In the summary you use the word "trajectory". Trajectories for what? I assume the transition between cell types, but this is not clear. 

      We added text to clarify the use of trajectory in the summary (line 34).

      (2) This sentence: "By 60 identifying networks enriched for genes implicated in GWAS we predicted putatively causal genes 61 for hundreds of BMD associations based on their membership in enriched modules." is also not clear. Do you mean: we predicted putatively causal genes by identifying clusters of co-expressed genes that were enriched for GWAS genes?" It is not clear how you identify the causal gene in the network. Is this just based on the hub gene? 

      The aforementioned sentence has since been removed to streamline the introduction, as suggested by Reviewer 1.

      In regards to causal gene identification, it is not based on whether it is hub gene. We prioritized a DDG (and their associated networks) if it was a causal gene that we identified in our previous work as having eQTL/sQTL in a GTEx tissue that colocalizes with human BMD GWAS.

      (3) Figure 3C. This is good but the labels are quite small. Would be good to make all the font sizes larger. 

      We have enlarged Figure 3C.

      (4) Line 341 in the Discussion should be "pseudotemporal". 

      We have edited “temporal” to “pseduotemporal”.

    1. inorganic scintillation

      are insulators. The band gap is greater than 5 eV making the light emitted of too high energy (shorter wavelength). To reduce the energy, an activator impurity is added to the crystal.

    2. In some designs (especially ionization chambers), both electrodes can be positioned in the gas, separate from the gas pressure vessel.

      Parallel plate electrodes

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this fMRI study, the authors wished to assess neural mechanisms supporting flexible "temporal construals". For this, human participants learned a story consisting of fifteen events. During fMRI, events were shown to them, and they were instructed to consider the event from "an internal" or from "an external" perspective. The authors found opposite patterns of brain activity in the posterior parietal cortex and the anterior hippocampus for the internal and the external viewpoint. They conclude that allocentric sequences are stored in the hippocampus, whereas egocentric sequences are used in the parietal cortex. The claims align with previous fMRI work addressing this question.

      We appreciate the reviewer's concise summary of our research. We would like to offer two clarifications to prevent any potential misunderstandings.

      First, the activity patterns in the parietal cortex and hippocampus are not entirely opposite across internal and external perspectives. Specifically, the activation level in the posterior parietal cortex shows a positive correlation with sequential distance during external-perspective tasks, but a negative correlation during internal-perspective tasks. In contrast, the activation level in the anterior hippocampus positively correlates with sequential distance, irrespective of the observer's perspective. Therefore, our results suggest that the parietal cortex, with its perspective-dependent activity, supports egocentric representation; the hippocampus, with its consistent activity across perspectives, supports allocentric representation.

      Second, while some of our findings align with previous fMRI studies, to our knowledge, no prior research has explicitly investigated how the neural representation of time may vary depending on the observer's viewpoint. This gap in the literature is the primary motivation for our current study.

      Strengths:

      The research topic is fascinating, and very few labs in the world are asking the question of how time is represented in the human brain. Working hypotheses have been recently formulated, and this work seems to want to tackle some of them.

      We appreciate the reviewer's acknowledgment of the theoretical significance of our study.

      Weaknesses:

      The current writing is fuzzy both conceptually and experimentally. I cannot provide a sufficiently well-informed assessment of the quality of the experimental work because there is a paucity of details provided in the report. Any future revisions will likely improve transparency.

      (1) Improving writing and presentation:

      The abstract and the introduction make use of loaded terms such as "construals", "mental timeline", "panoramic views" in very metaphoric and unexplained ways. The authors do not provide a comprehensive and scholarly overview of these terms, which results in verbiage and keywords/name-dropping without a clear general framework being presented. Some of these terms are not metaphors. They do refer to computational concepts that the authors should didactically explain to their readership. This is all the more important that some statements in the Introduction are misattributed or factually incorrect; some statements lack attributions (uncited published work). Once the theory, the question, and the working hypothesis are clarified, the authors should carefully explain the task.

      We appreciate the reviewer's critics.

      The formulation of the scientific question in the introduction is grounded in the spatial construals of time hypothesis and conceptual metaphor theory (e.g., Traugott, 1978; Lakoff & Johnson, 1980; see recent reviews by Núñez & Cooperrider, 2013; Bender & Beller, 2014). These frameworks were originally developed through analyses of how spatial metaphors are used to describe temporal concepts in natural language. Consequently, it is theoretically motivated and largely unavoidable to introduce the two primary temporal construals—mental time travel and mental time watching— using metaphorical expressions.

      However, we do agree with the reviewer that the introduction in the original manuscript was overly long and that the working hypothesis was not clearly stated. In the revised manuscript, we have streamlined the introduction and substantially revised the following two paragraphs to clarify the formulation of our working hypothesis (Pages 5-6):

      “Recent studies have already begun to investigate the neural representation of the memorized event sequence (e.g., Deuker et al., 2016; Thavabalasingam et al., 2018; Bellmund et al., 2019, 2022; see reviews by Cohn-Sheehy & Ranganath, 2017; Bellmund et al., 2020). Yet, the neural mechanisms that enable the brain to construct distinct construals of an event sequence remain largely unknown. Valuable insights may be drawn from research in the spatial domain, which diPerentiates the neural representation in allocentric and egocentric reference frames. According to an influential neurocomputational model (Byrne et al., 2007; Bicanski & Burgess, 2018; Bicanski & Burgess, 2020), allocentric and egocentric spatial representations are dissociable in the brain—they are respectively implemented in the medial temporal lobe (MTL)—including the hippocampus—and the parietal cortex. Various egocentric representations in the parietal cortex derived from diPerent viewpoints can be transformed and integrated into a unified allocentric representation and stored in the MTL (i.e., bottom-up process). Conversely, the allocentric representation in the MTL can serve as a template for reconstructing diverse egocentric representations across diPerent viewpoints in the parietal cortex (i.e., top-down process).”

      “In line with the spatial construals of time hypothesis, several authors have recently suggested that such mutually engaged egocentric and allocentric reference frames (in the parietal cortex and the medial temporal lobe, respectively) proposed in the spatial domain might also apply to the temporal one (e.g., Gauthier & van Wassenhove, 2016ab; Gauthier et al., 2019, 2020; Bottini & Doeller, 2020). If this hypothesis holds, it could explain how the brain flexibly generates diverse construals of the same event sequence. Specifically, the hippocampus may encode a consistent representation of an event sequence that is independent of whether an individual adopts an internal or external perspective, reflecting an allocentric representation of time. In contrast, parietal cortical representations are expected to vary flexibly with the adopted perspective that is shaped by task demands, reflecting an egocentric representation of time.”

      In the revised manuscript, we also corrected statements in the Introduction that may have been misattributed (see Reviewer 2, comment 4(ii)) and added several relevant and important publications.

      (2) The experimental approach lacks sufficient details to be comprehensible to a general audience. In my opinion, the results are thus currently uninterpretable. I highlight only a couple of specific points (out of many). I recommend revision and clarification.

      (a) No explanation of the narrative is being provided. The authors report a distribution of durations with no clear description of the actual sequence of events. The authors should provide the text that was used, how they controlled for low-level and high-level linguistic confounds.

      We thank the reviewer for the suggestions. The event sequence for the odd-numbered participants is shown in the original Figure 1. In the revised manuscript, we added to Figure 1 the figure supplement 1 to illustrate the actual sequence of events for the participants with both odd and even numbers. We also added the narratives used in the reading phase of the learning procedures for the participants with both odd and even numbers (Figure 1—source data 1).

      To control for low-level linguistic confounds, we included the number of syllables as a covariate in the first-level general linear model in the fMRI analysis. To address high-level linguistic confounds, such as semantic information (which is difficult to quantify), we randomly assigned event labels to the 15 events twice, creating two counterbalanced versions for participants with even and odd numbers (see Comment 2b below).

      (b) The authors state, "we randomly assigned 15 phrases to the events twice". It is impossible to comprehend what this means. Were these considered stimuli? Controls? IT is also not clear which event or stimulus is part of the "learning set" and whether these were indicated to be such to participants.

      We apologize for any confusion in the Results section and the legend of Figure 1. Our motivation was explained in the "Stimuli" section of the Methods. In the revised manuscript, we have clarified this by adding an explanation to the legend of Figure 1 and including the supplementary Figure 1: " To minimize potential confounds between the semantic content of the event phrases and the temporal structure of the events, we randomly assigned the phrases to the events, creating two versions for participants with even and odd ID numbers. Both versions can be seen in Figure1—figure supplement 1 and Figure 1—source data 1."

      (c) The left/right counterbalancing is not being clearly explained. The authors state that there is counterbalancing, but do not sufficiently explain what it means concretely in the experiment. If a weak correlation exists between sequential position and distance, it also means that the position and the distance have not been equated within. How do the authors control for these?

      We thank the reviewer for highlighting this point and apologize for the lack of clarity in the original manuscript. In the current version (Page 40), we have provided further clarification: “We carefully selected two sets of 20 event pairs from the 210 possible combinations, assigning them to the odd and even runs of the fMRI experiment. Using a brute-force search, we identified 20 pairs in which sequential distance showed only weak correlations with positional information for both reference and target events (ranging from 1 to 15), as well as with behavioral responses (Same vs. Different or Future vs. Past, coded as 0 and 1), with all correlation coefficients below 0.2. At the same time, we balanced the proportion of correct responses across conditions: for the external-perspective task, Same/Different = 11/9 and 12/8; for the internal-perspective task, Future/Past = 12/8 and 8/12. Under these constraints, the sequential distances in both sets ranged from 1 to 5. To further mitigate spatial response biases, we pseudorandomized the left/right on-screen positions of the two response options within each task block, while ensuring an equal number of correct responses mapped to the left and right buttons (i.e., 10 per block).”

      The event pairs we selected already represent the best possible choice given all the criteria we aimed to satisfy. It is impossible to completely eliminate all potential correlations. For instance, if the target event occurs near the beginning of the day, it will tend to fall in the past, whereas if it occurs near the end of the day, it is more likely to fall in the future. To further ensure that the significant results were not driven by these weak confounding factors, we constructed another GLM that included three additional parametric modulators: the sequence position of the target event (ranging from 1 to 15) and the behavioral responses (Future vs. Past in the internal-perspective task; Same vs. Different in the external-perspective task, coded as 0 and 1). The significant findings were unaffected.

      (d) The authors used two tasks. In the "external perspective" one, the authors asked participants to report whether events were part of the same or a different part of the day. In the "internal perspective one", the authors asked participants to project themselves to the reference event and to determine whether the target event occurred before or after the projected viewpoint. The first task is a same/different recognition task. The second task is a temporal order task (e.g., Arzy et al. 2009). These two asks are radically different and do not require the same operationalization. The authors should minimally provide a comprehensive comparison of task requirements, their operationalization, and, more importantly, assess the behavioral biases inherent to each of these tasks that may confound brain activity observed with fMRI.

      We understand the reviewer’s concern. We agree that there is a substantial difference between the two tasks. However, the primary goal of this study was not to directly compare these tasks to isolate a specific cognitive component. Rather, the neural correlates of temporal distance were first identified as brain regions showing a significant correlation between neural activity and temporal distance using the parametric modulation analysis. We then compared these neural correlates between the two tasks. Therefore, any general differences between the tasks should not be a confound for our main results. Our aim was to examine whether the hippocampal representation of temporal distance remains consistent across different perspectives, and whether the parietal representation of temporal distance varies as a function of the perspective adopted.

      Therefore, the main aim of our task manipulation was to ensure that participants adopted either an external or an internal perspective on the event sequence, depending on the task condition. In the Introduction (Pages 6–7), we clarify this manipulation as follows: “In the externalperspective task, participants localized events with respect to external temporal boundaries, judging whether the target event occurred in the same or a different part of the day as the reference event. In the internal-perspective task, participants were instructed to mentally project themselves into the reference event and localize the target event relative to their own temporal point, judging whether the target event happened in the future or the past of the reference event (see Methods for details of the scanning procedure).”

      We believe this task manipulation was successful. Behaviorally, the two tasks showed opposite correlations between reaction time and temporal distance, resembling the symbolic distance versus mental scanning effect. Neurally, contrasting the internal- and external-perspective tasks revealed activation of the default mode network, which is known to play a central role in self-projection (Buckner et al., 2017).

      (e) The authors systematically report interpreted results, not factual data. For instance, while not showing the results on behavioral outcomes, the authors directly interpret them as symbolic distance effects.

      Thank you for this comment. In the original paper, we reported the relevant statistics before our interpretation: “Sequential Distance was correlated positively with RT in the external-perspective task (z = 3.80, p < 0.001) but negatively in the internal-perspective task (z = -3.71, p < 0.001).” However, they may have been difficult to notice, and we are including a figure for the RT analysis in the revised manuscript.

      Crucially, the authors do not comment on the obvious differences in task difficulty in these two tasks, which demonstrates a substantial lack of control in the experimental design. The same/different task (task 1 called "external perspective") comes with known biases in psychophysics that are not present in the temporal order task (task 2 called " internal perspective"). The authors also did not discuss or try to match the performance level in these two tasks. Accordingly, the authors claim that participants had greater accuracy in the external (same/different) task than in the internal task, although no data are shown and provided to support this report. Further, the behavioral effect is trivialized by the report of a performance accuracy trade off that further illustrates that there is a difference in the task requirements, preventing accurate comparison of the two tasks.

      As noted in Question 2d, we acknowledge the substantial difference between the two tasks. However, the primary goal of this study was not to directly compare these tasks to isolate a specific cognitive component. Instead, we first identified the neural correlates of temporal distance as brain regions showing a significant correlation between neural activity and temporal distance, independent of task demands. We then compared these neural correlates across the two task conditions, which were designed to engage different temporal perspectives. Therefore, any general differences between the tasks should not be a confound for our main findings and interpretation.

      Our aim was to investigate whether the hippocampal representation of temporal distance remains consistent across different perspectives and whether the parietal representation of temporal distance varies as a function of the perspective adopted. We do not see how this doubledissociation pattern could be explained by differences in task difficulty.

      While we do not consider the overall difference in task difficulty between the two tasks to be a confounding factor, we acknowledge the potential confound posed by variations in task difficulty across temporal distances (1 to 5). This concern arises from the similarity between the activity patterns in the posterior parietal cortex and reaction time across temporal distances. To address this, we conducted control analyses to test this hypothesis (see the second and third points from Reviewer 2 for details).

      On page 8, we present the behavioral accuracy data: “Participants showed significantly higher accuracy in the external-perspective task than in the internal-perspective task (external-perspective task: M = 93.5%, SD = 4.7%; internal-perspective task: M = 89.5%, SD = 8.1%; paired t(31) = 3.33, p = 0.002).”

      All fMRI contrasts are also confounded by this experimental shortcoming, seeing as they are all reported at the interaction level across a task. For instance, in Figure 4, the authors report a significant beta difference between internal and external tasks. It is impossible to disentangle whether this effect is simply due to task difference or to an actual processing of the duration that differs across tasks, or to the nature of the representation (the most difficult to tackle, and the one chosen by the authors).

      We thank the reviewer for pointing out this important issue. Like temporal distance, the neural correlates of duration were not derived from a direct contrast between the two tasks. Instead, they were identified by detecting brain regions showing a significant correlation between neural activity and the implied duration of each event using the parametric modulation analysis. Therefore, what is shown in Figure 4 reflects the significant differences in these neural correlations with duration between the two tasks.

      The observed difference in the neural representation of duration between the two tasks was unexpected. In the original manuscript, we provided a post hoc explanation: “Since the externalperspective task in the current study encouraged the participants to compare the event sequence with the external parallel temporal landmarks, duration representation in the hippocampus may be dampened.”

      However, we agree that this difference might also arise from other factors distinguishing the two tasks. In the revised manuscript, we have clarified this possibility as follows: “The difference in duration representation between the two tasks remains open to interpretation. One possible explanation is that the hippocampus is preferentially involved in memory for durations embedded within event sequences (see review by Lee et al., 2020). In the internal-perspective task, participants indeed localized events within the event sequence itself. In contrast, the externalperspective task encouraged participants to compare the event sequence with external temporal landmarks, which may have attenuated the hippocampal representation of duration.”

      Conclusion:

      In conclusion, the current experimental work is confounded and lacks controls. Any behavioral or fMRI contrasts between the two proposed tasks can be parsimoniously accounted for by difficulty or attentional differences, not the claim of representational differences being argued for here.

      We hope that our explanations and clarifications above adequately address the reviewer’s concerns. We would like to reiterate that we did not directly compare the two tasks. Rather, we first identified the neural representations of sequential distance and duration, and then examined how these representations differed across tasks. It is unclear to us how the overall difference in task difficulty or attentional demands could lead to the observed pattern of results.

      By determining where the neural representations were consistent and where they diverged, we were able to differentiate brain regions that encode temporal information allocentrically from those that represent temporal information in a perspective-dependent manner, modulated by task demands.

      Reviewer #2 (Public review):

      Summary:

      Xu et al. used fMRI to examine the neural correlates associated with retrieving temporal information from an external compared to internal perspective ('mental time watching' vs. 'mental time travel'). Participants first learned a fictional religious ritual composed of 15 sequential events of varying durations. They were then scanned while they either (1) judged whether a target event happened in the same part of the day as a reference event (external condition); or (2) imagined themselves carrying out the reference event and judged whether the target event occurred in the past or will occur in the future (internal condition). Behavioural data suggested that the perspective manipulation was successful: RT was positively correlated with sequential distance in the external perspective task, while a negative correlation was observed between RT and sequential distance for the internal perspective task. Neurally, the two tasks activated different regions, with the external task associated with greater activity in the supplementary motor area and supramarginal gyrus, and the internal condition with greater activity in default mode network regions. Of particular interest, only a cluster in the posterior parietal cortex demonstrated a significant interaction between perspective and sequential distance, with increased activity in this region for longer sequential distances in the external task, but increased activity for shorter sequential distances in the internal task. Only a main effect of sequential distance was observed in the hippocampus head, with activity being positively correlated with sequential distance in both tasks. No regions exhibited a significant interaction between perspective and duration, although there was a main effect of duration in the hippocampus body with greater activity for longer durations, which appeared to be driven by the internal perspective condition. On the basis of these findings, the authors suggest that the hippocampus may represent event sequences allocentrically, whereas the posterior parietal cortex may process event sequences egocentrically.

      We sincerely appreciate the reviewers for providing an accurate, comprehensive, and objective summary of our study.

      Strengths:

      The topic of egocentric vs. allocentric processing has been relatively under-investigated with respect to time, having traditionally been studied in the domain of space. As such, the current study is timely and has the potential to be important for our understanding of how time is represented in the brain in the service of memory. The study is well thought out, and the behavioural paradigm is, in my opinion, a creative approach to tackling the authors' research question. A particular strength is the implementation of an imagination phase for the participants while learning the fictional religious ritual. This moves the paradigm beyond semantic/schema learning and is probably the best approach besides asking the participants to arduously enact and learn the different events with their exact timings in person. Importantly, the behavioural data point towards successful manipulation of internal vs. external perspective in participants, which is critical for the interpretation of the fMRI data. The use of syllable length as a sanity check for RT analyses, as well as neuroimaging analyses, is also much appreciated.

      We thank the reviewer for the positive and encouraging comments.

      Weaknesses/Suggestions:

      Although the design and analysis choices are generally solid, there are a few finer details/nuances that merit further clarification or consideration in order to strengthen the readers' confidence in the authors' interpretation of their data.

      (1) Given the known behavioural and neural effects of boundaries in sequence memory, I was wondering whether the number of traversed context boundaries (i.e., between morning-afternoon, and afternoon-evening) was controlled for across sequential length in the internal perspective condition? Or, was it the case that reference-target event pairs with higher sequential numbers were more likely to span across two parts of the day compared to lower sequential numbers? Similarly, did the authors examine any potential differences, whether behaviourally or neurally, for day part same vs. day part different external task trials?

      We thank the reviewer for the thoughtful comments. When we designed the experiment, we minimized the correlation between the sequential distance between the target and reference events and whether the reference and target events occurred within the same or different parts of the day (coded as Same = 0, Different = 1). The point-biserial correlation coefficient between these two variables across all the trials within the same run were controlled below 0.2.

      To investigate the effect of day-part boundaries on behavior, as well as the contribution of other factors, we conducted a new linear mixed-effects model analysis incorporating four additional variables. They are whether the target and the reference events are within the same or different parts of the day (i.e., Same vs. Different), whether the target event is in the future or the past of the reference event (i.e., Future vs. Past), and the interactions of the two factors with Task Type (i.e., internal- vs. external-perspective task).

      The results are largely the same as the original one in the table: There was a significant main effect of Syllable Length, and the interaction effects between Task Type and Sequence Distance and between Task Type and Duration remain significant. What's new is we also found a significant interaction effect between Task Type and Same vs. Different.

      As shown in the Figure 2—figure supplement 1, this Same vs. Different effect was in line with the effect of Sequential Distance, with two events in the same and different parts of the day corresponding to the short and long sequential distances. Given that Sequential Distance had already been considered in the model, the effect of parts of the day should result from the boundary effect across day parts or the chunking effect within day parts, i.e., the sequential distance across different parts of the day was perceived longer while the sequential distance within the same parts of the day was perceived shorter. We have incorporated these findings into the manuscript.

      Neurally, to further verify that the significant effects of sequential distance were not driven by its weak correlation with the Same/Different judgment or other potential confounding factors, we constructed another GLM that incorporated three additional parametric modulators: the sequence position of the target event (ranging from 1 to 15) and the behavioral responses (Future vs. Past in the internal-perspective task; Same vs. Different in the external-perspective task, coded as 0 and 1). The significant findings were unaffected.

      (2) I would appreciate further insight into the authors' decision to model their task trials as stick functions with duration 0 in their GLMs, as opposed to boxcar functions with varying durations, given the potential benefits of the latter (e.g., Grinband et al., 2008). I concur that in certain paradigms, RT is considered a potential confound and is taken into account as a nuisance covariate (as the authors have done here). However, given that RTs appear to be critical to the authors' interpretation of participant behavioural performance, it would imply that variations in RT actually reflect variations in cognitive processes of interest, and hence, it may be worth modelling trials as boxcar functions with varying durations.

      We appreciate the reviewer’s insightful comment on this important issue. Whether to control for RT’s influence on fMRI activation is indeed a long-standing paradox. On the one hand, RT reflects underlying cognitive processes and therefore should not be fully controlled for. On the other hand, RT can independently influence neural activity, as several brain networks vary with RT irrespective of the specific cognitive process involved—a domain-general effect. For example, regions within the multiple-demand network are often positively correlated with RT across different cognitive domains.

      Our strategy in the manuscript is to first present the results without including RT as a control variable and then examine whether the effects are preserved after controlling for RT. In the revised manuscript, we have clarified this approach (Page 13): “Here, changes in activity levels within the PPC were found to align with RT. Whether to control for RT’s influence on fMRI activation represents a well-known paradox. On the one hand, RT reflects underlying cognitive processes and therefore should not be fully controlled for. On the other hand, RT can independently influence neural activity, as several brain networks vary with RT irrespective of the specific cognitive process involved—a domain-general effect. For instance, regions within the multiple-demand network are often positively correlated with RT and task difficulty across diverse cognitive domains (e.g., Fedorenko et al., 2013; Mumford et al., 2024). To evaluate the second possibility, we conducted an additional control analysis by including trial-by-trial RT as a parametric modulator in the first-level model (see Methods). Notably, the same PPC region remained the only area in the entire brain showing a significant interaction between Task Type and Sequential Distance (voxel-level p < 0.001, clusterlevel FWE-corrected p < 0.05). This finding indicates that PPC activity cannot be fully attributed to RT. Furthermore, we do not interpret the effect as reflecting a domain-general RT influence, as regions within the multiple-demand system—typically sensitive to RT and task difficulty—did not exhibit significant activation in our data.”

      The reason we did not use boxcar functions with varying durations in our original manuscript is that we also applied parametric modulation in the same model. In the parametric modulation, all parametric modulators inherit the onsets and durations of the events being modulated. Consequently, the modulators would also take the form of boxcar functions rather than stick functions—the height of each boxcar reflecting the parameter value and its length reflecting the RT. We were uncertain whether this approach would be appropriate, as we have not encountered other studies implementing parametric modulation in this manner.

      For exploratory purposes, we also conducted a first-level analysis using boxcar functions with variable durations. The same PPC region remained the strongest area in the entire brain that shows an interaction effect between Task Type and Sequential Distance. However, the cluster size was slightly reduced (voxel-level p < 0.001, cluster-level FWE-corrected p = 0.0610; see the Author response image 1 below). The cross indicates the MNI coordinates at [38, –69, 35], identical to those shown in the main results (Figure 4A).

      Author response image 1.

      (3) The activity pattern across tasks and sequential distance in the posterior parietal cortex appears to parallel the RT data. Have the authors examined potential relationships between the two (e.g., individual participant slopes for RT across sequential distance vs. activity betas in the posterior parietal cortex)?

      We thank the reviewer for this helpful suggestion. As shown in the Author response image 2, the interaction between Task Type and Sequential Distance was a stronger predictor of PPC activation than of RT. Because PPC activation and RT are measured on different scales, we compared their standardized slopes (standardized β) measuring the change in a dependent variable in terms of standard deviations for a one-standard-deviation increase in an independent variable. The standardized β for the Task Type × Sequential Distance interaction was −0.30 (95% CI [−0.42, −0.19]) for PPC activation and −0.21 (95% CI [−0.30, −0.13]) for RT. The larger standardized effect for PPC activation indicates that the Task Type × Sequential Distance interaction was a stronger predictor of neural activation than of behavioral RT.

      Author response image 2.

      A more relevant question is whether PPC activation can be explained by temporal information (i.e., the sequential distance) independently of RT. To test this, we included both Sequential Distance and RT in the same linear mixed-effects model predicting PPC Activation Level. As shown in the Author response table 1, although RT independently influenced PPC activation (F(1, 288) = 4.687, p = 0.031), the interaction between Task Type and Sequential Distance was a much stronger independent predictor (F(1, 290) = 19.319, p < 0.001).

      Author response table 1.

      PPC Activation Level Predicted by Sequential Distance and RT

      (3) Linear Mixed Model Formula: PPC Activation Level ~ 1 + Task Type * (Sequential Distance + RT) + (1 | Participant)

      (4) There were a few places in the manuscript where the writing/discussion of the wider literature could perhaps be tightened or expanded. For instance:

      (i) On page 16, the authors state 'The negative correlation between the activation level in the right PPC and sequential distance has already been observed in a previous fMRI study (Gauthier & van Wassenhove, 2016b). The authors found a similar region (the reported MNI coordinate of the peak voxel was 42, -70, 40, and the MNI coordinate of the peak voxel in the present study was 39, -70, 35), of which the activation level went up when the target event got closer to the self-positioned event. This finding aligns with the evidence suggesting that the posterior parietal cortex implements egocentric representations.' Without providing a little more detail here about the Gauthier & van Wassenhove study and what participants were required to do (i.e., mentally position themselves at a temporal location and make 'occurred before' vs. 'occurred after' judgements of a target event), it could be a little tricky for readers to follow why this convergence in finding supports a role for the posterior parietal cortex in egocentric representations.

      We appreciate the reviewer’s comments. In the revised manuscript, we have provided a more detailed explanation of Gauthier and van Wassenhove’s study (Page 17): “The negative correlation between the activation level in the right PPC and sequential distance has already been observed in a previous fMRI study by Gauthier & van Wassenhove (2016b). In their study, the participants were instructed to mentally position themselves at a specific time point and judge whether a target event occurred before or after that time point. The authors identified a similar brain region (reported MNI coordinates of the peak voxel: 42, −70, 40), closely matching the activation observed in the present study (MNI coordinates of the peak voxel: 39, −70, 35). In both studies, activation in this region increased as the target event approached the self-positioned time point, which aligns with the evidence suggesting that the posterior parietal cortex implements egocentric representations.”

      (ii) Although the authors discuss the Lee et al. (2020) review and related studies with respect to retrospective memory, it is critical to note that this work has also often used prospective paradigms, pointing towards sequential processing being the critical determinant of hippocampal involvement, rather than the distinction between retrospective vs. prospective processing.

      We sincerely thank the reviewer for highlighting these important points. In response, we have revised the section of the Introduction discussing the neural underpinnings of duration (Pages 3-4). “Neurocognitive evidence suggests that the neural representation of duration engages distinct brain systems. The motor system—particularly the supplementary motor area—has been associated with prospective timing (e.g., Protopapa et al., 2019; Nani et al., 2019; De Kock et al., 2021; Robbe, 2023), whereas the hippocampus is considered to support the representation of duration embedded within an event sequence (e.g., Barnett et al., 2014; Thavabalasingam et al., 2018; see also the comprehensive review by Lee et al., 2020).”

      (iii) The authors make an interesting suggestion with respect to hippocampal longitudinal differences in the representation of event sequences, and may wish to relate this to Montagrin et al. (2024), who make an argument for the representation of distant goals in the anterior hippocampus and immediate goals in the posterior hippocampus.

      We thank the reviewer for bringing this intriguing and relevant study to our attention. In the Discussion of the manuscript, we have incorporated it into our discussion (Page 21): “Evidence from the spatial domain has suggested that the anterior hippocampus (or the ventral rodent hippocampus) implements global and gist-like representations (e.g., larger receptive fields), whereas the posterior hippocampus (or the dorsal rodent hippocampus) implements local and detailed ones (e.g., finer receptive fields) (e.g., Jung et al., 1994; Kjelstrup et al., 2008; Collin et al., 2015; see reviews by Poppenk et al., 2013; Robin & Moscovitch, 2017; see Strange et al., 2014 for a different opinion). Recent evidence further shows that the organizational principle observed along the hippocampal long axis may also extend to the temporal domain (Montagrin et al., 2024). In that study, the anterior hippocampus showed greater activation for remote goals, whereas the posterior hippocampus was more strongly engaged for current goals, which are presumed to be represented in finer detail.”

      Reviewing Editor Comments:

      While both reviewers acknowledged the significance of the topic, they raised several important concerns. We believe that providing conceptual clarification, adding important methodological details, as well as addressing potential confounds will further strengthen this paper.

      We thank the editor for the suggestions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Please, provide the actual ethical approval #.

      We have added the ethical approval number in the revised manuscript (P 36): “The ethical committee of the University of Trento approved the experimental protocol (Approval Number 2019-018),”

      (2) Thirty-two participants were tested. Please report how you estimated the sample size was sufficient to test your working hypothesis.

      We thank the editor for pointing out this omission. In the revised manuscript, we have added an explanation for our choice of sample size (p. 36): “The sample size was chosen to align with the upper range of participant numbers reported in previous fMRI studies that successfully detected sequence or distance effects in the hippocampus (N = 15–34; e.g., Morgan et al., 2011; Howard et al., 2014; Deuker et al., 2016; Garvert et al., 2017; Theves et al., 2019; Park et al., 2021; Cristoforetti et al., 2022).”

      (3) All MRI figures: please orient the reader; left/right should be stated.

      In the revised manuscript, we have added labels to all MRI figures to indicate the left and right hemispheres.

      (4) In Figure 3A-B, the clear lateralization of the activation is not discussed in the Results or in the Discussion. Was it predicted?

      We thank the editors for highlighting this important point regarding hemispheric lateralization. The right-lateralization observed in our findings is indeed consistent with previous literature. In the revised manuscript, we have expanded our discussion to emphasize this aspect more clearly.

      For the parietal cortex, we now note (Page 17-18): “The negative correlation between activation in the right posterior parietal cortex (PPC) and sequential distance has previously been reported in an fMRI study by Gauthier and van Wassenhove (2016b). In their paradigm, participants were instructed to mentally position themselves at a specific time point and judge whether a target event occurred before or after that point. The authors identified a similar region (peak voxel MNI coordinates: 42, −70, 40), closely corresponding to the activation observed in the present study (peak voxel MNI coordinates: 39, −70, 35). In both studies, activation in this region increased as the target event approached the self-positioned time point, consistent with evidence suggesting that the posterior parietal cortex supports egocentric representations. Neuropsychological studies have further shown that patients with lesions in the bilateral or right PPC exhibit ‘egocentric disorientation’ (Aguirre & D’Esposito, 1999), characterized by an inability to localize objects relative to themselves (e.g., Case 2: Levine et al., 1985; Patient DW: Stark, 1996; Patients MU: Wilson et al., 1997, 2005).”

      For the hippocampus, we have added (Page 19): “Previous research has shown that hippocampal activation correlates with distance (e.g., Morgan et al., 2011; Howard et al., 2014; Garvert et al., 2017; Theves et al., 2019; Viganò et al., 2023), and that distributed hippocampal activity encodes distance information (e.g., Deuker et al., 2016; Park et al., 2021). Most studies have reported hippocampal ePects either bilaterally or predominantly in the right hemisphere, whereas only one study (Morgan et al., 2011) found the ePect localized to the left hippocampus.”

    1. Reviewer #1 (Public review):

      In this study, the authors provide an integrated proteogenomics pipeline to enable the discovery of novel peptides in an Ewing sarcoma cell line (A673). To identify novel full-length resolved isoforms, they performed long-read RNA sequencing (Oxford Nanopore Technology). Then, to increase the chance of detecting Ewing-specific neopeptides, the authors combined two approaches: a multi-protease digestion and a multi-dimensional proteomics approach.

      Given the importance of novel isoforms and cryptic sites in neoantigen discovery and its putative applications in immunotherapy, this method and resource paper are of interest for the Ewing community and potentially for a broader cancer audience. The originality of this paper relies mostly on this optimized method to discover novel peptides (long-read sequencing with multiprotease, multi-dimensional trapped ion mobility spectrometry parallel accumulation-serial fragmentation mass spectrometry). Although, to my knowledge, no study combining long-read sequencing and proteomics methods has been published on Ewing Sarcoma, this study appears limited by a few aspects:

      (1) The study is restricted to the analysis of a single cell line (A673). The authors should consider extending the analysis to other Ewing cell lines.

      (2) The characterization of the 1121 non-canonical transcripts can be improved. How many are just splice variants of known genes, and how many are bona fide neogenes? In this respect, the definition of what the authors call neogene is quite unclear. Is a transcript with a new exon reported as a neogene? Is a transcript with a new start site reported as a neogene? It should be clearly indicated which categories of Figure 4B are reported on Figure 4D. A general flow chart would be very useful to help follow the analysis process.

      (3) Similarly, the authors detect 3216 A673 specific proteins with no match in SwissProt. This number decreases to 72 "putative non-canonical proteoforms with unique peptides after BLASTp" against Uniprot. Again, a flow chart would conveniently enable one to follow the step-by-step analysis.

      (4) Finally, only 17 spectral matches are suggested to be derived from non-canonical proteoforms. It would be important to compare the spectrum of these detected peptides with that of synthetic peptides. Such an analysis would enable us to assess the number of reliably detected proteoforms that can be expected in an Ewing sarcoma cell line.

      (5) It is very unclear what the authors want to highlight in Supplementary Figure 5. Is it that non-canonical transcripts are broadly expressed in normal tissue? Which again raises the question of definitions of neogenes, non-canonical... Apparently, this figure shows that these non-canonical transcripts contain a large part of canonical sequences, which account for the strong signal in many normal tissues. A similar heatmap could be presented, including only the non-canonical sequences of the non-canonical transcripts. This figure should also include Ewing sarcoma samples.

    2. Reviewer #2 (Public review):

      The paper from Kulej et al. reports a set of tools for proteogenomic analysis of cancer proteomes. Their approach utilizes modern methods in long-read RNA sequencing to assemble a proteome database that is specific to Ewing sarcoma-derived A673 cells. To maximize proteome coverage and therefore increase the odds of detecting cancer-specific alterations at the protein level, the authors use multiple enzymes (trypsin, gluC, etc.) to digest cellular proteins and then perform multidimensional peptide fractionation. Peptide samples are then analyzed by LC-MS/MS using data-dependent and data-independent schemes on a timstof mass spectrometer. Proteogenomics is an important area of investigation for cancer research and does require new informatics tools.

      The authors describe an end-to-end workflow where they claim to have optimized four different steps:

      (1) Assembly of a sample-specific protein database using long-read transcriptomic data.

      (2) Use of 8 different proteolytic enzymes to maximize diversity of peptides.

      (3) Multiple stages of peptide fractionation using SCX and high pH rp chromatography.

      (4) Utilize acquisition methods on the timstof mass spec to provide MS/MS data from single-charged peptides and multiply-charged peptides.

      The authors published two earlier versions of ProteomeGenerator (versions 1 and 2) in the Journal of Proteome Research. In these earlier versions, 'ProteomeGenerator' was the set of software tools designed to integrate DNA and RNA sequencing to create a sample-specific protein database. To test the performance of each ProteomeGenerator version, the authors generated LC-MS/MS data using a combination of trypsin and LysC, then in the other paper, trypsin, LysC, and GluC. In both papers, they performed some levelof peptide fractionation prior to LC-MS/MS. They acquired LC-MS/MS data on a Thermo Q-Exactive in one paper and a Thermo Orbitrap mass spec in the other paper.

      In the current paper, the primary innovation is the use of long-read sequencing to potentially improve the quality of the sample specific protein database. The other three components noted above are incremental compared to the authors' previous two papers and generally accepted practices in the field of proteomics. To note one example, the authors previously digested proteins using three enzymes and now use eight. Similarly, they are now using a timstof Bruker mass spec instead of one from Thermo. The detailed descriptions around the use of many enzymes and peptide fractionation, etc., create a very technically oriented paper, similar to or more so than the authors' earlier papers in J. Proteome Research. So, while there is enthusiasm for the use of long-read sequencing across biomedical research, the impact here for proteogenomic applications is somewhat lost with all of the technical description for experimental details that are not particularly innovative. In this respect, the report is not well matched to a broad readership.

    1. Reviewer #3 (Public review):

      Summary:

      In this study, Wang et al., investigate how herbivorous insects overcome plant receptor-mediated immunity by targeting plant receptor-like proteins. The authors identify two independently evolved salivary effectors, BtRDP in whiteflies and NlSP694 in brown planthoppers, that promote the degradation of plant RLP4 through the ubiquitin-dependent proteasome pathway. NtRLP4 from tobacco and OsRLP4 from rice are shown to confer resistance against herbivores by activating defense signaling, while BtRDP and NlSP694 suppress these defenses by destabilizing RLP4 proteins.

      Strengths:

      This work highlights a convergent evolutionary strategy in distinct insect lineages and advances our understanding of insect-plant coevolution at the molecular level.

      Two minor comments:

      In line 140, yeast two-hybrid (Y2H) was used to screen for interacting proteins in plants. However, it is generally difficult to identify membrane receptors using Y2H. Please provide more methodological details to justify this approach, or alternatively, include a discussion explaining this.

      In Figure S12C, the interaction between the two proteins appears to be present in the nucleus as well. Please provide a possible explanation for this observation.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This is a well-structured and interesting manuscript that investigates how herbivorous insects, specifically whiteflies and planthoppers, utilize salivary effectors to overcome plant immunity by targeting the RLP4 receptor.

      Strengths:

      The authors present a strong case for the independent evolution of these effectors and provide compelling evidence for their functional roles.

      Weaknesses:

      Western blot evidence for effector secretion is weak. The possibility of contamination from insect tissues during the sample preparation should be avoided.

      Below are some specific comments and suggestions to strengthen the manuscript.

      Thank you very much for your comments. We have carefully revised the MS following your valuable suggestions and comments.

      (1) Western blot evidence for effector secretion:

      The western blot evidence in Figure 1, which aims to show that the insect protein is secreted into plants, is not fully convincing. The band of the expected size (~30 kDa) in the infested tissues is very weak. Furthermore, the high and low molecular weight bands that appear in the infested tissues do not match the size of the protein in the insects themselves, and a high molecular weight band also appears in the uninfested control tissues. It is difficult to draw a definitive conclusion that this protein is secreted into the plants based on this evidence. The authors should also address the possibility of contamination from insect tissues during the sample preparation and explain how they have excluded this possibility.

      Thank you for pointing out this. One or two bands between 25-35kDa were specifically identified in B. tabaci-infested plants, but not the non-infested plants, and the smaller high intensity band is the same size as that of BtRDP in salivary glands. This experiment has been repeated for six times. In the current version, we reperformed this experiment, and provided salivary gland sample as a positive control, which showed the same molecular weight with a specific band in infested sample. It is noteworthily that in the experiment of current version, only the smaller high intensity band appear, while the low intensity band did not appear. The detection of a protein within infested plant tissue is a key criterion for validating the secretion of salivary effectors, an approach supported by numerous studies in this field. Furthermore, our previous LC-MS/MS analysis of B. tabaci watery saliva identified six unique peptides matching BtRDP, providing independent evidence for its presence in saliva. Therefore, as we now state in the manuscript “the detection of BtRDP in infested plants (Fig. 1a) and in watery saliva (Fig. S1) collectively indicates that BtRDP is a salivary protein”.

      Regarding the higher molecular weight band that present in both infested and non-infested samples, we agree that it most likely represents a non-specific band, which is a common occurrence in Western blot assays. Such bands are sometimes used to indicate comparable sample loading. To address the possibility of contamination by insect tissues, we wish to clarify that all insects and deposited eggs were carefully removed from the infested leaves prior to sample processing. Moreover, BtRDP is undetectable at the egg stage, and no BtRDP-associated band can be detected even in egg contamination. We have revised the Methods section to explicitly state this procedure:

      “After feeding, the eggs deposited on the infested tobacco leaves were removed. The leaves showing no visible insect contamination were immediately frozen in liquid nitrogen and ground to a fine powder.”

      (2) Inconsistent conclusion (Line 156 and Figure 3c):

      The statement in line 156 is inconsistent with the data presented in Figure 3c. The figure clearly shows that the LRR domain of the protein is the one responsible for the interaction with BtRDP, not the region mentioned in the text. This is a critical misrepresentation of the experimental findings and must be corrected. The conclusion in the text should accurately reflect the data from the figure.

      We apologize for any confusion caused by the original phrasing. In our previous manuscript, the description “NtRLP4 without signal peptides and transmembrane domains” referred specifically to the truncated construct NtRLP4<sub>(23-541)</sub> used in the experiment. To prevent any misunderstanding, we have revised the sentence in the updated version to state explicitly: “Point-to-point Y2H assays reveal that NtRLP4<sub>(23-541)</sub> (a truncated version lacking the signal peptide and transmembrane domains) interacts with BtRDP<sup>-sp</sup>”.

      (3) Role of SOBIR1 in the RLP4/SOBIR1 Complex:

      The authors demonstrate that the salivary effectors destabilize the RLP4 receptor, leading to a decrease in its protein levels and a reduction in the RLP4/SOBIR1 complex. A key question remains regarding the fate of SOBIR1 within this complex. The authors should clarify what happens to the SOBIR1 protein after the destabilization of RLP4. Does SOBIR1 become unbound, targeted for degradation itself, or does it simply lose its function without RLP4? This would provide further insight into the mechanism of action of the effectors.

      Thank you for suggestion. In the current version, we assessed the impact of BtRDP on NtSOBIR1 following NtRLP4 destabilization. The results showed that while the NtRLP4-myc accumulation was markedly reduced, NtSOBIR1-flag levels remained unchanged, suggesting that destabilization of NtRLP4 did not affect NtSOBIR1 accumulation.

      (4) Clarification on specificity and evolutionary claims:

      The paper's most significant claim is that the effectors from both whiteflies and planthoppers "independently evolved" to target RLP4. While the functional data is compelling, this evolutionary claim would be more convincing with stronger evidence. Showing that two different effector proteins target the same host protein is a fascinating finding but without a robust phylogenetic analysis, the claim of independent evolution is not fully supported. It would be valuable to provide a more detailed evolutionary analysis, such as a phylogenetic tree of the effector proteins, showing their relationship to other known insect proteins, to definitively rule out a shared, but highly divergent, common ancestor.

      We appreciate the reviewer’s valuable suggestion to investigate a potential evolutionary link between BtRDP and NlSP104. Our initial analysis already indicated no detectable sequence similarity. To address this point more thoroughly, we attempted a phylogenetic analysis. However, we were unable to generate a meaningful alignment due to a complete lack of conserved amino acid sequences. Therefore, we conducted a comparative genomics analysis by blasting both proteins against the genomic or transcriptomic data of 30 diverse insect species. This analysis revealed that RDP is exclusively present in Aleyrodidae species, and SP104 is exclusively present in Delphacidae species (Table S1). Taken together, the absence of sequence similarity, their distinct protein structure, and their lineage-specific distributions, we conclude that BtRDP and NlSP104 are highly unlikely to be homologous and thus did not originate from a common ancestor.

      (5) Role of SOBIR1 in the interaction:

      The results suggest that the effectors disrupt the RLP4/SOBIR1 complex. It is not entirely clear if the effectors are specifically targeting RLP4, SOBIR1, or both. Further experiments, such as a co-immunoprecipitation assay with just RLP4 and the effector, could clarify if the effector can bind to RLP4 in the absence of SOBIR1. This would help to definitively place RLP4 as the primary target.

      We appreciate the reviewer’s insightful comments regarding whether the effector preferentially targets RLP4, SOBIR1, or both. In our study, we conducted reciprocal co-immunoprecipitation assays using RLP4 and BtRDP as controls. These assays showed that BtRDP interacts with RLP4 but does not interact with SOBIR1, supporting the conclusion that SOBIR1 is unlikely to be a direct target of BtRDP. We fully agree that testing the interaction between RLP4 and BtRDP in the absence of SOBIR1 would further strengthen the conclusion. However, we were unable to obtain N. tabacum SOBIR1 knockout mutants, and therefore could not experimentally assess whether the RLP4–BtRDP interaction persists in planta without SOBIR1. Nevertheless, our yeast two-hybrid assays demonstrate that RLP4 and BtRDP can directly interact, indicating that their association does not strictly depend on SOBIR1. Together, these results support the interpretation that RLP4 is the primary target of BtRDP, while SOBIR1 is not directly engaged by the effector.

      (6) Transcriptome analysis (Lines 130-143):

      The transcriptome analysis section feels disconnected from the rest of the manuscript. The findings, or lack thereof, from this analysis do not seem to be directly linked to the other major conclusions of the paper. This section could be removed to improve the manuscript's overall focus and flow. If the authors believe this data is critical, they should more clearly and explicitly connect the conclusions of the transcriptome analysis to the core findings about the effector-RLP4 interaction.

      Thank you for suggestion. As you and Reviewer #2 pointed, the transcriptomic analysis did not closely link to the major conclusions of the paper, and we got little information from the transcriptomic analysis. Therefore, we remove these analyses to improve the manuscript’s overall focus and flow.

      (7) Signal peptide experiments (Lines 145 and beyond):

      The experiments conducted with the signal peptide (SP) are questionable. The SP is typically cleaved before the protein reaches its final destination. As such, conducting experiments with the SP attached to the protein may have produced biased observations and could lead to unjustified conclusions about the protein's function within the plant cell. We suggest the authors remove the experiments that include the signal peptide.

      Thank you for pointing out this. The SP was retained to direct the target proteins to the extracellular space of plant cells. Theoretically, the SP is cleaved in the mature protein. This methodology is widely used in effector biology. For example, the SP directs Meloidogyne graminicola Mg01965 to the apoplast, where it functions in immune suppression, whereas Mg01965 without the SP fails to exert this function (10.1111/mpp.12759). In our study, the SP of BtRDP was expected to guide the target protein to the extracellular space, facilitating its interaction with RLP4. Moreover, the observed protein sizes of BtRDP with and without the SP in transgenic plants were identical, suggesting successful SP cleavage. Therefore, we have retained the experiments involving the SP in the current version.

      (8) Overly strong conclusion and unclear evidence (Line 176):

      The use of the word "must" on line 176 is very strong and presents a definitive conclusion without sufficient evidence. The authors state that the proteins must interact with SOBIR1, but they do not provide a clear justification for this claim. Is SOBIR1 the only interaction partner for NtRLP4? The authors should provide a specific reason for focusing on SOBIR1 instead of demonstrating an interaction with NtRLP4 first. Additionally, do BtRDP or NlSP694 also interact with SOBIR1 directly? The authors should either tone down their language to reflect the evidence or provide a clearer justification for this strong claim.

      Thank you for pointing this out. In the current version, the word “must” has been toned down to “may” due to insufficient supporting evidence. In this study, SOBIR1 was chosen because it has been widely reported to be required for the function of several RLPs involved in innate immunity. However, it remains unclear whether SOBIR1 is the only interaction partner of NtRLP4. In the current version, we have clarified the rationale for focusing on SOBIR1 prior to the experiments “The receptor-like kinase SOBIR1, which contains a kinase domain, has been widely reported to be required for the function of RLPs involved in innate immunity (Gust & Felix, 2014)” and discussed that “Although NtRLP4 interacts with SOBIR1, this alone does not confirm that it operates strictly through this canonical module. Evidence from other RLPs shows that co-receptor usage can be flexible, and some RLPs function partly or conditionally independent of SOBIR1. Therefore, a more definitive assessment of NtRLP4 signaling will therefore require genetic dissection of its co-receptor dependencies, including but not limited to SOBIR1.”. In addition, the direct interaction between BtRDP and SOBIR1 was experimentally tested, and the results showed that BtRDP failed to interact with SOBIR1.

      Minor Comments

      (9) The statement in the abstract, "However, it remains unclear how these invaders are able to overcome receptor perception and disable the plant signaling pathways," is not entirely accurate. The fields of effector biology and host-pathogen interactions have provided significant insight into how pathogens and pests manipulate both Pattern-Triggered Immunity (PTI) and Effector-Triggered Immunity (ETI). While the specific mechanism described in this paper is novel, the broader claim that the field is unclear on these processes weakens the initial hook of the paper. A more precise framing of the problem would be beneficial, perhaps by stating that the specific mechanisms used by these particular herbivores to target RLP4 were previously unknown.

      Thank you for this insightful comment. We agree that the original statement in the abstract overstated the lack of understanding in the field. In the current version, we have refined the sentence to more accurately reflect the current state of knowledge, emphasizing that while microbial suppression of plant immunity has been extensively studied, the strategies used by herbivorous insects to overcome receptor-mediated defenses remain less understood. The revised sentence now reads as follows: “Although the mechanisms used by microbial pathogens to suppress plant immunity are well studied, how herbivorous insects overcome receptor-mediated defenses remains unclear”.

      (10) The introduction is heavily focused on Pattern Recognition Receptors (PRRs), which, while central to the paper's findings, gives a somewhat narrow view of the plant's defense against herbivores. It would be beneficial to briefly acknowledge the broader context of plant defenses, such as physical barriers, direct chemical toxicity, and indirect defenses, before narrowing the focus to the specific molecular interactions of PRRs that are the core of this study. This would provide a more complete picture of the "arms race" between plants and herbivores.

      Thank you for this valuable suggestion. We agree that the original introduction focused too narrowly on pattern-recognition receptors (PRRs). In the current version, we have expanded the introductory section to provide a broader overview of plant defense mechanisms. Specifically, we now acknowledge the multiple layers of plant defenses, including physical barriers (e.g., cuticle and cell wall), chemical defenses (e.g., toxic secondary metabolites and anti-nutritive compounds), and indirect defenses mediated by herbivore-induced volatiles. This addition provides a more complete context for understanding the molecular interactions discussed in this study. The revised paragraph now reads as follows: “Plants have evolved sophisticated defense systems to survive constant attacks from pathogens and herbivorous insects. These defenses operate at multiple levels, including physical barriers such as the cuticle and cell wall, chemical defenses involving toxic secondary metabolites and anti-nutritive compounds, and indirect defenses that attract natural enemies of herbivores through the emission of herbivore-induced volatiles. Beyond these general strategies, plants also rely on highly specialized molecular immune responses that allow them to detect and respond rapidly to invaders.”

      (11) The figure legends are generally clear, but some could be more detailed. For instance, in Figure 2, it would be helpful to explicitly state what each bar represents in the graph and to include the statistical test used. Please ensure all panels in all figures have clear labels.

      Thank you for this helpful suggestion. We have revised the legend of Fig. 2 and other figures to provide more detailed information for each panel. Specifically, we now explicitly describe what each bar represents in the graphs and specify the statistical test used. In addition, we ensured that all panels are clearly labeled. These changes improve clarity and allow readers to better interpret the data.

      (12) The methods section is comprehensive, but it would be helpful to include more specifics on the statistical analyses used. For example, the type of statistical test (e.g., t-test, ANOVA) and the software used should be mentioned for each experiment.

      Thank you for your suggestion. We have revised the Methods section (Statistical analysis) to provide more detailed information on the statistical analysis used for each experiment.

      (13) The manuscript's overall impact is weakened by the inclusion of unnecessary words and a few grammatical issues. A focused revision to tighten the language would make the major findings stand out more clearly. For example, on page 2, line 18, "in whitefly Bemisia tabaci, BtRDP is an Aleyrod..." seems to have an incomplete sentence. A thorough proofreading for typos and grammatical errors is highly recommended to improve the overall readability.

      Thank you for your suggestion. We have carefully revised the abstract and the manuscript to improve clarity, readability, and grammatical correctness. In addition, we sought the assistance of a professional English editor to thoroughly proofread and polish the manuscript, ensuring that the language meets high academic standards.

      (14) The discussion section is strong, but it could benefit from a more explicit connection between the findings and the broader ecological implications. For instance, how might the independent evolution of these effectors in different insect species impact plant-insect co-evolutionary dynamics?

      We thank the reviewer for the valuable suggestion. In the current version, we have added a paragraph in the Discussion section highlighting the broader ecological and evolutionary implications of our findings. Specifically, we discuss how the independent evolution of RLP4-targeting effectors in different insect lineages may drive plant-insect co-evolution, influence selection pressures on both plants and herbivores, and potentially shape defense diversification across plant communities. This addition helps to link our molecular findings to ecological outcomes and co-evolutionary dynamics.

      (15) The sentence on line 98, which reads " A few salivary proteins have been reported to attach to salivary sheath after secretion" seems to serve an unclear purpose in the introduction. It would be helpful for the authors to clarify its relevance to the surrounding context or to the paper's overall argument. Its inclusion currently disrupts the flow of the introduction and makes it difficult for the reader to understand its intended purpose.

      We thank the reviewer for the comment. We have revised the paragraph to clarify the relevance of salivary sheath localization to the study. Specifically, we now introduce the role of the salivary sheath as a potential scaffold for effector delivery and explicitly link previous reports of sheath-associated salivary proteins to our observation that BtRDP localizes to the salivary sheath after secretion.

      (16) The writing in lines 104-106 is both grammatically inconsistent and overly wordy. The authors switch between present and past tense ("is" and "was"), and the sentences could be made more concise to improve the clarity and flow of the text. Also check entire paper.

      We thank the reviewer for pointing this out. We have revised the sentence to improve grammatical consistency and clarity, and also checked the manuscript for similar issues. The sentence is now split into two concise statements. In addition, we have thoroughly checked the entire manuscript for similar tense inconsistencies and overly wordy sentences, and have made revisions throughout to ensure consistent past tense usage and improved readability.

      (16) The sentences on lines 111-113 are quite wordy. The core conclusion, which is that the protein affects the insect's feeding probe, could be expressed more simply and directly to improve clarity and flow. I suggest rephrasing this section to be more concise and to highlight the primary finding without the added language.

      We thank the reviewer for the helpful suggestion. We have revised the sentences to make them more concise and to emphasize the main finding that BtRDP influences the whitefly’s feeding behavior as follow: “Compared with the dsGFP control, dsBtRDP-treated B. tabaci showed a marked reduction in phloem ingestion and a longer pathway duration, indicating that BtRDP is required for efficient feeding (Fig. 2c).”

      (17) On line 118, the authors mention "subcellular location." It is not clear where the protein is localized. The authors should explicitly state the specific subcellular compartment of the protein, as this is crucial for understanding its function and interaction with other proteins.

      We thank the reviewer for this valuable comment. To clarify the subcellular localization of BtRDP, we have revised the manuscript accordingly. The transgenic line overexpressing the full-length BtRDP including the signal peptide (oeBtRDP) is expected to localize in the apoplast (extracellular space), whereas the line expressing BtRDP without the signal peptide (oeBtRDP<sup>-sp</sup>) is likely retained in the cytoplasm.

      (18) Lines 121-128, the description of the fecundity and choice assays in this section is overly wordy. The authors should present the main conclusion of these experiments more directly and concisely. The key finding is that the protein affects feeding behavior; this central point is somewhat lost in the detailed, and sometimes repetitive, phrasing.

      We thank the reviewer for this suggestion. In the revised manuscript, we have simplified the description of the fecundity and two-choice assays to highlight the main conclusion as follow: “Fecundity and two-choice assays showed that BtRDP, whether localized in the apoplast (oeBtRDP) or cytoplasm (oeBtRDP<sup>-sp</sup>), enhanced whitefly settling and oviposition compared with EV controls (Fig. 2d-i; Fig. S10), indicating that BtRDP promotes whitefly feeding behavior regardless of its subcellular location.”

      (19) Line 148, the manuscript mentions experiments involving transformation, but the transformation efficiency is not provided. Please include the transformation efficiency for all transformation experiments, as this is crucial for the reproducibility of the results.

      We thank the reviewer for raising this point. We would like to clarify that no transformation experiments were performed in this section. The experiments described involved Y2H screening using BtRDP<sup>-sp</sup> as a bait to identify interacting proteins from a N. benthamiana cDNA library. Therefore, there is no transformation efficiency to report.

      (20) Line 159, the manuscript refers to a sequence similarity around line 159 but does not provide the specific data. It is important to show the actual sequence similarity, perhaps in a supplementary figure or table, to support the claims being made.

      We thank the reviewer for this suggestion. To support our statement regarding sequence similarity, we have added the corresponding alignment figure in the Fig. S11.

      (21) Line 159, the manuscript refers to "three randomly selected salivary proteins." It is unclear from where these proteins were selected. The authors should clarify the source of this selection (e.g., a specific database or a previous study) to ensure the methodology is transparent and the results are reproducible.

      We thank the reviewer for raising this point. These proteins were selected based on previously reports (10.1093/molbev/msad221; 10.1111/1744-7917.12856). In the current version, we provide the accession of these proteins in the MS.

      (22) Line 160, the description "NtcCf9 without signal peptide and transmembrane domains" is difficult to understand. It would be clearer and more consistent to use a term like "truncated NtcCf9" and then specify which domains were removed, as this is a standard practice in molecular biology for describing protein constructs.

      We thank the reviewer for this suggestion. We have revised the manuscript to describe the construct as “truncated NtCf9” and specified that the signal peptide and transmembrane domains were removed

      (23) The phrase "incubated with anti-flag beads" on line 172 is a detail of a routine method. Such details are more appropriate for the Methods section rather than the main text, which should focus on the results and their implications. Please remove such descriptions from the main text to improve readability and flow.

      We thank the reviewer for this suggestion. We have removed the methodological detail from the main text to improve readability. We also check this throughout the MS.

      I am excited about the potential of this work and look forward to seeing the current version.

      We sincerely thank the reviewer for the positive feedback and encouragement. We appreciate your time and thoughtful comments.

      Reviewer #2 (Public review):

      Summary:

      The authors tested an interesting hypothesis that white flies and planthoppers independently evolved salivary proteins to dampen plant immunity by targeting a receptor-like protein.

      Strengths:

      The authors used a wide range of methods to dissect the function of the white fly protein BtRDP and identify its host target NtRLP4.

      Thank you very much for your comments. We have carefully revised the MS following your valuable suggestions and comments.

      Weaknesses:

      (1) Serious concerns about protein work.

      I did not find the indicated protein bands for anti-BtRDP in Figures 1a and 1b in the original blot pictures shown in Figure S30. In Figure 1a, I can't get the point of showing an unspecific protein band with a size of ~190 kD as a loading control for a protein of ~ 30 kD.

      The data discrepancy led me to check other Western blot pictures. Similarly, Figures 2d, 3b, 3d, and S15b (anti-Myc) do not correspond to the original blots shown. In addition, the anti-Myc blot in Figure 4i, all blot pictures in Figures 5b, 5h, and S19a appeared to be compressed vertically. These data raised concerns about the quality of the manuscript.

      Blots shown in Figure 3d, 4f, 4g, and 4h appeared to be done at a different exposure rate compared to the complete blot shown in Figure S30. The undesirable connection between Western blot pictures shown in the figures and the original data might be due to the reduced quality of compressed figures during submission. Nevertheless, clarification will be necessary to support the strength of the data provided.

      We sincerely thank the reviewer for carefully examining our Western blot data and for pointing out these inconsistencies. The discrepancy between the figures in the main text and the original blots (Figure S30) resulted from an oversight during manuscript revision. This manuscript had undergone multiple rounds of revision after submission to another journal. During this process, the main figures and supplementary figures were updated separately, and we mistakenly failed to replace the original blot files with the corresponding current versions.

      For the different exposure rate, the blots shown in the main text were adjusted for overall contrast and brightness to enhance band visibility and presentation clarity, whereas the original images in Figure S30 were raw, unprocessed scans directly from the imaging system. For example, in the Author response image 1 below, to visualize the loading of the input sample, the output figure was adjusted for overall contrast and brightness. This was acceptable for image processing (https://www.nature.com/nature-portfolio/editorial-policies/image-integrity)

      Author response image 1.

      The same figure with brightness and contrast changes across the entire image.

      For the vertical compression, in the previous version, some images were vertically compressed for layout purposes to make the composite figures appear more visually balanced. However, after consulting relevant publication guidelines, we realized that such one-dimensional compression is not encouraged by certain journals as it may alter the original aspect ratio of the image. Therefore, in the manuscript, we have avoided any non-proportional scaling and retained the original aspect ratio of all images.

      We have now carefully rechecked all Western blot data, replaced the outdated raw blot images with the correct corresponding ones, avoid vertical compression, and ensured that the processed figures in the main text match their original data. The revised supplementary figures now accurately reflect the raw experimental results.

      (2) Misinterpretation of data.

      I am afraid the authors misunderstood pattern-triggered immunity through receptor-like proteins. It is true that several LRR-type RLPs constitutively associate with SOBIR1, and further recruit BAK1 or other SERKs upon ligand binding. One should not take it for granted that every RLP works this way. To test the hypothesis that NtRLP4 confers resistance to B.tabaci infestation, the author compared transcriptional profiles between an EV plant line and an RLP4 overexpression line. If I understood the methods and figure legends correctly, this was done without B. tabaci treatment. This experimental design is seriously flawed. To provide convincing genetic evidence, independent mutant lines (optionally independent overexpression lines) in combination with different treatments will be necessary. Otherwise, one can only conclude that overexpressing the RLP4 protein generated a nervous plant. In addition, ROS burst, but not H2O2 accumulation, is a common immune response in pattern-triggered immunity.

      We agree with the reviewer that not every RLP functions through the same mechanism as the canonical SOBIR1–BAK1 pathway. In the current version, we further examined the interaction between the whitefly salivary protein and SOBIR1, and found that they do not interact. However, our interaction assays clearly demonstrated that NtRLP4 does interact with SOBIR1. Whether NtRLP4 functions through, or exclusively through, SOBIR1 remains uncertain, and we have emphasized this limitation in the Discussion section as follow: “Although NtRLP4 interacts with SOBIR1, this alone does not confirm that it operates strictly through this canonical module. Evidence from other RLPs shows that co-receptor usage can be flexible, and some RLPs function partly or conditionally independent of SOBIR1 [39]. Therefore, a more definitive assessment of NtRLP4 signaling will therefore require genetic dissection of its co-receptor dependencies, including but not limited to SOBIR1.”

      Regarding the transcriptome analysis, our original aim was to explore why B. tabacishowed such a pronounced preference among tobacco plants. As this preference was assessed using uninfested plants, we also performed transcriptome sequencing using plants without B. tabaci treatment. The enrichment analysis demonstrated that the majority of up-regulated DEGs were associated with plant–pathogen interaction, environmental adaptation, MAPK signaling, and signal transduction pathways, while down-regulated DEGs were enriched in glutathione, carbohydrate, and amino acid metabolism. Notably, many DEGs were annotated as RLK/RLPs or WRKY transcription factors, most of which were upregulated, suggesting an enhanced defense state in the NtRLP4-overexpressing plants. The altered expression of JA- and SA-related genes (e.g., upregulation of FAD7 and downregulation of PAL and NPR1) further supported this enhanced defense and hormonal crosstalk. We agree that combining overexpression or knockout lines with insect infestation treatments would provide more direct genetic evidence for NtRLP4-mediated resistance, and we have acknowledged this as an important future direction. Nevertheless, our current data are consistent with the conclusion that NtRLP4 overexpression confers increased resistance to B. tabaci infestation.

      Finally, DAB staining for H<sub>2</sub>O<sub>2</sub> accumulation is also a well-established indicator of PTI responses, and many studies have shown that overexpression of salivary elicitors can trigger such accumulation.

      (3) Lack of logic coherence.

      The written language needs substantial improvement. This impeded the readability of the work. More importantly, the logic throughout the manuscript appeared scattered. The choice of testing protein domains for protein-protein interactions, using plants overexpressing an insect protein to study its subcellular localization, switching back and forth between using proteins with signal peptides and without signal peptides, among others, lacks a clear explanation.

      We appreciate the reviewer’s careful reading and valuable comments regarding the logical coherence of our manuscript.

      (1) To improve the English quality, the entire manuscript has been professionally edited by a certified language-editing service.

      (2) Regarding the rationale for testing protein domains in the protein–protein interaction assays: NtRLP4 is a membrane-anchored receptor-like protein composed of extracellular, transmembrane, and short intracellular domains. We aimed to determine which region of NtRLP4 is responsible for interacting with the salivary protein, as this would help infer the likely site of interaction in planta. In addition, not all RLPs contain a malectin-like domain, and we sought to verify whether the BtRDP–NtRLP4 interaction depends on this domain. To enhance the logical flow, we introduced a brief statement explaining the experimental purpose before presenting the interaction assays in the current version as follow: “These findings raised the question of which domain of NtRLP4 is responsible for binding BtRDP, as identifying the interacting domain could help infer where the salivary protein contacts the receptor in planta. We therefore dissected the NtRLP4 domains accordingly.”

      (3) With respect to using plants overexpressing an insect protein to examine subcellular localization: since both the brown planthopper and the whitefly are non-model species for which stable genetic transformation is technically unfeasible, many previous studies have used Agrobacterium-mediated transient expression or transgenic plant systems to investigate the subcellular localization of insect salivary proteins within host cells. Following these precedents, our study also employed plant systems to determine the localization of the insect protein and to assess how different localizations affect plant defense responses.

      (4) As for switching between constructs with or without signal peptides: the subcellular localization of effectors can influence their biological activity and interactions. Previous studies have used the presence or absence of signal peptides, or replacement with a PR1 signal peptide, to direct protein targeting (for example, Frontiers in Plant Science, 2022, 13:813181). Because salivary sheaths are generally considered to localize in the apoplastic space, we generated two transgenic N. tabacum lines overexpressing BtRDP: one carrying the full-length coding sequence including the signal peptide (oeBtRDP), expected to be secreted into the apoplast, and another lacking the signal peptide (oeBtRDP-sp), likely retained in the cytoplasm. In the current version, we clarified this rationale and added references to similar studies to improve the manuscript’s logic and readability. Details are as follow: “To investigate the role of BtRDP in different subcellular location of host plants, we constructed two transgenic N. tabacum lines overexpressing BtRDP: one carrying the full-length coding sequence including the signal peptide (oeBtRDP), which is expected to be secreted into the apoplast (extracellular space), and the other lacking the signal peptide (oeBtRDP<sup>-sp</sup>), which is likely retained in the cytoplasm.”

      Reviewer #3 (Public review):

      Summary:

      In this study, Wang et al. investigate how herbivorous insects overcome plant receptor-mediated immunity by targeting plant receptor-like proteins. The authors identify two independently evolved salivary effectors, BtRDP in whiteflies and NlSP694 in brown planthoppers, that promote the degradation of plant RLP4 through the ubiquitin-dependent proteasome pathway. NtRLP4 from tobacco and OsRLP4 from rice are shown to confer resistance against herbivores by activating defense signaling, while BtRDP and NlSP694 suppress these defenses by destabilizing RLP4 proteins.

      Strengths:

      This work highlights a convergent evolutionary strategy in distinct insect lineages and advances our understanding of insect-plant coevolution at the molecular level.

      Thank you very much for your comments. We have carefully revised the MS following your valuable suggestions and comments.

      Weaknesses:

      (1) I found the naming of BtRDP and NlSP694 somewhat confusing. The authors defined BtRDP as "B. tabaci RLP-degrading protein," whereas NlSP694 appears to have been named after the last three digits of its GenBank accession number (MF278694, presumably). Is there a standard convention for naming newly identified proteins, for example, based on functional motifs or sequence characteristics? As it stands, the inconsistency makes it difficult for readers to clearly distinguish these proteins from those reported in other studies.

      Thank you for your comment. These are species-specific salivary proteins that have not been reported or annotated in previous studies. Because no homologous genes could be identified in other species, there are no existing names or annotations for these proteins. For such lineage-specific salivary proteins, it is common in recent studies to name them according to their experimentally identified functions. For example, a recently reported salivary protein was named SR45-interacting salivary protein (SISP) based on its function (10.1111/nph.70668). Following this convention, we adopted a similar functional naming strategy in this study. We acknowledge that there may not yet be a standardized rule for naming such proteins, and we would be glad to follow a more authoritative naming guideline if possible.

      (2) Figure 2 and other figures. Transgenic experiments require at least two independent lines, because results from a single line may be confounded by position effects or unintended genomic alterations, and multiple lines provide stronger evidence for reproducibility and reliability.

      We appreciate the reviewer’s suggestion. In our study, two independent transgenic lines were used to ensure the reproducibility and reliability of the results. One representative line was presented in the main figures, while data from the second independent line were included in the supplementary figures. To make this clearer, we have emphasized in the manuscript that bioassays were conducted using two independent transgenic lines.

      (3) Figure 3e. Quantitative analysis of NtRLP4 was required. Additionally, since only one band was observed in oeRLP, were any tags included in the construct?

      Thank you for your comment. In the current version, quantitative analysis of NtRLP4 expression has been performed and is now presented in Figure 3. For the oeRLP plants, no tag was fused to NtRLP4; thus, anti-RLP serum was used to detect the target bands. In contrast, oeBtRDP and oeBtRDP-sp were fused with C-terminal FLAG tags, and their detection was carried out using anti-FLAG serum. This information has been clarified in the revised Methods section as follows: “The oeBtRDP and oeBtRDP<sup>-sp</sup> were fused with C-terminal FLAG tags, while no tag was fused to oeNtRLP4.”

      (4) Figure 4a. The RNAi effect appears to be well rescued in Line 1 but poorly in Line 2. Could the authors clarify the reason for this difference?

      Thank you for pointing this out. We also noticed that the RNAi effect appeared to be better rescued in Line 2 than in Line 1. Based on our measurements, the silencing efficiency of NtRLP4 in RNAi-RLP4 Line 1 was markedly weaker than in Line 2, which likely explains the difference in rescue efficiency. In the current version, we have clarified this point as follows: “Both RNAi-RLP lines showed reduced NtRLP4 levels compared with EV plants, with RNAi-RLP#2 exhibiting a stronger silencing effect (Fig. S19a).” “The differential rescue effect between the two RNAi lines likely resulted from their different NtRLP4 silencing efficiencies, with the lower NtRLP4 level in RNAi-RLP#2 leading to a more complete rescue phenotype.”

      (5) ROS accumulation is shown for only a single leaf. A quantitative analysis of ROS accumulation across multiple samples would be necessary to support the conclusion. The same applies to Figure 16f.

      Thank you for pointing this out. The H<sub>2</sub>O<sub>2</sub> accumulation experiments have been repeated for 5 times in Figure 4 and Figure S16f. In the current version, we addressed that “the experiment is repeated five times with similar results” in the figure legends.

      (6) Figure 4f: NtRLP4 abundance was significantly reduced in oeBtRDP plants but not in oeBtRDP-SP. Although coexpression analysis suggests that BtRDP promotes NtRLP4 degradation in an ubiquitin-dependent manner, the reduced NtRLP4 levels may not result from a direct interaction between BtRDP and NtRLP4. It is possible that BtRDP influences other factors that indirectly affect NtRLP4 abundance. The authors should discuss this possibility.

      Thank you for your valuable suggestion. We agree that the reduced NtRLP4 abundance may not necessarily result from a direct interaction between BtRDP and NtRLP4. In the manuscript, we have further discussed this possibility as follows: “Notably, BtRDP and NlSP104 shared no sequence or structural similarity and lack resemblance to known eukaryotic ubiquitin-ligase domains. Their interaction with RLP4s occurs in the extracellular space (Fig. 3d; Fig. 5c), whereas the ubiquitin-proteasome system primarily functions in the cytosol and nucleus [46]. Furthermore, NtRLP4 reduction is observed only in oeBtRDP transgenic plants, not in oeBtRDP-sp plants (Fig. 4f), suggesting that BtRDP exerts its influence on NtRLP4 in the extracellular space. These observations collectively argue against the possibility that BtRDP or NlSP694 possesses intrinsic E3 ligase activity capable of directly ubiquitinating RLP4s within plant cells. Importantly, the reduced NtRLP4 levels may not result from a direct physical interaction between BtRDP and NtRLP4. Instead, BtRDP may indirectly affect RLP4 post-translational modification, thereby accelerating its degradation, which warrants further investigation”

      (7) The statement in lines 335-336 that 'Overexpression of NtRLP4 or NtSOBIR1 enhances insect feeding, while silencing of either gene exerts the opposite effect' is not supported by the results shown in Figures S16-S19. The authors should revise this description to accurately reflect the data.

      Thank you for pointing this out. We agree that our original statement was not precise, as we measured the insect settling preference and oviposition on transgenic plants, but did not directly assess the feeding behavior of B. tabaci. Therefore, we have revised the description in the manuscript to more accurately reflect our data as follows: “Overexpression of NtRLP4 or NtSOBIR1 in N. tabacum is attractive to B. tabaci and promotes insect reproduction, whereas silencing of either gene exerts the opposite effect.”

      (8) BtRDP is reported to attach to the salivary sheath. Does the planthopper NlSP694 exhibit a similar secretion localization (e.g., attachment to the salivary sheath)? The authors should supplement this information or discuss the potential implications of any differences in secretion localization between BtRDP and NlSP694 for their respective modes of action.

      Thank you for your insightful suggestion. We agree that determining the secretion localization of NlSP694 would provide valuable information for understanding its potential mode of action. Immunohistochemical (IHC) staining is indeed a critical approach for such analysis. However, in this study, we were unable to express NlSP694 in Escherichia coli, and the antibody generated using a synthesized peptide did not show sufficient specificity or sensitivity for IHC detection. Consequently, we were unable to determine whether NlSP694 is attached to the salivary sheath. Therefore, whether BtRDP and NlSP694 acted in different mode require further investigation.

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      (1) Figure 1e. The BtRDP-labeled fluorescent signal is difficult to discern. An enlarged view of the target region would be helpful for clarity.

      Thank you for your suggestion. In the current version, an enlarged view of the target region was provided below the figure.

      (2) The finding that BtRDP accumulates in the salivary sheath secreted by Bemisia tabaci is important for understanding the subcellular localization of this protein during actual insect feeding. I suggest moving Figure S5 to the main text.

      Thank you for your suggestion. Figure S5 has been moved to Fig. 1f in the current version.

      (3) Please carefully cross-check the figure numbering to ensure that all in-text citations correspond to the correct figures and panels. i.e., lines 136,188,192, and 194.

      Thank you for pointing this out. We corrected them in the current version.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This study builds upon previous work in schizophrenia and other disorders using fibroblasts derived from patients, assessing mitochondrial phenotypes and then using these to identify compounds which reverse these phenotypes. The study is one of the largest of its kind performed to date with 168 patients included. The authors undertake mitochondrial phenotyping and machine learning of the outputted images to be segregate the patients based on clinical features and the associated cellular phenotype. The authors then go on to screening virtually publicly available datasets of cancer cells treated with compounds and also genetic modulations. In doing so, they can identify compounds which modulate the phenotypes and therefore might be of value to test in the patient derived lines. The study has strengths in the large number of samples, the advanced machine learning and the virtual screening. Furthermore, the authors highlight and discuss the limitations of the study well. There are some weaknesses which the authors can address. Firstly in the introduction, although it is comprehensive in some areas, in other areas for example outlining the fibroblast mitochondrial phenotype and indeed the use of patient fibroblasts to identify compounds, there is significant literature missing, particularly in Parkinson's Disease where screening in fibroblasts has resulted in compounds entering Phase 3 clinical trials. In addition to the studies using 100 or more PD patient fibroblast lines for phenotyping and patient stratification have not been included. It would be useful if the authors could comment on the robustness of the phenotypes identified in the fibroblasts over multiple passages. This is important when considering the biological and disease relevance of the phenotypes and it is not something the authors show or comment on. In discussing the genetic manipulations it would be useful to comment on the genes identified in more detail particularly those which are not known to be associated with changes in mitochondrial phenotypes.

      Significance

      This study builds on work from multiple labs investigating the utility of fibroblasts to identify phenotypes and find potential novel therapeutics. The size of the cohort and the advanced machine learning methods are a particular strength and this advances the field in this area. The availability of the data and code is a strength to allow others to replicate the findings. The lack of experimental validation of any of the compounds or genes identified by the virtual screening is a weakness which could be addressed.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      In their study, Haghighi et al. seek to build upon prior literature linking alterations in mitochondrial network distribution with various kinds of psychosis. Correlations between subcellular mitochondrial localization and different psychological states is an interesting and potentially fruitful frontier and should be explored; however, despite their ambitious strategy to screen 168 skin fibroblasts from patients experiencing psychosis, and examine various online image databases, there is a concerning number of issues related to the image-analysis approach. The foremost of these is a lack of direct measures of mitochondrial distribution, which might serve to validate their proposed MITO-SLOPE protocol. There is also a worrisome lack of robust controls, which are critical in light of how admittedly subtle some of the distribution phenotypes may be. Overall, the aim to screen differences in mitochondrial distribution is a laudable goal and, in the context of psychological disorders, could be helpful in identifying new therapeutic targets; but the methodology employed in this study does not seem to be sufficiently rigorous to be able to leverage this approach for screening purposes.

      I have extensive experience investigating mitochondria with advanced imaging technologies, including super-resolution microscopy as well as high-throughput and 4D imaging modalities. I am also familiar with standard as well as machine-learning approaches for quantifying mitochondrial morphology as well as distribution or trafficking. In my opinion, this study requires substantial revision, both in terms of the indirect and often opaque image-analysis pipeline as well as the inclusion of orthogonal experiments, which could serve to lessen concerns regarding purported differences in mitochondrial distribution, which are so difficult to discern as to be imperceptible. It is worth noting, too, that this study appears to be predicated, in many ways, upon a 2010 study (Cataldo et al.) of mitochondria in patients with bipolar disorder, which appears to reflect its own lack of critical controls for cell size.

      Major comments:

      The authors state, in the first paragraph of the results section: "By eye, we observed that samples from patients in the control and MDD categories show a more fine-grained, dispersed mitochondrial network extending to the edges of the cell, whereas patients in the categories experiencing psychosis tend to show an agglomerated, thicker network more concentrated around the nucleus. The pattern is subtle and heterogeneous across a cell population." The pattern is indeed subtle. I am concerned that it is so subtle as to be imperceptible. Firstly, it is important to note that the mitochondrial reticulum in BP, SZ, and SZA is more difficult to differentiate, by eye, because the signal appears to be saturated in places, such that the boundaries of individual mitochondria are indistinguishable due to differences in contrast or possibly from the fluorescence intensity itself. Although the authors indicate in the legend that the intensity of the mitochondrial fluorescence was adjusted "for visual clarity," it appears that the contrast needs to be decreased in the BP, SZ, and SZA conditions. It is also important to note that MitoTrackers load into mitochondria in a membrane-potential-dependent fashion. Did the authors detect differences in membrane potential between these groups? While imaging, was the same laser power and gain utilized from condition to condition? With this being said, it is not clear that mitochondria in control and MDD categories have different morphologies from the other conditions. It is also not clear what "fine-grained" means in this context. Is this a comment on aspect ratio? If so, it would be better to use standard terminology. (Why are there large red circular structures in the nucleus? These are likely not mitochondria, so why are they showing up in the channel with MitoTracker?) It is also not evident that one condition has more dispersed mitochondria than another. Given that the authors appear to be making this a central claim of their manuscript, it would seem appropriate to highlight specifically the regions of the different cells that they believe exhibit meaningful differences. If I attempt to look at the merged image, which is important because it is really the only way that one can gauge the relative distance of the mitochondrial network from the edge of the cell, there would seem to be no obvious differences between the conditions. Another key point that I think important to mention, given that it is frequently referenced in this manuscript, Cataldo et al., 2010 indicate that mitochondria in patient fibroblasts with bipolar disorder (BD) are more perinuclear than those in control. However, a cursory inspection of the images from this study (e.g., Figure 2A-B; Figure 4A-D; and Figure 6A-H) unambiguously demonstrate that the BD cells are smaller than the control cells. Of course, if the cells are smaller, the distance from the nucleus will tend to be shorter. In Cataldo et al., 2010, the authors state, "We also measured cell area, cell length, cell width, and cell perimeter of the fibroblasts used in this analysis to verify that the observed mitochondrial distributional differences were not simply a result of BD cells being smaller, shorter, or fatter. No significant differences in any of these measurements were seen based on diagnosis after two sample t tests." Notably, the data is not shown, so it is difficult to appreciate what the variance of the population of cells from control and BD would look like, but it must be said, nevertheless, that the representative images in this paper all point to the BD cells being smaller. In light of this, it would be helpful if Haghighi et al. could add scale bars to all the images (e.g., in Figure 2), so readers can ascertain whether all the cells are portrayed at the same scale and are of similar areas.

      As the authors indicate, interpretable measures of mitochondrial morphology include values like size and shape. It is concerning, therefore, that Figure 3 purports to identify a number of significantly different mitochondrial "features" in the patient groups experiencing psychosis, but they do not appear to make an effort to clarify how any of these features might reflect ground truths of mitochondrial architecture, which can be understood directly by values such as aspect ratio, circularity, area, number organelles, number of nodes or branching points in a network, etc. Unless the authors can specifically tie their machine-learning classifications to standard mitochondrial shape descriptors, their classifications will remain opaque and therefore of limited credibility or value. One way to improve the validation of their machine-learning classification methods would be to use empirically sound methods for manipulating a mitochondrial morphology and distribution, which could serve as positive or negative controls. For example, treatment of cells with the uncoupler FCCP would induce mitochondrial fragmentation, treatment with cycloheximide results in stress-induced mitochondrial hyperfusion (SIMH), or treatment with Nocodazole would block mitochondrial trafficking. Treating control cells with these chemicals would help to establish baseline measurements for how far the patient cells are deviating from untreated controls, in one direction or another. Such considerations, I think, are especially important when the mitochondrial phenotypes are so subtle. I agree with the authors' argument that, for the purposes of screening, it is best to focus on a single metric. Based on their apparent discernment of the subtle differences in mitochondrial distribution in patients experiencing psychosis, they opted to examine possible differences in network density. To this end, they developed "MITO-SLOPE." Out of multiple categories of features, they highlight the following as the most powerful for establishing differences in mitochondrial network density:

      "(a) A subset of texture measures in the nuclei and cytoplasm area of the mito channel. (b) A subset of features measuring the intensity of the mitochondria area across the cell."

      Within the concentric bins around the cell nuclei, they measure:

      • FracAtD: Fraction of total stain in an object at a given radius.
      • MeanFrac: Mean fractional intensity at a given radius, calculated as the fraction of total intensity normalized by the fraction of pixels at a given radius.
      • RadialCV: Coefficient of variation of intensity within a ring, calculated across 8 slices."

      While the authors have recommended the use of a single metric for purposes of screening, MITO-SLOPE appears to represent a bundle of metrics, which, in the end, do not amount to a clear readout of what is being measured. From my point of view, if one were interested in measuring mitochondrial distribution, then, in an ideal situation, one would measure the average distance of all the mitochondria from the center of the nucleus. And, since the size of the cell is critical for establishing relative distances to the boundaries or periphery of the cell, one would normalize this metric by cellular area. Thus, the readout would be: [average mitochondrial distance from the nuclear center (µm)]/[cellular area (µm2)]. An even simpler metric could be: [average mitochondrial distance from nuclear center (µm)]/[average cytoplasmic radius (µm)]. When talking about mitochondrial distribution, we typically think in terms of where is the mitochondrial network, on average, in relation to the nucleus (perinuclear) or to the edge of the cell (peripheral). By quantifying the actual mean distance of the mitochondrial network in relation to both the nucleus and the bona fide cell extremities, via the metrics I described above, one can obtain direct measurements of the truly meaningful values related to mitochondrial distribution. It seems deviating from these approaches introduces more and more opportunities for confounding variables.

      However, the MITO-SLOPE analysis does not seem to consider this metric. Is this, or a similar variation, not the most direct way to establish differences in the mitochondrial network distribution? I would, of course, at least want to see a discussion of why the authors have not chosen to use the most direct form of quantification for this purely spatial value. Why opt for a multifaceted measurement of a relatively straightforward quantity, when a simpler form of quantification would not only suffice but arguably be more likely to capture the ground truth? With this being said, it is not clear to me why, within MITO-SLOPE there seems to be a reliance on measuring the "intensity" of the mitochondria. (And what intensity is it? Mean intensity per ROI?) Of course, particularly if MitoTrackers were used for staining mitochondria, there will be heterogeneity in fluorescence intensity from organelle to organelle, which introduces potential confounders into the workflow. Furthermore, as indicated above, to know if the subcellular distribution of mitochondria is truly altered, it is essential to know if the cell size has likewise changed. Therefore, any unbiased measure of mitochondrial distribution must take into consideration the size of the cell; however, based on the information provided about MITO-SLOPE, it does not appear that the authors are accounting for possible variations in cell size that might account for alterations in mitochondrial network distribution - i.e., a smaller cell will have a more constrained area in which mitochondria will be able to disperse - thus, not accounting for cell size (area) will yield ambiguous results. For example, how can we know if mitochondrial motility is impaired or if the cell is simply smaller and there is less space in which to move? Another complexity, here, is if the cell boundaries were not accounted for via staining of actin, etc., then establishing a true cell boundary will be very challenging. How many bins are sufficient to capture the whole cell? Just 12? Furthermore, human fibroblasts have a tendency to be quite large (sometimes several hundred microns from end to end); how can the authors account for the whole cell, particularly in cases where part of the cell is beyond the field of view or cells are growing on top of each other, as is often the case?

      In Figure 6, there is no control image that could be used as a frame of reference. I have extensive experience imaging A549 cells. The mitochondria in these images appear to be highly fragmented. The staining patterns, particularly of the cells treated with divalproex-sodium, are quite dim, indicating mitochondrial depolarization. Of course, depolarization affects the fluorescence intensity of mitochondria stained with vital dyes, such as MitoTrackers, which will, in turn, presumably affect the values obtained from MITO-SLOPE, which appear to rely on intensity gradients, rather than more concrete spatial coordinates. Also, as indicated above, it is unclear how the authors are establishing the edges of cells without a marker of the plasma membrane or cytoskeleton.

      The authors note that "Divalproex-sodium is a benzodiazepine receptor agonist and HDAC inhibitor (Rahman et al. 2025) used to manage a variety of seizure disorders (Willmore 2003) and bipolar disorder(Bond et al. 2010; Cipriani et al. 2013); it shows a positive MITO-SLOPE which is the direction expected to normalize the centralized mitochondrial localization associated with psychosis." Insofar as this recommends the drug for use in "normalizing" perinuclear mitochondria within neurons, it would seem only prudent to mention that this drug also appears to induce mitochondrial depolarization and fragmentation, which are both associated with a range of severe human pathologies. I would caution the authors to not highlight one potential benefit while omitting an obvious side effect involving what appears to be significant perturbation of mitochondrial structure and function. What is the point of normalizing mitochondrial distribution if the mitochondria being redistributed are dysfunctional?

      The authors note, in Figure 7, that their MITO-SLOPE analysis was unable to discern a statistically significant difference in cells with specific knockouts of genes associated with mitochondrial trafficking. If the MITO-SLOPE cannot discern a difference in the context of a substantial abrogation of mitochondrial transport capacity, how is it that it could detect meaningful differences where there is only a "subtle" change in distribution? This result would seem to militate strongly against the efficacy of this analysis pipeline and would not recommend its use for unbiased screening and discovery.

      Minor comments:

      For Figure 6 b and c, "µm" should be "µM."

      The introduction and discussion could be more concise.

      Significance

      This study attempts to fill an important gap in knowledge relating to mitochondrial distribution and psychological disorders. It aims to perform an initial screen to try to validate a novel analysis pipeline called MITO-SLOPE, however, the study appears to lack analytical rigor, both in terms of the underlying cell biology together with the approach for quantification, itself. Conceptually, this study has great promise, but the authors will need to improve their pipeline prior to publication, which will likely require fundamental revisions, including an array of orthogonal measures (largely lacking here) as well as detailed demonstrations of how the segmentation actually works and ultimately yields data reflecting demonstrable mitochondrial trafficking/distribution defects.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Haghighi and McPhie et al. builds upon their previous findings by exploring the mitochondrial localization as a disease-associated phenotype in mental disorders, particularly in psychotic disorders. They recruited a cohort of patients diagnosed with schizophrenia, schizoaffective disorder, bipolar disorder and MDD. By taking advantage of skin biopsies, they screened patient-derived fibroblasts for aberrant mitochondrial localization and morphology using common staining techniques. Then, they use a machine learning approach to classify patients into their respective groups, which was effective for BP, SZA and pooled psychotic patients. Authors then develop a single feature for phenotyping, Mito-SLOPE, a metric of mitochondria density distribution across a cell by radial areas. With this metric, psychotic patients tend to have more nuclear-localized than edge-localized mitochondria; whereas MDD patients show a trend for higher edge-to-nucleus distribution. To find candidate drugs, authors screen publicly available datasets of cells treated with small compounds using mito-SLOPE. Furthermore, authors then apply mitoSLOPE on a CRISPR screen dataset, showcasing the role of mitochondrial dynamics genes and three genes of interest because of their association with psychosis. Finally, they identified the top genes whose KO or overexpression may explain (or reverse) the mitoSLOPE phenotype.

      Overall, the manuscript is well-written, the conclusions are supported within their limitations and this work represents an advancement in the field. I recommend it for publication provided these concerns are addressed:

      Major comments:

      1. The mitoSLOPE measure is very interesting and most likely reflects a subtle changes in mitochondrial transport. How does the microtubule network look like in the patient fibroblasts, are there obvious alterations in e.g. their posttranslational modifications? Is there a difference in mito transport speed or pausing frequency?
      2. I concur with the exclusion of compounds that obviously alter cell shape, as the authors mention for the cancer therapeutics. Some cancer therapeutics actually affect microtubule dynamics (see 1st point), which may underlie their effect on both cell shape and mitoSLOPE. To undertand the mechanism of action, the top hits should also be tested for the integrity of the microtubular network and mitochondrial transport parameters.
      3. While I agree with the authors' reasoning that the observed phenotype could be a result of the disease or the result of a compensatory mechanism, their hypothesis could be experimentally tested by addition of any of the top hits in order to reverse mitoSLOPE in their patient cell lines. It may not have worked for Lithium in their last manuscript, but the mechanism of action of the novel compounds could be cell intrinsic.
      4. Does recreation of the CRISPR cell line in their hands produce the same phenotype?
      5. Additionally, the observed phenotypes could also be a product of the medication taken by the patients. Deeper patient data from the cohort may be relevant to put the findings in context. How were patients diagnosed? Which medications were the patients taking? Was substance abuse present? In Mertens et al, Lithium responders and Lithium non-responders showed a differential mitochondrial response, how does this affect their dataset?
      6. While MDD itself is not a psychotic disorder, it can still present with psychotic features. Was this evaluated during the recruitment? Also important, were they on antipsychotic medication in addition to antidepressant therapy?
      7. The fact that CACNA1C is excluded from the "unbiased" hit discovery (Fig 8) undermines the power of the filtering criteria selected by the authors. Authors should include some discussion around this.

      Minor comments:

      1. Colored images should be made colorblind-accessible. This applies to microscopy images and graphs.
      2. Fig 3: Exact p-values should be reported in the graphs
      3. Fig. 5 and Fig 7a-b: It is not immediately clear what the lines in these graphs represent. Is it the individual drug/gene hits in a pre-ranked manner?
      4. Fig 6 b-c: should the "m" be capitalized for Molarity?
      5. The annotation of divalproex/valproic acid as a "benzodiazepine receptor agonist" is incorrect. While it is known to enhance GABAergic neurotransmission, the mechanism is supported to be through GABA synthesis rather than being a GABA-A receptor agonist (see eg. PMID: 23407051).
      6. Supplementary Fig 3 and 4 could be swapped to match the main text order.
      7. One reference was inaccessible: Anon, Phenomics-Enabled Discovery and Optimization of Small Molecule RBM39 Degraders as Alternative to CDK12 Targeting in High-Grade Serious Ovarian Cancer (HGSOC).

      Significance

      Recently, mitochondria have emerged as mediators of anxious behavior and are increasingly studied in the context of neuropsychiatric disorders. However, the molecular mechanisms that connect altered mitochondrial performance to specific neuropathological conditions are unknown. This study extends our knowledge in this realm. While it is in principle an extension of earlier work from the authors (Cataldo, A.M. et al. Am. J. Pathol. 2010), it has added value due to the application of their automated analysis to publicly available datasets, providing a clear technical advance. This identified known as well as novel compounds that could revert the mitochondrial phenotype and makes this study specifically interesting to an audience interested in translational research. The strength of the manuscript certainly lies in the large number of examples studied and their well-rounded discussion of their findings. It is limited by the fact that the phenotype of neuropsychiatric conditions is studied in peripheral cells, and thus may not be a simple cell-autonomous response but a compensatory, systemic response that is not easy to replicate in a fibroblast in isolation. No mechanistic insight is gained on the underlying cell biology in the current format.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Seegren and colleagues demonstrate that in a mouse model of neonatal E. coli meningitis, loss of endothelial toll-like receptor 4 (TLR4) leads to a marked decrease in transcriptional dysregulation across multiple leptomeningeal cell types, a decrease in vascular permeability, and a decrease in macrophage abundance. In contrast, loss of macrophage TLR4 had less pronounced effects. Using cultured wild-type and TLR4-knockout endothelial cells, the authors further demonstrate that TLR4-NF-κB signaling leads to reversible internalization of the tight junction protein claudin-5, establishing a potential mechanism of increased vascular permeability. Finally, the authors use RNA-sequencing of wild-type and TLR4-knockout endothelial cells to define the TLR4-dependent cell-autonomous transcriptional response to E. coli.

      Strengths:

      (1) The authors address an important, well-motivated hypothesis related to the cellular and molecular mechanisms of leptomeningeal inflammation.

      (2) The authors use model systems (mouse conditional knockouts and cultured endothelial cells) that are appropriate to address their hypotheses. The data are of high quality.

      Weaknesses:

      (1) The authors perform single-nucleus RNA-seq on dissected leptomeninges from control and E. coli-infected mice across three genotypes (WT, Tlr4MKO, and Tlr4ECKO). A major discovery from this experiment, as summarized by the authors, is: "Tlr4ECKO mice exhibited a global attenuation of infection-induced transcriptional responses across all major leptomeningeal cell types, as judged by the positions of cell clusters in the UMAP." This conclusion could be considerably strengthened by improving the qualitative and quantitative analysis.

      (2) The authors interpret E. coli infection-induced increases in leptomeningeal sulfo-NHS-biotin as evidence of compromised BBB integrity (i.e., extravasation from the vasculature) (Results, page 7), but another possible route in this context is sulfo-NHS-biotin entry from the dura across a compromised arachnoid barrier. The complete rescue in Tlr4ECKOs is strongly suggestive that the vascular route dominates, but it would strengthen the work if the authors could assess arachnoid barrier fidelity (e.g. via immunohistochemistry). At a minimum, authors should mention that the sulfo-NHS-biotin signal in this context may represent both vascular and arachnoid barrier extravasation.

      (3) The authors state that "deletion of TLR4 prevented both NF-κB nuclear translocation and Cldn5 internalization in response to E. coli (Figure 4A-D)" (Results, page 9). In Figures 4C and D, however, there is no indicator of a statistical test directly comparing the two genotypes. A comparison of within-genotype P-values should not be used to support a genotype difference (PMID: 34726155).

      (4) In the first paragraph of the Results, the authors summarize the meningeal layers as (1) pia, (2) subarachnoid space, (3) arachnoid, and (4) dura, and then state "The second and third layers constitute the leptomeninges." This definition of leptomeninges seems to omit the pia, which is widely considered part of the leptomeninges (PMID: 37776854).

      (5) The Cdh5-CreER/+;Tlr4 fl/- mouse lacks TLR4 in all endothelial cells (i.e., in peripheral organs as well as CNS/leptomeninges), and, as the authors note, the periphery is exposed to E. coli. It would be helpful if the authors could comment in the Discussion on the possibility that peripheral effects (e.g., peripheral endothelial cytokine production, changes to blood composition as a result of changes to peripheral endothelial permeability) may contribute to the observed leptomeningeal phenotypes.

    2. Reviewer #2 (Public review):

      Summary:

      The authors use a postnatal mouse model of E. coli bacterial meningitis and a mouse brain endothelioma cell line combined with cell-type-specific gene deletion to study the function of endothelial TLR4, a cell surface receptor that recognizes gram positive bacterial wall components, in the local leptomeningeal (LPM) response with a focus on endothelial barrier breakdown mediated by TLR4. Single-cell transcriptional profiling and imaging studies using whole-mount preps of the LPM support that LPM endothelial, CD206+ local macrophage and LPM fibroblast and arachnoid barrier cell inflammatory response and is abrogated in endothelial-specific KO of TLR4, pointing to a role for endothelial TLR4 in local LPM response. Culture studies using Bend3.1 cells (a mouse brain endothelioma cell line) support a direct role for TLR4 in the bacteria-mediated inflammatory response and in internalization of Cldn5 via the endosomal-lysosomal pathway, resulting in loss of barrier integrity

      Strengths:

      The local LPM cell response in meningitis and the role of specific LPM cells in inflammation and CNS barrier breakdown have not been extensively studied, despite ample evidence for primary immune response in the meninges in human patients and in animal models. The authors employ a robust, multi-model approach using both in vivo and in vitro models with cell-type-specific knockout to study the function of TLR4 in brain endothelial cell response. The authors nicely combine functional barrier assays with IF for junctional localization in their experimental design, and they delve into potential mechanisms of Cldn5 internalization using markers of endosomal-lysosomal pathway localization. The authors also describe a new type of barrier assay using a streptavidin-coated plate upon which barrier-forming cell cultures can be placted, this could be a very useful alternative or complement to other size-selective barrier assays and presumably could work for other barrier forming cells types, likely epithelial cells.

      Weaknesses:

      (1) There are no measures of bacterial burden in peripheral organs, blood, in the LPM or brain in the TLR4 endothelial cKO mice. Lack of TLR4 in endothelial cells could prevent bacterial 'access' into the LPM and brain, essentially preventing meningitis and leading to a lack of inflammatory responses in the LPM-located cells simply because there is no bacteria present. Bacteremia may also be reduced, as might inflammatory responses in peripheral organs with TLR4-deficient peripheral endothelium. Bacterial counts and inflammatory measures in peripheral organs and blood are important to better understand the mechanism(s) underlying the reduced inflammatory profile in LPM cells and no LPM endothelial breakdown in the Tlr4 endothelial cKO mice. In other words, does deleting TLR4 in EC protect against the development of meningitis by somehow blocking bacteria access to the LPM (this would be supported by low or no CFU counts in infected Tlr4 endothelial cKO) or is it what the authors appear to propose in Figure 1J that TLF4 in EC is the only cell responding to the bacteria to trigger the immune cascade in the LPM? More data is needed to resolve this, as this is a major claim of the paper.

      (2) The authors look at the underlying cortical response (cerebral vasculature for ICAM and immune cells) but do not use markers that could identify microglia (Iba1), the primary resident immune cell (CD206 is not useful, at this stage, in perivascular macrophages that are extremely sparse in the postnatal brain). This would be important to better study the impact on CNS resident immune cell morphological activation.

      (3) The authors suggest that Cldn5 junctional localization is selectively disrupted upon bacterial exposure, mediated by TLR4 - they suggest this based on studying PECAM, GLUT-1, ZO-1 and B-catenin (all normally junction or cell surface located in cultured Bend3.1) in relationship to Cldn5 localization (normally high) - it is possibly these are also impact by bacteria exposure (maybe through different mechanisms?) - a better measure would be to use the similar cyto/PM measure they do for Cldn5 in Fig. 4D and to evaluate this or to use intensity measurements.

      (4) The discussion could benefit from delving more into the prior literature on E.coli-mediated breakdown of junctions in cultured human microvascular brain endothelial cell model and critical host-pathogen interactions of the bacteria with ECs (PMID: 14593586), and how this might involve TLR4.

      (5) It would be important to discuss how their results relate to earlier studies on TLR4-/- and TLR2-/- global knockout mice and protection vs vulnerability to development of meningitis (see PMCID: PMC3524395) - this paper showed that TLR4 global KO mice have increased susceptibility to die from meningitis and have much higher CFU counts in the CNS. In this manuscript and their prior work (Wang et al., 2023), this group shown that both global TLR4-/- mutants and their EC-specific KO have reduced barrier permeability, but we don't have any information about CFU or susceptibility to death from meningitis in their models.

    3. Reviewer #3 (Public review):

      Summary:

      This study investigates the molecular underpinnings of immune responses in the leptomeninges in neonatal bacterial meningitis. Bacterial meningitis is a major disease burden, particularly for neonates, and it has previously been noted that the meningeal immune environment in infants is permissive to opportunistic infection (Kim et al., Sci Immunol, 2023). There is less known about the contribution of the stromal compartment to meningeal immune responses. Seegren et al. interrogate the role of leptomeningeal endothelium in host defence in E. coli infected neonatal mice using mouse genetic tools to delete the LPS receptor Tlr4 from either endothelial cells (using Cdh5-CreER) or macrophages (using LysM-Cre). The authors use snRNAseq, cleared cortical mounts, and in vitro work to define the impact of E. coli infection on leptomeningeal endothelial cells. This study uses a range of innovative techniques to probe the role of the stromal compartment in meningitis.

      Strengths:

      This study makes excellent use of cleared cortical mounts to examine the biology of the leptomeninges, in particular, changes to the endothelium, with unprecedented detail. In combination with high-quality sequencing data provide new insights into the impact of meningitis on the leptomeninges. The data presented by the authors is of very high quality.

      Weaknesses:

      The weaknesses of the study were in terms of interpretation and perhaps study design.

      (1) Most importantly, the authors need to provide additional validation of their conditional knockout models. The authors need to confirm that the Cdh5-CreER does not impact leptomeningeal fibroblasts and to confirm gene deletion in macrophages.

      (2) The authors could also strengthen the paper by providing data on the impact of these conditional knockout models on the course of meningitis and bacterial burden.

      (3) Finally, it is perhaps not surprising that Tlr4 is required for meningitis responses with E. coli. However, it is unclear if these findings can be generalised to other, more common, meningitis infections (streptococcal/pneumococcal).

      (4) There are additional minor issues; for instance, the arachnoid fibroblast 2 population appears to closely resemble dural border cells.

      (5) The cell line model (bEnd.3) is a relatively low-fidelity model of BBB endothelial cells, and this should be acknowledged.

      With these caveats, it is difficult to be certain that the endothelium alone is the driver of meningeal immune responses in meningitis, and what the impact of these is.

    1. Reviewer #1 (Public review):

      Summary:

      In brief, this manuscript addresses a very interesting topic, namely, the impact of the Mediterranean diet on the development of cancer. Using one mouse model and three tumor cell lines, the data show that a Mediterranean diet is sufficient to promote an anti-tumor response mediated by the microbiota, metabolites, and the immune system. Mechanistically, the Mediterranean diet promotes the expansion of Bacteroides thetaiotaomicron (B. theta for short), which converts tryptophan into 3-IAA. Both B. theta and the metabolite are sufficient to phenocopy the effect of the Mediterranean diet on cancer growth in vivo. The manuscript also shows that this effect is mediated by CD8 T cells and suggests, by way of in vitro assays, that 3-IAA sustains the functionality of CD8 T cells, preserving their exhaustion and blocking the ISR pathway.

      Strengths:

      The conclusions of this manuscript are potentially interesting and of potential clinical relevance.

      Weaknesses:

      For a full technical evaluation of the strength of the data, I am missing important technical and experimental details (e.g., number of independent experiments, statistics), and found some legends with potential labelling inaccuracies.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to investigate the mechanistic link between a Mediterranean-mimicking diet (MedDiet)-specifically the synergy between high fiber and fish oil-and its ability to suppress tumor growth. They successfully identify that this dietary combination alters the gut microbiome to favor the expansion of Bacteroides thetaiotaomicron. This bacterium metabolizes dietary tryptophan into indole-3-acetic acid (3-IAA), which then acts systemically to prevent CD8+ T-cell exhaustion.

      Strengths:

      The study integrates controlled dietary interventions, microbiome perturbation, metabolite profiling, and immune functional analyses into a coherent and well-organized framework, making the overall logic of the work easy to follow. The dietary design is carefully controlled, allowing clear interpretation of which broad dietary features are associated with the observed antitumor effects. The immune dependence of the phenotype is addressed using appropriate experimental approaches, and the results broadly support a role for gut microbiota-derived metabolites in shaping immune cell function. In addition, analyses of human datasets provide important context and enhance the potential relevance and usefulness of the findings for a broader research community.

      Weaknesses:

      While the manuscript provides strong support for a role of the microbial metabolite indole-3-acetic acid and downstream stress signaling in shaping immune cell function, the upstream mechanism by which this metabolite exerts its effects remains unresolved. In particular, the specific molecular sensor or binding target through which the metabolite acts has not been identified, and this uncertainty limits mechanistic precision. Framing this point more explicitly as an open question would help align the interpretation with the current data.

      In addition, at several points, the presentation may imply that a single microbial species is uniquely responsible for the observed effects. However, the experimental evidence more directly demonstrates sufficiency under the tested conditions rather than necessity. A clearer distinction between "sufficient" and "necessary" claims would help readers better assess the generality of the findings and their applicability to more complex microbial communities.

      The interpretation of the human data also warrants some caution. The diet-associated score applied to human datasets is derived from gene-expression signatures identified in mouse models and therefore represents an indirect proxy rather than a direct measure of dietary intake. Although the score correlates with clinical outcomes, it does not establish that patient survival is driven by consumption of specific dietary components such as fiber and fish oil.

    1. Reviewer #1 (Public review):

      This is an excellent paper from Dr. Yokoyama and colleagues. The experiments are technically demanding, given the very low cell numbers and the challenges of working with implantation sites at gestational days 6.5, 10.5, and 14.5. Overall, the impact of TGF-β receptor II deficiency in the NK lineage on uterine trNK cell numbers and litter size is convincing, and the authors' conclusions are well supported by the data. Less convincing, however, is the claim that the decrease in trNK cells is compensated by an increase in cNK cells; rather, the absence of TGF-β receptor II appears to result in an overall reduction of NK/ILC1 cells.

      Major Points:

      (1) Figure 1A and B

      Although a trend is evident, it does not appear that the absolute number of cNK cells at day 14 is significantly changed from day 6.5?

      (2) Figure 2E

      The authors state, "This reduction of uterine trNK cells was accompanied by a concomitant increase in the absolute number and frequency of CD49b+Eomes+ cNK cells within the pregnant uterus of TGF-βRIINcr1Δ dams (Figure 2 D, E). The number of cNK cells appears relatively low (visually ~1,000-1,300), and although the difference is statistically significant, its physiological relevance is unclear. More importantly, this modest increase does not correlate with the marked decrease in trNK and ILC1 populations, as cNK cells do not appear to accumulate. In my opinion, the conclusion "Collectively, these findings indicate that a TGF-β-driven differentiation pathway directs the conversion of peripheral cNK cells into uterine trNK cells during murine pregnancy" should be slightly toned down.

      (3) Figures 2-4

      It is unclear whether the littermate controls are floxed mice or floxhet-Ncr1iCre mice? This distinction is important, as Ncr1iCre expression itself could potentially lead to a phenotype.

    2. Reviewer #2 (Public review):

      In their manuscript "TGF-β drives the conversion of conventional NK cells into uterine tissue-resident NK cells to support murine pregnancy", Yokoyama and colleagues investigate the role of Tgfbr2 expression by NK cells in the formation of tissue-resident uterine NK cells and subsequent importance in murine pregnancy. By transferring congenic splenic conventional NK cells into pregnant mice, they show conversion of circulating NK cells into uterine ivCD45 negative tissue-resident NK cells. When interfering with the formation of uterine trNK cells, spiral artery remodelling was impaired, fetal resorption rates were increased, and litter sizes were reduced.

      Generally, this is a research topic of high interest, yet the manuscript is lacking detailed mechanistic insights, and some questions remain open. At the current state, the data represent an interesting characterisation of the Tgfbr2-fl/fl Ncr1-Cre mice in pregnancy, but considering (a) the recent publication by the group (Reference 17) on the role of Eomes+ cNK cells during pregnancy, (b) the previously described role of Tgfbr2 and autocrine TGFb expression for uterine NK cell differentiation in virgin mice (also cited by the authors), and (c) the well-known relevance of uterine NK cells during pregnancy, additional experiments addressing the specific role of Tgfb during pregnancy would help to improve novelty and significance of the manuscript. To this end, the following aspects should be discussed and, where applicable, experimentally addressed by the authors:

      (1) The authors suggest cNK extravasation and local differentiation into iv- trNK.

      Can it be estimated how much this process contributes to the trNK pool vs. a potential local proliferation of already existing trNK? How do absolute numbers of CD49a+ Eomes+ trNK change during pregnancies? (In Figure 1A, the cell numbers of CD49a+ Eomes+ trNK seem to go down dramatically between gd 6.5 and 14.5). The plot in 1B could also include absolute numbers of ILC1s and trNKs. Would recruited cNK cells compensate for a potential loss of CD49a+ Eomes+ trNK?

      (2) Figure 1C: 2.5

      Mio cNK cells have been transferred, but only very few cells can be detected within the uterus (concatenated FACS plot shown). What may represent the limit to generate uterine trNK out of cNK? Is the niche supporting cNK-trNK differentiation limited? Is it only a specific subset of (splenic) cNK capable of differentiating into trNK? Is gd 0.5 the optimal timepoint for the transfer? Is there continuous recruitment of cNK into the uterus and differentiation into trNK, or is it enhanced at specific timepoints of pregnancy? Could there be local proliferation of cNK-derived trNK? This could be studied by proliferation dye dilution of WT cNK cells in this transfer-setup.

      (3) The authors should consider inducible Tgfbr2 deletion (e.g. with Tamoxifen-inducible Cre) to enable development of the uterine NK compartment in virgin mice and only ablate trNK differentiation during pregnancy. This could help to estimate the turnover of cNK into trNK, or to understand if constant cNK recruitment is required to form the uterine trNK compartment during pregnancy.

      (4) Did the authors consider transfer of Tgfbr2-floxed Ncr1-Cre cNK in the same setup as in Fig. 1C? This experiment could confirm the requirement of Tgfbr-dependent signalling for cNK to trNK conversion during pregnancy versus effects of Tgfb signals on trNK numbers in the uterus at steady state (before pregnancy).

      (5) Figures 2D/E

      The authors should state that ILC1s are reduced in the virgin uterus of female Tgfbr2-floxed or Tgfb1-floxed Ncr1-Cre mice and cite the relevant work (the Ref #29 discussed in this context did not show that?). It would be helpful to include an analysis of all three uterine ILC subsets in steady state. This could help to answer the question if the cNK cell changes are pregnancy-specific or a general phenomenon in Tgfbr2-floxed Ncr1-Cre mice.

      (6) Figure 2E

      Please phrase more carefully about the "concomitant increase" of cNKs, since this increase is much less pronounced compared to the very strong reduction (absence) of trNKs in Tgfbr2-floxed Ncr1-Cre mice. Do the authors suggest that cNKs are halted at this stage and cannot differentiate into trNK, based on these data?

      (7) Figure 3/4

      Can the reduced litter size and the abnormal spiral artery formation be rescued by transfer of WT cNK into Tgfbr2-floxed Ncr1-Cre mice?

    1. Partenariat Parents-École : Un Pilier pour la Réussite Scolaire

      Résumé Analytique

      Ce document de synthèse analyse les points clés de la conférence organisée par Parents Partenaires en Éducation (PPE) Ontario, portant sur l'importance cruciale du partenariat entre les familles et les institutions scolaires.

      Le message central est que la réussite des élèves ne repose pas uniquement sur l'école, mais sur une collaboration étroite et proactive où les parents agissent en tant que « co-éducateurs ».

      L'engagement parental est structuré autour de trois dimensions : l'investissement personnel, l'investissement cognitif et l'engagement institutionnel.

      Pour les familles, particulièrement celles issues de l'immigration, cette implication est un levier majeur pour déconstruire les biais inconscients, valoriser l'identité culturelle et assurer une intégration réussie.

      L'analyse démontre que l'inclusion est un choix délibéré et que le sentiment d'appartenance ne peut émerger que lorsque les voix des parents participent activement aux processus de décision au sein des conseils d'école et des comités.

      --------------------------------------------------------------------------------

      1. Cadre Conceptuel de l'Engagement Parental

      L'engagement parental ne se limite pas à la supervision des devoirs ; il s'agit d'un investissement multidimensionnel qui influence directement les performances académiques et le bien-être socio-affectif de l'enfant.

      Les Trois Dimensions de l'Engagement

      Selon la littérature scientifique citée, l'engagement se décline comme suit :

      | Dimension | Description | Exemples concrets | | --- | --- | --- | | Investissement personnel | Aspirations et intérêt manifesté pour la vie scolaire de l'enfant. | Discussions sur la journée, intérêt pour les camarades et les activités. | | Investissement cognitif | Accompagnement dans les tâches et respect des structures scolaires. | Supervision des devoirs, fréquentation de la bibliothèque, respect des règles (ex: usage des appareils électroniques). | | Engagement institutionnel | Présence effective et participation aux processus de décision. | Participation aux conseils d'école, comités de parents, réunions et bénévolat actif. |

      --------------------------------------------------------------------------------

      2. L'Identité et les Valeurs : Fondements du Partenariat

      L'identité et les valeurs des parents ne doivent pas rester à la porte de l'école. Elles constituent les filtres à travers lesquels le partenariat s'exprime.

      L'identité comme outil de décodage : Le système scolaire a besoin de connaître l'identité socioculturelle des familles pour adapter son offre de services (enseignants, travailleurs sociaux).

      La décolonisation de l'esprit : Pour les parents immigrants, il est essentiel d'articuler leur identité face au choc culturel et de valoriser leurs origines pour que l'enfant se sente en sécurité dans son environnement scolaire.

      Le filtre des valeurs : Les décisions majeures concernant l'éducation de l'enfant doivent être passées au filtre des valeurs familiales. L'implication dans les conseils d'école permet de challenger l'approche « taille unique » (one size fits all) des politiques scolaires.

      --------------------------------------------------------------------------------

      3. Analyse des Bénéfices de la Collaboration

      La collaboration entre les parents et l'école crée une dynamique « gagnant-gagnant » pour toutes les parties prenantes.

      Pour l'Élève

      Renforcement de la confiance : L'enfant est fier de voir sa famille impliquée et valorisée.

      Motivation accrue : La proximité des parents stimule l'engagement de l'élève dans ses propres apprentissages.

      Réduction des biais : Une collaboration étroite permet de changer le regard du personnel scolaire sur l'enfant, transformant parfois une perception négative (ex: hyperactivité perçue comme un trouble) en une reconnaissance de traits positifs (ex: curiosité et créativité).

      Pour les Parents

      Fluidité de la communication : Les échanges directs avec les enseignants facilitent la résolution rapide des problématiques.

      Acteur du changement : Les parents peuvent influencer les politiques (ex: code vestimentaire, introduction de l'uniforme, littératie financière).

      Lutte contre l'isolement : L'implication favorise l'intégration sociale et culturelle, surtout pour les nouveaux arrivants.

      Pour le Personnel Scolaire

      Meilleure compréhension culturelle : Les parents aident les enseignants à décoder les comportements des élèves sous un angle culturellement adapté.

      Soutien opérationnel : Le bénévolat parental (ex: accompagnement au musée) enrichit l'expérience pédagogique.

      --------------------------------------------------------------------------------

      4. Diversité, Inclusion et Appartenance

      Une distinction cruciale est faite entre ces trois concepts pour guider l'action parentale :

      1. La Diversité : Un fait statistique (nombres, quotas, pluralité linguistique et culturelle).

      2. L'Inclusion : Un choix individuel et collectif. C'est la volonté d'accueillir et de s'intégrer activement.

      3. L'Appartenance : Le stade ultime, atteint uniquement lorsque les voix des minorités sont intégrées aux discussions et aux processus de décision.

      --------------------------------------------------------------------------------

      5. Exemples d'Impact par l'Engagement Proactif

      La source met en lumière plusieurs cas où l'initiative parentale a transformé l'environnement scolaire :

      Adaptation culturelle : La proposition d'un coin calme pour la prière a permis à un élève de vivre sa foi en sécurité, harmonisant les valeurs de la maison et de l'école.

      Valorisation identitaire : Une séance de lecture de contes et de danses africaines a transformé la perception d'une élève sur ses vêtements traditionnels, passant de la honte à la fierté.

      Innovation curriculaire : L'initiative d'un parent a mené à l'adoption de la littératie financière comme priorité au sein d'un conseil d'école.

      Réorientation stratégique : La proximité entre une mère et une enseignante a permis de rediriger un élève vers un programme plus adapté à son profil (Baccalauréat International), modifiant ainsi sa trajectoire académique.

      --------------------------------------------------------------------------------

      6. Conclusion et Appel à l'Action

      Le document conclut que le manque de temps est souvent une barrière perçue plutôt que réelle. Une heure par mois offerte au conseil d'école peut suffire pour exercer une influence positive.

      Messages clés pour l'avenir :

      • Les parents sont les premiers éducateurs ; l'école fournit l'instruction, les parents fournissent l'éducation.

      • L'implication des parents est le seul moyen efficace pour que le système scolaire connaisse et respecte l'identité des familles qu'il sert.

      • Chaque parent possède un pouvoir d'influence et doit choisir d'être un acteur du changement pour garantir une société pluraliste et enrichie par ses différences.

    1. build and strengthen their professional networks, leading to further job opportunities. And don’t discount the role of your professors in helping you build your network as well! In addition to providing valuable letters of recommendation for both graduate school and job applications, professors often have well-established professional networks and may be willing to help connect dedicated students with additional opportunities.

      Networking is a highly valuable connection that can last for a lifetime.

    2. Review the checklist below and mark each item if you agree. For those you cannot yet answer, consult your instructor, academic advisor, or college website to locate these important details.

      This is an excellent Checklist to ensure I am on the right track.

    3. it is tough to anticipate what to expect when you’re new to college. Taking the time to create a plan and to revise it when necessary is essential to making well-informed, mindful decisions.

      I find this sentence contradictory. Its tough to anticipate what to expect, yet creating a plan. How will I plan for the unexpected?

    4. Alternatively, you may be able to speed up, or accelerate, your timeline to degree by taking courses during summer or winter terms. Or if you take fewer than 15 credits per semester, you can take courses during the summer terms to “make up” those credits and stay on track toward those two- or four-year graduation goals.4

      I like that there are several options for PT students to be able to complete their degree in a decent time frame.

    1. Reviewer #1 (Public review):

      The manuscript titled "The distinct role of human PIT in attention control" by Huang et al. investigates the role of the human posterior inferotemporal cortex (hPIT) in spatial attention. Using fMRI experiments and resting-state connectivity analyses, the authors present compelling evidence that hPIT is not merely an object-processing area, but also functions as an attentional priority map, integrating both top-down and bottom-up attentional processes. This challenges the traditional view that attentional control is localized primarily in frontoparietal networks.

      The manuscript is strong and of high potential interest to the cognitive neuroscience community. Below, I raise questions and suggestions to help with the reliability, methodology, and interpretation of the findings.

      (1) The authors argue that hPIT satisfies the criteria for a priority map, but a clearer justification would strengthen this claim. For example, how does hPIT meet all four widely recognized criteria, such as spatial selectivity, attentional modulation, feature invariance, and input integration, when compared to classical regions such as LIP or FEF? A more systematic summary of how hPIT meets these benchmarks would be helpful. Additionally, to what extent are the observed attentional modulations in hPIT independent of general task difficulty or behavioral performance?

      (2) The authors report that hPIT modulation is invariant to stimulus category, but there appear to be subtle category-related effects in the data. Were the face, scene, and scrambled images matched not only in terms of luminance and spatial frequency, but also in terms of factors such as semantic familiarity and emotional salience? This may influence attentional engagement and bias interpretation.

      (3) The result that attentional load modulates hPIT is important and adds depth to the main conclusions. However, some clarifications would help with the interpretation. For example, were there observable individual differences in the strength of attentional modulation? How consistent were these effects across participants?

      (4) The resting-state data reveal strong connections between hPIT and both dorsal and ventral attention networks. However, the analysis is correlational. Are there any complementary insights from task-based functional connectivity or latency analyses that support a directional flow of information involving hPIT? In addition, do the authors interpret hPIT primarily as a convergence hub receiving input from both DAN and VAN, or as a potential control node capable of influencing activity in these networks? Also, were there any notable differences between hemispheres in either the connectivity patterns or attentional modulation?

      (5) A few additional questions arise regarding the anatomical characteristics of hPIT: How consistent were its location and size across participants? Were there any cases where hPIT could not be reliably defined? Given the proximity of hPIT to FFA and LOp, how was overlap avoided in ROI definition? Were the functional boundaries confirmed using independent contrasts?

      Comments on revisions:

      The authors have successfully addressed my previous questions and concerns. The public comments above reflect my views on the initial submission and, in my opinion, will remain helpful for general readers. Given this, I do not have additional public comments and will keep my previous public review unchanged.

    2. Reviewer #2 (Public review):

      Summary

      This study investigates the role of the human posterior inferotemporal cortex (hPIT) in attentional control, proposing that hPIT serves as an attentional priority map that integrates both top-down (endogenous) and bottom-up (exogenous) attentional processes. The authors conducted three types of fMRI experiments and collected resting-state data from 15 participants. In Experiment 1, using three different spatial attention tasks, they identified the hPIT region and demonstrated that this area is modulated by attention across tasks. In Experiment 2, by manipulating the presence or absence of visual stimuli, they showed that hPIT exhibits strong attentional modulation in both conditions, suggesting its involvement in both bottom-up and top-down attention. Experiment 3 examined the sensitivity of hPIT to stimulus features and attentional load, revealing that hPIT is insensitive to stimulus category but responsive to task load - further supporting its role as an attentional priority map. Finally, resting-state functional connectivity analyses showed that hPIT is connected to both dorsal and ventral attention networks, suggesting its potential role as a bridge between the two systems. These findings extend prior work on monkey PITd and provide new insights into the integration of endogenous and exogenous attention.

      Strength

      (1) The study is innovative in its use of specially designed spatial attention tasks to localize and validate hPIT, and in exploring the region's role in integrating both endogenous and exogenous attention, as prior works focus primarily on its involvement in endogenous attention.

      (2) The authors provided very comprehensive experiment designs with clear figures and detailed descriptions.

      (3) A broad range of analyses was conducted to support the hypothesis that hPIT functions as an attentional priority map -- including experiments of attentional modulation under both top-down and bottom-up conditions, sensitivity to stimulus features and task load, and resting-state functional connectivity. These analyses showed consistent results.

      (4) Multiple appropriate statistical analyses - including t-tests, ANOVAs, and post-hoc tests-were conducted, and the results are clearly reported.

      Comments on revisions:

      The authors have addressed our comments in their revised manuscript and in their response to the reviewers. We don't have any further suggestions or comments.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The manuscript titled "The distinct role of human PIT in attention control" by Huang et al. investigates the role of the human posterior inferotemporal cortex (hPIT) in spatial attention. Using fMRI experiments and resting-state connectivity analyses, the authors present compelling evidence that hPIT is not merely an object-processing area, but also functions as an attentional priority map, integrating both top-down and bottom-up attentional processes. This challenges the traditional view that attentional control is localized primarily in frontoparietal networks.

      The manuscript is strong and of high potential interest to the cognitive neuroscience community. Below, I raise questions and suggestions to help with the reliability, methodology, and interpretation of the findings.

      Thank you for a nice summary of the key points of our study. Below you will find our reply to your questions.

      (1) The authors argue that hPIT satisfies the criteria for a priority map, but a clearer justification would strengthen this claim. For example, how does hPIT meet all four widely recognized criteria, such as spatial selectivity, attentional modulation, feature invariance, and input integration, when compared to classical regions such as LIP or FEF? A more systematic summary of how hPIT meets these benchmarks would be helpful. Additionally, to what extent are the observed attentional modulations in hPIT independent of general task difficulty or behavioral performance?

      Great suggestions! For the first suggestion, we have included a clearer justification in the discussion part of manuscript (line 405-406). For the second one, all participants received task practice prior to scanning, and task accuracy exceeded 90%, suggesting the tasks were not overly demanding. Although ceiling effects limit the interpretability of behavioral-performance correlations, we argue that higher task demands would likely require greater attentional effort, leading to stronger modulation in hPIT, which aligns with our findings.

      (2) The authors report that hPIT modulation is invariant to stimulus category, but there appear to be subtle category-related effects in the data. Were the face, scene, and scrambled images matched not only in terms of luminance and spatial frequency, but also in terms of factors such as semantic familiarity and emotional salience? This may influence attentional engagement and bias interpretation.

      The response of hPIT is not sensitive to stimulus category, but attentional modulation in hPIT is slightly stronger to faces than scenes and scrambled images. Although faces used in the task had neutral expressions and the scene pictures were also neutral, we acknowledge that we indeed cannot exclusively eliminate the possibility that potential semantic familiarity or emotional salience may contribute to the subtle category-related effects in the results of experiment 3. This limitation has been noted in the discussion part of manuscript (line 440-442).

      (3) The result that attentional load modulates hPIT is important and adds depth to the main conclusions. However, some clarifications would help with the interpretation. For example, were there observable individual differences in the strength of attentional modulation? How consistent were these effects across participants?

      Yes, individual differences exist. In the manuscript, we have included individual subject data points in the figure 6B. No data exceeded three standard deviations from the group mean, suggesting that the attentional modulation effects were generally consistent across participants.

      (4) The resting-state data reveal strong connections between hPIT and both dorsal and ventral attention networks. However, the analysis is correlational. Are there any complementary insights from task-based functional connectivity or latency analyses that support a directional flow of information involving hPIT? In addition, do the authors interpret hPIT primarily as a convergence hub receiving input from both DAN and VAN, or as a potential control node capable of influencing activity in these networks? Also, were there any notable differences between hemispheres in either the connectivity patterns or attentional modulation?

      Though it’s hard to generate directional flow of information from fMRI due to the low temporal resolution. We agree that besides resting-state connection, task-based functional connectivity analyses would have the potential to provide additional information about whether hPIT serves as a convergence node or a control hub. We have conducted task-based functional connectivity analyses, specifically PPI, using data from experiment 2, which revealed task-modulated right hPIT connectivity with FFA, LOp, and TPJ, suggesting hPIT may allocate attentional resources to object-processing regions following priority map generation (line 378-383). Given the limited number of significant PPI results and the inherent constraints of fMRI in capturing fast or transient attention-related interactions, the present data do not allow us to determine the role of hPIT. Future studies combining effective connectivity or causal perturbation methods (e.g., DCM, TMS-fMRI) would be ideal to test whether hPIT acts as a control node influencing activity within DAN and VAN.

      We also observed modest hemispheric asymmetries in connectivity—for instance, both left and right hPIT showed stronger connectivity with right-hemisphere attention nodes. This has been described in the results part of manuscript (line 373-377).

      (5) A few additional questions arise regarding the anatomical characteristics of hPIT: How consistent were its location and size across participants? Were there any cases where hPIT could not be reliably defined? Given the proximity of hPIT to FFA and LOp, how was overlap avoided in ROI definition? Were the functional boundaries confirmed using independent contrasts?

      We can see a relatively consistent size and location of hPIT across subjects in Supplementary Figure 1, where the voxel size and location for individual subjects reported. The consistency also demonstrated by figure 4C.

      We avoided overlap with the FFA and LOp by manually delineating the hPIT which is defined by conjunction maps across three tasks and by avoiding overlapping voxels. The FFA was defined using an independent contrast (Exp3 contrast [face-scene]) and the Lop location was defined by anatomical parcellation (Glasser et al., 2016).

      Reviewer #2 (Public review):

      Summary

      This study investigates the role of the human posterior inferotemporal cortex (hPIT) in attentional control, proposing that hPIT serves as an attentional priority map that integrates both top-down (endogenous) and bottom-up (exogenous) attentional processes. The authors conducted three types of fMRI experiments and collected resting-state data from 15 participants. In Experiment 1, using three different spatial attention tasks, they identified the hPIT region and demonstrated that this area is modulated by attention across tasks. In Experiment 2, by manipulating the presence or absence of visual stimuli, they showed that hPIT exhibits strong attentional modulation in both conditions, suggesting its involvement in both bottom-up and top-down attention. Experiment 3 examined the sensitivity of hPIT to stimulus features and attentional load, revealing that hPIT is insensitive to stimulus category but responsive to task load - further supporting its role as an attentional priority map. Finally, resting-state functional connectivity analyses showed that hPIT is connected to both dorsal and ventral attention networks, suggesting its potential role as a bridge between the two systems. These findings extend prior work on monkey PITd and provide new insights into the integration of endogenous and exogenous attention.

      Strengths

      (1) The study is innovative in its use of specially designed spatial attention tasks to localize and validate hPIT, and in exploring the region's role in integrating both endogenous and exogenous attention, as prior works focus primarily on its involvement in endogenous attention.

      (2) The authors provided very comprehensive experiment designs with clear figures and detailed descriptions.

      (3) A broad range of analyses was conducted to support the hypothesis that hPIT functions as an attentional priority map -- including experiments of attentional modulation under both top-down and bottom-up conditions, sensitivity to stimulus features and task load, and resting-state functional connectivity. These analyses showed consistent results.

      (4) Multiple appropriate statistical analyses - including t-tests, ANOVAs, and post-hoc tests - were conducted, and the results are clearly reported.

      Thank you for a nice summary of the key points and strengths of our study.

      Weaknesses

      (1) The sample size is relatively small (n = 15), and inter-subject variability is big in Figures 5 and 6, as seen in the spread of individual data points and error bars. The analysis of attention-modulated voxel map intersections appears to be influenced by multiple outliers.

      We agree that the sample size (n = 15) is not ideal, and we acknowledge that some data points in Figures 5 and 6 appear to be potential outliers. However, according to conventional outlier detection criteria, all data points fell within three standard deviations of the group mean and were therefore retained for analysis.

      Moreover, the attention-modulated voxel intersection map shown in Figure 4C is insensitive to outliers, because the intersection plotted is based on the number of subjects

      (2) The authors acknowledge important limitations, including the lack of exploration of feature-based attention and the temporal constraints inherent to fMRI.

      Yes, we have mentioned these limitations in the discussion.

      (3) Prior research has established that regions such as the prefrontal cortex (PFC) and posterior parietal cortex (PPC) are involved in both endogenous and exogenous attention and have been proposed as attentional priority maps. It remains unclear what is uniquely contributed by hPIT, how it functionally interacts with these classical attentional hubs, and whether its role is complementary or redundant. The study would benefit from more direct comparisons with these regions.

      In this study, we define the ROI base on intersection across three different types of spatial attention tasks, which is a stricter criterion. And the results didn’t reveal spatial attentional modulation across tasks besides PITd. This could be due to the lack of lateralized responses in PFC/PPC. To evaluate whether a region qualifies as a priority map, we applied four widely accepted criteria (as mentioned in introduction). While dorsal and ventral attention network (DAN and VAN) regions can be considered supportive components of the priority map system, our findings suggest that among the regions tested, only hPIT fully meets all criteria. In Experiment 2, we included regions such as VFC (as part of PFC) and IPS (as part of PPC), and our findings suggest these areas are more involved in top-down attention. In the revision, we have performed additional analysis on PPC (IPS) and PFC (FEF, VFC), shown in Figure S2.

      (4) The functional connectivity analysis is only performed on resting-state data, and this approach does not capture context-dependent interactions. Task-based data analysis can provide stronger evidence.

      We acknowledge that resting-state FC is limited in assessing task-specific communication. To further investigate the role of hPIT, we have conducted PPI analysis, which revealed task-modulated right hPIT connectivity in attention allocation (line 378-383).

      (5) The study does not report whether attentional modulation in hPIT is consistent across the two hemispheres. A comparison of hemispheric effects could provide important insight into lateralization and inter-individual variability, especially given the bilateral localization of hPIT.

      We thank the reviewer for this suggestion. hPIT was localized bilaterally using the same intersection-based method in Experiment 1. We have now performed additional analysis and found hemispheric differences in hPIT attentional modulation (Experiment 2). Besides, we also found in Experiment 3, the difference of load modulation (averaged across stimulus categories) in left and right hPIT was not significant. These results have been reported in the results part of manuscript (line 347-351).

    1. Burgoon defined personal space as the “invisible, variable volume of space surrounding an individual that defines that individual’s preferred distance from others.”2 She claimed that the size and shape of our personal space depend on our cultural norms and individual preferences, but our space always reflects a compromise between the conflicting approach–avoidance needs that we as humans have for affiliation and privacy. The idea of personal space wasn’t original with Burgoon. In the 1960s, Illinois Institute of Technology anthropologist Edward Hall coined the term proxemics to refer to the study of people’s use of space as a special elaboration of culture.3 He entitled his book The Hidden Dimension because he was convinced that most spatial interpretation is outside our awareness. He claimed that Americans have four proxemic zones, which nicely correspond with the four interpersonal distances selected by my students:

      The part about personal space caught my attention because it shows we all kinda have an invisible bubble around us even if we do not think about it. When someone stands too close or too far it just feels off and changes how we see the interaction. The classroom example made sense because the professor reacted differently to each student just based on distance, not what they said. That shows communication is not only words, it is also space and body position. It explains why someone can feel awkward or rude even when they did not actually say anything bad. So distance affects how we judge people without us realizing it.

    2. Personal Space Expectations: Conform or Deviate? Burgoon defined personal space as the “invisible, variable volume of space surrounding an individual that defines that individual’s preferred distance from others.”2 She claimed that the size and shape of our personal space depend on our cultural norms and individual preferences, but our space always reflects a compromise between the conflicting approach–avoidance needs that we as humans have for affiliation and privacy. The idea of personal space wasn’t original with Burgoon. In the 1960s, Illinois Institute of Technology anthropologist Edward Hall coined the term proxemics to refer to the study of people’s use of space as a special elaboration of culture.3 He entitled his book The Hidden Dimension because he was convinced that most spatial interpretation is outside our awareness. He claimed that Americans have four proxemic zones, which nicely correspond with the four interpersonal distances selected by my students:

      Burgoon’s definition of personal space shows that the distance we keep from others is not accidental but shaped by both cultural norms and personal comfort levels. What stood out to me is the idea that personal space is a balance between our need for closeness and our need for privacy, which connects directly to Expectancy Violations Theory. This helps explain why the same behavior, like standing too close, can feel friendly to one person but uncomfortable or invasive to another. Hall’s concept of proxemics adds to this by emphasizing that many of these spatial expectations operate without us even realizing it, making violations more noticeable and impactful when they occur.

    1. What are 3–5 adjectives that you would want your receiver to use to describe you?

      3 adjectives I would want my receiver to use to describe me are productive, respectful, and hardworking

    1. List each source that you have cited in your paper with an in-text citation in the Works Cited page. Only list sources you have cited in the paper. Do not list sources that you have consulted but not cited.

      only list sources that you cited

    1. AI Doesn’t Reduce Work—It Intensifies It
      • Task Expansion & Role Blurring: AI lowers the barrier to entry for complex tasks, leading employees to take on work outside their core expertise. Product managers and designers are now writing code, while researchers take on engineering tasks.
      • Specialist Burden: This expansion creates a "cleanup" tax. For example, senior engineers now spend significant time reviewing, debugging, and mentoring colleagues who produce "vibe-coded" AI outputs, often through informal and unmanaged channels like Slack.
      • The "Ambient Work" Phenomenon: Because AI interactions feel conversational and "easy," work has become ambient. Employees find themselves prompting AI during lunch, between meetings, or late at night, eliminating natural mental downtime.
      • Intensified Multitasking: Workers are running multiple AI agents in parallel while simultaneously performing manual tasks. This creates a high sense of "momentum" but leads to extreme cognitive load and constant attention-switching.
      • The Productivity Trap: AI acts as a "partner" that makes revived or deferred tasks feel doable. This creates a flywheel where people don't work less; they simply take on more volume, leading to "unsustainable intensity" that managers often mistake for genuine productivity.
      • Sustainability Risks: The researchers warn that while AI feels like "play" initially, it eventually leads to cognitive fatigue, impaired decision-making, and burnout as the quiet increase in workload becomes overwhelming.

      Hacker News Discussion

      • Cognitive Fatigue: Users highlighted that "AI fatigue" is distinct from normal work tiredness. It stems from the "constant vigilance" required to audit AI output and the lack of a "flow state" due to unpredictable waiting times for generations.
      • Executive Function Strain: Commenters noted that managing autonomous agents is more exhausting than manual work. One user compared it to Level 3 autonomous driving—you aren't driving, but you must remain "fully hands-on" to ensure the AI doesn't touch the wrong files or hallucinate.
      • The Jevons Paradox: Several participants pointed out that as the "cost" of work decreases due to AI, the demand for work increases proportionally. Instead of saving time, workers are expected to triple their output, which leaves them more stressed than before.
      • Management Expectations: A common theme was that leadership often mandates AI usage and pre-supposes productivity gains, leaving no room for cases where AI makes work slower or lower quality. This forces employees to "perform" productivity while working longer hours.
      • Vibe Coding vs. Engineering: There is a heated debate between those who see "vibe coding" (prompt-heavy development) as a massive efficiency gain and veterans who argue it produces "average code" that becomes a maintenance nightmare in large, legacy codebases.
  2. d1wqtxts1xzle7.cloudfront.net d1wqtxts1xzle7.cloudfront.net
    1. mited to stages 3 to 5 (be-cause NHES III included children 12 to 17, consideredtoo old to estimate stage 2 reliably) and Mexican Amer-ican children from HHANES. The authors concluded thatthere is no evidence of an earlier puberty (as measuredby median ages of Tanner stages 3–5) during the timespanning the 3 surveys (1960s through 1990s) for eithernon-Hispanic black or white girls but “some evidence”for Mexican American girls between 1982 and 1994. Anadvantage of this study is the consistent reanalysis ofdata from 3 national surveys. Disadvantages of the anal-ysis are the exclusion of information on onset (Tannerstage 2), which many consider to be the key puberty-timing issue of concern, and decreased study power as aresult of limiting the sample sizes for comparison

      So this study shows that there is no puberty change fore ages 12-17? The concern is puberty onset, not general maturation, meaning it is not applicable?

    1. A possibly-related but undeciphered writing system emerged about 5,200 years ago in the region of Susa in what are now the Iranian highlands east of Mesopotamia. Named after the proto-Elamite culture of the region, the script found on over 1,600 clay tablets developed during a period when the region was trading with Uruk. Between 1,000 and 1,500 symbols seem to represent words and syllables in a language that gave way to Elamite about five hundred years later.

      I think it is quite interesting how there are so many undeciphered writing systems just like this one. Goes to show how advanced their civilization may have been at the time. I can’t even fathom how a new language or writing system would be interpreted in today's day and age.

    1. The meditation training involved six 60-minute group sessions(held over 7 weeks, because of religious holidays) with 20 –30participants per group. All sessions were led by a stress-management specialist (Sandra M. Finkel) with extensive experi-ence practicing and teaching LKM. The median number of ses-sions attended was five (M  4.3, SD  1.8). At the first session,participants were given a CD that included three guided medita-tions of increasing scope, led by the workshop instructor. DuringWeek 1, participants practiced a meditation directing love andcompassion toward themselves. During Week 2, the meditationadded loved ones. During subsequent weeks, the meditation builtfrom self, to loved ones, to acquaintances, to strangers, and finally,to all living beings. The first meditation lasted 15 min, and thefinal one lasted 22 min.

      LKM and the different levels the meditation group were exposed to. 1) guided CD meditation led by instructor 2) added loved ones 3)acquaintances 4) strangers 5) all living beings

    Annotators

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General statements.

      We thank the reviewers for their positive response and useful suggestions on our manuscript. They recognize the ‘proof of concept’ nature of the work and the importance of extending the number of human mutation-specific DMD mouse models from one to five for preclinical research. We feel that the quality of the manuscript has been improved upon implementation of the reviewer’s suggestions.

      Reviewer 1.

      OPTIONAL - From the point of view of the reviewer, it seems plausible to use CRISP/Cas9 to "clean up" the original hDMDmdx mouse line by selectively removing one of the YACs forming the tail-to-tail tandem in the mouse genome. Once such single copy mouse line is generated (and proven viable?) any subsequent rearrangement of the hDMD transgene would prove much less challenging. Such mouse line would also better represent human model where only one DMD copy is carried on the X chromosome.

      The reviewer gives the optional suggestion that the generation of these models could have been combined with the removal of one of the copies of the YAC to extend the use of the new models to CRISPR-based therapies. This is correct, but we note that when the data on the removal of a copy of the YAC were published, our new models were already generated and in different stages of QC, colony building and analysis. The procedure described by Chey et al could be used on our new models, but this would require additional time and funding and is therefore outside the scope of this manuscript.

      The labels in figure 2B and 3A would benefit from showing the PCR fragment lengths as well as the sizes of obtained hDMD exon deletions. On could also include an additional figure panel demonstrating the principle of ASO-induced exon skipping

      Reviewer #1 also has a minor comment regarding the exact deletions in figure 2B and 3A. For fig. 2B he/she suggests to include the sizes of the PCR fragments next to the gel. Especially for the gel regarding PCR1, which detects the deleted YAC copy, this will not be very informative as this can be (and is) different for different clones depending on the NHEJ-mediated repair in the specific clone. Adding sizes is only interesting for each specific clone, and adding them all will make a very messy figure. The important message from this gel is the presence of any fragment, as the undeleted copy is not amplified under the conditions used. For the gel of PCR2 the opposite is the case, here the PCR fragment shown is simply the undeleted YAC copy, and here we are only interested in the absence of the PCR fragment.

      We thank the reviewer for the suggestion of adding the deletion sizes to fig 3A. This made us realize that an additional table with the details of the mutant alleles in all models had been omitted, and we apologize for this error. With the revised version we include details on the size of the deletions and their genomic coordinates (in the human genome as it is in the human YAC) of each of the new models (revised Sup. Table 1). We trust that adding these details will clarify this reviewer’s minor comment.

      The reviewer requests to include an additional figure panel demonstrating the principle of ASO-induced exon skipping. We have now added this to the revised version of the manuscript (new fig. 5).

      The study is fairly limited in scope and will be of primary interest to those working in the DMD field.

      We are aware of 9 clinical trials for exon 51 and 53 studies that are ongoing or were recently stopped. For four of these compounds companies have a license to our hDMDdel52/mdx mouse model, and one of these studies has been published. An additional 7 clinical trials are planned or ongoing for exon 44, 45 and 50 skipping for which the newly developed models are being or can be used for preclinical studies.

      Reviewer 2.

      To further strengthen the rigor of the study, it would be valuable to include an analysis of potential off-target effects of CRISPR editing, particularly given that double targeting of two YAC copies was required. This is especially important for germline edits, as off-target mutations could introduce confounding phenotypes in the resulting mice. Demonstrating minimal or absent off-target activity would increase confidence in the specificity and safety of the generated models.

      There has indeed been one major study suggesting a large number of CRISPR-induced off-target mutations in mouse models. However, this publication was rapidly questioned by multiple groups for having used the wrong control animals and the original publication was retracted (https://doi.org/10.1038/nmeth0518-394a). Another study at that time, using the correct controls, did not find mutations that could be attributed to CRISPR-induced off-target mutations. A more recent study analysed founder animals from transgenic projects using 163 different guide RNAs and concluded ‘In total, only 4.9% (8/163) of guides tested have detectable off-target activity, at a rate of 0.2 Cas9 off-target mutations per founder analysed. In comparison, we observe __~1,100 unique variants in each mouse regardless of genome exposure to Cas9 __indicating off-target variants comprise a small fraction of genetic heterogeneity in Cas9-edited mice.’ In short, the background mutation rate in mice is much higher than the Cas9 off-target mutation rate. In addition to this, we only used guide RNAs that did not have any predicted off-target sites (according to the CRISPOR tool; https://crispor.gi.ucsc.edu/crispor.py) on the same chromosome or in protein coding sequences, so that any undetected off-target mutation will rapidly be lost in the subsequent breeding. We also would like to refer the reviewer to the ‘referee cross-commenting remark’ from reviewer #3 on this topic.

      The validation of the dystrophic phenotype is generally convincing. However, the authors should clarify how "human dystrophin" is detected in the deletion models. Since only part of the dystrophin gene in these mice is humanized (the remainder is murine), it is important to specify, also in the results, which antibody was used and which epitope/exon it recognizes. If the antibody targets a deleted exon in a given model, this could lead to misinterpretation of the dystrophin signal. Providing this clarification would ensure the conclusions regarding dystrophin expression are fully supported.

      This question is based on the incorrect assumption that only part of the DMD gene in these models is humanized. As described in the original publication on the YAC transgenics the complete human gene is in the YAC. Here, we deleted a particular exon from this complete human DMD gene. In combination with the mdx allele, these mice lack the full-length mouse and human dystrophin isoforms expressed in muscle. As mentioned in the materials section, the human dystrophin protein was detected with the Mandys 106 antibody (recognizing exon 43; amino acids 2063-2078), which only has reactivity with human dystrophin according to the product specification of Sigma Aldrich. We confirmed this for wild type mouse tissue, showing no dystrophin for this antibody. In fig 4 we confirm lack of human dystrophin in the deletion models using this antibody. The mouse and human dystrophin protein was detected with the AB154168 antibody of Abcam (recognizing the last 100 amino acids of the C-terminal part of the protein), which has reactivity with both mouse and human. So neither antibody did target a deleted exon. For the exon skipping validation, solely the Abcam antibody was used, as none of the deleted or skipped exons was recognized by this antibody. Information regarding the targeted protein region has now been added to the materials section.

      Additionally, to further strengthen the characterization of the muscular dystrophy phenotype, the authors could quantify muscle fibre size and the percentage of centrally nucleated fibres, both of which are widely accepted quantitative markers of ongoing degeneration/regeneration in DMD models.

      and

      The validation of exon skipping in the new hDMD deletion models is convincing at the molecular level. However, since the ASOs were injected into both gastrocnemius and triceps muscles, it would be helpful to include at least a brief characterization of the triceps, even in supplementary data, as different muscles can show slightly different pathology and responses. Additionally, while the molecular readouts (RT-PCR and Western blot) demonstrate restoration of dystrophin expression, including simple histological analysis, such as H&E staining, could further support functional improvement and reinforce the physiological relevance of exon skipping in these models.

      The proof-of-principle nature of the current manuscript is focused on restoration of dystrophin expression shortly after ASO treatment, and the current sample sizes (n=3 mice per strain) are too limited for actual quantification of histopathological improvements. Furthermore, the timespan between the intramuscular injection and tissue collection (2 weeks) does not allow sufficient time for histopathological improvements to develop. Notably, a large natural history analysis of all these new models is currently ongoing, which includes a large variety of in vivo functional outcome measures and provides a full description of the histopathological aspects of these mice. The proposed characterization of the triceps is now included as supplementary data of the manuscript (Sup. Fig 1).

      Reviewer 3.

      This reviewer starts with pointing out some typos, or requested rephrasing to sentences for clarification. We appreciate this and have addressed this in the revised version of the manuscript.

      Generation of the models: it is not clear why the authors generated line 44 in ES cells, then switched to direct gene editing in zygotes. Was this due to advent of electroporation of zygotes at the time? This may need clarification beyond the sentence "Encouraged by the specificity of our new prescreen workflow and the efficiency of correct targeting of human exon 44 in ES cells, we generated additional models ... directly in mouse zygotes".

      The simple answer to this is that we were (pleasantly) surprised ourselves by the efficiency we got in the ES cells (which was based on the previous experience generating the del52 model). For animal welfare reason we prefer to generate models via ES cells if we expect a long and cumbersome quality control process and / or very low efficiency, as ES cells allow us to do this QC before the actual animals are generated, thus reducing the number of animals generated during the model generation phase. Expecting very low efficiency, we originally picked 10 x 96 well plates of clones for this del44 targeting, but after pre-screening the first two plates (192 clones), we realized this was an enormous overkill in clones, and the additional 8 plates were not analysed. With this much higher than expected efficiency, and the power of the two-step pre-screen described in the manuscript, we decided to try the next model (the del45) directly in zygotes. This was found efficient enough to also do the last two models directly in zygotes. We can only speculate on the much higher efficiency than observed for the del52 targeting. Clearly the fact that we knew of the double integration this time allowed us to develop the successful 2-step pre-screen. Another difference is that the del52 model was generated using TALENs as genome editors, whereas now we could use CRISPR/Cas9.

      Antisense oligonucleotide treatment: there is no description of the design of the ASOs beyond their sequence in suppl. Table 4. How were they designed? Moreover, they have been injected at two different doses (i.e., 50ul for Exon 51 & 53; 100ul for Exon 44 & 45). What is the rational for this? There is no justification in the manuscript.

      The requested additional details on ASO design and dosing have been added to the materials section of the revised manuscript. The reviewer also pointed out that fig 4 includes both a protein sample diluted to 10% of protein of both a C57BL/6J and hDMD/mdx control mouse, and requested a justification for this. We included samples of both wildtype strains to confirm species reactivity of the dystrophin antibodies used, with the AB145168 antibody being specific for both mouse and human protein (showing a dystrophin band in both wildtype samples), and the Mandys106 antibody being specific to only human protein (showing a dystrophin band in the hDMD/mdx control only).

      Phenotypic validation of the new models: a description of the mdx line with C57BL/6J mice is mentioned. Is this why Fig.4 includes "10% Bl6" and "10% hDMD/mdx"? If so, this should be clarified in the text (or deleted from the figure). The authors mentioned "As expected, the gastrocnemius of healthy hDMD/mdx mice expressed dystrophin of human origin at wildtype levels". Why would this be expected? If 2 copies of the gene, including the human promoter, are integrated, why would one expect a wildtype level of expression? In fact, in the original paper describing the hDMD/mdx model ('t Hoen et al. 2008), the human transcripts are expressed at 2 to 4-fold higher than their endogenous counterparts (which is in line with the integration of 2 copies).

      It is true, as he/she points out, that qRT-PCR data in the original YAC transgenic publication showed double expression of the human transcript, consistent with the double integration. However, fig. 3b in the same paper shows that at the protein level the expression of human DMD is comparable to the mouse protein. We don’t know the reason for the discrepancy between transcript and protein levels in this model, but in the current manuscript we are referring to this protein expression.

      A quantification of the expression levels on Figure 4 should be done (normalized to actinin) to resolve this. The size of the Marker should also be added on Figure 4.

      We feel that proper quantification can only be done with the utilization of a standard curve. As we expected no, or trace levels of dystrophin in the deletion models, we only included wildtype samples diluted to 10% of wildtype protein. This prevents us from accurate quantification of the trace dystrophin levels observed in the del45 and del51 models. However, as can be appreciated from fig 4, expression is very minimal. We added information on the marker in the materials section, and indicated the size (85 kDa) in the figure legend.

      Finally, the authors observed histological hallmarks of the disease in the new models (i.e., muscle degeneration and fibrosis). Although obvious on the images, it may be useful to add indications (e.g., arrows) on the images for readers non familiar with DMD.

      We added information on the marker in the materials section, and indicated the size (85 kDa) in the figure legend. Lastly, we also added the requested arrows to the pictures of fig. 4B to allow distinction between different histopathological hallmarks, and refer to these in the figure legend.

      Prescreen PCR of hDMD/mdx ES cells (Fig. 2): the authors mentioned that "The PCR conditions were chosen for not being able to amplify the undeleted allele." What does this mean? Was the elongation time reduced? As per the text, the theoretical size of a WT band is around 1.6kb. Yet, on the gel, bands higher than 1kb are visible for some clones.

      This is indeed based on the extension time of the PCR reaction shown in PCR 1 from fig 2B, amplified with primers upstream and downstream of the deleted region (see fig 1 and 2A). However, the approx. 1.6 kb fragment the reviewer refers to is the undeleted-specific amplification shown in Fig 2B PCR 2, which is the result of a primer outside and a primer inside the deleted region (fig 1and 2A). Amplification of the undeleted copy with the primers used in PCR 1 would give a fragment of 3902 nt. The deletion of exon 44 in the final model is 3584 nt, which details will be shown in the excel file that was erroneously omitted (see our response to reviewer #1), with the PCR 1 product of the deleted copy in the clone used for the mouse model being 318 nt. It is straight-forward to select an extension time that would be insufficient for a 3.9 kb fragment, but which can amplify fragments that are shorter due to the deletion. Even in a clone with a single copy of exon 44 deleted, one would not expect to see the 3902 nt fragment due to preferential amplification of the much shorter mutant band. This has now been clarified in the legend of figure 2 of the revised version of the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by van Putten et al. describes the generation and initial characterization of four new mouse models of DMD, based on the previously generated hDMD/mdx murine model, which expressed human dystrophin from a yeast artificial chromosome (YAC) in a DMD null (mdx) background. The four new models are based on the deletion of four Exons (44, 45, 51 & 53), which accounts for most human deletions (hotspot) in DMD.

      The description of the generation of these models using CRISPR/Cas9 gene editing is thorough, and the quality control is adequate. Moreover, preliminary testing of exon skipping therapy using ASO showed it is possible to restore the production of dystrophin protein (albeit truncated) in these models, which increase their translational value. Although the study is valuable and methodologically sound, there are minor points that need to be addressed:

      • Few typos need to be corrected:
        • "Therapeutic approaches aiming to restore dystrophin for DMD are based on the discrepancy between DMD and BMD mutations." This needs to be rephrased to clarify the meaning for readers not familiar with DMD.
        • "Western blot and immune fluorescence analysis on gastrocnemius muscles..." replace" immune fluorescence" with immunofluorescence.
        • "Two weeks after the last injection muscles were isolated, and RNA and protein was isolated from muscle..." protein WERE isolated.
        • "However, gene editing-based therapies could run into the same unpredictable outcome reduced efficiency of a therapy ..." This sentence is confusing, consider rephrasing.
      • Generation of the models: it is not clear why the authors generated line 44 in ES cells, then switched to direct gene editing in zygotes. Was this due to advent of electroporation of zygotes at the time? This may need clarification beyond the sentence "Encouraged by the specificity of our new prescreen workflow and the efficiency of correct targeting of human exon 44 in ES cells, we generated additional models ... directly in mouse zygotes".
      • Antisense oligonucleotide treatment: there is no description of the design of the ASOs beyond their sequence in suppl. Table 4. How were they designed? Moreover, they have been injected at two different doses (i.e., 50ul for Exon 51 & 53; 100ul for Exon 44 & 45). What is the rational for this? There is no justification in the manuscript.
      • Phenotypic validation of the new models: a description of the mdx line with C57BL/6J mice is mentioned. Is this why Fig.4 includes "10% Bl6" and "10% hDMD/mdx"? If so, this should be clarified in the text (or deleted from the figure). The authors mentioned "As expected, the gastrocnemius of healthy hDMD/mdx mice expressed dystrophin of human origin at wildtype levels". Why would this be expected? If 2 copies of the gene, including the human promoter, are integrated, why would one expect a wildtype level of expression? In fact, in the original paper describing the hDMD/mdx model ('t Hoen et al. 2008), the human transcripts are expressed at 2 to 4-fold higher than their endogenous counterparts (which is in line with the integration of 2 copies). A quantification of the expression levels on Figure 4 should be done (normalized to actinin) to resolve this. The size of the Marker should also be added on Figure 4. Finally, the authors observed histological hallmarks of the disease in the new models (i.e., muscle degeneration and fibrosis). Although obvious on the images, it may be useful to add indications (e.g., arrows) on the images for readers non familiar with DMD.
      • Prescreen PCR of hDMD/mdx ES cells (Fig. 2): the authors mentioned that "The PCR conditions were chosen for not being able to amplify the undeleted allele." What does this mean? Was the elongation time reduced? As per the text, the theoretical size of a WT band is around 1.6kb. Yet, on the gel, bands higher than 1kb are visible for some clones.

      Referee cross-commenting

      The comments from the other reviewers seem fair, reasonable, and should be easily addressed by the authors. The off-target analysis might however be a bit of a stretch, given that (as per published data) the off-target rate is low (i.e., no higher than genetic drift) in mouse zygotes when using CRISPR RNPs, and any potential off-target mutation could easily be segregated out by means of backcrossing.

      Significance

      The four new mouse models generated in this study will advance the field both at the preclinical and the clinical levels, because they more closely recapitulate the human mutations linked to DMD than previous models, while presenting with a translational potential (the authors showed a proof of concept of exon-skipping therapy in these mice).

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      The authors generated four novel humanized DMD mouse models carrying deletions of exons 44, 45, 51, or 53 in the human DMD gene on an mdx C57BL/6J background. They developed an optimized CRISPR-Cas9 pre-screening workflow for embryonic stem cells and zygotes, allowing efficient and precise targeting of the human DMD YAC, which carries a complex double tail-to-tail integration. The models display absent or trace dystrophin and classical DMD muscle pathology, including fibrosis. ASO-mediated exon skipping of flanking exons successfully restores dystrophin expression, validating their use for preclinical testing of mutation-specific therapies. These models address a key limitation of the standard mdx mouse, which carries a mutation only in exon 23, and provide a more clinically relevant platform for evaluating human sequence-specific therapeutic strategies for the most frequently mutated DMD exons.

      Minor comments:

      1. The pre-screen workflow and model generation are impressive and well-optimized. To further strengthen the rigor of the study, it would be valuable to include an analysis of potential off-target effects of CRISPR editing, particularly given that double targeting of two YAC copies was required. This is especially important for germline edits, as off-target mutations could introduce confounding phenotypes in the resulting mice. Demonstrating minimal or absent off-target activity would increase confidence in the specificity and safety of the generated models.
      2. The validation of the dystrophic phenotype is generally convincing. However, the authors should clarify how "human dystrophin" is detected in the deletion models. Since only part of the dystrophin gene in these mice is humanized (the remainder is murine), it is important to specify, also in the results, which antibody was used and which epitope/exon it recognizes. If the antibody targets a deleted exon in a given model, this could lead to misinterpretation of the dystrophin signal. Providing this clarification would ensure the conclusions regarding dystrophin expression are fully supported. Additionally, to further strengthen the characterization of the muscular dystrophy phenotype, the authors could quantify muscle fibre size and the percentage of centrally nucleated fibres, both of which are widely accepted quantitative markers of ongoing degeneration/regeneration in DMD models.
      3. The validation of exon skipping in the new hDMD deletion models is convincing at the molecular level. However, since the ASOs were injected into both gastrocnemius and triceps muscles, it would be helpful to include at least a brief characterization of the triceps, even in supplementary data, as different muscles can show slightly different pathology and responses. Additionally, while the molecular readouts (RT-PCR and Western blot) demonstrate restoration of dystrophin expression, including simple histological analysis, such as H&E staining, could further support functional improvement and reinforce the physiological relevance of exon skipping in these models.

      Significance

      This study presents a clear and technically robust advance in the field of Duchenne muscular dystrophy (DMD) preclinical research. The strongest aspects are the generation of four novel humanized DMD mouse models carrying clinically relevant exon deletions (44, 45, 51, 53) and the development of an optimized CRISPR-Cas9 pre-screening workflow that efficiently and precisely targets the human DMD YAC, despite its complex double tail-to-tail integration. These models display relevant dystrophic phenotypes and are validated for ASO-mediated exon skipping, demonstrating their applicability for preclinical testing of mutation-specific therapies.

      Compared to existing models, such as the standard mdx mouse or previously generated hDMDdel52/mdx line, these new models address the critical limitation that most human DMD mutations cluster outside exon 23, providing a more clinically relevant system. The study extends knowledge both technically, by demonstrating an efficient pre-screening workflow for complex humanized YAC edits, and functionally, by creating models that allow preclinical evaluation of human sequence-specific therapeutic strategies for the most frequent DMD mutations. The audience for this work includes basic and translational researchers in the muscular dystrophy, gene therapy, and genome editing fields, as well as clinicians interested in the development and preclinical testing of exon skipping and gene-editing therapies. These models will likely be widely used to optimize therapy design, dosage, and delivery, enhancing translatability to clinical applications.

      Field of expertise: Duchenne muscular dystrophy, preclinical models, genome editing, exon skipping therapies, regenerative medicine.

    1. Reviewer #1 (Public review):

      Summary:

      The article investigates how the Japanese macaque makes gait transitions between quadruped and biped gaits. It presents a compelling neuromechanical simulation that replicates the transition and an interesting analysis based on an inverted pendulum that can explain why some transition strategies are successful and others are not.

      Strengths:

      I enjoyed reading this article. I think it presents an interesting study and elegant modeling approaches (musculoskeletal + inverted pendulum). The study is well conducted, and the results are interesting. I particularly liked how the success of gait transitions could be predicted based on the inverted pendulum and its saddle node stability. I think it makes a useful and interesting contribution to the state of the art.

      Weaknesses:

      The article is already in great shape, but could be improved a bit by:

      (1) Strengthening the comparison to animal data. In particular, videos of the real animal should be included + snapshots of their gaits (quadruped, biped, and transitions).

      (2) Exploring and testing a broader range of conditions. I think it would be very interesting to test gaits and gait transitions on up and down slopes (both with the musculoskeletal model and with the inverted pendulum model). This could be used to make predictions on how the real animal adapts to those conditions. Ideally, this should be tested on the animal as well. I think this could increase (even more) the impact of this work.

      (3) Better explaining several aspects of the PSO optimization.

      (4) (Ideally) performing a sensitivity analysis on the optimized parameters (e.g. variations of +-5, 10, 20%) in order to determine their respective importance and how much their instantiated values have influenced the results.

      (5) Running a spell checker, as there are quite a few typos.

    2. Reviewer #2 (Public review):

      Summary:

      This article presents a neuromusculoskeletal (NMS) model of the Japanese Macaque. This model is added with a neural feedforward controller based on CPG and synergy that allows for reproducing quadrupedal and bipedal gait as well as the transition between quadrupedal and bipedal gait. The model and controller were validated using experimental data. Results were also compared to an inverted pendulum model to show that the transition between quadrupedal and bipedal in macaque is using this kind of representation for transition and stability. Overall, the article is very interesting, but it sometimes lacks clarity.

      Strengths:

      The results of the model present impressive results for quadrupedal, bipedal, and transition, validated by experimental data. NMS controllers based on feedforward controllers are very difficult to fine-tune.

      Weaknesses:

      (1) The movement regulator is not clear and should be better explained. At first, it seems that it is just a new CPG/synergy (feedforward) added, but in the methods, it seems to be a feedback controller.

      (2) It is also not clear what is meant by discretizing the weight for the trigger limb from 0 to 1 (page 8).

      (3) The controller is mainly using a feedforward controller, allowing only anticipatory movement. Animals are also using a reflex-based feedback controller. A controller with feedback/reflex could reduce failed attempts in training and better represent the transition.

      (4) There are small typos throughout the article that should be corrected.

    3. Reviewer #3 (Public review):

      Summary:

      The purpose of this study was to test the hypothesis that the inverted pendulum mechanism contributes to the gait transition from quadrupedal to bipedal gait in Japanese macaques. The author uses a neuromusculoskeletal model to generate different motor tasks by varying motor command parameters during forward dynamics simulations. After simulations were done, the authors used dynamical system analysis of the inverted pendulum model to reveal the underlying principles of gait transition control. The authors showed that successful gait transition from quadrupedal to bipedal gait mostly depends on increased step length of a hindlimb.

      Strengths:

      This study is important not only for understanding gait transition, but also to understand stability control of bipedal gaits. Another advantage of this study is that it allows us to estimate the effect of one control mechanism and find its effect and limits. In animal studies, we also have a combination of compensatory stability control mechanisms.

      Weaknesses:

      Any simulation is not perfect, so discrepancies from experimental data are expected. A 2D model is used, but the advantage of using a 3D model is not clear, and it is much more complicated.

    1. Lipide transmitters: de belangrijkste lipide neurotransmitters zijn cannabinoïden, die zowel door het lichaam (endocannabinoïden) als door planten (fytocannabinoïden) worden geproduceerd. Endocannabinoïden worden in het postsynaptische membraan gesynthetiseerd uit arachidonzuur en worden on demand aangemaakt, in plaats van opgeslagen in synaptische blaasjes. Zij diffunderen terug naar het presynaptische membraan, waar zij CB1-receptoren activeren en zo de afgifte van zowel GABA als glutamaat remmen (zie hieronder). Hierdoor functioneren zij als neuromodulatoren die zowel remmende als exciterende activiteit reguleren. Cannabinoïden spelen een rol in processen zoals eetlust, pijn, slaap, stemming, geheugen, angst en stress.

      endo is binnen het lichaam, exo is buiten het lichaam. dus binnen ons lichaam zijn er ook cannabis fabriekjes.

    2. Kleine molecuul transmitters: de belangrijkste kleine-molecuul neurotransmitters in het centrale zenuwstelsel zijn: acetylcholine (ACh), dopamine (DA), noradrenaline (NE) en serotonine (SE). Deze stoffen worden in de axonterminal gesynthetiseerd uit voedingsstoffen en zijn direct beschikbaar voor afgifte. Na vrijgave in de synaptische spleet kunnen zij snel worden aangevuld door heropname en hergebruik.

      Belangrijk! acetylcholine --> acetaat en choline serotonine--> tryptofaan--> depressie GABA--> glutamaat exitatie en inhibitie tyrosine

    1. Author response:

      eLife Assessment

      This study provides a valuable contribution to understanding how negative affect influences food-choice decision making in bulimia nervosa, using a mechanistic approach with a drift diffusion model (DDM) to examine the weighting of tastiness and healthiness attributes. The solid evidence is supported by a robust crossover design and rigorous statistical methods, although concerns about the interpretation of group differences across neutral and negative conditions limit the interpretability of the results.

      We are grateful for this improved assessment. Below, we provide detailed responses that we believe address the noted concerns about interpreting group differences across conditions. If these clarifications resolve the interpretability concerns, we would be grateful if the editors would consider updating the eLife assessment accordingly.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Using a computational modeling approach based on the Drift and Diffusion Model (DDM) introduced by Ratcliff and McKoon in 2008, the article by Shevlin and colleagues investigates whether there are differences between neutral and negative emotional states in:

      (1) The timings of the integration in food choices of the perceived healthiness and tastiness of food options in individuals with bulimia nervosa (BN) and healthy participants

      (2)The weighting of the perceived healthiness and tastiness of these options.

      Strengths:

      By looking at the mechanistic part of the decision process, the approach has potential to improve the understanding of pathological food choices.

      Weaknesses:

      I thank the author for reviewing their manuscript.

      However, I still have major concerns.

      The authors say that they removed any causal claims in their revised version of the manuscript. The sentence before the last one of the abstract still says "bias for high-fat foods predicted more frequent subjective binge episodes over three months". This is a causal claim that I already highlighted in my previous review, specifically for that sentence (see my second sentence of my major point 2 of my previous review).

      We appreciate the Reviewer's continued attention to causal language. We acknowledge that our use of the term 'predicted', though intended to refer to statistical prediction in a regression model, could be misinterpreted as implying causation. We have therefore revised this sentence to read: 'bias for high-fat foods was associated with more frequent subjective binge episodes over three months’.

      I also noticed that a comment that I added was not sent to the authors. In this comment I was highlighting that in Figure 2 of Galibri et al., I was uncertain about a difference between neutral and negative inductions of the average negative rating after the induction in the BN group (i.e. comparing the negative rating after negative induction in BN to the negative rating after neutral induction in BN). Figure 2 of Galibri et al. looks to me that:

      (1) The BN participants were more negative before the induction when they came to the neutral session than when they came to the negative session.

      (2) The BN participants looked almost negatively similar (taking into account the error bars reported) after the induction in both sessions

      These observations are of high importance because they may support the fact that BN patients were likely in a similar negative state to run the food decision task in both conditions (negative and neutral). Therefore, the lack of difference in food choices in BN patients is unsurprising and nothing could be concluded from the DDM analyses. Moreover, the strong negative ratings of BN patients in the neutral condition as compared to healthy participants together with almost similar negative ratings after the two inductions contradict the authors' last sentence of their abstract.

      I appreciate that the authors reproduced an analysis of their initial paper regarding the negative ratings (i.e. Table S1). It partly answers my aforementioned point but does not address the fact that BN may have been in a similar negative state in both conditions (neutral and negative) when running the food decision task: if BN patients were similarly negative after both induction (neutral and negative), nothing can be concluded from their differences in their results obtained from the DDM. As the authors put it, "not all loss-ofcontrol eating occurs in the context of negative state", I add that far from all negative states lead to a loss-of-control eating in BN patients. This grounds all my aforementioned remarks and my remarks of my first review.

      A solution for that is to run a paired t-test in BN patients only comparing the score after the induction in the two conditions (neutral and negative) reported in Figure 2 of their initial article.

      We appreciate the reviewer’s concern. We understand how the visual representation in Figure 2, which displays between-subject error bars, might suggest similar post-induction affect levels. However, the within-subject paired comparison (which appropriately accounts for individual differences in baseline affect) reveals a significant difference, which we detail below.

      While BN participants did report higher baseline negative affect than the HC group prior to the mood inductions, this does not negate the effectiveness of the manipulation. The critical comparison is the within-subject change from pre- to post-induction (detailed below) which shows that negative affect was significantly higher after the negative induction than the neutral induction.

      As we reported in the Supplementary Information (Table S1), our initial analyses of self-reported affect ratings used a linear mixed-effects model with group (HC = 0, BN = 1), condition (Neutral = 0, Negative = 1), and time (pre-induction = 0, post-induction = 1) as fixed effects, including all interactions, and random intercepts for participants. This approach accounts for individual differences in baseline affect.

      However, to address the reviewer's concerns, we conducted two simple effects analyses using estimated marginal means. As the reviewer suggested, we directly compared post-induction affect between conditions within the BN group (described in the second analysis below). In the first analysis, we examined the diagnosis × time interaction within each condition separately. In the Negative condition, individuals with BN demonstrated a substantial increase in negative affect from pre- to post-induction (mean difference = 20.36, t = 4.84, p < 0.0001, Cohen’s d = 0.97). In the second analysis, we examined the condition × time interaction within each group separately. Among the BN group, we found that reported affect was significantly higher following the negative mood induction than after the neutral affect induction (mean difference = -17.40, t = -4.13, p = 0.0003, Cohen’s d = 0.83). This difference in post-induction negative affect between conditions within the BN group represents a meaningful and statistically robust difference in affective states. These within-group effects confirm that the negative mood induction was (1) effective in the BN group and (2) produced significantly greater negative affect than the neutral mood induction.

      These findings confirm that participants completed the food decision task under meaningfully different affective states, supporting the interpretability of the subsequent DDM analyses. We now report these analyses in the Supplementary Information.

      I appreciate the analysis that the authors added with the restrictive subscale of the EDE-Q.

      That this analysis does not show any association with the parameters of interest does not show that there is a difference in the link between self reported restrictions and self reported binges. Only such a difference would allow us to claim that the results the authors report may be related to binges.

      We thank the reviewer for raising this important point about specificity. To address this concern, we examined the correlation between self-reported binge frequency (both subjective binge episodes and objective binge episodes over the past three months) and EDE-Q Restraint subscale in our BN sample.

      The correlation between these measures were modest and non-significant (subjective binge frequency: Spearman’s p = 0.21, p = 0.306; objective binge frequency: Spearman’s p = 0.05, p = 0.806), indicating that both binge frequency measures and dietary restraint were relatively independent dimensions of eating pathology in our sample. This dissociation supports the specificity of our findings: the fact that our DDM parameters were associated with binge frequency but not with dietary restraint suggests that the affect-induced changes in decisionmaking we observed are specifically related to binge-eating behavior rather than reflecting a correlate of dietary restraint. We now report this analysis in the Supplementary Information.

      I appreciate the wording of the answer of the authors to my third point: "the results suggest that individuals whose task behavior is more reactive to negative affect tend to be the most symptomatic, but the results do not allow us to determine whether this reactivity causes the symptoms". This sentence is crystal clear and sums very well the limits of the associations the authors report with binge eating frequency. However, I do not see this sentence in the manuscript. I think the manuscript would benefit substantially from adding it.

      We thank the reviewer for the suggestion. We have added the following sentences that convey this information to the end of the third paragraph of the discussion:

      “These results suggest that individuals whose task behavior is more reactive to negative affect tend to be the most symptomatic. However, our correlational design does not allow us to determine whether this reactivity causes the symptoms.”

      Statistical analyses:

      If I understood well the mixed models performed, analyses of supplementary tables S1 and S27 to S32 are considering all measures as independent which means that the considered score of each condition (neutral vs negative) and each time (before vs after induction) which have been rated by the same participants are independent. Such type of analyses does not take into account the potential correlation between the 4 scores of a given participant. As a consequence, results may lead to false positives that a linear mixed model does not address. The appropriate analysis would be to run adapted statistical tests pairing the data without running any mixed model.

      We appreciate the reviewer's attention to the statistical approach. However, we respectfully note that mixed-effects models do account for within-subject correlations, contrary to the reviewer’s interpretation.

      The linear mixed-effects model we employed explicitly accounts for the correlation among repeated measures from the same participant through the random intercept term. This random effect structure models the non-independence of observations within participants, allowing for correlated errors within individuals while assuming independence between individuals. This is a standard and appropriate approach for analyzing repeated-measures data (Bates et al., 2015).

      The mixed-effects model is, in fact, more appropriate than separate paired t-tests for our design because it:

      (1) Simultaneously models all fixed effects (group, condition, time) and their interactions in a single unified framework;

      (2) Properly partitions variance into within-subject and between-subject components;

      (3) Provides greater statistical power and more precise estimates by using all available data simultaneously; and

      (4) Allows for direct testing of three-way interactions that cannot be assessed through pairwise comparisons alone.

      Paired tests (e.g., t-tests), as the reviewer suggests, would require multiple separate analyses and would not allow us to test our primary hypotheses about group × condition × time interactions. The mixed-effects approach provides a more comprehensive and statistically rigorous analysis of our repeated-measures design. To clarify this even further in the manuscript, we have added the following in our methods when describing our model, “participant-level random intercepts were included to account for within-subject correlations across repeated measurements.”

      Notes:

      It is not because specific methods like correlating self reported measures over long periods with almost instantaneous behaviors (like tasks) have been used extensively in studies that these methods are adapted to answer a given scientific question. Measures aggregated over long periods miss the variations in instantaneous behaviors over these periods.

      We acknowledge the reviewer’s concern about the temporal mismatch between our session-level task measures and the 3-month aggregated symptom reports. This is a valid limitation of crosssectional designs, and we agree that examining how task performance fluctuates in relation to real-time symptom variation would provide richer insights into the potential dynamics of these relationships.

      We agree that we cannot capture how daily changes in task performance relate to momentary symptom occurrence. In response to previous rounds of helpful reviews, we added this limitation to the Discussion section, noting that future research employing ecological momentary assessment (EMA) or daily diary methods could examine whether the decision-making processes we identified also fluctuate in relation to real-time symptom occurrence.

      We note that our finding that affect-induced changes in decision-making parameters were associated with subjective binge frequency suggests that this laboratory-measured reactivity may reflect a stable individual difference that manifests across contexts and time periods. While our current study provides initial evidence that individual differences in affect-related decisionmaking are associated with symptom severity, we acknowledge that longitudinal designs with repeated assessments would strengthen causal and temporal inferences.

      Reviewer #2 (Public review):

      Summary:

      Binge eating is often preceded by heightened negative affect, but the specific processes underlying this link are not well-understood. The purpose of this manuscript was to examine whether affect state (neutral or negative mood) impacts food choice decisionmaking processes that may increase the likelihood of binge eating in individuals with bulimia nervosa (BN). The researchers used a randomized crossover design in women with BN (n=25) and controls (n=21), in which participants underwent a negative or neutral mood induction prior to completing a food-choice task. The researchers found that despite no differences in food choices in the negative and neutral conditions, women with BN demonstrated a stronger bias toward considering the 'tastiness' before the 'healthiness' of the food after the negative mood induction.

      Strengths:

      The topic is important and clinically relevant, and the methods are sound. The use of computational modeling to understand nuances in decision-making processes and how that might relate to eating disorder symptom severity is a strength of the study.

      Weaknesses:

      Sample size was relatively small, and participants were all women with BN, which limits generalizability of findings to the larger population of individuals who engage in binge eating. It is likely that the negative affect manipulation was weak and may not have been potent enough to change behavior. These limitations are adequately noted in the discussion.

      We are grateful to Reviewer #2 for their careful and supportive review of our manuscript. We appreciate their recognition that computational modeling can reveal nuanced alterations in decision-making processes that may not be apparent in overt behavioral choices. Their balanced assessment of both the strengths and limitations of our work has been helpful in contextualizing our findings appropriately. We have carefully considered their comments regarding sample size and the potential limitations of our mood induction procedure, both of which we discuss in detail in the manuscript's limitations section.

      Reviewer #3 (Public review):

      Summary:

      The study uses the food choice task, a well-established method in eating disorder research, particularly in anorexia nervosa. However, it introduces a novel analytical approach-the diffusion decision model-to deconstruct food choices and assess the influence of negative affect on how and when tastiness and healthiness are considered in decision-making among individuals with bulimia nervosa and healthy controls.

      Strengths:

      The introduction provides a comprehensive review of the literature, and the study design appears robust. It incorporates separate sessions for neutral and negative affect conditions and counterbalances tastiness and healthiness ratings. The statistical methods are rigorous, employing multiple testing corrections.

      A key finding-that negative affect induction biases individuals with bulimia nervosa toward prioritizing tastiness over healthiness-offers an intriguing perspective on how negative affect may drive binge eating behaviors.

      Weaknesses:

      A notable limitation is the absence of a sample size calculation, which, combined with the relatively small sample, may have contributed to null findings. Additionally, while the affect induction method is validated, it is less effective than alternatives such as image or film-based stimuli (Dana et al., 2020), potentially influencing the results.

      We are grateful to Reviewer #3 for their thoughtful evaluation of our work. We appreciate their recognition that the diffusion decision model provides a novel analytical lens for understanding how negative affect influences the dynamics of food-related decision-making in bulimia nervosa. Their balanced assessment of both the methodological strengths of our design (counterbalancing, rigorous statistical corrections) and its limitations (sample size, mood induction efficacy) has been valuable in ensuring we appropriately contextualize our findings and their implications. Specifically, we have taken their comments regarding sample size and the relative efficacy of different mood induction methods seriously, and we address these important methodological considerations in our discussion of the study's limitations.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The authors have addressed my previous comments, and I do not have any additional suggestions for improvement.

      We thank the reviewer for their time, effort, and insightful feedback.

      Reviewer #3 (Recommendations for the authors):

      The authors have adequately addressed my feedback. I have no further comments.

      We thank the reviewer for their time, effort, and insightful feedback.

    1. Author response:

      eLife Assessment

      Hoverflies are known for their sexually dimorphic visual systems and exquisite flight behaviors. This valuable study reports how two types of visual descending neurons differ between males and females in their motion- and speed-dependent responses, yet surprisingly, the behavior they control lacks any sexual dimorphism. The results convincingly support these findings, which will be of interest for studies of visuomotor transformations and network-level brain organization.

      This statement perfectly recapitulates our findings.

      Public Reviews:

      Reviewer #1 (Public review):  

      Summary: 

      Hoverflies are known for a striking sexual dimorphism in eye morphology and early visual system physiology. Surprisingly, the male and female flight behaviors show only subtle differences. Nicholas et al. investigate the sensori-motor transformation of sexually dimorphic visual information to flight steering commands via descending neurons. The authors combined intra- and extracellular recordings, neuroanatomy, and behavioral analysis. They convincingly demonstrate that descending neurons show sexual dimorphisms - in particular at high optic flow velocities - while wing steering responses seem relatively monomorphic. The study highlights a very interesting discrepancy between neuronal and behavioral response properties.

      Thank you for this summary. Most of the statement perfectly recapitulates the main findings of our paper. However, we want to emphasize that some hoverfly flight behaviors are strongly sexually dimorphic, especially those related to courtship and mating. Indeed, only male hoverflies pursue targets at high speed, chase away territorial intruders, and pursue females for mating. However, other flight behaviours, such as those related to optomotor responses and flights between flowers when feeding, are not sexually dimorphic. We will amend the Introduction to make the difference between flight behaviors clear.

      More specifically, the authors focused on two types of descending neurons that receive inputs from well-characterized wide-field sensitive tangential cells: OFS DN1, which receives inputs from so-called HS cells, and OFS DN2, which receives input from a set of VS cells. Their likely counterparts in Drosophila connect to the neck, wing, and haltere neuropils. The authors characterized the visual response properties of these two neuronal classes in both male and female hoverflies and identified several interesting differences. They then presented the same set of stimuli, tracked wing beat amplitude, and analyzed the sum and the difference of right and left wing beat amplitude as a readout of lift or thrust, and yaw turning, respectively. Behavioral responses showed little to no sexual dimorphism, despite the observed neuronal differences.

      Thank you for this very nice summary of our work. We want to clarify that LPTC input to DN1 and DN2 has not been shown directly in hoverflies using e.g. dye coupling, or dual recordings. Instead, the presumed HS and VS input is inferred from morphological and physiological DN evidence, and comparisons to similar data in Drosophila and blowflies. We will amend the Introduction to clarify this. The rest of the paragraph perfectly recapitulates the main findings of our paper.

      Strengths:

      I find the question very interesting and the results both convincing and intriguing. A fundamental goal in neuroscience is to link neuronal responses and behavior. The current study highlights that the transformations - even at the level of descending neurons to motoneurons - are complex and less straightforward than one might expect.

      Thank you.

      Weaknesses:

      The authors investigated two types of descending neurons, but it was not clear to me how many other descending neurons are thought to be involved in wing steering responses to wide-field motion. I would suggest providing a more in-depth overview of what is known about hoverflies and Drosophila, since the conclusions drawn from the study would be different if these two types were the only descending neurons involved, as opposed to representing a subset of the neurons conveying visual information to the wing neuropil.

      This is a great point. There are around 1000 fly DNs, of which many could respond to widefield motion, without being specifically tuned to widefield motion. For example, many looming sensitive neurons also respond to widefield motion, and could therefore be involved in the WBA movements that we measured here. In addition, there are many multimodal neurons that could be involved in optomotor responses in free flight, but these may not have been stimulated when we only provided visual input. Furthermore, many visual neurons are modulated by proprioceptive feedback, which is lacking in immobilized physiology preps. Finally, in blowflies, up to 5 optic flow sensitive DNs have been identified morphologically, and in Drosophila 3 have been identified morphologically and physiologically. In summary, it is more than likely that other neurons project visual widefield motion information to the wing neuropil. We will amend our Introduction and Discussion to make this important point clear to the readers.

      Both neuronal classes have counterparts in Drosophila that also innervate neck motor regions. The authors filled the hoverfly DNs in intracellular recordings to characterize their arborization in the ventral nerve cord. In my opinion, these anatomical data could be further exploited and discussed a bit more: is the innervation in hoverflies also consistent with connecting to the neck and haltere motor regions? Are there any obvious differences and similarities to the Drosophila neurons mentioned by the authors? If the arborization also supports a role in neck movements, the authors could discuss whether they would expect any sexual dimorphism in head movements.

      These are all great points. We did not see any clear arborizations to the frontal nerve, where we would expect to find the neck motor neurons (NMNs). In addition, while we did see fine arborizations throughout the length of the thoracic ganglion, we saw no strong outputs projecting directly to the haltere nerve (HN). In the revised version of the MS we will modify figure 4 (morphological characterization) to clarify.

      There are important differences between the morphology of DN1 and DN2 in hoverflies and DNHS1 and DNOVS2 in Drosophila, in terms of their projections in the thoracic ganglion. For example, In Drosophila DNOVS2, there are several fine branches along the length of the neuron in the thoracic ganglia. Similarly, we found fine branches in Eristalis tenax DN2, however, in addition, we found a wide branch projecting to the area of the thoracic ganglion where the prothoracic and pterothoracic nerves likely get their inputs (Figure 4), suggesting that the neuron could contribute to controlling the wings and/or the forelegs (which is why we quantified the WBA). In Drosophila DNHS1, there is a similar fat branch to the prothoracic and pterothoracic nerves, which we also found in Eristalis tenax OFS DN1 (Figure 4). Indeed, while Drosophila DNHS1 and DNOVS2 have quite strikingly different morphology, DN1 and DN2 in Eristalis looked quite similar. We will modify the Results section to make this clear.

      In addition, to investigate this further, in the revised version of the MS we will include analysis of the movement of different body parts (including the head) to investigate the presence of any potential sexual dimorphism. Unfortunately, however, this will not include the halteres, as they cannot be seen well in the videos.

      Reviewer #2 (Public review):

      Summary:

      Many fly species exhibit male-specific visual behaviors during courtship, while little is known about the circuit underlying the dimorphic visuomotor transformations. Nicholas et al focus on two types of visual descending neurons (DNs) in hoverflies, a species in which only males exhibit high-speed pursuit of conspecifics. They combined electrophysiology and behavior analysis to identify these DNs and characterize their response to a variety of visual stimuli in both male and female flies. The results show that the neurons in both sexes have similar receptive fields but exhibit speed-dependent dimorphic responses to different optic flow stimuli.

      This statement perfectly recapitulates the main findings of our paper. However, as mentioned above, while hoverfly flight behaviors related to courtship and mating are strongly sexually dimorphic, other flight behaviours, such as those related to optomotor responses and flights between flowers when feeding, are not. We will amend the Introduction to make the difference between flight behaviors clear.

      Strengths:

      Hoverflies, though not a common model system, show very interesting dimorphic behaviors and provide a unique and valuable entry point to explore the brain organization behind sexual dimorphism. The findings here are not only interesting on their own right but will also likely inspire those working in other systems, particularly Drosophila.

      Thank you.

      The authors employed rigorous morphology, electrophysiology, and behavior methods to deliver a comprehensive characterization of the neurons in question. The precision of the measurements allowed for identifying a subtle and nuanced neuronal dimorphism and set a standard for future work in this area.

      Thank you.

      Weaknesses:

      Cell-typing using receptive field preferred directions (RFPDs): if I understood correctly, this classification method mostly relies on the LPDs near the center of the receptive field (median within the contour in Fig.1). I have two concerns here. First, this method is great if we are certain there are only two types of visual DNs as described in the manuscript. But how certain is this? Given the importance of vision in flight control, I would expect many DNs that transmit optic flow information to the motor center. I'd also like to point out that there are other lobula plate tangential cells (LPTCs) than HS and VS cells, which are much less studied and could potentially contribute to dimorphic behaviors.

      This is very true, and an important point. As mentioned above, in blowflies, up to 5 optic flow sensitive DNs have been identified morphologically, however, if these correspond to 5 different physiological types remain unclear. In both blowflies and Drosophila 3 have been identified morphologically and physiologically (DNHS1, DNOVS1, DNOVS2). Importantly, in both blowflies and fruitflies DNOVS1 gives graded responses, and no action potentials, meaning that we would not be able to record from it using extracellular electrophysiology.

      We previously used clustering techniques to show that in Eristalis, we can reliably distinguish two types of optic flow sensitive DNs from extracellular electrophysiological data, based on a range of receptive field parameters, and we think that these correspond to DNHS1 and DNOVS2 in Drosophila (Nicholas et al, J Comp Physiol A, 2020, cited in paper). As mentioned above in response to Reviewer 1, this does not mean that there are no other neurons that could respond to widefield optic flow, and which might be involved in the WBA we recorded in the paper. However, the point of this paper was not to conclusively show that there are only two optic flow sensitive descending neurons. The point was to say that there are two quite distinct optic flow sensitive neurons that have similar receptive fields in males and females, while the responses to widefield motion show differences between males and females.

      We will modify the Introduction and Discussion to make these important points clear to the Reader, including the discussion of the 45-60 LPTCs that exist in the lobula plate, and what their role might be.

      Second, this method feels somewhat impoverished given the richness of the data. The authors have nicely mapped out the directional tuning for almost the entire visual field. Instead of reducing this measurement to 2 values (center and direction), I was wondering if there is a better method to fully utilize the data at hand to get a better characterization of these DNs. As the authors are aware, local features alone can be ambiguous in characterizing optic flows. What's more, taking into account more global features can be useful for discovering potentially new cell types.

      This is a great point, and we did an extensive analysis of other receptive field properties in this study (shown in supp fig 1). In addition, and as mentioned above, we have published a clustering analysis across receptive field properties of these neurons (Nicholas et al, J Comp Physiol A, 2020, cited in paper). The point that we attempted to make in this paper was that by using two strikingly simple metrics, we can reliably distinguish which of the two neuron types we are recording from (if we accept that there are two main types that we are likely to record from) simply based on location and overall directional preference. This makes automated analysis very easy and straightforward. Indeed, we now use this routinely to ID what neuron we are recording from, rather than making a human-based assumption.

      However, we agree that further in depth analysis is warranted. Therefore, to address this, we will provide additional receptive field analysis and clustering in the revised version of the MS. In addition, we want to highlight that all data is uploaded to DataDryad for anyone interested in doing additional in-depth analyses.

      Line 131, it wasn't clear to me why full-screen stimuli were used for comparison here, instead of the full receptive field maps. Male flies exhibit sexual dimorphic behaviors only during courtship, which would suggest that small-sized visual stimuli (mimicking an intruder or female conspecific) would be better suited to elicit dimorphic neuronal responses. A similar comment applies to the later results as well. Based on the receptive field mapping in Figure 1, I'm under the impression that these 2 DN types are more suited to detect wide-field optic flows, those induced by self-motion as mentioned in the manuscript. The results are still very interesting, but it's good to make this point clear early on to help set appropriate expectations. Conversely, this would also suggest that there are other visual DN types that are responsible for the courtship-related sexually dimorphic behaviors.

      Thank you for mentioning these important points. Our reasoning for using full-screen stimuli for the analysis on line 131 was that since we used the small sinusoidal gratings for mapping the receptive fields, and to subsequently classify the neurons, it would be unfair to use the same data to investigate potential sexual dimorphism. I.e., we selected neurons that fulfilled certain criteria, and then we cannot rightfully use the same criteria to determine differences. This was not explicitly mentioned in the paper, so we will modify the text to make this clear to the Reader.

      However, in Supp Figure 1d/e we show that there are no striking receptive field differences between males and females in terms of receptive field center nor directional preference. In Supp Figure 1f we show that there is no difference between male and female receptive field height and width. We will modify the text to draw the Reader’s attention to this figure, and also mention the additional analysis done in response to the comment above.

      As a side note, I personally expected at least DNHS1 to have a smaller receptive field in males, as the hoverfly HSN is strikingly sexually dimorphic (Nordström et al, Curr Biol 2008), and also very sensitive to small objects. However, while optic flow sensitive DNs do respond to small objects (see e.g. the J Comp Physiol paper mentioned above) we did not detect any obvious sexual dimorphism in receptive field properties. Indeed, we think that a different subset of DNs control target pursuit behavior (target selective DNs (TSDNs)). This will be addressed in the modified version of the paper.

    1. Reviewer #1 (Public review):

      Summary:

      The dysgranular retrosplenial cortex (RSD) and hippocampus both encode information related to an animal's navigation through space. Here, the authors study the different ways in which these two brain regions represent spatial information when animals navigate through interconnected rooms. Most importantly, they find that the RSD contains a small fraction of neurons that encode properties of interconnected rooms by firing in different head directions within each room. This direction is shifted by 180 degrees in 2-room environments, and by 90 degrees in 4-room environments. While it cannot be definitively proven that this encoding is not just related to the presence of exits (doors) in each room, this is a noteworthy finding and will motivate further study in more complex and well-controlled environments to understand this coding scheme in the RSD. The recordings and analyses used to identify these multi-directional cells are mostly solid. Additional conclusions regarding the rotational symmetry across rooms seen in the RSD neurons that do not encode direction (representing the majority of RSD neurons) remain incomplete, given the evidence presented thus far. The differences between RSD and hippocampus encoding of space are clear and consistent with prior observations.

      Strengths:

      (1) Use of tetrode recordings from the RSD to identify multi-direction cells that only encode one direction in each room, but shift the preferred direction by either 180 or 90 degrees depending on the number of rooms in the environment.

      (2) Solid controls to show that this multi-direction encoding is stable over time and across some environmental manipulations.

      (3) Convincing evidence that these multi-direction cells can co-exist with single-direction head direction cells in the RSD (as both cell types can be simultaneously recorded).

      (4) Convincing evidence for clear differences between directional and spatial encoding in the RSD versus hippocampus, consistent with prior observations.

      Weaknesses:

      (1) The paper mostly uses the term "retrosplenial cortex", but it is important to clarify that the study is only focused on the dysgranular retrosplenial cortex (RSD; Brodmann Area 30) and not the granular retrosplenial cortex (Brodmann Area 29). These are two distinct regions (despite the similar names), each with distinct connectivity and distinct behavioral encoding and function, so it is important to clarify in the abstract and title that the present study is solely about the RSD to prevent confusion in the literature.

      (2) The proportion of each observed cell type is not clearly stated, although it is clear that the multi-directional cells are in the minority. Having the proportion of well-isolated neurons in distinct sessions that encode each type of information (e.g., multi vs single direction encoding) would greatly aid the interpretation of the result and help the field know how common each cell type is in the RSD.

      (3) The authors state that "MDCs [multi-directional cells] never exhibited multidirectional activity within a single room" - but many of the single room examples from the 4-room environment (shown in Figures 2E and 2F) reveal multi-peaked directional encoding. This suggests that the multi-direction encoding may be more compatible with encoding some property of the number of exits rather than relative room orientations.

      (4) The spatial rotation analyses of non-directional cell analyses are considered incomplete. This is impacted by the slower speed at the doors and hence altered firing rates (as evidenced in spatial rate plots). The population rate is not relevant as the correlational analyses are done on a single cell level. Since some cells fire more with increasing speed and others fire less, that will necessarily result in a population rate map that minimizes firing rate differences near the doorway, where the animals move more slowly. But on a single cell level, that reduced speed is having a big effect, as evidenced by individual rate map examples, and the rooms will need to be rotated to obtain a higher correlation by overlapping the doorway regions. This does not necessarily say anything about spatial coding across the two or four interconnected rooms being rotationally symmetric, and it would appear difficult to draw any conclusions related to spatial encoding from those analyses.

    2. Reviewer #2 (Public review):

      Summary:

      Laurent et al. perform in vivo electrophysiological recordings in the retrosplenial cortex of rats foraging in multi-compartment environments with either identical or unique visual features. The authors characterize two types of directional signals in the area that they have previously reported: classic head direction cells anchored to the global allocentric reference frame and multi-direction cells (MDCs), which have a rotationally preserved directional field anchored to local compartments. The primary finding of this work is that MDCs seem sensitive to local environmental geometry rather than visual context. They also show that MDC tuning persists in the absence of hippocampal place field repetition, further dissociating the RSC local directional signal from the broader allocentric representation of space. A novel observation is that RSC non-directional spatial signals are anchored to the local environment, which could and should be explored further. While the data is solid and the analyses are mostly appropriate, the primary findings are incremental, and more interesting novel claims are not explored in detail or not explicitly tested.

      Strengths:

      The environmental manipulations clearly demonstrate that tuning is not modulated by complex visual information.

      The finding that RSC two-dimensional spatial responses are stable and anchored to environmental features is novel and can be further explored in future work.

      Weaknesses:

      The observation that BDCs and MDCs are insensitive to visual context builds upon the author's previous work (and replicates aspects of Zhang et al., 2022) but leaves many open questions that are not addressed with the current set of experiments. Specifically, what exactly are MDCs anchoring to? The primary theory is that they anchor to environmental geometry, but there are no explicit experimental manipulations to test this theory. It is important to note that 2- and 4-compartment environments share many features, including the same cardinal axes, making any differences/similarities in these two conditions difficult to interpret.

      The main finding presented with respect to BDC/MDs tuning is that they are not sensitive to visual context as manipulated by distinct visual patterns on the wall and floor in multicompartment environments. One could argue that the individual rooms are, in actuality, quite similar in low-level visual features - each possesses a large white background square visual feature on a single wall with a fixed relationship to the door(s). How can the authors rule out that i) BDC/MDC responses are modulated by these low-level features rather than geometry and/or ii) that the rats are not paying attention to any visual features at all? There is no task requiring them to indicate which room they are in. Furthermore, the doorways themselves are prominent visual features that are present in each context. It would be interesting to see if MDC/BDC tuning persisted in a square room where the number of doorways was manipulated to rule out this possibility.

      A strong possibility is that the rotational symmetry of both MDCs and non-directional spatial neurons is related to i) door-related firing, 2) stereotyped movement, and 3) stereotyped directional sampling. In Supplemental Figure 8, the authors begin to address this by comparing a 'population ratemap' to a 'population speed map.' I do not think this is sufficient and is difficult to interpret. Instead, the authors should assess whether MDC and BDCs fire more at doorways and what the overlap is with the speed-modulated cells they report. Moreover, they should assess whether the spatial speed profile itself is rotationally symmetric within each session. It would also be useful to look at the confluence of the variables simultaneously using some form of regression analysis. The authors could generate a directional predictor that captures the main response property of these cells and see if it accounts for greater variability in spiking than speed or x,y position. Finally, rotationally symmetric directional sampling biases could arise from the doors being present on the same two walls in each room. The authors should assess whether MDC tuning is still present if directional sampling is randomly downsampled to match directional observations in each compartment.

      Recent work has demonstrated that neurons with egocentric corner or boundary tuning are observed in RSC. The authors do not address whether egocentric tuning contributes to MDC signals. An explicit analysis of the relationship and potential overlap of MDC and egocentric populations is warranted.

      Many of the MDCs presented in the main figures are not especially compelling. This includes alterations to MDC tuning in Figure 2, which is a key datapoint. The authors should show significantly more (if not all) examples of MDCs in each environment. It would similarly be useful to see all/more examples of non-directional spatially tuned neurons with rotationally symmetric firing patterns.

      "One might hypothesize that specific environmental cues, such as door orientation or landmark positioning, drive these tuning shifts. However, our results argue against this interpretation. In four-room environments, each room had multiple entry points, yet MDCs never exhibited multidirectional activity within a single room."

      I do not understand the logic here. Can the authors unpack this? Also, it is clear that some of the example cells have more than one peak in individual compartments. How is this quantified?

    3. Reviewer #3 (Public review):

      Summary:

      The authors examine firing of dysgranular retrosplenial cortex (dRSC) neurons in relation to head orientation and location for rats exploring open-field environments. One environment utilized was a square arena with high walls that is split into two rectangular spaces connected by a doorway. Another environment is a square arena split into quadrants connected by doors near the center. For each, the different sub-spaces of the environments are either identical in terms of visual and tactile cues or different. For head direction neurons, the authors present one population where each neuron maintains a single tuning direction for the two or four sub-compartments of the two environments. A second population exhibits what is termed multi-directional firing, wherein neurons exhibit (overall) two or four head direction peaks in firing. For such neurons, firing in each of the sub-compartments is associated with only a single preferred direction, but the directions across compartments are shown to be at 180-degree (two-compartment environment) or 90-degree offsets. The offsets evidence tuning to the "same" orientation for the sub-compartments that are, in the global reference frame, oriented at 180 or 90 degree offsets. The results are similar whether or not the sub-compartments have the same or different tactile and visual cues. Thus, the first population is said to be global in its head direction tuning, while the second relates to each local environment in a way that is systematic across sub-compartments. Spatially-specific activity of another population of non-direction-tuned RSC neurons is examined, and comparisons of sub-compartment spatial firing maps suggest that spatial tuning in RSC also repeats across compartments when the firing maps for the compartments are rotated to match each other (as in physical space). Finally, a population of hippocampal "place" cells exhibited different location mapping across sub-compartments. The findings are interpreted to indicate that RSC can simultaneously map orientation in both local and global reference frames, possibly forming a mechanism whereby the sub-compartments' shared geometry (given by the boundary shapes and the door locations) can be related to each other and to the global space they share.

      Strengths:

      This paper addresses an interesting problem and expands how the field will think about directional tuning.

      Weaknesses:

      It is not clear that the experimental design allows for a clear interpretation of the data. Rates for preferred turning are low, as are ratemap correlations for spatially-tuned neurons.

      (1) It is concerning that the neurons with head direction tuning have fairly low peak firing rates (mean close to 5 Hz), where prior studies examining head direction tuning in dRSC found head direction-tuned neurons with peak rates more than an order of magnitude higher (100 Hz or more). Under circumstances where neurons are tuned well to variables other than head direction (for example, angular velocity of movement), weak head direction tuning may be observed if those other variables are not sampled equally across head directions. The manuscript contains no rigorous control for this possibility. One place to start to address this issue would be to map out variables such as angular velocity by head orientation, and to test whether such relationships also carry 90 and 180 degree offsets.

      (2) There is some question as to whether dRSC neurons (spatial or directional) following the sub-compartment "geometry" is appropriate in terms of interpreting the data. In the condition with sub-compartments carrying different tactile and visual cues, it seems that such cues pertain only to the floor of the environments. The distal visual space of the boundaries appears to be identical. One is left to wonder whether distinguishing environments according to boundary wall visual cues would lead to different results. The CA1 data does not help to rule this possibility out. A second reason to doubt the "shared geometry" interpretation is that there is no condition where sub-compartment geometry is varied. It is also the case that the sub-compartment doorways may stand as the only salient distal visual cue linking the environments. Local sensory cues and geometry seem not so disentangled in this study, but this is a major claim in the abstract.

      (3) There is some concern with the interpretation that the spatial tuning of some dRSC neurons repeats in rotated form across sub-compartments. The firing rate map correlations are very low on average (~0.2), and far lower than the population of CA1 having repeating fields across the same vs different visual/tactile cue conditions. The authors should define the chance level of ratemap correlation by shuffling neuron identities. Apologies if this is indeed the current approach, but it seems not to be (I was left a bit lost by the description in the methods). For any population of hippocampal place cells, the cross-neuron correlations of firing rate maps are typically not zero, and correlations at 0.2 would normally be evidence for remapping.

      (4) A somewhat picky point here that is not meant to claim that multi-compartment studies are not useful - the introduction states that real-world environments typically consist of multi-compartment rooms. This is certainly not true for rodents and is only sometimes true in humans.

      (5) The discussion lacks a consideration of how such dRSC output might impact the target structures of dRSC.

      (6) The discussion speaks to the idea that multi-directional neurons may aid in transitioning between contexts (sub-compartments). But it is notable that none of the multidirectional neurons have multi-directional tuning in all sub-compartments, but such firing was seen in the 2017 Nature Neuroscience study by Jacob/Jeffery. The discussion should address this difference and perhaps posit a means by which the firing of global and local head direction neurons can be related to each other to yield navigation that depends on both scales.

      (7) The authors should provide the size of the smoothing function for spatial firing rate maps.

      (8) The authors should devise a measure to define directional tuning in 4 directions (with 90-degree offsets).

      (9) Figures 2D and 2H - The offsets in preferred tuning across sub-compartments are rather variable.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents a tunable Bessel-beam two-photon fluorescence microscopy (tBessel-TPFM) platform that enables high-speed volumetric imaging with stable axial focus. The work is technically strong and broadly significant, as it substantially improves the flexibility and practicality of Bessel-beam-based two-photon microscopy. The demonstrations are generally strong and bridge a wide range of neuroimaging applications, namely vascular dynamics, neurovascular coupling, optogenetic perturbation, and microglial responses. These convincingly show that the approach enables biological measurements that are difficult or impractical with existing methods.

      The evidence supporting the technical and biological claims is generally strong. The optical design is carefully motivated, clearly described, and validated through a combination of simulations and experimental characterization. The biological applications are diverse and well chosen to highlight the strengths of the proposed method, and the data are of high quality, with appropriate controls and comparative measurements where relevant.

      Strengths:

      (1) The optical innovation addresses a well-recognized limitation of existing Bessel-TPFM implementations, namely axial focus drift during tuning, and does so using a relatively simple, light-efficient, and cost-effective design.

      (2) The manuscript provides convincing experimental evidence for this being a versatile platform to map flow dynamics across diverse vessel sizes and orientations in both healthy and pathological states.

      (3) Biological demonstrations are comprehensive and span multiple domains such as hemodynamics, neurovascular coupling, and neuroimmune responses.

      (4) Quantitative analyses of blood flow across vessel sizes and orientations, including kilohertz line scanning, are particularly compelling and clearly beyond the reach of standard Gaussian TPFM.

      (5) Particular advantages are that higher blood slow speeds become measurable up to 23mm/sec (20x more than conventional frame scanning), and that simultaneous (Bessel-)imaging and (Gaussian-)perturbation are possible because of the stable axial focus.

      Weaknesses:

      (1) At present, the paper does not properly position the new Bessel-beam method against previous work, and fails to compare it to alternative fast volumetric imaging methods without Bessel beams.

      (2) The cost-effectiveness of the proposed method is not well described or supported by evidence; it would be useful to include more detail or remove this claim.

      (3) Some biological conclusions, e.g., regarding novel features of microglial dynamics (i.e., the observed two-wave responses and coordinated extension-retraction), are based on relatively limited sample size and would benefit from clearer discussion of variability across animals and fields of view.

      (4) The use of neural network-based denoising for microglial imaging is reasonable but introduces potential concerns about trustworthiness; additional clarification of validation or failure modes would strengthen confidence in these results.

      To conclude, most of the authors' claims are well supported by the data. The central conclusion, namely that tBessel-TPFM provides tunable volumetric imaging enabling experiments not feasible with existing two-photon approaches, is justified. Some biological interpretations would benefit from a more cautious framing, but they do not undermine the main technical and methodological contributions of the study. This is a strong and technically rigorous manuscript that makes a substantial methodological advance with clear relevance to neuroscience and intravital imaging. Minor clarifications and a slightly more measured discussion of certain biological findings are recommended.

    2. Reviewer #3 (Public review):

      Summary:

      The manuscript presents an elegant and cost-effective approach for generating a tunable Bessel beam on a conventional two-photon microscope. The authors assemble a compact optical module comprising three axicons and a series of lenses that permits rapid adjustment of both lateral resolution and axial extent without modifying the focal plane. This flexibility enables the system to be readily adapted to a variety of biological preparations. As a proof of concept, the authors employ the device to record blood flow velocities in cortical microcapillaries, arterioles, and venules, thereby directly visualizing vasodilatation and vasoconstriction dynamics and permitting quantitative analysis of neurovascular coupling across cortical layers in awake mice.

      The authors demonstrate that the tunability of the Bessel beam can be exploited to match the numerical aperture to the vessel type: a high NA configuration, albeit slower scan, is optimal for resolving flow in capillaries, whereas a low NA setting provides faster acquisition suitable for arterioles and venules. By implementing a one-dimensional line scan with the Bessel beam, they achieve an imaging speed that is twentyfold faster than conventional frame-by-frame scanning, which proves sufficient to capture hemodynamic transients before and after an induced ischemic stroke.

      In addition to pure observation, the authors integrate a co-propagating Gaussian line to the system, allowing simultaneous imaging and photostimulation within the same focal plane. This capability addresses a common limitation of other Bessel beam implementations, in which the observation and perturbation planes often become misaligned when the Bessel beam is altered. The manuscript also emphasizes the advantage of Bessel beam excitation for calcium imaging after a perturbation, because it captures neuronal activity in planes both above and below the nominal focal plane, signals that would be missed with a standard Gaussian focus. Finally, the authors apply the technique to investigate the neuroimmune response following targeted microglial ablation; they report that adjacent microglia extend processes toward the injury site while retracting processes in the opposite direction.

      Overall, the work offers a technically straightforward yet powerful extension to existing two-photon platforms, providing high-speed, volumetric imaging and stimulation capabilities that are well-suited to a broad range of neurovascular and neuroimmune studies. The experimental validation is quite thorough, and the presented data convincingly illustrates the benefits of the approach.

      Strengths:

      The authors present a truly clever and inexpensive optical module that can be integrated into almost any two-photon microscope, providing a tunable Bessel beam with a minimal modification of the existing system. The experimental data and accompanying quantitative analysis convincingly demonstrate that the system can reveal physiological events, such as capillary flow, calcium transients across multiple axial planes, and microglial process dynamics, that are difficult or impossible to capture with a conventional Gaussian beam. The breadth of experiments chosen for the manuscript illustrates the practical utility of the device and supports the authors' conclusions that it extends the functional repertoire of standard two-photon microscopy.

      Weaknesses:

      The manuscript would benefit from a more detailed contextualisation of the claimed speed advantage. Although the authors mention other techniques in the introduction, they do not provide any direct comparison with other state-of-the-art high-speed two-photon approaches such as light beads microscopy (Demas et al., Nat. Methods 2021), temporal multiplexing schemes (Weisenburger et al., Cell 2019), or random access microscopy (Villette et al., Cell 2019). A brief comparison of imaging speed, spatial resolution, and instrumental complexity would enable readers to assess the relative merits of the present method.

      A second limitation that warrants discussion is the inherent trade off between volumetric coverage and image specificity. Because the Bessel beam excites fluorescence throughout an extended axial range, the detector inevitably integrates signal from a three dimensional volume into a two dimensional image. In densely labelled tissue, this can lead to significant signal crosstalk, reducing contrast and complicating quantitative interpretation. A brief analysis of how labeling density affects the fidelity of flow or calcium measurements, or suggestions for mitigating crosstalk (e.g., computational deconvolution, adaptive excitation shaping, or combinatorial sparse labeling), would broaden the applicability of the technique.

    1. Reviewer #2 (Public review):

      Summary:

      This is a very interesting paper bringing new and important information about the poorly understood rhodopsin 7 photoreceptive molecule. The very ancient origin of the gene is revealed in addition to data supporting a signaling pathway that is different from the one known for the canonical rhodopsins. Precise expression data, particularly in the optic lobe of the fly, as well as clear behavioral phenotypes in responses to light changes, make this study a strong contribution to the understanding of the still-debated function of rhodopsin 7.

      Specific comments

      (1) Title and abstract: Contribution of Rh7 to circadian clock regulation

      (a) It is not that clear to me what rhodopsin does in terms of circadian regulation (even though its function might be circadianly regulated). The clear role in the light/dark distribution of activity might not be circadian per se, but mostly light/dark-driven, and there is no evidence here for a role in the entrainment of the clock.

      (b) The authors should cite Lazopulo, which nicely shows that Rh7 has an important role in peripheral neurons to allow flies to escape from blue light (see below).

      (2) Figure 2 C

      The finding showing that Galphaz but not Galphaq can trigger signaling from light-excited Rh7 is a very intriguing finding to better understand Rh7 function. Since Galphaz is related to Gi/o, it would be interesting to test those, for example, by expressing RNAi with Rh7-gal4 and testing the Light-dark or light-off response behavior.

      (3) Figures 3-4

      The change in the locomotor activity distribution between light and dark in LD conditions provides a nice assay for Rh7 function. Since Lazopulo et al. (2019) have shown that wild-type but not Rh7 mutants do escape from blue light, it would be important to compare and discuss these LD behavior data with the Lazopulo results. Precisely, is this nighttime preference linked to blue light?

      The expression data are really nice and show that Rh7 is mostly a non-retinal photoreceptor. However, the paper would be strongly reinforced by correlating this with the LD behavior. The LD phenotype should be tested in flies with Rh7 expression rescued under Rh7gal4 control (as done for the startle response). This is important to show whether the expression pattern is likely responsible for the described Rh7 function in LD. If L5 and or M11 drivers are available, they should be used to rescue Rh7? Since expression in some clock neurons is shown, the rescue experiment should also be done with a clock neuron driver.

      In the same line, can the LD phenotype (or startle response phenotype of Figure 4) be restored by expressing Rh7 under ppk control, as shown for the blue light avoidance phenotype by Lazopulo et al?

      Finally, the Rh7 "darkfly" rescued flies should be tested in LD.

    2. Reviewer #3 (Public review):

      Summary:

      While our knowledge regarding visual opsins is largely very good, a lot more uncertainty exists around the role of non-visual opsins. Using the power of the Drosophila melanogaster model system, Kirsh et al. investigate the role of the non-visual opsin Rhodopsin7 (Rh7). Expression analysis, based on Rh7-Gal4>UAS-GFP and HRC in situ staining, reveals strong expression in the optic lobes and somewhat weaker, but nevertheless extensive expression in the brain. An investigation of motor activity reveals that loss of function leads to an altered day and night rhythm, specifically decreasing activity during the dark phase. These flies were also less sensitive, but still responsive to a light-induced startle response and showed deficiencies in the optomotor response. To further investigate how Rh7 may modulate these responses, inspired by the Dark line of flies (which were kept in the dark for ~1400 generations) and which has accumulated C-terminal related losses, the authors conducted rescues with an intact and a C-terminal-deficient Rh7 and were able to pinpoint that region as an important driver of related behavioral shifts. These findings are particularly intriguing as Rh7 represents an ancient opsin with phylogenetic and mechanistic parallels to mammalian melanopsin.

      Strengths:

      The paper is well-written and contains high-quality data with appropriate sample sizes, and the conclusions are well supported.

      Weaknesses:

      No weaknesses were identified by this reviewer, but the following recommendations are made:

      (1) The authors should clarify exactly what tissues were taken for the comparative qPCR. This is particularly interesting in terms of the retina. Since Rh7 appears not to be expressed within the photoreceptor cells of the retina, this raises the important question as to which cells it is expressed in. To address this important question, it would also be helpful to include an expression analysis of the retina itself (by extending the RH7-GFP expression patterns and/or adding HCR in situ of the ommatidia array). The cell types of the retina are very well classified, and some evidence already exists for Rh7 expression in support cells (e.g., Charlton-Perkins et al., (2017); PMID: 28562601). This study has a unique opportunity to investigate this further by adding these critical data for a more complete picture of Rh7.

      (2) Mammalian opsins should be included in the phylogenetic analysis illustrated in Figure 2A and indicate their position on the tree. This will allow readers to better put the authors' statements regarding the intermediate position of Rh7 into perspective. In addition, note that the distinction between red and deep red is easy to miss regarding the Rh7 cluster. Perhaps the authors could use a more distinct colour scheme, for example, orange and deep red.

      (3) More details should be provided on the optomotor response experiments. Specifically, specifications of the frequencies used for the optomotor response are needed. Results show a relatively large level of variation, which may be due to different angular perspectives that flies may have had while viewing the stimulus. If possible, provide videos as examples, as they will make it clearer to viewers how much flies could move around in the setup (from the methods, it seems they could move within the 2.2 of the 3 cm diameter of the arena, which would lead to substantial differences in the visual angle of the viewed grating.

    1. I will tell you that you don’t need an introductory paragraph, at least not of the 1) topic sentence 2) structural methodology 3) thesis statement variety that we were all taught in high school.

      In my opinion, the author has all her thoughts structured so incredibly well because she is one of those organized brain people who also has a gifted talent for writing. If I were to try and do this, it would come out like I talk, the whole essay would be everywhere the reader would be so confused, which is why I need to stick to the traditional form of writings essays.

    1. W.I. Thomas’s notable Thomas theorem which states, “If men define situations as real, they are real in their consequences”

      This sentence shows how people’s beliefs can shape reality, even if those beliefs are not objectively true. It helps explain why labels, expectations, and assumptions can strongly influence behavior and outcomes in everyday life.

    2. It is important to note that status refers to the rank in social hierarchy, while role is the behavior expected of a person holding a certain status.

      It explains the difference between status and role by showing that status is a person's position in society, while a role is how a person is expected to act because of that position.

    3. Even if you’re not consciously trying to alter your personality, your grandparents, coworkers, and date probably see different sides of you.

      A person may act a little differently with different people. For example, you may be more polite with grandparents, more professional at work, and more relaxed on date - even though you are still the same person.

    1. The Examined Life is Wise Living: The Relationship Between Mindfulness, Wisdom, and the Moral Foundations.Published in:Journal of Adult Development, Dec2020,Academic Search CompleteBy:Verhaeghen, PaulVerhaeghen, Paul The Examined Life is Wise Living: The Relationship Between Mindfulness, Wisdom, and the Moral Foundations  This correlational study of two independent samples (260 college students and 173 Mechanical Turk workers aged 21–74) examined whether and how mindfulness (broadly construed as a manifold of self-awareness, self-regulation, and self-transcendence), influences wisdom about the self (Adult Self-Transcendence Inventory and Self-Assessed Wisdom Scale) and wisdom about the (social) world (Three-Dimensional Wisdom Scale), and how mindfulness and wisdom impact ethical sensitivities (the five moral foundations). Mindfulness predicted wisdom about the self, and wisdom about the self was linked to an emphasis on the individualizing moral foundations of care/harm avoidance and fairness and, to a lesser degree, on the binding moral foundations of loyalty, authority, and purity. Wisdom about the (social) world was not associated with either mindfulness or the moral foundations. Age was a significant positive predictor for wisdom about the self once the self-awareness component of mindfulness was taken into account. Keywords: Wisdom; Mindfulness; Moral foundations; Ethics This paper investigates the links between trait mindfulness, wisdom, and ethical sensitivities (operationalized as sensitivity to the five moral foundations) in two independent samples, one of college students and one of adults spanning ages 21–74. Two principal ideas guided the study. The first idea is that wisdom, whether one conceptualizes it as a form of expertise or as a virtue or personality characteristic, might be well served by the specific quality or qualities of attention the individual brings to their experiences. It makes sense to expect that a habitual mindful attitude (i.e., taking an open, non-judgmental, reflective, self-regulatory, and sometimes self-transcendent stance towards life) might be a good indicator or exemplifier of such qualities. The second idea is that most, if not all, current adult-developmental theories consider wisdom to be of practical consequence, in the sense that wise people are expected to generally display prosocial attitudes and behavior (for a review, see Bangen et al. [10]). Consequentially, one might expect this wise stance to give rise to ethical sensitivities that are compatible with the characteristics of wisdom (as defined within these theories). Wisdom It is probably fair to say that within the field of psychology the study of wisdom started from an adult development perspective (e.g., Clayton and Birren [20]; Erikson [26]; Kramer [44]; Pascual-Leone [54]). Initial conceptualizations tended to view wisdom primarily from a cognitive angle, that is, as an advanced form of postformal thought. For instance, Baltes and Staudinger ([ 9 ]) define wisdom as 'expertise in the conduct and meaning of life' (p. 124). In this approach, wisdom is conceptualized as a form of crystallized intelligence, more specifically 'expert knowledge in the fundamental pragmatics of life that permits exceptional insight, judgment, and advice about complex and uncertain matters' (Pasupathi et al. [56], p. 351). Other approaches—Glück and Bluck ([31]) label these 'integrative views'—have supplemented this cognitive view by additionally emphasizing the reflective, affective, and conative qualities of the wise person, making wisdom more akin to a personality characteristic or a virtue (e.g., Ardelt [ 3 ]; Mitchell et al. [52])—wisdom as 'personal, concrete, applied, and involved' (Ardelt [ 3 ], p. 262). The different conceptualizations of wisdom do have a common core. From a review of 24 different key theories or definitions of wisdom, Bangen et al. ([10]) concluded that five subcomponents were present in at least half of the papers: (a) social decision making and pragmatic knowledge of life; (b) prosocial attitudes and values; (c) reflection and self-understanding (including a desire to learn); (d) acknowledgement of and coping with uncertainty; and (e) emotional homeostasis. Although there are qualitative, performance-based measures of wisdom, such as the Berlin wisdom paradigm (Baltes and Smith [ 8 ]), where participants describe how they would solve a particular life problem and answers are scored along a series of dimensions, self-report measures were used here, simply because quantitative measures allow for more efficient data collection and scoring, which in turn allows to query a larger sample of respondents. Specifically, I used the three quantitative self-report measures for wisdom recommended by Glück ([30]), Glück et al. ([34]), and Staudinger and Glück ([64])—Ardelt's Three-Dimensional Wisdom Scale (3D-WS; [ 2 ]), Levenson's Adult Self-Transcendence Inventory (ASTI; Levenson et al. [47]), and Webster's Self-Assessed Wisdom Scale (SAWS; [71], [72]). These three scales have different emphases. The 3D-WS measures wisdom as the integration of cognitive, reflective, and affective/compassionate personal characteristics; the SAWS gauges five dimensions, namely critical life experience, emotional regulation, reminiscence and reflectiveness, humor, and openness; the ASTI taps into self-transcendent wisdom, defined as a self-expansive process entailing decreased self-concern and increased empathy, understanding, spirituality, and feelings of connectedness with past and future generations. Not all of these scales cover all five subcomponents mentioned above: Arguably, the 3D-WS does; the SAWS covers social decision making, self-reflection, and emotional homeostasis; and the ASTI includes items about prosocial attitudes, self-reflection, and emotional homeostasis. Glück et al. ([34]) and Staudinger and Glück ([64]) additionally make a distinction between personal and general wisdom. The former refers to a person's insight into themselves and their own lives; the latter to insights into life and the world in general. The assumption is that personal wisdom is obtained through actual personal experience, whereas general wisdom does not have personal experience as a necessary condition. In Glück's conceptualization, all three scales mentioned above measure personal wisdom; only performance-based measures tap into general wisdom. Glück et al. ([34]) also posit a third, often underappreciated facet of wisdom, namely other-related wisdom, which they define as 'an empathy-based caring concern for both concrete other people and humankind at large' (p. 5); it is most evident in two of the three 3D-WS scales, namely the cognitive and reflective scales, and is possibly a subcomponent of personal wisdom. In (partial) confirmation of this view, Glück et al. found that all three 3D-WS scales loaded on a different factor than the two other quantitative scales. Given that the cognitive scale of the 3D-WS contains items that are indeed about the other (e.g., 'People are either good or bad' and 'You can classify almost all people as either honest or crooked'—both items are reverse-scored), but also items that are often general and external (e.g., 'ignorance is bliss' and 'It is better not to know too much about things that cannot be changed'—both items are reverse-scored), it seems to us that this dimension could be labeled more accurately as 'wisdom about the (social) world', in contrast with the 'wisdom about the self' tapped in personal-wisdom scales. Mindfulness Mindfulness is often defined as a particular way of paying attention—the ability or propensity to engage in "nonelaborative, non-judgmental, present-centered awareness in which each thought, feeling, or sensation that arises in the attentional field is acknowledged" (Bishop et al. [12], p. 232); this awareness requires cultivation (Nilsson and Kazemi [53]). One corollary is that "thought or events are observed as events in the mind without over-identifying with them and without reacting to them in an automatic, habitual pattern of reactivity", thus "introducing a 'space' between one's perception and response" and allowing one "to respond to situations more reflectively (as opposed to reflexively)" (Bishop et al. [12], p. 232). Mindfulness has been found to be broadly beneficial to the individual—mindfulness interventions lead to positive outcomes regarding stress, well-being, anxiety, depression, negative emotions, emotion regulation, rumination, self-compassion, and empathy (Eberth and Sedlmeier [25]; Verhaeghen [68]). These relationships are at least partially causal: changes in dispositional mindfulness after meditation training correlate with changes in self-perceived stress, anxiety, depressed mood, positive affect, negative affect, rumination, and general well-being (Gu et al. [40]; Khoury et al. [43]). Recent theoretical work within the field has converged on the conclusion that mindfulness is a complex concept, more akin to a manifold (or even a cascade of processes) than to a singular construct. The starting point of this work has been an examination of the reasons why mindfulness interventions lead to such a wide array of positive outcomes. Many models have been advanced to explain the translation of mindfulness into positive outcomes (e.g., Baer [ 5 ]; Brown et al. [16]; Chiesa et al. [19]; Creswell and Lindsay [21]; Grabovac et al. [35]; Hölzel et al. [42]; Segal et al. [59]; Shapiro et al. [60]; Vago and Silbersweig [67]), each with their own emphases and levels of complexity. Although details of the different proposed models vary, the list of proposed mechanisms generally contains three categories, as Vago and Silbersweig ([67]) point out. A first proposed mechanism is a change in self-awareness. This involves recognizing automatic habits and automatic patterns of reactivity, as well as an increased awareness of momentary states of body and mind—what is typically meant by mindfulness. A second proposed mechanism is a change in self-regulation. This includes better regulation of emotions, heightened self-compassion, increased emotional and cognitive flexibility, decreased rumination and worry, and increased nonattachment and acceptance. A final proposed mechanism is increased self-transcendence . This implies increased decentering, a stronger awareness of interdependence between self and others, and heightened compassion. Vago and Silbersweig label this common-denominator model the S-ART model, after its three components: self-awareness, self-regulation, and self-transcendence. Our own empirical work on the subject (Verhaeghen [69]; Verhaeghen and Aikman [70]), based on exploratory and confirmatory factor analysis as well as structural equation modeling on 3 independent samples of about 300 subjects each has indeed confirmed the plausibility of this S-ART mindfulness manifold, suggesting a flow of influence from self-awareness over self-regulation to self-transcendence, and then outward to well-being and other aspects of psychological health (for a schematic representation, see Fig. 1). Factor analysis showed that additional subdivisions were present within the components of self-awareness and self-regulation: self-awareness incorporated reflective awareness (the more active, deliberate, probing aspect of mindfulness) and controlled sense-of-self in the moment (the more passive, equanimous, non-judgmental aspect of mindfulness) (for more details on these components and how they are measured, see the "Methods" section below); self-regulation was tapped by (the opposite of) self-preoccupation and by self-compassion. Graph: Fig. 1 The S-ART mindfulness manifold as obtained in Verhaeghen ([69]) Mindfulness and Wisdom There are obvious points of contact between this conceptualization of mindfulness and those of wisdom, suggesting they operate in the same nomological space. First, some of the common-core wisdom subcomponents align with the mindfulness manifold. Clearly, the reflection and self-understanding subcomponent of common-core wisdom has a natural affinity (if not identity) with the reflective awareness component in the mindfulness manifold. A few examples from specific theories illustrate this quite nicely. For instance, Ardelt ([ 3 ]) explicitly claims that '[t]he development of wisdom requires the transcendence of one's subjectivity and projections, which can be accomplished through self-examination, self-awareness, and a reflection on one's own behavior and one's interactions with others' (p. 269). Likewise, Glück and Bluck's ([32]) MORE (mastery, openness, reflectivity, and emotion regulation) model of wisdom posits that wisdom-related knowledge develops through an interaction of life experiences with the four MORE resources, and that therefore wisdom should manifest itself in how people reflect upon past experiences. As a third example, Brown and Greene's model of Wisdom Development ([14]) states that wisdom ripens when individuals go through a core 'learning-from-life' process, comprised of reflection, integration, and application. Pascual-Leone ([55]), as a final example, considers meditation (one possible cultivator of mindfulness) as a path towards wisdom, through its fostering of insight, self-insight, and self-transcendence. Second, emotional homeostasis can be understood as an aspect or outcome of self-regulation. Third, some wisdom researchers explicitly view self-transcendence as a critical component of wisdom (see the Ardelt quote above; also Curnow [22]; Levenson [46]). There are a few empirical indications of a mindfulness-wisdom link as well. One study (Brienza et al. [13]) used its own process-based measure of wisdom, and found correlations with mindfulness scales, especially observing and orienting. Two studies used a training approach to foster wisdom by incorporating mindfulness either explicitly (Sharma and Dewangan [61]) or implicitly (as reflective awareness through a self-reflection journal and a life experience journal; Bruya and Ardelt [17]). The former study did not find intervention effects on either mindfulness or wisdom, but did find significant correlations at pretest between mindfulness (measured by the Mindful Attention Awareness Scale, MAAS; Brown and Ryan [15]) and the affective and reflective components of wisdom. The latter study obtained an intervention effect of the reflective exercises over and beyond those of attending a cognitively oriented class on wisdom, but did not include a measure of mindfulness to verify the proximal cause of the effect. These intervention studies, then, are somewhat suggestive of (but far from definitive about) a positive relationship between mindfulness and wisdom. Wisdom and Ethical Sensitivities The psychological study of ethical sensitivities and attitudes (e.g., Greene [37]; Haidt [41]) has converged on the conclusion that ethical actions are not always the product of the careful application of rational thought, but instead tend to be largely (although not exclusively) based on intuitions—evolved, automatic responses, inaccessible to awareness, which sometimes operate in contradiction with logical constraints. Researchers in this field often consider the vessels for these intuitions to be innate—for instance, Haidt's Moral Foundations Theory (MFT; Graham et al. [36]) posits that ethical sensitivities ultimately boil down to the five dimensions of promoting care/avoiding harm, fairness, ingroup loyalty, (respect for) authority, and purity (or sanctity). The former two are often combined into an 'individualizing' foundation, because they focus on the provision and protection of individual rights; the remaining three into a 'binding' foundation, because they focus on ingroup cohesion. The idea is that every individual is sensitive to these five aspects, but that the intuitions themselves are built through experience, and are thus open to individual and cultural differences through a tuning up or down of the emotional responses due to experiences that fit into these vessels (Flanagan and Williams [28]). In our previous study (Verhaeghen and Aikman [70]), where we adopted the Moral Foundations framework, we found clear links between the mindfulness manifold and ethical sensitivities, which possibly might be mediated through wisdom. Specifically, we found that reflective awareness and self-transcendence were directly related to the individualizing aspects of morality (i.e., an emphasis on care and fairness); only self-transcendence was related to the binding aspects of morality (i.e., an emphasis on loyalty, authority, and sanctity). One reason to suspect that wisdom might play a role in the individualizing foundation stems from its very definition—prosocial attitudes and values are the second most cited key component in Bangen et al.'s ([10]) literature review (21 out of 24 theories or models incorporated this component). A key mechanism may be the self-transcendental character of wisdom, which it has in common with mindfulness. There are empirical reasons to suspect that wisdom is implicated in moral attitudes (for a review of empirical and theoretical links between wisdom and ethics, see Sternberg and Glück [65]). For instance, wisdom has been found to correlate positively with other-oriented values such as well-being of friends, societal engagement, and ecological protection (Kunzmann and Baltes [45]; Webster [73]). Implicit lay theories of wisdom also include value orientations that align, in Haidt's model, with care and fairness (Glück et al. submitted). The Present Study The literature reviewed suggests that mindfulness, wisdom, and ethical sensitivities are related, but the pieces of this puzzle have not yet been fit together. One wide-open question is how the different components of mindfulness, broadly defined as self-awareness, self-regulation, and self-transcendence relate to wisdom; another whether (or how) wisdom might be a mediator translating, and perhaps crystalizing, mindfully experienced events into ethical attitudes. From the literature reviewed above, I expect that all three aspects of mindfulness would be positively related to wisdom. To assess wisdom, I used the three scales most commonly used in quantitative research—the 3D-WS, the ASTI, and the SAWS. After Glück et al. ([34]), I expect that a factor analysis of these measures will yield two dimensions: wisdom about the self (ASTI and SAWS) and wisdom about the (social) world (3D-WS). Given that mindfulness is primarily associated with knowledge of the self, I would expect that the mindfulness-wisdom connection would be stronger for wisdom about the self than for wisdom about the (social) world. Extending our prior work on mindfulness and ethical sensitivities, as well as building on Glück et al. (submitted), I expect that wisdom will be positively connected to the individualizing moral foundations—care and fairness. For the binding foundations—authority, loyalty, and sanctity/purity—the connection is likely less strong. Because wisdom is very often considered an aspect of adult development, I included a group of adults sampled across a large sweep of the adult life span (Sample B, age 25–74), aside from the more usual sample of college students (Sample A). Adding the former sample allows me, first, to check if the results from the first sample replicate, and second, to test whether or not any of the wisdom or ethical components are age-sensitive, as has sometimes been claimed (e.g., Ardelt [ 1 ]; Baltes and Kunzmann [ 7 ]; but see, e.g., Grossmann and Kross [39]; Mickler and Staudinger [51]). Methods Participants Sample A consisted of 260 undergraduate students from the Georgia Institute of Technology, who received course credit in return for their participation. They were invited to participate in a study on 'mindfulness, acceptance, and psychology'. They were aged 18–26 (mean = 19.7, SD = 1.5); 54% were women. Sample B consisted of 173 participants recruited from Mechanical Turk. They were invited to participate in a study on 'mindfulness, acceptance, and psychology', and offered $4 in return for their time. Workers needed to be highly qualified in order to participate—more than 5000 Human Intelligence Tasks (HIT; i.e., surveys or other online tasks) completed to the requesters' satisfaction, and at least 98% of all lifetime HITs approved by the requester. They were aged 21–74 (mean = 39.8, SD = 11.7); 44% were women. The age distribution was as follows: age 21–30: 38 participants; age 31–40: 69 participants; age 41–50: 33 participants; age 51–60: 18 participants; age 61–74: 12 participants. On average, participants had completed 14.9 years of education (SD = 1.9). Although Mechanical Turk is generally considered to be a useful, valid, and reliable tool for behavioral researchers (e.g., Mason and Suri [49]), we found it prudent to assess potential differences in data quality between the two samples. We did this by comparing Cronbach's α values for all subscales (see the "Measures and Procedure" section below for all α values). Sample B (Mechanical Turk) tended to have higher reliability values (median = 0.84, ranging from 0.41 to 0.93) than Sample A (students) (median = 0.71, ranging from 0.48 to 0.90). The correlation between Fisher z -transformed reliability values between the samples was 0.78 (this transformation was applied to linearize the measurement scale), suggesting that both groups were about equally sensitive to differences in the item characteristics that drive reliability. Measures and Procedure Participants filled out all questionnaires online; they took about 45–60 min to complete. Below, questionnaires are grouped thematically; the mindfulness measures (i.e., self-awareness, self-regulation, and self-transcendence) are presented as they resulted from the set of factor analyses (an exploratory analysis on 488 participants, and a confirmatory analysis on an independent sample of 222 participants) in Verhaeghen ([69]); this structure was replicated in Verhaeghen and Aikman ([70]). All measures were collected from both samples. Cronbach's α values reported are the values obtained in the present study, reported separately for Samples A and B, respectively. Note that some scales (notably the subscales of the Self-Compassion Scale) contain a very small number of items, possibly depressing the α values. Control Variables The Mini-IPIP (Donnellan et al. [23]) is a 20-item measurement of the Big Five personality factors , 4 items for each factor: Extraversion (sample item: 'I am the life of the party', Cronbach's α = 0.83 and 0.87), Agreeableness (sample item: 'I sympathize with others' feelings', Cronbach's α = 0.77 and 0.85), Conscientiousness (sample item: 'I get chores done right away', Cronbach's α = 0.68 and 0.78), Openness (which the IPIP labels Intellect/Imagination; sample item: 'I have a vivid imagination', Cronbach's α = 0.71 and 0.84), and Neuroticism (sample item: 'I have frequent mood swings', Cronbach's α = 0.74 and 0.78). Additionally, participants were asked for their age and gender . Social Conservatism Social conservatism was measured via the Social Conservatism subscale (6 items; sample item: 'Please indicate the extent to which you feel positive or negative towards each issue: ... Abortion'; Cronbach's α = 0.62 and 0.69) of the Social and Economic Conservatism Scale (SECS; Everett [27]). Self-awareness Two constructs were assessed within self-awareness. The first, reflective awareness , is the unit-weighted composite of the z -scores of three scales: (a) the Observing subscale of the Five Facets Mindfulness Questionnaire (FFMQ; Baer et al. [ 6 ]) (8 items; sample item: 'When I'm walking, I deliberately notice the sensations of my body moving', Cronbach's α = 0.73 and 0.87); (b) the Reflectiveness subscale of the Broad Rumination Scale (BRS; Trani et al. in preparation) (4 items; sample item: 'It is important for me to understand why I feel a certain way', Cronbach's α = 0.81 and 0.81); and (c) Search for Insight/Wisdom of the Aspects of Spirituality scale (ASP; Büssing et al. [18]) (7 items; sample item: 'I strive for insight and truth', Cronbach's α = 0.84 and. 90). In both samples, the composite was normally distributed, as ascertained via a Kolmogorov–Smirnov test ( p > 0.2). The second construct, controlled sense-of-self in the moment , is the unit-weighted composite of the z -scores of three scales: (a) the Acting with Awareness subscale from the FFMQ (8 items, sample item: the reverse of 'When I'm doing things, my mind wanders off and I'm easily distracted', Cronbach's α = 0.87 and 0.91); (b) the Sense-of-self Scale (SOSS; Flury and Ickes [29]) (12 items, sample item: 'I have a clear and definite sense of who I am and what I'm all about'; Cronbach's α = 0.86 and 0.88); and (c) the Non-judging of inner experience subscale of the FFMQ (8 items, sample item: the reverse of 'I criticize myself for having irrational or inappropriate emotions', Cronbach's α = 0.90 and 0.93). In both samples, the composite was normally distributed, as ascertained via a Kolmogorov–Smirnov test ( p > 0.2). Self-regulation Two constructs were assessed within self-regulation. The first, self-preoccupation , is the unit-weighted composite of the z -scores of two subscales from the BRS, namely Compulsivity (5 items; sample item: 'When I start to worry, it's very hard for me to stop', Cronbach's α = 0.79 and 0.87) and Worrying (3 items; sample item: 'Uncertainty about the future bothers me', Cronbach's α = 0.58 and 0.68), as well as two subscales from the Self-Compassion Scale, Short Form (SCS; Raes et al. [57]), namely Isolation (2 items; sample item: 'When I'm feeling down, I tend to feel like most other people are probably happier than I am', Cronbach's α = 0.56 and 0.63) and Over-Identified (2 items; sample item: 'When I fail at something important to me I become consumed by feelings of inadequacy', Cronbach's α = 0.66 and 0.58). In both samples, the composite was normally distributed, as ascertained via a Kolmogorov–Smirnov test ( p > 0.2). In our previous work, as here, self-preoccupation correlated negatively with other aspects of mindfulness, as one would expect—better self-regulation implies lower, not higher, levels of self-preoccupation. This may be confusing for some readers. Because the construct is, however, measured by scales that tap explicitly into the self-preoccupation aspect, and not its absence or opposite, we preferred to keep the self-preoccupation label. The second, self-compassion , was measured as the unit-weighted composite of the z -scores of three subscales from the SCS, namely Self-Kindness (2 items; sample item: 'I try to be understanding and patient towards those aspects of my personality I don't like', Cronbach's α = 0.61 and 0.60), Common humanity (2 items; sample item: 'I try to see my failings as part of the human condition', Cronbach's α = 0.49 and 0.57), and Mindfulness (2 items; sample item: 'When something painful happens I try to take a balanced view of the situation', Cronbach's α = 0.66 and 0.68), as well as the Decentering subscale of the Experiences Questionnaire (EQ; Fresco et al. 2007) (13 items, sample item: 'I am better able to accept myself as I am'; Cronbach's α = 0.84 and 0.93). The composite was normally distributed in Sample A, Kolmogorov–Smirnov = 0.042, p > 0.2, but not Sample B, Kolmogorov–Smirnov = 0.075, p = 0.034. Self-transcendence Self-transcendence was measured as the unit-weighted composite of the z -scores of 2 subscales from the Dispositional Positive Emotion Scale (DPES; Shiota et al. [62]), namely Joy (6 items; sample item: 'I am an intensely cheerful person', Cronbach's α = 0.84 and 0.90), and Love (6 items; sample item: 'I develop strong feelings of closeness to people easily', Cronbach's α = 0.82 and 0.90), and 1 subscale from the Resilience Scale (RS; Lundman et al. [48]), namely Meaningfulness (7 items, sample item: 'My life has meaning', Cronbach's α = 0.81 and 0.91). The composite was normally distributed in Sample A, Kolmogorov–Smirnov = 0.042, p > 0.2, but not Sample B, Kolmogorov–Smirnov = 0.072, p = 0.046. Moral Foundations This construct was measured using the 5 subscales of the Moral Foundations Questionnaire (Graham et al. [36]): (a) Care/harm (6 items; sample item: 'When you decide whether something is right or wrong, to what extent are the following considerations relevant to your thinking? – Whether or not someone suffered emotionally'; Cronbach's α = 0.52 and 0.76); (b) Fairness (6 items; sample item: '... Whether or not some people were treated differently than others'; Cronbach's α = 0.56 and 0.64); (c) Ingroup loyalty (6 items; sample item: '... Whether or not someone's action showed love for his or her country'; Cronbach's α = 0.48 and 0.84); (d) Authority (6 items; sample item: '... Whether or not someone showed a lack of respect for authority'; Cronbach's α = 0.61 and 0.85); and (e) Purity (6 items; sample item: '... Whether or not someone violated standards of purity and decency'; Cronbach's α = 0.69 and 0.92). Wisdom Scales Participants filled out three self-report wisdom surveys. The Adult Self-Transcendence Inventory (ASTI; Levenson et al. [47]) measures, in the words of the authors, "a decreasing reliance on externals for definition of the self, increasing interiority and spirituality, and a greater sense of connectedness with past and future generations" (p. 127). After factor analysis, Levenson et al. derived a more focused self-transcendence scale, which is used here (Factor 1 of their Table 1; 10 items; sample item: 'My peace of mind is not so easily upset as it used to be'; Cronbach's α = 0.67 and 0.79). The Self-Assessed Wisdom Scale (SAWS; Webster [71]) measures 5 interrelated dimensions of wisdom: experience (8 items; sample item: 'I have experienced many painful events in my life'; Cronbach's α = 0.81 and 0.84), emotions (8 items; sample item: 'I am good at identifying subtle emotions within myself'; Cronbach's α = 0.83 and 0.86), reminiscence (8 items; sample item: 'Reviewing my past helps gain perspective on current concerns'; Cronbach's α = 0.86 and 0.91), openness (8 items; sample item: 'I like to read books which challenge me to think differently about issues'; Cronbach's α = 0.71 and 0.80), and humor (8 items; sample item: 'I can chuckle at personal embarrassments'; Cronbach's α = 0.86 and 0.91). The Three-Dimensional Wisdom Scale (3D-WS; Ardelt [ 2 ]) consists of 3 subscales, tapping the cognitive (14 items, sample item: 'It is better not to know too much about things that cannot be changed'; Cronbach's α = 0.78 and 0.86), reflective (12 items, sample item: 'When I'm upset at someone, I usually try to "put myself in his or her shoes" for a while'; Cronbach's α = 0.55 and 0.54), and affective (13 items, sample item: 'I can be comfortable with all kinds of people'; Cronbach's α = 0.49 and 0.41) components of wisdom. Factor analysis of the nine wisdom scales in both samples; principal axis analysis with oblimin rotation Sample ASample BFactor 1 wisdom about the selfFactor 2 wisdom about the social worldFactor 1 wisdom about the selfFactor 2 wisdom about the social worldASTI (total).67.80SAWS-emotion regulation.72.78SAWS-experience.79.75SAWS-humor.71.77SAWS-openness.65.74SAWS-reminisce-reflect.80.733D-WS-affective.71.803D-WS-cognitive.57.683D-WS-reflective.76.68 N = 260 for Sample A and 173 for Sample B. For legibility reasons, factor loadings below.30 are not represented Measures Collected but Not Included in the Analyses Additionally, participants filled out the Nonattachment Scale (NAS; Sahdra et al. [58]), the Emotional Resilience Scale (ERS; Gross and John [38]); the QUEST scale (Batson and Schoenrade [11]), the Varieties of Inner Speech Questionnaire (VISQ; McCarthy-Jones and Fernyhough [50]), and the Self-Verbalization Scale (SVS; Duncan and Cheyne [24]). Some of those measures were remnants of an earlier (Verhaeghen [69]) attempt at casting a wide net of mindfulness measures; these measures failed to make the final cut after the factor analysis described in that paper (NAS, ERS, and QUEST); others were are not relevant to the present project (VISQ and SVS). Results Factor Analysis of the Wisdom Scales Two exploratory factor analyses (principal axis analysis with oblimin rotation), one for each sample, were conducted on the nine wisdom scales (i.e., the ASTI scale, the three 3D-WS scales and the five SAWS scales). Scale or subscale scores (i.e., not item scores) were the unit of analysis. Eigenvalues and the scree plot suggested a 2-factor solution in both samples. This solution is presented in Table 1; it explains 55% of the variance in Sample A, and 57% of the variance in Sample B. Both analyses converged on the same solution: the ASTI and all the SAWS scales loaded on one factor, and all three 3D-WS scales loaded on another. As mentioned in the introduction, the ASTI and the SAWS scale have in common that they survey wisdom from an intrapersonal perspective, that is, they appear to tap self-knowledge and self-acceptance; the 3D-WS arguably captures skills and wisdom about how to deal with the social world and with external circumstances. Consequently, I will label the first factor wisdom about the self , and the second wisdom about the ( social ) world . The two factors are relatively independent: Their intercorrelation was 0.18 in Sample A and 0.07 in Sample B. Wisdom and the Mindfulness Manifold To examine how the mindfulness manifold is related to self-assessed wisdom, as well as to control for the effects of the set of background variables (personality, age, and gender), hierarchical multiple regression analysis was applied to the data, separated by sample, with the two types of wisdom (wisdom about the self and wisdom about the [social] world) as the final outcome. For these analyses, a unit-weighted composite was constructed from the z -scores for the ASTI and the different SAWS scales to represent wisdom about the self. The unit-weighted composite of the z -scores of the three 3D-WS scales represented wisdom about the (social) world. Both unit-weighted wisdom composites were normally distributed in both samples; highest Kolmogorov–Smirnov = 0.057, p > 0.200. In the first step, the background variables—the five IPIP scales, age, and gender—were entered. The next step added the two self-awareness composites (reflective awareness and controlled sense-of-self in the moment); the step after that the two self-regulation composites (self-preoccupation and self-compassion); the final step added self-transcendence. Pearson correlations between all variables are reported in Table 2; results from the regression analyses in Table 3. Note that in these analyses, self-preoccupation is scored as defined above, that is, higher values indicate higher levels of self-preoccupation, which indicates a low level of self-regulation. Because of the potential conceptual overlap between the mindfulness concept of self-transcendence and wisdom as defined through the ASTI, analyses were rerun after removing the ASTI from the composite measuring wisdom about the self. The wisdom about the self variable and the wisdom about the self variable with the ASTI removed were virtually identical ( r = 0.98 in Sample A and 0.99 in Sample B); the pattern of the regression results was identical (i.e., variables that were significant remained significant and variables that were not remained non-significant). Correlation matrix for the background variables, mindfulness variables, and wisdom factors; Sample A data presented above the diagonal, Sample B below 12345678910111213141516171 IPIP extraversion1.00.29**.01 −.12*.13*.09.10.03.12.22** −.22**.13*.40**.31**.19**.06.062 IPIP agreeableness.25**1.00.17** −.02.25**.18**.03.28**.36**.19**.00.20**.51**.38**.23**.31**.063 IPIP conscientiousness.12.30**1.00 −.16**.05.18**.03.11.09.34** −.11.18**.27**.10 −.02.05.19**4 IPIP neuroticism −.43** −.34** −.36**1.00 −.09 −.04 −.03.24**.08 −.53**.60** −.48** −.34** −.18** −.11.06 −.045 IPIP intellect/imagination.29**.18* −.02 −.20**1.00.07.04 −.15*.35**.08 −.08.07.20**.36**.03.04 −.116 Social conservatism −.04.14.23** −.19* −.111.00 −.05.07.16*.15* −.02.14*.24**.18*.03.11.54**7 Age −.05.13.07 −.08 −.08.30**1.00 −.07.05.03.03 −.02 −.03.03.07 −.03.088 Gender.05 −.31** −.17* −.02.03 −.07 −.21**1.00.04 −.03.21** −.05.13*.05.13*.30**.009 Reflective awareness.22**.34**.26** −.18*.43** −.02 −.12 −.141.00 −.08.22**.23**.35**.60**.15*.37**.23**10 Controlled sense-of-self in the moment.33**.40**.37** −.62**.21**.05.17* −.10.17*1.00 −.54**.42**.43**.22**.14* −.03.0111 Self-preoccupation −.37** −.22** −.23**.57** −.19* −.08 −.17* −.08 −.02 −.56**1.00 −.44** −.27** −.08 −.14*.30**.1112 Self-compassion.06.16* −.07 −.20**.03.05.04 −.04.17* −.01.17*1.00.48**.41**.21**.14*.17**13 Self-transcendence.52**.59**.34** −.66**.16*.26**.04 −.12.43**.54** −.47**.21**1.00.57**.27**.35**.24**14 Wisdom about the self.34**.51**.32** −.47**.40**.10.11 −.14.66**.45** −.28**.22**.68**1.00.28**.41**.26**15 Wisdom about the (social) world.11.06.08 −.08.08 −.05.05 −.06.10.05 −.06.00.11.101.00.18**.1016 Individualizing foundation.09.38**.09 −.13.17* −.08.06 −.15.31**.13 −.02.03.29**.43**.111.00.33**17 Binding foundation −.04.20**.20* −.12 −.20*.77**.13 −.10 −.01 −.02.09.07.31**.16*.01.071.00 N = 260 for Sample A and 173 for Sample B IPIP International Personality Item Pool (https://ipip.ori.org/) * p <.05 Results from hierarchical regression analyses to predict the wisdom factors Step 1Step 2Step 3Step 4Sample ASample BSample ASample BSample ASample BSample ASample BWisdom about the self IPIP extraversion0.19**0.080.16**0.020.17**0.030.11* − 0.06 IPIP agreeableness0.24**0.26**0.080.17**0.060.17** − 0.010.05 IPIP conscientiousness0.010.07* − 0.060.01 − 0.060.03 − 0.080.02 IPIP neuroticism − 0.16** − 0.21** − 0.15** − 0.19** − 0.10 − 0.17* − 0.06 − 0.05 IPIP intellect/imagination0.28**0.31**0.13**0.110.16**0.110.14*0.18** Age − 0.010.08 − 0.020.13* − 0.010.12*0.010.13* Gender0.07 − 0.060.080.010.070.020.050.02 Reflective awareness0.52**0.50**0.46**0.49**0.40**0.38** Controlled sense-of-self in the moment0.15*0.120.120.130.070.09 Self-preoccupation0.04 − 0.010.050.05 Self-compassion0.19**0.060.14*0.03 Self-transcendence0.28**0.41**R2.296.455.506.622.526.625.561.673R2 change.296**.455**.210**.167**.020**.003.035**.048**Wisdom about the (social) world IPIP extraversion0.130.120.100.130.090.130.060.12 IPIP agreeableness0.21** − 0.010.16*0.000.16*0.000.16 − 0.01 IPIP conscientiousness − 0.090.03 − 0.130.04 − 0.120.04 − 0.13*0.04 IPIP neuroticism − 0.17** − 0.02 − 0.13 − 0.08 − 0.07 − 0.09 − 0.05 − 0.08 IPIP intellect/imagination − 0.050.06 − 0.080.06 − 0.080.06 − 0.080.07 Age0.050.040.050.040.060.050.070.05 Gender0.11 − 0.070.10 − 0.070.11 − 0.070.10 − 0.07 Reflective awareness0.110.040.130.040.100.02 Controlled sense-of-self in the moment0.12 − 0.120.07 − 0.110.05 − 0.12 Self-preoccupation − 0.120.03 − 0.110.04 Self-compassion0.03 − 0.000.01 − 0.08 Self-transcendence0.130.06R2.116.033.132.043.140.043.148.044R2 change.116*.033.016.009.008.000.008.001 N = 260 for Sample A and 173 for Sample B IPIP International Personality Item Pool (ipip.ori.org) * p <.05, ** p <.01 Ethical Sensitivity as Consequence of Mindfulness and Wisdom Hierarchical regression was applied to investigate how wisdom and the mindfulness manifold potentially shape ethical sensitivity, operationalized here as the moral foundations. To keep the number of analyses manageable, the two individualizing foundations were collapsed into a single construct by taking the average of the z -scores of the Care/Harm and Fairness scales (the correlation between the two individualizing foundations was 0.50 in Sample A, and 0.57 in Sample B); likewise, a unit-weighted z -score composite was built from the three binding foundations, namely Ingroup loyalty, Authority, and Purity (intercorrelations between the three binding foundations ranged from 0.59 to 0.64 in Sample A, and from 0.63 to 0.78 in Sample B). As is usual (because individuals generally tend to skew towards the ethical side of the distribution), these composites were not normally distributed, Kolmogorov–Smirnov = 0.109, 0.112, 0.139, and 0.073, for individualizing in Samples A and B and binding in sample A and B, respectively, p = 0.000, 0.000, 0.000, and 0.040, respectively. Pearson correlations are reported in Table 2; results from the regression analyses in Table 4. Rerunning the regression analyses with the alternate measure of wisdom about the self, that is, with the ASTI removed, yielded an identical pattern as obtained for the original wisdom about the self concept (i.e., variables that were significant remained significant and variables that were not remained non-significant). Results from hierarchical regression analyses to predict the moral foundations Step 1Step 2Step 3Step 4Step 5Sample ASample BSample ASample BSample ASample BSample ASample BSample ASample BIndividualizing foundation IPIP extraversion − 0.06 − 0.02 − 0.04 − 0.03 − 0.01 − 0.03 − 0.06 − 0.11 − 0.10 − 0.09 IPIP agreeableness0.23**0.34**0.110.33**0.100.34**0.050.25*0.030.23* IPIP conscientiousness0.060.010.01 − 0.02 − 0.00 − 0.04 − 0.03 − 0.040.01 − 0.05 IPIP neuroticism − 0.04 − 0.03 − 0.10 − 0.10 − 0.21* −.16 − 0.17 − 0.07 − 0.17* − 0.05 IPIP intellect/imagination0.15*0.080.040.020.070.020.040.08 − 0.030.03 Social conservatism0.01 − 0.15 − 0.00 − 0.16 − 0.01 − 0.16 − 0.03 − 0.22* − 0.02 − 0.20* Age − 0.060.05 − 0.050.09 − 0.080.11 − 0.060.13 − 0.070.07 Gender0.21** − 0.060.25** − 0.030.21** − 0.030.18* − 0.020.17* − 0.02 Reflective awareness0.33**0.190.22**0.20*0.17*0.110.03 − 0.05 Controlled sense-of-self in the moment − 0.05 − 0.120.05 − 0.110.02 − 0.15 − 0.00 − 0.17 Self-preoccupation0.38**0.100.39**0.170.39**0.13 Self-compassion0.10 − 0.110.04 − 0.15 − 0.01 − 0.15 Self-transcendence0.27**0.35*0.160.17 Wisdom about the self0.42**0.41** Wisdom about the self (ASTI excluded)(NA)(NA) Wisdom about the (social) world0.010.04R2.158.160.233.191.300.202.329.232.404.285R2 stepwise change.158**.160**01,075**.033.067**.011.029**.031*.075**.053**Binding foundation IPIP extraversion − 0.020.030.000.040.030.050.00 − 0.02 − 0.01 − 0.02 IPIP agreeableness − 0.080.09 − 0.120.10 − 0.130.11 − 0.15*0.04 − 0.15*0.03 IPIP conscientiousness0.21**0.030.22**0.040.21**0.020.20**0.030.21**0.02 IPIP neuroticism0.070.07 − 0.020.02 − 0.05 − 0.06 − 0.030.00 − 0.030.02 IPIP intellect/imagination0.02 − 0.10 − 0.03 − 0.10 − 0.01 − 0.11 − 0.02 − 0.06 − 0.06 − 0.09 Social conservatism0.54**0.80**0.54**0.80**0.54**0.80**0.53**0.74**0.53**0.75** Age0.02 − 0.100.02 − 0.110.00 − 0.090.01 − 0.060.01 − 0.09 Gender − 0.13 − 0.04 − 0.10 − 0.05 − 0.13* − 0.03 − 0.14* − 0.02 − 0.14* − 0.02 Reflective awareness0.130.000.040.010.02 − 0.06 − 0.06 − 0.13 Controlled sense-of-self in the moment − 0.15* − 0.08 − 0.12 − 0.06 − 0.13 − 0.09 − 0.15 − 0.10 Self-preoccupation0.21*0.15*0.22**0.20**0.21*0.19** Self-compassion0.14 − 0.090.12 − 0.11*0.09 − 0.12* Self-transcendence0.100.28**0.050.22* Wisdom about the self0.23**0.15 Wisdom about the self (ASTI excluded)(NA)(NA) Wisdom about the (social) world − 0.040.04R2.361.651.391.655.419.668.423.690.447.698R2 stepwise change.361**.651**.030*.004.029*.013.004.024**.023*.008 N = 260 for Sample A and 173 for Sample B IPIP International Personality Item Pool (https://ipip.ori.org/) * p <.05, ** p <.01 Discussion In the present study, I investigated if and how wisdom might be related to dispositional mindfulness, broadly construed as a manifold of self-awareness, self-regulation, and self-transcendence, and if and how it might promote ethical sensitivities. Wisdom was measured using the three self-report surveys most often used in quantitative research on the topic—the 3D-WS, the ASTI, and the SAWS. Two independent samples were included: A sample of college students (Sample A), and one of adult workers on Mechanical Turk with a much wider age range (viz., 21–74; Sample B). The Structure of Wisdom A first expectation (after Glück et al. [34]) was that factor analysis on the subscales of the three surveys would reveal a bifurcation between wisdom about the self (ASTI and SAWS) and wisdom about the (social) world (3D-WS). Factor analysis indeed confirmed this divergence, in both samples. The correlation between the two dimensions was small, 0.18 in Sample A and 0.07 in Sample B, underscoring the relative independence of these two aspects of wisdom. This result replicates that of Glück et al., who obtained a correlation of 0.11. The present study is the first to also show functional independence between the two constructs, in that both types of wisdom have different correlates, as explicated in the next two sections. Predicting Wisdom About the Self From the literature reviewed in the Introduction, I expected that all three aspects of mindfulness—self-awareness, self-regulation, and self-transcendence—would be positively related to wisdom. Regression analysis suggested that this is (partially) true, but only for wisdom about the self. Before I detail these results, note that the background variables explained a fair amount of variance in wisdom about the self: it was negatively related to neuroticism, and positively related to agreeableness and intellect/imagination in both samples, and additionally to extraversion in the college sample and conscientiousness in the Mechanical Turk sample. After taking mindfulness into account, only the influence of intellect/imagination (in both groups) and extraversion (in the college sample) remained significant, but the coefficients were substantially reduced (with β s roughly half of those in Step 1). This suggests that the effects of agreeableness and neuroticism are wholly mediated through the effects of mindfulness, and those of extraversion and intellect/imagination are partially mediated. Levenson et al. ([47]) obtained a negative effect of neuroticism, and a positive effect of openness (i.e., imagination/intellect in this sample), agreeableness, and conscientiousness on the ASTI, a measure of wisdom about the self; only the latter correlation was absent from the present results. Within the Berlin wisdom paradigm, openness to experience is likewise a strong predictor of wisdom scores (e.g., Pasupathi et al. [56]; Staudinger and Glück [64]). This makes sense: if wisdom is at least partially based on experience, an openness to new experiences would be key for its development or flourishing. Crucially, the mindfulness manifold explained an additional 21% to 26% of the variance in wisdom about the self, over and beyond the variance explained by personality, age, and gender. In both samples, one aspect of self-awareness—reflective awareness—was a significant and strong predictor of wisdom about the self, with β values around 0.40 for the final step. The other aspect of self-awareness, however—controlled sense-of-self in the moment—was not a significant predictor (except in Step 2 in the college sample). It appears, then, that wisdom about the self is associated with a reflective stance about one's experiences (i.e., reflective awareness), but not with the experience of being present in the moment (i.e., controlled sense-of-self in the moment)—in other words, it is the examination of or the investigation into one's experiences rather than the mere witnessing of those experiences that is important for this type of wisdom, as many models of wisdom (e.g., Ardelt [ 3 ]; Brown and Greene [14]; Glück and Bluck [31]) indeed explicitly predict. It is interesting to note that self-compassion (at least in the college sample) was an additional predictor for wisdom about the self. The reasons might be that self-compassion allows one to step back from the immediacy of the experience, and consider oneself the way one would consider a friend—this friendly distancing, like the reflection/examination component, might possibly help to foster the transcendence Ardelt ([ 3 ]) considers so necessary for the development of wisdom. Self-preoccupation was not related to wisdom in either sample. One additional link found here was that between self-transcendence and wisdom about the self (with β values on par with or a little lower than those for reflective awareness). This association is almost self-evident, given that quite a few theorists consider self-transcendence to be a critical component of wisdom (Ardelt [ 3 ]; Curnow [22]; Levenson [46]). Note that this relationship remained unchanged when the ASTI, a measure of wisdom the conceptually relies on self-transcendence, was removed from the composite that tapped wisdom about the self, suggesting that the relationship cannot be explained merely by conceptual overlap between the measure of self-transcendence and the ASTI. The role of reflective awareness and self-compassion in wisdom about the self, however, is not merely to foster self-transcendence: the final step in the regression analyses clearly shows that the effects of reflective awareness (both samples) and self-compassion (college sample) are far from completely mediated by self-transcendence. It is also important to stress that the three background variables and the mindfulness manifold provide us with a very good handle on the individual differences in wisdom about the self: they explain a little more than half to two thirds of the variance (between 56 and 67%, to be precise), indicating that these constructs probably should be important components in any realistic theory of wisdom about the self. Predicting Wisdom About the (Social) World Wisdom about the (social) world, in contrast, was not predicted by the mindfulness manifold at all. There is some indication that wisdom about the (social) world might have roots in individual differences in personality instead: individuals scoring higher on agreeableness and lower on neuroticism scored higher on wisdom about the (social) world; however, this was only true in the student sample. As in wisdom about the self, the effects of agreeableness and neuroticism were wholly mediated through the effects of mindfulness, even though the latter effects did not rise to the level of significance. These personality correlates have some face validity in their predictive value. That is, it makes sense that people who are (or want to appear) more friendly, warm, and helpful might be better at picking up on social cues or be more interested in understanding how the social world and the world in general works. Neuroticism, in general, is related to overreactivity, negative emotions, and feeling easily threatened by social situations; none of these qualities would likely be conducive to acquire the type of equanimity associated with wisdom in general (see Wink and Staudinger [74], for a similar argument). Note that Ardelt et al. ([ 4 ]) found that openness and extraversion correlated with the 3D-WS (in a sample of 98 males who were approximately 80 years old); we found such correlations for wisdom about the self, not for wisdom about the (social) world. The reason for the discrepancy is unclear. The reason why the influence of personality variables on wisdom about the (social) world is constrained to the college group is likewise unclear. One potential reason could be adult development: perhaps as people grow older the grip of personality on their outlook on the world loosens. There is a hint of this in the present data: after a median split on the Mechanical Turk sample, the relevant correlations were nominally higher in the younger sample (correlation of wisdom about the [social] world with agreeableness was 0.11, with neuroticism − 0.12) than the older subsample (0.01 and − 0.04, resp.). None of these correlations, however, reached significance. This, then, remains an area for further research. Note that the Mechanical Turk sample was highly educated (about 3 years of college), so educational differences are unlikely to explain the cross-sample differences. Also note that the relationship with personality is much smaller than that observed in wisdom about the self: the background variables (personality, age, and gender) explained 30–46% of the variance in wisdom about the self, versus only 3–12% in wisdom about the (social) world. Wisdom about the (social) world is not only distinct from wisdom about the self; it also seems, with the present measures, much harder to explain. Wisdom and the Moral Foundations Turning now to ethical sensitivity as a potential consequence of mindfulness and wisdom, I found, first, a conceptual (partial) replication of our earlier paper (Verhaeghen and Aikman [70]) on the effects of mindfulness on the moral foundations. In that paper, we found, in two independent samples, that reflective awareness, self-preoccupation, and self-transcendence were related to the individualizing aspects of morality (i.e., an emphasis on care and fairness) (note that the relationship with self-preoccupation was only significant in Sample A in the present study). Self-compassion and self-transcendence were positively related to the binding aspects of morality (i.e., an emphasis on loyalty, authority, and sanctity). In the present data, an additional effect of self-preoccupation on binding was obtained, and the effect of self-compassion on binding was not significantly different from zero in one sample, and, surprisingly, negative in the other. Wisdom about the self turned out to be a strong predictor for the individualizing foundation, that is, one's sensitivity to the ethical dimensions of care and fairness ( β for the final step = 0.42 and 0.41, resp.). In contrast, wisdom about the (social) world had only a negligible and non-significant influence on the individualizing foundation ( β = 0.01 and 0.04). While most theories about wisdom posit an effect on ethics, notably "prosocial attitudes and behaviors, which include empathy, compassion, warmth, altruism, and a sense of fairness" (Bangen et al. [10], p. 1257), the present data suggest that this effect remains restricted to wisdom about the self, and does not extend to wisdom about the (social) world. Within the group of mindfulness variables, the effects of self-awareness on the individualizing foundation were partially mediated through self-transcendence (i.e., the coefficients associated with self-awareness become smaller once self-transcendence enters the equation) and wholly mediated through wisdom about the self (i.e., the coefficients associated with self-awareness became non-significant once the wisdom variables enter the equation, but only wisdom about the self had a reliable effect). The effects of self-transcendence on individualizing, in turn, were fully mediated through wisdom, and particularly wisdom about the self. One possible interpretation of the latter finding is that self-transcendence is a precursor for wisdom about the self; another that self-transcendence as defined here is subsumed under or maybe even synonymous with wisdom about the self. The latter interpretation is certainly compatible with views about wisdom as a form of self-transcendence (Ardelt [ 3 ]; Curnow [22]; Levenson [46]). Whatever the mechanism, wisdom about the self thus appears to foster an increased emphasis on the ethical dimensions of care and fairness, and this is partially due to the influence of reflective awareness and self-transcendence on wisdom about the self. The effects of wisdom on the binding foundations (i.e., an emphasis on authority, ingroup loyalty, and purity) were rather small. The strongest predictor for the binding foundation remained social conservatism, with people who are more conservative showing larger interest in the binding foundation ( β for the final step = 0.53 and 0.75). Wisdom about the self had a much smaller effect ( β for the final step = 0.23 and 0.15; the latter value was ns ); the contribution of wisdom about the (social) world was essentially nil ( β for the final step = − 0.04 and 0.04, ns ). In the college sample, participants who were less agreeable, more conscientious, male, and more self-preoccupied showed a larger interest in the binding foundation. The latter effect replicated for the Mechanical Turk sample, where lower levels of self-compassion and higher levels of self-transcendence were additionally related to a higher interest in binding. If we look at the results that replicate across both samples, the take-away message is that an interest in the binding foundation is determined mostly by social conservatism, and maybe, but to a much smaller extent, by wisdom about the self. This implies a second amendment to the Bangen et al. ([10]) quotation above, to the effect that wisdom's fostering of prosocial attitudes applies mostly to attitudes that make the rights and concerns of others visible (i.e., treating individuals with care and fairness), and less so to attitudes pertaining to ingroup cohesion (i.e., a focus on loyalty, authority, and purity).
    1. Philosophy for Children, Values Education and the Inquiring Society.Published in:Educational Philosophy & Theory,Oct2014,Professional Development CollectionBy:Cam, Philip Philosophy for Children, Values Education and the Inquiring Society.  How can school education best bring about moral improvement? Socrates believed that the unexamined life was not worth living and that the philosophical examination of life required a collaborative inquiry. Today, our society relegates responsibility for values to the personal sphere rather than the social one. I will argue that, overall, we need to give more emphasis to collaboration and inquiry rather than pitting students against each other and focusing too much attention on 'teaching that' instead of 'teaching how'. I will argue that we need to include philosophy in the curriculum throughout the school years, and teach it through a collaborative inquiry which enables children to participate in an open society subject to reason. Such collaborative inquiry integrates personal responsibility with social values more effectively than sectarian and didactic religious education. Keywords: religion; ethics; community of inquiry; spiral curriculum Introduction [ 1 ]As Socrates would have it, the philosophical examination of life is a collaborative inquiry. The social nature of the enterprise goes with its spirit of inquiry to form his bifocal vision of the examined life. These days, insofar as our society teaches us to think about values, it tends to inculcate a private rather than a public conception of them. This makes reflection a personal and inward journey rather than a social and collaborative one, and a person's values a matter of parental guidance in childhood and individual decision in maturity. The relegation of responsibility for values to the personal sphere also militates against societal self-examination. On the other hand, the traditional pontifical alternative is equally presumptive and debilitating in ignoring the possibility of personal judgement. How can education steer a course between the tyranny of unquestionable moral codes and the bankruptcy of individualistic moral relativism? It remains to be seen whether there is a way in which education could teach children to engage productively across their differences rather than responding to difference with suspicion or prejudice. Gilbert Ryle (in Cahn, 1970) made a clear distinction between 'teaching how' and 'teaching that', arguing from a behaviourist perspective that teaching how had a much more lasting impact than simply teaching the facts. However, too much emphasis on 'teaching how' can result in conditioning, training, teaching to conform to habit, teaching obedience with the threat of hellfire if the rules are broken. There is a third way, the way of philosophy espoused by Matthew Lipman ([ 8 ]) in his Philosophy for Children, which involves giving more emphasis to collaboration and inquiry rather than pitting students against each other and focusing too much attention on 'teaching that' instead of 'teaching how'. Philosophy as it is traditionally taught may well involve teaching how to follow the rules of formal logic correctly, or learning facts about the life and death of Socrates, but it also requires a capacity for critical reflection, consideration of alternative possibilities, and a genuine concern for truth and clarity. I argue that we need to include philosophy in the curriculum throughout the school years, but it needs to be a philosophy taught in the spirit of Socrates which balances individual and social values. Religious instruction tends to inculcate values through adult imposition and denies space to critical judgement. Ryle's distinction between 'learning that' and 'learning how' implied that these were discrete and exclusive ways of learning. However, learning how to do things is more than a matter of memorizing facts or following procedural instructions. Being able to cook is more than being able to follow a recipe book. Again, while some instruction is useful in learning to ride a bike, it is mostly a matter of trying to ride, and then, under guidance, trying again. It is a case of learning by doing, and doing it under different circumstances, in order to apply it in different circumstances. This is working out for oneself how to exercise individual judgement, rather than first learning a set of instructions and then carrying them out (Ryle, in Cahn, 1970, pp. 413–424). Whatever the rules are, they are heuristic and strategic, depending on different contexts, rather than algorithmic and learnable by rote. 'Learning how' can be important in many areas of the curriculum where training in skills is an important feature, especially in physical education and the arts, However, learning the art of inquiry requires a slightly different type of 'learning how' from training, rehearsal, repetition. A curriculum that is based on inquiry is one that is centred on thinking. There is a world of difference in the outcome to be expected from an education that treats knowledge as material with which to think and one that emphasizes memorization of knowledge. It is the difference between an inquiring society and one in which those few who have developed an inquiring mind have done so in spite of their education rather than because of it (Dewey, 1916/1966, chap. 12; Lipman, [ 8 ]). The concept of a community of inquiry owes much to Dewey who, in Democracy and education (1916/1966), described the healthy relation between an individual and his or her environment as functional. Dewey insisted that because the relationship between the individual and his or her environment must be based on mutual adjustment, fitting into society might well involve radically changing it. Dewey believed in the importance of preparing students for democratic citizenship. He stressed that consciously guided education aimed at developing the 'mental equipment' and moral character of students was essential to the development of civic character. Is this not what religious instruction tries to do? The relationship between the individual and society was far more important for Dewey than the child's relationship with an abstract God. It was organic and continually evolving in mutual adaptation. It differs from religious instruction in that its aim is to develop a model of free inquiry, which requires tolerance of alternative viewpoints, and free communication. He also believed that children's capacity for the exercise of deliberative, practical reason in moral situations could be cultivated not by ready-made knowledge but by 'a mode of associated living' characteristic of democracy. Lipman ([ 7 ]) was to elaborate on this idea of schools as a model of a participatory democracy and his classroom community of inquiry provided close analogies with the democratic school, a microcosm of the wider society. Thinking Together When we move away from the traditional classroom to the inquiring one and the teacher becomes less occupied with conveying information—with teaching 'that'— it becomes educationally desirable for students to engage with one another. When human conduct stimulates moral inquiry it is usually because that conduct is controversial, which is to say that there are different points of view as to how it should be judged. If you and I have different opinions in regard to someone's character or conduct, then we are both in need of justification and our views are subject to each other's objections. When we make a proposal to solve a practical problem of any complexity, we rely upon others who are reasonably well placed for constructive criticism or a better suggestion. If we want students to grow out of the habit of going with their own first thoughts, to become used to considering a range of possibilities, and to be on the lookout for better alternatives, then we could not do better than to have them learn by exploring issues, problems and ideas together. If we want them to become used to giving reasons for what they think, to expect the same of others, and to make productive use of criticism, then we could not go past giving them plenty of practice with their peers. And if we want them to grow up so that they consider other people's points of view, and not to be so closed minded as to think that those who disagree with them must be either ignorant or vicious, then the combination of intellectual and social engagement to be found in collaborative inquiry is just the thing. These are all good reasons for having our students learn to inquire together. Philosophy for Children More than any other discipline, philosophy is an inquiry into fundamental human problems and issues, where all the general conceptions that animate society come under scrutiny. Philosophy as a formal discipline played an important part in its place as a matriculation subject in some Australian states, because there were rigorous rules by which its standards could be maintained. This would involve, say, learning that ignoratio elenchi was an informal fallacy, or that modus tollens is an illegitimate move in deductive logic, or learning how to mount a reasoned argument in defence of a position. When, however, we are talking abut philosophy for children, its subject matter needs to be adapted to the interests and experience of students of various ages and its tools and procedures adjusted to their stage of development. There are models to work from, particularly the series of novels and manuals from Matthew Lipman, and in recent years we have begun to find our way forward.[ 2 ] If part of the difficulty is also that some philosophers think of philosophy as being above all that, it is salutary to remember that other disciplines have long since discovered how to recast themselves in educational form. Just as mathematics was forced to become more practical and relevant to the growing range of children who were staying on at school through the New Maths, so philosophy has been forced to become more real and relevant to children. The move towards an integrated curriculum away from discrete learning areas also required philosophy to make the connections across and through disciplines, raising the larger questions of epistemology, ontology, aesthetics and, for the purpose of this article, the important area of axiology or values. For philosophy to have a formative influence, and thereby to significantly affect both the way people think and the character of their concerns, it needs to be part of the regular fare throughout the school years. Only by this means can it effectively supply its nutrients to the developing roots of thought or knowing that and action or knowing how. We need to counter the view that philosophy is an advanced discipline, suitable only for the academically gifted and intellectually mature. Jerome Bruner made famous the startling claim that 'the foundations of any subject may be taught to anybody at any age in some form' (1960, p. 12), and he suggested that the prevailing view of certain disciplines being too difficult for younger students results in our missing important educational opportunities. Bruner called this structure a spiral curriculum : one that begins with the child's intuitive understanding of the fundamentals, and then returns to the same basic concepts, themes, issues and problems at increasingly elaborate and more abstract or formal levels over the years. A spiral curriculum is vital for developing the kind of deep understanding that belongs to philosophy and the humanities. What else is to be gained from building philosophy into the curriculum throughout the school years? It seems to me that an education in philosophical inquiry will assist students to achieve a rich understanding of a wide array of issues and ideas that inform life and society through an increasingly deep inquiry into them. It will help students to think more carefully about issues and problems that do not have a unique solution or a settled decision procedure, but where judgements and decisions can be better or worse in all kinds of ways. Since most of the problems that we face in life and in our society are of that character, the general-purpose tools that students acquire through philosophy will ensure that they are better prepared to face those problems. If philosophy is carried out in the collaborative style envisaged above, then its recipients will also be more likely to tackle such problems collaboratively, and thereby to be more constructive and accommodating with one another. Let me spell all this out a little under the headings of 'thinking', 'understanding' and 'community'. Thinking Philosophy is a discipline with a particular focus on thinking. It involves thinkers in the cognitive surveillance of their own thought. It is a reflective practice, in the sense that it involves not only careful thinking about some subject matter, but thinking about that thinking, in an effort to guide and improve it. Since philosophical thinking tends to keep one eye on the thinking process, philosophy can supply the tools that assist the thinker in such tasks as asking probing questions, making needful distinctions, constructing fruitful connections, reasoning about complex problems, evaluating propositions, elaborating concepts, and honing the criteria that are used to make judgements and decisions. Dewey's (2010) five-step model of identifying the problem and placing it in context, making creative and testable hypotheses that move towards a possible solution, analysing the hypotheses in terms of past experience, considering alternative hypotheses that may be more suitable, and checking possible solutions against actual experiences was picked up as a model of individual thinking, especially in science and design work. But in a community of inquiry each of these steps is done from the multiple perspectives of the group at any age, allowing not only the falsifiability of any conservative position to truth but also their complete contingency. The skills, abilities and habits of skills, abilities and habits of thinking—acquiring the habit of reflecting carefully upon your own thoughts, as well as what others think; developing the ability to imagine and evaluate new possibilities; developing the habit of changing your mind on the basis of good reasons; and acquiring skill in the establishment and use of appropriate criteria to form sound judgements—provide the methodology of Lipman's community of inquiry. Understanding Philosophy deals with ethical questions about how we should behave, social questions about the good community, epistemological questions about the justification of people's opinions, metaphysical questions about our spiritual lives, or logical questions about what we may reasonably infer, and is therefore a rich source of our cultural heritage and of contemporary thought and debate. In terms of both its history and ways of thinking, philosophy also helps to deepen our understanding of the big ideas and key concepts that have helped to shape civilization and continue to inform the way we live. Our conceptions of what makes something right or wrong, of justice, freedom and responsibility, of our personal, cultural and national identity, of sources of knowledge, of the nature of truth, beauty and goodness, are all central to what we value and how we conduct our affairs. Since such concepts so deeply inform life and society, it is important for students to develop their understanding of them. While we may attempt to deal with these matters elsewhere in the curriculum, philosophical inquiry gives students the tools that they need in order to explore these ideas in depth. Community With regard to cooperative thinking and the importance of community, I would stress the virtues of dialogue. As we work to resolve differences in our understandings, or to subject our reasons to each other's judgement, or try to follow an argument where it leads, we are like detectives whose clues are the experience, inferences, judgements and other intellectual considerations that each thinker brings into the dialogue with others. On this view, philosophical inquiry provides a model of the inquiring community: one that is engaged in thoughtful deliberation and decision making, is driven by a desire to make advance through cooperation and dialogue, and values the kinds of regard and reciprocity that grow under its influence. Just because it has these characteristics, philosophical inquiry can provide a training-ground for people who are being brought up to live together in such a community. Dewey's five steps require the philosophical disposition to give reasons when that is appropriate; and, generally, to cooperate with others and respect different points of view. Values Education The vital significance of educating for judgement in regard to values is nowhere more clearly recognized than in the writings of John Dewey: 'The formation of a cultivated and effectively operative good judgment or taste with respect to what is aesthetically admirable, intellectually acceptable and morally approvable is the supreme task set to human beings by the incidents of experience' (Dewey, 1929/1980, p. 262). This makes the cultivation of judgement the ultimate educational task and the development of good judgement central to values education in particular. Values education therefore cannot be simply a matter of instructing students as to what they should value—just so much 'teaching that'—as if students did not need to inquire into values or learn to exercise their judgement. In any case, it is an intellectual mistake to think that values constitute a subject matter to be learned by heart. They are not that kind of thing. Values are embodied in commitments and actions and not merely in propositions that are verbally affirmed. Nor can values education be reduced to an effort to directly mould the character of students so that they will make the right moral choices—as if in all the contingencies of life there was never really any doubt about what one ought to do, and having the right kind of character would ensure that one did it. Being what is conventionally called 'of good character' will not prevent you from acting out of ignorance, from being blind to the limitations of your own perspective, from being overly sure that you have right on your side, or even from committing atrocities with a good conscience in the name of such things as nation or faith. History is littered with barbarities committed by men reputedly of good character who acted out of self-righteous and bigoted certainty. Far from being on solid moral ground, the ancient tradition that places emphasis upon being made of the right stuff has encouraged moral blindness towards those of different ethnicity, religion, politics, and the like. Whatever else we do by way of values education, we must make strenuous efforts to cultivate good judgement. When it comes to deciding what to do in a morally troubling situation, good judgement involves distinguishing more from less acceptable decisions and conduct. Such discernment needs to be made by comparing our options in the circumstances in which they occur. Any such comparison requires us to ensure that, insofar as possible, we have hold of all the relevant facts. It involves us doing our best to make sure that we have not overlooked any reasonable course of action. It requires us to think about the consequences of making one decision, or taking one course of action, by comparison with another, and to be mindful of the criteria against which we evaluate them. It requires us to monitor the consequences of our actions in order to adjust our subsequent thinking to actuality. In short, good moral judgement requires us to follow the ways of inquiry. Dewey (1920/1957, pp. 163–164) says: A moral situation is one in which judgment and choice are required antecedently to overt action. The practical meaning of the situation—that is to say the action needed to satisfy it—is not self-evident. It has to be searched for. There are conflicting desires and alternative apparent goods. What is needed is to find the right course of action, the right good. Hence, inquiry is exacted: observation of the detailed make-up of the situation; analysis into its diverse factors; clarification of what is obscure; discounting of the more insistent and vivid traits; tracing of the consequences of the various modes of action that suggest themselves; regarding the decision reached as hypothetical and tentative until the anticipated or supposed consequences which led to its adoption have been squared with the actual consequences. The lack of integration of our advanced empirical and scientific knowledge with the remnants of value systems of much earlier times is already a problem of considerable proportions. We should not be adding to this burden when we teach science and technology, or history, or about society and the environment. Instead, we need to introduce our students to ways of thinking that develop their values in conjunction with their other understandings. This approach to values education fits with the emphasis to be placed upon collaborative inquiry for several reasons. First, the idea that values are to be cultivated by student reflection rather than impressed upon the student from without by moral authority does not imply that the pursuit of values is a purely personal affair. That would be a pendulum swing to individualistic relativism. Collaborative inquiry supplies a middle road—a way forward between an unquestioningly traditional attitude towards values and an individualism that makes each person their own moral authority. The development of good judgement through collaborative inquiry is the path towards a truly social intelligence. Secondly, values inquiry depends upon different points of view. If something is uncontroversial and everyone is of the same opinion, then there is no motivation for inquiry. Inquiry arises in situations where something is uncertain, puzzling, contentious or in some way problematic. The collaborative inquiry is organic, synergistic and evolving, a kind of moral practice based on a principle of democracy. Consider such elementary aspects of philosophical practice as: learning to hear someone out when you disagree with what they are saying; learning to explore the source of your disagreement rather than engaging in personal attacks; developing the habit of giving reasons for what you say and expecting the same of others; being disposed to take other people's interests and concerns into account; and generally becoming more communicative and inclusive. To see values education as continuous with all of our other efforts to educate our young in the ways of inquiry is to place it firmly in the tradition of reflective education rather than traditional religious instruction. Religious instruction cannot take on the burden of a systematic exploration of the ethical issues involved in the various areas of the curriculum as they are presented throughout the rest of the week. If we are to cultivate good moral judgement we need to make it integral to the material that we teach and not something we attempt to establish in such a disconnected fashion. From a pedagogical perspective, while it would be possible for religious instructors to introduce students to values inquiry, they are under no obligation to do so and many of them come from traditions that are likely to use the occasion to moralize and engage in indoctrination instead. This is not to say that religious education is incompatible with values inquiry. It is rather to acknowledge the need for change. Much of traditional religious instruction is antithetical to the educational requirements of an inquiring society; and if we are to develop such a society, such an outdated approach should not retain its foothold in our schools. This still leaves it open as to whether the school takes a philosophical approach to values education, or insists upon indoctrination rather than education. We should not think of philosophy and religion as representing two incompatible options when it comes to values education. They are representative, however, of a deeper choice that must be made in relation to values education, the choice between appeal to reason and dogmatism as central to the way we teach. Footnotes 1 Editor's Note : This article has been substantially edited and modified since it was delivered as a keynote address in December 2010. The context in which it was written reflects an ongoing tension between the didactic teaching of ethics through religious education and a more organic process of teaching ethics by modelling it and discussing it in philosophical discussion. In New South Wales (NSW) religious education was not compulsory, but Education Department policy forbade schools from offering alternative lessons to students who chose not to take part in scripture. The NSW government tasked St James Ethics Centre, under the guidance of Professor Cam, to develop and deliver ethics education classes in urban, regional and rural primary schools as an alternative to religious education. St James Ethics Centre promptly established Primary Ethics Limited, an independent not-for-profit organization, to develop an engaging, age-appropriate, interconnected curriculum that spans the primary years from Kindergarten to Year 6 and to then deliver ethics education free of charge via a network of specially trained and accredited volunteers. Despite protests from Church leaders in NSW that they should have sole responsibility for values education, on 1 December 2010 Parliament amended the NSW Education Act to give students who do not attend special religious education/scripture classes in NSW public schools the legal right to attend philosophical ethics classes as an alternative to supervised 'private study'. Because of the popularity of secular ethics classes, pressure from Church leaders and a change to a conservative state government, it was legislated in 2012 that parents should be told of the availability of ethics classes in their school only after they have opted out of special religious education or scripture. 2 Since the early 1990s Lipman's followers have extended his work and this general approach is now represented in schools in many countries around the world. For a selection of Australasian resources see http://www.fapsa.org.au/resources/catalogue References Bruner, J. S. (1960). The process of education. Cambridge, MA: Harvard University Press. Cahn, S. E. (Ed.). The philosophical foundations of education. New York: Harper & Row. 3 Dewey, J. (1910). How we think. Chicago, IL: D. C. Heath & Co. 4 Dewey, J. (1957). Reconstruction in philosophy (enlarged ed.). Boston, MA: Beacon Press. (Original work published 1920). 5 Dewey, J. (1966). Democracy and education. London: Collier Macmillan. (Original work published 1916). 6 Dewey, J. (1980). The quest for certainty. New York: Perigee Books. (Original work published 1929). 7 Lipman, M. (1988). Philosophy goes to school. Philadelphia, PA: Temple University Press. 8 Lipman, M. (2002). Thinking in education (2nd ed.). New York: Cambridge University Press. 9 Ryle, G. (1970). Teaching and training. In S. M. Cahn (Ed.), The philosophical foundations of education (pp. 413–424). New York: Harper & Row. ~~~~~~~~ By Philip Cam Reported by Author
    1. actors involved in emergency politics may appeal to three different forms of legitimacy while making use of three different kinds of rhetorical power in their efforts to legitimize and normalize their actions

      OKAY -> so is building off of Kreuder-Sonner / White -> says that not only do supranational entities have four LEVELS of emergency politics, all of which are united by the same script, but have to JUSTIFY their seizure of power / deviation from the norm THROUGH THREE DIFFERENT WAYS: 1. Coercive power OVER ides / discourse (i.e., controlling the narrative over their increased power) 2. Structural / institutional power IN ideas / discourse (idk) 3. Persuasive power THROUGH ideas via discourse (i.e., convincing people that increased power in this situation is good)

      Will look at the RHETORIC of executive actors at times of EU crisis in orderto demonstarte this (i.e., what terms / langauge they use to justify the extraordinary power measures). Specifically Covid / Eurozone.

      "actors involved in emergency politics may appeal to three different forms of legitimacy while making use of three different kinds of rhetorical power in their efforts to legitimize and normalize their actions"

    1. Miwoks and Yokuts

      Discussed in class: - Two different societies - both still active today Miwok (3 clusters on the map showed in class, 1 big, 2 small) - shows up in several areas on the map - distinct architecture (bark homes, discussion areas, sweat house for rituals) - organized society, hierarchy Yokuts (3 large sections on map shown in class) - lived in several areas over central valley California - good fishermen - good technique for gathering resources - traded for acorns (with Miwoks? diplomatically related)

    Annotators

    1. As discussed in Chapter 3, the most common means of transmitting HIV are having unprotected sex and sharing contaminated needles. In addition to abstinence and condom use, the risk of HIV transmission can be reduced through pre-exposure prophylaxis or PrEP. Recommended for persons at high risk of HIV infection (e.g., persons with other STIs, those who have unprotected sex with multiple partners, injection drug users, and persons with an HIV positive partner) and taken most frequently as a daily pill, PrEP reduces the risk of infection through sex by 99 percent.

      I am curious if this is due to education as an underlying factor and the commonality of these groups (teenagers & those of the same sex) are not supported amongst societies standards of sexual interactions. Also a lot of programs are through schools or religious institutions so they practice abstinence based sex education.

    2. biomedical model makes four primary assumptions that limit its utility for completely understanding health and illness

      1-assumes the presence of disease and its diagnosis and treatment are objective phenomena. 2-assumes that only medical professionals are capable of defining health and illness. 3-presumes that physiological malfunction alone defines health and illness. 4-defines healths as the absence of disease.

    1. Yoga + Schools and Neurodiversity — behavior, mental health, academic performance, autism, and ADHD Yoga + Migraine and Headache — frequency, intensity, duration, disability, and autonomic functionYoga + Brain Longevity® — memory, attention, and aging

      I would save this for further down on the page where you explore the 3 aspects in great detail

    1. Students come to the classroom with prior knowledgethat must be addressed if teaching is to be effective.2. Students need to organize and use knowledgeconceptually if they are to apply it beyond theclassroom.3. Students learn more effectively if they understand howthey learn and how to manage their own learning

      These always seem to get lost in education. We do not allow students the chance to show us what they know. Nor do we allow them the time to explore.

    1. La Coéducation : Synergie entre Milieux Scolaire et Familial

      Synthèse de direction

      La coéducation est définie comme une alliance stratégique entre tous les adultes gravitant autour de l'enfant — enseignants, parents, professionnels et personnel de soutien — visant à optimiser le développement de son plein potentiel.

      Cette approche repose sur la reconnaissance et l'acceptation des rôles complémentaires de chaque acteur.

      L'établissement de cette relation doit idéalement débuter dès la première rencontre parents-enseignants, bien qu'elle puisse se mobiliser à tout moment, notamment lors de situations critiques.

      Le succès de cette démarche repose sur une posture de bienveillance créant un climat de sécurité psychologique, favorisant ainsi une communication transparente et une action concertée.

      L'intégration des technologies numériques, encadrée par le Plan d'action numérique en éducation, vient renforcer cette collaboration en offrant de nouveaux leviers d'apprentissage et en rassurant les parents sur l'usage pédagogique des outils technologiques.

      --------------------------------------------------------------------------------

      1. Fondements et Définition de la Coéducation

      La coéducation n'est pas une simple communication occasionnelle, mais une véritable mentalité de partenariat.

      Elle se structure autour de trois piliers : reconnaître, accepter et mettre en action les rôles respectifs de chacun.

      Une rencontre d'univers : Elle représente la fusion de l'univers familial et de l'univers scolaire pour former un écosystème unique et cohérent dans la vie du jeune.

      Une mission commune : L'objectif central est l'accompagnement de l'élève dans le développement de ses compétences et de son bien-être.

      Une alliance durable : Cette relation doit perdurer tout au long de l'année scolaire, assurant une continuité entre les différents milieux de vie de l'enfant.

      --------------------------------------------------------------------------------

      2. L'Établissement d'une Posture de Bienveillance

      Pour que la coéducation soit effective, les acteurs doivent adopter une posture spécifique favorisant l'ouverture et l'écoute.

      Le climat de sécurité psychologique

      L'état de bienveillance est le moteur de la coéducation. Il permet de :

      • Créer un contexte où chacun se sent à l'aise de nommer ses véritables préoccupations.

      • Établir une écoute mutuelle authentique.

      • Réduire les malentendus et les confrontations.

      Processus d'ancrage de la bienveillance

      Pour cultiver cet état, les intervenants sont invités à :

      1. Se référer à une expérience passée de bienveillance pour en retrouver les codes (ton, attitude).

      2. Pratiquer l'auto-bienveillance avant de l'étendre à l'autre.

      3. S'interroger sur les meilleures conditions pour rester dans l'ouverture lors des échanges.

      --------------------------------------------------------------------------------

      3. Rôles et Responsabilités : Complémentarité des Acteurs

      Bien que les objectifs finaux convergent, les rôles des enseignants et des parents sont distincts et complémentaires.

      | Acteur | Mandat et Objectifs Spécifiques | Domaine d'Influence | | --- | --- | --- | | Enseignant | Instruire, socialiser et qualifier dans un cadre temporel limité (180 jours). Application du programme et progression des apprentissages. | Milieu scolaire (classe) | | Parent | Premier éducateur de l'enfant. Accompagnement dans les transitions, les défis de vie et les étapes de développement. | Milieu familial et social | | Rôles Communs | Se rassurer mutuellement, valider les informations, partager le vécu de l'enfant et s'informer des stratégies efficaces. | Global (Co-responsabilité) |

      --------------------------------------------------------------------------------

      4. Stratégies de Communication et d'Action

      La coéducation se manifeste par un questionnement constant orienté vers l'impact positif pour l'enfant.

      L'intention politique commune : Avant chaque intervention, les adultes devraient se demander : "Quel est l'impact positif que mon intervention va avoir pour le bien de l'enfant ?"

      La résolution de problèmes : Face aux difficultés (comportements nuisibles ou retards d'apprentissage), l'approche préconisée est de se demander : "Comment pourrions-nous travailler ensemble pour répondre aux besoins de l'enfant ?"

      Inclusion de l'enfant : Il est recommandé d'inclure le jeune dans le questionnement pour s'assurer que les stratégies développées répondent réellement à ses besoins.

      --------------------------------------------------------------------------------

      5. Bénéfices et Manifestations de la Réussite

      Une coéducation réussie transforme la dynamique éducative et génère des résultats tangibles :

      Engagement accru : La clarté des rôles et le climat bienveillant stimulent la motivation des adultes à s'investir.

      Sentiment d'efficacité personnelle : Les expériences positives répétées renforcent la croyance des parents et des enseignants en leur capacité de réussir l'éducation du jeune.

      Progrès accélérés : L'action concertée et continue entre la maison et l'école permet une multiplication des progrès de l'enfant.

      Gestion émotionnelle : Les acteurs parviennent mieux à se détacher d'une surcharge émotionnelle lors des communications pour se recentrer sur l'objectif pédagogique.

      --------------------------------------------------------------------------------

      6. La Coéducation à l'Ère du Numérique

      Le numérique agit comme un levier pour soutenir la relation entre l'école et la famille.

      Le Plan d'action numérique

      Ce plan offre un cadre de référence inspiré des meilleures pratiques mondiales. Il vise deux dimensions centrales :

      1. Développer un citoyen éthique à l'ère du numérique.

      2. Mobiliser les compétences technologiques des jeunes.

      Manifestations concrètes en classe

      L'intégration technologique se traduit par de nouvelles méthodes d'apprentissage où l'enfant est placé en mode création :

      • Ateliers de robotique, de programmation et de codage.

      • Utilisation de la réalité virtuelle (ex: pour des exposés oraux).

      • Usage de tablettes pour la lecture et d'autres contributions pédagogiques.

      Cette structure numérique, encadrée par des pédagogues, sert également à rassurer les parents sur l'accompagnement technologique de leurs enfants, renforçant ainsi le lien de confiance nécessaire à la coéducation.

    1. Author response:

      [Note: The final version has been published in Brain, Behavior, and Immunity: https://doi.org/10.1016/j.bbi.2026.106473]

      eLife Assessment

      Rhis useful study raises interesting questions but provides inadequate evidence of an association between atovaquone-proguanil use (as well as toxoplasmosis seropositivity) and reduced Alzheimer's dementia risk. The findings are intriguing but they are correlative and hypothesis-generating with the strong possibility of residual confounding.

      We thank the editors and reviewers for characterizing our work as useful and for the opportunity to publish a Reviewed Preprint with a corresponding response. However, the statements in the Assessment characterizing the evidence as ‘inadequate’ and asserting a ‘strong possibility of residual confounding’ are factually incorrect as applied to our data and incompatible with the empirical findings presented in the manuscript. We have notified the editors of this factual inaccuracy. As the Assessment will be published as originally written, we provide clarification here to ensure an accurate scientific record for readers of the Reviewed Preprint.

      Our study shows that the association between atovaquone–proguanil (A/P) exposure and reduced dementia risk, first identified in a rigorously matched national cohort in Israel, is robustly reproduced across three independently constructed age-stratified cohorts in the U.S. TriNetX network (with exposure at ages 50–59, 60–69, and 70–79). In each cohort, individuals exposed to A/P were compared with rigorously matched individuals who received another medication at the same age and were then followed over a decade for incident dementia. Cases and controls were matched on all major established dementia risk factors: age, sex, race/ethnicity, diabetes, hypertension, obesity, and smoking status.

      Across all three strata, each containing more than 10,000 exposed individuals with an equal number of matched controls, we observed substantial and consistent reductions in cumulative dementia incidence (HR 0.34–0.51), extremely low P-values (10<sup>–16</sup> to 10<sup>–40</sup>), and continuously widening divergence of Kaplan–Meier curves over the follow-up period. To more rigorously exclude the possibility of unmeasured baseline differences in health status, we additionally performed, for the purpose of this response, comparative analyses of key indicators of frailty and clinical utilization, including emergency and inpatient encounters, as well as the prevalence of mild cognitive impairment prior to medication exposure (values provided below in response to Reviewer #2, Weakness 1). These analyses provide clear evidence showing no pattern suggestive of exposed individuals being medically or cognitively healthier at baseline.

      Taken together, these findings constitute a rigorously matched and independently replicated association across two national health systems, using TriNetX, the most widely cited real-world evidence platform in published cohort studies. Replication across three age strata, each with >10,000 exposed individuals, followed for a decade, and matched on all major known risk factors for dementia, meets the accepted epidemiologic definition of strong and reproducible evidence.

      Although we disagree with elements of the editorial Assessment that appear inconsistent with the empirical findings, we will proceed with publication of the current manuscript as a Reviewed Preprint in order to ensure timely dissemination of findings with meaningful implications for public health and dementia prevention. In this initial public version, the point-by-point responses below provide concise explanations addressing the critiques underlying the Assessment. A revised manuscript, incorporating expanded baseline comparisons across each TriNetX age stratum, additional stringent exclusions, and an expanded discussion that will address the remarks presented in this review, will be submitted shortly.

      Reviewer #1 (Public review):

      Summary:

      This useful study provides incomplete evidence of an association between atovaquone-proguanil use (as well as toxoplasmosis seropositivity) and reduced Alzheimer's dementia risk. The study reinforces findings that VZ vaccine lowers AD risk and suggests that this vaccine may be an effect modifier of A-P's protective effect. Strengths of the study include two extremely large cohorts, including a massive validation cohort in the US. Statistical analyses are sound, and the effect sizes are significant and meaningful. The CI curves are certainly impressive.

      Weaknesses include the inability to control for potentially important confounding variables. In my view, the findings are intriguing but remain correlative / hypothesis generating rather than causative. Significant mechanistic work needs to be done to link interventions which limit the impact of Toxoplasmosis and VZV reactivation on AD.

      We thank the reviewer for describing our study as useful and for highlighting several of its strengths, including the very large cohorts, sound statistical analyses, meaningful effect sizes, and the impressive CI curves. We also appreciate the reviewer’s recognition that our findings reinforce prior evidence linking VZV vaccination to reduced AD risk.

      Regarding the statement that the evidence remains incomplete due to “inability to control for potentially important confounding variables,” we refer to our introductory explanation above. As noted there, our analyses meet the accepted criteria for reproducible epidemiological evidence, and the assumption of uncontrolled confounding is contradicted by rigorous matching and by additional baseline evaluations. We fully agree that mechanistic work is warranted, and our epidemiologic findings strongly motivate such efforts.

      We address the reviewer’s specific comments in detail below.

      (1) Most of the individuals in the study received A-P for malaria prophylaxis as it is not first line for Toxo treatment. Many (probably most) of these individuals were likely to be Toxo negative (~15% seropositive in the US), thereby eliminating a potential benefit of the drug in most people in the cohort. Finally, A-P is not a first line treatment for Toxo because of lower efficacy.

      We agree that individuals in our cohort received Atovaquone-Proguanil (A-P) for malaria prophylaxis rather than for treatment of toxoplasmosis. However, this does not contradict our interpretation. Because latent CNS colonization by T. gondii is not currently considered clinically actionable, asymptomatic carriers are not offered treatment, and therefore would only receive an anti-Toxoplasma regimen unintentionally, through a medication prescribed for another indication such as malaria prophylaxis. Importantly, atovaquone is an established therapy for toxoplasmosis, including CNS disease, with documented efficacy and CNS penetration in current treatment guidelines. It is therefore reasonable to assume that, during the multi-week course typically administered for malaria prophylaxis, A-P would exert significant anti-Toxoplasma activity in individuals with latent CNS infection, potentially reducing or eliminating parasite burden even though the medication was not prescribed for that purpose.

      The reviewer notes that only ~15% of individuals in the U.S. are Toxoplasma-seropositive, based on surveys performed primarily in young adults of reproductive age (serologic testing is most commonly obtained in women during prenatal care). However, seropositivity increases cumulatively over the lifespan, and few reliable estimates exist for the age groups in which Alzheimer’s disease and dementia occur. Even if we accept the lower estimate of ~15% latent colonization in older adults, this proportion is still smaller than the lifetime cumulative incidence of dementia in the general population.

      Therefore, if latent toxoplasmosis contributes causally to dementia risk, and A-P is capable of eliminating latent Toxoplasma in the subset of individuals who harbor it, then a multi-week course of treatment—such as the one routinely taken for malaria prophylaxis—would be expected to produce a substantial reduction in dementia incidence at the population level, of the same order of magnitude reported here. A protective effect concentrated in a minority of exposed individuals is fully compatible with, and can mechanistically explain, the large overall reduction in risk that we observe.

      Finally, the reviewer notes that A-P is not a first-line treatment for toxoplasmosis due to assumed lower efficacy. This point does not undermine our results. Even a second-line agent, when administered over several weeks—as is routinely done for malaria prophylaxis—is expected to exert substantial anti-Toxoplasma activity. The long duration of exposure in large populations receiving A-P for travel provides a unique natural experiment that does not exist for other anti-Toxoplasma medications, which, when prescribed for their non-Toxoplasma indications, are not taken more than a few days. Thus, the widespread use of A-P for malaria prophylaxis allows a unique opportunity to evaluate long-term outcomes following inadvertent anti-Toxoplasma treatment.

      Moreover, “first line” recommendations in clinical guidelines refer to treatment of acute toxoplasmosis in immunosuppressed individuals, where tachyzoites are actively replicating. These guidelines do not consider efficacy against latent CNS colonization, which is dominated by bradyzoites, a biologically distinct form, in immunocompetent individuals. Therefore, the guideline hierarchy is not informative regarding which medication is more effective at clearing latent brain infection, the stage we consider most relevant to dementia risk.

      (2) A-P exposure may be a marker of subtle demographic features not captured in the dataset such as wealth allowing for global travel and/or genetic predisposition to AD. This raises my suspicion of correlative rather than casual relationships between A-P exposure and AD reduction. The size of the cohort does not eliminate this issue, but rather narrows confidence intervals around potentially misleading odds ratios which have not been adjusted for the multitude of other variables driving incident AD.

      We agree that prior to matching, A-P exposure may be associated with demographic features such as health or to travel internationally. However, this does not apply after matching. In all age-stratified analyses, exposed and control individuals were rigorously matched on all major risk factors known to influence dementia risk, including age, sex, race/ethnicity, smoking status, hypertension, diabetes, and obesity. Owing to the extremely large pool of individuals in TriNetX (~120M), our matching was performed stringently, producing exposed and unexposed cohorts that are near-identical with respect to the established determinants of dementia risk.

      The reviewer correctly identifies that large cohorts alone do not eliminate confounding; however, confounding must still be biologically and epidemiologically plausible. Any hypothetical confounder capable of producing a 50–70% reduction in dementia incidence over a decade would need to: (1) produce a very large protective effect against dementia; (2) be strongly associated with A-P exposure; and (3) remain entirely uncorrelated with age, sex, race/ethnicity, smoking, diabetes, hypertension and obesity, which have been rigorously matched. No such factor has been proposed. The suggestion that an unspecified ‘subtle demographic feature’ could produce effects of this magnitude remains hypothetical, and no such factor has been described in the dementia risk literature.

      If a specific evidence-supported confounder is proposed that meets these criteria, we would be pleased to test it empirically in our cohorts. In the absence of such a proposal, the interpretation that the association is merely “correlative rather than causal” remains speculative and does not negate the strength of a replicated, rigorously matched, long-term association across large cohorts in two national health systems.

      (3) The relationship between herpes virus reactivation and Toxo reactivation seems speculative.

      We respectfully disagree with the characterization of the herpesvirus–Toxoplasma interaction as speculative. The mechanism we describe is biologically valid, based on established virology and parasitology literature showing that latent T. gondii infection can reactivate from its bradyzoite state under inflammatory or immune-modifying conditions, including viral triggers. A published clinical report has documented CNS co-reactivation of T. gondii and a herpesvirus, explicitly noting that HHV-6 reactivation can promote Toxoplasma reactivation in neural tissue (Chaupis et al., Int J Infect Dis, 2016).

      Moreover, this mechanism is the only currently evidence-supported explanation that simultaneously and parsimoniously accounts for all of the epidemiologic observations in our study:

      (1) Substantially higher cumulative incidence of dementia in individuals with positive Toxoplasma serology, indicating that latent infection is a risk factor for subsequent cognitive decline;

      (2) Strong protective association following A-P exposure, a medication with established activity against Toxoplasma gondii, including in the CNS;

      (3) Independent protection conferred by VZV vaccination, observed consistently for two vaccines with distinct formulations (one live attenuated, one recombinant protein), whose only shared property is suppression of VZV reactivation;

      (4) Greater protective effect of A-P among individuals who were not vaccinated against VZV, consistent with a model in which dementia risk requires both herpesvirus reactivation and persistent latent Toxoplasma infection—such that reducing either factor alone (via VZV vaccination or anti-Toxoplasma suppression) substantially lowers risk.

      Taken together, these observations are difficult to reconcile under any alternative hypothesis.  

      To date, we are unaware of any other biologically coherent mechanism that can explain all four findings simultaneously. We would welcome any alternative explanation capable of accounting for these converging epidemiologic signals, as such a proposal could meaningfully advance the scientific discussion. In the absence of a competing explanation, the interaction between latent toxoplasmosis and herpesvirus reactivation remains the most parsimonious hypothesis supported by current knowledge.

      Finally, while observational studies are inherently limited in their ability to provide causal inference, the mechanism we propose is biologically grounded and experimentally testable. Our results provide a strong rationale for mechanistic studies and clinical trials, and warrant publication precisely because they generate a verifiable hypothesis that can now be evaluated directly.

      (4) A direct effect on A-P on AD lesions independent on infection is not considered as a hypothesis. Given the limitations above and effects on metabolic pathways, it probably should be. The Toxo hypothesis would be more convincing if the authors could demonstrate an enhanced effect of the drug in Toxo positive individuals without no effect in Toxo negative individuals.

      A direct effect of A-P on AD established lesions is indeed possible, and this hypothesis would be of significant therapeutic interest. However, we did not consider it within the scope of our epidemiologic analyses because all cohorts explicitly excluded individuals with existing dementia. Under these conditions, proposing a disease-modifying effect on established Alzheimer’s lesions based on our data would itself be speculative. Evaluating such a mechanism would be better answered by mechanistic or interventional studies rather than inference from populations without baseline disease.

      We also agree that demonstrating a stronger protective effect among Toxoplasma-positive individuals would be informative. Unfortunately, this “natural experiment” cannot be performed using the available data: Toxoplasma serology is rarely ordered in older adults, and A-P exposure is itself uncommon, resulting in a cohort overlap far too small to yield valid statistical inference (n≈25 in TriNetX).

      Thus, while both proposed hypotheses are scientifically attractive and merit further study, neither can be resolved using currently available real-world clinical data. Our findings provide the rationale to investigate both hypotheses experimentally, and we hope our report will motivate such studies.

      Reviewer #2 (Public review):

      Summary:

      This manuscript examines the association between atovaquone/proguanil use, zoster vaccination, toxoplasmosis serostatus and Alzheimer's Disease, using 2 databases of claims data. The manuscript is well written and concise. The major concerns about the manuscript center around the indications of atovaquone/proguanil use, which would not typically be active against toxoplasmosis at doses given, and the lack of control for potential confounders in the analysis.

      Strengths:

      (1) Use of 2 databases of claims data.

      (2) Unbiased review of medications associated with AD, which identified zoster vaccination associated with decreased risk of AD, replicating findings from other studies.

      We thank the reviewer for the thoughtful assessment and for noting key strengths of our work, including (1) the use of two large national databases, and (2) the unbiased discovery approach that replicated the widely reported association between zoster vaccination and reduced Alzheimer’s disease (AD) risk. We agree that these features highlight the validity and reproducibility of the analytic framework.

      Below we respond to the reviewer’s perceived weaknesses.

      Weaknesses:

      (1) Given that atovaquone/proguanil is likely to be given to a healthy population who is able to travel, concern that there are unmeasured confounders driving the association.

      We agree that, prior to matching, A-P exposure may correlate with demographic or health-related differences (e.g., ability to travel). However, this potential bias was explicitly controlled for in the study design. Across all three age-stratified TriNetX cohorts, exposed and unexposed individuals were rigorously matched on all major established dementia risk factors: age, sex, race/ethnicity, smoking status, obesity, diabetes mellitus, and hypertension. Comparative analyses confirm that these risk factors are equivalently distributed at baseline.

      As noted in our response to Reviewer #1, for any hypothetical unmeasured confounder to explain the results, it would need to satisfy three conditions simultaneously:

      (1) Be capable of producing a 50–70% reduction in dementia incidence sustained over a decade and across three distinct age strata (ages 50–79);

      (2) Be strongly associated with likelihood of receiving A-P;

      (3) Remain entirely uncorrelated with age, sex, race/ethnicity, smoking, diabetes, hypertension, or obesity, all of which were rigorously matched and balanced at baseline.

      No such factor has been proposed in the literature or by the reviewer. Thus, the concern remains hypothetical and unsupported by any measurable demographic or biological mechanism.

      Importantly, empirical evidence contradicts the notion of a “healthy traveler” bias:

      Emergency and inpatient encounter rates prior to exposure were comparable between A-P users and controls. Across the three age-stratified cohorts, emergency visits were similar or slightly higher among A-P users (EMER: 19.6% vs 16.4%, 19.9% vs 14.2%, 22.0% vs 14.8%), and inpatient encounters were effectively equivalent (IMP: 14.8% vs 15.2%, 17.7% vs 17.6%, 22.1% vs 22.2%). These patterns directly contradict the suggestion that A-P users were a healthier or less medically burdened population at baseline.

      Prevalence of mild cognitive impairment was not lower among A-P users and was, in fact, slightly higher in the oldest cohort. Across the three age groups, baseline diagnoses of mild cognitive impairment (MCI) were comparable or slightly higher among exposed individuals (0.1% vs 0.1%, 0.3% vs 0.2%, 1.1% vs 0.6%). These data contradict the suggestion that A-P users had superior baseline cognition.

      The strongest protective association occurred in the youngest stratum (age 50–59; HR 0.34). At this age, when nearly all individuals are sufficiently healthy to travel internationally, A-P uptake is the least likely to confound health status. A frailty-based “healthy traveler” hypothesis would instead predict the opposite pattern, with older adults showing the greatest apparent benefit, since health limitations are more likely to restrict travel in later life. In contrast, the protective association weakens with increasing age, empirically contradicting any explanation based on differential travel capacity.

      In conclusion, the empirical evidence directly contradicts the existence of a ‘healthy traveler’ effect.

      (2) The dose of atovaquone in atovaquone/proguanil is unlikely to be adequate suppression of toxo (much less for treatment/elimination of toxo), raising questions about the mechanism.

      A few important points should address the reviewer’s concern:

      In our cohorts, A-P was prescribed for malaria prophylaxis, as correctly noted. In this setting, it is taken for the entire duration of travel, plus several days before and after, typically resulting in many weeks of continuous exposure. This creates an unintentional but scientifically valuable natural experiment, in which a CNS-penetrating anti-Toxoplasma agent is administered for long durations.

      Atovaquone is an established treatment for CNS toxoplasmosis, has strong CNS penetration, and is included in current clinical guidelines for acute toxoplasmosis in immunocompromised patients, although at higher doses. Because latent, asymptomatic CNS colonization is not treated in clinical practice, there are currently no data establishing the dose required to eliminate bradyzoite-stage Toxoplasma in immunocompetent individuals.

      Our observations concern atovaquone–proguanil (A-P), a fixed-dose combination of atovaquone with proguanil, a DHFR inhibitor targeting a key metabolic pathway shared by malaria parasites and T. gondii. The combination has well-established synergistic effects in malaria prophylaxis and the same mechanism would be expected to enhance anti-Toxoplasma activity. This fixed-dose regimen has never been formally evaluated for toxoplasmosis treatment at prolonged durations or against latent bradyzoite infection.

      Our hypothesis does not require or imply complete eradication of Toxoplasma. A clinically meaningful reduction in latent cyst burden among the subset of colonized individuals may be sufficient to alter long-term disease trajectories. Thus, a population-level decrease in dementia incidence does not require universal clearance of infection, but only partial suppression or reduction of parasite load in susceptible individuals, which is entirely compatible with the known pharmacology and duration of A-P exposure.

      (3) Unmeasured bias in the small number of people who had toxoplasma serology in the TriNetX cohort.

      The relatively small number of older adults with Toxoplasma serology stems from current clinical practice: serologic testing is mostly performed in women during reproductive years due to risks in pregnancy, whereas in older adults a positive result has no clinical consequence and therefore testing is rarely ordered.

      Importantly, the seropositive and seronegative groups were drawn from the same underlying population of individuals who underwent serology testing, and the only difference between groups is the test result itself. Because the decision to order a test is made prior to and independent of the result, there is no plausible rationale by which the serology outcome (positive or negative) would introduce a bias favoring either group beyond the result of the test itself.

      Furthermore, the two groups were here also rigorously matched on all major dementia risk factors, including age, sex, race/ethnicity, smoking, diabetes, hypertension, and BMI, and these characteristics are similarly distributed between groups. A small sample size does not imply bias; it simply reduces statistical power. Despite this limitation, the observed association (HR = 2.43, p = 0.001) remains strongly significant.

      Finally, this result is consistent with multiple published studies reporting higher rates of Toxoplasma seropositivity among individuals with Alzheimer’s disease, dementia, and even mild cognitive impairment, such that our finding reinforces a broader and independently observed epidemiologic pattern. Importantly, in our cohort the serology testing clearly preceded dementia diagnosis, which supports the plausibility of a causal rather than merely correlative relationship between latent toxoplasmosis and cognitive decline.

      To conclude our provisional response, we thank the editor and reviewers for raising points that will be further addressed and expanded upon in the discussion of the forthcoming revision. We welcome transparent scientific dialogue and acknowledge that, as with all observational research, residual confounding cannot be eliminated with absolute certainty. However, we disagree with the overall Assessment and emphasize that our findings—reproduced independently across two national health systems and three age-stratified cohorts, each rigorously matched on all major determinants of dementia risk, meet, and in many respects exceed, current standards for high-quality observational evidence.

      Assigning the results to “residual confounding” requires more than speculation: it requires identification of a confounding factor that is (1) anchored in established dementia risk literature, (2) empirically plausible, and (3) quantitatively capable of generating a sustained ~50 percent reduction in dementia incidence over a decade. No such factor has been identified to date. We note that the assertion of “residual confounding” has not been supported by a specific, quantitatively plausible mechanism. A hypothetical bias that is both extremely large in effect and uncorrelated with all major risk factors is not statistically or biologically credible.

      The explanation we propose, reduction in dementia risk through elimination of latent Toxoplasma gondii, is biologically grounded, directly supported by independent epidemiologic literature, and uniquely capable of accounting for all convergent observations in our data. No alternative hypothesis has been put forward that can plausibly explain these findings.

      A revised version of the manuscript will be submitted shortly, incorporating expanded baseline analyses, with the strictest possible exclusion criteria (including congenital, vascular, chromosomal, and neurodegenerative disorders such as Parkinson’s disease), and complete tabulated comparisons. These data will further reinforce that the observed protective associations are not attributable to any measurable confounding. We also plan to enhance the discussion in order to address the points raised by the reviewers.

      In light of the expanded analyses, any reservations expressed in the initial Assessment can now be re-evaluated on the basis of the empirical evidence. The findings reported in our study meet, and in several respects exceed, current epidemiologic standards for high-quality observational research, clearly warrant publication, and provide a robust scientific foundation for future mechanistic and interventional studies to determine whether elimination of latent toxoplasmosis can prevent or treat dementia.

    2. Reviewer #1 (Public review):

      Summary:

      This useful study provides incomplete evidence of an association between atovaquone-proguanil use (as well as toxoplasmosis seropositivity) and reduced Alzheimer's dementia risk. The study reinforces findings that VZ vaccine lowers AD risk and suggests that this vaccine may be an effect modifier of A-P's protective effect. Strengths of the study include two extremely large cohorts, including a massive validation cohort in the US. Statistical analyses are sound, and the effect sizes are significant and meaningful. The CI curves are certainly impressive.

      Weaknesses include the inability to control for potentially important confounding variables. In my view, the findings are intriguing but remain correlative / hypothesis generating rather than causative. Significant mechanistic work needs to be done to link interventions which limit the impact of Toxoplasmosis and VZV reactivation on AD.

      Weaknesses:

      Major:

      (1) Most of the individuals in the study received A-P for malaria prophylaxis as it is not first line for Toxo treatment. Many (probably most) of these individuals were likely to be Toxo negative (~15% seropositive in the US), thereby eliminating a potential benefit of the drug in most people in the cohort. Finally, A-P is not a first line treatment for Toxo because of lower efficacy.

      (2) A-P exposure may be a marker of subtle demographic features not captured in the dataset such as wealth allowing for global travel and/or genetic predisposition to AD. This raises my suspicion of correlative rather than casual relationships between A-P exposure and AD reduction. The size of the cohort does not eliminate this issue, but rather narrows confidence intervals around potentially misleading odds ratios which have not been adjusted for the multitude of other variables driving incident AD.

      (3) The relationship between herpes virus reactivation and Toxo reactivation seems speculative.

      (4) A direct effect on A-P on AD lesions independent on infection is not considered as a hypothesis. Given the limitations above and effects on metabolic pathways, it probably should be. The Toxo hypothesis would be more convincing if the authors could demonstrate an enhanced effect of the drug in Toxo positive individuals without no effect in Toxo negative individuals.

      Minor:

      (5) "Clinically meaningful" should be eliminated from the discussion given that this is correlative evidence.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript examines the association between atovaquone/proguanil use, zoster vaccination, toxoplasmosis serostatus and Alzheimer's Disease, using 2 databases of claims data. The manuscript is well written and concise. The major concerns about the manuscript center around the indications of atovaquone/proguanil use, which would not typically be active against toxoplasmosis at doses given, and the lack of control for potential confounders in the analysis.

      Strengths:

      (1) Use of 2 databases of claims data.

      (2) Unbiased review of medications associated with AD, which identified zoster vaccination associated with decreased risk of AD, replicating findings from other studies.

      Weaknesses:

      (1) Given that atovaquone/proguanil is likely to be given to a healthy population who is able to travel, concern that there are unmeasured confounders driving the association.

      (2) The dose of atovaquone in atovaquone/proguanil is unlikely to be adequate suppression of toxo (much less for treatment/elimination of toxo), raising questions about the mechanism.

      (3) Unmeasured bias in the small number of people who had toxoplasma serology in the TriNetX cohort.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      COMBINED REVIEW REPORTS

      __1.1. The biochemical and biophysical experiments performed in this study were well designed, data were clear and the conclusions were well supported by the results. One potential improvement is to check whether NLS could affect the normal activation targets of ΔNp63α, such as KRT14 and other epithelial genes. This could complement the experiments testing the inhibition effect of ΔNp63α on p53-mediated gene activation. This will be interesting, as ΔNp63α is a master regulator in epithelial cells via regulation of diverse epithelial genes. __

      We thank the Review for such useful comment. In order to further investigate the relationship between p63 nuclear import and function, and the importance of the oligomerization driven tolerance to point mutations in the latter, we have now performed a number of novel experiments. First of all, we have included both DNp63a NLSn and NLSc mutants in DNA binding/p53 -inhibition assays shown in original Figure 7. The new data is shown in Figure 4E and Supplementary Figure__ S5__. As expected, such mutants had a much smaller effect on DNA binding/p53-inhibition as compared to the NLSbip mutant, further establishing a functional link between p63 nuclear levels and transcriptional activity, and proving the functional relevance of the compensatory mechanism evolved by p63 to tolerate the effect of mutations inactivating either NLSn or NLSc.

      In addition, and as specifically suggested by the Reviewer, we have measured the effect of NLS impairing mutations on the ability of DNap63 to transactivate the K14 and the Bax promoters, which. Our results, shown in revised Figure 4F and 4G, as well as in Supplementary Figure S6 clearly show that both DNp63a NLSn and NLSc mutants transactivate the promoters at undistinguishable levels compared to the wild-type, consistent with their minimal effect on DNA binding and nuclear transport, while the NLSbip mutation, which prevents nuclear localization and DNA binding, also prevents transcriptional transactivation.

      __1.2. A minor suggestion: authors could consider use p63 rather than ΔNp63α in the manuscript. The heterogenous sequences of NLS regions are relevant for the delta isoform of p63. In addition, all experiments performed in the study are not necessarily specific for the biology of the ΔNp63α isoform, but they are probably informative for all p63 isoforms. __

      We thank the Reviewer for this suggestion. We have modified the text in the discussion to introduce this concept. Indeed, we expect the bipartite NLS to mediate nuclear transport of most p63 isoforms, whereas the p63 delta isoform, which lacks NLSn, would be transported into the nucleus by NLSc. We have modified the text in the Discussion section to make this point clearer and more explicit "the bipartite NLS identified here is responsible for nuclear localization of most p63 isoforms, while p63 delta is transported into the nucleus by NLSc: SIKKRRSPD)." To further corroborate this statement, we have also included new data obtained with the TAp63a and gNp63a isoforms. Our data clearly show that nuclear import of both isoforms depends on the NLSbip identified here and is mediated by the IMPa/b1 heterodimer, so that the findings obtained for the ΔNp63α isoform can be generalized to others. The new data is shown in Figure 3 and in Supplementary Figure S3.

      __1.3. Another minor suggestion: As p63 forms a tetramer when binding to DNA sequence for gene regulation, it would be good for authors to speculate the role of NLS and its variations in tetramerization. __

      We thank the Reviewer for such comment. Since the NLS is located outside of the tetramerization domain, it is not expected to play a direct role in tetramerization. We have addressed this issue by generating computational models of ΔNp63α and DNp63α;mNLS dimers and tetramers to allow a direct comparison. The new data is shown in Figure 5A-D and Supplementary Figure S11A-D. The data suggests that mutation of the NLS residues, which lies outside of the oligomerizaiton domain, does not affect ΔNp63α oligomerization abilities supporting the experimental evidences from Figure 5E (BRET experiments).

      __

      2.1. In immunofluorescence images it is sometime difficult to see nuclear accumulation. Single channels of the GFP signal may help to make the point. __

      We thank the Reviewer for pointing out this issue. We have provided single channels for every microscopic image in Supplemental Figures.

      __ 2.2. The binding assays in Fig. 3 would profit from using the most efficient imp a variant together with imp beta to show potential cooperative binding.__

      We thank the Reviewer for such comment, which helped enhancing the physiological relevance of our binding data. We have now introduced the requested data in Supplementary Figure S2A. In the revised Figure panel, we compared binding of FITC-labelled p63-NLS peptide to either full length IMPa1 alone, IMPa1DIBB and pre-heterodimerized IMPa1/IMPb1 complex. The data are consistent with a classical binding mode whereby interaction with IMPb1 releases full length IMPa1 binding minor and major binding sites by engaging with the autoinhibitory IBB domain. To corroborate our results even further and demonstrate the bipartite nature of p63 NLS identified here, we have also performed FP experiments between p63-NLS and LTA SV40 NLS (a well characterized monopartite NLS) in the presence of either wt IMPa1DIBB or its minor and major site mutants. As expected from a bipartite NLS, either mutation impaired binding significantly, whereas the mutation of the minor site had a much smaller effect on binding of SV40 LTA NLS. The new data, shown in Supplementary Figure S2BC and Supplementary Table S3 confirm our hypothesis by highlighting a very strong binding affinity reduction of p63 NLS peptide for IMPa1 major site mutant (

      __2.3. please mention that NTR can also recognize 3D structures of structural RNAs, e.g. tRNAs or miRNAs __

      We thank the Reviewer for this very useful suggestion. We have now introduced this concept in the Introduction and added two references to support our statement. The paragraph is as follows: "Additionally, Exportin 5 and Exportin-T evolved to recognize specific RNA structures within pre-miRNAs and t-RNAs, respectively (5, 6)."

      2.4. longer TA isoforms

      We have added corrected the typo and we thank the Reviewer for noticing it.

      __ 2.5. homologues or orthologues? __

      We thank the reviewer for pointing out this issue. We have corrected the text, so now IMPas and members of the p53 family are referred to as paralogs and not as orthologs

      __3.1. The major function of DNp63a seems to be that of a bookmarking factor that ensures the establishment of an epithelial transcriptional program. It is found to bind more to enhancer than to promoter regions. While it might also act for a few genes as a classical transcription factor (K14). this bookmarking and interaction with other transcriptional regulators seems to be its major task. This should be included in the introduction. __

      We thank the Reviewer for this suggestion. The Introduction has been modified as requested to incorporate this important concept "Additionally, p63 has been shown to act as a pioneer factor, shaping the chromatin and enhancer landscape, thus regulating accessibility to activating and repressing transcription factors (18-20)."

      __ 3.2. "DNp63a can be imported into the nucleus as a dimer" What is the evidence that DNp63a is imported as a dimer and not as a tetramer? Although functional not really relevant, because all conclusions drawn for a dimer are true for a tetramer (such as the mutation compensation), this statement (and others in the text) should either be substantiated or modified. __

      The Reviewer is correct in pointing out that, while p63 isoforms bind DNA as tetramers (7), the precise oligomeric state at which nuclear import occurs is not firmly established. Indeed, little is known about the regulation of the p63 oligomerization process during nucleocytoplasmic trafficking. While TA isoforms are generally maintained in an inactive, closed, and dimeric conformation-requiring external stimuli such as phosphorylation to undergo activation and tetramerization-ΔNp63α has been reported to form tetramers even in the absence of such stimuli (4, 8). In light of this, we have modified the text to explicitly acknowledge the possibility that ΔNp63α may be transported into the nucleus either as a dimer or as a tetramer, rather than implying a single obligatory oligomeric state.

      Importantly, to directly address the Reviewer's concern, we have broadened the scope of the manuscript to include additional p63 isoforms, particularly TAp63α, which is predominantly present as a dimer under basal conditions. Our new data (Figure 3) demonstrate that TAp63α is efficiently translocated into the nucleus via the IMPα/β1 heterodimer in an NLSbip-dependent manner. Notably, despite its inability to form tetramers, TAp63α displays a similar tolerance to mutations that inactivate individual basic clusters within the bipartite NLS, analogous to what is observed for ΔNp63α (Supplementary Figure S11).

      Together, these results formally demonstrate that dimerization is sufficient to support efficient nuclear import in the presence of NLS-inactivating mutations, and that higher-order oligomerization (i.e., tetramerization) is not required for this property. We have therefore revised the manuscript accordingly to avoid over-interpretation and to more accurately reflect the experimental evidence.

      __ 3.3. The explanation for the difference in the sensitivity of mutations in the bipartite NLS in the isolated peptide experiments and experiments with the full length DNp63a is intriguing. Unfortunately, it is not based on direct experimental evidence. To proof their model (which is the central claim of this manuscript) they should fuse the bipartite NLS to any dimerization module (e.g. a leucine zipper sequence) and show that by dimerization of the bipartite NLS the same results towards mutations are obtained as for full length DNp63a. This would strongly support their model. __

      We agree that the model for nuclear transport is a central claim of our work, and deserves additional experimental validation. In order to support our hypothesis, in the revised manuscript we have generated a number of additional DNp63a mutants uncapable of self-interaction, based on deletion of residues 301-347(p63-DOD).

      We have now:

      (i) Validated the inability of the DOD mutant to self-interact by means of BRET assays in living cells, whereby a strong decrease in BRET ratio is observed compared to wild-type DNp63a (New Figure 6E and New Supplementary Figure S8).

      (ii) Shown that, in such context, substitution of either the N-terminal or C-terminal basic stretch of amino acids in the NLS is sufficient to impact p63 nuclear import, whereas in the context of the full-length protein, they are not (New Figure 6F-H, and New Supplementary Figure S9).

      (iii) Shown that while FLAG-p63 wt could relocalize to the nucleus YFP-p63mNLSbip but not YFP-p63;DOD;mNLSbip (New Supplementary Figure S10).

      We believe that these new data further demonstrate the impact of p63 self-association on subcellular localization and strongly support our hypothesis. We greatly thank the Reviewer for their inspiring comment, which led to a significant improvement of our manuscript.

      References

      Lotz R, Osterburg C, Chaikuad A, Weber S, Akutsu M, Machel AC, et al. Alternative splicing in the DBD linker region of p63 modulates binding to DNA and iASPP in vitro. Cell Death Dis. 2025;16(1):4. Ciribilli Y, Monti P, Bisio A, Nguyen HT, Ethayathulla AS, Ramos A, et al. Transactivation specificity is conserved among p53 family proteins and depends on a response element sequence code. Nucleic Acids Res. 2013;41(18):8637-53. Monti P, Ciribilli Y, Bisio A, Foggetti G, Raimondi I, Campomenosi P, et al. ∆N-P63alpha and TA-P63alpha exhibit intrinsic differences in transactivation specificities that depend on distinct features of DNA target sites. Oncotarget. 2014;5(8):2116-30. Pitzius S, Osterburg C, Gebel J, Tascher G, Schafer B, Zhou H, et al. TA*p63 and GTAp63 achieve tighter transcriptional regulation in quality control by converting an inhibitory element into an additional transactivation domain. Cell Death Dis. 2019;10(10):686. Okada C, Yamashita E, Lee SJ, Shibata S, Katahira J, Nakagawa A, et al. A high-resolution structure of the pre-microRNA nuclear export machinery. Science. 2009;326(5957):1275-9. Kutay U, Lipowsky G, Izaurralde E, Bischoff FR, Schwarzmaier P, Hartmann E, et al. Identification of a tRNA-specific nuclear export receptor. Mol Cell. 1998;1(3):359-69. Enthart A, Klein C, Dehner A, Coles M, Gemmecker G, Kessler H, et al. Solution structure and binding specificity of the p63 DNA binding domain. Scientific reports. 2016;6:26707. Deutsch GB, Zielonka EM, Coutandin D, Weber TA, Schafer B, Hannewald J, et al. DNA damage in oocytes induces a switch of the quality control factor TAp63alpha from dimer to tetramer. Cell. 2011;144(4):566-76.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Demarinis et al describe a detailed analysis of different stretches of basic amino acids located between the DBD and the OD of DNp63a to act as nuclear localization signals. They convincingly show that two stretches exist that form a bipartite NLS. They combine both functional import data with structure determination of the NLS sequence with IMP⍺ showing that both parts interact with the major and the minor site. The data are presented well provide a very good model of how nuclear important is regulated for DNp63a.

      This is a nice study of the bipartite NLS of DNp63a. Most interestingly, the authors show that nuclear import experiments using either the isolated peptide fused to GFP or DNp63a have a different outcome when the individual sequences are mutated. While in the case of the isolated peptide experiments a mutation in either of the two sequences has a measurable effect, this is not the case in the full length DNp63a context. The authors explain this with the oligomeriic state of DNp63a, which provides additional sequences from the other monomers within the tetramer, even when one of the NLS sequences is mutated. They provide alphaFold models to support this explanation. This in trans substitution effect explains why the NLS is not a mutation hotspot for inactivating DNp63a. These results are new and interesting in the context of how DNp63a regulates the development of epithelial tissues.

      Criticism:

      1. The major function of DNp63a seems to be that of a bookmarking factor that ensures the establishment of an epithelial transcriptional program. It is found to bind more to enhancer than to promoter regions. While it might also act for a few genes as a classical transcription factor (K14). this bookmarking and interaction with other transcriptional regulators seems to be its major task. This should be included in the introduction.
      2. "DNp63a can be imported into the nucleus as a dimer" What is the evidence that DNp63a is imported as a dimer and not as a tetramer? Although functional not really relevant, because all conclusions drawn for a dimer are true for a tetramer (such as the mutation compensation), this statement (and others in the text) should either be substantiated or modified.
      3. The explanation for the difference in the sensitivity of mutations in the bipartite NLS in the isolated peptide experiments and experiments with the full length DNp63a is intriguing. Unfortunately it is not based on direct exerimental evidence. To proof their model (which is the central claim of this manuscript) they should fuse the bipartite NLS to any dimerization module (e.g. a leucine zipper sequence) and show that by dimerization of the bipartite NLS the same results towards mutations are obtained as for full length DNp63a. This would strongly support their model.

      Significance

      Demarinis et al describe a detailed analysis of different stretches of basic amino acids located between the DBD and the OD of DNp63a to act as nuclear localization signals. They convincingly show that two stretches exist that form a bipartite NLS. They combine both functional import data with structure determination of the NLS sequence with IMP⍺ showing that both parts interact with the major and the minor site. The data are presented well provide a very good model of how nuclear important is regulated for DNp63a.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary

      General assessment:

      The authors show a number of generally very solid experiments that consistently support what is stated in the headline and further developed. They use wt and recombinant deltaN63alpha (N63) to sort out a previously published NLS whose inactivation did not lead to preventing nuclear localization of N63. The authors convincingly show that import is governed by a bipartite NLS. The interesting observation is that - when the bipartite stretch is transferred to GFP to drive the import, each motif is required - but the full-length protein tolerates alterations in either motif. The puzzle is solved by further structural analysis of binding of the NLS to importin alpha that shows the bipartite signal to work as expected. However, additional binding studies using BRET demonstrate dimerization that brings two copies of N63 and thus two bipartite signals together that compensate for mutations in one or the other part. Transcriptional activity of p53 can be modulated consistently with nuclear import, i.e. functional NLS motifs.

      The manuscript is overall in a very mature state, and I foresee publication essentially in its present form. A few suggestions may be considered prior to publication:

      1. In immunofluorescence images it is sometime difficult to see nuclear accumulation. Single channels of the GFP signal may help to make the point.
      2. The binding assays in Fig. 3 would profit from using the most efficient imp a variant together with imp beta to show potential cooperative binding.
      3. wording:

      please mention that NTR can also recognize 3D structures of structural RNAs, e.g. tRNAs or miRNAs

      longer TA isoforms

      homologues or orthologues?

      Significance

      General assessment:

      see above: this is a very consistent and mature study that can be pubslihed essentially in its present form.

      Advance:

      Even though the described mechanisms are not novel, they clarify how N63 is imported into human cell nuclei. We understand that in molecular mechanism and can deduce that the amounts of nuclear N63 are directly linked to its transcriptional response on p53.

      Audience:

      I see that this is interesting to experts in the nucleo-cytoplasmic transport field since it adds a novel aspect how robustness of import via dimerization can be reached. Beyond, the work brings news in translational research for physiology and pathology of epithelial tissue differentiation and homeostasis.

    1. Il y avait une anomalie à découvrir via les tests exploratoires : l’affichage du message d’erreur au moment de cliquer sur le bouton "Valider le paiement".

      Il y en a 2 et non 1, le panier est vide, un test exploratoire devrait justement remonter ce cas en plus du fait de l'erreur à la validation, sinon le test exploratoire n'a pas d'utilité dans cette situation vis à vis d'un simple scénario de test en 3 étapes...

    1. A Taxonomia Digital de Bloom

      Achei muito pertinente a secção 3.3 sobre a Taxonomia Digital de Bloom. O passo nº 3, Selecionar o verbo de ação apropriado, parece-me ser o momento crítico do design. Se o verbo for mal escolhido (por ex: apenas ler em vez de interpretar), a e-atividade pode falhar em promover a aprendizagem ativa que o texto defende. A clareza no verbo garante o alinhamento entre o objetivo e a tecnologia usada.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 ( Public review):

      The strength of the current study lies in their establishing the molecular mechanism through which PRMT1 could alter craniofacial development through regulation of the transcriptome, but the data presented to support the claim that a PRMT1-SFPQ axis directly regulates intron retention of the relevant gene networks should be robust and with multiple forms of clear validation. For example, elevated intron retention findings are based on the intron retention index, and according to the manuscript, are assessed considering the relative expression of exons and introns from a given transcript. However, delineating between intron retention and other forms of alternative splicing (i.e., cryptic splice site recognition) requires a more comprehensive consideration of the intron splicing defects that could be represented in data. A certain threshold of intron read coverage (i.e., the percent of an intron that is covered by mapped reads) is needed to ascertain if those that are proximal to exons could represent alternative introns ends rather than full intron retention events. In other words, intron retention is a type of alternative splicing that can be difficult to analyze in isolation given the confounding influence of cryptic splicing and cryptic exon inclusion. If other forms of alternative splicing were assessed and not detected, more confident retention calls can be made.

      This manuscript is a mechanistic exploration that follows previous work we published on the role of Prmt1 in craniofacial development, in which genetic deletion of Prmt1 in CNCCs leads to cleft palate and mandibular hypoplasia (PMID: 29986157).

      As the reviewer pointed out, a certain threshold of intron read coverage is needed to assess intron retention events. We employed IRTools to assess the collective changes of intron retention between cell-states associated with certain biological function or pathway. IRTools incorporated considerations for intron read coverage by checking the evenness of read distribution in an intron. Specifically, every constitutive intronic regions (CIR) is divided into 10 equally sized bins and the proportion of reads that map to each bin is calculated. CIRs are then ranked according to their imbalance in bin-wise reads distribution, represented by the proportion of reads in its most populated bin. Those among top 1% are considered to contain potentially false IR events and excluded. We further addressed this question by developing another measure of intron retention, intron retention coefficient (IRC), which assesses IR events using the junction reads (Supplemental Figure-S8). Junction reads that straddle two exons are called exon-exon junction reads (spliced reads), and those that straddle an exon and a neighboring intron are called exon-intron junction reads (retained reads). The IRC of an intron is defined as the fraction of junction reads that are exon-intron junction reads: IRC = exon-intron read-count / (exon-exon read-count + exon-intron read-count), where exon-intron read-count = (5’ exon-intron read-count + 3’ exon-intron read-count) / 2. The IRC of a gene is defined as the exon-intron fraction of all junction reads overlapping or over the constitutive introns of this gene. In the calculation of the IRC, only exon-intron junction reads that cover the junction point and overlap both of each side for at least 8 bps were counted, and only exon-exon junction reads that jump over the relevant junction points and overlap each of the respective exons for at least 8 bps were counted. In this process, evenness of the proportion of exon-intron junction reads that are 5’ or 3’ exon-intron junction reads are taken into account. As shown in the Supplemental Figure S7A and S7B, IRC analysis generated consistent results with those obtained from using IRI (Figure 3A and 3I).

      In addition, as the reviewer pointed out, intron retention can be difficult to analyze in isolation. We followed the reviewer’s suggestion that “If other forms of alternative splicing were assessed and not detected, more confident retention calls can be made“ and analyzed other forms of alternative splicing for all ECM and GAG genes with significant IRI increase (genes highlighted in Figure-3A and 3I) using rMATS (Supplemental Figure-S9). Among these genes, only 5 genes (Cthcr1, Mmp23, Adamts10, Ccdc80 and Col25a1) showed statistically significant changes in skipped exon, 1 gene (Bmp7) showed significant changes in mutually exclusive exons, and none showed significant changes in alternative 5’ or 3’ splicing. SE and MXE changes detected were marginal (Supplemental figure S8), while the majority of matrix genes with significant intron retention didn’t exhibit other forms of alternative splicing, further supporting the confidence of intron retention calls.

      While data presented to support the PRMT1-SFPQ activation axis is quite compelling, that this is directly responsible for the elevated intron retention remains enigmatic. First, in characterizing their PRMT1 knockout model, it is unclear whether the elevated intron retention events directly correspond to downregulated genes.

      In the revised manuscript, we demonstrate IR-triggered NMD as a mechanism for transcript decay and downregulation of matrix genes. When IR-triggered NMD was blocked by chemical inhibitor NMDI14, the intron-retaining transcripts showed significant accumulation (new Figure-4). NMD is the RNA surveillance system to degrade aberrant RNAs. Intron retention-triggered NMD in cancer has both promotive and suppressive roles and NMD inhibitors has been tested for cancer therapy including immunotherapy. During embryonic development, the functional significance of NMD machinery is suggested by human genetic findings and mouse genetic models. NMD is driven by a protein complex composed of SMG and UPF proteins. Smg6, Upf1, Upf2 and Upf3a knockout mouse die at early embryonic stages (E5.5-E9.5), and Smg1 gene trap mutant mice die at E12.5 (PMID: 29272451). SMG9 mutation in human patients causes malformation in the face, hand, heart and brain (PMID: 27018474).

      We show that in CNCCs NMD functions both as a physiological mechanism and invoked by molecular insult. Blocking NMD in CNCCs caused significant accumulation of intron-retaining Adamts2, Alpl, Eln, Matn2, Loxl1 and Bgn transcripts, suggesting a basal role for NMD to degrade intron-retaining transcripts (Figure-4Ba-4Bf). We further demonstrated the accumulation of Adamts2 and Fbln5 using semi-quantitative PCR with the detection of a longer product from Adamts2 intron 19 and Fbln5 intron 7 (Figure-4Ca-4Ch). In CNCCs and ST2 cells, NMD is further invoked by Prmt1 and Sfpq deficiency. In Prmt1 deficient CNCCs, NMD blockage led to higher accumulation of intron-retaining Adamts2 and Alpl transcripts, suggesting that Prmt1 deficiency triggers NMD to reduce intron-containing transcripts (Figure-4Aa, 4Ab). In Sfpq-depleted ST2 cells, blocking NMD caused accumulation of intron-retaining transcripts Col4a2, St6galnac3 and Ptk7 (Figure-9B, 9C).

      Moreover, intron splicing is a well-documented node for gene regulation during embryogenesis and in other proliferation models, and craniofacial defects are known to be associated with 'spliceosomopathies'. However, reproduction of this phenotype does not suggest that the targets of interest are inherently splicing factors, and a more robust assessment is needed to determine the exact nature of alternative splicing in this system. Because there are several known splicing factors downstream of PRMT1 and presented in the supplemental data, the specific attribution of retention to SFPQ would be additionally served by separating its splicing footprint from that of other factors that are primed to cause alternative splicing.

      We have previously shown that a group of splicing factors depends on Prmt1 for arginine methylation, including SFPQ (PMID: 31451547). We tested additional splicing factors that are highly expressed in CNCCs and depends on PRMT1 for arginine methylation: SRSF1, EWSR1, TAF15, TRA2B and G3BP1 (Figure-5, 6 and 10). Among these factors, EWSR1 and TRA2B are both methylated in CNCCs and depend on PRMT1 for methylation (Fig. 5 and Supplemental Figure-S3B, S3C). We weren’t able to assess TAF15 methylation because of lack of efficient antibody for the PLA assay. We also demonstrated that their protein expression or subcellular localization was not altered by Prmt1 deletion in CNCCs, unlike SFPQ (Supplemental Figure-S4). To define their splicing footprint, we performed siRNA-mediated knockdown in ST2 cells, followed by RNA-seq and IRI analysis to define differentially regulated genes and introns, which revealed distinct biological pathways regulated by SFPQ, EWSR1, TRA2B and TAF15, but minimal roles of EWSR1, TRA2B and TAF15 on intron retention when compared to SFPQ (Fig. 10F-10S, Supplemental Figure S7A-S7F, Supplemental Tables S4-S6). ECM genes are significantly downregulated by all four splicing factors (Fig. 10F-10I), but EWSR1, TRA2B and TAF15 function through IR-independent mechanisms, such as exon skipping, as exemplified by Postn (Fig. 10J-10S).

      Clarifying the relationship between SFPQ and splicing regulation is important given that the observed splicing defects are incongruous with published data presented by Takeuchi et al., (2018) regarding SFPQ control of neuronal apoptosis in mice. In this system, SFPQ was more specifically attributed to the regulation of transcription elongation over long introns and its knockout did not result in significant splicing changes. Thus, to establish the specificity for the SFPQ in regulating these retention events, authors would need to show that the same phenotype is not achieved by mis-regulation of other splicing factors. That the authors chose SFPQ based on its binding profile is understandable but potentially confounding given its mechanism of action in transcription of long introns (Takeuchi 2018). Because mechanisms and rates of transcription can influence splicing and exon definition interactions, the role of SFPQ as a transcription elongation factor versus a splicing factor is inadequately disentangled by authors.

      To test whether SFPQ acts as a transcription elongation factor, we performed Pol II Cut&Tag in ST2 cells and demonstrated that depletion of SFPQ only caused marginal changes in either the promoter region or gene body of ECM genes, suggesting that the role of SFPQ as a transcriptional activator or elongation factor is minimal (Fig. 7G, 7H). This finding is distinct from SFPQ function in neurons (PMID: 29719248), suggesting that the activation or recruitment of SFPQ in transcriptional regulation may involve tissue-specific factors in neurons.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Lima et al examines the role of Prmt1 and SFPQ in craniofacial development. Specifically, the authors test the idea that Prmt1 directly methylates specific proteins that results in intron retention in matrix proteins. The protein SFPQ is methylated by Prmt1 and functions downstream to mediate Prmt1 activity. The genes with retained introns activate the NMD pathway to reduce the RNA levels. This paper describes an interesting mechanism for the regulation of RNA levels during development.

      Strengths:

      The phenotypes support what the authors claim that Prmt1 is involved in craniofacial development and splicing. The use of state-of-the-art sequencing to determine the specific genes that have intron retention and changes in gene expression is a strength.

      Weaknesses:

      Some of the data seems to contradict the conclusions. And it is unclear how direct the relationships are between Prmt1 and SFPQ.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      First, the claims regarding the effect of PRMT1 loss on splicing are unclear by the section title. In other words, does loss PRMT1 change the incidence of baseline alternative splicing events, or does it introduce new retention events that are responsible for underwriting the craniofacial phenotype? Consistent with this idea, the narrative could benefit from more cellular and/or histological validations of the transcriptomic defects discovered in the RNAseq, which could help contextualize the bioinformatics data with the developmental defects. Moreover, the conclusions drawn about intron retention could be clarified in terms of how applicable the mechanism is likely to be outside of this tissue-specific set of responsive introns.

      Loss of Prmt1 did not cause a global shift in intron retention, as shown in Supplemental Figure S2. Instead, Prmt1 deletion caused increase of intron retention specifically in genes enriched in cartilage development, glycosaminoglycan biology, dendrite and axon, and decreased intron retention in mitochondria and metabolism genes (Table. S1). We also tested matrix protein expression by histology to confirm that transcriptomic defects revealed at the RNA level resulted in lower protein production. The new data are in Figure 3E-3H.

      Additionally, invoking NMD to align splicing and differential gene expression data understandable but lacking sufficient controls to be conclusive, such as positive control genes to confirm inhibition of NMD.

      To validate the blockage of NMD, glutathione peroxidase 1 (Gpx1) intron 1, a well-documented substrate for NMD, is tested as positive control (Fig 4Ac, 4Ad, 9B).

      Additionally, it should be clarified whether NMD is a basal mechanism for the regulation of these introns or whether it is an induced mechanism that is invoked by the molecular insult.

      In CNCCs, NMD functions both as a physiological mechanism and invoked by molecular insult. Please refer to responses to Reviewer 1’s public review for detailed explanations.

      Further, authors present data downstream of two siRNAs for the same gene target, but it remains unclear how siRNAs for the same gene target produce different effects. It may be helpful for authors to clarify how many of the transcriptomic defects are shared versus unique between the siRNAs.

      To address this question, we used bioinformatic analysis of the whole genome data to the similarity in changes caused by the two SFPQ-targeting siRNAs. As shown in the new Fig. 7Ba & 7Bb, transcriptomic and intron changes are consistent between the two siRNAs, suggesting that genes targeted by the two siRNA predominantly overlap. This overlap is illustrated by scatter plot analysis of RNAseq DEG and IRI data from each siRNA against SFPQ.

      Finally, we stress the importance of presenting the full conceptual basis for SFPQ's potential role in splicing and gene expression. It is significant to note that SFPQ has been previously studied as a splicing factor and was instead determined to function in support of the transcription elongation rather than in splicing. Thus, if authors are confident that the SFPQ manifests directly in splicing changes they encumber the burden of proof to show that its role in transcription, nor another splicing factor, are driving splicing changes.

      We demonstrated that depletion of SFPQ only caused marginal changes in either the promoter region or gene body of ECM genes, suggesting that the role of SFPQ as a transcriptional activator or elongation factor is minimal (Fig. 7G, 7H). Please refer to responses to Reviewer 1’s public review for detailed explanations.

      Reviewer #2 (Recommendations for the authors):

      (1) It is not clear why the authors focused on intron retention targets vs the other possibilities. Skipped Exon is much higher in terms of the number of changes, please clarify. For the intron retention how is this quantified? The traces are nice, but it is hard to tell which part is retained at this magnification. Also, because the focus is on extracellular matrix (ECM) and NMD it would be nice to show some of those targets here. In the tbx1 trace, some are up and some are down. What does that mean for the gene expression?

      We have investigated SE initially and found that genes with significant changes in Prmt1 CKO CNCCs fall into diverse functional pathways. Among them, a few genes are critical for skeletal formation, including Postn and Fn, and the function of their exon skipping has been documented. For example, the two exons that are skipped in Postn, Exon17 and 21, have been shown to regulate craniofacial skeleton shape and mandibular condyle hypertrophic zone thickness using transgenic mouse models (PMID: 36859617). As illustrated by Figure 10, the skipped exon of Postn is regulated by multiple splicing factors that may perform overlapping functions in vivo.

      Intron retention of each gene is quantified by the ratio of the overall read density of its constitutive intronic regions (CIRs) to the overall read density of its constitutive exonic regions (CERs) and defined as the intron retention index (IRI). In the first section of Response to Reviewer 1’s comments, we explained additional bioinformatic analysis that was performed to address reviewers’ questions, support the confidence of intron event calls and rule out the possibility of other alternative splicing mechanisms, such as by SE, MXE, A5SS or A3SS (Supplemental Figure S5, S6, Table S7).

      (2) RNA-Sequencing of Prmt1 mutants nicely shows gene expression changes, including in ECM and GAG genes. While validation of the sequencing results is not necessarily required, it would be very interesting to show the expression in situ. In addition, the heat map shows both downregulated but also upregulated transcripts. This is expected since this protein regulates many genes. However, the volcano plot shows a significant number of genes upregulated. It would be interesting to show what the upregulated genes are. And what is the proposed mechanism for Prmt1 regulation of upregulated genes?

      Validation for the transcriptomic changes is shown in Fig. 3E-3H using immunostaining.

      As for upregulated genes in Prmt1 mutant, top pathways include cytokine-mediated signaling pathway, signal transduction by p53 signaling pathway and cell morphogenesis (Figure 2E), which are consistent with our previous reports that Prmt1 deletion induces cytokine production in oral epithelium and leads to p53 accumulation in embryonic epicardium (PMID: 32521264, 29420098). Besides these pathways, Prmt1 deletion also caused upregulation of genes involved in adult behavior, postsynaptic organization and apoptotic process, which is consistent with findings from other labs on PRMT1 function in neuronal and cancer cells (PMID: 34619150, 33127433).

      (3) Specific transcripts were shown to have elevated intron retention involved in the ECM and GAG pathway. However in Figure 3D it seems to show the opposite with intronic expression decreased and exonic increases and intronic decrease. This is very important to the final conclusion of the paper. In addition, is there a direct relationship between increased intron and downregulation of this specific gene expression? It seems a bit correlational as it could also be an indirect mechanism. One way to test this is to do in vitro translation with and without the specific intron to test if it results in lower expression.

      We apologize for the mis-labeling in previous version of Figure 3D, which is now corrected. We also tried to test the direct relationship between intron and downregulation of matrix genes such as Adamts2 using in vitro experiments, however, the introns of matrix genes with high retention tends to be long, many 10 to 50kb in length, making it challenging to generate mini-gene constructs for molecular analysis. We used a different approach and demonstrated that inhibition of NMD with a chemical inhibitor NMDI14 caused dramatic accumulation of the Adamts2, Alpl, Eln, Matn2, Loxl1 and Bgn transcripts, suggesting that retained introns triggered NMD to regulate gene expression and this mechanism acts as a physiological level in CNCCs (Fig. 4). We also blocked NMD in control and Prmt1 null CNCCs, where NMD blockage led to higher accumulation of Adamts2 and Alpl transcripts, suggesting that upon Prmt1 deficiency, NMD is further utilized to degrade intron-containing transcripts (Fig. 4). Similarly, in Sfpq-depleted ST2 cells, blocking NMD caused accumulation of intron-retaining transcripts Col4a2, St6galnac3 and Ptk7 (Fig. 9A, 9B).

      (4) While Figure 4 nicely shows the methylation of SFPQ is reduced in Prmt1 CKO cells, it is unclear which reside this methylation occurs. Also the overall expression of SFPQ is also down so it is possible that the methylation is indirect ie Prmt1 regulates some other methyltransferase that regulates SFPQ. Or that because the overall level of SFPQ is down, there is no protein to methylate. How do the authors differentiate between these possibilities?

      Previously, arginine methylation of SFPQ has been characterized using in vitro reaction and cell lines with biochemical assays by Snijders., et al in 2015 (PMID: 25605962). Among all PRMTs that catalyze asymmetric arginine dimethylation (ADMA), SFPQ is methylated by only PRMT1 and PRMT3, with PRMT1 showing higher efficiency while PRMT3 showing a lower efficiency. However, PRMT3 is mainly cytosolic. Its expression in CNCCs is about 100-fold lower than PRMT1 (Fig. 1). Based on these knowledges, PRMT1 is the primary arginine methyltransferase for SFPQ, a nuclear protein in CNCCs. We and others have shown in a previous publication that SFPQ methylation on arginine 7 and 9 depends on PRMT1 (PMID: 31451547).

      To investigate SFPQ protein degradation in CNCCs, we used MG132 to block proteasomal degradation and observed a partial rescue of SFPQ protein degradation in Prmt1 mutant embryos, suggesting that SFPQ is degraded through proteasomal-mediated mechanism. To address the relationship between SFPQ methylation and protein expression, we assessed arginine methylation of SFPQ that accumulated after MG132 treatment. The accumulated SFPQ was not methylated, confirming the absence of methylation even when SFPQ protein expression is restored.

      Snijders., et al, also shown that citrullination induced by PADI4 regulate SFPQ stability (Snijders 2015). We considered this possibility and assessed the expression levels of PADIs. In E13.5 and E15.5 CNCCs, PADI1-4 mRNA expression levels are very low (TPM<5), suggesting that PADIs may not regulate SFPQ stability in CNCCs. A detailed mechanism as to how PRMT1-mediated SFPQ methylation controls stability awaits further investigation.

      (5) For the Sfpq deleted experiment, it seems that the two knockdowns are not similar in the gene targets and GO terms different except Wnt signaling. This makes this data difficult to interpret. The genes identified as intron retention are different than the ones identified in Prmt1 deletion and not reduced as much. How does this fit in with the Prmt1 story? If working through Sfpq, it assumes that the targets will be similar and more the 8% would be in common.

      To address the first concern, we used bioinformatic analysis of the whole genome data to the similarity in changes caused by the two SFPQ-targeting siRNAs. As shown in the new Fig. 7Ba & 7Bb, transcriptomic and intron changes are consistent between the two siRNAs, suggesting that genes targeted by the two siRNA predominantly overlap. This overlap is illustrated by scatter plot analysis of RNAseq DEG and IRI data from each siRNA against SFPQ.

      We have previously identified a group of splicing factors that depends on PRMT1 for arginine methylation, including SFPQ (PMID: 31451547). In the new data in Figures 5, 6 and 10, we tested an additional five PRMT1-dependent splicing factors that are highly expressed in CNCCs: SRSF1, EWSR1, TAF15, TRA2B and G3BP1 (Fig. 5, 6 and 10). Among these factors, SRSF1 and G3BP1 are predominantly expressed in the cytosol of NCCs at E13.5. As splicing activity in the nucleus is needed for pre-mRNA splicing, we excluded these two and focused on the other three proteins. EWSR1 and TRA2B are both methylated in CNCCs and depend on PRMT1 for methylation (Fig. 5). We weren’t able to assess TAF15 methylation because of lack of efficient antibody for the PLA assay. We also demonstrated that their protein expression or subcellular localization was not altered by Prmt1 deletion in CNCCs, unlike SFPQ (Fig. S2). To define their splicing footprint, we performed siRNA-mediated knockdown in ST2 cells, followed by RNA-seq and IRI analysis to define differentially regulated genes and introns, which revealed distinct biological pathways regulated by SFPQ, EWSR1, TRA2B and TAF15, but minimal roles of EWSR1, TRA2B and TAF15 on intron retention when compared to SFPQ (Fig. 10F-10I, Supplemental Figure S7A-S7F). ECM genes are significantly downregulated by all four splicing factors (Fig. 10J-10M), but EWSR1, TRA2B and TAF15 regulate transcription or exon skipping instead of IR, as exemplified by Alpl and Postn (Fig. 10N-10T).

      (6) The addition of an NMD mechanism is interesting but not surprising that when inhibiting the pathway broadly, there is an increase in gene expression in the mesoderm cell line. How specific is this to craniofacial development?

      NMD is driven by a protein complex composed of SMG and UPF proteins. We show in the revised manuscript that NMD is both a physiological mechanism in CNCCs and triggered by genetic disturbance (Fig. 4). These data are in line with human patient reports where SMG9 mutation in human causes malformation in the face, hand, heart and brain (PMID: 27018474). Mouse genetic studies also demonstrated roles of NMD components during embryonic development.Smg6, Upf1, Upf2 and Upf3a knockout mouse die at early embryonic stages (E5.5-E9.5), and Smg1 gene trap mutant mice die at E12.5 (Han 2018). Additionally, intron retention-triggered NMD in cancer has both promotive and suppressive roles and NMD inhibitors has been tested for cancer therapy and recently cancer immunotherapy. Our findings highlight matrix genes as one of the key targets for NMD during craniofacial development.

      Minor:

      (1) The supplemental figures are difficult to understand. In the first upload there are many figures and tables, some excel files that are separate uploads and some not. Please upload as separate files so it is clear. And also put them in order that they are in the manuscript.

      (2) For the heat map in figure 2B, it would be good to show all the genes or none at all. It seems a bit like cherry-picking to highly only a few. And they are not labeled where they are located in the graph. Are these the top lines if so please label.

      (3) Gene names in Figure 3A are difficult to read. I would also not consider BMP7 an ECM gene.

      (4) A summary diagram of the interactions proposed will help to make this more understandable.

      The supplemental figures are reorganized and uploaded as separate word and excel documents. For Heat map in Fig. 2B, we have removed the gene names. For Fig. 3A, only the most significantly changed gene are labeled in red dots with names. We didn’t label all the genes because of the large number of genes. For the new Figure 3B, we have replaced BMP7. A schematic summary is also added to Supplemental Fig. S9 to illustrate the PRMT1-SFPQ pathway.

    1. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Wang and colleagues explore factors contributing to the diversification of wtf meiotic drivers. wtf genes are autonomous, single-gene poison-antidote meiotic drivers that encode both a spore-killing poison (short isoform) and an antidote to the poison (long isoform) through alternative transcriptional initiation. There are dozens of wtf drivers present in the genomes of various yeast species, yet the evolutionary forces driving their diversification remain largely unknown. This manuscript is written in a straightforward and effective manner, and the analyses and experiments are easy to follow and interpret. While I find the research question interesting and the experiments persuasive, they do not provide any deeper mechanistic understanding of this gene family.

      Revision update:

      Having read the response to the reviewers, I believe the major issues have been addressed. However, I would strongly suggest toning down the claim regarding the chimeric WTF element in the abstract, which currently reads

      "As proof-of-principle, we generate a novel meiotic driver through artificial recombination between wtf drivers, and its encoded poison cannot be detoxified by the antidotes encoded by their parental wtf genes but can be detoxified by its own antidote."

      As the author reports in their response, despite various attempts, it was not possible to show that this chimeric WTF element was indeed capable of meiotic drive in a natural context (not transgenic overexpression experiment). thus the authors should not claim they generated "a novel meiotic driver"

      Strengths:

      (1) The authors present a comprehensive compendium and analysis of the evolutionary relationships among wtf genes across 21 strains of S. pombe

      (2) The authors found that a synthetic chimeric wtf gene, combining exons 1-5 of wtf23 and exon 6 of wtf18, behaves like a meiotic driver that could only be rescued by the chimeric antidote but neither of the parental antidotes. This is a very interesting observation that could account for their inception and diversification.

      Weaknesses:

      (1) Deletion strains

      The authors separately deleted all 25 Wtf genes in the S. pombe ference strain. Next, the authors performed spot assay to evaluate the effect of wtf gene knockout on the yeast growth. They report no difference to the WT and conclude that the wtf genes might be largely neutral to the fitness of their carriers in the asexual life cycle at least in normal growth condition.

      The authors could have conducted additional quantitative growth assays in yeast, such as growth curves or competition assays, which would have allowed them to detect subtle fitness effects that cannot be quantified with a spot assay. Furthermore, the authors do not rule out simpler explanations, such as genetic redundancy. This could have been addressed by crossing mutants of closely related paralogs or editing multiple wtf genes in the same genetic background.

      Another concern is the lack of detailed information about the 25 knockout strains used in the study. There is no information provided on how these strains were generated or, more importantly, validated. Many of these wtf genes have close paralogs and are flanked by repetitive regions, which could complicate the generation of such deletion strains. As currently presented, these results would be difficult to replicate in other labs due to insufficient methodological details

      Revision update:

      The authors measured the fitness of the deletion strains using growth curves (Fig. 2C and D) and no significant differences were found, further supporting their claims. The requested information (details on the generation of the deletion strains) is now available in the methods section.

      (2) Lack of controls

      The authors found that a synthetic chimeric wtf gene, constructed by combining exons 1-5 of wtf23 and exon 6 of wtf18, behaves as a meiotic driver that can be rescued only by its corresponding chimeric antidote, but not by either of the parental antidotes (Figure 4F). In contrast, three other chimeric wtf genes did not display this property (Figure 4C-E). No additional experiments were conducted to explain these differences, and basic control experiments, such as verifying the expression of the chimeric constructs, were not performed to rule out trivial explanations. This should be at the very least discussed. Also, it would have been better to test additional chimeras.

      Revision update:

      The authors report that the expression of the construct was measured. However, they do not make reference to any specific figure or section of the main text. It would be very useful if the authors explicitly referenced where exactly changes were made (this is true for all changed made)

      (3) Statistical analyses

      In line 130 the authors state that: "Given complex phylogenetic mixing observed among wtf genes (Figure 1E), we tested whether recombination occurred. We detected signals of recombination in the 25 wtf genes of the S. pombe reference genome (p = 0) and in the wtf genes of the 21 S. pombe strains (p = 0) using pairwise homoplasy index (HPI) test. "<br /> Reporting a p-value of 0 is not appropriate. Please report exact P-values.

      Revision update:

      This has been addressed.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary

      The authors determine the phylogenetic relation of the roughly two dozen wtf elements of 21 S. pombe isolates and show that none of them in the original S. pombe are essential for robust mitotic growth. It would be interesting to test their meiotic function by simply crossing each deletion mutant with the parent and analyzing spores for non-Mendelian inheritance. If this has been reported already, that information should be added to the manuscript. If not, I suggest the authors do these simple experiments and add this information.

      Thanks for the great summary! All the wtf genes have been tested for meiotic drive phenotypes previously by Bravo Nunez et al. (2020; http://doi.org/10.1371/journal.pgen.1008350). The reference was cited in our original manuscript, and we added the details in the revised manuscript.  

      Strengths:

      The most interesting data (Figure 4) show that one recombinant (wtfC4) between wtf18 and wtf23 produces in mitotic growth a poison counteracted by its own antidote but not by the parental antidotes. Again, it would be interesting to test this recombinant in a more natural setting - meiosis between it and each of the parents.

      Thanks for this insightful comment! As suggested, we have tried to test this recombinant in a more natural setting. We created a recombinant strain (wtfC4) based on the laboratory strain 972h-. Specifically, we replaced the last exon of the original wtf23 gene with the last exon of wtf18. However, we encountered a challenge: since strain 972h- has only one mating type and cannot undergo meiosis on its own, we had to mate the recombinant strain with a BN0 h⁺ strain that only carries the wtf23<sup>antidote</sup>. Unfortunately, despite of tens of attempts over nearly a year, we did not observe meiotic driver phenotype as expected. This might be due to issues with the proper splicing and expression of the potential poison and antidote proteins or due to the genetic background. Similarly, the drive activity of wtf13 has been shown to be specifically suppressed in certain backgrounds.

      Weaknesses:

      In the opinion of this reviewer, some minor rewriting is needed.

      We did the rewriting as this reviewer suggested.

      Reviewer #2 (Public review):

      Summary:

      This important study provides a mechanism that can explain the rapid diversification of poison-antidote pairs (wtf genes) in fission yeast: recombination between existing genes.

      Thanks!

      Strengths:

      The authors analyzed the diversity of wtf in S. pombe strains, and found pervasive copy number variations. They further detected signals of recurrent recombination in wtf genes. To address whether recombination can generate novel wtf genes, the authors performed artificial recombination between existing wft genes, and showed that indeed a new wtf can be generated: the poison cannot be detoxified by the antidotes encoded by parental wtf genes but can be detoxified by own antidote.

      Thanks for the great summary!

      Weaknesses:

      The study can benefit from demonstrating that the novel poison-antidote constructed by the authors can serve as a meiotic driver.

      Thanks for this insightful comment! As suggested, we have tried to test this recombinant in a more natural setting. We created a recombinant strain (wtfC4) based on the laboratory strain 972h-. Specifically, we replaced the last exon of the original wtf23 gene with the last exon of wtf18. However, we encountered a challenge: since strain 972h- has only one mating type and cannot undergo meiosis on its own, we had to mate the recombinant strain with a BN0 h⁺ strain that only carries the wtf23<sup>antidote</sup>. Unfortunately, despite of tens of attempts over nearly a year, we did not observe meiotic driver phenotype as expected. This might be due to issues with the proper splicing and expression of the potential poison and antidote proteins or due to the genetic background. Similarly, the drive activity of wtf13 has been shown to be specifically suppressed in certain backgrounds.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript, Wang and colleagues explore factors contributing to the diversification of wtf meiotic drivers. wtf genes are autonomous, single-gene poison-antidote meiotic drivers that encode both a spore-killing poison (short isoform) and an antidote to the poison (long isoform) through alternative transcriptional initiation. There are dozens of wtf drivers present in the genomes of various yeast species, yet the evolutionary forces driving their diversification remain largely unknown. This manuscript is written in a straightforward and effective manner, and the analyses and experiments are easy to follow and interpret. While I find the research question interesting and the experiments persuasive, they do not provide any deeper mechanistic understanding of this gene family.

      Thanks! Please see the following for our point-to-point response.

      Strengths:

      (1) The authors present a comprehensive compendium and analysis of the evolutionary relationships among wtf genes across 21 strains of S. pombe.

      (2) The authors found that a synthetic chimeric wtf gene, combining exons 1-5 of wtf23 and exon 6 of wtf18, behaves like a meiotic driver that could only be rescued by the chimeric antidote but neither of the parental antidotes. This is a very interesting observation that could account for their inception and diversification.

      Thanks for the great summary!

      Weaknesses:

      (1) Deletion strains

      The authors separately deleted all 25 Wtf genes in the S. pombe ference strain. Next, the authors performed a spot assay to evaluate the effect of wtf gene knockout on the yeast growth. They report no difference to the WT and conclude that the wtf genes might be largely neutral to the fitness of their carriers in the asexual life cycle at least in normal growth conditions.

      The authors could have conducted additional quantitative growth assays in yeast, such as growth curves or competition assays, which would have allowed them to detect subtle fitness effects that cannot be quantified with a spot assay. Furthermore, the authors do not rule out simpler explanations, such as genetic redundancy. This could have been addressed by crossing mutants of closely related paralogs or editing multiple wtf genes in the same genetic background.

      Another concern is the lack of detailed information about the 25 knockout strains used in the study. There is no information provided on how these strains were generated or, more importantly, validated. Many of these wtf genes have close paralogs and are flanked by repetitive regions, which could complicate the generation of such deletion strains. As currently presented, these results would be difficult to replicate in other labs due to insufficient methodological details

      We generated growth curves for all the 25 wtf deletion strains. We provided the details for wtf gene knockout. However, for 25 wtf genes, there are too many combinations for editing two genes, and it is technically challenging to knock out multiple wtf together. Nevertheless, our results suggest single wtf genes have little effect on the host fitness under normal condition.

      (2) Lack of controls

      The authors found that a synthetic chimeric wtf gene, constructed by combining exons 1-5 of wtf23 and exon 6 of wtf18, behaves as a meiotic driver that can be rescued only by its corresponding chimeric antidote, but not by either of the parental antidotes (Figure 4F). In contrast, three other chimeric wtf genes did not display this property (Figure 4C-E). No additional experiments were conducted to explain these differences, and basic control experiments, such as verifying the expression of the chimeric constructs, were not performed to rule out trivial explanations. This should be at the very least discussed. Also, it would have been better to test additional chimeras.

      We verified the expression of the chimeric genes. The last exon of wtf18 is too small (128bp) to do more meaningful chimeras.

      (3) Statistical analyses

      In line 130 the authors state that: "Given complex phylogenetic mixing observed among wtf genes (Figure 1E), we tested whether recombination occurred. We detected signals of recombination in the 25 wtf genes of the S. pombe reference genome (p = 0) and in the wtf genes of the 21 S. pombe strains (p = 0) using pairwise homoplasy index (HPI) test." Reporting a p-value of 0 is not appropriate. Exact P-values should be reported. 

      Due to software limitations, the PHI test reports p-values of 0.0 for extremely significant results. We have therefore reported them as <0.0001 in the revised manuscript.

      Recommendations for the authors:

      Reviewing Editor Comments:

      Regarding the synthetic chimeric wtf gene constructed by combining exons of wtf23 and wtf18, the authors did not explicitly test whether it acts as a meiotic driver in the natural context of a cross. Instead, they examined this possibility only through transgenic overexpression experiments. Given that this is arguably the most important claim of the paper, it is critical that the authors perform, report, and discuss such an experiment in a natural context, regardless of the outcome. It is not necessary to test other recombinants or other wtf loci.

      Thanks for this insightful comment! As suggested, we have tried to test this recombinant in a more natural setting. We created a recombinant strain (wtfC4) based on the laboratory strain 972h-. Specifically, we replaced the last exon of the original wtf23 gene with the last exon of wtf18. However, we encountered a challenge: since strain 972h- has only one mating type and cannot undergo meiosis on its own, we had to mate the recombinant strain with a BN0 h⁺ strain that only carries the wtf23<sup>antidote</sup>. Unfortunately, despite of tens of attempts over nearly a year, we did not observe meiotic driver phenotype as expected. This might be due to issues with the proper splicing and expression of the potential poison and antidote proteins or due to the genetic background. Similarly, the drive activity of wtf13 has been shown to be specifically suppressed in certain backgrounds.

      Reviewer #1 (Recommendations for the authors):

      The paper is very well written, but some minor points should be corrected or checked.

      (1) Line 95: Why "Putative"? Is it not clear what a wtf pseudogene is?

      “Putative” was removed.

      (2) Line 105: Does "known functional" mean they are active (i.e., have been tested and shown to be active)? If so, a reference should be added.

      We used “known meiotic divers”, and added reference here.

      (3) Line 135: "no recombination signal was tested". Do the authors mean no signal was inferred? 

      We changed “tested” to “detected”.

      (4) Line 147: References for "known functional meiotic drivers (wtf23) and artificially generated meiotic driver (wtf18)" should be given. A statement of how wtf18 was "artificially generated" is essential so the reader knows how that element differs from the wtfC4 generated here.

      Reference for wtf23. As for wtf18, we have specified in the follow text, namely “we artificially introduced an in-frame ATG codon right before the start of exon 2, generating wtf18poison/-0M.”

      (5) Lines 154 and 424 say an ATG codon was introduced "right before the start of exon 2," but Figure 4B shows it before exon 1.

      We thank the reviewer. The introduced ATG is the second start codon in the long transcript and the first in the short transcript. The right panel of Figure 4B shows the short transcript, so the text and figure are consistent.

      (6) Line 159: The wtf18 mutant with this additional ATG codon should be tested in meiosis, to see if "putative" is correct.

      Thanks. As wtfC4, we came with technical challenges to show the driver phenotype in a natural setting, and thus removed this statement.

      (7) Line 181: change "driver" to "drive".

      Driver is correct.

      (8) Line 184: insert to read "wtf genes tested". Also, what is the basis for proposing that "the last exon might be crucial for antidote function"?

      “Tested” added, and removed the statement.

      (9) Line 198: change to read "detects only large differences".

      Done as suggested.

      (10) Line 204: change "removed" to "removal".

      Done as suggested.

      (11) Lines 242 and 243: Are "Splittree4" and "SplitsTree4" different, or is this a misprint?

      Corrected!

      (12) Lines 274-5 and 412 -3 would read better as "strains were diluted in five 10-fold steps” and “...μL of each dilution spotted on” “…to assay for…"

      Done as suggested.

      (13) Line 284 says "No new data were generated." This is clearly wrong. Perhaps the authors mean there are no supplementary data files.

      Corrected!

      (14) Line 406: Change "is" to "are".

      Corrected!

      (15) Line 413: Surely, they were spotted onto YE agar medium, not liquid medium.

      Corrected!

      (16) Figure 3C: Define "Rho" and the scale used.

      The definition of Rho has been added to the Methods section in the revised manuscript.

      Reviewer #2 (Recommendations for the authors):

      The evidence is largely solid, but the study can benefit from demonstrating that the novel poison-antidote constructed by the authors can serve as a meiotic driver.

      As suggested, we have tried to test this recombinant in a more natural setting. We created a recombinant strain (wtfC4) based on the laboratory 972h-. Specifically, we replaced the last exon of the original wtf23 gene with the last exon of wt18f. However, we encountered a challenge: since 972h- is a mating-type strain and cannot undergo meiosis on its own, we had to mate the recombinant strain with a BN0 h⁺ strain that carries the wtf23<sup>antidote</sup>. Unfortunately, despite of tens of attempts over nearly a year, we did not observe meiotic driver phenotype as expected. This might be due to issues with the proper splicing and expression of the potential poison and antidote proteins.

      Reviewer #3 (Recommendations for the authors):

      I strongly recommend the authors provide all the details concerning the generation of the knock-out strains, including specific primers used (for both the deletion and validation), the result of these validations, and the specific genotype (and ID) of the strains generated.

      These details are now included in the Materials and Methods section and in Supplementary.

      Please also provide exact P-values (see point 3).

      Due to software limitations, the PHI test reports p-values of 0.0 for extremely significant results. We have therefore reported them as <0.0001 in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #2 (Public review):

      In this valuable manuscript, Lin et al attempt to examine the role of long non coding RNAs (lncRNAs) in human evolution, through a set of population genetics and functional genomics analyses that leverage existing datasets and tools. Although the methods are incomplete and at times inadequate, the results nonetheless point towards a possible contribution of long non coding RNAs to shaping humans, and suggest clear directions for future, more rigorous study.

      Comments on revisions:

      I thank the authors for their revision and changes in response to previous rounds of comments. As before, I appreciate the changes made in response to my comments, and I think everyone is approaching this in the spirit of arriving at the best possible manuscript, but we still have some deep disagreements on the nature of the relevant statistical approach and defining adequate controls. I highlight a couple of places that I think are particularly relevant, but note that given the authors disagree with my interpretation, they should feel free to not respond!

      (1) On the subject of the 0.034 threshold, I had previously stated: "I do not agree with the rationale for this claim, and do not agree that it supports the cutoff of 0.034 used below."

      In their reply to me, the authors state:

      "What we need is a gene number, which (a) indicates genes that effectively differentiate humans from chimpanzees, (b) can be used to set a DBS sequence distance cutoff. Since this study is the first to systematically examine DBSs in humans and chimpanzees, we must estimate this gene number based on studies that identify differentially expressed genes in humans and chimpanzees. We choose Song et al. 2021 (Song et al. Genetic studies of human-chimpanzee divergence using stem cell fusions. PNAS 2021), which identified 5984 differentially expressed genes, including 4377 genes whose differential expression is due to trans-acting differences between humans and chimpanzees. To the best of our knowledge, this is the only published data on trans-acting differences between humans and chimpanzees, and most HS lncRNAs and their DBSs/targets have trans-acting relationships (see Supplementary Table 2). Based on these numbers, we chose a DBS sequence distance cutoff of 0.034, which corresponds to 4248 genes (the top 20%), slightly fewer than 4377."

      I have some notes here. First, Agoglia et al, Nature, 2021, also examined the nature of cis vs trans regulatory differences between human and chimps using a very similar set up to Song et al; their Supplementary Table 4 enables the discovery of genes with cis vs trans effects although admittedly this is less straightforward than the Song et al data. Second, I can't actually tell how the 4377 number is arrived at. From Song et al, "Of 4,671 genes with regulatory changes between human-only and chimpanzee-only iPSC lines, 44.4% (2,073 genes) were regulated primarily in cis, 31.4% (1,465 genes) were regulated primarily in trans, and the remaining 1,133 genes were regulated both in cis and in trans (Fig. 2C). This final category was further broken down into a cis+trans category (cis- and transregulatory changes acting in the same direction) and a cis-trans category (cis- and trans-regulatory changes acting in opposite directions)." Even when combining trans-only and cis&trans genes that gives 2,598 genes with evidence for some trans regulation. I cannot find 4,377 in the main text of the Song et al paper.

      Elsewhere in their response, the authors respond to my comment that 0.034 is an arbitrary threshold by repeating the analyses using a cutoff of 0.035. I appreciate the sentiment here, but I would not expect this to make any great difference, given how similar those numbers are! A better approach, and what I had in mind when I mentioned this, would be to test multiple thresholds, ranging from, eg,0.05 to 0.01 <DBS dist =0.01 -> 0.034 -> 0.05> at some well-defined step size.

      (1) We sincerely thank the reviewer for this critical point. Our initial purpose, based on DBS distances from the human genome to chimpanzee genome and archaic genomes, was that genes with large DBS distances may have contributed more to human evolution. However, our ORA (overrepresentation analysis) explored only genes with large DBS distances (the legend of old Figure 2 was “1256 target genes whose DBSs have the largest distances from modern humans to chimpanzees and Altai Neanderthals are enriched in different Biological Processes GO terms”), with the use of the cutoff (threshold) of 0.034 for defining large distance. The cutoff is not totally unreasonable (as our new results and the following sensitivity analysis indicate), but this approach was indirect and flawed.

      (2) We have now performed ORA using two methods. The first uses only DBS distances. Instead of using a cutoff, we now sort genes by DBS distance (human-chimpanzee distances and human-Altai Neanderthal distance, respectively, see Supplementary Table 5) and use the top 25% and bottom 25% of genes to perform ORA. This directly examines whether DBS distances along indicate that genes with large DBS distances contribute more to human evolution than genes with small DBS distances. The second also explores the ASE genes (allele-specific expression, genes undergoing human/chimpanzee-specific regulation in the tetraploid human–chimpanzee hybrid iPS) reported by Agoglia et al. 2021. We select the top 50% and bottom 50% of genes with large and small DBS distances, intersect them with ASE genes from Agoglia et al. 2021 (their Supplementary Table 4), and apply ORA to the intersections. Both the results are that: (a) more GO terms are obtained from genes with large DBS distances, (b) more human evolution-related GO terms are obtained from genes with large DBS distances (Supplementary Table 5,6,7; Figure 2; Supplementary Fig. 15). These results directly suggest that genes with large DBS distances contribute more to human evolution than genes with small DBS distances, which is a key theme of the study.

      (3) Regarding Song et al 2021, the statement of “we differentiated…allotetraploid (H1C1a, H1C1b, H2C2a, H2C2b) lines into ectoderm, mesoderm, and endoderm” made us assume that their differentiated hybrid cell lines cover more tissue types than those of Agoglia et al. 2021. Now, upon re-examining Supplementary Table 5 of Song et al. and Supplementary Table 4 of Agoglia et al. 2021, we find that the latter more clearly indicates significant ASE genes (p-adj<0.01 and |LFC>0.5| in GRCh38 and PanTro5).

      (4) We have also performed two additional analyses in response to the suggestion of “test multiple thresholds, ranging from, eg, 0.05 to 0.01 <DBS dist =0.01 -> 0.034 -> 0.05> at some well-defined step size”. First, we performed a multi-threshold sensitivity analysis using a spectrum of cutoffs (0.03, 0.034, 0.04, 0.05), and tracked the number of genes identified and the enrichment significance of key GO terms (e.g., "neuron projection development," "behavior") across these thresholds. The result confirms that while the absolute number of genes varies with the cutoffs, the core biological conclusion (specifically, the significant enrichment of target genes in neurodevelopmental and cognitive functions) remains stable and significant. For instance, "behavior" maintains strong statistical significance (FDR<0.01) in both the human-chimpanzee and human-Altai Neanderthal comparisons across all tested cutoffs, and "Neuron projection development" also remains significant across three (0.03, 0.034, 0.04) of the four cutoffs in the Altai comparison. This pattern suggests that our core findings regarding neurodevelopmental functions are robust across a range of cutoffs. Nevertheless, we did not extend the analysis to smaller cutoffs (e.g., 0.01 or 0.02) because such values would identify an excessively large number of genes (>10000) for ORA, which would render the GOterm enrichment analysis less meaningful due to a loss of specificity.

      Second, we have performed an additional validation to directly evaluate whether the 0.034 cutoff itself represents a stringent and biologically meaningful value. We sought to empirically determine how often a DBS sequence distance of 0.034 or greater might occur by chance in promoter regions, thereby testing its significance as a marker of potential evolutionary divergence. We randomly sampled 10,000 windows from annotated promoter regions across the hg38 genome, each with a size matching the average length of DBSs (147 bp). We then calculated the per-base sequence distances for these random windows between modern humans and chimpanzees, as well as between modern humans and the three archaic humans (Altai, Denisovan, Vindija). The analysis reveals that a distance of ≥0.034 is a rare event in random promoter sequences: for Human-Chimp, Human-Altai, HumanDenisovan, and Human-Vindija, 5.49% (549/10000), 0.31% (31/10000), 4.47% (447/10000), and0.03% (3/10000) of random windows reach this distance. This empirical evidence suggests that 0.034 is a sufficiently strong cutoff for defining large DBS distance, it would occur very unlikely in a random genomic background (P<0.1 for Chimpanzee and P<0.05 for the archaic humans), and DBSs exceeding this cutoff are significantly enriched for sequences that have undergone substantial evolutionary change instead of being random neutral variations.  

      (5) We present new Figure 2, Supplementary Table 5,6,7, and Supplementary Fig. 15. We have substantially revised section 2.3, related sections in Results, Supplementary Note 3, and Supplementary Table 8. We have removed related descriptions and explanations in the main text and Supplementary Notes. The results of the above two analyses are presented here as two Author response images.

      Author response table 1.

      Sensitivity analysis of GO-term enrichment across different DBS sequence distance cutoffs. The table shows the numbers of target genes identified and the false discovery rates (FDR) for the enrichment of three selected GO terms at four different distance cutoffs. Note that, unlike in the old Figure 2, the results for chimpanzees and Altai Neanderthals are not directly comparable here, as the numbers of target genes used for the enrichment analysis differ between them at each cutoff.

      Author response image 1.

      Distribution of per-base sequence distances for DBS size-matched random genomic windows in Ensembl-annotated promoter regions, calculated between modern humans and (A) chimpanzee, (B) Altai Neanderthal, (C) Denisovan, and (D) Vindija Neanderthal genomes.

      (2) The authors have introduced a new TFBS section, as a control for their lncRNAs - this is welcome, though again I would ask for caution when interpreting results. For instance, in their reply to me the authors state: "The number of HS TFs and HS lncRNAs (5 vs 66) <HS TF vs all HS lncRNAs> alone lends strong evidence suggesting that HS lncRNAs have contributed more significantly to human evolution than HS TFs (note that 5 is the union of three intersections between <many2zero + one2zero> and the three <human TF list>)."

      But this assumes the denominator is the same! There are 35899 lncRNAs according to the current GENCOVE build; 66/35899 = 0.0018, so, 0.18% of lncRNAs are HS. The authors compare this to 5 TFs. There are 19433 protein coding genes in the current GENCOVE build, which naively (5/19433) gives a big depletion (0.026%) relative to the lnc number. However, this assumes all protein coding genes are TFs, which is not the case. A quick search suggests that ~2000 protein coding genes are TFs (see, eg, https://pubmed.ncbi.nlm.nih.gov/34755879/); which gives an enrichment (although I doubt it is a statistically significant one!) of HS TFs over HS lncRNAs (5/2000 = 0.0025). Hence my emphasis on needing to be sure the controls are robust and valid throughout!

      We thank the reviewer for this comment. While 5 vs 66 reveals a difference, a direct comparison is too simplified. The real take-home message of the new TFBS section is not the numbers but the distributions of HS TFs’ targets and HS lncRNAs’ targets across GTEx organs and tissues (Figure 3 and Supplementary Figures 24, 25) - correlated HS lncRNA-target transcript pairs are highly enriched in brain regions, but correlated HS TF-target transcript pairs are distributed broadly across GTEx tissues and organs. We have now removed the simple comparison of “5 vs 66” and more carefully explained our comparison in section 2.6.

      (3) In my original review I said: line 187: "Notably, 97.81% of the 105141 strong DBSs have counterparts in chimpanzees, suggesting that these DBSs are similar to HARs in evolution and have undergone human-specific evolution." I do not see any support for the inference here. Identifying HARs and acceleration relies on a far more thorough methodology than what's being presented here. Even generously, pairwise comparison between two taxa only cannot polarise the direction of differences; inferring human-specific change requires outgroups beyond chimpanzee.

      In their reply to me, the authors state:

      Here, we actually made an analogy but not an inference; therefore, we used such words as "suggesting" and "similar" instead of using more confirmatory words. We have revised the latter half sentence, saying "raising the possibility that these sequences have evolved considerably during human evolution".

      Is the aim here to draw attention to the ~2.2% of DBS that do not have a counterpart? In that case, it would be better to rewrite the sentence to emphasise those, not the ones that are shared between the two species? I do appreciate the revised wording, though.

      (1) Our original phrasing may be misleading, and we agree entirely that “pairwise comparison between two taxa only cannot polarise the direction of differences; inferring human-specific change requires outgroups beyond chimpanzee”. As explained in that reply, we know and think that DBSs and HARs are two different classes of sequences, and indeed, identifying HARs and acceleration relies on a far more thorough methodology. Yet, three factors prompted us to compare them. First, both suggest the importance of sequences outside genes. Second, both are quite “old” sequences and have undergone considerable evolution recently (although the references are different). Third, both have contributed greatly to human brain evolution.  

      (2) Here, our stress is 97.81% but not 2.2%, and we have made this analogy more clearly and cautiously. Relevant revisions have been made in the Results, Discussion, and Methods sections.   

      (3) We also have further determined whether the 2.2% DBSs are human-specific gains by analyzing them using the UCSC Multiz Alignments of 100 Vertebrates. The result confirms that all 2248 DBSs are present in the human genome but are absent from the chimpanzee genome and all other aligned vertebrate genomes. We add this result into the manuscript.

      (4) Finally, Line 408: "Ensembl-annotated transcripts (release 79)" Release 79 is dated to March 2015, which is quite a few releases and genome builds ago. Is this a typo? Both the human and the chimpanzee genome have been significantly improved since then!

      (1) We thank the reviewer for this comment, which prompts us to provide further explanation and additional data. First, we began predicting HS lncRNAs’ DBSs when Ensembl release 79 was available, but did not re-predict DBSs when new Ensembl releases were published because (a) these new Ensembl releases are based also on hg38, (b) we did not find any fault in the LongTarget program during our use, nor received any one from users, (c) predicting lncRNAs’ DBSs using the LongTarget program is highly time-consuming.  

      (2) Second, to assess the influence of newer Ensembl releases, we compared the promoters annotated in release 79 and in release 115. We found that the vast majority (87.3%) of promoters newly annotated in release 115 belong to non-coding genes. Thus, using release 115 may predict more DBSs in non-coding genes, but downstream analyses based on protein-coding genes would be essentially the same (meaning that all figures and tables would be the same).

      (3) Third, a key element of this study is GTEx data analysis, and these data were also published years ago.  

      (4) Finally, some lncRNA genes have new gene symbols in new Ensembl releases. To allow researchers to use our data conveniently, we have added a new column titled "Gene symbol (Ensembl release115)" to Supplementary Tables 2A and 2B.  

      Summary:

      Major changes based on Reviewer’s comments:

      (1) The following revisions are made to address the comment on “the 0.034 threshold”: (a) Section 2.3, section 2.4, Supplementary Note 3, and related contents in Discussion and Methods are revised, (b) new Figure 2, Supplementary Figure 15, new Supplementary Table 5,6,7, (c) Table 2 and Supplementary Table 8 are revised.

      (2) To address the comment on “new TFBS section”, section 2.6 and section 4.13 are revised.  

      (3) To address the comment on “97.81% and 2.2% of DBSs”, section 2.3 is revised.

      (4) The following revisions are made to address the comment on “release 79”: (a) the old Supplementary Table 2, 3 are merged to Supplementary Table 2AB, and the new column "Gene symbol (Ensembl release115)" is added to Supplementary Table 2AB, (b) accordingly, Supplementary Table 4,5 are renamed to Supplementary Table 3,4.

      Additional revisions:

      (1) Section 2.5 “Young weak DBSs may have greatly promoted recent human evolution” is moved into Supplementary Note 3 (which now has the subtitle “Target genes with specific DBS features are enriched in specific functions”), because this section is short and lacking sufficient cross-validation.

      (2) Considerable minor revisions of sentences have been made.

      (3) Since there are many supplementary figures, the main text now cites only Supplementary Notes, as the reader can easily access supplementary figures in Supplementary Notes.

    1. Reviewer #3 (Public review):

      Summary

      This study aimed to investigate whether the differences observed in the organization of visual brain networks between blind and sighted adults result from a reorganization of an early functional architecture due to blindness, or whether the early architecture is immature at birth and requires visual experience to develop functional connections. This question was investigated through the comparison of 3 groups of subjects with resting-state functional MRI (rs-fMRI). Based on convincing analyses, the study suggests that: 1) secondary visual cortices showed higher connectivity to prefrontal cortical regions (PFC) than to non-visual sensory areas (S1/M1 and A1) in infants like in blind adults, in contrast to sighted adults; 2) the V1 connectivity pattern of infants lies between that of sighted adults (showing stronger functional connectivity with non-visual sensory areas than with PFC) and that of blind adults (showing stronger functional connectivity with PFC than with non-visual sensory areas); 3) the laterality of the connectivity patterns of infants resembled those of sighted adults more than those of blind adults, but infants showed a less differentiated fronto-occipital connectivity pattern than adults.

      Strengths

      - The question investigated in this article is important for understanding the mechanisms of plasticity during typical and impaired development, and the approach considered, which compares different groups of subjects including, neonates/infants and blind adults, is highly original.

      - Overall, the presented analyses are solid and well-detailed, and the results and discussion are convincing.

      Weaknesses

      - While it is informative to compare the "initial" state (close to birth) and the "final" states in blind and sighted adults to study the impact of post-natal and visual experience, this study does not analyze the chronology of this development and when the specialization of functional connections is completed. This would require investigating the evolution of functional connectivity of the visual system as a function of visual experience and thus as a function of age, at least during toddlerhood given the early and intense maturation of the visual system after birth. This could be achieved by analyzing different developmental periods using open databases such as the Baby Connectome Project.

      - The rationale for grouping full-term neonates and preterm infants (scanned at term-equivalent age) is not understandable when seeking to perform comparisons with adults. Even if the study results do not show differences between full-terms and preterms in terms of functional connectivity differences between regions and of connectivity patterns, preterms group had different neurodevelopment and post-natal (including visual) experiences (even a few weeks might have an impact). And actually they show reduced connectivity strength systematically for all regions compared with full-terms (Sup Fig 7). Considering a more homogeneous group of neonates would have strengthened the study design.

      - The rationale for presenting results on the connectivity of secondary visual cortices before the one of primary cortices (V1) could be clarified.

      - The authors acknowledge the methodological difficulties for defining regions of interest (ROIs) in infants in a similar way as adults. Since the brain development is not homogeneous and synchronous across brain regions (in particular with the frontal and parietal lobes showing a delayed growth), this poses major problems for registration. This raises the question of whether the study findings could be biased by differences in ROI positioning across groups.

      Comments on revisions:

      The authors have addressed my specific recommendations, but some weaknesses in the study remain, particularly the inclusion of preterm infants alongside full-term neonates.

    2. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The present study evaluates the role of visual experience in shaping functional correlations between human extrastriate visual cortex and frontal regions. The authors used fMRI to assess "resting-state" temporal correlations in three groups: sighted adults, congenitally blind adults, and neonates. Previous research has already demonstrated differences in functional correlations between visual and frontal regions in sighted compared to early blind individuals. The novel contribution of the current study lies in the inclusion of an infant dataset, which allows for an assessment of the developmental origins of these differences.

      The main results of the study reveal that correlations between prefrontal and visual regions are more prominent in the blind and infant groups, with the blind group exhibiting greater lateralization. Conversely, correlations between visual and somato-motor cortices are more prominent in sighted adults. Based on these data, the authors conclude that visual experience plays an instructive role in shaping these cortical networks. This study provides valuable insights into the impact of visual experience on the development of functional connectivity in the brain.

      Strengths:

      The dissociations in functional correlations observed among the sighted adult, congenitally blind, and neonate groups provide strong support for the main conclusion regarding postnatal experience-driven shaping of visual-frontal connectivity.

      The inclusion of neonates offers a unique and valuable developmental anchor for interpreting divergence between blind and sighted adults. This is a major advance over prior studies limited to adult comparisons.

      Convergence with prior findings in the blind and sighted adult groups reinforces the reliability and external validity of the present results.

      The split-half reliability analysis in the infant data increases confidence in the robustness of the reported group differences.

      Weaknesses:

      The manuscript risks overstating a mechanistic distinction between sighted and blind development by framing visual experience as "instructive" and blindness as "reorganizing." Similarly, the binary framing of visual experience and blindness as independent may oversimplify shared plasticity mechanisms.

      The interpretation of changes in temporal correlations as altered neural communication does not adequately consider how shifts in shared variance across networks may influence these measures without reflecting true biological reorganization.

      The discussion does not substantively engage with the longstanding debate over whether sensory experience plays an instructive or permissive role in cortical development.

      The relationship between resting-state and task-based findings in blindness remains unclear.

      Reviewer #2 (Public review):

      Summary:

      Tian et al. explore the developmental origins of cortical reorganization in blindness. Previous work has found that a set of regions in the occipital cortex show different functional responses and patterns of functional correlations in blind vs. sighted adults. Here, Tian et al. explore how this organization arises over development. Is the "starting state" more like the blind pattern, or more like the adult pattern? Their analyses reveal that the answer depends on the particular networks investigated. Some functional connections in infants look more like blind than sighted adults; other functional connections look more like sighted than blind adults; and others fall somewhere in the middle, or show an altogether different pattern in infants compared with both sighted and blind adults.

      Strengths:

      The paper addresses very important questions about the starting state in the developing visual cortex, and how cortical networks are shaped by experience. Another clear strength lies in the unequivocal nature of many results. Many results have very large effect sizes, critical interactions between regions and groups are tested and found, and infant analyses are replicated in split halves of the data.

      Weaknesses:

      While potential roles of experience (e.g., visual, cross-modal) are discussed in detail, little consideration is given to the role of experience-independent maturation. The infants scanned are extremely young, only 2 weeks old. It is possible then that the sighted adult pattern may still emerge later in infancy or childhood, regardless of infant visual experience. If so, the blind adult pattern may depend on blindness-related experience only (which may or may not reflect "visual" experience per se). In short, it is not clear that birth, or the first couple weeks of life, are a clear cut "starting point" for development, after which all change can be attributed to experience.

      Reviewer #3 (Public review):

      Summary

      This study aimed to investigate whether the differences observed in the organization of visual brain networks between blind and sighted adults result from a reorganization of an early functional architecture due to blindness, or whether the early architecture is immature at birth and requires visual experience to develop functional connections. This question was investigated through the comparison of 3 groups of subjects with resting-state functional MRI (rs-fMRI). Based on convincing analyses, the study suggests that: 1) secondary visual cortices showed higher connectivity to prefrontal cortical regions (PFC) than to non-visual sensory areas (S1/M1 and A1) in infants like in blind adults, in contrast to sighted adults; 2) the V1 connectivity pattern of infants lies between that of sighted adults (showing stronger functional connectivity with non-visual sensory areas than with PFC) and that of blind adults (showing stronger functional connectivity with PFC than with non-visual sensory areas); 3) the laterality of the connectivity patterns of infants resembled those of sighted adults more than those of blind adults, but infants showed a less differentiated fronto-occipital connectivity pattern than adults.

      Strengths

      - The question investigated in this article is important for understanding the mechanisms of plasticity during typical and impaired development, and the approach considered, which compares different groups of subjects including, neonates/infants and blind adults, is highly original.

      - Overall, the presented analyses are solid and well detailed, and the results and discussion are convincing.

      Weaknesses

      - While it is informative to compare the "initial" state (close to birth) and the "final" states in blind and sighted adults to study the impact of post-natal and visual experience, this study does not analyze the chronology of this development and when the specialization of functional connections is completed. This would require investigating the evolution of functional connectivity of the visual system as a function of visual experience and thus as a function of age, at least during toddlerhood given the early and intense maturation of the visual system after birth. This could be achieved by analyzing different developmental periods using open databases such as the Baby Connectome Project.

      - The rationale for grouping full-term neonates and preterm infants (scanned at term-equivalent age) is not understandable when seeking to perform comparisons with adults. Even if the study results do not show differences between full-terms and preterms in terms of functional connectivity differences between regions and of connectivity patterns, preterms group had different neurodevelopment and post-natal (including visual) experiences (even a few weeks might have an impact). And actually they show reduced connectivity strength systematically for all regions compared with full-terms (Sup Fig 7). Considering a more homogeneous group of neonates would have strengthen the study design.

      - The rationale for presenting results on the connectivity of secondary visual cortices before the one of primary cortices (V1) could be clarified.

      - The authors acknowledge the methodological difficulties for defining regions of interest (ROIs) in infants in a similar way as adults. Since the brain development is not homogeneous and synchronous across brain regions (in particular with the frontal and parietal lobes showing a delayed growth), this poses major problems for registration. This raises the question of whether the study findings could be biased by differences in ROI positioning across groups.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors are appropriately cautious in many parts of the discussion and include several helpful control analyses. Nonetheless, additional clarification of key assumptions and potential confounds would strengthen the paper.

      (1) The current framing labels vision as "instructive" and blindness as "reorganizing," but it is unclear why these two experiential factors are characterized differently. Both involve activity-dependent changes to functional architecture from a shared immature scaffold. Labeling them differently risks conflating divergent outcomes with distinct underlying mechanisms. Just because visual and blind adults show different patterns of functional connectivity does not mean they reflect separate processes. While the discussion briefly acknowledges the possibility of shared plasticity mechanisms, much of the framing across the manuscript, including in the abstract and introduction, implies a dichotomy. A clearer articulation of the criteria used to assign these labels, or reconsideration of whether such a distinction is warranted, would improve conceptual clarity. The current framing appears analogous to saying that "heat causes expansion" and "cold causes contraction" as if these were separate mechanisms, when they are actually two directions of change along a single factor: temperature. A more parsimonious framework, such as activity-dependent reweighting of pre-existing connectivity, may better capture the nature of plasticity at play in both sighted and blind development.

      Following the reviewer’s suggestion, we have revised the manuscript to clarify that both vision and blindness can be understood as manifestations of a common framework of experience-driven plasticity. We removed all mention of reorganization and clarify and modified the wording throughout.

      Specifically:

      Abstract: “Are infant visual cortices functionally like those of sighted adults, with blindness leading to functional change? We find that, on the contrary that secondary visual cortices of infants are functionally more like those of blind adults: stronger coupling with PFC than with nonvisual sensory-motor networks, suggesting that visual experience modifies elements of the sighted-adult long-range functional connectivity profile. Infant primary visual cortices are in-between blind and sighted adults i.e., more balanced PFC and sensory-motor connectivity than either adult group. The lateralization of occipital-to-frontal connectivity in infants resembles the sighted adults, consistent with the idea that blindness leads to functional change. These results suggest that both vision and blindness modify functional connectivity through experience-driven (i.e., activity-dependent) plasticity.” (Page 1, Line 13)

      Introduction: We replaced “blindness leads to functional reorganization” with “blindness modifies this functional connectivity” (Page 2, Line 52), and the following sentence has also been modified to: “lifetime visual experience shapes connectivity toward the sighted-adult pattern” (Page 2, Line 54) For the lateralization patterns, we now describe them as “blindness-related modification” rather than “reorganization”, to keep the interpretation descriptive rather than mechanistic. (Page 4, Line 114),

      (2) In interpreting the functional correlation differences, the discussion should more explicitly consider how statistical interdependence between areas could influence the observed results. For example, an increase in shared variance between visual and motor areas, such as might result from visually guided action, could result in a reduction in the apparent strength of visual-prefrontal temporal correlation (at the resolution of fMRI) without any true biological change in communication between visual-prefrontal cortex. This possibility is not ruled out by reporting groupwise patterns of relative connectivity. A more cautious systems-level framing could help clarify the distinction between neural plasticity and statistical redistribution of variance.

      We thank the reviewer for raising this important point. We agree that resting-state fMRI provides a measure of statistical synchrony in BOLD signals rather than direct causal interactions between regions. This a fundamental limitation of resting state fMRI, which we now note in the Discussion section. Such changes in correlation are consistent with a variety of underlying biological mechanisms. Online task is one factor that influences cross-region correlations. In the current study, both blind and sighted groups were measured while blindfolded and were not performing visually guided actions during the resting state fMRI scans. It is possible that past visual-guided action experience changes the resting state correlations of sighted participants. Indeed, this is one interesting hypothesis.

      In the revised Discussion, we now explicitly note this limitation and clarify that differences in FC do not by themselves establish whether or how underlying neurophysiological mechanisms are changed. We also emphasize that future work will need to investigate whether FC changes are accompanied by alterations in structural connectivity and to probe causal interactions and mechanistic underpinnings as follows:

      “Resting-state functional connectivity captures synchrony in BOLD signal fluctuations rather than causal interactions and differences in functional connectivity cannot on their own reveal how underlying neurophysiological mechanisms are modified.” (page 13,line 342)

      “Future studies will be needed to determine whether these functional changes are accompanied by alterations in structural connectivity, and to probe causal interactions and mechanistic underpinnings.” (page 13,line 350)

      (3) The mechanistic interpretation of group differences in visual-motor coupling would benefit from stronger network-level justification. Direct connections between these areas are sparse in primates. If effects reflect indirect polysynaptic interactions or shared thalamic input, as the authors suggest, one might expect corresponding group differences in intermediate regions (e.g., parietal cortex, thalamus) that mediate these interactions. Is there any evidence for this in the data?

      We thank the reviewer for raising this point. We agree and as noted above, resting state fMRI cannot distinguish between direct causal interactions between two regions and ones that a mediating region is involved. This is a fundamental limitation of resting state fMRI. The current study further focused on testing a specific hypothesis motivated by previously observed group differences between blind and sighted adults and our analyses focused on ROI-to-ROI connectivity between occipital, frontal, and sensory-motor cortices, and did not include these additional regions. In prior work, we and others, have looked at effects in parietal cortices (Abboud & Cohen, 2019; Bedny et al., 2009; Deen et al., 2015; Kanjlia et al., 2016, 2021; Sen et al., 2022). In blindness, parietal networks show increased correlations with some visual areas, rather than decreased. Regarding the thalamus, there is less clear evidence and there is some ongoing work trying to address this question. A couple of studies suggest that there is indeed increased connectivity between some parts of the thalamus and visual cortex in blindness. Although the anatomical information is limited, some of the work suggests that this increase is with higher-cognitive nuclei of the thalamus (Bedny et al., 2011; Liu et al., 2007).

      We agree that this is an important direction for future work. To acknowledge this point, we have revised the manuscript to highlight the potential role of cortical and subcortical hub regions in mediating connectivity changes. The text has been modified as follows:

      “Connectivity changes between two areas could be mediated by ‘third-party’ hub regions. For example, posterior parietal cortex serves as a cortical hub for multisensory integration and visuo-motor coordination and could mediate occipital-to-sensory-motor communication (Rolls et al., 2023; Sereno & Huang, 2014). Subcortical structures such as the thalamus could also play a mediating role (Vega-Zuniga et al., 2025).” (page 13,line 345)

      (4) The discussion would benefit from deeper engagement with prior work on experience-dependent plasticity, particularly the longstanding distinction between instructive and permissive roles of experience. While the authors briefly define these concepts and reference their historical use, a more explicit consideration of how their findings relate to this broader literature would help clarify whether such distinctions are necessary or appropriate.

      We thank the reviewer for this thoughtful suggestion to engage more explicitly with the longstanding literature on instructive versus permissive roles of experience. However, most of this literature comes from animal models, where experimental manipulations of the anatomical structure, of experience itself (e.g., controlled rearing studies) and sometimes of neural activity patterns allow clear tests of these mechanisms. Such manipulations are not feasible in humans. The terminology in the animal literature does not directly map onto the methods and data available in the present study or in other work with humans. For this reason, the current data does not allow us to fully engage with the debates in the animal literature and doing risks overinterpreting our findings.

      Nevertheless, we agree that once the instructive/permissive framework has been introduced, it is important to clarify how our results relate to it, rather than only providing definitions. We have therefore added the following text to the discussion:

      “In humans, such manipulations are not feasible, leaving us to study only the consequences of the presence or absence of vision. Under an instructive account, visual and multisensory experience could strengthen coupling between visual and other non-visual sensory-motor cortices through coordinated activity, thereby establishing the sighted-adult connectivity pattern. In the absence of visual input, by contrast, the lack of such coordinated activity may prevent these couplings from being established. Alternatively, vision may act permissively, indirectly enabling maturational processes that shift connectivity toward the sighted-adult configuration.” (page 14,line 362)

      (5) The revised discussion acknowledges the divergence between resting-state and task-based findings, but does not fully frame the theoretical implications of this discrepancy. Although this study cannot resolve the issue with its own data, a more integrative discussion could help clarify whether these measures reflect distinct functional states, developmental trajectories, or mechanisms of plasticity. Without such framing, readers are left without clear guidance on how to reconcile the present results with prior work on cross-modal recruitment in blindness.

      We thank the reviewer for this thoughtful comment. We agree that know how resting-state evidence relates to task-based evidence is a fundamentally important issue. We now discuss this more in the Introduction as well as in the Discussion.

      There is a sizable literature of both task-based and resting state studies. Some of prior studies have measured resting state and task-based data within the same participants and found relationships (Kanjlia et al., 2016, 2021; Lane et al., 2015). We now clarify this in the introduction. These studies find that within visual cortices of blind people, the task-based profile of a cortical area is related to its resting state connectivity pattern (Abboud & Cohen, 2019; Deen et al., 2015; Kanjlia et al., 2016, 2021). This suggests that these two measures are related. However, the timecourse of this relationship, the developmental trajectory and mechanism of plasticity is not known. We note this now in the introduction on page 2. Primarily this is because there is very little relevant developmental evidence. For example, in the current study we find that the resting state profile of secondary visual networks in infants is similar to that of blind adults. However, we do not know whether the visual cortices of infants show task-based cross modal responses. To our knowledge nobody has tested this question. We agree with the reviewer that raising this question in the paper is better than not commenting on the relationship at all.

      To address the reviewer’s comment, we have expanded the discussion to situate our results within a developmental framework, highlighting how early intrinsic connectivity may scaffold alternative trajectories shaped by either visual experience or blindness. The revised text now reads as follows:

      “Conversely, for people who remain blind throughout life, visual-PFC connectivity could enable recruitment of visual cortices for higher-order non-visual functions, such as language and executive control (Bedny et al., 2011; Kanjlia et al., 2021). Our results suggest that blind adults may build on connectivity patterns already present in infancy: like blind adults, sighted infants show stronger occipital–PFC than occipital–sensory–motor coupling. Repeated engagement of occipital networks during higher cognitive tasks in early development could intern enhance connectivity and specialization of visual networks for non-visual higher-order functions.

      Some prior studies have measured resting-state and task-based functional profiles in the same participants. These studies find that within visual cortices of blind people, the task-based profile of a cortical area is related to its resting state connectivity pattern (citations.) This suggests that these two measures are related. However, the timecourse of this relationship, the developmental trajectory and mechanism of plasticity is not known. Primarily this is because there is very little relevant developmental evidence. For example, in the current study we find that the resting state profile of secondary visual networks in infants is similar to that of blind adults. However, we do not know whether the visual cortices of infants show enhanced task-based cross modal responses, relative to sighted adults and how this compares to responses observed in blind adults. Future work with infants and children would be able to address this question.

      In the current study, the clearest evidence for functional change driven by blindness was observed for laterality. Connectivity lateralization in sighted infants resembles that of sighted adults, in both V1 and secondary visual cortices. Relative to both sighted infants and sighted adults, blind adults show more lateralized connectivity patterns between occipital and prefrontal cortices. Previous studies suggest that in people born blind occipital and non-occipital language responses are co-lateralized (Lane et al., 2017; Tian et al., 2023). We speculate that habitual activation of visual cortices by higher-cognitive tasks, such as language, which are themselves highly lateralized, contributes to this biased connectivity pattern of occipital cortex in blindness. Taken together, these results suggest a developmental framework in which intrinsic connectivity present in infancy provides a scaffold that is subsequently shaped and reinforced by experience-dependent recruitment, through either visual experience or the lifelong absence of vision in blindness. Longitudinal work across successive developmental stages will be crucial to test how the alternative trajectories shaped by visual experience versus blindness unfold over development.” (page 14-15)

      (6) The split-half reliability analysis is a valuable control. Additional details would clarify what these noise ceilings reflect. Were the rsFC patterns for each ROI calculated only for the ROIs included in the current study or was a broader assessment across the whole brain performed? It also would be helpful to report whether reliability differed for individual ROIs within and between groups. Even if global reliability is matched, selective differences could influence group comparisons. Several infants in the dhcp dataset were scanned twice. Were any second scans included in the current analyses? Comparing first versus second scans directly could strengthen the claim that several weeks of visual experience are insufficient to shift connectivity toward a sighted adult profile.

      Thanks to the reviewer’s comments on the reliability of the current study.

      In the present study, the noise ceiling was computed from the reliability of the ROI-wise FC profiles used across all analyses. Reliability was estimated using a split-half procedure: each rs-fMRI time series was divided into two equal halves, FC among all ROIs included in the study was computed separately for each half, and the noise ceiling for each ROI was defined as the Pearson correlation between its two FC profiles. Then we averaged these ROI-wise noise ceilings to evaluate group-level reliability, which exceeded 0.70 in all three groups and found no significant difference across groups. This provides an estimate of the upper bound on explainable variance for the exact FC features subjected to statistical testing (Lage-Castellanos et al., 2019). A brief description has been added to the manuscript (page 19, line 518).

      Regarding the reviewer’s question about the scope of rsFC features used in the noise-ceiling analysis: we computed noise ceilings only for the ROIs included in the present study, because all analyses in this work were conducted at the ROI–ROI level and did not involve voxelwise whole-brain FC. Thus, the noise-ceiling estimates correspond directly to the full set of FC features on which all statistical comparisons were based.

      As suggested by the reviewer, we examined noise ceilings for each ROI separately. All ROIs showed high absolute reliability (noise ceiling > 0.80) across the three groups, indicating that the ROI-wise FC estimates are generally robust across participants. Although many ROIs exhibited statistically significant group differences in noise ceiling (one-way ANOVA, p < 0.05), the effect sizes were small to moderate (partial η<sup>2</sup> < 0.14). These differences indicate that reliability may vary modestly across groups at the ROI level, and we cannot fully determine whether such variability contributes to the observed different FC patterns across groups. We have included this point in the revised manuscript (page 19, line 525), along with the full statistical results for the ROI-wise noise ceilings in the Supplementary Table S2.

      Last, we fully agree that longitudinal comparisons across multiple time points can provide important insights into how early visual experience shapes connectivity. At the same time, in the present dataset, the first scan occurred at a preterm age and the second at term-equivalent age. The differences between the first and second scans would reflect not only additional weeks of visual input, but also differences in prematurity status and overall neurodevelopmental maturity, which would make the interpretation of such comparisons difficult in the context of our current aims. We have clarified in the revised manuscript that only term-equivalent (second) scans were included. We see careful longitudinal work as an important avenue for addressing this question more directly.

      (7) The signal dropout assessment in the infant dataset is a valuable quality control step. Applying the same metric to the adult datasets would help harmonize preprocessing across groups and increase confidence in group-level comparisons.

      Thank you for this valuable suggestion. Following your comment, we applied the same signal dropout assessment to the adult datasets. One participant in the sighted adult group and two participants in the blind adult group showed signal dropout in one ROI each. The corresponding results are now included in the Supplementary Materials (Figure S13). The findings remain unchanged after this additional control analysis. We also add the relevant content in the Method part as follows:

      “The same signal dropout assessment was also applied to the blind and sighted adults to ensure consistent quality control across groups. One participant in the sighted adult group and two participants in the blind adult group exhibited signal dropout in one ROI each. Excluding these participants did not alter the group-level results (see Figure S13).” (page 16, line 449)

      Minor:

      (8) The authors added accurate anatomical descriptions to the methods but a less precise characterization remains in the introduction: "Anatomically, these regions correspond roughly to the location of areas such as motion area V5/MT+, the lateral occipital complex (LO), V3a and V4v in sighted people."

      We thank the reviewer for this helpful comment. We have revised the Introduction to provide a fuller anatomical description, consistent with the Methods. The text now reads:

      “Anatomically, these regions in sighted people approximately correspond to the locations of motion-sensitive V5/MT+ and the lateral occipital complex (LO), as well as ventral portions of occipito-temporal cortex including V4v and dorsal portions including V3a. The occipital ROI also extends ventrally into the middle portion of the ventral temporal lobe and dorsally into the intraparietal sulcus and superior parietal lobule.” (page 3, line 88)

      (9)Typo: "lager effect" should be "larger effect."

      Secondary visual cortices showed a significant within > between difference in both groups, with a lager effect in the blind group (post-hoc tests, Bonferroni-corrected paired: t-test: sighted adults within hemisphere > between hemisphere: t (49) = 7.441, p = 0.012; blind adults within hemisphere > between hemisphere: t (29) = 10.735, p < 0.001; V1: F(1, 78) =87.211, p < 0.001).

      We thank the reviewer for catching this typo. We have corrected “lager effect” to “larger effect” in the revised manuscript. (page 9, line 214)

      Reviewer #2 (Recommendations for the authors):

      All of my other concerns were adequately addressed.

      We thank the reviewer for their positive evaluation, and we are glad that our revisions have addressed their concerns.

      Reviewer #3 (Recommendations for the authors):

      In my view, qualifying infants as "sighted" is confusing and unnecessary: why not simplifying and homogenizing the wording along the manuscript and figures?

      We thank the reviewer for this suggestion. We agree and have revised the manuscript to use consistent wording, avoiding the qualification of infants as “sighted.”

      l188, I don't understand the sentence "By contrast, in sighted adults, this cross-hemisphere difference is weak or absent."

      We thank the reviewer for noting that this sentence was unclear. We have revised the text to provide a more precise explanation. The text now reads:

      “By contrast, in sighted adults this lateralized pattern is weaker: visual areas in each hemisphere show only a modest preference for ipsilateral prefrontal cortices, and connectivity with the contralateral PFC remains comparatively strong.” (page 8, line 207)

      l193: "Secondary visual cortices showed a significant within > between difference in both groups, with a lager effect in the blind group": providing effect sizes for the 2 groups would strengthen this result (+ note the typo laRger).<br /> - Figure S7, S11: Please add titles of y-axes.

      Thank you for this helpful suggestion. We have corrected the typo and added the effect sizes for both groups in the revised text. The revised sentence now reads as follows:

      “Secondary visual cortices showed a significant within > between difference in both groups, with a larger effect in the blind group (post-hoc tests, Bonferroni-corrected paired: t-test: sighted adults within hemisphere > between hemisphere: t (49) = 7.441, p = 0.012, cohen’d = 0.817; blind adults within hemisphere > between hemisphere: t (29) = 10.735, p < 0.001, cohen’d = 1.96).” (page 9, line 214)

      Titles of the y-axes have also been added to Figures S7 and S11.

    1. Reviewer #1 (Public review):

      Summary:

      Lesser et al provide a comprehensive description of Drosophila wing proprioceptive sensory neurons at the electron microscopy resolution. This "tour-de-force", provides a strong foundation for future structural and functional research aimed at understanding wing motor control in Drosophila with implications to understanding wing control across other insects.

      Strengths:

      (1) Authors leverage previous research that described many of the fly wing proprioceptors, and combine this knowledge with EM connectome data such that they now provide a near-complete morphological description of all wing proprioceptors.

      (2) Authors cleverly leverage genetic tools and EM connectome data to tie the location of proprioceptors on the wings with axonal projections in the connectome. This enables them to both align with previous literature as well as make some novel claims.

      (3) In addition to providing a full description of wing proprioceptors, authors also identified a novel population of sensors on the wing tegula that make direct connections with the B1 wing motor neurons implicating the role of tegula in wing movements that was previously underappreciated.

      (4) Despite being the most comprehensive description so far, it is reassuring that authors clearly state the missing elements in the discussion.

      Weaknesses:

      (1) Authors do their main analysis on data from FANC connectome but provide corresponding IDs for sensory neurons in the MANC connectome. I wonder how the connectivity matrix compares across FANC and MANC if the authors perform similar analysis as they have done in Fig. 2. This could be a valuable addition and potentially also pick up any sexual dimorphism.

      (2) Authors speculate about presence of gap junctions based on density of mitochondria. I'm not convinced about this given mitochondrial densities could reflect other things that correlate with energy demands in sub-compartments.

      Overall, I consider this an exceptional analysis which will be extremely valuable to the community.

    2. Reviewer #3 (Public review):

      Summary:

      The authors aim to identify the peripheral end organ origin in the fly's wing of all sensory neurons in the Anterior Dorsal Mesothoracic nerve. They reconstruct the neurons and their downstream partners in an electron microscopy volume of a female ventral nerve cord, analyse the resulting connectome and identify their origin with review of the literature and imaging of genetic driver lines. While some of the neurons were already known through previous work, the authors expand on the identification and create a near complete map of the wing mechanosensory neurons at synapse resolution.

      Strengths:

      The authors elegantly combine electron microscopy neuron morphology, connectomics and light microscopy methods to bridge the gap between fly wing sensory neuron anatomy and ventral nerve cord morphology. Further, they use EM ultrastructural observations to make predictions on the signaling modality of some of the sensory neurons and thus their function in flight.

      The work is as comprehensive as state of the art methods allow to create a near complete map of the wing mechanosensory neurons. This work will be of importance to the field of fly connectomics and modelling of fly behavior as well as a useful resource to the Drosophila research community.

      Through this comprehensive mapping of neurons to the connectome the authors create a lot of hypotheses on neuronal function partially already confirmed with the literature and partially to be tested in the future. The authors achieved their aim of mapping the periphery of the fly's wing to axonal projections in the ventral nerve cord, beautifully laying out their results to support their mapping.

      The authors identify the neurons in a previously published connectome of a male fly ventral nerve cord to enable cross-individual analysis of connections and find no indication of sexual dimorphism at the sensory neuron level. Further, together with their companion paper Dhawan et al., 2025 describing the haltere sensory neurons in the same EM dataset, they cover the entire mechanosensory space involved in Drosophila flight.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Lesser et al provide a comprehensive description of Drosophila wing proprioceptive sensory neurons at the electron microscopy resolution. This “tour-de-force” provides a strong foundation for future structural and functional research aimed at understanding wing motor control in Drosophila with implications for understanding wing control across other insects.

      Strengths:

      (1) The authors leverage previous research that described many of the fly wing proprioceptors, and combine this knowledge with EM connectome data such that they now provide a near-complete morphological description of all wing proprioceptors.

      (2) The authors cleverly leverage genetic tools and EM connectome data to tie the location of proprioceptors on the wings with axonal projections in the connectome. This enables them to both align with previous literature as well as make some novel claims.

      (3) In addition to providing a full description of wing proprioceptors, the authors also identified a novel population of sensors on the wing tegula that make direct connections with the B1 wing motor neurons, implicating the role of the tegula in wing movements that was previously underappreciated.

      (4) Despite being the most comprehensive description so far, it is reassuring that the authors clearly state the missing elements in the discussion.

      Weaknesses:

      (1) The authors do their main analysis on data from the FANC connectome but provide corresponding IDs for sensory neurons in the MANC connectome. I wonder how the connectivity matrix compares across FANC and MANC if the authors perform a similar analysis to the one they have done in Figure 2. This could be a valuable addition and potentially also pick up any sexual dimorphism.

      We agree that systematic comparisons will provide valuable insights as more connectome datasets become available. However, the primary goal of this study was to link central axon morphology with peripheral structures in the wing. We deliberately omitted more detailed and quantitative analyses of the downstream VNC circuitry, apart from providing a global view of the connectivity matrix and using it to cluster the sensory axon types. A more detailed and systematic comparison of wing sensorimotor circuit connectivity across different connectome datasets (FANC, MANC, BANC, IMAC) is the subject of ongoing work in our lab, which we feel is beyond the scope of this study. Here, we chose to match the wing proprioceptors to axons in MANC to demonstrate their stereotypy across individuals and to make them more accessible to other researchers. We found no obvious sexual dimorphism at the level of wing sensory neurons. We now note this in the Discussion.

      (2) The authors speculate about the presence of gap junctions based on the density of mitochondria. I’m not convinced about this, given that mitochondrial densities could reflect other things that correlate with energy demands in sub-compartments.

      We have moved speculation about mitochondria and gap junctions to the Discussion.

      (3) I’m intrigued by how the tegula CO is negative for iav. I wonder if authors tried other CO labeling genes like nompc. And what does this mean for the nature of this CO. Some more discussion on this anomaly would be helpful.

      Based on this suggestion, we have added an image showing that tegula CO neurons are labeled by nompC-Gal4.

      (4) The authors conclude there are no proprioceptive neurons in sclerite pterale C based on Chat-Gal4 expression analysis. It would be much more rigorous if authors also tried a pan-neuronal driver like nsyb/elav or other neurotransmitter drivers (Vglut, GAD, etc) to really rule this out. (I hope I didn’t miss this somewhere.)

      To address this, we imaged OK371-GFP, which labels glutamatergic neurons, in the wing and wing hinge. We saw expression in the wing, as others have reported (Neukomm et. al., 2014), but we saw no expression at the wing hinge. Apart from a handful of glutamatergic gustatory neurons in the leg, we are not aware of any other sensory neurons in the fly that are not labeled by Chat-Gal4.

      Overall, I consider this an exceptional analysis that will be extremely valuable to the community.

      We sincerely appreciate the reviewer’s positive feedback.

      Reviewer #2 (Public review):

      Summary:

      Lesser et al. present an atlas of Drosophila wing sensory neurons. They proofread the axons of all sensory neurons in the wing nerve of an existing electron microscopy dataset, the female adult fly nerve cord (FANC) connectome. These reconstructed sensory axons were linked with light microscopy images of full-scale morphology to identify their origin in the periphery of the wing and encoded sensory modalities. The authors described the morphology and postsynaptic targets of proprioceptive neurons as well as previously unknown sensory neurons.

      Strengths:

      The authors present a valuable catalogue of wing sensory neurons, including previously undescribed sensory axons in the Drosophila wing. By providing both connectivity information with linked genetic drive lines, this research facilitates future work on the wing motor-sensory network and applications relating to Drosophila flight. The findings were linked to previous research as well as their putative role in the proprioceptive and nerve cord circuitry, providing testable hypotheses for future studies.

      Weaknesses:

      (1) With future use as an atlas, it should be noted that the evidence is based on sensory neurons on only one side of the nerve cord. Fruit flies have stereotyped left/right hemispheres in the brain and left/right hemisegments in the nerve cord. The comparison of left and right neurons of the nervous system can give a sense of how robust the morphological and connectivity findings are. Here, the authors have not compared the left and right side sensory axons from the wing nerve, leaving potential for developmental variability across samples and left/right hemisegments.

      The right ADMN nerve in the FANC dataset is partially severed, making left/right comparisons unreliable (see Azevedo 2024, Extended Data Figure 4). We have updated the text to explain this within the Methods section of the paper.

      (2) Not all links between the EM reconstructions and driver lines are convincing. To strengthen these, for all EM-LM matches in Figures 3-7, rotated views of the driver line (matching the rotated EM views) should be shown to provide a clearer comparison of the data. In particular, Figure 3G and Figure 7B are not very convincing based on the images shown. MCFO imaging of the driver lines in Figure 3G and 7B would make this position stronger if a clone that matches the EM reconstruction could be identified.

      Many of the z-stack images in the paper are from the Janelia FlyLight collection, and unfortunately their imaging parameters were not optimized for orthogonal views. Rotated views are blurry and not especially helpful for comparison to EM reconstruction. We now point out in the text that interested readers can access the z-stacks from FlyLight to see the dorsal-ventral projections.

      Regarding Figure 3G and 7B, we have added markers to the image with corresponding descriptions in the legend to guide the reader through the image of the busy driver line. Although these lines label many cells in the VNC as a whole, they sparsely label cells in the ADMN, making them nonetheless useful for identifying peripheral sensory neurons.

      (3) Figure 7B looks like the driver line might have stochastic expression in the sensory neuron, which further reduces confidence in the result shown in Figure 7C. Is this expression pattern in the wing consistently seen? Many split-GAL4s have stochastic expressions. The evidence would be strengthened if the authors presented multiple examples (~4-5) of each driver line’s expression pattern in the supplement.

      Figure 7B shows sparse labeling of the driver line using the MCFO technique, as specified in the legend. Its unilateral expression is therefore not due to stochastic expression of the Gal4 line. We have added the “MFCO” label to the image to clarify.

      (4) Certain claims in this work lack quantitative evidence. On line 128, for instance, “Overall, our comprehensive reconstruction revealed many morphological subgroups with overlapping postsynaptic partners, suggesting a high degree of integration within wing sensorimotor circuits.” If a claim of subgroups having shared postsynaptic partners is being made, there should have been quantitative evidence. For example, cosine similar amongst members of each group compared to the cosine similarity of shuffled/randomised sets of axons from different groups. The heat map of cosine similarity in Figure 2B alone is not sufficient.

      We agree that illustrating the extent of shared postsynaptic partners across subgroups strengthens this point. We added a visualization showing pairwise similarity scores for within- and between-cluster neuron pairs (Figure 2B inset). We also performed a permutation test to determine that within-cluster similarity is significantly higher than between clusters, and we report the test in the results as well as the figure legend. This analysis provides a more quantitative summary of the qualitative trends in connectivity that are summarized in Figure 2B.

      (5) Similarly, claims about putative electrical connections to b1 motor neurons are very speculative. The authors state that “their terminals contain very densely packed mitochondria compared to other cells”, without providing a quantitative comparison to other sensory axons. There is also no quantitative comparison to the one example of another putative electrical connection from the literature. Further, it should be noted that this connection from Trimarchi and Murphey, 1997, is also stated as putative on line 167, which further weakens this evidence. Quantification would strongly strengthen this position. Identification of an example of high mitochondrial density at a confirmed electrical connection would be even better. In the related discussion section “A potential metabolic specialization for flight circuitry”, it should be more clearly noted that the dense mitochondria could be unrelated to a putative electrical connection. If the authors have an alternative hypothesis about the mitochondria density, this should be stated as well.

      We agree with the reviewer that the link between mitochondrial density and metabolic specialization is purely speculative in this context. Based on reviewer feedback, we have moved all mention of the relationship between mitochondrial density and gap junction coupling to the Discussion. We acknowledge that this may seem like a somewhat random and not quantitatively supported observation. However, we found the coincidence striking and worthy of mention, though it is only tangentially relevant to the rest of the paper. From conversations with colleagues, we have also heard that this relationship is consistent with as yet unpublished work in other model organisms (e.g., zebrafish, mouse).

      The electrical coupling to b1 motor neurons is well-established (Fayyazuddin and Dickinson, 1999), and we have updated the text to state this more clearly. However, we agree that whether the specific neurons we have identified based on their anatomy are the same ones functionally identified through whole-nerve recordings remains unknown.

      (6) It would be appropriate to cite previous work using a similar strategy to match sensory axons to their cell bodies/dendrites at the periphery using driver lines and connectomics (see Figure 5 for example in the following paper: https://doi.org/10.7554/eLife.40247 ).

      At this point, there are now dozens of papers that match the axons of sensory neurons to their cell bodies/dendrites in the periphery by comparing light microscopy and connectomics. When we dug in, we found examples in C. elegans, Ciona intestinalis, zebrafish, and mouse, all published prior to the study cited above. For basically every animal for which scientists have acquired EM volumes of neural tissue, they have used other anatomical labeling methods to determine cell types inside and outside the imaged volume. In summary, we found it difficult to establish a single primary citation for this approach. In lieu of this, we have added a citation to an earlier review by a pioneer in EM connectomics that discusses the general approach of matching cells across different labeling/imaging modalities (Meinertzhagen et al., 2009).

      The methods section is very sparse. For the sake of replicability, all sections should be expanded upon.

      We have expanded the methods section, and also a STAR methods table.

      Reviewer #3 (Public review):

      Summary:

      The authors aim to identify the peripheral end-organ origin in the fly’s wing of all sensory neurons in the anterior dorsomedial nerve. They reconstruct the neurons and their downstream partners in an electron microscopy volume of a female ventral nerve cord, analyse the resulting connectome, and identify their origin with a review of the literature and imaging of genetic driver lines. While some of the neurons were already known through previous work, the authors expand on the identification and create a near-complete map of the wing mechanosensory neurons at synapse resolution.

      Strengths:

      The authors elegantly combine electron microscopy, neuron morphology, connectomics, and light microscopy methods to bridge the gap between fly wing sensory neuron anatomy and ventral nerve cord morphology. Further, they use EM ultrastructural observations to make predictions on the signaling modality of some of the sensory neurons and thus their function in flight.

      The work is as comprehensive as state-of-the-art methods allow to create a near-complete mapof the wing mechanosensory neurons. This work will be of importance to the field of fly connectomics and modelling of fly behavior, as well as a useful resource to the Drosophila research community.

      Through this comprehensive mapping of neurons to the connectome, the authors create a lot of hypotheses on neuronal function, partially already confirmed with the literature and partially to be tested in the future. The authors achieved their aim of mapping the periphery of the fly’s wing to axonal projections in the ventral nerve cord, beautifully laying out their results to support their mapping.

      The authors identify the neurons in a previously published connectome of a male fly ventral nerve cord to enable cross-individual analysis of connections. Further, together with their companion paper, Dhawan et al. 2025, describing the haltere sensory neurons in the same EM dataset, they cover the entire mechanosensory space involved in Drosophila flight.

      Weaknesses:

      The connectomic data are only available upon request; the inclusion of a connectivity table of the reconstructed neurons would aid analysis reproducibility and cross-dataset comparisons.

      We have added a connectivity table as well as analysis scripts in the github repository for the paper (https://github.com/EllenLesser/Lesser_eLife_2025).

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The methods section should be expanded in every aspect. Most pressing sections are:

      (1) Data and Code availability: All code should be included as a Zenodo database, the suggestion to ask authors for code upon request is inappropriate.

      We have added all code to a public github repository, which is now linked in the Methods section.

      (2) Samples: Standard cornmeal and molasses medium should have a reference, as many institutes use different recipes.

      The recipe used by the University of Washington fly kitchen is based on the Bloomington standard Cornmeal, Molasses and Yeast Medium recipe, which can be found at https://bdsc.indiana.edu/information/recipes/molassesfood.html. The UW recipe is slightly modified for different antifungal ingredients and includes tegosept, propionic acid, and phosophoric acid.

      (3) Table 3: Driver lines labelling wing sensory neurons: The genetic driver lines should have associated Bloomington stock centre numbers. Additionally, relevant information for effector lines used should be included in the methods.

      We now include the Bloomington stock numbers and more information on effector lines in the STAR methods table.

      Minor corrections:

      (1) Lines 119-120: “Notably, many of the axons do not form crisp cluster boundaries, suggesting that multimodal sensory information is integrated at early stages of sensory processing.” We do not follow the logic of this statement and suspect it is a bit too speculative.

      We removed this sentence from the manuscript.

      (2) Figure 1: The ADMN is missing in the schematics and would be helpful to depict for non-experts. Is this what is highlighted in Figure 1D?

      Yes, and we now label 1D as the ADMN wing nerve.

      (3) Figure 1B: Which driver lines are being depicted here? Looking at Table 3 does not clarify. It should be specified at least in the figure legend.

      As stated in the legend, we include a table of all of the driver lines we screened and which sensory structures they label.

      (4) Figure 1C: There are some minor placement issues with the text in the schematic. There is an arrow very close to the “CO” on the top right, which makes the “O” look like the symbol for male. “ax ii” is a bit too close to the wing hinge

      We updated the figure to address this issue.

      (5) Figure 1D: The outlined grey masks are not clear. The use of colour would be very useful for the reader to help understand what the authors are referring to here

      We now use color for the masks.

      (6) Figure 2A: It is unclear if the descending neuron and non-motor efferent neuron are not shown because they are under the described threshold, or to simplify the plot. They should be included in the plot if over the threshold.

      We have updated the legend to specify that the exclusion of the descending and non-motor efferent neurons are to visually simplify the plot. We include % of sensory output to each of these neurons in the legend, and they are included in the connectivity matrix data in the public  GitHub repository associated with the paper, included in the Methods.

      (7) Figure 2B: What clustering is used specifically? The method says it’s from Scikit-learn, but there are many types of clustering available in this package.

      We now include the specific clustering type used in the Methods section, which is agglomerative clustering.

      (8) Figure 3A: What does the green box behind the plot represent?

      The green box represents the tegula CO axons, which we now specify in the legend.

      (9) Figure 3C: the “C” is clipped at the top.

      We updated the figure to address this issue.

      (10) Figure 4A: the main text says a “group of four axons” (line 203) while the figure says 5 axons.

      We updated the text to address this issue.

      (11) Line 360: “We found that the campaniform sensilla on the tegula provide the most direct feedback onto wing steering motor neurons”. We struggled to find where this was directly shown, because several sensory axon types directly synapse onto motor neurons.

      We now specify in the text that this finding is shown in Figure 3.

      Reviewer #3 (Recommendations for the authors):

      I would like to congratulate the authors on their beautiful, easy-to-read, and easy-to-comprehend manuscript, with clear figures and nice visualizations. This work provides a valuable resource that will contribute to the interpretability of connectomic data and further to connectome-based modeling of fly behavior.

      We sincerely appreciate the reviewer’s positive feedback.

    1. Reviewer #3 (Public review):

      This paper addresses, through experiment and simulation, the combined effects of bacterial circular swimming near no-slip surfaces and chemotaxis in simple linear gradients. The authors have constructed a microfluidic device in which a gradient of L-aspartate is established, to which bacteria respond while swimming while confined in channels of different widths. There is a clear effect that the chemotactic drift velocity reaches a maximum in channel widths of about 8 microns, similar in size to the circular orbits that would prevail in the absence of side walls. Numerical studies of simplified models confirm this connection.

      The experimental aspects of this study are well executed. The design of the microfluidic system is clever in that it allows a kind of "multiplexing" in which all the different channel widths are available to a given sample of bacteria.<br /> The authors have included a useful intuitive explanation of their results via a geometric model of the trajectories. In future work it would be interesting to analyze further the voluminous data on the trajectories of cells by formulating the mathematical problem in terms of a suitable Fokker-Planck equation for the probability distribution of swimming directions. In particular, this might help understand how incipient circular trajectories are interrupted by collisions with the walls and how this relates to enhanced chemotaxis.

      The authors argue that these findings may have relevance to a number of physiological and ecological contexts. As these would be characterized by significant heterogeneity in pore sizes and geometries, further work will be necessary to translate the present results to those situations.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      This article deals with the chemotactic behavior of E coli bacteria in thin channels (a situation close to 2D). It combines experiments and simulations.

      The authors show experimentally that, in 2D, bacteria swim up a chemotactic gradient much more effectively when they are in the presence of lateral walls. Systematic experiments identify an optimum for chemotaxis for a channel width of ~8µm, close to the average radius of the circle trajectories of the unconfined bacteria in 2D. It is known that these circles are chiral and impose that the bacteria swim preferentially along the right-side wall when there is no chemotactic gradient. In the presence of a chemotactic gradient, this larger proportion of bacteria swimming on the right wall yields chemotaxis. This effect is backed by numerical simulations and a geometrical analysis.

      If the conclusions drawn from the experiments presented in this article seem clear and interesting, I find that the key elements of the mechanism of this wall-directed chemotaxis are not sufficiently emphasized. Moreover, the paper would be clearer with more details on the hypotheses and the essential ingredients of the analyses.

      We thank the reviewer for these constructive suggestions. We agree that emphasizing the underlying mechanism is crucial for the clarity of our findings. In the revised manuscript, we have now explicitly highlighted the critical roles of chiral circular motion and the alignment effect following side-wall collisions in both the Abstract (lines 25-27) and the Discussion (lines 391-393). Furthermore, we have added a new analysis of bacterial trajectories post-collision (Fig. S2), which demonstrates that cells predominantly align with and swim along the sidewalls. We have also clarified the assumptions in our numerical simulations, specifically how the radius of circular trajectories and the alignment effect are incorporated into the equations of motion. Please refer to our detailed responses in the "Recommendations for the authors" section for further specifics.

      Reviewer #2 (Public review):

      Summary:

      In this study, the authors investigated the chemotaxis of E. coli swimming close to the bottom surface in gradients of attractant in channels of increasingly smaller width but fixed height = 30 µm and length ~160 µm. In relatively large channels, they find that on average the cells drift in response to the gradient, despite cells close to the surface away from the walls being known to not be chemotactic because they swim in circles.

      They find that this average drift is due to the cell localization close to the side walls, where they slide along the wall. Whereas the bacteria away from the walls have no chemotaxis (as shown before), the ones on the left side wall go down-gradient on average, but the ones on the right-side wall go up-gradient faster, hence the average drift. They then study the effect of reducing channel width. They find that chemotaxis is higher in channels with a width of about 8 µm, which approximately corresponds to the radius of the circular swimming R. This higher chemotactic drift is concomitant to an increased density of cells on the RSW. They do simulations and modeling to suggest that the disruption of circular swimming upon collision with the wall increases the density of cells on the RSW, with a maximal effect at w = ~ 2/3 R, which is a good match for their experiments.

      Strengths:

      The overall result that confinement at the edge stabilises bacterial motion and allows chemotaxis is very interesting although not entirely unexpected. It is also important for understanding bacterial motility and chemotaxis under ecologically relevant conditions, where bacteria frequently swim under confinement (although its relevance for controlling infections could be questioned). The experimental part of the study is nicely supported by the model.

      Weaknesses:

      Several points of this study, in particular the interpretation of the width effect, need better clarification:

      (1) Context:

      There are a number of highly relevant previous publications that should have been acknowledged and discussed in relation to the current work:

      https://pubs.rsc.org/en/content/articlehtml/2023/sm/d3sm00286a

      https://link.springer.com/article/10.1140/epje/s10189-024-00450-7

      https://doi.org/10.1016/j.bpj.2022.04.008

      https://doi.org/10.1073/pnas.1816315116

      https://www.pnas.org/doi/full/10.1073/pnas.0907542106

      https://doi.org/10.1038/s41467-020-15711-0

      http://doi.org/10.1038/s41467-020-15711-0

      http://doi.org/10.1039/c5sm00939a

      We appreciate the reviewer bringing these important publications to our attention. We have now cited and discussed these works in the Introduction (lines 55-62 and 76-85) to better contextualize our study regarding bacterial motility and chemotaxis in confined geometries.

      (2) Experimental setup:

      a) The channels are built with asymmetric entrances (Figure 1), which could trigger a ratchet effect (because bacteria swim in circle) that could bias the rate at which cells enter into the channel, and which side they follow preferentially, especially for the narrow channel. Since the channel is short (160 µm), that would reflect on the statistics of cell distribution. Controls with straight entrances or with a reversed symmetry of the channel need to be performed to ensure that the reported results are not affected by this asymmetry.

      We appreciate the reviewer's insight regarding the potential ratchet effect caused by asymmetric entrances. To rule this out, we fabricated a control device with straight entrances and repeated the measurements. As shown in Figure S3, the chemotactic drift velocity follows the same trend as observed in the original setup, confirming an optimal width of ~9 mm. These results demonstrate that the entrance geometry does not bias the reported statistics. We have updated the manuscript text at lines 233-235.

      b) The authors say the motile bacteria accumulate mostly at the bottom surface. This is strange, for a small height of 30 µm, the bacteria should be more-or-less evenly spread between the top and bottom surface. How can this be explained?

      We apologize for not explaining this clearly in the text. As shown by Wei et al., Phys. Rev. Lett. 135, 188401 (2025), significant surface accumulation occurs in channels with heights exceeding 20 µm. In our specific experimental setup, we did not use Percoll to counteract gravity. Therefore, the bacteria accumulated mostly at the bottom surface under the combined influence of gravity and hydrodynamic attraction. This bottom-surface localization is supported by our observation that the bacterial trajectories were predominantly clockwise (characteristic of the bottom surface) rather than counter-clockwise (characteristic of the top surface). We have added this explanation to Line 141.

      c) At the edge, some of the bacteria could escape up in the third dimension (http://doi.org/10.1039/c5sm00939a). What is the magnitude of this phenomenon in the current setup? Does it have an effect?

      We thank the reviewer for raising this important point regarding 3D escape. We have quantified this phenomenon and found the escape rate from the edge into the third dimension to be 0.127 s<sup>-1</sup>. This corresponds to a mean residence time that allows a cell moving at 20 mm/s to travel approximately 157.5 mm along the edge. Since this distance is comparable to the full length of our lanes (~160 mm), most cells traverse the entire edge without escaping. Furthermore, our analysis is based on the average drift of the surface trajectories per unit of time; this metric is independent of the absolute number of cells present. Therefore, the escape phenomenon does not significantly impact our conclusions. We have added a statement clarifying this at line 154.

      d) What is the cell density in the device? Should we expect cell-cell interactions to play a role here? If not, I would suggest to de-emphasize the connection to chemotaxis in the swarming paper in the introduction and discussion, which doesn't feel very relevant here, and rather focus on the other papers mentioned in point 1.

      The cell density in our experiments was approximately 1.3×10<sup>-3</sup> μm<sup>-2</sup>. Given this low density, we do not expect cell-cell interactions to play a role in the observed behaviors.

      Regarding the connection to swarming chemotaxis: We agree that our low-density setup differs from a high-density swarm; however, we believe the comparison remains relevant for two reasons. First, it provides a necessary contrast to studies showing surface inhibition of chemotaxis. Second, while we eliminate cell-cell interactions, we isolate the geometric aspect of swarming. In a swarm, cells move within narrow lanes created by their neighbors. Our device mimics this specific physical confinement by replacing neighboring cells with PDMS sidewalls. This allows us to decouple the effects of physical confinement from cell-cell interactions. We have added the text (Line 370) to clarify this rationale and have incorporated the additional references in introduction as suggested in point 1.

      e) We are not entirely convinced by the interpretation of the results in narrow channels. What is the causal relationship between the increased density on the RSW and the higher chemotactic drift? The authors seem to attribute higher drift to this increased RSW density, which emerges due to the geometric reasons. But if there is no initial bias, the same geometric argument would induce the same increased density of down-gradient swimmers on the LSW, and so, no imbalance between RSW and LSW density. Could it be the opposite that the increased RSW density results from chemotaxis (and maybe reinforces it), not the other way around? Confinement could then deplete one wall due to the proximity of the other, and/or modify the swimming pattern - 8 µm is very close to the size of the body + flagellum. To clarify this point, we suggest measuring the bacterial distributions in the absence of a gradient for all channel widths as a control.

      We thank the reviewer for this insightful comment regarding the causal relationship between cell density and chemotactic drift. We apologize if the initial explanation was unclear.

      Regarding the no-gradient control: Without an attractant gradient (and no initial bias), there is no breaking of symmetry and the labels of "LSW" and "RSW" are arbitrary. Therefore, there will be no asymmetry in the bacterial distributions on both sides (within experimental fluctuations) in the absence of a gradient for any channel width.

      Regarding the causality and density imbalance: We agree that the increased RSW density is a result of chemotaxis, which is then reinforced by the lane geometry especially at narrow lane width. The mechanism relies on the coupling of chemotactic bias with surface circularity. The angle ranges that lead to RSW-UG accumulation (Fig. 6A-C) coincide with the up-gradient direction. Because these cells experience suppressed tumbling (longer runs), they can maintain the steady circular trajectories required to reach and align with the RSW. Conversely, while pure geometric analysis suggests a similar potential for LSW-DG accumulation, these trajectories coincide with the down-gradient direction. These cells experience enhanced tumbling, which distorts the circular trajectories. This prevents them from effectively reaching the LSW and also increases the probability of them leaving the wall. Therefore, the causality is indeed a positive feedback loop: the attractant gradient creates an initial bias that allows the RSW-UG fraction to form stable trajectories; the optimal lane width (matching the swimming radius) then maximizes this capture efficiency, further enriching the RSW fraction and enhancing the overall drift.

      We have added clarifications regarding these points in the revised manuscript (the last paragraph of “Results”).

      (3) Simulations:

      The simulations treat the wall interaction very crudely. We would suggest treating it as a mechanical object that exerts elastic or "hard sphere" forces and torques on the bacteria for more realistic modeling.

      We appreciate the reviewer's suggestion to incorporate more detailed mechanical interactions, such as elastic or hard-sphere forces, for the wall collisions. While we agree that a full hydrodynamic or mechanical model would offer higher fidelity, our experimental observations suggest that a simplified kinematic approach is sufficient for the specific phenomena studied here.

      As shown in the new Fig. S2, our analysis of cell trajectories in the 44-µm-wide channels reveals that cells colliding with the sidewalls tend to align with the surface almost instantaneously. The timescale required for this alignment is negligible compared to the typical wall residence time (see also Ref. 6). Consequently, to maintain computational efficiency without sacrificing the essential physics of the accumulation effect, we employed a coarse-grained phenomenological model where a bacterium immediately aligns parallel to the wall upon contact, similar to approaches used previously (Ref. 43). We have added relevant text to the manuscript on lines 168-171.

      Notably, the simulations have a constant (chemotaxis independent) rate of wall escape by tumbling. We would expect that reduced tumbling due to up-gradient motility induces a longer dwell time at the wall.

      We apologize for the confusion. The chemotaxis effect is indeed fully integrated into our simulation. Specifically, the simulated cells sense the chemical gradient and adjust their motor CW bias (B) accordingly. This adjustment directly modulates the tumble rate (k), calculated as k \= B/0.31 s<sup>-1</sup>. Consequently, the wall escape rate is not constant but varies with the chemotactic response. We also imposed a maximum detention time limit which, when combined with the variable tumble rate, results in an average wall residence time of approximately 2 s, consistent with our experimental observations (Fig. S6B). We have clarified these details in the final section of 'Materials and Methods'.

      Reviewer #3 (Public review):

      This paper addresses through experiment and simulation the combined effects of bacterial circular swimming near no-slip surfaces and chemotaxis in simple linear gradients. The authors have constructed a microfluidic device in which a gradient of L-aspartate is established to which bacteria respond while swimming while confined in channels of different widths. There is a clear effect that the chemotactic drift velocity reaches a maximum in channel widths of about 8 microns, similar in size to the circular orbits that would prevail in the absence of side walls. Numerical studies of simplified models confirm this connection.

      The experimental aspects of this study are well executed. The design of the microfluidic system is clever in that it allows a kind of "multiplexing" in which all the different channel widths are available to a given sample of bacteria.

      While the data analysis is reasonably convincing, I think that the authors could make much better use of what must be voluminous data on the trajectories of cells by formulating the mathematical problem in terms of a suitable Fokker-Planck equation for the probability distribution of swimming directions. In particular, I would like to see much more analysis of how incipient circular trajectories are interrupted by collisions with the walls and how this relates to enhanced chemotaxis. In essence, there needs to be a much clearer control analysis of trajectories without sidewalls to understand the mechanism in their presence.

      We thank the reviewer for this insightful suggestion. We agree that understanding how circular trajectories are interrupted by wall collisions is central to explaining the enhanced chemotaxis. While we did not explicitly formulate a Fokker-Planck equation, we have addressed the reviewer's core point by employing two complementary mathematical approaches that model the probability distribution of swimming directions and wall interactions:

      (1) Stochastic simulations (Langevin approach): As detailed in the "Simulation of E. coli chemotaxis within lane confinements" subsection of “Results” and Figure 5, we modeled cells as self-propelled particles performing random walks. This model explicitly accounts for the "interruption" of circular trajectories by incorporating a constant angular velocity (circular swimming) and an alignment effect upon collision with sidewalls. These simulations successfully reproduced the experimental trends, confirming that the interplay between circular radius and lane width determines the optimal drift velocity.

      (2) Geometric probability analysis: To provide the "intuitive understanding", we included a specific Geometrical Analysis section (the last subsection of “Results”) and Figure 6. This analysis mathematically formulates the problem by calculating the exact proportion of swimming angles that allow a cell to transition from a circular trajectory in the bulk to an up-gradient trajectory along the Right Sidewall (RSW). By integrating over the possible swimming directions, we derived the probability of wall interception as a function of lane width (w) and swimming radius (r). This analysis reveals that the interruption of circular paths is most favorable for chemotaxis when w » (0.7-0.8)´r.

      (3) Control analysis: regarding the "control analysis of trajectories without sidewalls," we utilized the cells in the Middle Area (MA) of the wide lanes as an internal control. As shown in Fig. 2B and 4A, these cells exhibit typical surface-associated circular swimming (Fig. 3B) but generate zero net drift. This serves as the baseline "no sidewall" condition, demonstrating that the chemotactic enhancement is strictly driven by the rectification of circular swimming into wall-aligned motion at the boundaries.

      The authors argue that these findings may have relevance to a number of physiological and ecological contexts. Yet, each of these would be characterized by significant heterogeneity in pore sizes and geometries, and thus it is very unclear whether or how the findings in this work would carry over to those situations.

      We thank the reviewer for this important observation regarding environmental heterogeneity. We agree that we should be cautious about directly extrapolating to complex ecological contexts without qualification. We have revised the last sentence of the abstract to adopt a more measured tone: "Our results may offer insights into bacterial navigation in complex biological environments such as host tissues and biofilms, providing a preliminary step toward exploring microbial ecology in confined habitats and potential strategies for controlling bacterial infections."

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Key elements of the mechanism of wall-directed chemotaxis are not sufficiently emphasized:

      For instance, the chirality of the trajectories is an essential part of the analysis but is mentioned only briefly in the introduction. In the geometrical analysis, I understand that one of the critical parameters is the angle at which bacteria "collide" with the walls. But, again, this remains largely implicit in the discussion. This comes to the point that these ideas are not even mentioned in the abstract which doesn't provide any hint of a mechanism. An analysis of the actual trajectories of the cells after they hit the walls, as a function of their initial angle would be helpful in comparison with the simulations and the geometrical analysis.

      We appreciate the reviewer's insightful comment regarding the need to better emphasize the mechanism of wall-directed chemotaxis. We agree that the chirality of trajectories and the geometry of wall collisions are central to our analysis and were previously under-emphasized.

      To address this, we have made the following revisions:

      (1) We have revised the Abstract (lines 25-27) and the Discussion (lines 391-393) to explicitly highlight the crucial role of chiral circular motion and the alignment effect following sidewall collisions.

      (2) We further analyzed bacterial trajectories at different collision angles. Typical examples are shown in Supplementary Fig. S2. We observed that cells tend to align with and swim along the sidewalls regardless of their initial collision angles. This finding is now described in the main text at lines 168-171.

      The motion of the bacteria is modelled as run-and-tumble at several places in the manuscript, and in particular in the simulations. Yet, the trajectories of the bacteria seem to be smooth in this almost 2D geometry, except of course when they directly interact with the walls (I hardly see tumbles in the MA region in Figure 1B). Can the authors elaborate on the assumptions made in the numerical simulations? In particular, how is the radius of the trajectories included in these equations of motion (line 514)?

      We apologize for the lack of clarity regarding the bacterial motion model. It has been established that while bacteria do tumble near solid surfaces, they exhibit a smaller reorientation angle compared to bulk fluids; in fact, the most probable reorientation angle on a surface is zero (Ref. 41). Consequently, tumbles are often difficult to distinguish from runs with the naked eye. Additionally, the trajectories in Figure 1B are plotted on a 44 mm ´ 150 mm canvas with unequal coordinate scales, which may further obscure the visual distinctness of tumbling events.

      Regarding the equations of motion: We modeled the bacteria as self-propelled particles governed by the internal chemotaxis pathway, alternating between run and tumble states. As noted in the equations on lines 286 & 578, we incorporated the circular motion by introducing a constant angular velocity, −ν<sub>0</sub>/r, during the run state. Here, ν<sub>0</sub> represents the swimming speed, r denotes the radius of circular swimming, and the negative sign indicates clockwise chirality. Furthermore, to model the hydrodynamic interaction with the boundaries, we assumed that when a cell collides with a sidewall, its velocity vector instantly aligns parallel to that wall.

      The comparison of Figure 5B (simulations) with Figure 4B (experiments) does not strike me as so "similar". Why are the points at small widths so noisy (Figure 5AB)? Figure 5C is cut at these widths, it should be plotted over the entire scale.

      We acknowledge that the agreement between simulation and experiment is less robust in the narrowest channels. The discrepancy and "noise" at small widths in Figure 5 arise from the limitations of the self-propelled particle model in highly confined geometries. Specifically, our simulation treats bacteria as point particles and does not explicitly calculate the physical exclusion (steric effects) caused by the finite size of the flagella and cell body.

      In the experimental setup, steric constraints within narrow channels (comparable to the cell size) restrict the cells' ability to turn freely, effectively stabilizing their motion. However, because our model allows particles to reorient more freely than actual cells would in such confined spaces, it produces fluctuations and an overestimation of the drift velocity at small widths. If these confinement effects were fully incorporated, the cell density mismatch between the left and right sidewalls would be reduced, leading to lower drift velocities that match the experimental data more closely.

      Regarding Figure 5C: Since the "active particle" assumption loses physical validity in channels narrower than the scale of the bacterium, the simulation results in this regime are not representative of biological reality. Plotting these non-physical points would distort the analysis. Therefore, we have maintained the truncation of Figure 5C at 4 mm to ensure the data presented is physically meaningful. We have added a clear discussion of these model limitations to the manuscript at lines 310-314.

      These important precisions should be added to the text or in a supplementary section. A validated mechanism describing in detail the impact of the walls on the cell trajectories would greatly improve the conclusions.

      We thank the reviewer for the suggestions. As noted in the responses above, we have incorporated the details concerning the simulation assumptions and the model limitations at narrow widths into the revised manuscript. We have performed further analysis of the collision trajectories between bacteria and the sidewalls. As illustrated in the new Fig. S2, the data confirms that cells tend to align with and swim along the sidewalls following a collision, regardless of the initial impact angle.

      Reviewer #2 (Recommendations for the authors):

      Minor points

      (1) Related to swimming in 3D: The authors should specify the depth of field of the objective in their setup.

      We thank the reviewer for pointing this out. We have calculated the depth of field (DOF) of our objective to be approximately 3.7 µm. This estimate is based on the standard formula:

      where l = 610 nm (emission wavelength), n = 1.0 (refractive index), NA = 0.45 (numeric aperture), M = 20 (magnification), and e = 6.5 µm (camera resolution). We have added this specification to the "Microscopy and Data Acquisition" section of “Materials and Methods”.

      (2) Related to the interpretation of the width effect: We think plotting the cell enrichment, ie the probabilities P in Figure 4B normalized to the expected value if cells were homogeneously distributed ((3µm)/w for the side walls, (w - 6µm)/w for the middle) would help understand the strength of the wall 'siphoning' effect.

      We thank the reviewer for the suggestion. We have calculated the cell enrichment by normalizing the observed probabilities against the expected values for a homogeneous distribution, as suggested. The resulting relationship between cell enrichment and lane width is presented in Figure S4.

      Related to simulations:

      (1) Showing vd for the 3 regions in Figure S5 would be helpful also to understand the underlying mechanism.

      We thank the reviewer for the suggestion. The V<sub>d</sub> values for the three regions are shown in Fig. S5.

      (2) Figure 5B vs 4B: There is a mismatch in the right vs left side density at w=6µm in the simulations that is not here in the experiments. What could explain this difference?

      We appreciate the reviewer pointing this out. The mismatch in the simulations is due to the simplified treatment of cells as self-propelled particles, which overlooks the physical volume of the cell body and flagella. In narrow channels (w\=6 mm), these physical constraints would restrict the cells' ability to change direction freely - a factor not fully captured in the simulation. Accounting for these steric effects would trap cells more effectively against the walls, reducing the density asymmetry between the LSW and RSW and lowering the drift velocity. This would bring the simulation results closer to the experimental observations. We have added a discussion of these limitations and effects to the revised manuscript (lines 310-314).

      (3) The simulations essentially assume that the density of motile cells is homogeneous and equal at both x=0 and x=L open ends of the channel. Is it the case in the experiments, even with the gradient, and the walls creating some cell transport?

      We thank the reviewer for pointing this out. The simulation assumption is consistent with our experimental observations. Our data were recorded within 160-μm-long lanes located in the center of the wider (400 μm) cell channel. In this central region, the cells maintain a continuous flux. Furthermore, experiments were performed within 8 min of flow, limiting the time for significant cell density gradients to establish. As illustrated in Author response image 11, the inhomogeneity in the measured cell density distribution is insignificant across the length of the observation window, indicating that the walls and gradient do not create significant heterogeneity at the boundaries of the region of interest.

      Author response image 1.

      The cell density distribution along the gradient field from the data of 44-μm-wide lane.

      (4) Line 506: There is something strange with the definition of the bias. B cannot be the tumbling bias if k=B/0.31 s<sup>-1</sup> and the tumble-to-run rate is 5/s, because then the tumbling bias is B/0.31 / (B/0.31 + 5). Please clarify.

      We apologize for the confusion caused by the notation. In our model, B represents the CW bias of the individual flagellar motor, not the macroscopic tumbling bias of the cell. We assume the run-to-tumble rate is equivalent to the motor CCW-to-CW switching rate (k). Previous studies have shown that this rate increases linearly with the motor CW bias according to k=B/t, where t is a characteristic time (Ref. 50).

      Based on experimental data for wildtype cells, the average run time in the near-surface region is ~2.0 s (corresponding to a run-to-tumble rate of ~0.5 s<sup>-1</sup>) (Ref. 11), and the steady-state wildtype CW bias is ~0.15. Using these values, we determined t ~ 0.31 s. Consequently, the switching rate is defined as k=B/0.31 s<sup>-1</sup>. Since the tumble duration is constant (0.2 s) (Ref. 51), the tumble-to-run rate is fixed at 5 s<sup>-1</sup>. We have clarified these definitions and parameter values in lines 569-573.

      Other minor comments:

      (1) Line 20 and lines 34-35: We think that the connection to infection is questionable here and should be toned down.

      Thank you for the suggestion. We have revised Line 20 to read: “Understanding bacterial behavior in confined environments is helpful to elucidating microbial ecology and developing strategies to manage bacterial infections.” Additionally, we modified lines 34-35 to state: “Our results may offer insights into bacterial navigation in complex biological environments such as host tissues and biofilms, providing a preliminary step toward exploring microbial ecology in confined habitats and potential strategies for controlling bacterial infections.”

      (2) Line 49: Consider highlighting the change in the sense of rotation at the air-liquid interface.

      Thank you for the suggestion. We have now highlighted the difference in chirality between trajectories at the air-liquid interface and those at the liquid-solid interface. The text has been updated to read: “For example, E. coli swim clockwise when observed from above a solid surface, whereas Caulobacter crescentus move in tight, counter-clockwise circles when viewed from the liquid side.”

      (3) Lines 58-59: The sentence should be better formulated, explaining what is CheY-P and that its concentration changes because of a change in phosphorylation (P).

      Thank you for the suggestion. We have reformulated this section to explicitly define CheY-P and explain how its concentration is regulated through phosphorylation. The revised text reads: “The transmembrane chemoreceptors detect attractants or repellents and transmit signals into the cell by modulating the autophosphorylation of the histidine kinase CheA. Attractant binding suppresses CheA autophosphorylation, while repellent binding promotes it. This modulation alters the concentration of the phosphorylated response regulator protein, CheY-P.”

      (4) Lines 63-64: CheR CheB do a bit more than "facilitating" adaptation, they mediate it. The notation CheB(p) may be confusing, since "-P" was used above for CheY.

      Thank you for pointing this out. We have corrected the notation and strengthened the description of the enzymes' roles. The revised text is: “The adaptation enzymes CheR and CheB methylate and demethylate the receptors, respectively, mediating sensory adaptation.”

      (5) Line 130: there must be a typo in the formula.

      We have replaced the ambiguous lag time variable in Fig. 1C with _n_Δt to ensure mathematical consistency.

      (6) Additionally, \Delta t is both the time between the frame here and the lag time in Figure 1.

      Thank you for highlighting this ambiguity. We have updated the notation to distinguish these two values. The lag time in Figure 1 is now explicitly denoted as _n_Δt, while Δt remains the time interval between individual frames.

      (7) Line 162: "Consistent with previous reports," a reference to said reports is missing.

      Thank you for pointing this out. We have now added the reference (Ref. 41) to support this statement.

      (8) Figure 1B: Are these tracks in the presence of a gradient? Same as used in panel C? This needs to be explained.

      Response: Thank you for this question. We confirm that the tracks shown in Figure 1B were indeed recorded in the presence of a gradient and represent a subset of the data used in Figure 1C. We have clarified this in the figure legend as follows: "Thirty bacterial trajectories selected from the data of the 44-mm-wide lane in gradient assays. These represent a subset of the trajectories analyzed in panel C."

      (9) Simulations: the equation for x(t) should also be given for completeness.

      Thank you for the suggestion. For completeness, we have added the position updating equations for the run state to the Materials and Methods section (lines 579-580). The equations are defined as:

      (10) Figure S2: For the swimming directions that are more unstable due to the surface friction torque, RSW-DG, and LSW-UG, one would have expected that the Up-gradient motion is more persistent than the down gradient one. It seems to be the opposite. Is it significant, and what could be the reason for this?

      We apologize for the lack of clarity in our original explanation. While we would generally expect up-gradient motion to be more persistent than down-gradient motion in bulk fluid, our measurements near the surface show a different trend due to the specific contributions of run and tumble states to the escape rate. Cells swimming up-gradient (UG) in the LSW experience higher probability of running. Consequently, they are subjected to the destabilizing surface friction torque for a greater proportion of time compared to cells swimming down-gradient (DG) in the RSW. This can be explained mathematically. The escape rates for RSW-DG and LSW-UG can be expressed as:

      Where B<sup>+</sup> and B<sup>−</sup> represent the tumble bias (probability of tumbling) when swimming up-gradient and down-gradient, respectively, and k<sub>T</sub> and k<sub>R</sub> denote the escape rates during a tumble and a run, respectively. Due to the chemotactic response, 0≤ B<sup>+</sup>< B<sup>−</sup> ≤1. Crucially, our system is characterized by k<sub>R</sub>>k<sub>T</sub> (the escape rate is higher during a run than a tumble). Therefore, the lower tumble bias during up-gradient swimming (B<sup>+</sup>< B<sup>−</sup>) increases the weight of the run-state escape term((1−B<sup>+</sup>)k<sub>R</sub>), leading to a higher overall escape rate for LSW-UG compared to RSW-DG. We have added an intuitive understanding of k<sub>R</sub>>k<sub>T</sub> in the Supplemental text.

    1. Reviewer #1 (Public review):

      Summary:

      This is a careful and comprehensive study demonstrating that effector-dependent conformational switching of the MT lattice from compacted to expanded deploys the alpha tubulin C-terminal tails so as to enhance their ability to bind interactors.

      Strengths:

      The authors use 3 different sensors for the exposure of the alpha CTTs. They show that all 3 sensors report exposure of the alpha CTTs when the lattice is expanded by GMPCPP, or KIF1C, or a hydrolysis-deficient tubulin. They demonstrate that expansion-dependent exposure of the alpha CTTs works in tissue culture cells as well as in vitro.

      Appraisal:

      The authors have gone to considerable lengths to test their hypothesis that microtubule expansion favours deployment of the alpha tubulin C-terminal tail, allowing its interactors, including detyrosinase enzymes, to bind. There is a real prospect that this will change thinking in the field. One very interesting possibility, touched on by the authors, is that the requirement for MAP7 to engage kinesin with the MT might include a direct effect of MAP7 on lattice expansion.

      Impact:

      The possibility that the interactions of MAPS and motors with a particular MT or region feed forward to determine its future interaction patterns is made much more real. Genuinely exciting.

    2. Reviewer #3 (Public review):

      Summary:

      In this study, the authors investigate how the structural state of the microtubule lattice influences the accessibility of the α-tubulin C-terminal tail (CTT). By developing and applying new biosensors, they reveal that the tyrosinated CTT is largely inaccessible under normal conditions but becomes more accessible upon changes to the tubulin conformational state induced by taxol treatment, MAP expression, or GTP-hydrolysis-deficient tubulin. The combination of live imaging, biochemical assays, and simulations suggests that the lattice conformation regulates the exposure of the CTT, providing a potential mechanism for modulating interactions with microtubule-associated proteins. The work addresses a highly topical question in the microtubule field and proposes a new conceptual link between lattice spacing and tail accessibility for tubulin post-translational modification. Future work is required to distinguish CTT exposure in the microtubule lattice is sensitive to additional factors present in vivo but not in vitro.

      Strengths:

      (1) The study targets a highly relevant and emerging topic-the structural plasticity of the microtubule lattice and its regulatory implications.

      (2) The biosensor design represents a methodological advance, enabling direct visualization of CTT accessibility in living cells.

      (3) Integration of imaging, biochemical assays, and simulations provides a multi-scale perspective on lattice regulation.

      (4) The conceptual framework proposed lattice conformation as a determinant of post-translational modification accessibility is novel and potentially impactful for understanding microtubule regulation.

      [Editors' note: the authors have responded to the reviewers and this version was assessed by the editors.]

    3. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This is a careful and comprehensive study demonstrating that effector-dependent conformational switching of the MT lattice from compacted to expanded deploys the alpha tubulin C-terminal tails so as to enhance their ability to bind interactors.

      Strengths:

      The authors use 3 different sensors for the exposure of the alpha CTTs. They show that all 3 sensors report exposure of the alpha CTTs when the lattice is expanded by GMPCPP, or KIF1C, or a hydrolysis-deficient tubulin. They demonstrate that expansion-dependent exposure of the alpha CTTs works in tissue culture cells as well as in vitro.

      Weaknesses:

      There is no information on the status of the beta tubulin CTTs. The study is done with mixed isotype microtubules, both in cells and in vitro. It remains unclear whether all the alpha tubulins in a mixed isotype microtubule lattice behave equivalently, or whether the effect is tubulin isotype-dependent. It remains unclear whether local binding of effectors can locally expand the lattice and locally expose the alpha CTTs.

      Appraisal:

      The authors have gone to considerable lengths to test their hypothesis that microtubule expansion favours deployment of the alpha tubulin C-terminal tail, allowing its interactors, including detyrosinase enzymes, to bind. There is a real prospect that this will change thinking in the field. One very interesting possibility, touched on by the authors, is that the requirement for MAP7 to engage kinesin with the MT might include a direct effect of MAP7 on lattice expansion.

      Impact:

      The possibility that the interactions of MAPS and motors with a particular MT or region feed forward to determine its future interaction patterns is made much more real. Genuinely exciting.

      We thank the reviewer for their positive response to our work. We agree that it will be important to determine if the bCTT is subject to regulation similar to the aCTT. However, this will first require the development of sensors that report on the accessibility of the bCTT, which is a significant undertaking for future work.

      We also agree that it will be important to examine whether all tubulin isotypes behave equivalently in terms of exposure of the aCTT in response to conformational switching of the microtubule lattice.

      We thank the reviewer for the comment about local expansion of the microtubule lattice. We believe that Figure 3 does show that local binding of effectors can locally expand the lattice and locally expose the alpha-CTTs. We have added text to clarify this.

      Reviewer #2 (Public review):

      The unstructured α- and β-tubulin C-terminal tails (CTTs), which differ between tubulin isoforms, extend from the surface of the microtubule, are post-translationally modified, and help regulate the function of MAPs and motors. Their dynamics and extent of interactions with the microtubule lattice are not well understood. Hotta et al. explore this using a set of three distinct probes that bind to the CTTs of tyrosinated (native) α-tubulin. Under normal cellular conditions, these probes associate with microtubules only to a limited extent, but this binding can be enhanced by various manipulations thought to alter the tubulin lattice conformation (expanded or compact). These include small-molecule treatment (Taxol), changes in nucleotide state, and the binding of microtubule-associated proteins and motors. Overall, the authors conclude that microtubule lattice "expanders" promote probe binding, suggesting that the CTT is generally more accessible under these conditions. Consistent with this, detyrosination is enhanced. Mechanistically, molecular dynamics simulations indicate that the CTT may interact with the microtubule lattice at several sites, and that these interactions are affected by the tubulin nucleotide state.

      Strengths:

      Key strengths of the work include the use of three distinct probes that yield broadly consistent findings, and a wide variety of experimental manipulations (drugs, motors, MAPs) that collectively support the authors' conclusions, alongside a careful quantitative approach.

      Weaknesses:

      The challenges of studying the dynamics of a short, intrinsically disordered protein region within the complex environment of the cellular microtubule lattice, amid numerous other binders and regulators, should not be understated. While it is very plausible that the probes report on CTT accessibility as proposed, the possibility of confounding factors (e.g., effects on MAP or motor binding) cannot be ruled out. Sensitivity to the expression level clearly introduces additional complications. Likewise, for each individual "expander" or "compactor" manipulation, one must consider indirect consequences (e.g., masking of binding sites) in addition to direct effects on the lattice; however, this risk is mitigated by the collective observations all pointing in the same direction.

      The discussion does a good job of placing the findings in context and acknowledging relevant caveats and limitations. Overall, this study introduces an interesting and provocative concept, well supported by experimental data, and provides a strong foundation for future work. This will be a valuable contribution to the field.

      We thank the reviewer for their positive response to our work. We are encouraged that the reviewer feels that the Discussion section does a good job of putting the findings, challenges, and possibility of confounding factors and indirect effects in context. 

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors investigate how the structural state of the microtubule lattice influences the accessibility of the α-tubulin C-terminal tail (CTT). By developing and applying new biosensors, they reveal that the tyrosinated CTT is largely inaccessible under normal conditions but becomes more accessible upon changes to the tubulin conformational state induced by taxol treatment, MAP expression, or GTP-hydrolysis-deficient tubulin. The combination of live imaging, biochemical assays, and simulations suggests that the lattice conformation regulates the exposure of the CTT, providing a potential mechanism for modulating interactions with microtubule-associated proteins. The work addresses a highly topical question in the microtubule field and proposes a new conceptual link between lattice spacing and tail accessibility for tubulin post-translational modification.

      Strengths:

      (1) The study targets a highly relevant and emerging topic-the structural plasticity of the microtubule lattice and its regulatory implications.

      (2) The biosensor design represents a methodological advance, enabling direct visualization of CTT accessibility in living cells.

      (3) Integration of imaging, biochemical assays, and simulations provides a multi-scale perspective on lattice regulation.

      (4) The conceptual framework proposed lattice conformation as a determinant of post-translational modification accessibility is novel and potentially impactful for understanding microtubule regulation.

      Weaknesses:

      There are a number of weaknesses in the paper, many of which can be addressed textually. Some of the supporting evidence is preliminary and would benefit from additional experimental validation and clearer presentation before the conclusions can be considered fully supported. In particular, the authors should directly test in vitro whether Taxol addition can induce lattice exchange (see comments below).

      We thank the reviewer for their positive response to our work. We have altered the text and provided additional experimental validation as requested (see below).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The resolution of the figures is insufficient.

      (2) The provision of scale bars is inconsistent and insufficient.

      (3) Figure 1E, the scale bar looks like an MT.

      (4) Figure 2C, what does the grey bar indicate?

      (5) Figure 2E, missing scale bar.

      (6) Figure 3 C, D, significance brackets misaligned.

      (7) Figure 3E, consider using the same alpha-beta tubulin / MT graphic as in Figure 1B.

      (8) Figure 5E, show cell boundaries for consistency?

      (9) Figure 6D, stray box above the y-axis.

      (11) Figure S3A, scale bar wrong unit again.

      (12) S3B "fixed" and mount missing scale bar in the inset.

      (13) S4 scale bars without scale, inconsistency in scale bars throughout all the figures.

      We apologize for issues with the figures. We have corrected all of the issues indicated by the reviewer.

      (10) Figure 6F, surprising that 300 mM KCL washes out rigor binding kinesin

      We thank the reviewer for this important point. To address the reviewer’s concern, we have added a new supplementary figure (new Figure 6 – Figure Supplement 1) which shows that the washing step removes strongly-bound (apo) KIF5C(1-560)-Halo<sup>554</sup> protein from the microtubules. In addition, we have made a correction to the Materials and Methods section noting that ATP was added in addition to the KCl in the wash buffer. We apologize for omitting this detail in the original submission. We also added text noting that the wash out step was based on Shima et al., 2018 where the observation chamber was washed with either 1 mM ATP and 300 mM K-Pipes or with 10 mM ATP and 500 mM K-Pipes buffer. In our case, the chamber was washed with 3 mM ATP and 300 mM KCl. It is likely that the addition of ATP facilitates the detachment of strongly-bound KIF5C.

      (14) Supplementary movie, please identify alpha and beta tubules for clarity. Please identify residues lighting up in interaction sites 1,2 & 3.

      Thank you for the suggestions. We have made the requested changes to the movie.

      Reviewer #2 (Recommendations for the authors):

      There appear to have been some minor issues (perhaps with .pdf conversion) that leave some text and images pixelated in the .pdf provided, alongside some slightly jarring text and image positioning (e.g., Figure 5E panels). The authors should carefully look at the figures to ensure that they are presented in the clearest way possible.

      We apologize for these issues with the figures. We have reviewed the figures carefully to ensure that they are presented in the clearest way possible.

      The authors might consider providing a more definitive structural description of compact vs expanded lattice, highlighting what specific parameters are generally thought to change and by what magnitude. Do these differ between taxol-mediated expansion or the effects of MAPs?

      Thank you for the suggestion. We have added additional information to the Introduction section.

      Reviewer #3 (Recommendations for the authors):

      (1) Figure 1 should include a schematic overview of all constructs used in the study. A clear illustration showing the probe design, including the origin and function of each component (e.g., tags, domains), would improve clarity.

      Thank you for the suggestion. We have added new illustrations to Figure 1 showing the origin and design (including domains and tags) of each probe.

      (2) Add Western blot data for the 4×CAP-Gly construct to Figure 1C for completeness.

      We thank the reviewer for this suggestion. We carried out a far-western blot using the purified 4xCAPGly-mEGFP protein to probe GST-Y, GST-DY, and GST-DC2 proteins (new Figure 1 – Figure Supplement 1C). We note that some bleed-through signal can be seen in the lanes containing GST-ΔY and GST-ΔC2 protein due to the imaging requirements and exposure needed to visualize the 4xCAPGly-mEGFP protein. Nevertheless, the blot shows that the purified CAPGly sensor specifically recognizes the native (tyrosinated) CTT sequence of TUBA1A.

      (3) Essential background information on the CAP-Gly domain, SXIP motif, and EB proteins is missing from the Introduction. These concepts appear abruptly in the Results and should be properly introduced.

      Thank you for the suggestion. We have added additional information to the Introduction section about the CAP-Gly domain. However, we feel that introducing the SXIP motif and EB proteins at this point would detract from the flow of the Introduction and we have elected to retain this information in the Results section when we detail development of the 4xCAPGly probe.

      (4) In Figure 2E, it remains possible that the CAP-Gly domain displacement simply follows the displacement of EB proteins. An experiment comparing EB protein localization upon Taxol treatment would clarify this relationship.

      We thank the reviewer for raising this important point. To address the reviewer’s concern, we utilized HeLa cells stably expressing EB3-GFP. We performed live-cell imaging before and after Taxol addition (new Figure 2 – Figure Supplement 1C). EB3-EGFP was lost from the microtubule plus ends within minutes and did not localize to the now-expanded lattice.

      (5) Statements such as "significantly increased" (e.g., line 195) should be replaced with quantitative information (e.g., "1.5-fold increase").

      We have made the suggested changes to the text.

      (6) Phrases like "became accessible" should be revised to "became more accessible," as the observed changes are relative, not absolute. The current wording implies a binary shift, whereas the data show a modest (~1.5-fold) increase.

      We have made the suggested changes to the text.

      (7) Similarly, at line 209, the terms "minimally accessible" versus "accessible" should be rephrased to reflect the small relative change observed; saturation of accessibility is not demonstrated.

      We have made the suggested changes to the text.

      (8) Statements that MAP7 "expands the lattice" (line 222) should be made cautiously; to my knowledge, that has not been clearly established in the literature.

      We thank the reviewer for this important comment. We have added text indicating that MAP7’s ability to induce or presence an expanded lattice has not been clearly established.

      (9) In Figures 3 and 4, the overexpression of MAP7 results in a strikingly peripheral microtubule network. Why is there this unusual morphology?

      The reviewer raises an interesting question. We are not sure why the overexpression of MAP7 results in a strikingly peripheral microtubule network but we suspect this is unique to the HeLa cells we are using. We have observed a more uniform MAP7 localization in other cell types [e.g. COS-7 cells (Tymanskyj et al. 2018), consistent with the literature [e.g. BEAS-2B cells (Shen and Ori-McKenney 2024), HeLa cells (Hooikaas et al. 2019)].

      (10) In Supplementary Figure 5C, the Western blot of detyrosination levels is inconsistent with the text. Untreated cells appear to have higher detyrosination than both wild-type and E254A-overexpressing cells. Do you have any explanation?

      We thank the reviewer for this important comment. We do not have an explanation at this point but plan to revisit this experiment. Unfortunately, the authors who carried out this work recently moved to a new institution and it will be several months before they are able to get the cell lines going and repeat the experiment. We thus elected to remove what was Supp Fig 5C until we can revisit the results. We believe that the important results are in what is now Figure 5 - Figure Supplement 1A,B which shows that the expression levels of the WT and E254E proteins are similar to each other.

      (11) The image analysis method in Figures 5B and 5D requires clarification. It appears that "density" was calculated from skeletonized probe length over total area, potentially using a strict intensity threshold. It looks like low-intensity binding has been excluded; otherwise, the density would be the same from the images. If so, this should be stated explicitly. A more appropriate analysis might skeletonize and integrate total fluorescence intensity relative to the overall microtubule network.

      We have added additional information to the Materials and Methods section to clarify the image analysis. We appreciate the reviewer’s valuable feedback and the suggestion to use the integrated total fluorescence intensity, which is a theoretically sound approach. While we agree that integrated intensity is a valid metric for specific applications, its appropriate use depends on two main preconditions:

      (1) Consistent microscopy image acquisition conditions.

      (2) Consistent probe expression levels across all cells and experiments.

      We successfully maintained consistent image acquisition conditions (e.g., exposure time) throughout the experiment. However, despite generating a stably-expressing sensor cell lines to minimize variation, there remains an inherent, biological variability in probe expression levels between individual cells. Integrated intensity is highly susceptible to this cell-to-cell variability. Relying on it would lead to a systematic error where differences in the total amount of expressed probe would be mistaken for differences in Y-aCTT accessibility.

      The density metric (skeletonized probe length / total cell area) was deliberately chosen as it serves as a geometric measure rather than an intensity-based normalization. The density metric quantifies the proportion of the microtubule network that is occupied by Y-aCTT-labeled structures, independent of fluorescence intensity. Thus, the density metric provides a more robust and interpretable measure of Y-aCTT accessibility under the variable expression conditions inherent to our experimental system. Therefore, we believe that this geometric approach represents the most appropriate analysis for our image dataset.

      (12) In Figure 5D, the fold-change data are difficult to interpret due to the compressed scale. Replotting is recommended. The text should also discuss the relative fold changes between E254A and Taxol conditions, Figure 2H.

      We appreciate the reviewer's insightful comment. We agree that the presence of significant outliers led to a compressed Y-axis scale in Figure 5D, obscuring the clear difference between the WT-tubulin and E254A-tubulin groups. As suggested, we have replotted Figure 5D using a broken Y-axis to effectively expand the relevant lower range of the data while still accurately representing all data points, including the outliers. We believe that the revised graph significantly enhances the clarity and interpretability of these results. For Figure 2, we have added the relative fold changes to the text as requested.

      (13) Figure 6. The authors should directly test in vitro whether Taxol addition can induce lattice exchange, for example, by adding Taxol to GDP-microtubules and monitoring probe binding. Including such an assay would provide critical mechanistic evidence and substantially strengthen the conclusions. I was waiting for this experiment since Figure 2.

      We thank the reviewer for this suggestion. As suggested, we generated GDP-MTs from HeLa tubulin and added it to two flow chambers. We then flowed in the YL1/2<sup>Fab</sup>-EGFP probe into the chambers in the presence of DMSO (vehicle control) or Taxol. Static images were taken and the fluorescence intensity of the probe on microtubules in each chamber was quantified. There was a slight but not statistically significant difference in probe binding between control and Taxol-treated GDP-MTs (Author response image 1). While disappointing, these results underscore our conclusion (Discussion section) that microtubule assembly in vitro may not produce a lattice state resembling that in cells, either due to differences in protofilament number and/or buffer conditions and/or the lack of MAPs during polymerization.

      Author response image 1.

      References

      Hooikaas, P. J., Martin, M., Muhlethaler, T., Kuijntjes, G. J., Peeters, C. A. E., Katrukha, E. A., Ferrari, L., Stucchi, R., Verhagen, D. G. F., van Riel, W. E., Grigoriev, I., Altelaar, A. F. M., Hoogenraad, C. C., Rudiger, S. G. D., Steinmetz, M. O., Kapitein, L. C. and Akhmanova, A. (2019). MAP7 family proteins regulate kinesin-1 recruitment and activation. J Cell Biol, 218, 1298-1318.

      Shen, Y. and Ori-McKenney, K. M. (2024). Microtubule-associated protein MAP7 promotes tubulin posttranslational modifications and cargo transport to enable osmotic adaptation. Dev Cell, 59, 1553-1570.

      Tymanskyj, S. R., Yang, B. H., Verhey, K. J. and Ma, L. (2018). MAP7 regulates axon morphogenesis by recruiting kinesin-1 to microtubules and modulating organelle transport. Elife, 7.

    1. Reviewer #3 (Public review):

      This manuscript presents a study combining molecular dynamics simulations and Hedgehog (Hh) pathway assays to investigate cholesterol translocation pathways to Smoothened (SMO), a G protein-coupled receptor central to Hedgehog signal transduction. The authors identify and characterize two putative cholesterol access routes to the transmembrane domain (TMD) of SMO and propose a model whereby cholesterol traverses through the TMD to the cysteine-rich domain (CRD), which is presented as the primary site of SMO activation.

      The MD simulations and biochemical experiments are carefully executed and provide useful data.

      Comments on revisions:

      I appreciate the authors' detailed response and the substantial revisions made to the manuscript. The changes addressing Comments 3.1-3.5 have significantly improved the balance and framing of the work, and my primary concerns regarding overstatement and selective interpretation have been satisfactorily addressed.

      The authors' rebuttal to my initial review includes extended argumentation regarding specific interpretations of prior studies and broader models of SMO regulation. These issues represent longstanding differences in interpretation that have already been discussed extensively in the literature and are not essential to evaluating the quality or conclusions of the present study.

      For readers seeking a comprehensive and balanced overview of cholesterol-dependent SMO activation that integrates both CRD- and TMD-centered models, I would point to recent review articles (e.g., Zhang and Beachy, Nat Rev Mol Cell Biol2023). I do not feel it is productive to rehash these debates further in the context of this review, and I have no additional substantive concerns with the revised manuscript.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript uses primarily simulation tools to probe the pathway of cholesterol transport with the smoothened (SMO) protein. The pathway to the protein and within SMO is clearly discovered, and interactions deemed important are tested experimentally to validate the model predictions.

      Strengths:

      The authors have clearly demonstrated how cholesterol might go from the membrane through SMO for the inner and outer leaflets of a symmetrical membrane model. The free energy profiles, structural conformations, and cholesterol-residue interactions are clearly described.

      We thank the reviewer for their kind words.

      (1) Membrane Model: The authors decided to use a rather simple symmetric membrane with just cholesterol, POPC, and PSM at the same concentration for the inner and outer leaflets. This is not representative of asymmetry known to exist in plasma membranes (SM only in the outer leaflet and more cholesterol in this leaflet). This may also be important to the free energy pathway into SMO. Moreover, PE and anionic lipids are present in the inner leaflet and are ignored. While I am not requesting new simulations, I would suggest that the authors should clearly state that their model does not consider lipid concentration leaflet asymmetry, which might play an important role.

      We thank the reviewer for their comment. Membrane asymmetry is inherent in endogenous systems; we acknowledge that as a limitation of our current model. We have addressed the comment by adding this limitation to our discussion in the manuscript.

      Added lines: (End of paragraph 6, Results subsection 2):

      “One possibility that might alter the thermodynamic barriers is native membrane asymmetry, particularly the anionic lipid-rich inner leaflet. This presents as a limitation of our current model.”

      (2) Statistical comparison of barriers: The barriers for pathways 1 and 2 are compared in the text, suggesting that pathway 2 has a slightly higher barrier than pathway 1. However, are these statistically different? If so, the authors should state the p-value. If not, then the text in the manuscript should not state that one pathway is preferred over the other.

      We thank the reviewer for their comment. We have added statistical t-tests for the barriers.

      Changes made: (Paragraph 6, Results subsection 2)

      “However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol v/s 6.5 ± 0.8 kcal/mol, p = 0.0013)”

      (3) Barrier of cholesterol (reasoning): The authors on page 7 argue that there is an enthalpy barrier between the membrane and SMO due to the change in environment. However, cholesterol lies in the membrane with its hydroxyl interacting with the hydrophilic part of the membrane and the other parts in the hydrophobic part. How is the SMO surface any different? It has both characteristics and is likely balanced similarly to uptake cholesterol. Unless this can be better quantified, I would suggest that this logic be removed.

      We thank the reviewer for this suggestion. We have removed the line to avoid confusion.

      Reviewer #2 (Public review):

      Summary:

      In this work, the authors applied a range of computational methods to probe the translocation of cholesterol through the Smoothened receptor. They test whether cholesterol is more likely to enter the receptor straight from the outer leaflet of the membrane or via a binding pathway in the inner leaflet first. Their data reveal that both pathways are plausible but that the free energy barriers of pathway 1 are lower, suggesting this route is preferable. They also probe the pathway of cholesterol transport from the transmembrane region to the cysteine-rich domain (CRD).

      Strengths:

      (1) A wide range of computational techniques is used, including potential of mean force calculations, adaptive sampling, dimensionality reduction using tICA, and MSM modelling. These are all applied rigorously, and the data are very convincing. The computational work is an exemplar of a well-carried out study.

      (2) The computational predictions are experimentally supported using mutagenesis, with an excellent agreement between their PMF and mRNA fold change data.

      (3) The data are described clearly and coherently, with excellent use of figures. They combine their findings into a mechanism for cholesterol transport, which on the whole seems sound.

      (4) The methods are described well, and many of their analysis methods have been made available via GitHub, which is an additional strength.

      Weaknesses:

      (1) Some of the data could be presented a little more clearly. In particular, Figure 7 needs additional annotation to be interpretable. Can the position of the cholesterol be shown on the graph so that we can see the diameter change more clearly?

      We thank the reviewer for this suggestion. We have added the cholesterol positions as requested.

      Changes made: (Caption, Figure 7)

      “The tunnel profile during cholesterol translocation in SMO. (a) Free energy plot of the zcoordinate v/s the tunnel diameter when cholesterol is present in the core TMD. The tunnel shows a spike in the radius in the TMD domain, indicating the presence of a cholesterol-accommodating cavity. (b) Representative figure for the tunnel when a cholesterol molecule is in the TMD. (c) Same as (a), when cholesterol is at the TMD-CRD interface. (e) same as (b), when cholesterol is at the TMD-CRD interface. (e) same as (a), when cholesterol is at the CRD binding site. (f) same as (b), when cholesterol is at the CRD binding site. Tunnel diameters shown as spheres. Cholesterol positions marked on plots using dotted lines. All snapshots presented are frames taken from MD simulations.”

      (2) In Figure 3C, it doesn’t look like the Met is constricting the tunnel at all. What residue is constricting the tunnel here? Can we see the Ala and Met panels from the same angle to compare the landscapes? Or does the mutation significantly change the tunnel? Why not A283 to a bulkier residue? Finally, the legend says that the figure shows that cholesterol can still pass this residue, but it doesn’t really show this. Perhaps if the HOLE graph was plotted, we could see the narrowest point of the tunnel and compare it to the size of cholesterol.

      We thank the reviewer for this suggestion. A283 was mutated to methionine as it presents with a longer heavy tail containing sulfur. We have plotted the tunnel radii for both WT and A283M mutants and added them as a supplemental figure. As shown in the figure, the presence of methionine doesn’t completely block the tunnel, but occludes it, thereby increasing the barrier for cholesterol transport slightly.

      Changes made: (End of Results subsection 1)

      “When we calculated the PMF for cholesterol entry, A<sup>2.60f</sup>M mutant showed restricted tunnel but it did not fully block the tunnel (Figure 3—figure Supplement 3).”

      (3) The PMF axis in 3b and d confused me for a bit. Looking at the Supplementary data, it’s clear that, e.g., the F455I change increases the energy barrier for chol entering the receptor. But in 3d this is shown as a -ve change, i.e., favourable. This seems the wrong way around for me. Either switch the sign or make this clearer in the legend, please.

      We thank the reviewer for this suggestion. We measured ∆PMF as PMF<sub>WT</sub> PMF<sub>mutant</sub>, hence the negative values. We have added additional text to the legend to clarify this.

      Changes made: (Caption, Figure 3)

      “(b) ∆Gli1 mRNA fold change (high SHH vs untreated) and ∆ PMF (difference of peak PMF , calculated as PMF<sub>WT</sub> - PMF<sub>mutant</sub>) plotted for the mutants in Pathway 1. (c) Example mutant A<sup>2_._60f</sup>M shows that cholesterol can enter SMO through Pathway 1 even on a bulky mutation. (d) Same as (b) but for Pathway 2 (e) Example mutant L<sup>5.62f</sup>A shows that cholesterol can enter SMO through Pathway 2 due to lesser steric hindrance. All snapshots presented are frames taken from MD simulations.”

      Changes made: (Caption, Figure 6)

      “(b) ∆Gli1 mRNA fold change (high SHH vs untreated) and ∆ PMF (difference of peak PMF, calculated as PMF<sub>WT</sub> - PMF<sub>mutant</sub>) plotted for mutants along the TMD-CRD pathway. (c, d) Example mutants Y<sup>LD</sup>A and F<sup>5.65f</sup>A show that cholesterol is unable to translocate through this pathway because of the loss of crucial hydrophobic contacts provided by Y207 and F484 and along the solvent-exposed pathway.”

      (4) The impact of G280V is put down to a decrease in flexibility, but it could also be a steric hindrance. This should be discussed.

      We thank the reviewer for this suggestion. We have added it as a possible mechanism of the decrease in activity of SMO.

      Changes made: (Paragraph 5, Results subsection 1)

      “We mutated G280<sup>2.57f</sup>  to valine - G<sup>2.57f</sup>V to test whether reducing the flexibility of TM2 prevents cholesterol entry into the TMD. Consequently, the activity of mSMO showed a decrease. However, this decrease could also be attributed to steric hindrance added by the presence of a bulky propyl group in valine.”

      (5) Are the reported energy barriers of the two pathways (5.8plus minus0.7 and 6.5plus minus0.8 kcal/mol) significantly and/or substantially different enough to favour one over the other? This could be discussed in the manuscript.

      We thank the reviewer for this suggestion. We have added statistical t-tests for the barriers.

      Changes made: (Paragraph 6, Results subsection 2)

      “However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol v/s 6.5 ± 0.8 kcal/mol, p = 0.001)”

      (6) Are the energy barriers consistent with a passive diffusion-driven process? It feels like, without a source of free energy input (e.g., ion or ATP), these barriers would be difficult to overcome. This could be discussed.

      We thank the reviewer for this suggestion. We have added a discussion to further clarify this point.

      Discussion: (Paragraph 6, Results subsection 2)

      “These values are comparable to ATP-Binding Cassette (ABC) transporters of membrane lipids, which use ATP hydrolysis (-7.54 ± 0.3 kcal/mol) (Meurer et al., 2017) to drive lipid transport from the membrane to an extracellular acceptor. Some of these transporters share the same mechanism as SMO, where the lipid from the inner leaflet is flipped and transported to the extracellular acceptor protein (Tarling et al., 2013). Additionally, for secondary active transporters that do not use ATP for the transport of substrates, a thermodynamic barrier of 5-6 kcal/mol has been reported in literature. (Chan et al., 2022; Selvam et al., 2019; McComas et al., 2023; Thangapandian et al., 2025).”

      (7) Regarding the kinetics from MSM, it is stated that the values seen here are similar to MFS transporters, but this then references another MSM study. A comparison to experimental values would support this section a lot.

      We thank the reviewer for this suggestion. We have added a discussion discussing millisecond-scale timescales measured for MFS transporters.

      Changes made: (Paragraph 2, Results subsection 5)

      “These timescales are comparable to the substrate transport timescales of Major Facilitator Superfamily (MFS) transporters (Chan et al., 2022). Furthermore, several experimental studies have also resolved the millisecond-scale kinetics of MFS transporters (Blodgett and Carruthers, 2005; Körner et al., 2024; Bazzone et al., 2022; Smirnova et al., 2014; Zhu et al., 2019), further corroborating the results from our study.”

      Reviewer #2 (Recommendations for the authors):

      (1) The heatmaps in Figures 2a and 4a are great. On these, an arrow denotes what looks like a minimum energy path. Is it possible to see this plotted, as this might show the height of the energy barriers more clearly?

      We thank the reviewer for this suggestion. We have computed the minimum energy paths for both pathways and presented them in a supplementary figure.

      Added lines: (Paragraph 4, Results subsection 1):

      For further clarity, we have plotted the minimum energy path taken by cholesterol as it translocates along this pathway (Figure 2—figure Supplement 3)a,b)

      Added lines: (Paragraph 4, Results subsection 2):

      For further clarity, we have plotted the minimum energy path taken by cholesterol as it translocates along this pathway (Figure 2—figure Supplement 3)c,d)

      (2) The tiCA data in S15 is first referred to on line 137, but the technique isn’t introduced until line 222. This makes understanding the data a little confusing. Reordering this might improve readability.

      We thank the reviewer for this suggestion. We have reordered the text to make it clearer.

      Changes made: (Paragraph 2, Results subsection 1) This provides evidence for multiple stable poses along the pathway as observed in the multiple stable poses of cholesterol in Cryo-EM structures of SMO bound to sterols (Deshpande et al., 2019; Qi et al., 2019b, 2020). A reliable estimate of the barriers comes from using the time-lagged Independent Components (tICs), which project the entire dataset along the slowest kinetic degrees of freedom. Overall, the highest barrier along Pathway 1 is 5.8 ± 0.7 kcal/mol, and it is associated with the entry of cholesterol into the TMD (Figure 2—Figure Supplement 2).

      Changes made: (Paragraph 3, Results subsection 2)

      “On plotting the first two components of tICs, (Figure 2—Figure Supplement 2), we observe that the energetic barrier between η and θ is ∼6.5 ± 0.8 kcal/mol.”

      (3) Missing bracket on line 577.

      We thank the reviewer for this suggestion. The typo has been fixed.

      (4) Line 577: Fig. S2nd?

      We thank the reviewer for this suggestion. This typo has been fixed.

      Reviewer #3 (Public review):

      Summary:

      This manuscript presents a study combining molecular dynamics simulations and Hedgehog (Hh) pathway assays to investigate cholesterol translocation pathways to Smoothened (SMO), a G protein-coupled receptor central to Hedgehog signal transduction. The authors identify and characterize two putative cholesterol access routes to the transmembrane domain (TMD) of SMO and propose a model whereby cholesterol traverses through the TMD to the cysteine-rich domain (CRD), which is presented as the primary site of SMO activation. The MD simulations and biochemical experiments are carefully executed and provide useful data.

      Weaknesses:

      However, the manuscript is significantly weakened by a narrow and selective interpretation of the literature, overstatement of certain conclusions, and a lack of appropriate engagement with alternative models that are well-supported by published data-including data from prior work by several of the coauthors of this manuscript. In its current form, the manuscript gives a biased impression of the field and overemphasizes the role of the CRD in cholesterol-mediated SMO activation. Below, I provide specific points where revisions are needed to ensure a more accurate and comprehensive treatment of the biology.

      (1) Overstatement of the CRD as the Orthosteric Site of SMO Activation

      The manuscript repeatedly implies or states that the CRD is the orthosteric site of SMO activation, without adequate acknowledgment of alternative models. To give just a few examples (of many in this manuscript):

      (a) “PTCH is proposed to modulate the Hh signal by decreasing the ability of membrane cholesterol to access SMO’s extracellular cysteine-rich domain (CRD)” (p. 3).

      (b) “In recent years, there has been a vigorous debate on the orthosteric site of SMO” (p. 3).

      (c) “cholesterol must travel through the SMO TMD to reach the orthosteric site in the CRD” (p. 4).

      (d) “we observe cholesterol moving along TM6 to the TMD-CRD interface (common pathway, Fig. 1d) to access the orthosteric binding site in the CRD” (p. 6).

      While the second quote in this list at least acknowledges a debate, the surrounding text suggests that this debate has been entirely resolved in favor of the CRD model. This is misleading and not reflective of the views of other investigators in the field (see, for example, a recent comprehensive review from Zhang and Beachy, Nature Reviews Molecular and Cell Biology 2023, which makes the point that both the CRD and 7TM sites are critical for cholesterol activation of SMO as well as PTCH-mediated regulation of SMO-cholesterol interactions).

      In contrast, a large body of literature supports a dual-site model in which both the CRD and the TMD are bona fide cholesterol-binding sites essential for SMO activation. Examples include:

      (a) Byrne et al., Nature 2016: point mutation of the CRD cholesterol binding site impairs-but does not abolish-SMO activation by cholesterol (SMO D99A, Y134F, and combination mutants - Fig 3 of the 2016 study).

      (b) Myers et al., Dev Cell 2013 and PNAS 2017: CRD deletion mutants retain responsiveness to PTCH regulation and cholesterol mimetics (similar Hh responsiveness of a CRD deletion mutant is also observed in Fig. 4 Byrne et al, Nature 2016).

      (c) Deshpande et al., Nature 2019: mutation of residues in the TMD cholesterol binding site blocks SMO activation entirely, strongly implicating the TMD as a required site, in contrast to the partial effects of mutating or deleting the CRD site.

      Qi et al., Nature 2019, and Deshpande et al., Nature 2019, both reported cholesterol binding at the TMD site based on high-resolution structural data. Oddly, Deshpande et al., Nature 2019, is not cited in the discussion of TMD binding on p. 3, despite being one of the first papers to describe cholesterol in the TMD site and its necessity for activation (the authors only cite it regarding activation of SMO by synthetic small molecules).

      Kinnebrew et al., Sci Adv 2022 report that CRD deletion abolished PTCH regulation, which is seemingly at odds with several studies above (e.g., Byrne et al, Nature 2016; Myers et al, Dev Cell 2013); but this difference may reflect the use of an N-terminal GFP fusion to SMO in the Kinnebrew et al 2022, which could alter SMO activation properties by sterically hindering activation at the TMD site by cholesterol (but not synthetic SMO agonists like SAG); in contrast, the earlier work by Byrne et al is not subject to this caveat because it used an untagged, unmodified form of SMO.

      Although overexpression of PTCH1 and SMO (wild-type or mutant) has been noted as a caveat in studies of CRD-independent SMO activation by cholesterol, this reviewer points out that several of the studies listed above include experiments with endogenous PTCH1 and low-level SMO expression, demonstrating that SMO can clearly undergo activation by cholesterol (as well as regulation by PTCH1) in a manner that does not require the CRD.

      Recommendation: The authors should revise the manuscript to provide a more balanced overview of the field and explicitly acknowledge that the CRD is not the sole activation site. Instead, a dual-site model is more consistent with available structural, mutational, and functional data. In addition, the authors should reframe their interpretation of their MD studies to reflect this broader and more accurate view of how cholesterol binds and activates SMO.

      We thank the reviewer for this comprehensive overview of the existing literature. We agree that cholesterol binding to both the TMD and CRD sites is required for full activation of SMO. As described below in responses to comments, we have made changes to the manuscript to make this point clear. For instance, in the revised manuscript, we refrain from calling the CRD cholesterol binding site the “orthosteric site”. Instead, we highlight that the goal of the manuscript is not to resolve the debate over whether the TMD or CRD site is more important for PTCH1 regulation by SMO but rather to use molecular dynamics to understand the fascinating question of how cholesterol in the membrane can reach the CRD, located at a significant distance above the outer leaflet of the membrane. We believe that this is an important goal since there is an abundance of evidence that supports the view that PTCH1 inhibits SMO by reducing cholesterol access to the CRD. This evidence is now summarized succinctly in the introduction:

      Changes made: (Paragraph 4, Introduction)

      “While cholesterol binding to both the TMD and CRD sites is required for full SMO activation, our work focuses on how cholesterol gains access to the CRD site, perched above the outer leaflet of the membrane (Luchetti et al., 2016; Kinnebrew et al., 2022). Multiple lines of evidence suggest that PTCH1-regulated cholesterol binding to the CRD plays an instructive role in SMO regulation both in cells and animals. Mutations in residues predicted to make hydrogen bonds with the hydroxyl group of cholesterol bound to the CRD reduced both the potency and efficacy of SHH in cellular signaling assays (Kinnebrew et al., 2022; Byrne et al., 2016) and, more importantly, eliminated HH signaling in mouse embryos (Xiao et al., 2017). Experiments using both covalent and photocrosslinkable sterol probes in live cells directly show that PTCH1 activity reduces sterol access to the CRD (Kinnebrew et al., 2022; Xiao et al., 2017). Notably, our simulations evaluate a path of cholesterol translocation that includes both the TMD and CRD sites: cholesterol first enters the 7-transmembrane domain bundle from the membrane; it then engages the TMD site before continuing along a conduit to the CRD site. Thus, we analyze translocation energetics and residue-level contacts along a path that includes both the TMD and the CRD.”

      However, Reviewer 3 makes several comments below that are biased, inaccurate, or selective. We feel it is important to address these so readers can approach the literature from a balanced perspective. Indeed, the eLife review forum provides an ideal venue to present contrasting views on a scientific model. We encourage the editors to publish both Reviewer 3’s comments and our response in full so readers can read the original papers and reach their own conclusions. It is important to note these issues are not relevant to the quality of the computational and experimental data presented in this paper.

      We have now removed the term “orthosteric” to describe the CRD site throughout the paper and clearly state in the introduction that “both the CRD and TMD sites are required for SMO activation” but that our focus is on how cholesterol moves from the membrane to the CRD site. There is no doubt that cholesterol binding to the CRD plays a key role in SMO activation– our focus on this path is justified and does not devalue the importance of the TMD site. Our prior models (see Figure 7 of Kinnebrew 2022 explicitly include contributions of both sites).

      Now we respond to some of the concerns outlined, individually:

      (1) Byrne et al., Nature 2016: point mutation of the CRD cholesterol binding site impairs-but does not abolish-SMO activation by cholesterol (SMO D99A, Y134F, and combination mutants - Fig 3 of the 2016 study)

      The fact that a point mutation dramatically diminishes (but does not abolish signaling) does not mean that the CRD cholesterol binding site is not important for SMO regulation. Indeed, the reviewer fails to mention that Song et. al. (Molecular Cell, 2017) found that a SMO protein carrying a subtle mutation at D99 (D95/99N, a residue that makes a hydrogen bond with the cholesterol hydroxyl) completely abolishes SMO signaling in mouse embryos. Thus, the CRD site is critical for SMO activation in an intact animal, justifying our focus on evaluating the path of cholesterol translocation to the CRD site.

      (2) Myers et al., Dev Cell 2013 and PNAS 2017: CRD deletion mutants retain responsiveness to PTCH regulation and cholesterol mimetics (similar Hh responsiveness of a CRD deletion mutant is also observed in Fig 4 Byrne et al, Nature 2016).

      The Reviewer fails to note that CRD-deleted versions of SMO have markedly (>10-fold) higher basal (i.e. ligand-independent) activity compared to full-length SMO. The response to SHH is minimal (∼2-fold), compared to >50-100-fold with full-length SMO. Thus, CRD-deleted SMO is likely in a non-native conformation. Local changes in cholesterol accessibility caused by PTCH1 inactivation or cholesterol loading can cause small fluctuations in delta-CRD activity, but this cannot be used to infer meaningful insights about how native, full-length SMO (with >10-fold lower basal activity) is regulated. We encourage the reviewer to read our previous paper (Kinnebrew et. al. 2022), which presents a unified view of how the TMD and CRD sites together regulate SMO activation.

      A more physiological experiment, reported in Kinnebrew et. al. 2022, tested mutations in residues that make hydrogen bonds with cholesterol at the CRD and TMD sites in the context of full-length SMO. These mutants were stably expressed at moderate levels in Smo<sup>−/−</sup> cells. Mutations at the CRD site reduced the fold-increase in signaling output in response to SHH, as would be expected for a PTCH1-regulated site. In contrast, analogous mutations in the TMD site reduced the magnitude of both basal and maximal signaling, without affecting the fold-change in response to SHH. In signaling assays, the key parameter in evaluating the impact of a mutation is whether it impacts the change in output in response to a signal (in this case PTCH1 inactivation by SHH). A mutation in SMO that affects PTCH1 regulation is expected to decrease the fold-change in signaling in response to SHH, a criterion that is fulfilled by mutations in the CRD site. Accordingly, mutations in the CRD site abolish SMO signaling in mouse embryos (Xiao et al., 2017).

      (3) Deshpande et al., Nature 2019: mutation of residues in the TMD cholesterol binding site blocks SMO activation entirely, strongly implicating the TMD as a required site, in contrast to the partial effects of mutating or deleting the CRD site.

      Introduction of bulky mutations at the TMD site (V333F) that abolish SMO activity were first reported by Byrne et. al. 2016 and were used to markedly increase the stability of SMO for protein expression. These mutations indeed stabilize the inactive state of SMO, increasing protein abundance and completely preventing its localization at primary cilia. SMO variants carrying such bulky mutations cannot be used to infer the importance of the TMD site since they do not distinguish between the following possibilities: (1) SMO is inactive because the sterol cannot bind, or (2) SMO is inactive because it is locked in an inactive conformation, or (3) SMO is inactive because it cannot localize to primary cilia (where it must be localized to activate downstream signaling).

      As described in Response 3.3, a better evaluation of the importance of the TMD site is the use of mutations in residues that make hydrogen bonds with the hydroxyl group of TMD cholesterol. These mutations do not markedly increase protein stability or prevent ciliary localization (Kinnebrew 2022, Fig.S2). While a TMD site mutation decreases the magnitude of maximal (and basal) SMO signaling, it does not impact the fold-increase in signal output in response to Hh ligands (the key parameter that should be used to evaluate PTCH1 activity).

      (4) Qi et al., Nature 2019, and Deshpande et al., Nature 2019, both reported cholesterol binding at the TMD site based on high-resolution structural data. Oddly, Deshpande et al., Nature 2019 not cited in the discussion of TMD binding on p. 3, despite being one of the first papers to describe cholesterol in the TMD site and its necessity for activation (the authors only cite it regarding activation of SMO by synthetic small molecules)

      The reference has now been added at this location in the manuscript.

      (5) Kinnebrew et al., Sci Adv 2022 report that CRD deletion abolished PTCH regulation, which is seemingly at odds with several studies above (e.g., Byrne et al, Nature 2016; Myers et al, Dev Cell 2013); but this difference may reflect the use of an N-terminal GFP fusion to SMO in the Kinnebrew et al 2022, which could alter SMO activation properties by sterically hindering activation at the TMD site by cholesterol (but not synthetic SMO agonists like SAG); in contrast, the earlier work by Byrne et al is not subject to this caveat because it used an untagged, unmodified form of SMO.

      The reviewer fails to note that CRD deleted versions of SMO have markedly (>10-fold) higher basal activity than full-length SMO. The response to SHH is minimal (∼2fold), compared to >50-fold with full-length SMO. Thus, CRD-deleted SMO is likely in a non-native conformation. Local changes in cholesterol accessibility caused by PTCH1 inactivation or cholesterol loading can cause small fluctuations in delta-CRD activity, but this cannot be used to infer meaningful insights about how native, full-length SMO (with >10-fold lower basal activity) is regulated. Please see Response 3.3 for further details.

      Reviewer 3 presents an incomplete picture of the extensive experiments reported in Kinnebrew et. al. to establish the functionality of YFP-tagged delta-CRD SMO. Most importantly, a TMDselective sterol analog (KK174) can fully activate YFP-tagged delta-CRD, showing conclusively that the YFP fusion does not block sterol access to the TMD site. The fact that this protein is nearly unresponsive to SHH highlights the critical role of the CRD-bound cholesterol in SMO regulation by PTCH1. Indeed, the YFP-tagged, CRD-deleted SMO was made purposefully to test the requirement of the CRD in a construct that had normal basal activity. Again, this data justifies the value of investigating the path of cholesterol movement from the membrane via the TMD site to the CRD.

      (6) Although overexpression of PTCH1 and SMO (wild-type or mutant) has been noted as a caveat in studies of CRD-independent SMO activation by cholesterol, this reviewer points out that several of the studies listed above include experiments with endogenous PTCH1 and low-level SMO expression, demonstrating that SMO can clearly undergo activation by cholesterol (as well as regulation by PTCH1) in a manner that does not require the CRD.

      This comment is inaccurate. The data presented in Deshpande et. al. (and prior work in Myers et. al.) used transient transfection to overexpress SMO in Smo<sup>−/−</sup> cells. At the individual cell level transient transfection produces expression levels that are markedly higher (10-1000-fold) than stable expression (in addition to being more variable). Most scientists would agree that stable expression (as used in Kinnebrew 2022) at a moderate expression level is a better system to compare mutant phenotypes, assess basal and activated signaling, and provide an accurate measure of the fold-change in signal output in response to SHH. Notably, introduction of a mutation in the CRD cholesterol binding site at the endogenous mouse Smo locus (an even better experiment than stable expression) leads to complete loss of SMO activity (PMID 28344083). This result again justifies our investigation of the pathway of cholesterol movement from the membrane to the CRD site.

      We have changed the initial discussion and reflect a more general outlook.

      Changes made: (Paragraph 1, Introduction)

      “PTCH modulates the availability of accessible cholesterol at the primary cilium and thereby regulates SMO, with models invoking effects on both the CRD and 7TM pockets.”

      Changes made: (Results subsection 3, paragraph 1)

      “According to the dual-site model, to reach the binding site in the CRD (ζ), cholesterol translocate along the TMD-CRD interface from the TM binding site (α∗) is required.”

      Added lines: (Paragraph 5, Results subsection 3):

      “The computational investigation showed here covers the dual-site model, where cholesterol reaches the CRD site via binding to the TM binding site first. In comparison to the CRD site, the TM site is more stable by ∼ 2 kcal/mol (Figure 2—Figure Supplement 3b, d).”

      Added lines: (Paragraph 2, Conclusions):

      “Here we have explored the role the CRD-site plays in SMO activation. In addition, through simulating the CRD site-dependent SMO activation hypothesis, we have also simulated the TMD site-dependent activation. We show that the overall stability of cholesterol is higher than the CRD site by ∼ 2 kcal/mol.”

      (2) Bias in Presentation of Translocation Pathways

      The manuscript presents the model of cholesterol translocation through SMO to the CRD as the predominant (if not sole) mechanism of activation. Statements such as: "Cholesterol traverses SMO to ultimately reach the CRD binding site" (p. 6) suggest an exclusivity that is not supported by prior literature in the field. Indeed, the authors’ own MD data presented here demonstrate more stable cholesterol binding at the TMD than at the CRD (p 17), and binding of cholesterol to the TMD site is essential for SMO activation. As such, it is appropriate to acknowledge that cholesterol may activate SMO by translocating through the TM5/6 tunnel, then binding to the TMD site, as this is a likely route of SMO activation in addition to the CRD translocation route they highlight in their discussion.

      The authors describe two possible translocation pathways (Pathway 1: TM2/3 entry to TMD; Pathway 2: TM5/6 entry and direct CRD transfer), but do not sufficiently acknowledge that their own empirical data support Pathway 2 as more relevant. Indeed, because their experimental data suggest Pathway 2 is more strongly linked to SMO activation, this pathway should be weighted more heavily in the authors’ discussion. In addition, Pathway 2 is linked to cholesterol binding to both the TMD and CRD sites (the former because the TMD binding site is at the terminus of the hydrophobic tunnel, the latter via the translocation pathway described in the present manuscript), so it is appropriate that Pathway 2 figures more prominently than Pathway 1 in the authors’ discussion.

      The authors also claim that "there is no experimental structure with cholesterol in the inner leaflet region of SMO TMD" (p 16). However, a structural study of apo-SMO from the Manglik and Cheng labs (Zhang et al., Nat Comm, 2022) identified a cholesterol molecule docked at the TM5/6 interface and also proposed a "squeezing" mechanism by which cholesterol could enter the TM5/6 pocket from the membrane. The authors do not consider this SMO conformation in their models, nor do they discuss the possibility that conformational dynamics at the TM5/6 interface could facilitate cholesterol flipping and translocation into the hydrophobic conduit, despite both possibilities having precedent in the 2022 empirical cryoEM structural analysis.

      Recommendation: The authors should avoid oversimplifying the SMO cholesterol activation process, either by tempering these claims or broadening their discussion to better reflect the complexity and multiplicity of cholesterol access and activation routes for SMO. They should also consider the 2022 apo-SMO cryoEM structure in their analysis of the TM5/6 translocation pathway.

      We thank the reviewer for this comprehensive overview of the existing literature and parts we have missed to include in the discussion. We agree with the reviewer, since our data shows that both pathways are probable. Through our manuscript, we have avoided using a competitive approach (that one pathway dominates over the other). Instead, we have evaluated both pathways independently and presented a comparative rather than competitive overview of both pathways from our observations. While we agree that experimental evidence suggests the inner leaflet pathway is possible, we cannot discount the observations made in previous studies that support the outer leaflet pathway, particularly Hedger et al. (2019), Bansal et al. (2023), and Kinnebrew et al. (2021). Therefore, considering the reviewer’s comments have made the following changes:

      (1) Added lines: (Paragraph 3, Conclusions):

      “We show that the barriers associated with the pathway starting from the outer leaflet are lower by ∼0.7 kcal, (p=0.0013). We also provide evidence that cholesterol can enter SMO via both leaflets, considering that multiple computational and experimental studies have found cholesterol entry sites and activation modulation via the outer leaflet, between TM2TM3. This is countered by evidence from multiple experimental and computational studies corroborating entry via the inner leaflet, between TM5-TM6, including this study. Overall, we posit that cholesterol translocation from either pathway is feasible.”

      (2)nChanges made: (Paragraph 6, Results subsection 2)

      “Based on our experimental and computational data, we conclude that cholesterol translocation can happen via either pathway. This is supported on the basis of the following observations: mutations along pathway 2 affect SMO activity more significantly, and the presence of a direct conduit that connects the inner leaflet to the TMD binding site. In addition, a resolved structure of SMO in the presence of cholesterol shows a cholesterol situated at the entry point from the membrane into the protein between TM5 and TM6, in the inner leaflet. However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol vs. 6.5 ± 0.8 kcal/mol, p \= 0.0013). Additionally, PTCH1 controls cholesterol accessibility in the outer leaflet. This shows that there is a possibility for transport from both leaflets. One possibility that might alter the thermodynamic barriers is native membrane asymmetry, particularly the anionic lipid-rich inner leaflet. This presents as a limitation of our current model.”

      (3)nChanges made: (Paragraph 1, Results subsection 2)

      “In a structure resolved in 2022, cholesterol was observed at the interface between the protein and the membrane, in the inner leaflet, between TMs 5 and 6. However, cholesterol in the inner leaflet has a downward orientation, with the polar hydroxyl group pointing intracellularly (η). A striking observation is that this cholesterol binding site pose was never used as a starting point for simulations and was discovered independent of the pose described in Zhang et al. (2022) (Figure 4—Figure Supplement 1).”

      (3) Alternative Possibility: Direct Membrane Access to CRD

      The possibility that the CRD extracts cholesterol directly from the membrane outer leaflet is not considered. While the crystal structures place the CRD in a stable pose above the membrane, multiple cryo-EM studies suggest that the CRD is dynamic and adopts a variety of conformations, raising the possibility that the stability of the CRD in the crystal structures is a result of crystal packing and that the CRD may be far more dynamic under more physiological conditions.

      Recommendation: The authors should explicitly acknowledge and evaluate this potential mechanism and, if feasible, assess its plausibility through MD simulations.

      We thank the reviewer for the suggestion. We have addressed this comment by calculating the distance from the lipid headgroups for each lipid in the membrane to the cholesterol binding site. We show that in our study, we do not observe any bending of the CRD over the membrane, precluding any cholesterol from being extracted from the membrane directly.

      Added lines: (Paragraph 3, Conclusions):

      “An alternative possibility states that the flexibility associated with the CRD would allow it to directly access the membrane, and consequently, cholesterol. In the extensive simulations reported in this study, the binding site of cholesterol in the CRD remains at least 20 Å away from the nearest lipid head group in the membrane, suggesting that such direct extraction and the bending of the CRD do not occur within the timescales sampled (Appendix 2 – Figure 6).

      The mechanistic details of this process are still unexplored and form the basis of future work.”

      (4) Inconsistent Framing of Study Scope and Limitations

      The discussion contains some contradictory and misleading language. For example, the authors state that "In this study we only focused on the cholesterol movement from the membrane to the CRD binding site," and then several sentences later state that "We outline the entire translocation mechanism from a kinetic and thermodynamic perspective." These statements are at odds. The former appropriately (albeit briefly) notes the limited scope of the modeling, while the latter overstates the generality of the findings.

      In addition, the authors’ narrow focus on the CRD site constitutes a major caveat to the entire work. It should be acknowledged much earlier in the manuscript, preferably in the introduction, rather than mentioned as an aside in the penultimate paragraph of the conclusion.

      Recommendation: The authors should clarify the scope of the study and expand the discussion of its limitations. They should explicitly acknowledge that the study models one of several cholesterol access routes and that the findings do not rule out alternative pathways.

      We thank the reviewer for the suggestion. We have addressed this comment by explicitly mentioning the scope of the study.

      Changes made: (Paragraph 3, Conclusions)

      “We outline the entire translocation mechanism from a kinetic and thermodynamic perspective for one of the leading hypotheses for the activation mechanism of SMO.”

      (5) Summary:

      This study has the potential to make a useful contribution to our understanding of cholesterol translocation and SMO activation. However, in its current form, the manuscript presents an overly narrow and, at times, misleading view of the literature and biological models; as such, it is not nearly as impactful as it could be. I strongly encourage the authors to revise the manuscript to include:

      (1) A more balanced discussion of the CRD vs. TMD binding sites.

      (2) Acknowledgment of alternative cholesterol access pathways.

      (3) More comprehensive citation of prior structural and functional studies.

      (4) Clarification of assumptions and scope.

      Of note, the above suggestions require little to no additional MD simulations or experimental studies, but would significantly enhance the rigor and impact of the work.

      We thank the reviewer for the suggestions. We have taken into account the literature and diverse viewpoints. We have changed the initial discussion and reflected a more general outlook. In the revised version of the manuscript, we have refrained from referring to the CRD site as the orthosteric site. Instead, we refer to it as the CRD sterol-binding site. To better represent the dual-site model, we add further discussion in the Introduction. Through our manuscript, we have avoided using a competitive approach (that one pathway dominates over the other). Instead, we have evaluated both pathways independently and presented a comparative rather than competitive overview of both pathways from our observations. We explicitly mention the scope of the study.

    1. Storia della vaccinazione, Edward Jenner

      🥸 Ecco perché Edward Jenner era un ignorante e cialtrone. ☣.. 💥💫 Sir Charles Creighton, MD, nella voce “Vaccinazioni” pubblicata nell’Enciclopedia Britannica del 1888, demolisce il “Jennerismo” con tre passaggi che, nel suo impianto argomentativo, ne fanno crollare la pretesa di scientificità: 👉 1. Oggetto instabile: il cow-pox viene descritto come fenomeno condizionato, non come entità naturale semplice, univoca e costante. 👉 2. Mezzo non unico: la “linfa vaccinale” appare come una filiera di stock e pratiche (passaggi, selezioni, sostituzioni), non come una sostanza unica con effetti invariabili. 👉 3. Prova fallace: l’impianto dimostrativo è accusato di post hoc ergo propter hoc, mentre l’uso delle statistiche dell’epoca viene messo duramente alla prova. ⚡️🔥 Un caso di invalidazione di una teoria pseudoscientifica attraverso testimonianze storiche. 😎 Articolo completo (con rimandi alle pagine originali): 👇https://autoimmunityreactions.org/wp/2026/01/07/creighton-britannica-1888-la-storia-naturale-del-cow-pox-come-leva-per-smontare-la-vaccinazione-jenneriana/

    1. is_visited, visit

      Är det bättre att bara "importera" visit, och sedan använda "visit.is_visited" och "visit.visit" istället för funktionerna direkt? Så har jag gjort det i Clojure, och skulle nog föredra att göra det i Python också.

    2. endast refererar till Python-koden

      Tror detta kan göras tydligare, typ att vi bara visar Clojure-koden och refererar till Python-koden.

      Ett alternativ är att alltid visa både Python och Clojure, så kan vi skippa denna text.

    1. Reviewer #1 (Public review):

      Willeke et al. hypothesize that macaque V4, like other visual areas, may exhibit a topographic functional organization. One challenge to studying the functional (tuning) organization of V4 is that neurons in V4 are selective for complex visual stimuli that are hard to parameterize. Thus, the authors leverage an approach comprising digital twins and most exciting stimuli (MEIs) that they have pioneered. This data-driven, deep-learning framework can effectively handle the difficulty of parametrizing relevant stimuli. They verify that the model-synthesized MEIs indeed drive V4 neurons more effectively than matched natural image controls. They then performed psychophysics experiments (on humans) along with the application of contrastive learning to illustrate that anatomically neighboring neurons often care about similar stimuli. Importantly, the weaknesses of the approach are clearly appreciated and discussed.

      Comments:

      (1) The correlation between predictions and data is 0.43. I'd agree with the authors that this is "reliable" and would recommend that they discuss how the fact that performance is not saturated influences the results.

      (2) Modeling V4 using a CNN and claiming that the identified functional groups look like those found in artificial vision systems may be a bit circular.

      (3) No architecture other than ResNet-50 was tested. This might be a major drawback, since the MEIs could very well be reflections of the architecture and also the statistics of the dataset, rather than intrinsic biological properties. Do the authors find the same result with different architectures as the basis of the goal-driven model?

      (4) The closed-loop analysis seems to be using a much smaller sample of the recorded neurons - "resulting in n=55 neurons for the analysis of the closed-loop paradigm".

      (5) A discussion on adversarial machine learning and the adversarial training that was used is lacking.

    2. Reviewer #2 (Public review):

      This is an ambitious and technically powerful study, investigating a long-standing question about the functional organization of area V4. The project combined large-scale single-unit electrophysiology in macaque V4 with deep learning-based activation maximization to characterize neuronal tuning in natural image space. The authors built predictive encoding models for V4 neurons and used these models to synthesize most exciting images (MEIs), which are subsequently validated in vivo using a closed-loop experimental paradigm.

      Overall, the manuscript advances three main claims:

      (1) Individual V4 neurons showed complex and highly structured selectivity for naturalistic visual features, including textures, curvatures, repeating patterns, and apparently eye-like motifs.

      (2) Neurons recorded along the same linear probe penetration tended to have more similar MEIs than neurons recorded at different cortical locations (this similarity was supported by human psychophysics and by distances in a learned, contrastive image embedding space).

      (3) MEIs clustered into a limited number of functional groups that resembled feature visualizations observed in deep convolutional neural networks.

      Strengths:

      (1) The study is important in that it is the first to apply activation maximization to neurons sampled at such fine spatial resolution. The authors used 32-channel linear silicon probes, spanning approximately 2 mm of cortical depth, with inter-contact spacing of roughly 60 µm. This enabled fine sampling across most of the cortical thickness of V4, substantially finer resolution than prior Utah-array or surface-biased approaches.

      (2) A key strength is the direct in vivo validation of model-derived synthetic images by stimulating the same neurons used to build the models, a critical step often absent in other neural network-based encoding studies.

      (3) More broadly, the study highlights the value of probing neuronal selectivity with rich, naturalistic stimulus spaces rather than relying exclusively on oversimplified stimuli such as Gabors.

      Weaknesses:

      (1) A central claim is that neurons sampled within the same penetration shared MEI tuning properties compared to neurons sampled in different penetrations because of functional organization. I am concerned about technical correlations in activity due to technical or methodology-related approaches (for example, shared reference or grounding) instead of functional organization alone. These recordings were obtained with linear silicon probes, and there have been observations that neuronal activity along this type of probe (including neuropixels probes) may be correlated above what prior work showed, using manually advanced single electrodes. For example, Fujita et al. (1992) showed finer micro-domains and systematic changes in selectivity along a cortical penetration, and it is not clear if that is true or detectable here. I think that the manuscript would be strengthened by a more thorough and explicit characterization of lower-level response correlations (at the neuronal electrophysiology level) prior to starting with fitting models. In particular, the authors could examine noise correlations along the electrode shaft (using the repeated test images, for example), as well as signal correlations in tuning, both within and across sessions. It would also be helpful to clarify whether these correlations depended on penetration day, recording chamber hole (how many were used?), or spatial separation between penetrations, and whether repeated use of the same hole yielded stable or changing correlations. Illustrations of the peristimulus time histogram changes across the shaft and across penetrations would also help. All of this would help us understand if the reports of clustering were technically inevitable due to the technique.

      (2) It is difficult to understand a story of visual cortex neurons without more information about their receptive field locations and widths, particularly given that the stimulus was full-screen. I understand that there was a sparse random dot stimulus used to find the population RF, so it should be possible to visualize the individual and population RFs. Also, the investigators inferred the locations of the important patches using a masking algorithm, but where were those masks relative to the retinal image, and how distributed were they as a function of the shaft location? This would help us understand how similar each contact was.

      (3) A major claim is that V4 MEIs formed groups that were comparable to those produced by artificial vision systems, "suggesting potential shared encoding strategies." The issue is that the "shared encoding strategy" might be the authors' use of this same class of models in the first place. It would be useful to know if different functional groups arise as a function of other encoding neural network models, beyond the robust-trained ResNet-50. I am unsure to what extent the reported clustering, depth-wise similarity, and correspondence to artificial features depended on architectural and training bias. It would substantially strengthen the manuscript to test whether a similar organizational structure would emerge using alternative encoding models, such as attention-based vision transformers, self-supervised visual representations, or other non-convolutional architectures. Another important point of contrast would be to examine the functional groups encoded by the ResNet architecture before its activations were fit to V4 neuronal activity: put simply, is ResNet just re-stating what it already knows?

      (4) Several comparisons to prior work are presented largely at a qualitative level, without quantitative support. For example, the authors state that their MEIs are consistent with known tuning properties of macaque V4, such as selectivity for shape, curvature, and texture. However, this claim is not supported by explicit image analyses or metrics that would substantiate these correspondences beyond appeal to visual inspection. Incorporating quantitative analyses, for instance, measures of curvature, texture statistics, or comparisons to established stimulus sets, would strengthen these links to prior literature and clarify the relationship between the synthesized MEIs and previously characterized V4 tuning properties.

    3. Author response:

      We thank the reviewers for their careful reading and constructive feedback. We were glad to see that they recognized both the technical scope of the study and its contribution as the first to apply activation maximization with such fine spatial sampling. Their appreciation for the critical in vivo validation of model-derived stimuli is very encouraging.

      The reviewers raised several important points that we plan to address in the revised manuscript. These center on:

      Model Architecture and Potential Circularity:

      Both reviewers raised the concern that using a CNN-based model could introduce circularity when comparing V4 functional groups to artificial vision systems, and questioned whether similar results would emerge with alternative architectures. We believe that the in vivo verification provides a critical control for this concern: the MEIs synthesized by our model were empirically validated to elicit significantly higher responses than matched natural image controls, demonstrating that the model captures genuine biological tuning properties rather than architectural artifacts. This means that even if these features emerged from the particular architectural choice, the biological neurons seem to prefer the same features. We will clarify this point in the respective section in the revised manuscript.

      Recording locations and spike sorting contamination:

      Reviewer #2 raised concerns about potential correlation artefacts along the silicon probe. Unfortunately, assessing functional correlations across sessions proved challenging because neurons recorded at different penetration sites had non-overlapping receptive fields, precluding direct comparison of responses to identical stimuli across recording sites. We will make this limitation explicit in the manuscript. Furthermore, we maintain conservative standards for spike sorting to minimize the risk of multi-unit activity (MUA) "smearing" across unit definitions. Our primary analyses are restricted to well-isolated single units that meet all isolation metrics. Due to our low-impedance ground placed on the bone, shared-reference contamination as a source of tuning similarity is also mitigated.

      Quantitative Comparisons to Prior Literature:

      Reviewer #2 also noted that our comparisons between MEIs and known V4 tuning properties (e.g., shape, curvature, texture selectivity) were presented qualitatively, and suggested that explicit image analyses or metrics would strengthen these links to prior literature. We will revise the text to more carefully frame these comparisons as qualitative observations consistent with prior findings.

      Alternative Similarity Metrics:

      We will expand our justification for the Böhm et al. contrastive embedding approach in the Methods section. However, we believe that a systematic comparison of multiple clustering and similarity methods is beyond the scope of the current study.

      In the revised manuscript, we will address these points primarily through clarifications and expanded discussion. Specifically, we will: (1) strengthen our discussion of model architecture choice emphasizing that in vivo verification serves as a critical control against architectural artifacts; (2) clarify the stringent matching criteria underlying our closed-loop sample size and its consistency with the larger population analyses; (3) explicitly describe the recording geometry, including the use of multiple grid holes, and explain why direct functional comparisons across penetrations were precluded by non-overlapping receptive fields; (4) better characterize the spatial relationship between receptive fields and MEI masks; (5) reframe comparisons to prior V4 literature as qualitative observations rather than quantitative validations; and (6) expand our justification for the contrastive embedding approach. We believe these revisions will improve the clarity and rigor of the manuscript while appropriately scoping the claims to what the current data support.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      Authors should be commended for the availability of data/code and detailed methods. Clarity is good. Authors have clearly spent a lot of time thinking about the challenges of metabolomics data analysis.

      Significance

      Schmidt et al. present MetaProViz, a comprehensive and modular platform for metabolomics data analysis. The tool provides a full suite of processing capabilities spanning metabolite annotation, quality control, normalization, differential analysis, integration of prior knowledge, functional enrichment, and visualization. The authors also include example datasets, primarily from renal cancer studies, to demonstrate the functionality of the pipeline. The MetaProViz framework addresses several long-standing challenges in metabolomics data analysis, particularly issues of reproducibility, ambiguous metabolite annotation, and the integration of metabolite features with pathway knowledge. The platform is likely to be a valuable addition for the community, but the reviewer has some comments that need to be addressed prior to publication.

      We thank the reviewer for this positive feedback.

      Comments:

      (1) (Planned)

      The section "Improving the connection between prior knowledge and metabolomics features" could benefit from additional clarification. It is not entirely clear to the reader what specific steps were taken beyond using RaMP-DB to translate metabolite identifiers. For example, how exactly were ambiguous mappings ("different scenarios") handled in practice, and to what extent does this process "fix" or merely flag inconsistencies? A more explicit description or example of how MetaProViz resolves these cases would help readers better understand the improvements claimed.

      We thank the reviewer for pointing this out and we agree that this section requires extension to ensure clarity. Beyond using RaMP-DB, we are characterising the mapping ambiguity (one-to-none, one-to-many, many-to-one, many-to-many) within and across metabolite-sets (i.e. pathways) and return this information to the user together with the translated identifiers. This is important to understand potential inflation/deflation of metabolite-sets that occur due to the translation. Moreover, we also offer the manually curated amino-acid collection to ensure L-, D- and zwitterion without chirality IDs are assigned for aminoacids (Fig. 2b). Ambiguous mappings are handled based on the measured data (Fig. 2e). Indeed, many translation cases that deflate (many-to-one mapping) or inflate (one-to-many mapping) the metabolite-sets are resolved when merging the prior knowledge with actual measured data (i.e. Fig. 2e, one-to-many in scenario 1, which becomes obsolete as only one/none of the many potential metabolite IDs is detected). By sorting each mapping into one of those scenarios, we only flag those cases. The reason for this decision has been that in many cases multiple decisions are valid (i.e. Fig. 2e, Scenario 5: Here the values of the two detected metabolites could be summed or the metabolite value with the larger Log2FC could be kept) and it should really be up to the user to make those dependent on their knowledge of the biological system and the analytical LC-MS method used.

      Since these points have not been clear enough, we will add a more explicit description to the results section by showcasing more details on how we exactly tackled this problem in the ccRCC example data. This has also been suggested by Reviewer 3 (Minor Comment 7 and 8), so feel free to also see the responses below.

      (2) (Planned)

      The introduction of MetSigDB is intriguing, but its construction and added value are not sufficiently described. It would be helpful to clarify what specific advantages MetSigDB provides over directly using existing pathway resources such as KEGG, Reactome, or WikiPathways. For example, how many features, interactions, or metabolite-set relationships are included, and in what way are these pathways improved or extended compared to those already available in public databases?

      We thank the reviewer for this valuable comment and we apologise that this was not described sufficiently. One of the major advantages is that all the resources are available in one place following the same table format without the need to visit the different original resources and perform data wrangling prior to enrichment analysis. In addition, where applicable, we have removed metabolites that are not detectable by LC-MS (i.e. ions, H2O, CO2) to circumvent pathway inflation with features that are never within the data and hence impacting the statistical testing in enrichment analysis workflows.

      During the revision, we will compile an Extended Data Table listing all the resources present in MetSigDB, their number of features and interactions. We will also extend the methods section "Prior Knowledge access" about MetSigDB and how we removed metabolites.

      (3)

      Figure 1D/1E: The reviewer appreciates the inclusion of the visualizations illustrating the different mapping scenarios, as these effectively convey the complexity of metabolite ID translation. However, it took some time to interpret what each scenario represented. It would be helpful to include brief annotations or explanatory text directly on the figures to clarify what each scenario depicts and how it relates to the underlying issue being addressed.

      *We think the reviewer refers to Fig. 2D/E and we acknowledge that this is a complex problem we try to convey. We received a similar comment from Reviewer 2 (Minor Comment 1), who asked to extend the figure legend description of what the different scenarios display. *

      We have extended the figure legend and specifically explained each displayed case and its meaning (Line 222-242):

      "d-e) Schematics of possible mapping cases between metabolite IDs (= each circle corresponds to one ID) of a pathway-metabolite set (e.g. KEGG) to metabolites IDs of a different database (e.g. HMDB) with (d) showing many-to-many mappings that can occur within and across pathway-metabolite sets and (e) additionally showing the mapping to metabolite IDs that were assigned to the detected peaks within and across pathway-metabolite sets. (d) __Translating the metabolite IDs of a pathway-metabolite set can lead to special cases such as many-to-one mappings (Pathway 1), where for example the original resource used the ID for L-Alanine (Pathway 1, green) and D-Alanine (Pathway 1, yellow) in the amino-acid pathway, whilst the translated resources only has an entry for Alanine zwitterion (Pathway 1, blue). Additionally, many-to-one mappings can also occur across pathways (Pathway 2-4), where this mapping is only detected when mappings are analysed taking all pathways into account. Both of these cases deflate the pathways, which can also happen for one-to-none mappings (Pathway 1, white). There are also cases that inflate the pathway such as one-to-many mappings (e.g. Pathway 2-4, orange mapping to pink and violet). (e)__ Showcasing the different scenarios when merging measured data (detected) based on the translated metabolites within pathways (scenario 1-5) and across pathways (scenario 6-8) highlighting problematic scenarios (4-7) that require further actions. Unproblematic scenarios (1-3 and 8) can include special cases between original and translated (i.e. one-to-many in scenario 1), which become obsolete as only one/none of the many potential metabolite IDs is detected. Yet, if multiple metabolites are detected action is required (scenario 5), which can include building the sum of the multiple detected features or only keeping the one with the highest Log2FC between two conditions. Other special cases between original and translated (i.e. many-to-one in scenario 4 and 6) also depend on what has been mapped to the measured features. If features have been measured in those scenarios, pathway deflation (i.e. only one original entry remains) or measured feature duplication (the same measurement is mapped to many features in the prior knowledge) are the possible results within and across pathways. Those scenarios should be addressed on a case-by-case basis as they also require biological information to be taken into account."

      We have also rearranged the Scenarios in Fig. 2e. We hope that together with the extended figure legend this is now clear.

      (4) (Planned)

      "By assigning other potential metabolite IDs and by translating between the present ID types, we not only increase the number of features within all ID types but also increase the feature space with HMDB and KEGG IDs (Fig. 2a, right, SFig. 2 and Supplementary Table 1)". The reviewer would appreciate additional clarification on how this was done. It is not clear what specific steps or criteria were used to assign additional metabolite IDs or to translate between identifier types. The reviewer also appreciates the inclusion of the UpSet plots. However, simply having the plots side-by-side makes it difficult to determine the specific differences. An alternative visualization, such as stacked bar plots, scatter plots summarizing the changes in feature counts, or other representation that more clearly highlights the deltas, might make these results easier to interpret.

      The main Fig. 2a shows the original (left) metabolite ID availability per detected metabolite feature in the ccRCC data and the adapted (right) metabolite IDs. The individual steps taken to extend the metabolite ID coverage of the measured features and obtain Fig 2a (right), are shown in SFig. 2 for HMDB (SFig. 2a) and KEGG (SFig. 2b). We did not include the plots for the pubchem IDs as they follow the same principle. The individual steps we are showcasing with SFig. 2 are (I) How many of the detected features (577) have a HMDB ID (341, red bar + grey bar), (II) How this distribution changed after equivalent amino-acid IDs are added, which does not change the number of features with an HMDB ID, but the number of features with a single HMDB ID, and (III) How this distribution changed after translating from the other available ID types (KEGG and PubChem) to HMDB IDs using RaMP-DBs knowledge, which leads to 430 detected features with one or multiple HMDB IDs. The exact numbers can be extracted from Supplementary Table 1, Sheet "Feature metadata", where for example N-methylglutamate had no HMDB ID assigned in the original publication (see column HMDB_Original), yet by translating HMDB from KEGG (hmdb_from_kegg) and PubChem (see column hmdb_from_pubchem) we obtain in both cases the same HMDB ID "HMDB0062660". In order to clarify this in the manuscript, we have extended the figure legend of SFig. 2: "a-b) Bargraphs showing the frequency at which a certain number of metabolite IDs per integrated peak are available as per ccRCC patients feature metadata provided in the original publication (left), after potential equivalent IDs for amino-acid and amnio-acid-related features were assigned (middle), which increases the number of features with multiple (middle: grey bars) and after IDs were translated from the other available ID types (right). for a) Of 577 detected features, 341 had at least one HMDB IDs assigned (left graph, red + grey bar) according to the original publication (left). Translating from KEGG-to-HMDB and from PubChem-to-HMDB increased the number of features with an HMDB ID from 341 to 430 (left). and __b) __Of 577 detected features, 306 had at least one KEGG IDs assigned (left graph, red + grey bar) according to the original publication (left). Translating from HMDB-to-KEGG and from PubChem-to-KEGG did not increase the total number of features with an KEGG ID (left)."

      We like the suggestion of the reviewer to provide representations of the deltas and will add additional plots to SFig. 2 as part of our planned revision.

      (5) (Planned)

      MetaboAnalyst is mentioned several times in the manuscript. The reviewer is familiar with some of the limitations and practical challenges associated with using MetaboAnalyst and its R package. Given that MetaboAnalyst already offers some overlapping functionality with MetaProViz (and offers it in the form of an interactive website and a sometimes functional R package), a more explicit comparison between the two tools would help readers fully understand the unique advantages and improvements provided by MetaProViz.

      This is a good point the reviewer raises. As part of the revisions, we plan to create a supplementary data table that includes both tools and their respective features. We will refer to this table within the manuscript text.

      (6)

      Page 11: The authors state that they used limma for statistical testing, including for the analysis of exometabolomics data, where the values appear to represent log2-transformed distances or ratios rather than normally distributed intensities. Since limma assumes approximately normal residuals, please provide evidence or justification that this assumption holds for these data types. If the distributions deviate substantially from normality, a non-parametric alternative might be more appropriate.

      For exometabolomics data we use data normalised to media blank and growth factor (formula (1)). Limma is performed on those data, not on the log2-transformed distances. The Log2(Distance) is calculated separately to the statistical results using the normalised exometabolomics data. In addition, we always perform the Shapiro-Wilk test as part of MetaProViz differential analysis function on each metabolite to understand the distribution. In this particular case we have the following distributions:

      Cell line

      Metabolites normal distribution [%]

      Metabolites not-normal distribution [%]

      HK2

      82.35

      17.65

      786-O

      95.71

      4.29

      786-M1A

      97.14

      2.86

      786-M2A

      88.57

      11.43

      OSRC2

      92.86

      7.14

      OSLM1B

      85.71

      14.29

      RFX631

      97.14

      2.86

      If a user would have distributions that deviate substantially from normality, non-parametric alternatives are also available in MetaProViz (see methods section for all options).

      7)

      Page 13: why were young and old defined this way? Authors should provide their reasoning and/or citations for this grouping.

      We thank the reviewer for pointing this out. The explanation of our choices of the age groups is purely based on the literature:

      First, ccRCC can be sporadic (>96%) or familial (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3308682/pdf/nihms362390.pdf). This was also observed in other cohorts, where of 1233 patients only 93 were under 40 years of age (%, whilst 1140 (%) were older than 40 years (https://www.europeanurology.com/article/S0302-2838(06)01316-9/fulltext). Second, given the high frequency of sporadic cases it is unsurprising that ccRCC incidences were found to peak in patients aged 60 to 79 years with more male than female incidences (https://journals.lww.com/md-journal/Fulltext/2019/08020/Frequency,_incidence_and_survival_outcomes_of.49.aspx). Third, it was shown that sex impacts on the renal cancer-specific mortality and is modified by age, which is a proxy for hormonal status with premenopausal period below 42 years and postmenopausal period above 58 years (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4361860/pdf/srep09160.pdf). Putting all of this information together, we decided on our age groups of young (58years) following the hormonal period in order to account for sex impact. Additionally, our young age group is representative of the age of familial ccRCC, whilst our old age group summarises the age group where incidences were found to peak.

      To make this clear in the manuscript we have extended the method section of the manuscript (Line 547-548):

      "For the patient's ccRCC data, we compared tumour versus normal of two patient subset, "young" (58years)."

      (8)

      Figure 4e: It may help with interpretation to have these Sankey-like graph edges be proportional to the number of metabolites.

      We thank the reviewer for this suggestion, which we also pondered. When we tested this visualisation, the plot became convoluted, hard to interpret and not all potential flows exist in the data. This is why we have opted to create an overview graph of each potential flow, with each edge representing a potentially existing flow. The number of times a flow exists is shown in Fig. 4f.

      (9)

      Figure 4h: The values appear to be on an intensity scale (e.g., on the order of 3e10), yet some of them are negative, which would not be expected for raw or log-transformed mass spectrometry intensities. It is unclear whether these represent normalized abundance values, distances, or some other transformation. In addition, for the comparison of tumour versus normal tissue, it is not specified what statistical test was applied. Since mass spectrometry data are typically log2-transformed to approximate a log-normal distribution before performing t-tests or similar parametric methods, clarification is needed on how these data were processed.

      Thanks for pointing this out, it made us realize that we need to extend our figure legend for clarity for Fig. 4h (Line 343-345). In both cases we show normalized intensities following the workflow described in Fig. 3a. In case of the left graph labelled "CoRe", we are plotting an exometabolomics experiment, were additionally normalised using both media blanks (samples where no cells were cultured in) and growth factor (accounts for cell growth during experiment) as growth rate (accounts for variations in cell proliferation) has not been available (see also formula (1) in methods section). A result has a negative value if the metabolite has been consumed from the media, or a positive value if the metabolite has been released from the cell into the culture media.

      In addition, the reviewer refers to the comparison of tumour versus normal (Fig. 4a __and 4d__) and the missing description of the chosen statistical test. We have added the details to the figure legend (Lines 334 and 345).

      Adapted legend Fig. 4: "a) Differential metabolite analysis results for exometabolomics data comparing 786-O versus HK2 cells using Annova and false discovery rate (FDR) for p-value adjustment. b) __Heatmap of mean consumption-release of the measured metabolites across cell lines. c) Heatmap of normalised ccRCC cell line exometabolomics data for the selected metabolites of amino acid metabolism for a sample subset. __d) __Differential metabolite analysis results for intracellular data comparing 786-O versus HK2 cells using Annova and false discovery rate (FDR) for p-value adjustment. __e) __Schematics of bioRCM process to integrate exometabolomics with intracellular metabolomics and __f) __number of metabolites by their combined change patterns in intracellular- and exometabolomics in 786-M1A versus HK2. g)__ Heatmap of the metabolite abundances in the "Both_DOWN (Released/Comsumed)" cluster. __h) __Bar graphs of normalised methionine intensity for exometabolomics (CoRe: negative value, if the metabolite has been consumed from the media, or a positive value, if the metabolite has been released from the cell into the culture media) and intracellular metabolomics (Intra)."


      (10)

      Figure 5: "Tukey's p.adj We thank the reviewer for pointing this out. We have used the TukeyHSD (Tukey's Honestly Significant Difference) test in R on the Anova results. We have added more details into the figure legend (Line 384): "(Tukey's post-doc test after anova p.adj<br /> (11)

      The potential for multi-omics is mentioned. Please clarify how generalizable this framework is. Can it readily accommodate transcriptomics, proteomics, or fluxomics data, or does it require custom logic or formatting for each new data type?

      Thanks for raising this question. MetaProViz can readily accommodate transcriptomics and proteomics data for combined enrichment analysis using for example MetalinksDB metabolite-receptor pairs. Yet, MetaProViz does not support modelling fluxomics data into metabolic networks. We state in the discussion that this could be future development ("Beyond current capabilities, future developments could also incorporate mechanistic modeling to capture metabolic fluxes, subcellular compartmentalization, enzyme kinetics, regulatory feedback loops, and thermodynamic constraints to dissect metabolic response under perturbations."). To clarify on the availability of multi-omics integration for combined enrichment analysis, we have added some more details into the discussion section.

      Line 467-469: "In addition, providing knowledge of receptor-, transporter- and enzyme-metabolite pairs, MetaProViz can readily accommodate transcriptomics and proteomics data for combined enrichment analysis."

      (12)

      Please clarify if/how enrichment analyses account for varying set sizes and redundant metabolite memberships across pathways, which can bias over-representation analysis results.

      This is a very relevant point, which we have already been working on. Indeed, we agree that enrichment results from enrichment analyses can be biased due to varying set sizes and redundant metabolite memberships across pathways. MetaProViz explicitly accounts for varying set sizes when running over representation analysis (functions standard_ora()and cluster_ora()), which uses a model that computes the p-value under a hypergeometric distribution. Thereby, larger pathways are penalized unless the overlap is proportionally large, while smaller pathways can be significant with fewer overlaps. Hence, the test quantifies whether the observed overlap between the query set and a pathway is larger than would be expected under random sampling. In addition, we explicitly filter by gene‑set size using min_gssize/max_gssize, which further controls for extreme small or large sets. So both the statistical test itself and the size filters incorporate gene‑set size variation.

      Regarding the redundant metabolite-set (i.e. pathways) memberships, we have now implemented a new function (cluster_pk()) to cluster metabolite-sets like pathways based on overlapping metabolites. Thereby we allow investigation of enrichment results in regard to redundancy and similarity. For given metabolite-sets, the function calculates pathway similarities via either overlap- or correlation-based metrics. After optional thresholding to remove weak similarities, we implemented three clustering algorithms (connected-components clustering, Louvain community detection and hierarchical clustering) to group similar pathways. We then visualize the clustering results as a network graph using the new function viz_graph based on igraph. We have added all information into our methods section "Metabolite-set clustering" (Lines 656-671). In addition, we have also added the results of the clustering into Fig. 5f.

      New Fig. 5f:"f) *Network graph of top enriched pathways (p.adjusted

      Reviewer #2

      Evidence, reproducibility and clarity

      Schmidt et al report the development of MetaProViz, an integrated R package to process, analyze and visualize metabolomics data, including integration with prior knowledge. The authors then go on to demonstrate utility by analyzing several metabolomes of cell lines, media and patient samples from kidney cancer. The manuscript provides a concise description of key challenges in metabolomics that the authors identify and address in their software. The examples are helpful and illustrative, although I should point out that I lack the expertise to evaluate the R package itself. I only have a few very minor comments.

      Significance

      This is a very significant advance from one of the leading groups in the field that is likely to enhance metabolomics data analysis in the wider community.

      We thank the reviewer for this positive feedback on our package. We appreciate that there are no major comments from the reviewer.

      Minor comments:

      (1)

      Figure 2D, E: While the schematics are fairly intuitive, a brief figure legend description of what the different scenarios etc. represent would make this easier to grasp.

      We thank the reviewer for pointing this out and we acknowledge that this is a complex problem we try to convey. We received a similar comment from Reviewer 1 (Comment 3), so please see the extensive response there. In brief, we have extended the figure legend and specifically explained each displayed case and its meaning (Line 222-242) and extended the Figure itself by adding additional categories to Fig. 2e.

      Extended legend Fig.2 d-e: "d-e) Schematics of possible mapping cases between metabolite IDs (= each circle corresponds to one ID) of a pathway-metabolite set (e.g. KEGG) to metabolites IDs of a different database (e.g. HMDB) with (d) showing many-to-many mappings that can occur within and across pathway-metabolite sets and (e) additionally showing the mapping to metabolite IDs that were assigned to the detected peaks within and across pathway-metabolite sets. (d) __Translating the metabolite IDs of a pathway-metabolite set can lead to special cases such as many-to-one mappings (Pathway 1), where for example the original resource used the ID for L-Alanine (Pathway 1, green) and D-Alanine (Pathway 1, yellow) in the amino-acid pathway, whilst the translated resources only has an entry for Alanine zwitterion (Pathway 1, blue). Additionally, many-to-one mappings can also occur across pathways (Pathway 2-4), where this mapping is only detected when mappings are analysed taking all pathways into account. Both of these cases deflate the pathways, which can also happen for one-to-none mappings (Pathway 1, white). There are also cases that inflate the pathway such as one-to-many mappings (e.g. Pathway 2-4, orange mapping to pink and violet). (e)__ Showcasing the different scenarios when merging measured data (detected) based on the translated metabolites within pathways (scenario 1-5) and across pathways (scenario 6-8) highlighting problematic scenarios (4-7) that require further actions. Unproblematic scenarios (1-3 and 8) can include special cases between original and translated (i.e. one-to-many in scenario 1), which become obsolete as only one/none of the many potential metabolite IDs is detected. Yet, if multiple metabolites are detected action is required (scenario 5), which can include building the sum of the multiple detected features or only keeping the one with the highest Log2FC between two conditions. Other special cases between original and translated (i.e. many-to-one in scenario 4 and 6) also depend on what has been mapped to the measured features. If features have been measured in those scenarios, pathway deflation (i.e. only one original entry remains) or measured feature duplication (the same measurement is mapped to many features in the prior knowledge) are the possible results within and across pathways. Those scenarios should be addressed on a case-by-case basis as they also require biological information to be taken into account."

      (2) Fig. 4: The authors briefly state that they integrate prior knowledge to identify the changes in methionine metabolism in kidney cancer, but it is not clear how exactly they contribute to this conclusion. It could be helpful to expand a bit on this to better illustrate how MetaProViz can be used to integrate prior knowledge into the analysis workflow.

      We think the reviewer refers to this section in the text (Line 363-370):

      "Next, we focused on the cluster "Both_DOWN (Released-Consumed)" and found that several amino acids are consumed by the ccRCC cell line 786-M1A but released by healthy HK2 cells. At the same time, intracellular levels are significantly lower than in HK2 (Log2FC = -0.9, p.adj = 4.4e-5) (Fig. 4g). To explore the role of these metabolites in signaling, we queried the prior knowledge resource MetalinksDB, which includes metabolite-receptor, metabolite-transporter and metabolite-enzyme relationships, for their known upstream and downstream protein interactors for the measured metabolites (Supplementary Table 5). This approach is especially valuable for exometabolomics, as it allows us to generate hypotheses about cell-cell communication. Notably, we identified links involving methionine (Fig. 4h), enzymes such as BHMT, and transporters such as SLC43A2 that were previously shown to be important in ccRCC25,42 (Supplementary Table 5)."

      We have now extended this part to clearly state that here MetalinkDB is the prior knowledge resource we used to identify the links for methionine (Line 363-364). In addition we have extended our summary statement to ensure clarity for the reader that we combine the biological clustering, which revealed the amino acid changes, with prior knowledge for the mechanistic insight (Line 380-381):

      "In summary, calculating consumption-release and combining it with intracellular metabolomics via biological regulated clustering reveals metabolites of interest. Further combining these results with prior knowledge using the MetaproViz toolkit facilitates biological interpretation of the data."

      (3)

      Given the functional diversity among metabolites -central to diverse pathways, are key signaling molecules, restricted functions, co-variation within a pathway - I wonder how informative approaches such as PCA or enrichment analyses are for identifying metabolic drivers of a (patho)physiological state. To some extent, this can be addressed by integrating prior knowledge, and it would be helpful if the authors could comment on (and if applicable explain) whether/how this is integrated into MetaProViz.

      The reviewer is correct in stating the functional diversity of metabolites, which is also why prior knowledge is needed to add mechanistic interpretation to the finding from the metadata analysis (as we showcased by focusing on the separation of age (Fig. 5c-d)). We think that approaches such as PCA or enrichment can be helpful, even if admittedly limited. For example, in the metadata analysis presented in Fig. 5b and the subsequent enrichment analysis presented in Fig. 5, we used PCA to extract the eigenvector and the loading, which act as weights indicating the contribution of each original metabolite to that specific principal components separation. Hence, the eigenvector of PCA shows the metabolite drivers of the separation. This does not necessarily mean that those metabolites are drivers of a (patho)physiological state - the (patho)physiological state can equally be the reason for those metabolites driving the separation on the Eigenvectors. Thus, the metadata analysis presented in Fig. 5b enables us to extract the metadata variables (patho)physiological states separated on a PC with the explained variance. This can also lead to co-variation, when multiple (patho)physiological states are separated on the same PC, as the reviewer correctly points out. Regarding the enrichment analysis, we provide different types of prior knowledge for classical mapping, but also the prior knowledge we used to create the biological regulated clustering, which together help to identify key metabolic groups as we can first cluster the metabolites and afterwards perform functional enrichment. Yet, this does not account for the technical issues of enrichment analysis. In this context multi-omics integration building metabolic-centric networks could further elucidate the diversity of metabolic pathways and connection to signalling and co-variation, yet this is not the scope of MetaProViz. To sum up, we are aware of the limitations of this analysis and the constraints on the downstream interpretation.

      To capture the functional diversity amongst metabolites, which leads to metabolites being present in multiple pathways of metabolite-pathways sets, we have implemented a new function to cluster metabolite-sets like pathways based on overlapping metabolites and visualize redundant metabolite-set (i.e. pathways) memberships (Fig.5f). For more details also see our response to Reviewer 1, Comment 12. We hope this will circumvent miss- and over-interpretation of the enrichment results.

      In addition, we have extended the text to include the analysis pitfalls explicitly (Line 416-419): "Another variable explaining the same amount of variance in PC1 is the tumour stage, which could point to adjacent normal tissue metabolic rewiring that happens in relation to stage and showcases that biological data harbour co-variations, which can not be disentangled by this method."

      Reviewer #3

      Evidence, reproducibility and clarity

      This manuscript introduces an R package MetaProViz for metabolomics data analysis (post anotation), aiming to solve a poor-analysis-choices problem and enable more people to do the analysis. MetaProViz not only guides people to select the best statistical method, but also enables to solve previously unsolved problems: e.g. multiple and variable metabolite names in different databases and their connections to prior knowledge. They also created exometabolomics analysis and the needed steps to visualise intra-cell / media processes. The authors demonstrated their new package via kidney cancer (clear-cell renal cell carcinoma dataset, steping one step closer to improve biological interpretability of omics data analysis.

      Significance

      This is a great tool and I can't wait to use it on many upcoming metabolomics projects! Authors tackle multiple ongoing issues within the field: from poor selection of statistical methods (they provide guidance or have default safer options) to the messiness of data annotation between databases and improving data interpretability. The field is still evolving quickly, and it's impossible to solve all problems with one package; thus some limitations within the package could be seen as a bit rigid. Nonetheless, this fully steps toward filling an existing methodological gap. All bioinformaticians doing metabolomic analysis, or those learning how to do it, will greatly benefit from this knowledge.

      I myself lead a team of 6 bioinformaticians, and we do analysis for researchers, clinicians, drug discovery, and various companies. We run internal metabolomics pipelines every day and fully sympathise with the problems addressed by the authors.

      Major comments affecting conclusions

      none.

      We thank the reviewer for this positive feedback on evidence, reproducibility and clarity as well as significance of our work given the reviewers experience with metabolomics data analysis mentioned. We appreciate that there are no major comments from the reviewer.

      Minor comments

      Minor comments, important issues that could be addressed and possibly improve the clarity or generally presentation of the tool. Please see all below.

      (1)

      1- You start with separating and talking about metabolomics and lipidomics, but lipidomics quickly dissapears (especially beyond abstract/intro) - no real need to discuss lipidomics.

      Thanks, that's a good note and we have removed it from the abstract and introduction.

      (2)

      2- You refer to the MetImp4 imputation web tool, but I cannot find an active website, manuscript, or R package for it, and the cited link does not load. This raises doubts about whether the tool is currently usable. Additionally, imputation choice should be guided by biological context and study design, not just by testing a few methods and selecting the one that performs best.

      We fully agree with the reviewer on imputation handling. The manuscript we cite from Wei et. al. (https://doi.org/10.1038/s41598-017-19120-0) compared a multitude of missing value imputation methods and made this comparison strategy available as a web-based tool not as any code-based package such as an R-package. Yet, the reviewer is right, the web-tool is no longer reachable. Hence, we have adapted the statement in our introduction (Line 61-62): "Moreover, there are tools that focus on specific steps of the pre-processing of feature intensities, which encompasses feature selection, missing value imputation (MVI)9 and data normalisation. For example, MetImp4 is a web-tool that includes and compares multiple MVI methods9. "

      (3)

      3- The authors address key metabolomics issues such as ambiguous metabolite names and isoforms, and their focus on resolving mapping ambiguities and translating between database identifiers is highly valuable. However, the larger challenge of de novo identification and the "dark matter" of unannotated metabolites remains unresolved (initiatives as MassIVE might help in the future https://massive.ucsd.edu/ProteoSAFe/ ), and readers may benefit from clearer acknowledgement that MetaProViz does not operate on raw spectral data. The introduction currently emphasizes annotation, but since MetaProViz requires already annotated metabolite tables (and then deals with all the messiness), this space might be better used to frame the interpretability and pathway-analysis challenges that the tool directly addresses.

      We appreciate the comment and have highlighted this in the abstract and introduction: "MetaProViz operates on annotated intensity values..." (Line 29 and 88).

      Given the newest advancements in metabolite identification using AI-based methods, MetaProViz toolkit with a focus on connecting metabolite IDs to prior knowledge becomes increasingly valuable. We added this to our discussion (Line 484-488): "Given the imminent shift in metabolite identification through AI-based approaches, including language model-guided48 methods and self-supervised learning49, the growing number of identified metabolites will make the MetaProViz toolkit increasingly valuable for the community to gain functional insights."

      In regards to the introduction, where we mention some tools for peak annotation: The reason why we have this paragraph where peak annotation are named is that we wanted to set the basis by (I) listing the different steps of metabolomics data analysis and (II) pointing to well-known tools of those steps. We also have a dedicated paragraph for pathway-analysis challenges.

      (4)

      4- I also really enjoyed you touching on the point of user-friendly but then inflexible and problem of reproducibility. We truly need well working packages for other bioinformaticians, rather than expecting wet-lab scientists to do all the analysis within the user interface.

      We thank the reviewer for this positive feedback.

      (5)

      5- It would be helpful to explain why the authors chose cancer/RCC samples for the demonstration. Was it because the dataset included both media and cell measurements? Does the tool perform best when multiple layers of information are available from the same experiment?

      We specifically chose the ccRCC cell line data as example since, for a multitude of cell lines, both media (exometabolomics) and intracellular metabolomics had been performed. The combination of both data types is only used in the biological regulated clustering (Fig. 5e-g), all other analyses do not require additional data modalities. We have not specifically tested how performance differs for this particular case as it would require multiple paired data (exometabolomics and intracellular metabolomics) taken at the same time and at different times.

      (6)

      6- Figure 2B: The upset plots effectively show increased overlap after adaptation, but it would be easier to compare changes if the order of the intersection bars in the "adapted" plot matched the original. For example, while total intersections increased (251→285), the PubChem+KEGG overlap decreased (24→5), likely due to reallocation to the full intersection.

      Thanks for raising this point. We initially had ordered the bars based on their intersection size, but we agree with the reviewers that for our point it makes sense to fix the order in the adapted plot to match the order of the original plot. We have done this (Fig 2a) and also extended the figure legend text of SFig. 2, which shows the individually performed adaptations summarized in Fig 2a.

      (7) (Planned)

      7- In your example of D-alanine and L-alanine - you mention how chirality is important biological feature, but up to this point it's not clear how do you do translation exactly and in which situations this would be treated just as "alanine" and when the more precise information would be retained? You mention RaMP-DB knowledge and one to X mappings as well as your general guidance in the "methods" part, but it would be useful to describe in this publication how you exactly tackled this problem in the ccRCC case.

      We thank the reviewer for this suggestion. Since this is a complex problem, we will add a more explicit description to the results section by showcasing more details on how we exactly tackled this problem in the ccRCC example data.

      In regards to D- and L-alanine, even though chirality is an important biological feature, in a standard experiment we can not distinguish if we detect the L- or D-aminoacid. This is why we try to assign all possible IDs to increase the overlap with the prior knowledge. In Fig. 2b we showcase that this can potentially lead to multiple mappings of the same measured feature to multiple pathways. For example, if we measure alanine and assign the pubchem ID for L-Alanine, D-Alanine and Alanine and try to map to metabolite-sets that include both L-Alanine and D-Alanine. In turn this could fall into Scenario 6 (Fig. 2e), where across pathways there is a D-Alanine specific one (Pathway 1) and a L-Alanine specific one (Pathway 2). Now we can decide, if we want to allow both mapping (many-to-one) or if we decide to exclude D-Alanine because we know our biological system is human and should primarily have L-Alanine.

      (8) (Planned)

      8- In one to many mappings, it would be interesting to see quantification how frequently it was happening within a pathway or across pathways. I.e. Would going into pathway analysis "solve" the issue of "lost in translation" or not really?

      We have quantified the frequency for the example of translating the KEGG metabolite-set into HMDB IDs (Fig. 2c, left panel). Yet, we are not showcasing the quantification across the KEGG metabolite-sets with this plot. During the revision we will add the full results available to the Extended Data Table 2, which currently only includes the results displayed in Fig.2c.

      (9)

      9- QC: the coefficient of variation (CV) helps identify features with high variability and thus low detection accuracy. Here it's important to acknowledge that if the feature is very variable between groups it can be extremely important, but if the feature is very variable within the group - only then one would have low trust in the accuracy.

      Yes, we totally agree with the reviewer on this. For this reason, we have applied CV only in instances where this is not leading to any condition-driven CV differences, but is truly feature-focused: (1) Function pool_estimation performs CV on the pool samples only, which are a homogeneous mixture of all samples, and hence can be used to assess feature variability. (2) Function processing performs CV on exometabolomics media samples (=blanks), which are also not impacted by different conditions.

      (10)

      10- Missing value imputation - while missing not at random is a great way to deal with missingness, it would be great to have options for others (not just MNAR), as missingness is of a complex nature. If a pretty strong decision has been made, it would be good to support this by some supplementary data (i.e. how results change while applying various combinations of missingness and why choosing MNAR seems to be the most robust).

      We have decided to only offer support for MNAR, since we would recommend MVI only if there is a biological basis for it.

      As mentioned in the response to your minor comment 2, Wei et. al. (https://doi.org/10.1038/s41598-017-19120-0) compared a multitude of missing value imputation methods. They compared six imputation methods (i.e., QRILC, Half-minimum, Zero, RF, kNN, SVD) for MNAR and systematically measured the performance of those imputation methods. They showed that QRILC and Half-Minimum produced much smaller SOR values, showing consistent good performances on data with different numbers of missing variables. This was the reason for us to only provide Half-minimum.

      (11) (Planned)

      11- In the pre-processing and imputation stages - it would be interesting to see a summary table of how many features are left after each stage.

      This is a good suggestion and refers to the steps described in Fig. 3a. We will create an overview table for this, add it into the Extended Data Table and refer to it in the results section.

      (12)

      12- Is there a reason not to do UMAP or PSL-DA graphs for outlier detection? Doing more than PCA would help to have more confidence in removing or retaining outliers in the cases where biological relevance is borderline.

      The reason we decided to use PCA was the standardly used combination with the Hotelling T2 outlier testing. Since PCA is a linear dimensionality reduction technique that preserves the overall variance in the data and has a clear mathematical foundation linked to the covariance structure, it specifically fits the required assumptions of the Hotelling T2 outlier testing. Indeed, Hotelling T2 relies on the properties of the covariance matrix and the assumption of a multivariate Gaussian distribution. UMAP is a non-linear dimensionality reduction technique, which prioritizes preserving local and global structures in a way that often results in good clustering visualization, but it distorts distances between clusters and does not have the same rigorous statistical underpinnings as PCA. In terms of PLS-DA, which focuses on maximizing the covariance between variables and the class labels, even though not commonly done, one could use the optimal latent variables for discrimination and apply Hotelling's T² to those latent variables. Yet, PLS-DA is supervised and actively tries to separate data points in the latent space, which can be misleading for outlier detection where methods like PCA that are unbiased, unsupervised and preserve global variance are advantageous.

      (13)

      13- Metadata vs metabolite features - can this be used beyond metabolomics (i.e. proteomics, transcriptomics, etc)? It can be always very useful when there are many metadata features and it's hard to pre-select beforehand which ones are the most biologically relevant.

      Yes, definitely. In fact, we have used the metadata analysis strategy also with proteomics data and it will work equally with any omics data type.

      (14)

      14- While authors discussed what KEGG pathways were significantly deregulated, it would be interesting to see all the pathways that were affected (e.g. aPEAR "bubble" graphs can show this (https://github.com/kerseviciute/aPEAR) , or something similar to NES scores). I appreciate the trickiness of it, but it would be quite interesting to see how authors e.g. Figure5e narrowed it down to the two pathways and how all the others looked like.

      We thank the reviewer for the suggestion of the aPEAR graphs. Following this suggestion, we have implemented a new function to enable clustering of the pathways based on overlapping metabolites (cluster_pk()). For more details regarding the method see also our response to Reviewer 1 (Comment 12) and our extended method section "Metabolite-set clustering" (Lines 656-671). We visualize the clustering results as a network graph, which we also included into Fig. 5f.

      The complete result of the KEGG enrichment can be found in Extended Data Table 1, Sheet 13 (Pathway enrichment analysis using KEGG on Young patient subset). The pathways are ranked by p.adjusted value and also include a score (FoldEnrichment) from the fishers exact test (similar to NES scores in GSEA). Here one can find a total of seven pathways with a p.adjusted value For Fig. 5e we narrowed down to these two pathways based on the previous findings of dysregulated dipeptides (Fig. 5d), as we searched for a potential explanation of this observation.

      (15)

      15- Could you comment on the runtime of the pipeline? In particular, do the additional translation steps and use of multiple databases substantially affect computational speed?

      Downloading and parsing databases takes significant time, especially large ones like RaMP or HMDB might take minutes on a standard laptop. Our local cache speeds up the process by eliminating the need for repeated downloads. In the future, database access will be even faster: according to our plans, all prior knowledge will be accessible in an already parsed format by our own API (omnipathdb.org). The ambiguity analysis, which is a complex data transformation pipeline, and plotting by ggplot2, another key component of MetaProViz, are the slowest parts, especially when performing analysis for the first time when no cache can be used. This means there are a few slow operations which complete in maximum a few dozens of seconds. However, the implementation and speed of these solutions doesn't fall behind what we commonly find in bioinformatics packages, and most importantly, the speed of MetaProViz doesn't pose an obstacle or difficulty regarding an efficient use of it in analysis pipelines.

      (16)

      16- I clap to the authors for automated checks if selected methods are appropriate!

      Thank you, this is something we think is important to ensure correct analysis and circumvent misinterpretation.

      (17)

      17- My suggestion would be to also look into power calculation or p-value histogram. In your example you saw some clear signal, but very frequently research studies are under-sampled and while effect can be clearly seen, there are just not enough samples to have statistically significant hits.

      We fully agree that power calculations are very important. Yet, this should ideally happen prior to the user's experiment. MetaProViz analysis starts at a later time-point and power calculations should have been done before. In regards to p-value histogram, we have implemented a similar measure, namely a density plot, which is plotted as a quality control measure within MetaProViz differential analysis function. The density plot is a smoothed version of a histogram that represents the distribution as a continuous probability density function and can be used to assess whether the p-values follow a uniform distribution.

      (18)

      18- Overall functional parts are novel and next step in helping with data interpretability, but I still found it hard to read into functionally clear insights (re to pathways / functional groupings of metabolites) - especially as you have e.g. enzyme-metabolite databases etc. I think clarity there could be improved and would help to get your message more widely across.

      Regarding the clarity to the pathway enrichment and their functional insights, we have extended the Figure legends of Fig. 4 and 5, clearly state that for the functional interpretation MetalinkDB is the prior knowledge resource we used to identify the links for methionine (Line 367-368), and we have extended our summary statement to highlight that we combine the biological clustering with prior knowledge for the mechanistic insight (Line 380-381).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript introduces an R package MetaProViz for metabolomics data analysis (post anotation), aiming to solve a poor-analysis-choices problem and enable more people to do the analysis. MetaProViz not only guides people to select the best statistical method, but also enables to solve previously unsolved problems: e.g. multiple and variable metabolite names in different databases and their connections to prior knowledge. They also created exometabolomics analysis and the needed steps to visualise intra-cell / media processes. The authors demonstrated their new package via kidney cancer (clear-cell renal cell carcinoma dataset, steping one step closer to improve biological interpretability of omics data analysis.

      Major comments affecting conclusions: none.

      Minor comments, important issues that could be addressed and possibly improve the clarity or generally presentation of the tool. Please see all below.

      1. You start with separating and talking about metabolomics and lipidomics, but lipidomics quickly dissapears (especially beyond abstract/intro) - no real need to discuss lipidomics.
      2. You refer to the MetImp4 imputation web tool, but I cannot find an active website, manuscript, or R package for it, and the cited link does not load. This raises doubts about whether the tool is currently usable. Additionally, imputation choice should be guided by biological context and study design, not just by testing a few methods and selecting the one that performs best.
      3. The authors address key metabolomics issues such as ambiguous metabolite names and isoforms, and their focus on resolving mapping ambiguities and translating between database identifiers is highly valuable. However, the larger challenge of de novo identification and the "dark matter" of unannotated metabolites remains unresolved (initiatives as MassIVE might help in the future https://massive.ucsd.edu/ProteoSAFe/ ), and readers may benefit from clearer acknowledgement that MetaProViz does not operate on raw spectral data. The introduction currently emphasizes annotation, but since MetaProViz requires already annotated metabolite tables (and then deals with all the messiness), this space might be better used to frame the interpretability and pathway-analysis challenges that the tool directly addresses.
      4. I also really enjoyed you touching on the point of user-friendly but then inflexible and problem of reproducibility. We truly need well working packages for other bioinformaticians, rather than expecting wet-lab scientists to do all the analysis within the user interface.
      5. It would be helpful to explain why the authors chose cancer/RCC samples for the demonstration. Was it because the dataset included both media and cell measurements? Does the tool perform best when multiple layers of information are available from the same experiment?
      6. Figure 2B: The upset plots effectively show increased overlap after adaptation, but it would be easier to compare changes if the order of the intersection bars in the "adapted" plot matched the original. For example, while total intersections increased (251→285), the PubChem+KEGG overlap decreased (24→5), likely due to reallocation to the full intersection.
      7. In your example of D-alanine and L-alanine - you mention how chirality is important biological feature, but up to this point it's not clear how do you do translation exactly and in which situations this would be treated just as "alanine" and when the more precise information would be retained? You mention RaMP-DB knowledge and one to X mappings as well as your general guidance in the "methods" part, but it would be useful to describe in this publication how you exactly tackled this problem in the ccRCC case.
      8. In one to many mappings, it would be interesting to see quantification how frequently it was happening within a pathway or across pathways. I.e. Would going into pathway analysis "solve" the issue of "lost in translation" or not really?
      9. QC: the coefficient of variation (CV) helps identify features with high variability and thus low detection accuracy. Here it's important to acknowledge that if the feature is very variable between groups it can be extremely important, but if the feature is very variable within the group - only then one would have low trust in the accuracy.
      10. Missing value imputation - while missing not at random is a great way to deal with missingness, it would be great to have options for others (not just MNAR), as missingness is of a complex nature. If a pretty strong decision has been made, it would be good to support this by some supplementary data (i.e. how results change while applying various combinations of missingness and why choosing MNAR seems to be the most robust).
      11. In the pre-processing and imputation stages - it would be interesting to see a summary table of how many features are left after each stage.
      12. Is there a reason not to do UMAP or PSL-DA graphs for outlier detection? Doing more than PCA would help to have more confidence in removing or retaining outliers in the cases where biological relevance is borderline.
      13. Metadata vs metabolite features - can this be used beyond metabolomics (i.e. proteomics, transcriptomics, etc)? It can be always very useful when there are many metadata features and it's hard to pre-select beforehand which ones are the most biologically relevant.
      14. While authors discussed what KEGG pathways were significantly deregulated, it would be interesting to see all the pathways that were affected (e.g. aPEAR "bubble" graphs can show this (https://github.com/kerseviciute/aPEAR) , or something similar to NES scores). I appreciate the trickiness of it, but it would be quite interesting to see how authors e.g. Figure5e narrowed it down to the two pathways and how all the others looked like.
      15. Could you comment on the runtime of the pipeline? In particular, do the additional translation steps and use of multiple databases substantially affect computational speed?
      16. I clap to the authors for automated checks if selected methods are appropriate!
      17. My suggestion would be to also look into power calculation or p-value histogram. In your example you saw some clear signal, but very frequently research studies are under-sampled and while effect can be clearly seen, there are just not enough samples to have statistically significant hits.
      18. Overall functional parts are novel and next step in helping with data interpretability, but I still found it hard to read into functionally clear insights (re to pathways / functional groupings of metabolites) - especially as you have e.g. enzyme-metabolite databases etc. I think clarity there could be improved and would help to get your message more widely across.

      Significance

      This is a great tool and I can't wait to use it on many upcoming metabolomics projects! Authors tackle multiple ongoing issues within the field: from poor selection of statistical methods (they provide guidance or have default safer options) to the messiness of data annotation between databases and improving data interpretability. The field is still evolving quickly, and it's impossible to solve all problems with one package; thus some limitations within the package could be seen as a bit rigid. Nonetheless, this fully steps toward filling an existing methodological gap. All bioinformaticians doing metabolomic analysis, or those learning how to do it, will greatly benefit from this knowledge.

      I myself lead a team of 6 bioinformaticians, and we do analysis for researchers, clinicians, drug discovery, and various companies. We run internal metabolomics pipelines every day and fully sympathise with the problems addressed by the authors.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Schmidt et al report the development of MetaProViz, an integrated R package to process, analyze and visualize metabolomics data, including integration with prior knowledge. The authors then go on to demonstrate utility by analyzing several metabolomes of cell lines, media and patient samples from kidney cancer. The manuscript provides a concise description of key challenges in metabolomics that the authors identify and address in their software. The examples are helpful and illustrative, although I should point out that I lack the expertise to evaluate the R package itself. I only have a few very minor comments.

      Minor comments:

      1. Figure 2D, E: While the schematics are fairly intuitive, a brief figure legend description of what the different scenarios etc. represent would make this easier to grasp.
      2. Fig. 4: The authors briefly state that they integrate prior knowledge to identify the changes in methionine metabolism in kidney cancer, but it is not clear how exactly they contribute to this conclusion. It could be helpful to expand a bit on this to better illustrate how MetaProViz can be used to integrate prior knowledge into the analysis workflow.
      3. Given the functional diversity among metabolites -central to diverse pathways, are key signaling molecules, restricted functions, co-variation within a pathway - I wonder how informative approaches such as PCA or enrichment analyses are for identifying metabolic drivers of a (patho)physiological state. To some extent, this can be addressed by integrating prior knowledge, and it would be helpful if the authors could comment on (and if applicable explain) whether/how this is integrated into MetaProViz.

      Significance

      This is a very significant advance from one of the leading groups in the field that is likely to enhance metabolomics data analysis in the wider community.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      The author presents a new method for microRNA target prediction based on (1) a publicly available pretrained Sentence-BERT language model that the author fine-tunes using MeSH information and (2) downstream classification analysis for microRNA target prediction. In particular, the author's approach, named "miRTarDS", attempts to solve the microRNA target prediction problem by utilizing disease information (i.e., semantic similarity scores) from their language model. The author then compares the prediction performance with other sequence- and disease-based methods and attempts to show that miRTarDS is superior or at least comparable to existing methods. The author's general approach to this microRNA target prediction problem seems promising, but fails to demonstrate concrete computational evidence that miRTarDS outperforms other existing methods. The author's claim that disease information-based language models are sufficient is unfounded. The manuscript requires substantial rewriting and reorganization for readers with a strong background in biomedical research.

      We appreciate the reviewer’s careful examination of modeling, benchmarking, and interpretation, and we are particularly encouraged that they found the proposed method promising. We will make corresponding revisions to the manuscript based on the reviewer’s comments.

      A major issue related to the author's claim of computational advance of miRTarDS: The author does not introduce existing biomedical-specific language models, and does not compare them against miRTarDS's fine-tuned model. The performance of miRTarDS is largely dependent on the semantic embedding of disease terms. The author shows in Figure 5 that MeSH-based fine-tuning leads to a substantial improvement in MeSH-based correlation compared to the publicly available pretrained SBERT model "multi-qa-MiniLM-L6-cos-v1" without sacrificing a large amount of BIOSSES-based correlation. However, the author does not compare the performance of MeSH- and BIOSSES-based correlation with existing language models such as ChatGPT, BioBERT, PubMedBERT, and more. Also, the substantial improvement in MeSH-based correlation is a mere indication that the MeSH-based fine-tuning strategy was reasonable and not that it's superior to the publicly available pretrained SBERT model "multi-qa-MiniLM-L6-cos-v1".

      We thank the reviewer for the constructive suggestions regarding the benchmarking of language models. We acknowledge that the performance of miRTarDS largely depends on the semantic embeddings of disease terms. So, in the revisions, I will: 1) conduct a literature review to introduce existing biomedical-specific language models, and 2) perform a horizontal comparison between our fine-tuned model and these existing models, to more comprehensively evaluate the model’s capabilities.

      Another major issue is in the author's claim that disease-information from miRTarDS's language model is "sufficient" for accurate microRNA target prediction. Available microRNA targets with experimental evidence are largely biased for those with disease implications that have been reported in the biomedical literature. It's possible that their language model is biased by existing literature that has also been used to build microRNA target databases. Therefore, it is important that the author provides strong evidence that excludes the possibility of data leakage circularity. Similar concerns are prevalent across the manuscript, and so I highly recommend that the author reassess the evaluation frameworks and account for inflated performance, biased conclusions, and self-confirming results.

      We thank the reviewer for the comment. We recognize that existing experimentally validated microRNA targets may be biased toward those reported in biomedical literature as disease‑related. To mitigate this bias, we attempted to extract predicted microRNA targets that share a very similar number of miRNA- and gene‑ disease entries as the experimentally validated microRNA targets using the K‑Nearest Neighbors (KNN) method. Then applied Positive‑Unlabeled (PU) Learning to classify the two groups. PU‑Learning is designed to address scenarios where only a subset of the training data is explicitly labeled as positive, while the remaining data are unlabeled—with the unlabeled set containing both potential positives and true negatives—which is highly suitable for the application context of this manuscript [1]. Preliminary results show that after applying the new data extraction and classification approach, model performance drops to around F1=0.73 (the MISIM method also shows a decline, with F1 around 0.58; detailed code is available on GitHub). The specific reasons for this require further investigation.

      Last but not least, the manuscript requires a deeper and careful description and computational encoding of microRNA biology. I'd advise the author to include an expert in microRNA biology to improve the quality of this manuscript. For example, the author uses the pre-miRNA notation and replaces the mature miRNA notation to maintain computational encoding consistency across databases. However, the mature microRNA notation "the '-3p' or '-5p' is critical as the 3p and 5p mature microRNAs have different seed sequences and thus different mRNA targets. The 3p mature microRNA would most likely not target an mRNA targeted by the 5p mature microRNA.

      We thank the reviewer for the critique and suggestion. We fully agree with the reviewer that the distinction between the 3p and 5p mature strands is critical for determining mRNA targeting, as they possess distinct seed sequences. In our study, we relied on the miRNA–disease associations provided by the HMDD database, which annotates interactions at the pre-miRNA level: “… the enriched functions of each mature miRNA are aggregated to the corresponding miRNA precursor.” [2] Furthermore, existing literature suggests that the pre-miRNA level can be appropriate and informative for disease association analyses: “Compared with the mature miRNA method, the pre-miRNA method is more useful for studying disease association.” [3] We also find that, in some cases, both strands cooperate to regulate the same or complementary pathways [4]. We acknowledge the reviewer’s point as an important consideration for future revision. We plan to consult or collaborate with biologists to enhance the quality of the manuscript in biology.

      Reviewer #2 (Public review):

      This study introduces a novel knowledge-driven approach, miRTarDS, which enables microRNA-Target Interaction (MTI) prediction by leveraging the disease association degree between a miRNA and its target gene. The core hypothesis is that this single feature is sufficient to distinguish experimentally validated functional MTIs from computationally predicted MTIs in a binary classification setting. To quantify the disease association, the authors fine-tuned a Sentence-BERT (SBERT) model to generate embeddings of disease descriptions and compute their semantic similarity. Using only this disease association feature, miRTarDS achieved an F1 score of 0.88 on the test set.

      We thank the reviewers for their positive feedback, especially for their recognition of the novelty of this manuscript.

      Strengths:

      The primary strength is the innovative use of the disease association degree as an independent feature for MTI classification. In addition, this study successfully adapts and fine-tunes the Sentence-BERT (SBERT) model to quantify the semantic similarity between biomedical texts (disease descriptions). This approach establishes a critical pathway for integrating powerful language models and the vast growth in clinical/disease data into biochemical discovery, like MTI prediction.

      We would like to thank the reviewer again for their positive feedback. We appreciate their recognition of the novelty of our work, as well as their acknowledgment that the proposed method paves the way for integrating language models with clinical/disease data into biochemical discovery.

      Weaknesses:

      The main weakness lies in its definition of the ground-truth dataset, which serves as a foundation for methodological evaluation. The study defines the Negative Set as computationally predicted MTIs that lack experimental evidence. However, the absence of experimental validation does not equate to non-functionality. Similarly, the miRAW sets are classified by whether the target and miRNA could form a stable duplex structure according to RNA structure prediction. This definition is biologically irrelevant, as duplex stability does not fully encapsulate the complex in vivo binding of miRNAs within the AGO protein complex.

      We thank the reviewers for their constructive feedback. We have realized that treating predicted MTI as a negative class may pose some issues. Therefore, we have decided to adopt Positive Unlabeled (PU) Learning in subsequent updates. This classification method can be applied to datasets such as ours, which contain only positive classes and lack negative ones [1]. We used the miRAW dataset to enable a horizontal comparison of our method with traditional sequence-based prediction approaches. We acknowledge that miRAW may overlook some biological insights, and we plan to optimize the construction of test datasets in the future. Some preliminary explorations have already been conducted, and the relevant code is available on GitHub.

      Furthermore, we will make the following revisions: 1) We will clearly specify the version of miRBase and incorporate more miRNA-related databases. 2) Conduct a further literature review on miRNA biological mechanisms to enhance the quality of the manuscript in biology. 3) Perform a more comprehensive evaluation of the model’s performance. 4) Attempt to identify some representative MTIs that have been overlooked by existing prediction tools but can be predicted by our proposed method.

      References

      (1) Li, F., Dong, S., Leier, A., Han, M., Guo, X., Xu, J., ... & Song, J. (2022). Positive-unlabeled learning in bioinformatics and computational biology: a brief review. Briefings in Bioinformatics, 23(1), bbab461.

      (2) Huang, Z., Shi, J., Gao, Y., Cui, C., Zhang, S., Li, J., ... & Cui, Q. (2019). HMDD v3. 0: a database for experimentally supported human microRNA–disease associations. Nucleic acids research, 47(D1), D1013-D1017.

      (3) Wang, H., & Ho, C. (2023). The human pre-miRNA distance distribution for exploring disease association. International Journal of Molecular Sciences, 24(2), 1009.

      (4) Mitra, R., Adams, C. M., Jiang, W., Greenawalt, E., & Eischen, C. M. (2020). Pan-cancer analysis reveals cooperativity of both strands of microRNA that regulate tumorigenesis and patient survival. Nature Communications, 11(1), 968.

    1. Reviewer #2 (Public review):

      Summary:

      The manuscript describes a combined computational and experimental approach to investigate the ABHD5 binding to and insertion into membranes.

      Strengths:

      Mutational experiments support computational findings obtained on ABHD5 membrane insertion with enhanced-sampling atomistic simulations.

      Weaknesses:

      While the addressed problem is interesting, I have several concerns, which fall into two categories:

      (A) I see statements throughout the manuscript, e.g. on PNPLA activation, that are not supported by the results.

      (B) The presentation of the computational and experimental results lacks in part clarity and detail.

      Comments and questions on (A):

      (1) I think the following statements in the abstract, which go beyond ABHD5 membrane binding, are not supported by the presented data:

      the addition "to control lipolytic activation" in the 3rd sentence of the abstract.

      further below ".... transforming ABHD5 into an active and membrane-localized regulator".

      (2) The authors state in the Introduction (page numbers and line numbers are missing to be more specific):

      "We hypothesize that binding of ABHD5 alters the nanoscale chemical and biophysical properties of the LD monolayer, which, combined with direct protein-protein interactions, enables PNPLA paralogs to access membrane-restricted substrates. This regulatory mechanism represents a paradigm shift from conventional enzyme-substrate interactions to sophisticated allosteric control systems that operate at membrane interfaces."

      This hypothesis and the suggested paradigm shift are not supported by the data. Protein-protein interactions are not considered. What is meant by "sophisticated allosteric control"?

      (3) The authors state in the Results section:

      "We hypothesize that this TAG nanodomain is critical for ABHD5-activated TAG hydrolysis by PNPLA2." In previous pages, the authors state the location of the nanodomain: "TAG nanodomain under ABHD5".

      If the nanodomain is located under ABHD5, how can it be accessible to PNPLA2? To my understanding, ABHD5 then sterically blocks access of PNPLA2 to the TAG nandomain.

      (4) Another statement: "Our findings suggest that ABHD5-mediated membrane remodeling regulates lipolysis in part by regulating PNPLA2 access to its TAG substrate."

      I don't see how the reported results support this statement (see point 3 above).

      Comments and questions on (B):

      (1) The authors state that the GaMD simulations started "from varying conformations observed during CGMD".

      What is missing is a clear description of the CGMD simulation conformations, and the CG simulations as a whole, prior to the results section on GaMD. The authors use standard secondary and tertiary constraints in the Martini CG simulations. Do the authors observe some (constrained) conformational changes of ABHD5 already in the CG simulations (depending on the strength of the constraints)? Or do the conformational changes occur exclusively in the GaMD simulations? Both are fine, but this needs to be described.

      (2) The authors write: "Three replicas of GaMD were performed."

      Do these replicas lead to similar, or statistically identical, membrane-bound ABHD5 conformations? Is this information, i.e. a statistical analysis of differences in the replica runs, already included in the manuscript?

      (3) The authors state on the hydrogen exchange results:

      "HDX-MS provided orthogonal experimental evidence for the dynamics of the lid. In solution, a peptide (residues 200-226) spanning the lid helix displayed a bimodal isotopic distribution (Fig. S4), indicating the coexistence of different conformations. Upon LD binding, this distribution shifted to a single, low-exchange peak, demonstrating stabilization of the membrane-bound conformation with reduced solvent accessibility. These experimental observations corroborate our MD simulations."

      I find this far too short to be understandable. Also, there are no computational results of ABHD5 in solution that show a bimodal conformational distribution of the lid helix, which is observed in the hydrogen exchange experiments. Which aspects of the MD simulations are corroborated?

    1. Reviewer #1 (Public review):

      Summary:

      The goal of the study was to address the question of the degree to which social position in a group is a stable trait that persists across conditions. Reinwald et al. use a custom-built cage system with automated tracking and continuous testing for social dominance that does not require intervention by the experimenter. Remixing of individuals from different groups revealed that social position was rather stable and not really predictable from other measures that were taken. The authors conclude that social position is multifaceted but dependent on characteristics like personality traits.

      Strengths:

      (1) Reductionistic, highly controlled setting that allows for the control of many confounding variables.

      (2) Very interesting and important question.

      (3) Confirms the emergence of inter-individual behavior-driven differences in inbred mice in a shared environment.

      (4) Innovative paradigm and experimental setup.

      (5) Fresh perspective on an old question that makes the best use of modern technology.

      (6) Intelligent use of behavioral and cognitive covariables to generate a non-social context.

      (7) Bold and almost provocative conclusion, inviting discussion and further elaboration.

      Weaknesses:

      (1) Reductionistic, highly controlled setting that blends out much of the complexity of social behavior in a community.

      (2) The motivation to enter the test tube is not "trait" (or at least not solely a trait) but the basic need to reach food and water; chasing behavior would be less dependent on this stimulus.

      (3) Dominance is only one aspect of sociality, social structure is reduced to rank. The information that might lie in the chasing behavior is not optimally used to explain social behavior beyond the rank measure.

      (4) Focus on rank bears the risk of overgeneralization for readers not familiar with the context.

      (5) Conclusion only valid for the reductionistic setting, in which environment, social and non-social changes only within narrow limits, and in which the mouse population does not face challenges

      (6) Animals are not naive at the beginning of the experiment, but are already several weeks old.

      In summary, this is a wonderful study, but not one that is easy to interpret. The bold conclusion is valid only within the constraints of the study, but nevertheless points in an important direction. The paradigm is clever and could be used for many interesting follow-ups.

      To define social position as a personality trait will elicit strong opposition and much debate; the nuances of the paper might be lost on many readers and call for the (re)-consideration of many concepts that are touched. I find this attitude a strength of the paper, but the approach bears the risk of misunderstanding.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents the "NoSeMaze", a novel automated platform for studying social behavior and cognitive performance in group-housed male mice. The authors report that mice form robust, transitive dominance hierarchies in this environment and that individual social rank remains largely stable across multiple group compositions. They further demonstrate that social dominance and aggressive behaviors, like chasing, are partially dissociable and that dominance traits are independent of non-social cognitive performance. The study includes a genetic manipulation of oxytocin receptor expression in the anterior olfactory nucleus, which showed only transient effects on social rank.

      Strengths:

      (1) Innovative Methodology:<br /> The NoSeMaze platform is a technically elegant and conceptually well-integrated system that enables fully automated, long-term monitoring of both social and cognitive behaviors in large groups of group-housed mice. It combines tube-test-like dominance contests, voluntary chase-escape interactions, and an embedded operant olfactory discrimination task within a single, ethologically relevant environment. This modular design allows for high-throughput, minimally invasive behavioral assessment without the need for repeated handling or artificial isolation.

      (2) Experimental Scale and Rigor:<br /> The study includes 79 male mice and over 4,000 mouse-days of observation across multiple group reshufflings. The use of RFID-based identification, automated data logging, and longitudinal design enables robust quantification of individual trait stability and group-level social structure.

      (3) Multidimensional Behavioral Profiling:<br /> The integration of social (tube dominance, proactive chasing), physical (body weight), and cognitive (olfactory learning task) measures offers a rich, multi-dimensional profile of each individual mouse. The authors' finding that social dominance traits and non-social cognitive performance are largely uncorrelated reinforces emerging models of orthogonal behavioral trait axes or "animal personalities".

      (4) Clarity and Data Analysis:<br /> The analytical framework is well-suited to the study's complexity, with appropriate use of dominance metrics, mixed-effects models, and permutation tests. The analyses are clearly explained, statistically rigorous, and supported by transparent supplementary materials.

      Weaknesses:

      (1) Conceptual Novelty and Prior Work:<br /> While the study is carefully executed and methodologically innovative, several of its core findings reaffirm concepts already established in the literature. The emergence of stable, transitive social hierarchies, the persistence of individual differences in social behavior, and the presence of non-despotic social structures have all been previously reported in mice, including under semi-naturalistic conditions (e.g., Fan et al., 2019; Forkosh et al., 2019). Although this work extends those findings with greater behavioral resolution and scale, the manuscript would benefit from a clearer articulation of what is genuinely novel at the conceptual level, beyond the technological advance.

      (2) Role of OXTR Deletion:<br /> The inclusion of the OXTR manipulation feels somewhat disconnected from the manuscript's central aims. The effects were minimal and transient, and the authors defer full interpretation to a separate study.

      (3) Scope Limitations (Sex and Age):<br /> The study is limited to male mice, and although this is acknowledged, the title and overall framing imply broader generalizability. This sex-specific focus represents a common but problematic bias. Additionally, results from the older mouse cohort are under-discussed; if age had no effect, this should be explicitly stated.

      (4) Ambiguity of Dominance as a Construct:<br /> While the study robustly quantifies social rank and hierarchy structure, the broader functional meaning of "dominance" remains unclear. As in prior work (e.g., Varholick et al., 2019), dominance rank here shows only weak associations with physical attributes (e.g., body weight), cognitive strategy, or neuromodulatory manipulation (OXTR deletion). This recurring pattern, where rank metrics are reliably established yet poorly predictive of other behavioral or biological traits, raises important questions about what such measures actually capture. In particular, it challenges the assumption that outcomes in paradigms like the tube test or chase frequency necessarily reflect dominance per se, rather than other constructs.

    3. Reviewer #3 (Public review):

      Reinwald et al. present the NoSeMaze, a semi-natural behavioral system designed to track social behaviors alongside reinforcement-learning in large groups of mice. Accumulating more than 4,000 days of behavioral monitoring, the authors demonstrate that social rank (determined by tube competitions) is a stable trait across shuffled cohorts and correlated with active chasing behaviors. The system also provides a solid platform for long-term measurements of reinforcement learning, including flexibility, response adaptation, and impulsiveness. Yet, the authors show that social ranking and chasing are mostly independent of these cognitive traits, and both seem mostly independent of oxytocin signaling in the AON.

      Strengths:

      (1) The neuroethological approach for automated tracking of several mice under semi-natural conditions is still rare in social behavioral research and should be encouraged.

      (2) The assessment of dominance by two independent measures, i.e., spontaneous tube competitions and proactive chasing, is innovative and valuable.

      (3) The integration of a long-term reinforcement-learning module into the semi-natural system provides novel opportunities to combine cognitive traits into social personality assessments.

      (4) The open-source system provides a valuable resource for the scientific community.

      Limitations:

      (1) Apparent ambiguity and inconsistency in age structure and cohort participation across rounds, raising concerns about uncontrolled confounds.

      (2) Chasing behavior appears more stable than tube-test competitions (Figure 4D vs. Figure 3D), which challenges the authors' decision to treat tube competitions as the primary basis for hierarchy determination.

      Major concerns:

      (1) Unclear and inconsistent handling of age groups and repeated sampling. The manuscript repeatedly refers to "younger" and "older" adults, but it is unclear whether age was ever controlled for or included in models. Some mice completed only one round, others 2-5 rounds, without explanation of the criteria or balancing.

      (2) Stability of chasing appears stronger than the stability of tube competitions. Figure 4D shows highly consistent chasing behavior across weeks, while Figure 3D shows weaker and more variable correlations for tube-based David scores. This is also evident from Figure 5A-B,D. Thus, it appears that chasing, which serves to quantify dominance in similar semi-natural setups, may be a more reliable and behaviorally meaningful measure of dominance than the incidental tube competitions.

      (3) Unbalanced participation across rounds compromises stability analyses. Stability analyses (e.g., ICCs, round-to-round correlations) assume comparable sampling across individuals. However, some mice contribute 1 round, others 2, 3, 4, and even 5 rounds. This imbalance may inflate stability estimates or confound group reshuffling effects, and the rationale for variable participation is not explained.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript uses serological data to quantify the effects of imprinting on subsequent influenza antibody responses. While this is an admirable goal, the HI dataset sounds impressive, and the authors developed a number of models, the manuscript came off as very dense and technical. One of the biggest pitfalls is that it is not easy to understand the lessons learned. The two Results section headers make clear statements - there was an imprinting signal in the HI titers, but much of this signal could also be seen in an imprinting-free simulation - and then the Discussion states a number of limitations. This is fine, but it leaves the reader wondering exactly how large an error would be introduced by ignoring imprinting effects altogether; alternatively, if imprinting is purposefully added, what would the expected effect size be? The comments below will provide some concrete steps to help clarify these points.

      Major comments:

      (1) Lines 107-133: The first Results section is a dense slog of information, and the reader is never given a good overview of what the imprinting coefficients exactly are. As the paper currently stands, if you do not start by reading the Methods, you will take away very little. I suggest adding a schematic for any of your models, showing what HI titers would be expected with/without imprinting effects. or age effects, or both, to tie in your modeling coefficients with quantities that all readers are familiar with.

      (1.1) Clarify what the imprinting coefficient (y-axis in Figure 1A) looks like in this schematic.

      (1.2) Another aspect that I missed: In addition to stating which models were best by BIC, what is the absolute effect size in the HI titers? During my initial reading, I had hoped that Figure 3 would answer this question, but it turned out to be just an overview of the dataset. I strongly suggest having such a figure to show the imprinting effect inferred by different models. What would the expected effect be if you kept someone's birth year constant but tuned their age? What if you kept their age at collection constant but tuned their birth year?

      (1.3) It would also help to explain in your schematic what the x-axis labels (H1, H2, H1/H3) would look like in these scenarios, and what imprinting relative to H3 means.

      (2) As mentioned above, it was hard to understand the takeaway messages, such as:

      (2.1) A similar question would be: If you model antibody titers without imprinting, how far off would you be from the actual measurements (2x off, 4x off...)? If you add the imprinting effect, how much closer do you get?

      (2.2) Are there specific age groups that require imprinting to be taken into consideration, since otherwise HI analyses will be markedly off?

      (2.3) Are there age groups where imprinting can be safely ignored?

      (3) HI titers against multiple H1 and H3 variants were measured, but it is unclear how these are used, and why titers against a single variant each season would not have worked equally well.

    1. Reviewer #3 (Public review):

      Summary:

      The manuscript introduces a visual paradigm aimed at studying tran-saccadic memory.

      The authors observe how memory of object location is selectively impaired across eye movements, whereas object colour memory is relatively immune to intervening eye movements.<br /> Results are reported for young and elderly healthy controls, as well as PD and AD participants.

      A computational model is introduced to account for these results, indicating how early differences in memory encoding and decay (but not tran-saccadic updating per se) can account for the observed differences between healthy controls and clinical groups.

      In the revised manuscript, the authors have addressed most of my initial concerns. The dataset is generally compelling, as it includes healthy younger and older adults as well as clinical populations. In addition, the authors propose an interesting modelling approach designed to isolate and characterize the key components underlying the observed patterns of results.

      It is important to acknowledge potential limitations of the modelling approach, particularly the differences in the number of parameters across the tested models. As models with more parameters typically achieve better fit, this issue warrants careful consideration. The authors have substantially addressed this point in their rebuttal.

      Concerns regarding the specificity of the findings were also raised and have been adequately discussed in the authors' response. Specifically, they clarified the selective impact of saccade-related costs on spatial working memory updating across eye movements-without affecting feature‑based memory (e.g., color) -as well as the specificity of the updating effects observed with the Rey-Osterrieth Complex Figure.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Thank you so much for your comprehensive and insightful assessment of our manuscript. We appreciate your recognition of the novelty of our experimental design and the utility of our computational framework for interpreting visual remapping across the lifespan and in clinical populations. We are very grateful for your suggestions regarding the narrative flow, which have helped us to improve the manuscript's focus and coherence. Our responses to your specific concerns are detailed below.

      (1) Relevance of the figure-copy results (pp. 13-15). Is it necessary to include the figure-copy task results within the main text? The manuscript already presents a clear and coherent narrative without this section. The figure-copy task represents a substantial shift from the LOCUS paradigm to an entirely different task that does not measure the same construct. Moreover, the ROCF findings are not fully consistent with the LOCUS results, which introduces confusion and weakens the manuscript's coherence. While I understand the authors' intention to assess the ecological validity of their model, this section does not effectively strengthen the manuscript and may be better removed or placed in the Supplementary Materials.

      We thank the reviewer  for their perspective regarding the narrative flow and the transition between the LOCUS paradigm and the ROCF results. However, we remain keen to retain these findings in the main text, as they provide critical ecological and clinical validation for the computational mechanisms identified in our study.

      We think these results strengthen the manuscript for the following main reasons:

      (1) The ROCF we used is a standard neuropsychological tool for identifying constructional apraxia. Our results bridge the gap between basic cognitive neuroscience and clinical application by demonstrating that specific remapping parameters—rather than general memory precision—predict real-world deficits in patients.

      (2) The finding that our winning model explains approximately 62% of the variance in ROCF copy scores across all diagnostic groups further indicates that these parameters from the LOCUS task represent core computational phenotypes that underpin complex, real-life visuospatial construction (copying drawings).

      (3) Previous research has often observed only a weak or indirect link between drawing ability and traditional working memory measures, such as digit span (Senese et al., 2020). This was previously attributed to “deictic” strategies—like frequent eye and hand movements—that minimise the need to hold large amounts of information in memory (Ballard et al., 1995; Cohen, 2005; Draschkow et al., 2021). While our study was not exclusively designed to catalogue all cognitive contributions to drawing, the findings provide significant and novel evidence indicating that transsaccadic integration is a critical driver of constructional (copying drawing) ability. By demonstrating this link, the results provide evidence to stimulate a new direction for future research, shifting the focus from general memory capacity toward the precision of spatial updating across eye movements.

      In summary, by including the ROCF results in the main text, we provide evidence for a functional role for spatial remapping that extends beyond perceptual stability into the domain of complex visuomotor control. We have expanded on these points throughout the revised manuscript:

      In the Introduction: p.2:

      “The clinical relevance of these spatial mechanisms is underscored by significant disruptions to visuospatial processing and constructional apraxia—a deficit in copying and drawing figures—observed in neurodegenerative conditions such as Alzheimer's disease (AD) and Parkinson's disease (PD).[20,21] This raises a crucial question: do clinical impairments in complex visuomotor tasks stem from specific failures in transsaccadic remapping? If so, the computational parameters that define normal spatial updating should also provide a mechanistic account of these clinical deficits, differentiating them from general age-related decline.”

      p.3: "Finally, by linking these mechanistic parameters to a standard clinical measure of constructional ability (the Rey-Osterrieth Complex Figure task), we demonstrate that transsaccadic updating represents a core computational phenotype underpinning real-world visuospatial construction in both health and neurodegeneration.

      In the Results:

      “To assess whether the mechanistic parameters derived from the LOCUS task represent core phenotypes of real-world visuospatial abilities, we also instructed all participants to complete the Rey-Osterrieth Complex Figure copy task (ROCF; Figure 7A) on an Android tablet using a digital pen (see examples in Figure 7B; all Copy data are available in the open dataset: https://osf.io/95ecp/). The ROCF is a gold-standard neuropsychological tool for identifying constructional apraxia.[29] Historically, drawing performance has shown only weak or indirect correlations with traditional working memory measures.[30] This disconnect has been attributed to active visual-sampling strategies—frequent eye movements that treat the environment as an external memory buffer, minimising the necessity of holding large volumes of information in internal working memory.[3–5]

      We hypothesised that drawing accuracy is primarily constrained by the precision of spatial updating across frequent saccades rather than raw memory capacity. To evaluate the ecological validity of the identified saccade-updating mechanism, we modelled individual ROCF copy scores across all four groups using the estimated (maximum a posteriori) parameters from the winning “Dual (Saccade) + Interference” model (Model 7; Figure 8) as regressors in a Bayesian linear model. Prior to inclusion, each regressor was normalised by dividing by the square root of its variance.

      This model successfully explained 61.99% of the variance in ROCF copy scores, indicating that these computational parameters are strong predictors of real-word constructional ability (Figure 8A). … This highlights the critical role of accurate remapping based on saccadic information; even if the core saccadic update mechanism is preserved across groups (as shown in previous analyses), the precision of this updating process is crucial for complex visuospatial tasks. Moreover, worse ROCF copy performance is associated particularly with higher initial angular encoding error. This indicates that imprecision in the initial registration of angular spatial information contributes to difficulties in accurately reproducing complex visual stimuli.”

      In the Discussion:

      “Importantly, our computational framework establishes a direct mechanistic link between trassaccadic updating and real-world constructional ability. Specifically, higher saccade and angular encoding errors contribute to poorer ROCF copy scores. By mapping these mechanistic estimates onto clinical scores, we found that the parameters derived from our winning model explain approximately 62% of the variance in constructional performance across groups. These findings suggest that the computational parameters identified in the LOCUS task represent core phenotypes of visuospatial ability, providing a mechanistic bridge between basic cognitive theory and clinical presentation.

      This relationship provides novel insights into the cognitive processes underlying drawing, specifically highlighting the role of transsaccadic working memoty.ry. Previous research has primarily focused on the roles of fine motor control and eye-hand coordination in this skill.[4,50–55] This is partly because of consistent failure to find a strong relation between traditional memory measures and copying abili [4,31] For instance, common measures of working memory, such as digit span and Corsi block tasks, do not directly predict ROCF copying performance.[31,56] Furthermore, in patients with constructional apraxia, these memory performance measures often remain relatively preserved despite significant drawing impairments.[56–58] In the literature, this lack of association has often been attributed to “deictic” visual-sampling strategies, characterised by frequent eye movements that treat the environment as an external memory buffer, thereby minimising the need to maintain a detailed internal representation.[4,59] In a real-world copying task, the ROCF requires a high volume of saccades, making it uniquely sensitive to the precision of the dynamic remapping signals identified here. Recent eye-tracking evidence confirms that patients with AD exhibit significantly more saccades and longer fixations during figure copying compared to controls, potentially as a compensatory response to trassaccadic working memory constraints.[56] This high-frequency sampling—averaging between 150 and 260 saccades for AD patients compared to approximately 100 for healthy controls—renders the task highly dependent on the precision of dynamic remapping signals.[56] To ensure this relationship was not driven by a general "g-factor" or non-spatial memory impairment, we further investigated the role of broader cognitive performance using the ACE-III Memory subscale. We found that the relationship between transsaccadic working memory and ROCF performance remains highly significant, even after controlling for age, education, and ACE-III Memory subscore. This suggests that transsaccadic updating may represent a discrete computational phenotype required for visuomotor control, rather than a non-specific proxy for global cognitive decline.

      In other words, even when visual information is readily available in the world, the act of copying depends critically on working memory across saccades. This reveals a fundamental computational trade-off: while active sampling strategies (characterised with frequent eye-hand movements) effectively reduce the load on capacity-limited working memory, they simultaneously increase the demand for precise spatial updating across eye movements. By treating the external world as an "outside" memory buffer, the brain minimises the volume of information it must hold internally, but it becomes entirely dependent on the reliability with which that information is remapped after each eye movement. This perspective aligns with, rather contradicts, the traditional view of active sampling, which posits that individuals adapt their gaze and memory strategies based on specific task demands.[3,60] Furthermore, this perspective provides a mechanistic framework for understanding constructional apraxia; in these clinical populations, the impairment may not lie in a reduced memory "span," but rather in the cumulative noise introduced by the constant spatial remapping required during the copying process.[58,61]

      Beyond constructional ability, these findings suggest that the primary evolutionary utility of high-resolution spatial remapping lies in the service of action rather than perception. While spatial remapping is often invoked to explain perceptual stability,[11–13,15] the necessity of high-resolution transsaccadic memory for basic visual perception is debated.[13,62–64] A prevailing view suggests that detailed internal models are unnecessary for perception, given the continuous availability of visual information in the external world.[13,44] Our findings support an alternative perspective, aligning with the proposal that high-resolution transsaccadic memory primarily serves action rather than perception.[13] This is consistent with the need for precise localisation in eye-hand coordination tasks such as pointing or grasping.[65] Even when unaware of intrasaccadic target displacements, individuals rapidly adjust their reaching movements, suggesting direct access of the motor system to remapping signals.66 Further support comes from evidence that pointing to remembered locations is biased by changes in eye position,[67] and that remapping neurons reside within the dorsal “action” visual pathway, rather than the ventral “perception” visual pathway.[13,68,69] By demonstrating a strong link between transsaccadic working memory and drawing (a complex fine motor skill), our findings suggest that precise visual working memory across eye movements plays an important role in complex fine motor control.”

      (2) Model fitting across age groups (p. 9).

      It is unclear whether it is appropriate to fit healthy young and healthy elderly participants' data to the same model simultaneously. If the goal of the model fitting is to account for behavioral performance across all conditions, combining these groups may be problematic, as the groups differ significantly in overall performance despite showing similar remapping costs. This suggests that model performance might differ meaningfully between age groups. For example, in Figure 4A, participants 22-42 (presumably the elderly group) show the best fit for the Dual (Saccade) model, implying that the Interference component may contribute less to explaining elderly performance.

      Furthermore, although the most complex model emerges as the best-fitting model, the manuscript should explain how model complexity is penalized or balanced in the model comparison procedure. Additionally, are Fixation Decay and Saccade Update necessarily alternative mechanisms? Could both contribute simultaneously to spatial memory representation? A model that includes both mechanisms-e.g., Dual (Fixation) + Dual (Saccade) + Interference-could be tested to determine whether it outperforms Model 7 to rule out the sole contribution of complexity.

      We thank you for the opportunity to expand upon and clarify our modelling approach. Our decision to use a common generative model for both young and older adults was grounded in the empirical finding that there was no significant interaction between age group and saccade condition for either location or colour memory. While older adults demonstrated lower baseline precision, the specific "saccade cost" remained remarkably consistent across cohorts. This was the justification we proceeded on to use of a common model to assess quantitative differences in parameter estimates while maintaining a consistent mechanistic framework for comparison.

      Moreover, our winning model nests simpler models as special cases, providing the flexibility to naturally accommodate groups where certain components—such as interference—might play a reduced role. This ultimately confirms that the mechanisms for age-related memory deficits in this task reflect more general decline rather than a qualitative failure of the saccadic remapping process.

      This approach is further supported by the properties of the Bayesian model selection (BMS) procedure we used, which inherently penalises the inclusion of unnecessary parameters. Unlike maximum likelihood methods, BMS compares marginal likelihoods, representing the evidence for a model integrated over its entire parameter space. This follows the principle of Bayesian Occam’s Razor, where a model is only favoured if the improvement in fit justifies the additional parameter space; redundant parameters instead "dilute" the probability mass and lower the model evidence.

      Consequently, we contend that a hybrid model combining fixation and saccade mechanisms is unnecessary, as we have already adjudicated between alternative mechanisms of equal complexity. Specifically, Model 6 (Dual Fixation + Interference) and Model 7 (Dual Saccade + Interference) possess an identical number of parameters. The fact that Model 7 emerged as the clear winner—providing substantial evidence against Model 6 with a Bayes Factor of 6.11—demonstrates that our model selection is driven by the specific mechanistic account of the data rather than a simple preference for complexity.

      We have revised the Results and Discussion sections of the manuscript to state these points more explicitly for readers and have included references to established literature regarding the robustness of marginal likelihoods in guarding against overfitting.

      In the Results,

      “By fitting these models to the trial-by-trial response data from all healthy participants (N=42), we adjudicated between competing mechanisms to determine which best explained participant performance (Figure 4). We used random-effects Bayesian model selection to identify the most plausible generative model. This process relies on the marginal likelihood (model evidence), which inherently balances model fit against complexity—a principle often referred to as Occam’s razor.[25–27] The analysis yielded a strong result: the “Dual (Saccade) + Interference” model (Model 7 in Table 1) emerged as the winning model, providing substantial evidence against the next best alternative with a Bayes Factor of 6.11.”

      In the Discussion:

      “Our framework employs Variational Laplace, a method used to recover computational phenotypes in clinical populations like those with substance use disorders,[34,35] and the models we fit using this procedure feature time-dependent parameterisation of variance—conceptually similar to the widely-used Hierarchical Gaussian Filter.[36–39] Importantly, the risk of overfitting is mitigated by the Bayesian Model Selection framework; by utilising the marginal likelihood for model comparison, the procedure inherently penalises excessive model complexity and promotes generalisability.[25–27,40] This generalisability was further evidenced by the model's ability to predict performance on the independent ROCF task, confirming that these parameters represent robust mechanistic phenotypes rather than idiosyncratic fits to the initial dataset.”

      Minor point: On p. 9, line 336, Figure 4A does not appear to include the red dashed vertical line that is mentioned as separating the age groups.

      Thank you for pointing out this inconsistency. We apologise for the oversight; upon further review, we concluded that the red dashed vertical line was unnecessary for the clear presentation of the data. We have therefore removed the line from Figure 4A and deleted the corresponding sentence in the figure caption.

      (3) Clarification of conceptual terminology.

      Some conceptual distinctions are unclear. For example, the relationship between "retinal memory" and "transsaccadic memory," as well as between "allocentric map" and "retinotopic representation," is not fully explained. Are these constructs related or distinct? Additionally, the manuscript uses terms such as "allocentric map," "retinotopic representation," and "reference frame" interchangeably, which creates ambiguity. It would be helpful for the authors to clarify the relationships among these terms and apply them consistently.

      Thank you for pointing this out. We have revised the manuscript to ensure that these terms are applied with greater precision and consistency. Our revisions standardise the terminology based on the following distinctions:

      Reference frames: We distinguish between the eye-centred reference frame (coordinate systems that shift with gaze) and the world-centred reference frame (coordinate systems anchored to the environment).

      Retinotopic representation vs. allocentric map: We clarify that retinotopic representations are encoded within an eye-centred reference frame and are updated with every ocular movement. Conversely, the allocentric map is anchored to stable environmental features, remaining invariant to the observer’s gaze direction or position.

      Retinotopic memory vs. transsaccadic memory: We have removed the term "retinal memory" to avoid ambiguity. We now consistently use retinotopic memory to describe the persistence of visual information in eye-centred coordinates within a single fixation. In contrast, transsaccadic memory refers to the higher-level integration of visual information across saccades, which involves the active updating or remapping of representations to maintain stability.

      To incorporate these clarifications, we have implemented the following changes:

      In the Introduction, the second paragraph has been entirely rewritten to establish these definitions at the outset, providing a clearer theoretical framework for the study.

      “Central to this enquiry is the nature of the coordinate system used for the brain's internal spatial representation. Does the brain maintain a single, world-centred (allocentric) map, or does it rely on a dynamic, eye-centred (retinotopic) representation?[11,13,15,16] In the latter system, retinotopic memory preserves spatial information within a fixation, whereas transsaccadic memory describes the active process of updating these representations across eye movements to achieve spatiotopic stability—the perception of a stable world despite eye movements.[11,16–18] If spatial stability is indeed reconstructed through such remapping, the mechanism remains unresolved: do we retain memories of absolute fixation locations, or do we reconstruct these positions from noisy memories of the intervening saccade vectors? We can test these hypotheses by analysing when and where memory errors occur. Assuming that memory precision declines over time,[19] the resulting error distributions should reveal the specific variables that are represented and updated across each saccade.”

      In the Results, the opening section of the Results has been reorganised to align with this terminology. We have ensured that the hypotheses and behavioural data—specifically the definition of "saccade cost"—are introduced using this consistent conceptual vocabulary to improve the overall coherence of the narrative.

      (4) Rationale for the selective disruption hypothesis (p. 4, lines 153-154). The authors hypothesize that "saccades would selectively disrupt location memory while leaving colour memory intact." Providing theoretical or empirical justification for this prediction would strengthen the argument.

      We have revised the Results to state the hypothesis more explicitly and expanded the Discussion to provide a robust theoretical and empirical rationale:

      In the Results,

      “This design allowed us to isolate and quantify the unique impact of saccades on spatial memory, enabling us to test competing hypotheses regarding spatial representation. If spatial memory were solely underpinned by an allocentric mechanism, precision should remain comparable across all conditions as the representation would be world-centred and unaffected by eye movements. Thus, performance in the no-saccade condition should be comparable to the two-saccade condition. Conversely, if spatial memory relies on a retinotopic representation requiring active updating across eye movements, the two-saccade condition was anticipated to be the most challenging due to cumulative decay in the memory traces used for stimulus reconstruction after each saccade.[22] Critically, we hypothesised that this saccade cost would be specific to the spatial domain; while location requires active remapping via noisy oculomotor signals, non-spatial features like colour are not inherently tied to coordinate transformations and should therefore remain stable (see more in Discussion below).

      Meanwhile, the no-saccade condition was expected to yield the most accurate localisation, relying solely on retinotopic information (retinotopic working memory). These predictions were confirmed in young healthy adults (N = 21, mean age = 24.1 years, ranged between 19 and 34). A repeated measures ANOVA revealed a significant main effect of saccades on location memory (F(2.2,43.9)=33.2, p<0.001, partial η²=0.62), indicating substantial impairment after eye movements (Figure 2A). In contrast, colour memory remained remarkably stable across all saccade conditions (Figure 2B; F(2.2, 44.7) = 0.68, p=0.53, partial η² =0.03).

      This “saccade cost”—the loss of memory precision following an eye movement—indicates that spatial representations require active updating across saccades rather than being maintained in a static, world-centred reference frame.

      Critically, our comparison between spatial and colour memory does not rely on the absolute magnitude of errors, which are measured in different units (degrees of visual angle vs. radians). Instead, we assessed the relative impact of the same saccadic demand on each feature within the same trial. While location recall showed a robust saccade cost, colour recall remained statistically unchanged. To ensure this null effect was not due to a lack of measurement sensitivity, we examined the recency effect; recall performance for the second item was predicted to be better than for the first stimulus in each condition.[23,24] As expected, colour memory for Item 2 was significantly more accurate than for Item 1 (F(1,20) = 6.52, p = 0.02, partial η² = 0.25), demonstrating that the task was sufficiently sensitive to detect standard working memory fluctuations despite the absence of a saccade-induced deficit.”

      In the Discussion, we now write that on p.18:

      “A clear finding was the specificity of the saccade cost to spatial features; it was not observed for non-spatial features like colour, even in neurodegenerative conditions. This discrepancy challenges notions of fixed visual working memory capacity unaffected by saccades.16,44–46 The differential impact on spatial versus non-spatial features in transsaccadic memory aligns with the established "what" and "where" pathways in visual processing.32,33 For objects to remain unified, object features must be bound to stable representations of location across saccades.19 One possibility is that remapping updates both features and location through a shared mechanism, predicting equal saccadic interference for both colour and location in the present study.

      However, our findings suggest otherwise. One potential concern is whether this dissociation simply reflects the inherent spatial noise introduced by fixational eye movements (FEMs), such as microssacades and drifts.47 Because locations are stored in a retinotopic frame, fixational instability necessarily shifts retinal coordinates over time. However, the "saccade cost" here was defined as the error increase relative to a no-saccade baseline of equal duration; because both conditions are subject to the same fixational drift, any FEM-induced noise is effectively subtracted out. Thus, despite the ballistic and non-Gaussian nature of FEMs,48 they cannot account for the fact the saccade cost in the spatial memory, but total absence in the colour domain. Another possibility is that this dissociation reflects differences in baseline task difficulty or dynamic range. Yet, the presence of a robust recency effect in colour memory (Figure 2B) confirms that our paradigm was sensitive to memory-dependent variance and was not limited by floor or ceiling effects.

      The fact that identical eye movements—executed simultaneously and with identical vectors—systematically degraded spatial precision while sparing colour suggests a feature-specific susceptibility to transsaccadic remapping. This supports the view that the computational process of updating an object’s location involves a vector-subtraction mechanism—incorporating noisy oculomotor commands (efference copies)—that introduces specific spatial variance. Because this remapping is a coordinate transformation, the resulting sensorimotor noise does not functionally propagate to non-spatial feature representations. Consequently, features like colour may be preserved or automatically remapped without the precision loss associated with spatial updating.11,49 Our paradigm thus provides a refined tool to investigate the architecture of transsaccadic working memory across distinct object features.”

      (5) Relationship between saccade cost and individual memory performance (p. 4, last paragraph).

      The authors report that larger saccades were associated with greater spatial memory disruption. It would be informative to examine whether individual differences in the magnitude of saccade cost correlate with participants' overall/baseline memory performance (e.g. their memory precision in the no-saccade condition). Such analyses might offer insights into how memory capacity/ability relates to resilience against saccade-induced updating.

      We have now conducted the correlation analysis to determine whether baseline memory capacity (no-saccade condition) predicts resilience to saccade-induced updating. The results indicate that these two factors are independent.

      To clarify the nature of the saccade-induced impairment, we have updated the text as follows:

      p.4: “This “saccade cost”—the loss of memory precision following an eye movement—indicates that spatial representations require active updating across saccades rather than being maintained in a static, world-centred reference frame.”

      p.5: “Further analysis examined whether individual differences in baseline memory precision (no-saccade condition) predicted resilience to saccadic disruption. Crucially, individual saccade costs (defined as the precision loss relative to baseline) did not correlate with baseline precision (rho = 0.20, p = 0.20). This suggests that the noise introduced by transsaccadic remapping acts as an independent, additive source of variance that is not modulated by an individual’s underlying memory capacity. These findings imply a functional dissociation between the mechanisms responsible for maintaining a representation and those involved in its coordinate transformation.”

      (6) Model fitting for the healthy elderly group to reveal memory-deficit factors (pp. 11-12). The manuscript discusses model-based insights into components that contribute to spatial memory deficits in AD and PD, but does not discuss components that contribute to spatial memory deficits in the healthy elderly group. Given that the EC group also shows impairments in certain parameters, explaining and discussing these outcomes of the EC group could provide additional insights into age-related memory decline, which would strengthen the study's broader conclusions.

      This is a very good point. We rewrote the corresponding results section (p.12-13):

      “Modelling reveals the sources of spatial memory deficits in healthy aging and neurodegeneration - To understand the source of the observed deficits, we applied the winning ‘Dual (Saccade) + Interference’ model the data from all participants (YC, EC, AD, and PD). By fitting the model to the entire dataset, we obtained estimates of the parameters for each individual, which then formed the basis for our group-level analysis. To formally test for group differences, we used Parametric Empirical Bayes (PEB), a hierarchical Bayesian approach that compares parameter estimates across groups while accounting for the uncertainty of each estimate [28]. This allowed us to identify which specific cognitive mechanisms, as formalised by the model parameters, were affected by age and disease.

      The Bayesian inversion used here allows us to quantify the posterior mode and variance for each parameter and the covariance for each parameter. From these, we can compute the probabilities that pairs of parameters differ from one another, which we report as P(A>B)—meaning the posterior probability that the parameter for group A was greater than that for group B.

      We first examined the specific parameters differentiating healthy elderly (EC) from young controls (YC) to isolate the factors contributing to non-pathological, age-related decline. The analysis revealed that healthy ageing is primarily characterised by a significant increase in Radial Decay (P(EC > YC) = 0.995), a heightened susceptibility to Interference (P(EC > YC) = 1.000), and a reduction in initial Angular Encoding precision (P(YC < EC) = 0.002; Figure 6). These results suggest that normal ageing degrades the fidelity of the initial memory trace and its resilience over time, while the core computational process of updating information across saccades remains intact.

      Beyond these baseline ageing effects, our clinical cohorts exhibited more severe and condition-dependent impairments. Radial decay showed a clear, graded impairment: AD patients had a greater decay rate than PD patients (P(AD > PD) = 1.000), who in turn were more impaired than the EC group (P(PD > EC) = 0.996). A similar graded pattern was observed for Interference, where AD patients were most susceptible (P(AD > PD) = 0.999), while the PD and EC groups did not significantly differ (P(PD > EC) = 0.532).

      Patients with AD also showed a tendency towards greater angular decay than controls (P(AD > EC) = 0.772), although this fell below the 95% probability threshold. This effect was influenced by a lower decay rate in the PD group compared to the EC group (P(PD < EC) = 0.037). In contrast, group differences in encoding were less pronounced. While YC exhibited significantly higher precision than all other groups, AD patients showed significantly higher angular encoding error than PD patients (P(AD > PD) = 0.985), though neither group differed significantly from the EC group.

      Crucially, parameters related to the saccade itself—saccade encoding and saccade decay—did not differentiate the groups. This indicates that neither healthy ageing nor the early stages of AD and PD significantly impair the fundamental machinery for transsaccadic remapping. Instead, the visuospatial deficits in these conditions arise from specific mechanistic failures: a faster decay of radial position information and increased susceptibility to interference, both of which are present in healthy ageing but significantly amplified by neurodegeneration.”

      In the Discussion, we added:

      “Although saccade updating was an essential component of the winning model, its two key parameters—initial encoding error and decay rate during maintenance—did not significantly differ across groups. This indicates that the core computational process of updating spatial information based on eye movements is largely preserved in healthy aging and neurodegeneration.

      Instead, group differences were driven by deficits in angular encoding error (precision of initial angle from fixation), angular decay, radial decay (decay in memory of distance from fixation), and interference susceptibility. This implies a functional and neuroanatomical dissociation: while the ventral stream (the “what” pathway) shows an age-related decline in the quality and stability of stored representations, the dorsal-stream (the “where” pathway) parietal-frontal circuits responsible for coordinate transformations remain functionally robust.[31–34] These spatial updating mechanisms appear resilient to the normal ageing trajectory and only break down when challenged by the specific pathological processes seen in Alzheimer’s or Parkinson’s disease.”

      (7) Presentation of saccade conditions in Figure 5 (p. 11). In Figure 5, it may be clearer to group the four saccade conditions together within each patient group. Since the main point is that saccadic interference on spatial memory remains robust across patient groups, grouping conditions by patient type rather than intermixing conditions would emphasize this interpretation.

      There are several valid ways to present these plots, but we chose this format because it allows for a direct visual comparison of the post-hoc group differences within each specific task demand. This arrangement clearly illustrates the graded impairment from young controls through to patients with Alzheimer’s disease across every condition. This structure also directly mirrors our two-way ANOVA, which identified significant main effects for both Group and Condition, but crucially, no significant Group x Condition interaction. We felt that grouping the data by participant group would force readers to look across four separate clusters to compare the slopes, making the stability of the saccadic remapping mechanism much harder to grasp at a glance.

      Reviewer #1 (Recommendations for the authors):

      (1) Formatting of statistical parameters.

      The formatting of statistical symbols should be consistent throughout the manuscript. Some instances of F, p, and t are italicized, while others are not. All statistical symbols should be italicized.

      Thank you for pointing this out. We have audited the manuscript. While we have revised the text to address these instances throughout the Results and Methods sections, any remaining minor formatting inconsistencies will be corrected during the final typesetting stage.

      (2) Minor typographical issues.

      (a) Line 532: "are" should be "be."

      (b) Line 654: "cantered" should be "centered."

      (c) Line 213: In "(p(bonf) < 0.001, |t| {greater than or equal to} 5.94)," the t value should be reported with its degrees of freedom, and t should be reported before p. The same applies to line 215.

      Thank you for your careful reading. All corrected.

      Reviewer #2 (Public review):

      We thank you for your positive feedback regarding our eye-tracking methodology and computational approach. We appreciate your critical insights into the feature-specific disruption hypothesis and the task structure. We have substantially revised the results and discussion about the saccadic interference on colour memory. Below we will answer your suggestions point-by-point:

      Reviewer #2 (Recommendations for the authors):

      (1) The study treats colour and location errors as comparable when arguing that saccades selectively disrupt spatial but not colour memory. However, these measures are defined in entirely different units (degrees of visual angle vs radians on a colour wheel) and are not psychophysically or statistically calibrated. Baseline task difficulty, noise level, or dynamic range do not appear to be calibrated or matched across features. As a result, the null effect of saccades on colour could reflect lower sensitivity or ceiling effects rather than implicit feature-specific robustness.

      We agree that direct comparisons of absolute error magnitudes across different dimensions are not appropriate. Our argument for feature-specific disruption relies not on the scale of errors, but on the presence or absence of a saccade cost within identical trials. In our within-subject design, the same saccade vectors produced a systematic increase in location error while leaving colour error statistically unchanged. To address sensitivity, we observed that colour memory was sufficiently precise to show a significant recency effect (p = 0.02). To further quantify the evidence for the null effect, we performed Bayesian repeated measures ANOVAs, which yielded a BF10 = 0.22. This provides substantial evidence that saccades do not disrupt colour precision, regardless of baseline sensitivity.

      We have substantially revised this in Results, Methods and Discussion:

      In the Results:

      “This design allowed us to isolate and quantify the unique impact of saccades on spatial memory, enabling us to test competing hypotheses regarding spatial representation. If spatial memory were solely underpinned by an allocentric mechanism, precision should remain comparable across all conditions as the representation would be world-centred and unaffected by eye movements. Thus, performance in the no-saccade condition should be comparable to the two-saccade condition. Conversely, if spatial memory relies on a retinotopic representation requiring active updating across eye movements, the two-saccade condition was anticipated to be the most challenging due to cumulative decay in the memory traces used for stimulus reconstruction after each saccade.[22] Critically, we hypothesised that this saccade cost would be specific to the spatial domain; while location requires active remapping via noisy oculomotor signals, non-spatial features like colour are not inherently tied to coordinate transformations and should therefore remain stable (see more in Discussion below).

      Meanwhile, the no-saccade condition was expected to yield the most accurate localisation, relying solely on retinotopic information (retinotopic working memory). These predictions were confirmed in young healthy adults (N = 21, mean age = 24.1 years, ranged between 19 and 34). A repeated measures ANOVA revealed a significant main effect of saccades on location memory (F(2.2,43.9)=33.2, p<0.001, partial η²=0.62), indicating substantial impairment after eye movements (Figure 2A). In contrast, colour memory remained remarkably stable across all saccade conditions (Figure 2B; F(2.2, 44.7) = 0.68, p=0.53, partial η² =0.03).

      This “saccade cost”—the loss of memory precision following an eye movement—indicates that spatial representations require active updating across saccades rather than being maintained in a static, world-centred reference frame.

      Critically, our comparison between spatial and colour memory does not rely on the absolute magnitude of errors, which are measured in different units (degrees of visual angle vs. radians). Instead, we assessed the relative impact of the same saccadic demand on each feature within the same trial. While location recall showed a robust saccade cost, colour recall remained statistically unchanged. To ensure this null effect was not due to a lack of measurement sensitivity, we examined the recency effect; recall performance for the second item was predicted to be better than for the first stimulus in each condition.[23,24] As expected, colour memory for Item 2 was significantly more accurate than for Item 1 (F(1,20) = 6.52, p = 0.02, partial η² = 0.25), demonstrating that the task was sufficiently sensitive to detect standard working memory fluctuations despite the absence of a saccade-induced deficit.”

      In the Methods, at the beginning of “Statistical Analysis”, we added

      “Because location and colour recall involve different scales and units, all analyses were performed independently for each feature to avoid cross-dimensional magnitude comparisons.” (p25)

      In the Discussion, we added:

      “A potential concern is whether the observed dissociation between colour and location reflects differences in baseline task difficulty or dynamic range. Yet, the presence of a robust recency effect in colour memory (Figure 2B) confirms that our paradigm was sensitive to memory-dependent variance and was not limited by floor or ceiling effects.”

      (2) Colour and then location are probed serially, without a counter-balanced order. This fixed response order could introduce a systematic bias because location recall is consistently subject to longer memory retention intervals and cognitive interference from the colour decision. The observed dissociation-saccades impair location but not colour, and may therefore reflect task structure rather than implicit feature-specific differences in trans-saccadic memory.

      Thank you for the insightful observation regarding our fixed response order. We acknowledge that that a counterbalanced design is typically preferred to mitigate potential order effects. However, we chose this consistent sequence to ensure the task remained accessible for cognitively impaired patients (i.e., the Alzheimer’s disease (AD) and Parkinson’s disease (PD) cohorts). Conducting an eye-tracking memory task with cognitively impaired patients is challenging, as they may struggle with task engagement or forget complex instructions. During the design phase, we prioritised a consistent structure to reduce the cognitive load and task-switching demands that typically challenge these cohorts.

      Critically, because the saccade cost is a relative measure calculated by comparing conditions with identical timings, any bias from the fixed order is present in both the baseline and saccade trials. The disruption we report is therefore a specific effect of eye movements that goes beyond the noise introduced by the retention interval or the preceding colour report.

      We added the following text in the Methods – experimental procedure (p.22):

      “Recall was performed in a fixed order, with colour reported before location. This sequence was primarily chosen to minimise cognitive load and task-switching demands for the two neurological patient cohorts, ensuring the paradigm remained accessible for individuals with AD and PD. While this order results in a slightly longer retention interval for location recall, the saccade cost was identified by comparing location error across experimental conditions with similar timings but varying saccadic demands.”

      (3) Relatedly, because spatial representations are retinotopic, fixational eye movements (FEMs - microsaccades and drift) displace the retinal coordinates of encoded positions, increasing apparent spatial noise with time delays. Colour memory, however, is feature-based and unaffected by small retinal translations. Thus, any between-condition or between-group differences in FEMs could selectively inflate location error and the associated model parameters (encoding noise, decay, interference), while leaving colour error unchanged. Note that FEMs tend to be slightly ballistic [1,2], hence not well modelled with a Gaussian blur.

      This is a very insightful point. We have now addressed this in detail within the discussion:

      “However, our findings suggest otherwise. One potential concern is whether this dissociation simply reflects the inherent spatial noise introduced by fixational eye movements (FEMs), such as microssacades and drifts.[46] Because locations are stored in a retinotopic frame, fixational instability necessarily shifts retinal coordinates over time. However, the "saccade cost" here was defined as the error increase relative to a no-saccade baseline of equal duration; because both conditions are subject to the same fixational drift, any FEM-induced noise is effectively subtracted out. Thus, despite the ballistic and non-Gaussian nature of FEMs,n [47] they cannot account for the fact the saccade cost in the spatial memory, but total absence in the colour domain. Another possibility is that this dissociation reflects differences in baseline task difficulty or dynamic range. Yet, the presence of a robust recency effect in colour memory (Figure 2B) confirms that our paradigm was sensitive to memory-dependent variance and was not limited by floor or ceiling effects.”

      (4) There is no in silico demonstration that the modelling framework can recover the true generating model from synthetic data or recover accurate parameters under realistic noise levels, which can be challenging in generative models with a hierarchical structure (as per [3], for example). Figure 8b shows that the parameters possess substantial posterior covariance, which raises concerns as to whether they can be reliably disambiguate.

      Many thanks for this comment. We have added a simple recovery analysis as detailed below but are also keen to ensure we fully answer your question—which has more to do with empirical rather than simulated data—and make clear the rationale for this analysis in this instance.

      We added this in Supplementary Materials:

      “Model validation and recovery analysis

      The following section provides a detailed technical assessment of the model inversion scheme, focusing on the discriminability of the model space and the identifiability of individual parameters.

      Recovery analyses of this sort are typically used prior to collecting data to allow one to determine whether, in principle, the data are useful in disambiguating between hypotheses. In this sense, they have a role analogous to a classical power calculation. However, their utility is limited when used post-hoc when data have already been collected, as the question of whether the models can be disambiguated becomes one of whether non-trivial Bayes factors can be identified from those data.

      The reason for including a recovery analysis here is not to identify whether the model inversion scheme identifies a ‘true’ model. The concept of ‘true generative models’ commits to a strong philosophical position which is at odds with the ‘all models are wrong, but some are useful’ perspective held by many in statistics, e.g., (So, 2017). Of note, one can always confound a model recovery scheme by generating the same data in a simple way, and in (one of an infinite number of) more complex ways. A good model inversion scheme will always recover the simple model and therefore would appear to select the ‘wrong’ model in a recovery analysis. However, it is still the best explanation for the data. For these reasons, we do not necessarily expect ‘good’ recoverability in all parameter ranges. This is further confounded by the relationship between the models we have proposed—e.g., an interference model with very low interference will look almost identical to a model with no interference. The important question here is whether they can be disambiguated with real data.

      Instead, the value of a post-hoc recovery analysis here is to evaluate whether there was a sensible choice of model space—i.e., that it was not a priori guaranteed that a single model (and, specifically, the model we found to be the best explanation for the data) would explain the results of all others. To address this, for each model, we simulated 16 datasets, each of which relied upon parameters sampled from the model priors, which included examples of each of the experimental conditions. We then fit each of these datasets to each of the 7 models to construct the confusion matrix shown in the lower panel of Supplementary Figure 3, by accumulating evidence over each of the 16 participants generated according to each ‘true’ model (columns) for each of the possible explanatory models (rows). This shows that no one model, for the parameter ranges sampled here, explains all other datasets. Interestingly, our ‘winning’ model in the empirical analysis is not the best explanation for any of the datasets simulated (including its own). This is reassuring, in that it implies this model winning was not a foregone conclusion and is driven by the data—not just the choice of model space.”

      Your point about the posterior covariance is well founded. As we describe in Supplementary Materials, this is an inherent feature of inverse problems (analogous to EEG source localisation). However, the fact that our posterior densities move significantly away from the prior expectations demonstrates that the data are indeed informative. By adopting a Bayesian framework, we are able to explicitly quantify this uncertainty rather than ignoring it, providing a more transparent account of parameter identifiability. We have added the following in the same section of Supplementary Materials:

      “This problem is an inverse problem—inferring parameters from a non-linear model. We therefore expect a degree of posterior covariance between parameters and, consequently, that they cannot be disambiguated with complete certainty. While some degree of posterior covariance is inherent to inverse models—including established methods like EEG source localisation—the fact that many of the parameters are estimated with posterior densities that do not include their prior expectations implies the data are informative about these.

      The advantage of the Bayesian approach we have adopted here is that we can explicitly quantify posterior covariance between these parameters, and therefore the degree to which they can be disambiguated. While the posterior covariance matrices from empirical data are the relevant measure here, we can better understand the behaviour of the model inversion scheme in relation to the specific models used using the model recovery analysis reported in Supplementary figure 3.

      The middle panel of the figure is key, along with the correlation coefficients reported in the figure caption. Here, we see at least a weak positive correlation (in some cases much stronger) for almost all parameters and limited movement from prior expectations for those parameters that are less convincingly recovered. This reinforces that the ability of the scheme to recover parameters is best assessed in terms of the degree of movement of posterior from prior values following fitting to empirical data.”

      (5) The authors employ Bayes factors (BFs) to disambiguate models, but BFs would also strengthen the claims that location, but not colour, is impacted by saccades. Despite colour being a circular variable, colour error is analysed using ANOVA on linearised differences (radians). The authors should also arguably use circular statistics, such as the von Mises distribution, for the analysis of colour.

      Regarding the use of circular statistics, you are correct that such error distributions are not suitable for ANOVA, and it is better to use circular statistics. However, for the present dataset, we used the mean absolute angular error per condition (ranging from 0 to π radians), which represents the shortest distance on the colour wheel between the target and the response.

      This approach effectively linearises the measure by removing the 2π wrap-around boundary. because the observed errors were relatively small and did not cluster near the π boundary—even in the patient cohorts (Figure 5B)—the "wrap-around" effect of circular space is negligible. Moreover, by analysing the mean error across trials for each condition, rather than trial-wise data, we invoke the Central Limit Theorem. This ensures that the distribution of these means is approximately normal, satisfying the fundamental assumptions of ANOVA. Due to these reasons, we adopted simpler linear models. We confirmed that the data did not violate the assumptions of linear statistics. In this low-noise regime, linear and circular models converge on the same conclusions. This has been revised in Methods:

      “For colour memory, we calculated the absolute angular error, defined as the shortest distance on the colour wheel between the target and the reported colour (range 0 to π radians). For the primary statistical analyses, we utilised the mean absolute error per condition for each participant. By analysing these condition-wise means rather than trial-wise raw data, we invoke the Central Limit Theorem, which ensures that the sampling distribution of these means approximates normality. Because the absolute errors in this paradigm were relatively small and did not approach the π boundary (Figure 5B) even in the clinical cohorts, the data were treated as a continuous measure in our linear ANOVAs and regression models. Moreover, because location and colour recall involve different scales and units, all analyses were performed independently for each feature to avoid cross-dimensional magnitude comparisons.”

      We have also now integrated Bayesian repeated measures ANOVA throughout the manuscript. The Results section for the young healthy adults now reads (p. 4):

      “A repeated measures ANOVA revealed a significant main effect of saccades on location memory (F(3, 20) = 51.52, p < 0.001, partial η²=0.72), with Bayesian analysis providing decisive evidence for the inclusion of the saccade factor (BF<sub>incl</sub> = 3.52 x 10^13, P(incl|data) = 1.00). In contrast, colour memory remained remarkably stable across all saccade conditions (F(3, 20) = 0.57, p = 0.64, partial η² =0.03). This null effect was supported by Bayesian analysis, which provided moderate evidence in favour of the null hypothesis (BF<sub>01</sub> = 8.46, P(excl|data) = 0.89), indicating that the data were more than eight times more likely under the null model than a model including saccade-related impairment.”

      For elderly healthy adults:

      “In contrast, colour memory remained unaffected by saccade demands (F(3, 20) = 0.57, p = 0.65, partial η² =0.03), again supported by the Bayesian analysis: BF<sub>01</sub> = 8.68, P(excl|data) = 0.90.”

      For patient cohorts:

      “Bayesian repeated measures ANOVAs further supported this dissociation, providing moderate evidence for the null hypothesis in the AD group (BF<sub>01</sub> = 3.35, P(excl|data) = 0.77) and weak evidence in the PD group (BF<sub>01</sub> = 2.23, P(excl|data) = 0.69). This indicates that even in populations with established neurodegeneration, the detrimental impact of eye movements is specific to the spatial domain.”

      Related description is also updated in Methods – Statistical Analysis.

      Minor:

      (1) The modelling is described as computational but is arguably better characterised as a heuristic generative model at Marr's algorithmic level. It does not derive from normative computational principles or describe an implementation in neural circuits.

      We appreciate your perspective on the classification of our model within Marr’s hierarchy. We agree that our framework is best characterised as an algorithmic-level generative model. Our objective was to identify the mechanistic principles governing transsaccadic updating rather than to provide a normative derivation or a specific circuit-level implementation.

      To ensure readers do not over-interpret the term ‘computational’, we have added a clarifying statement in the Discussion acknowledging the algorithmic nature of the model. Interestingly, we note that a model predicated on this form of spatial diffusion implies a neural field representation with a spatial connectivity kernel whose limit approximates the second derivative of a Dirac delta function. While a formal neural field implementation is beyond the scope of the present work, our algorithmic results provide the necessary constraints for such future biophysical models.

      p.20: “While we describe the present framework as 'computational', it is more precisely characterised as an algorithmic-level generative model within Marr’s hierarchy. Our focus was on defining the rules of spatial integration and the sources of eye-movement-induced noise, rather than deriving these processes from normative principles or defining their specific neural implementation.”

      (2) I did not find a description of the recruitment and characterization of the AD and PD patients.

      Apologies for this omission. We have now included a detailed description of participant recruitment and clinical characterisation in the Methods section and also updated Table 2:

      “A total of 87 participants completed the study: 21 young healthy adults (YC), 21 older healthy adults (EC), 23 patients with Parkinson’s disease (PD), and 22 patients with Alzheimer’s disease (AD). Their demographic and clinical details are summarised in Table 2. Initially, 90 participants were recruited (22 YC, 21 EC, 25 PD, 22 AD); however, three individuals (1 YC and 2 PD) were excluded from all analyses due to technical issues during data acquisition.

      All participants were recruited locally in Oxford, UK. None were professional artists, had a history of psychiatric illness, or were taking psychoactive medications (excluding standard dopamine replacement therapy for PD patients). Young participants were recruited via the University of Oxford Department of Experimental Psychology recruitment system. Older healthy volunteers (all >50 years of age) were recruited from the Oxford Dementia and Ageing Research (OxDARE) database.

      Patients with PD were recruited from specialist clinics in Oxfordshire. All had a clinical diagnosis of idiopathic Parkinson's disease and no history of other major neurological or psychiatric conditions. While specific dosages of dopamine replacement therapy (e.g., levodopa equivalent doses) were not systematically recorded, all patients were tested while on their regular medication regimen ('ON' state).

      Patients with PD were recruited from clinics in the Oxfordshire area. All had a clinical diagnosis of idiopathic Parkinson’s disease and no history of other major neurological or psychiatric illnesses. While all patients were tested in their regular medication ‘ON’ state, the specific pharmacological profiles—including the exact types of medication (e.g., levodopa, dopamine agonists, or combinations) and dosages—were not systematically recorded. The disease duration and PD severity were also un-recorded for this study.

      Patients with AD were recruited from the Cognitive Disorders Clinic at the John Radcliffe Hospital, Oxford, UK. All AD participants presented with a progressive, multidomain, predominantly amnestic cognitive impairment. Clinical diagnoses were supported by structural MRI and FDG-PET imaging consistent with a clinical diagnosis of AD dementia (e.g., temporo-parietal atrophy and hypometabolism).69 All neuroimaging was reviewed independently by two senior neurologists (S.T. and M.H.).

      Global cognitive function was assessed using the Addenbrooke’s Cognitive Examination-III (ACE-III).70 All healthy participants scored above the standard cut-off of 88, with the exception of one elderly participant who scored 85. In the PD group, two participants scored below the cut-off (85 and 79). In the AD group, six participants scored above 88; these individuals were included based on robust clinical and radiological evidence of AD pathology rather than their ACE-III score alone.”

      (3) YA and OA patients appear to differ in gender distribution.

      We acknowledge the difference in gender distribution between the young (71.4% female) and older adult (57.1% female) cohorts. However, we do not anticipate that gender influences the fundamental computational mechanisms of retinotopic maintenance or transsaccadic remapping. These processes represent low-level visuospatial functions for which there is no established evidence of gender-specific differences in precision or coordinate transformation. We have ensured that the gender distribution for each cohort is clearly listed in the demographics table (Table 2) for full transparency.

      Thank you very much for very insightful feedback!

      Reviewer #3 (Public review):

      Thank you for the positive feedback regarding our inclusion of clinical groups and the identification of computational phenotypes that differentiate these cohorts.

      To address your concerns about the model, we have clarified our use of Bayesian Model Selection, which inherently penalises model complexity to ensure that our results are not driven solely by the number of parameters. We will also provide further evidence regarding model generalisability to address the concern of overfitting.

      Regarding the link with the ROCF, we have revised the manuscript to better highlight the specific relationship between our transsaccadic parameters and the ROCF data and better motivate the inclusion of these results in the main text.

      Below is our response to your suggestions point-by-point:

      (1) The models tested differ in terms of the number of parameters. In general, a larger number of parameters leads to a better goodness of fit. It is not clear how the difference in the number of parameters between the models was taken into account. It is not clear whether the modelling results could be influenced by overfitting (it is not clear how well the model can generalize to new observations).

      To ensure our results were not driven by the number of parameters, we utilised random-effects Bayesian Model Selection (BMS) to adjudicate between our candidate models. Unlike maximum likelihood methods, BMS relies on the marginal likelihood (model evidence), which inherently balances model fit against parsimony—a principle known as the Occam’s Razor (Rasmussen and Ghahramani, 2000). In this framework, a model is only preferred if the improvement in fit justifies the additional parameter space; redundant parameters actually lower model evidence by diluting the probability mass. We would be happy to point toward literature that discusses how these marginal likelihood approximations provide a more robust guard against overfitting than standard metrics like BIC or AIC (MacKay, 2003; Murray and Ghahramani, 2005; Penny, 2012).

      The fact that the "Dual (Saccade) + Interference" model (Model 7) emerged as the winner—with a Bayes Factor of 6.11 against the next best alternative—demonstrates that its complexity was statistically justified by its superior account of the trial-by-trial data.

      Furthermore, to address the risk of overfitting, we established the generalisability of these parameters by using them to predict performance on an independent clinical task. These parameters successfully explained ~62% of the variance in ROCF copy scores—a very distinct, real-world task--confirming that they represent robust computational phenotypes rather than idiosyncratic fits to the initial dataset.

      In the Results (p10):

      “We used random-effects Bayesian model selection to identify the most plausible generative model. This process relies on the marginal likelihood (model evidence), which inherently balances model fit against complexity—a principle often referred to as Occam’s razor.[25–27]”

      In the Discussion (p17):

      “Importantly, the risk of overfitting is mitigated by the Bayesian Model Selection framework; by utilising the marginal likelihood for model comparison, the procedure inherently penalises excessive model complexity and promotes generalisability.[25–27,42] This generalisability was further evidenced by the model's ability to predict performance on the independent ROCF task, confirming that these parameters represent robust mechanistic phenotypes rather than idiosyncratic fits to the initial dataset.”

      (2) Results specificity: it is not clear how specific the modelling results are with respect to constructional ability (measured via the Rey-Osterrieth Complex Figure test). As with any cognitive test, performance can also be influenced by general, non-specific abilities that contribute broadly to test success.

      We agree that constructional performance is influenced by both specific mechanistic constraints and general cognitive abilities. To isolate the unique contribution of transsaccadic updating, we therefore performed a partial correlation analysis across the entire sample. We examined the relationship between location error in the two-saccades condition (our primary behavioural measure of transsaccadic memory) and ROCF copy scores. Even after partialling out the effects of global cognitive status (ACE-III total score), age, and years of education, the correlation remained highly significant (rho = -0.39, p < 0.001).

      This suggests that our model captures a specific computational phenotype—the precision of spatial updating during active visual sampling—rather than acting as a proxy for non-specific cognitive decline. This mechanistic link explains why traditional working memory measures (e.g., digit span or Corsi blocks) frequently fail to predict drawing performance; unlike those tasks, figure copying requires thousands of saccades, making it uniquely sensitive to the precision of the dynamic remapping signals identified by our modelling framework.

      We added the following text in the Discussion (p19):

      “We also found that the relationship between transsaccadic working memory and ROCF performance remains highly significant (rho = -0.39, p < 0.001), even after controlling for age, education, and global cognitive status (ACE-III total score). Consequently, transsaccadic updating may represent a discrete computational phenotype required for visuomotor control, rather than a non-specific proxy for global cognitive decline.[57]”

      Reviewer #3 (Recommendations for the authors):

      (1) The authors mention in the introduction the following: "One key hypothesis is that we use working memory across visual fixations to update perception dynamically", citing the following manuscript:

      Harrison, W. J., Stead, I., Wallis, T. S. A., Bex, P. J. & Mattingley, J. B. A computational 906 account of transsaccadic attentional allocation based on visual gain fields. Proc. Natl. 907 Acad. Sci. U.S.A. 121, e2316608121 (2024).

      However, the manuscript above does not refer explicitly to the involvement of working memory in transaccadic integration of object location in space. Rather, it takes advantage of recent evidence showing how the true location of a visual object is represented in the activity of neurons in primary visual cortex ( A. P. Morris, B. Krekelberg, A stable visual world in primate primary visual cortex. Curr. Biol. 29, 1471-1480.e6 (2019) ). The model hypothesizes that true locations of objects are readily available, and then allocates attention in real-world coordinates, allowing efficient coordination of attention and saccadic eye movements.

      Thank you for clarification. As suggested, we have now included the citation of Morris & Krekelberg (2019) to acknowledge the evidence for stable object locations within the primary visual cortex.

      (2) The authors in the introduction and the title use the terms 'transaccadic memory' and 'spatial working memory'. However, it is not clear whether these can be used interchangeably or are reflecting different constructs.

      Classical measures of visuo-spatial working memory are derived from the Corsi task (or similar), where the location of multiple objects is displayed and subsequently remembered. In such tasks, eye movements and saccades are not generally considered, only memory performance, representing the visuo-spatial span.

      Transaccadic memory tasks are instead explicitly measuring the performance on remembered object locations of features across explicit eye movements, usually using a very limited number of objects (1 or 2, as is the case for the current manuscript).

      While the two constructs share some features, it is not clear whether they represent the same underlying ability or not, especially because in transaccadic tasks, participants are required to perform one or more saccades, thus representing a dual-task case.

      I think the relationship between 'transaccadic memory' and 'spatial working memory' should be clarified in the manuscript.

      Thank you. Yes, we have added this within the Methods - Measurement of saccade cost to clarify that spatial working memory is the broad cognitive construct responsible for short-term maintenance, whereas transsaccadic memory is the specific, dynamic process of remapping representations to maintain stability across eye movements.

      In Methods (p.22):

      “Within this framework, it is important to distinguish between the broad construct of spatial working memory and the specific process of transsaccadic memory. While spatial working memory refers to the general ability to maintain spatial information over short intervals, transsaccadic memory describes the dynamic updating of these representations—termed remapping—to ensure stability across eye movements. Unlike classical 'static' measures of spatial working memory, such as the Corsi block task which focuses on memory span, transsaccadic memory tasks explicitly require the integration of stored visual information with motor signals from intervening saccades. Our paradigm treats transsaccadic updating as a core computational process within spatial working memory, where eye-centred representations are actively reconstructed based on noisy memories of the intervening saccade vectors.”

      (3) In Figure 1, the second row indicates the presentation of item 2. Indeed, in the condition 'saccade-after-item-1', the target in the second row of Figure 1 is displaced, as expected. This clarifies the direction and amplitude of the first saccade requested. However, from Figure 1, it is hard to understand the amplitude and direction of the second requested saccade. I think the figure should be updated, giving a full description of the direction and amplitude of the second saccade as well ('saccade-after-item-2' and 'two-saccades' conditions).

      We agree that making the figure legend more self-contained is beneficial for the reader. While the specific physical parameters and the trial sequence for each condition are detailed in the Results and Methods sections, we have now updated the legend for Figure 1 to explicitly define these details. Specifically, we have clarified that the colour wheel itself served as the target for the second instructed saccade (i.e., the movement from the second fixation cross to the colour wheel location). We have also included the quantitative constraint that all saccade vectors were at least 8.5 degrees of visual angle in amplitude. Given the limited space within a figure legend, we hope these concise additions provide the transparency requested without interrupting the conceptual flow of the diagram.

      Updated Figure 1 legend:

      “Participants were asked to fixate a white cross, wherever it appeared. They had to remember the colour and location of a sequence of two briefly presented coloured squares (Item 1 and 2), each appearing within a white square frame. They then fixated a colour wheel wherever it appeared on the screen, which served as the target for the second instructed saccade (i.e., a movement from the second fixation cross to the colour wheel location). This cued recall of a specific square (Item 1 or Item 2 labelled within the colour wheel). Participants selected the remembered colour on the colour wheel which led to a square of that colour appearing on the screen. They then dragged this square to its remembered location on the screen. Saccadic demands were manipulated by varying the locations of the second frame and the colour wheel, resulting in four conditions in their reliance on retinotopic versus transsaccadic memory: (1) No-Saccade condition providing a baseline measure of within-fixation precision as no eye movements were required. (2) Saccade After Item 1; (3) Saccade After Item 2; (4) Saccades after both items (Two Saccades condition). In all conditions requiring eye movements, saccade vectors were constrained to a minimum amplitude of 8.5° (degrees of visual angle). While the No-Saccade condition isolates retinotopic working memory, conditions (2) to (4) collectively quantify the impact of varying saccadic demands and timings on the maintenance of spatial information, thereby assessing the efficacy of the transsaccadic updating process.”

      (4) The authors write: "Eye tracking analysis confirmed high compliance: participants correctly maintained fixation or executed saccades as instructed on the vast majority of trials (83% {plus minus} 14%). Non-compliant trials were excluded 136 from further analysis." 14% of excluded trials are a substantial fraction of trials, given the task requirements. Is this proportion of excluded trials different between experimental groups, and are experimental groups contributing equally to this proportion?

      We thank the reviewer for pointing this out, and we apologise for the confusion. The 83% trial number was actually across all four cohorts, and all conditions, and it was actually above 90% for YC, EC and even AD, but dropped to 60 ish in PD group.

      We now have conducted a full analysis of compliant trial counts using a mixed ANOVA (4 saccade conditions x 4 cohorts). This analysis revealed a main effect of group (F(3, 80) = 8.06, p < 0.001), which was driven by lower compliance in the PD cohort (mean approx. 25.4 trials per condition) compared to the AD, EC, and YC cohorts (means ranging from 35.8 to 38.9 trials per condition). Crucially, however, the interaction between group and condition was not statistically significant (p = 0.151). This indicates that the relative impact of saccade demands on trial retention was consistent across all four groups.

      Because our primary behavioural measure—the saccade cost—is a within-subject comparison of impairment across conditions, these differences in absolute trial numbers do not introduce a systematic bias into our findings. Furthermore, even with the higher attrition in the PD group, we retained a sufficient number of high-quality trials (minimum mean of ~23 trials in the most demanding condition) to support robust trial-by-trial parameter estimation and valid statistical inference. We have updated the Results and Methods to reflect these details.

      In Results (p4):

      “To mitigate potential confounds, we monitored eye position throughout the experiment. Eye-tracking analysis confirmed high compliance in healthy adults, who followed instructions on the vast majority of trials (Younger Adults: 97.2 ± 5.2 %; Older Adults: 91.3 ± 20.4 %). The mean difference between these groups was negligible, representing just 1.25 trials per condition, and was not statistically significant (t(80) = 0.16, p = 1.000; see more in Methods – Eyetracking data analysis). Non-compliant trials were excluded from all further analyses.”

      In Methods (p27):

      “Eye-tracking analysis confirmed high compliance overall, with participants correctly maintaining fixation or executing saccades on the vast majority of trials (83% across all participants). A mixed ANOVA revealed a main effect of group on trial retention (F(3, 80) = 8.06, p < 0.001, partial η² = 0.23), primarily due to lower compliance in the PD cohort (YC: 97±4%; EC: 91±10%; AD: 95±5%; PD: 63±38%). Importantly, there was no significant interaction between group and saccade condition (F(3.36, 80) = 1.78, p = 0.15, partial η² = 0.008), suggesting that trial attrition was not disproportionately affected by specific task demands in any group.

      We acknowledge that this reduced trial count in the PD group represents a limitation for across-cohort comparison. However, the absolute number of compliant trials in PD group (mean approx. 25 per condition) remained sufficient for robust trial-by-trial parameter estimation. Furthermore, the lack of a significant group-by-condition interaction confirms that the results reported for this cohort remain valid and that our primary finding of a selective spatial memory deficit is robust to these differences in data retention.”

      (5) Modelling

      (a) Degrees of freedom, cross-validation, number of parameters.

      I appreciate the effort in introducing and testing different models. Models of increase in complexity and are based on different assumptions about the main drivers and mechanisms underlying the dependent variable. The models differ in the number of parameters. How are the differences in the number of parameters between models taken into account in the modelling analysis? Is there a cost associated with the extra parameters included in the more complex models?

      (b) Cross-validation and overfitting.

      Overfitting can occur when a model learns the training data but cannot generalize to novel datasets. Cross-validation is one approach that can be used to avoid overfitting. Was cross-validation (or other approaches) implemented in the fitting procedure against overfitting? Otherwise, the inference that can be derived from the modelled parameters can be limited.

      To address your concerns regarding model complexity and overfitting, we would like to clarify our use of Bayesian Model Selection (BMS). Unlike frequentist methods that often rely on cross-validation to assess generalisability, we used random-effects BMS based on the marginal likelihood (model evidence). This approach inherently implements Bayesian Occam’s Razor by integrating out the parameters. Under this framework, the use of the marginal likelihood for model selection provides a mathematically equivalent safeguard to frequentist cross-validation, as it evaluates the model's ability to generalise across the entire parameter space rather than just finding a maximum likelihood fit for the training data. Thus, models are penalised not just for the absolute number of parameters, but for their overall functional flexibility. A more complex model is only preferred if the improvement in model fit is substantial enough to outweigh this inherent penalty. The emergence of Model 7 as the winner (Bayes Factor = 6.11 against the next best alternative) confirms that its additional complexity is statistically justified.

      Furthermore, in this study we provided an external validation of these recovered parameters by demonstrating that they explain 62% of the variance in an independent, real-world, clinical task (ROCF copy). This empirical evidence confirms that our model captures robust mechanistic phenotypes rather than idiosyncratic noise. We have updated the Results and Discussion to explicitly state these.

      In Results: (p10)

      “We used random-effects Bayesian model selection to identify the most plausible generative model. This process relies on the marginal likelihood (model evidence), which inherently balances model fit against complexity—a principle often referred to as Occam’s razor.[26–28]”

      In Discussion: (p17)

      “Importantly, the risk of overfitting is mitigated by the Bayesian Model Selection framework; by utilising the marginal likelihood for model comparison, the procedure inherently penalises excessive model complexity and promotes generalisability.[26–28,43] This generalisability was further evidenced by the model's ability to predict performance on the independent ROCF task, confirming that these parameters represent robust mechanistic phenotypes rather than idiosyncratic fits to the initial dataset.”

      (6) n. of participants.

      (a) The authors write the following: "A total of healthy volunteers (21 young adults, mean age = 24.1 years; 21 older adults, mean age = 72.4 years) participated in this study. Their demographics are shown in Table 1. All participants were recruited locally in Oxford." However, Table 1 reports the data from more than 80 participants, divided into 4 groups. Details about the PD and AD groups are missing. Please clarify.

      We apologize for this lack of clarity in the text. We have rewrote and expand the “Participants” section and corrected Table 2 in the Methods section to reflect the correct number of participants.

      In Methods (p20):

      “A total of 87 participants completed the study: 21 young healthy adults (YC), 21 older healthy adults (EC), 23 patients with Parkinson’s disease (PD), and 22 patients with Alzheimer’s disease (AD). Their demographic and clinical details are summarised in Table 2. Initially, 90 participants were recruited (22 YC, 21 EC, 25 PD, 22 AD); however, three individuals (1 YC and 2 PD) were excluded from all analyses due to technical issues during data acquisition.

      All participants were recruited locally in Oxford, UK. None were professional artists, had a history of psychiatric illness, or were taking psychoactive medications (excluding standard dopamine replacement therapy for PD patients). Young participants were recruited via the University of Oxford Department of Experimental Psychology recruitment system. Older healthy volunteers (all >50 years of age) were recruited from the Oxford Dementia and Ageing Research (OxDARE) database.

      Patients with PD were recruited from specialist clinics in Oxfordshire. All had a clinical diagnosis of idiopathic Parkinson's disease and no history of other major neurological or psychiatric conditions. While specific dosages of dopamine replacement therapy (e.g., levodopa equivalent doses) were not systematically recorded, all patients were tested while on their regular medication regimen ('ON' state).

      Patients with PD were recruited from clinics in the Oxfordshire area. All had a clinical diagnosis of idiopathic Parkinson’s disease and no history of other major neurological or psychiatric illnesses. While all patients were tested in their regular medication ‘ON’ state, the specific pharmacological profiles—including the exact types of medication (e.g., levodopa, dopamine agonists, or combinations) and dosages—were not systematically recorded. The disease duration and PD severity were also un-recorded for this study.

      Patients with AD were recruited from the Cognitive Disorders Clinic at the John Radcliffe Hospital, Oxford, UK. All AD participants presented with a progressive, multidomain, predominantly amnestic cognitive impairment. Clinical diagnoses were supported by structural MRI and FDG-PET imaging consistent with a clinical diagnosis of AD dementia (e.g., temporo-parietal atrophy and hypometabolism).[70] All neuroimaging was reviewed independently by two senior neurologists (S.T. and M.H.).

      Global cognitive function was assessed using the Addenbrooke’s Cognitive Examination-III (ACE-III).[71] All healthy participants scored above the standard cut-off of 88, with the exception of one elderly participant who scored 85. In the PD group, two participants scored below the cut-off (85 and 79). In the AD group, six participants scored above 88; these individuals were included based on robust clinical and radiological evidence of AD pathology rather than their ACE-III score alone.”

      (b) As modelling results rely heavily on the quality of eye movements and eye traces, I believe it is necessary to report details about eye movement calibration quality and eye traces quality for the 4 experimental groups, as noisier data could be expected from naïve and possibly older participants, especially in case of clinical conditions. Potential differences in quality between groups should be discussed in light of the results obtained and whether these could contribute to the observed patterns.

      Thank you for pointing this out. We have revised the Methods about how calibration was done:

      (p27) “Prior to the experiment, a standard nine-point calibration and validation procedure was performed. Participants were instructed to fixate a small black circle with a white centre (0.5 degrees) as it appeared sequentially at nine points forming a 3 x 3 grid across the screen. Calibration was accepted only if the mean validation error was below 0.5 degrees and the maximum error at any single point was below 1.0 degree. If these criteria were not met, or if the experimenter noticed significant gaze drift between blocks, the calibration procedure was repeated. This calibration ensured high spatial accuracy across the entire display area, facilitating the precise monitoring of fixations on item frames and saccadic movements to the response colour wheel.”

      Moreover, as detailed in our response to Point 4, while the PD group exhibited lower compliance, there was no interaction between group and saccade condition for compliance (p = 0.151). This confirms that any noise or trial attrition was distributed evenly across experimental conditions. Consequently, the observed "saccade cost" (the difference in error between conditions) is not an artefact of unequal noise but represents a genuine mechanistic impairment in spatial updating. We have updated the Methods to clarify this distinction.

      Furthermore, our Bayesian framework explicitly estimates precision (random noise) as a distinct parameter from updating cost (saccade cost). This allows the model to partition the variance: even if a clinical group is "noisier" overall, this is captured by the precision parameter, ensuring it does not inflate the specific estimate of saccade-driven memory impairment.

      (7) Figure 5. I suggest reporting these results using boxplots instead of barplots, as the former gives a better overview of the distributions.

      We appreciate the suggestion to use boxplots to better illustrate data distributions. However, we have chosen to retain the current bar plot format due to the visual and statistical complexity of our 4 x 4 x 2 experimental design. Figure 5 represents 16 distinct distributions across four groups and four conditions for both location and colour measures; employing boxplots/violins for this density of data would significantly increase visual clutter and make the figure difficult to parse.

      Furthermore, the primary objective of this figure is to reflect the statistical analysis and illustrate group differences in overall performance and highlight the specific finding that patients with AD were significantly more impaired across all conditions compared to YC, EC, and PD groups. Our statistical focus remains on the mean effects—specifically the significant main effect of group (F(3, 318) = 59.71, p < 0.001) and the critical null-interaction between group and condition (p = 0.90). The error measure most relevant to these comparisons is the standard error of the mean (SEM), rather than the interquartile range (IQR). We think that bar plots provide the most straightforward and scannable representation of these mean differences and the consistent pattern of decay across cohorts for the final manuscript layout.

      To address the reviewer’s request for distributional transparency, we have provided a version of Figure 5 using grouped boxplots in the supplementary material (Supplementary figure 2). We note, however, that the spread of raw data points in these plots does not directly reflect the variance associated with our within-subject statistical comparisons.

      (8) Results specificity, trans-saccadic integration and ROCF. The authors demonstrate that the derived model parameters account for a significant amount of variability in ROCF performance across the experimental groups tested (Figure 8A). However, it remains unclear how specific the modelling results are with respect to the ROCF.

      The ROCF is generally interpreted as a measure of constructional ability. Nevertheless, as with any cognitive test, performance can also be influenced by more general, non-specific abilities that contribute broadly to test success. To more clearly link the specificity between modelling results and constructional ability, it would be helpful to include a test measure for which the model parameters would not be expected to explain performance, for example, a verbal working memory task.

      I am not necessarily suggesting that new data should be collected. However, I believe that the issue of specificity should be acknowledged and discussed as a potential limitation in the current context.

      We appreciate this important point regarding the discriminant validity of our findings. We agree that cognitive performance in clinical populations is often influenced by a general "g-factor" or non-specific executive decline. However, we chose the ROCF Copy task specifically because it is a hallmark clinical measure of constructional ability that effectively serves as a real-world transsaccadic task, requiring participants to integrate spatial information across hundreds of saccades between the model figure and the drawing surface.

      To address the reviewer’s concern regarding specificity, we leveraged the fact that all participants completed the ACE-III, which includes a dedicated verbal memory component (the ACE Memory subscale). We conducted a partial correlation analysis and found that the relationship between transsaccadic working memory and ROCF copy performance remains highly significant (rho = -0.46, p < 0.001), even after controlling for age, education, and the ACE-III Memory subscale score. This suggests that the link between transsaccadic updating and constructional ability is mechanistically specific rather than a byproduct of global cognitive impairment. We have substantially revised the Discussion to highlight this link and the supporting statistical evidence.

      We first updated the last paragraph of Introduction:

      “Finally, by linking these mechanistic parameters to a standard clinical measure of constructional ability (the Rey-Osterrieth Complex Figure task), we demonstrate that transsaccadic updating represents a core computational phenotype underpinning real-world visuospatial construction in both health and neurodegeneration.”

      The new section in Discussion highlighting the ROCF copy link:

      “Importantly, our computational framework establishes a direct mechanistic link between trassaccadic updating and real-world constructional ability. Specifically, higher saccade and angular encoding errors contribute to poorer ROCF copy scores. By mapping these mechanistic estimates onto clinical scores, we found that the parameters derived from our winning model explain approximately 62% of the variance in constructional performance across groups. These findings suggest that the computational parameters identified in the LOCUS task represent core phenotypes of visuospatial ability, providing a mechanistic bridge between basic cognitive theory and clinical presentation.

      This relationship provides novel insights into the cognitive processes underlying drawing, specifically highlighting the role of transsaccadic working memory. Previous research has primarily focused on the roles of fine motor control and eye-hand coordination in this skill.[4,50–55] This is partly because of consistent failure to find a strong relation between traditional memory measures and copying ability.[4,31] For instance, common measures of working memory, such as digit span and Corsi block tasks, do not directly predict ROCF copying performance.[31,56] Furthermore, in patients with constructional apraxia, these memory performance often remain relatively preserved despite significant drawing impairments.[56–58] In literature, this lack of association has often been attributed to “deictic” visual-sampling strategies, characterised by frequent eye movements that treat the environment as an external memory buffer, thereby minimising the need to maintain a detailed internal representation.[4,59] In a real-world copying task, the ROCF requires a high volume of saccades, making it uniquely sensitive to the precision of the dynamic remapping signals identified here. Recent eye-tracking evidence confirms that patients with AD exhibit significantly more saccades and longer fixations during figure copying compared to controls, potentially as a compensatory response to trassaccadic working memory constraints.[56] This high-frequency sampling—averaging between 150 and 260 saccades for AD patients compared to approximately 100 for healthy controls—renders the task highly dependent on the precision of dynamic remapping signals.[56] We also found that the relationship between transsaccadic working memory and ROCF performance remains highly significant (rho = -0.46, p < 0.001), even after controlling for age, education, and ACE-III Memory subscore. Consequently, transsaccadic updating may represent a discrete computational phenotype required for visuomotor control, rather than a non-specific proxy for global cognitive decline.[58]

      In other words, even when visual information is readily available in the world, the act of drawing performance depends critically on working memory across saccades. This reveals a fundamental computational trade-off: while active sampling strategies (characterised with frequent eye-hand movements) effectively reduce the load on capacity-limited working memory, they simultaneously increase the demand for precise spatial updating across eye movements. By treating the external world as an "outside" memory buffer, the brain minimises the volume of information it must hold internally, but it becomes entirely dependent on the reliability with which that information is remapped after each eye movement. This perspective aligns with, rather contradicts, the traditional view of active sampling, which posits that individuals adapt their gaze and memory strategies based on specific task demands.[3,60] Furthermore, this perspective provides a mechanistic framework for understanding constructional apraxia; in these clinical populations, the impairment may not lie in a reduced memory "span," but rather in the cumulative noise introduced by the constant spatial remapping required during the copying process.[58,61]

      Beyond constructional ability, these findings suggest that the primary evolutionary utility of high-resolution spatial remapping lies in the service of action rather than perception. While spatial remapping is often invoked to explain perceptual stability,[11–13,15] the necessity of high-resolution transsaccadic memory for basic visual perception is debated.[13,62–64] A prevailing view suggests that detailed internal models are unnecessary for perception, given the continuous availability of visual information in the external world.[13,44] Our findings support an alternative perspective, aligning with the proposal that high-resolution transsaccadic memory primarily serves action rather than perception.[13] This is consistent with the need for precise localisation in eye-hand coordination tasks such as pointing or grasping.[65] Even when unaware of intrasaccadic target displacements, individuals rapidly adjust their reaching movements, suggesting direct access of the motor system to remapping signals.[66] Further support comes from evidence that pointing to remembered locations is biased by changes in eye position,[67] and that remapping neurons reside within the dorsal “action” visual pathway, rather than the ventral “perception” visual pathway.[13,68,69] By demonstrating a strong link between transsaccadic working memory and drawing (a complex fine motor skill), our findings suggest that precise visual working memory across eye movements plays an important role in complex fine motor control.”

      We are deeply grateful to the reviewers for their meticulous reading of our manuscript and for the constructive feedback provided throughout this process. Your insights have significantly enhanced the clarity and rigour of our work.

      In addition to the changes requested by the reviewers, we wish to acknowledge a reporting error identified during the revision process. In the original Results section, the repeated measures ANOVA statistics for YC included Greenhouse-Geisser corrections, and the between-subjects degrees of freedom were incorrectly reported as within-subjects residuals. Upon re-evaluation of the data, we confirmed that the assumption of sphericity was not violated; therefore, we have removed the unnecessary Greenhouse-Geisser corrections and corrected the degrees of freedom throughout the Results and Methods sections. We have ensured that these statistical updates are reflected accurately in the revised manuscript and that they do not alter the significance or interpretation of any of our primary findings.

      We hope that these revisions address all the concerns raised and provide a more robust account of our findings. We look forward to your further assessment of our work.

    1. Consolidation de la Co-éducation autour de l'Enfant et de l'Adolescent : Synthèse et Orientations

      Résumé Analytique

      La co-éducation ne doit pas être perçue comme un simple souhait ou un idéal lointain, mais comme une réalité de fait.

      Tout enfant ou adolescent évolue au sein d'un écosystème d'éducateurs multiples (famille, école, structures de loisirs, médias).

      L'enjeu majeur n'est pas de fusionner ces rôles, mais de consolider cet écosystème en préservant la spécificité de chaque acteur tout en assurant une cohérence globale.

      Cette cohérence repose sur des projets communs, des comportements éducatifs équilibrés — évitant l'aléa et la rigidité — et une collaboration étroite face aux défis sociétaux modernes, tels que la gestion des outils numériques.

      1. La Co-éducation : Un Écosystème de Fait

      La co-éducation est une réalité intrinsèque au développement de l'enfant. Dès lors qu'un individu sort de son isolement, il est exposé à une multiplicité d'influences éducatives qui constituent son environnement quotidien.

      Une pluralité d'acteurs : L'éducation n'est pas le seul fait des parents.

      Elle inclut l'école, les clubs de loisirs, la famille élargie, les amis, ainsi que les influences médiatiques (télévision, internet, réseaux sociaux).

      La notion d'écosystème : Cet ensemble d'interactions forme un cadre dans lequel l'enfant évolue.

      Les différents éducateurs s'y complètent de manière de facto, exerçant chacun une influence sur la construction du sujet.

      2. Le Principe de Spécificité des Rôles

      Un pilier fondamental de la co-éducation réussie est le respect des domaines de compétence et des vocations de chaque acteur. La collaboration ne doit pas mener à une confusion des rôles.

      Le maintien des identités : Chaque éducateur doit garder sa spécificité. Les parents n'ont pas vocation à devenir des enseignants, et les enseignants ne doivent pas s'immiscer dans la sphère privée familiale.

      Différenciation des méthodes : Un animateur de loisirs peut aborder des concepts académiques (comme la proportionnalité), mais il doit le faire selon des modalités propres à son cadre, et non en reproduisant strictement les méthodes scolaires.

      La complémentarité plutôt que l'imitation : L'objectif de la rencontre entre adultes n'est pas de chercher à se ressembler ou à agir de manière identique, mais d'organiser une complémentarité efficace.

      3. Leviers de Cohérence Éducative

      Si la spécificité est de mise, elle ne doit pas conduire à l'incohérence.

      Le document souligne deux moyens principaux pour harmoniser l'action des adultes :

      La mise en œuvre de projets communs

      La cohérence peut naître d'actions concrètes impliquant simultanément plusieurs sphères de la vie de l'enfant.

      Exemple : Les classes découvertes ou les sorties scolaires qui associent parents, enseignants et animateurs du périscolaire autour d'un objectif unique.

      L'harmonisation des comportements éducatifs

      Il s'agit de tisser un système au service du développement de l'enfant pour l'aider à comprendre le monde et à devenir un sujet autonome.

      Un système éducatif sain se définit par sa structure :

      | Type de milieu | Caractéristiques | Impact sur l'enfant | | --- | --- | --- | | Milieu Aléatoire | Imprévisible. Les réactions des adultes (sanction ou félicitation) ne sont pas constantes. | L'enfant ne peut pas anticiper les conséquences de ses actes. | | Milieu Rigide | Règles définies à l'avance, immuables et non discutables. Tout est enfermé dans des normes strictes. | Absence de dialogue et de remise en question. | | Milieu Équilibré | Présence d'un cadre sécurisant, mais flexible. Les règles peuvent faire l'objet de discussions selon les événements. | Favorise l'émergence de la réflexivité et du dialogue entre enfant et adulte. |

      4. Un Défi Partagé : La Gestion du Numérique

      La co-éducation est particulièrement sollicitée face aux problématiques sociétales complexes, l'utilisation du téléphone portable et des outils numériques en étant l'exemple le plus prégnant.

      L'impossibilité d'une solution isolée : Ni les parents, ni les enseignants, ni les éducateurs spécialisés ne peuvent légiférer ou résoudre seuls la question des écrans.

      La nécessité d'une "législation" cohérente : Les adultes ont tout intérêt à se concerter pour adopter des comportements et des règles cohérents autour de cet objet.

      La solution réside dans la concertation et l'établissement d'une ligne de conduite partagée au sein de l'écosystème.

      Conclusion

      La consolidation de la co-éducation repose sur un paradoxe constructif : travailler ensemble tout en restant différent.

      La rencontre entre les adultes n'est pas une fin en soi, mais un moyen de structurer un environnement prévisible et réflexif pour l'enfant.

      En instaurant un dialogue constant et en s'accordant sur des comportements cohérents face aux enjeux modernes, les éducateurs favorisent un écosystème propice à l'autonomie et au développement global de l'enfant et de l'adolescent.

    1. If an airplane seat was designed with little leg room, assuming people’s legs wouldn’t be too long, then someone who is very tall, or who has difficulty bending their legs would have a disability in that situation.

      Although I wouldn't consider this particular example as a "disability" I think it definitely affects a large group of people who don't fit into the "average height" category. As someone who's 5'5", I've never had to deal with this issue, as airplane seats are usually quite comfy for me, but when I met my boyfriend I found out relatively quickly that booking flights is a pretty big hassle for this reason. Going on a flight for 3 hours or less could be bearable, but when traveling around the world on a 15 hour flight, that's a nightmare for anyone that is taller than average. You have to either pay extra to choose a seat by the emergency exit, or have your legs in an awkward position in the aisle for the entirety of the flight. It's easy to not consider the minority of a population when designing something for consumers, but it is essential to keep the minority in mind, especially as sometimes their lives could be at risk if someone were to be designing a device or technology that was made for health related reasons

    1. 10.5. Design Analysis: Accessibility# We want to provide you, the reader, a chance to explore accessibility more. In this activity you will be looking at a social media site on your device (e.g., your phone or computer). We will again follow the five step CIDER method (Critique, Imagine, Design, Expand, Repeat). So open a social media site on your device (the website or app may have additional accessibility settings, but don’t use those for now, just consider how it works as it is currently). Then do the following (preferably on paper or in a blank computer document): 10.5.1. Critique (3-5 minutes, by yourself):# What assumptions do the site and your device make about individuals or groups using social media, which might not be true or might cause problems? List as many as you can think of (bullet points encouraged). 10.5.2. Imagine (2-3 minutes, by yourself):# Select one of the above assumptions that you think is important to address. Then write a 1-2 sentence scenario where a user face difficulties because of the assumption you selected. This represents one way the design could exclude certain users. 10.5.3. Design (3-5 minutes, by yourself):# Brainstorm ways to change the site or your device to avoid the scenario you wrote above. List as many different kinds of potential solutions you can think of – aim for ten or more (bullet points encouraged). 10.5.4. Expand (5-10 minutes, with others):# Combine your list of critiques with someone else’s (or if possible, have a whole class combine theirs). 10.5.5. Repeat the Imagine and Design Tasks:# Select another assumption from the list above that you think is important to address. Make sure to choose a different assumption than you used before. Choose one that you didn’t come up with yourself, if possible. Repeat the Imagine and Design steps. 10.5.6. Explore accessibility settings# Now, try to find the accessibility settings on the social media site and on your device. For each setting you see, try to come up with what disabilities that setting would be beneficial for (there may be multiple).

      This activity is a really effective way to make accessibility feel concrete instead of abstract. By starting with critique and assumptions, it highlights how many “default” design choices silently exclude users before accessibility settings are even considered. I especially like how the Imagine and Design steps force you to think through a specific user’s experience and then brainstorm multiple solutions, rather than jumping straight to a single fix. Ending with exploring existing accessibility settings also reinforces that accessibility is often an afterthought in design, even though it should be part of the core system from the beginning.

    1. This is not apologetics for neocolonialism, or nostalgia for ‘Eurafrica’.3 If our neighbours’ houses fall into disrepair, there are repercussions for us. In the twenty-first century large parts of North Africa have fallen apart and none of it looks secure.

      This statement sheds light on how during the 21st century there is a lot of instability going on in North Africa. The instability in North Africa causes some challenges for European security as well as policy. This quote puts emphasis on the broader theme in this article so far of how geography and stability are very important. Europe can not ignore the problems that are happening outside its boarders.

    1. Therefore, even if we adhere to the belief that news reporters are impartial and objective truth-tellers, we must still accept that there will always be parts of the story that are left out and some stories that are never told. The question here is: Who decides on which stories to tell and how to tell them? The theoretical perspectives discussed in this section offer different views of the filtering process that leads to the creation of crime news content.

      crime is viewed by the choices people make and the power people have ?for example journalisists becasue sometimes they can try to be objective

    2. It is not possible for the media to tell us about everything that happens in the world, the details of every crime committed, the harm done to every victim, the behaviour of every law enforcement officer, the decisions of every court, or the rehabilitation of every offender. There is necessarily a selection process, a filtering of the range of stories available to determine which ones are “news.” Theorists from different perspectives argue that the stories that become news normally carry one or more features called news values or newsworthiness criteria

      What I got from this paragraph is that the media cannot report every event, so reporting news involves a selection process in which only certain stories are chosen based on what is considered newsworthy.

    1. I was in your shoes and I dove in head first. After reading, owning, and watching countless videos on the matter, here's what I have learned: Don't buy online Only buy what you can have your hands on before exchanging money Be picky, don't just get any machine on the belief you'll start fixing them. Do not view them as being "rescued" when you buy another broken machine. Start off with a solid machine with no issues. (I suggest an Olympia brand, sm-3 etc) Honorable mention: only acquire organically through yard sales, estate sales antique stores etc. It imbues your machine with magic 🪄

      via u/Forge_Le_Femme

    1. The arguments I have examined can be sorted into three somewhat overlapping types: religion causes violence because it is (1) absolutist, (2) divisive, and (3) insufficiently rational.

      argument for religion causing violence

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      We are grateful for the reviewers' constructive comments and suggestions, which contributed to improving our manuscript. We are pleased to see that our work was described as an "interesting manuscript in which a lot of work has been undertaken". We are also encouraged by the fact that the experiments were considered "on the whole well done, carefully documented, and support most of the conclusions drawn," and that our findings were viewed as providing "mechanistic insight into how HNRNPK modulates prion propagation" and potentially offering "new mechanical insight of hnRNPK function and its interaction with TFAP2C."

      We conducted several new experiments and revised specific sections of the manuscript, as detailed below in the point-by-point response in this letter.

      Referee #1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The paper by Sellitto describes studies to determine the mechanism by which hnRNPK modulates the propagation of prion. The authors use cell models lacking HNRNPK, which is lethal, in a CRISPR screen to identify genes that suppress lethality. Based on this screen to 2 different cell lines, gene termed Tfap2C emerged as a candidate for interaction with HNRNPK. The show that Tfap2C counteracts the actions of HNRNPK with respect to prion propagation. Cells lacking HNRNPK show increased PrPSc levels. Overexpression of Tfap2C suppesses PrPSc levels. These effects on PrPSc are independent of PrPC levels. By RNAseq analysis, the authors hone in on metabolic pathways regulated by HNRPNK and Tfap2C, then follow the data to autophagy regulation by mTor. Ultimately, the authors show that short-term treatments of these cell models with mTor inhibitors causes increased accumulation of PrPSc. The authors conclude that the loss of HNRNPK leads to a reduced energy metabolism causing mTor inhibition, which is reduces translation by dephosphorylation of S6

      Major comments:

      1) Fig H and I, Fig 3L. The interaction between Tfap2C and HNRNPK is pretty weak. The interaction may not be consequential. The experiment seems to be well controlled, yielding limited interaction. The co-ip was done in PBS with no detergent. The authors indicate that the cells were mechanically disrupted. Since both of these are DNA binding proteins, is it possible that the observed interaction is due to the proximity on DNA that is linking the 2 proteins, including a DNAase treatment would clarify.

      Response: We agree that the observed co-IP between Tfap2c and hnRNP K is weak (previous Fig. 2H-I, Supp. Fig. 3L now shifted in Supp. Fig. 4C-E), and we have now highlighted this in the relevant section of the manuscript to reflect this observation better.

      Importantly, the co-IP was performed using endogenous proteins without overexpression or tagging, which can sometimes artificially enhance protein-protein interactions. However, we acknowledge that the use of a detergent-free lysis buffer and mechanical disruption alone may have limited nuclear protein extraction and solubilization, potentially contributing to the low co-IP signal.

      To address the reviewer's concerns and clarify whether the observed interaction could be DNA-mediated, we repeated the co-IP experiments under low-detergent conditions and included benzonase nuclease treatment to digest nucleic acids (Fig. 2H-I). DNA digestion was confirmed by agarose gel electrophoresis (Supp. Fig. 4F-G). Additionally, we performed the reciprocal IPs using both hnRNP K and Tfap2c antibodies (Fig. 2H-I). Although the level of co-immunoprecipitation remains modest, these updated experiments continue to demonstrate a specific co-immunoprecipitation between Tfap2c and hnRNP K, independent of DNA bridging. These additional controls and experimental refinements strengthen the validity of our findings. These results are also attached here for your convenience.

      2) Supplemental Fig 5B - The western blot images for pAMPK don't really look like a 2 fold increase in phosphorylation in HNRNPK deletion.

      Response: We thank the reviewer for raising this point. We re-examined the original pAMPK western blot (previously Supp. Fig. 5B; now presented as Supp. Fig. 6B) and confirmed the reported results. We note that the overall loading is not perfectly uniform across lanes (as suggested by the actin signal), which may affect the visual impression of band intensity. However, the phosphorylation change reported in the manuscript is based on the pAMPK/total AMPK ratio, which accounts for differences in AMPK expression and accurately reflects relative phosphorylation levels. To further address this concern, we performed three additional independent experiments. These new data reproduce the increase in pAMPK/AMPK upon HNRNPK deletion and are now included in the revised Supplementary Fig. 6B, together with the updated quantification. The new blot and the quantification are also attached here for your convenience.

      3) Fig. 5A - I don't think it is proper to do statistics on an of 2.

      Response: We believe the reviewer's comment refers to Fig. 5B, as Fig. 5A already has sufficient replication. We have now added two additional replicates, bringing the total to four. The updated statistical analysis corroborates our initial results. The new quantification is provided in the revised manuscript (Fig. 5B) along with the new blot (Supp. Fig. 6C). Both data are also attached here for your convenience.

      4) Fig 6D. The data look a bit more complicated than described in the text. At 7 days, compared to 2 days, it looks like there is a decrease in % cells positive for 6D11. Is there clearance of PrPSc or proliferation of un-infected cells?

      Response: We have now reworded our text in the results paragraph as follows:

      "These data show that TFAP2C overexpression and HNRNPK downregulation bidirectionally regulate prion levels in cell culture."

      We have now also included the following comments in the discussion section:

      "However, prion propagation relies on a combination of intracellular PrPSc seeding and amplification, as well as intercellular spread, which together contribute to the maintenance and expansion of infected cells within the cultured population. In this study, we were limited in our ability to dissect which specific steps of the prion life cycle are affected by TFAP2C. We also cannot fully exclude the possibility that TFAP2C overexpression influenced the relative proliferation of prion-infected versus uninfected cells in the PG127-infected HovL culture, thereby contributing to the observed reduction in the percentage of 6D11+ cells and overall 6D11+ fluorescence. However, we did not observe any signs of cell death, growth impairment, or increased proliferation under TFAP2C overexpression in PG127-infected HovL cells compared to NBH controls (data not shown). This suggests that a negative selective pressure on infected cells or a proliferative advantage of uninfected cells is unlikely in this context".

      5) The authors might consider a different order of presenting the data. Fig 6 could follow Fig. 2 before the mechanistic studies in Figs 3-5.

      Response: We believe that the current order of presenting the data is more appropriate. The first part of the manuscript focuses on the genetic and functional interactions between hnRNP K and its partners, particularly TFAP2C, which is a critical point for understanding the broader context before delving into the mechanistic studies involving prion-infected cells.

      6) The authors use SEM throughout the paper and while this is often used, there has been some interest in using StdDev to show the full scope of variability.

      Response: We chose to use SEM as it reflects the precision of the mean, which is central to our statistical comparisons. As the reviewer notes, this is a common and appropriate practice. To address variability, almost all graphs already include individual data points, which provide a direct visual representation of data spread. To further enhance clarity, we have now included StdDev in the Supplementary Source Data table of the revised manuscript.

      Discussion:

      The discrepancy between short-term and long-term treatments with mTor inhibitors is only briefly mentioned with a bit of a hand-waving explanation. The authors may need a better explanation.

      Response: We have now integrated a more detailed explanation in the discussion section of the revised manuscript as follows:

      "Previous studies showed that mTORC1/2 inhibition and autophagy activation generally reduce, rather than increase, PrPSc aggregation (79, 80). The reason for this discrepancy remains unclear and may be multifactorial. First, most prior studies were based on long-term mTOR inhibition, whereas our work examined acute inhibition, mimicking the time frame of HNRNPK and TFAP2C manipulation. Acute inhibition may trigger transient metabolic or signaling shifts that differ from adaptive changes associated with mTOR chronic inhibition, potentially overriding autophagy's effects on prion propagation. Additionally, while previous works were primarily conducted in murine in vivo models, our study focused on a human cell system propagating ovine prions. Differences in species background, model complexity (e.g., interactions between different cell types), and prion strain variability, as certain strains exhibit distinct responses to autophagy and mTOR modulation (https://doi.org/10.1371/journal.pone.0137958), likely contributed to the observed differences".

      Minor comments:

      Page 12 - no mention of chloroquine in the text or related data.

      Page 12 - Supp. Fig. E - should be 5E

      Response: We thank the reviewer for pointing this out. We have now better highlighted the use of chloroquine in Fig. 5B (see reviewer #1 - Point 3 - Major comments) and in the text as follows:

      "Furthermore, in the presence of chloroquine, LC3-II levels rose almost proportionally across all conditions (Fig. 5B), suggesting that the effects of HNRNPK and TFAP2C on autophagy occur at the level of autophagosome formation, rather than autophagosome-lysosome fusion and degradation."

      We have corrected the reference to Supp. Fig. 5E.

      Reviewer #1 (Significance (Required)):

      The study provides mechanistic insight into how HNRNPK modulates prion propagation. The paper is limited to cell models, and the authors note that long term treatment with mTor inhibitors reduced PrPSc levels in an in vivo model.

      The primary audience will be other prion researchers. There may be some broader interest in the mTor pathway and the role of HNRNPK in other neurodegenerative diseases.

      Referee #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2g (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency. They state that HNRNPK and TFAP2C are linked to mTOR signalling and observe that HNRNPK ablation inhibits mTORC1 activity through downregulation of mTOR and Rptor while TFAP2C overexpression enhances mTORC1 downstream functions. In prion infected cells, TFAP2C activation reduced prion levels and countered the increased prion propagation due to HNRNPK suppression. Pharmacological inhibition of mTOR also elevated prion levels and partially mimicked the effects of HNRNPK silencing. They state their study identifies TFAP2C as a genetic interactor of HNRNPK and implicates their roles in mTOR metabolic regulation and establishes a causative link between these activities and prion propagation.

      This is an interesting manuscript in which a lot of work has been undertaken. The experiments are on the whole well done, carefully documented and support most of the conclusions drawn. However, there are places where it was quite difficult to read as some of the important results are in the supplementary Figures and it was necessary to go back and forth between the Figs in the main body of the paper and the supplementary Figs. There are also Figures in the supplementary which should have been presented in the main body of the paper. These are indicated in our comments below.

      We have the following questions /points:

      Major comments:

      1) A plasmid harbouring four guide RNAs driven by four distinct constitutive promoters is used for targetting HNRNPK- is there a reason for using 4 guides- is it simply to obtain maximal editing - in their experience is this required for all genes or specific to HNRNPK?

      Response: The use of four guide RNAs driven by distinct promoters is chosen to maximize editing efficiency for HNRNPK. As previously demonstrated by J. A. Yin et al. (Ref. 32), this system provides better efficiency for gene knockout (or activation). For HNRNPK, achieving full knockout was crucial for observing a complete lethal phenotype, which made the four guide RNAs approach fundamental. However, other knockout systems, while potentially less efficient, have been shown to work well in other circumstances. We have now included this explanation in the revised manuscript as follows:

      "We employed a plasmid harboring quadruple non-overlapping single-guide RNAs (qgRNAs), driven by four distinct constitutive promoters, to target the human HNRNPK gene and maximize editing efficiency in polyclonal LN-229 and U-251 MG cells stably expressing Cas9 (32)."

      2) Is there a minimal amount of Cas9 required for editing?

      Response: We did not observe a correlation between Cas9 levels and activity, yet the C3 clone was the one with higher Cas9 expression and higher activity (Supp. Fig. 1A-B). We agree that comments about the amount of Cas9 expression may be misleading here. Thus, in the first result paragraph of the revised manuscript, we have now modified the text "we isolated by limiting dilutions LN-229 clones expressing high Cas9 levels" to "we isolated by limiting dilutions LN-229 single-cell clones expressing Cas9".

      3) It is stated that cell death is delayed in U251-MG cells compared to LN-229-C3 cells- why? Also, why use glioblastoma cells other than that they have high levels of HNRNPK? Would neuroblastoma cells be more appropriate if they are aiming to test for prion propagation?

      Response: As shown in Fig. 1A, U251-MG cells reached complete cell death at day 13, while LN-229 C3 reached it already at day 10. The percentage of viable U251-MG cells is higher (statistically significant) than LN-229 C3 cells at all time points before day 13, when both lines show complete death. The underlying reasons for this partial and relative resistance are probably multiple, but we clearly showed in Fig. 2 that TFAP2C differential expression is one modulator of cell sensitivity to HNRNPK ablation.

      We selected glioblastoma cells because their high expression of HNRNPK was essential for developing our synthetic lethality screen strategy, and we have now clarified it in the revised manuscript as follows:

      "As model systems, we chose the human glioblastoma-derived LN-229 and U-251 MG cell lines, which express high levels of HNRNPK (2, 3), a key factor for optimizing our synthetic lethality screen."

      While neuroblastoma cells might be more relevant in terms of prion neurotoxicity, glial cells, despite their resistance to prion toxicity, are fully capable of propagating prions. Prion propagation in glial cells has been shown to play crucial roles in mediating prion-dependent neuronal loss in a non-autonomous manner (see 10.1111/bpa.13056). This makes glioblastoma cells a valuable model for studying prion propagation (that is the focus of our study), despite the lack of direct toxicity (which is not the focus of our study). We have now added this explanation to the revised manuscript as follows:

      "Therefore, we continued our experiments using LN-229 cells, which provide a relevant model for studying prions, as glial cells can propagate prions and contribute to prion-induced neuronal loss through non-cell-autonomous mechanisms."

      4) Human CRISPR Brunello pooled library- does the Brunello library use constructs which have four independent guide RNAs as used for the silencing of HNRPNK?

      Response: No, the Human CRISPR Brunello pooled library does not use constructs with four independent guide RNAs (qgRNAs). Instead, each gene is targeted by 4 different single-guide RNAs (sgRNAs), each expressed on a separate plasmid. We have now clarified this in the main text of the revised manuscript as follows:

      "To identify functionally relevant epistatic interactors of HNRNPK, we conducted a whole-genome ablation screen in LN-229 C3 cells using the Human CRISPR Brunello pooled library (33), which targets 19,114 genes with an average of four distinct sgRNAs per gene, each expressed by a separate plasmid (total = 76,441 sgRNA plasmids)."

      5) To rank the 763 enriched genes, they multiply the -log10FDR with their effect size - is this a standard step that is normally undertaken?

      Response: The approach of ranking hits using the product of effect size and statistical significance is a well-established method in CRISPR screening studies. This strategy has been explicitly used in high-impact work by Martin Kampmann and others (see https://doi.org/10.1371/journal.pgen.1009103 and https://doi.org/10.1016/j.neuron.2019.07.014 as references). We have now added both references to the revised manuscript.

      6) The 32 genes selected- they were ablated individually using constructs with one guide RNA or four guide RNAs?

      Response: The 32 genes selected were ablated individually using constructs with quadruple-guide RNAs (qgRNAs), as this approach was intended to maximize editing efficiency for each gene. We have now clarified this in the main text of the revised manuscript as follows:

      "We ablated each gene individually using qgRNAs and then deleted HNRNPK."

      7) The identified targets were also tested in U251-MG cells and nine were confirmed but the percent viability was variable - is the variability simply a reflection of the different cell line?

      Response: The variability in percent viability observed in U251-MG cells likely reflects the inherent differences between cell lines, which can contribute to varying levels of susceptibility to gene ablation, even for the same targets. We have now highlighted these small differences in the main text of the revised manuscript as follows:

      "We confirmed a total of 9 hits (Fig. 1H), including the ELPs gene IKBAKP and the transcription factor TFAP2C, the two strongest hits identified in LN-229 C3 cells. However, in the U251-Cas9 the rescue effect did not always fall within the exact range observed in LN-229 C3 cells, likely due to intrinsic differences between the two cell lines."

      8) The two strongest hits were IKBAKP and TFAP2C. As TFAP2C is a transcription factor - is it known to modulate expression of any of the genes that were identified to be perturbed in the screen? Moreover, it is stated that it regulates expression of several lncRNAs- have the authors looked at expression of these lncRNAs- is the expression affected- can modulation of expression of these lncRNAs modulate the observed phenotypic effects and also some of the targets they have identified in the screen?

      Response: While TFAP2C is a transcription factor known to regulate the expression of several genes and lncRNAs, we did not identify any of its known target genes among the hits of our screen. However, our RNA-seq data and RT-qPCR (data not shown) indicate that the expression of lncRNA MALAT1 and NEAT1 (reported to interact with both HNRNPK and TFAP2C; ref 37, 41, 47) is strongly affected by HNRNPK ablation and to a lesser extent by TFAP2C deletion. However, the double deletion condition does not appear to change these lncRNA levels beyond what is observed with HNRNPK ablation alone. Therefore, we concluded that these changes do not play a primary role in the phenotypic effects observed in our study. Thus, although interesting, we believe that the description of such observations goes beyond the scope of this manuscript and the relevance of this work.

      9) As both HNRNPK and TFAP2C modulate glucose metabolism, the authors have chosen to explore the epistatic interaction. This is most reasonable.

      Response: We do not have further comments on this point.

      10) The orthogonal assay to confirm that deletion of TFAP2C supresses cell death upon removing HNRNPK- was this done using a single guide RNA or multiple guides - is there a level of suppression required to observe rescue? Interestingly ablation of HNRNPK increases TFAP2C expression in LN-229-C3 whereas in U251-Cas9 cells HNRNPK ablation has the opposite effect- both RNA and protein levels of TFAP2C are decreased - is this the cause of the smaller protective effect of TFAP2C deletion in this cell line?

      Response: TFAP2C deletion was performed using quadruple-guide RNAs (gqRNAs). We have clarified this point by addressing the reviewer #2's point 6 in "Major comments".

      We did not directly test the threshold of TFAP2C inhibition required to suppress HNRNPK ablation-induced cell death. We did not exclude that other effectors may take a role in the smaller protective effect of TFAP2C deletion in the U251-Cas9 cells, however, multiple lines of evidence from our study suggest that TFAP2C expression levels influence cellular sensitivity to HNRNPK loss:

      1) Both LN-229 C3 and U251-Cas9 cells are less sensitive to HNRNPK ablation upon TFAP2C deletion (Fig. 1G-H, Fig. 2A-B, Supp. Fig.3A-B).

      2) We observed a correlation between endogenous TFAP2C levels and HNRNPK ablation sensitivity. U251-Cas9 cells, where TFAP2C expression is reduced upon HNRNPK ablation (in contrast to LN-229 C3 cells, where HNRNPK ablation leads to an increase in TFAP2C expression) (Fig. 2C-F), are a) less sensitive to HNRNPK deletion than LN-229 C3 (Fig. 1A, 2A-B) and b) the protective effect of TFAP2C deletion is less pronounced than in LN-229 C3 (Fig. 1G-H, Fig. 2A-B, Supp. Fig.3A-B).

      3) TFAP2C overexpression experiments (Fig. 2G) establish a causal relationship to the former correlation: TFAP2C overexpression increased U251-Cas9 sensitivity to HNRNPK ablation.

      As clearly mentioned in the manuscript, we believe that, taken together, these findings strongly demonstrate a causal role for TFAP2C in modulating sensitivity to HNRNPK loss. Thus, despite the differences in the expression, the proposed viability interaction between TFAP2C and HNRNPK is conserved across cell lines.

      To further strengthen our conclusions, we have now added LN-229 C3 TFAP2C overexpression in Fig. 2G (also attached below for your convenience). As for the U251-Cas9, LN-229 C3 cells show increased sensitivity to HNRNPK ablation upon TFAP2C overexpression.

      11) Nuclear localisation studies indicate that the HNRNPK and TFAP2C proteins colocalise in the nucleus however the co-IP data is not convincing- although appropriate controls are present, the level of interaction is very low - the amount of HNRNPK pulled down by TFAP2C is really very low in the LN-229C3 cells and even lower in the U251-Cas9 cells. Have they undertaken the reciprocal co-IP expt?

      Response: We rephrased our text to better highlight this as also mentioned in our response to reviewer #1 (Point 1 - Major comments). However, as also noted by the reviewer, the experiments included all the relevant controls. Thus, the results are solid and confirm a degree of co-immunoprecipitation (although weak). As detailed in our response to reviewer #1 (Point 1 - Major comments), to strengthen our conclusion, we have now repeated the experiment in low-detergent conditions and used benzonase nuclease for DNA digestion. We also have performed the reciprocal experiment as suggested by the reviewer, confirming the initial results. In our opinion, these additional experiments support the conclusion that Tfap2c and hnRNP K co-immunoprecipitate through a weak, but direct, interaction.

      12) They state that LN-229 C3 ∆TFAP2C and U251-Cas9 ∆TFAP2C were only mildly resistant to the apoptotic action of staurosporin Fig 3E and F - I accept they have undertaken the stats which support their statement that at high concentrations of staurosporin the LN-229 C3 ∆TFAP2C cells are less sensitive but the U251-Cas9 ∆TFAP2C decreased sensitivity is hard to believe. Has this been replicated? I agree that HNRNPK deletion causes apoptosis in both LN-229 C3 and U251-Cas9 cells and this is blocked by Z-VAD-FMK - however the block is not complete- the max viability for HNRNPK deletion in LN-229 C3 cells is about 40% whereas for U251-Cas9 cells it is about 30% - does this suggest that cells are being lost by another pathway. Have they tested concentrations higher than 10nM?

      Response: The experiments in FIG. 3E-F have been replicated four times, as stated in the figure legend. We agree that TFAP2C plays a limited role in response to staurosporine-induced apoptosis, particularly in U251-Cas9 cells. To ensure clarity, we have now modified our previous sentence as follows:

      "LN-229 C3ΔTFAP2C cells were only mildly resistant to the apoptotic action of staurosporine, and U251-Cas9ΔTFAP2C showed even lower and minimal recovery (Fig. 3E-F). These results indicate that TFAP2C plays a limited role in apoptosis regulation and suggest that its suppressive effect on HNRNPK essentiality is not mediated through direct modulation of apoptosis but rather through upstream processes that eventually converge on it."

      The incomplete blockade of apoptosis by Z-VAD-FMK suggests that HNRNPK ablation may activate alternative, non-caspase-mediated cell death pathways. Regarding this point, we decided to not test Z-VAD-FMK above 10 nM as we noted that the rescue effect at the lowest concentration (2nM) was not proportionally increasing at higher concentrations, suggesting we already reached saturation. We have now added and clarified these observations in the revised manuscript as follows:

      "Z-VAD-FMK decreased cell death consistently and significantly in LN-229 C3 and U251-Cas9 cells transduced with HNRNPK ablation qgRNAs (Fig. 3C‑D), confirming that HNRNPK deletion promotes cell apoptosis. However, we observed that viability recovery plateaued already at the lowest concentration (2 nM) without further increase at higher doses, suggesting a saturation effect. This indicates that while caspase inhibition alleviates part of the cell death, HNRNPK loss triggers additional mechanisms beyond apoptosis".

      Following the suggestion of the reviewer, we have now also tested two higher concentrations of Z-VAD (20 and 50nM) in LN-229 cells. At these concentrations, we observed a slight decrease in cell viability in the NT condition, with a rescue effect in the HNRNPK-ablated cells comparable to what was observed at 2-10nM Z-VAD. For this reason, we did not include these data in the revised manuscript, and we attached them here for transparency.

      13) The RNA-seq comparisons- the authors use log2 FC Response: We used a log2 FC threshold of >0.5 and 0.25) is commonly used in RNA-seq studies to capture biologically relevant shifts (e.g.,https://doi.org/10.1371/journal.ppat.1012552; https://doi.org/10.1371/journal.ppat.1008653; https://doi.org/10.1016/j.neuron.2025.03.008; https://doi.org/10.15252/embj.2022112338). We complemented this analysis with Gene Set Enrichment Analysis (GSEA) to assess coordinated changes in biological/genetic pathways, ensuring that our conclusions are not based on isolated, minor expression changes nor on arbitrary thresholds. Finally, to enhance our result robustness, we applied False Discovery Rate (FDR) statistics, which is more stringent than a p-value cutoff. We hope this clarification strengthens the reviewer's confidence in the significance of the observed changes.

      14) It is stated" Accordingly, we observed increased AMPK phosphorylation (pAMPK) upon ablation of HNRNPK, which was consistently reduced in LN-229 C3ΔTFAP2C cells (Supp. Fig. 5B). LN-229 C3ΔTFAP2C; ΔHNRNPK cells also showed a partial reduction of pAMPK relative to LN-229 C3ΔHNRNPK cells (Supp. Fig. 5B). These results suggest that hnRNP K depletion causes an energy shortfall, leading to cell death.

      Response: I am not totally convinced by the data presented in this Fig. The authors have quantified the band intensity and present the ratio of pAMPK to AMPK. Please note that the actin levels are variable across the samples - did they normalise the data using the actin level before undertaking the comparisons? Also, if the authors think this is an important point which supports their conclusion, then it should be in the main body of the paper rather than the supplementary. If AMPK is being phosphorylated, this should lead to activation of the metabolic check point which involves p53 activation by phosphorylation. Activated p53 would turn on p21CIP1 which is a very sensitive indicator of p53 activation.

      We also refer the reviewer to our response to reviewer #1 (Point 2 - Major comments). We understand the point of the reviewer as pAMPK/Actin (absolute AMPK phosphorylation) may provide additional context regarding the downstream effects of AMPK activation, which, however, is not the primary scope of our experiment. We believe that in our specific case, a) the pAMPK/AMPK ratio is the most appropriate metric, as it reflects the energy status of the cell (ATP/AMP levels), which was our main point to assess in this experiment, and b) phospho-protein/total protein is the standard approach for quantifying phosphorylation ratio. For completeness, we have now included pAMPK/Actin quantifications in Supp. Fig. 6B of the revised manuscript (also attached below). pAMPK/Actin levels follow the same trend of pAMPK/AMPK in HNRNPK and TFAP2C single ablations. The pAMPK/AMPK partial rescue in HNRNPK;TFAP2C double ablation relative to HNRNPK single deletion is instead not observed at pAMPK/Actin level. We have now added the pAMPK/Actin quantification and this observation to the revised manuscript as follows:

      "Accordingly, we observed increased AMPK phosphorylation (pAMPK/AMPK ratio and pAMPK/Actin) upon ablation of HNRNPK, with a trend toward reduction in LN-229 C3ΔTFAP2C cells (Supp. Fig. 6B). LN-229 C3ΔTFAP2C;ΔHNRNPK cells also showed a reduction of pAMPK/AMPK ratio relative to LN-229 C3ΔHNRNPK cells, although absolute AMPK phosphorylation (pAMPK/Actin) remained high (Supp. Fig. 6B)."

      We prefer to keep the AMPK blots in Supplementary Fig. 6B, as we believe the main take-home message of the manuscript should remain centered on mTORC1 activity.

      15) We also do not understand why the mTOR Suppl. Fig. 5E is not in the main body of the paper. It's clear that RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C- however the ΔTFAP2C;ΔHNRNPK double deletion levels are only slightly higher than the ΔHNRNPK - they are not at the level NT or even ΔTFAP2C (Fig. 4C, Supp. Fig. 5E).

      Response: We moved the mTOR blot to Fig.5D of the revised manuscript. About the low rescue effect, this is in line with all the other observations where a full rescue of the effects of HNRNPK ablation is never achieved, but is only partial. As suggested by reviewer #3 (Figure 5 - Point 2), we have now added RT-qPCR in Fig.5C, which corroborates these data.

      16) The authors state: "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C). Similarly, the S6 phosphorylation ratio was reduced in LN-229 C3ΔHNRNPK cells and was restored in the ΔTFAP2C;ΔHNRNPK double-ablated cells (Fig. 5C)."

      WE are not convinced that p4EBP1 is preserved in the LN-229 C3ΔTFAP2C cells - there is a very faint band which is at a lower level than the band in the LN-229 C3ΔHNRNPK cells. However, when both HNRNPK and TFAP2C were ablated, the p4EBP1 band is clear cut. I agree with the quantitation that deletion of HNRNPK and TFAP2C both reduce the level of 4EBP1 - the reduction is greater with TFAP2 but when both are deleted together the levels of 4EBP1 are higher and p4EBP1 is clearly present. In quantifying the S6 and pS6 levels, did the authors consider the actin levels- they present a ratio of the pS6 to S6. I may be lacking some understanding but why is the ratio of pS6/S6 being calculated. Is the level of pS6 not what is important - phosphorylation of S6 should lead it to being activated and thus it's the actual level of pS6 that is important, not the ratio to the non-phosphorylated protein.

      Response: In Fig. 5C, the three-band pattern of 4EBP1 is clearly visible in the NT+NT or WT condition, with the top band representing the highest phosphorylation state. Upon HNRNPK deletion, this top band almost completely disappears, mimicking the effect of our starvation control (Starv.). This top band remains clearly visible in both TFAP2C-ablated and double-ablated cells, supporting our conclusion. In our original text, we referred to the "highly phosphorylated forms" of 4EBP1, which might have caused some confusion, suggesting we were evaluating the two top bands. We are specifically referring only to the very top band (high p4EBP1), which represents the most highly phosphorylated form of 4EBP1. This is the relevant phosphorylated form to focus on, as it is the only one that disappears in the starvation control (Starv.) or upon mTORC1/2 inhibition with Torin-1 (Fig. 7B).

      To better clarify these points, we have now more clearly indicated the "high p4EBP1" band with an asterisk in Fig. 5E, added quantification of high p4EBP1/4EBP1, and rephrased the text as follows:

      "Deletion of HNRNPK diminished the highest phosphorylated form of 4EBP1 (high p4EBP1, marked with an asterisk), mimicking the effect observed in starved cells (Starv.). This high p4EBP1 band was preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C).".

      Regarding pS6 quantification, we added pS6/Actin quantification in Supp. Fig. 6E and F of the revised manuscript, also attached here for your convenience.

      17) When determining ATP levels, do they control for cell number? HNRNPK depletion results in lower ATP levels, co-deletion of TFAP2C rescues this. But this could be because there is less cell-death? So, more cells express ATP. Have they controlled for relative numbers of cells.

      Response: As described in the Materials and Methods , we normalized ATP levels to total protein content, which is a standard approach for this type of quantification (see DOI:10.1038/nature19312).

      18) The construction of the HovL cell line that propagate ovine prions - very few details are provided of the susceptibility of the cell line to PG127 prions.

      Response: As with other prion-infected cell lines, HovL cells do not exhibit any specific growth defects, susceptibilities, or phenotypes beyond their ability to propagate prions. This is consistent with established observations in prion research, where immortalized cell lines (and in general in vitro cultures) normally do not show cytotoxicity upon prion infection and, therefore, are used as models for prion propagation rather than for prion toxicity (see https://doi.org/10.1111/jnc.14956 for reference).

      We now expanded the relevant section, including technical and conceptual details in the main text of the revised manuscript as follows:

      "As reported for other ovinized cell models (66), HovL cells were susceptible to infection by the PG127 strain of ovine prions and capable of sustaining chronic prion propagation, as shown by proteinase K (PK)-digested western blot and by detection of PrPSc using the anti-PrP antibody 6D11, which selectively stains prion-infected cells after fixation and guanidinium treatment (67) (Supp. Fig. 7C-E). Consistent with most prion-propagating cell lines (68), HovL cells did not exhibit specific growth defects, susceptibilities, or overt phenotypes beyond their ability to propagate prions."

      19) It is stated that HRNPK depletion from HovL cells increases PrpSC as determined by 6D11 fluorescence, but in the manuscript HRNPK depletion results in cell death. How does this come together?

      Response: As explicitly stated in the main text and shown in Fig.6-7, HNRNPK is downregulated (via siRNAs) in the prion experiments rather than fully deleted (via CRISPR) as in the first part of the manuscript. As shown in Supp. Fig. 8B, this downregulation does not affect cell viability within the experimental time window. Therefore, the observed increase in PrPSc levels upon HNRNPK downregulation, as determined by western blot and 6D11 staining, is independent of any potential cell death effects. Moreover, the same siRNA downregulation approach was used by M. Avar et al. (Ref. 26) in comparable experiments, yielding similar outcomes.

      20) They show that mTOR inhibition mimics the effect of HNRNPK deletion, why didn't they overexpress mTOR and see if that rescues this? This would indicate a causal relationship.

      Response: We appreciate the reviewer's suggestion. We agree that the proposed rescue strategy would be the best approach to indicate a causal relationship. However, we linked the activity of the mTORC1 complex (and not only that of mTOR) to prion propagation. Overexpression of only mTOR would not restore mTORC1 full function, as Rptor would still be downregulated in the context of HNRNPK siRNA silencing (Fig. 7A and Supp. Fig. 8E). Moreover, our RNA-seq data (Supp. Table 5) from HNRNPK ablation indicate the downregulation of other mTORC1 components (namely Pras40 (AKT1S1) and mLST8). Therefore, the rescue of the mTORC1 activity by an overexpression strategy would be a very challenging approach. Given these complexities, to infer causality, we used mTORC1 inhibition (via rapamycin and Torin1) to mimic the effects of HNRNPK downregulation in reducing mTORC1 activity (FIG. 7B).

      For clarification, we have now highlighted in Fig. 4C that HNRNPK ablation downregulates also AKT1S1 and mLST8, other than mTOR and Rptor (also attached below), and we have discussed this in the main text as well. We also have clarified in the revised manuscript (where we sometimes inadvertently referred to it as just mTOR inhibition) that the observed effects are due to mTORC1 inhibition, and not simply mTOR inhibition.

      21) Flow cytometric data: supplementary Fig of Fig6d. - when they are looking at fixed cells the gating strategy for cells results in the inclusion of a lot of debris. The gate needs to be moved and be more specific to ensure results are interpreted properly. Same with the singlet gating. It's not tight enough, they include doublets as well which will skew their data. The gating strategy needs to be regated.

      Response: We have reanalyzed the flow cytometry data in Fig. 6D with a more stringent gating approach to better exclude debris and ensure proper singlet selection. We confirm that there is no change in the final interpretation of the results after applying the updated gating strategy.

      Reviewer #2 (Significance (Required)):

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2g (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency.

      Referee #3

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: Using a CRISPR-based high throughput abrasion assay, Sellitto et al. identified a list of genes that improve cell viability when deleted in hnRNP K knockout cells. Tfap2c, a transcription factor, was identified as a candidate with potential overlap with a hnRNP K function like modulating glucose metabolism. The deletion of Tfap2c in hnRNP K-deletion background prevented caspase-dependent apoptosis observed in hnRNP K single-deletion cells. Further analysis of bulk RNA-seq in hnRNP K/TFAP2C single- and double-deletion cells revealed the impairment in cellular ATP level. Accordingly, activation of AMPK led to perturbed autophagy in hnRNP K deleted cells. Moreover, the reduction and/or inactivation of the downstream mTOR protein resulted in the reduced phosphorylation of S6. Conversely, the phosphorylation of S6 and E4BP1 can be increased by TFAP2C overexpression. Finally, the pharmacological inhibition of the mTOR pathway increased the PrPSC level. This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. Co-IP experiments suggested hnRNP K and Tfap2c may interact, though further validation is needed. Several figures require additional clarification, statistical analysis, or experimental validation to strengthen conclusions.

      Major comments:

      1) Different responses of the TFAP2C expression level to deletion of hnRNPK in the two cell lines (LN-229 C3 and U251-Cas9) should be more adequately addressed. The manuscript focuses on the interaction between hnRNPK and TFAP2C, yet the hnRNPK deletion causes different changes in TFAP2C level in two different lines. Furthermore, in studies where the mechanistic link between hnRNPK and TFAP2C is being investigated, only results from the LN-229 line are presented (Figure 4-7). Thus, it is not clear whether these mechanisms also apply to another line, U251-Cas9, where hnRNPK deletion has the opposite effect on the TFAP1C level. Thus, key experiments should be performed in both lines.

      Response: The opposite effects of hnRNPK ablation on TFAP2C expression between LN-229 C3 and U251-Cas9 cells likely reflect intrinsic differences between the two cell lines. However, the viability interaction between hnRNPK and TFAP2C is conserved in both cell models (Fig. 1G-H, 2A-B, Supp. Fig. 3A-B), suggesting that shared molecular functions at the interface of this interaction exist across the lines. In fact, we believe that the opposite effect of hnRNPK ablation on TFAP2C expression in the two lines strengthens (rather than weakens) our model by highlighting how TFAP2C expression modulates cellular sensitivity to HNRNPK ablation, as detailed in our response to Reviewer #2 (Point 10 - Major comments).

      Regarding the mechanistic studies presented in FIG. 4-7, our initial goal in using two cell lines was to validate the functional viability interaction between HNRNPK and TFAP2C, as identified in our screening (performed in LN-229 C3 cells). After confirming this interaction, we chose to focus only on LN-229 C3 (beginning with RNA-seq analysis, which then led to subsequent mechanistic studies), as this provided the necessary foundation to investigate prion propagation in HovL cells (derived from LN-229). As a U251 model propagating prions does not exist, we are technically limited in performing prion experiments only in HovL and we do not believe that conducting additional experiments in U251 cells would add substantial value to our work or further our investigation.

      We hope this explanation clarifies our rationale and addresses the reviewer's concerns.

      2) Although a lot of data are presented, it is not clear how deletion of the TFAP2C reverses the toxicity caused by deletion of hnRNPK. Specifically, the first half of the paper seems to suggest an opposite mechanism than the second half of the paper. In Figure 2-4, the authors suggest a model that TFAP2C deletion has the opposite effect of hnRNPK deletion, thus rescuing toxicity. However, in Figure 5-6, it is suggested TFAP2C overexpression has the opposite effect of hnRNPK deletion. This two opposite effect of TFAP2C make it difficult to understand the models that the authors are proposing. Please also see below comment 2 for Figure 5.

      Response: We respectfully disagree with the notion that the first and second halves of the manuscript propose contradictory mechanisms.

      In Fig. 2-4, we describe the phenotypic rescue of cell viability upon TFAP2C deletion in hnRNPK-deficient cells. At this stage, we are not proposing a specific molecular mechanism but simply observing a rescue of viability and highlighting underlying transcriptional differences. There is no implication of an opposite molecular mechanism involving the individual activities of hnRNPK and TFAP2C; rather, we focused on the broader effect of TFAP2C deletion on the viability of HNRNPK-lacking cells. In Fig. 5, we isolated a partial mechanism underlying this interaction. We state that: "These data specify a role for TFAP2C in promoting mTORC1-mediated cell anabolism and suggest that its overexpression might hypersensitize cells to HNRNPK ablation by depleting the already limited ATP available, thus making its deletion advantageous". In the discussion, we now further reviewed our explanation: "HNRNPK deletion might cause a metabolic impairment leading to a nutritional crisis and a catabolic shift, whereas TFAP2C activation could promote mTORC1 anabolic functions. Thus, Tfap2c removal may rewire the bioenergetic needs of cells by modulating the mTORC1 signaling and augmenting their resilience to metabolic stress like the one induced by HNRNPK ablation". Therefore, we propose that TFAP2C expression might be particularly detrimental in hnRNPK-deficient cells, as it could push the cell into an anabolic biosynthetic state, further depleting energy stores that the cell is attempting to conserve in response to hnRNPK depletion. Removal of TFAP2C alleviates this metabolic strain. In our view, there is no contradiction between our observations.

      We hope this explanation clarifies our rationale and resolves any perceived inconsistency in our model. To further enhance the understanding of our interpretations, we have now also added (in substitution of Fig. 5E of the original manuscript) a graphical scheme (Fig. 5G of the revised manuscript) to visually explain and illustrate our model (attached below for your convenience).

      3) Similar to the point above, the first half of the paper focuses on hnRNPK deletion-induced toxicity (Fig. 1-5), while the second half of the paper focuses on hnRNPK deletion-induced PrPSC level (Fig. 6-7). The mechanistic link between these two downstream effects of hnRNPK deletion is not clear and thus, it is difficult to understand the reason that hnRNPK deletion-induced toxicity can be rescued by TFAP2C deletion, while hnRNPK deletion-induced PrPSC level increase can be rescued by TFAP2C overexpression.

      Response: Our study is not aimed at comparing viability and prion propagation as interconnected phenotypes but rather at identifying molecular processes regulated by the HNRNPK-TFAP2C interaction. Our study identifies mTORC1 activity as a molecular process at the interface of the HNRNPK-TFAP2C. HNRNPK knockout (or knockdown, which does not affect viability, and therefore is used in the prion section of the manuscript) tones mTORC1 activity down, while TFAP2C overexpression enhances it. This finding suggested an explanation for the viability interaction we observed (see reply to reviewer #3 - Point 2 -Major comments) and it provided a partial mechanism (mTORC1 activity) to explain the effect of HNRNPK knockdown and TFAP2C overexpression on prions.

      We hope this clarification addresses the reviewer's concern.

      Abstract:

      1) Please rephrase and clarify "We linked HNRNPK and TFAP2C interaction to mTOR signaling..." by distinguishing functional, genetic, and direct (molecule-to-molecule) interactions.

      Response: 1) We have now clarified it in the text of the revised manuscript as follows:

      "We linked HNRNPK and TFAP2C functional and genetic interaction to mTOR signaling, observing that HNRNPK ablation inhibited mTORC1 activity through downregulation of mTOR and Rptor, while TFAP2C overexpression enhanced mTORC1 downstream functions."

      2) A sentence reads, "...HNRNPK ablation inhibited mTORC1 activity through downregulation of mTOR and Rptor," although the downregulation of Rptor is observed only at the RNA level. The change in Rptor protein expression level is not reported in the manuscript. Please consider adding an experiment to address this or rephrase the sentence.

      Response: 2) We have now added the experiment in Supp. Fig. 9A of the revised manuscript. The blot shows that hnRNP K depletion reduces both mTOR and Rptor protein levels. "hnRNP K depletion inhibited mTORC1 activity through downregulation of mTOR and Rptor".

      Figure 2:

      1. H and I. Co-IP experiments were done using anti-TFAP2C antibody to the bead. Although the TFAP2C bands show robust signals on the blots, indicating successful enrichment of the protein, hnRNP K bands are very faint. Has the experiment been done by conjugating the hnRNP K antibody to the beads instead? Was the input lysate enriched in the nuclear fraction? Did the lysis buffer include nuclease (if so, please indicate in the figure legend and the methods section)? Addressing these would make the argument, "We also observed specific co-immunoprecipitation of hnRNP K and Tfap2c in LN-229 C3 and U251-Cas9 cells (Fig. 2H-I, Supp. Fig. 3L), suggesting that the two proteins form a complex inside the nucleus" stronger, providing information on potential direct binding.

      Response: 1. We refer the reviewer to our response to reviewers #1 and #2 regarding the weak interaction, the nuclease treatment, and the HNRNPK IP (reviewer #1 Point 1 and reviewer #2 Point 11 - Major comments). As for the co-IP input, it was not enriched in the nuclear fraction, but as shown in Supp. Fig. 4A-B hnRNPK and Tfap2c are exclusively nuclear.

      Figure 3:

      1. C and D. Please add a sentence in the figure legend explaining which means the multiple comparisons were made between (DMSO vs each drug concentration?). Graphing individual data points instead of bars would also be helpful and more informative. Please discuss the lack of dose dependency.

      Response: 1. We have now added information about the comparison in the figure legend ("Multiple comparison was made between Z-VAD-FMK and DMSO treatments in ΔHNRNPK cells."), modified the graph to show the individual data points (attached below for your convenience), and expanded the discussion as detailed for reviewer #2 (Point 14 - Major comments). (For completeness, we have also modified Supp. FIG. 5F to show individual data points, and we have combined the graphs (the DMSO control was shared across treatments)).

      Supplemental Figure 4 (Now shifted in Supplemental Figure 5):

      1. A. Although the trend can be observed, the deletion of hnRNP K does not significantly reduce the GPX4 protein level in LN-229 C3. Therefore, the following statement requires more data points and additional statistical analysis to be accurate: "In LN-229 C3 and U251-Cas9 cells, the deletion of HNRNPK reduced the protein level of GPX4, whereas TFAP2C deletion increased it (Supp. Fig. 4A-B)."

      2. A and B. The results are confusing, considering the previous report cited (ref 49) shows an increase in GPX4 with TFAP2C. It may be possible that the deletion of TFAP2C upregulates the expression of proteins with similar functions (e.g., Sp1). If this is the case, the changes in GPX4 expression observed here are a consequence of TFAP2C deletion and may not "suggest a role for HNRNPK and TFAP2C in balancing the protein levels of GPX4."

      Response: 1. We agree with the reviewer that in LN-229 C3 cells the reduction of GPX4 protein levels upon HNRNPK deletion did not reach statistical significance in our initial Western blot analysis. To address this concern, we performed six additional independent experiments and repeated the statistical analysis. Although the trend toward reduced GPX4 protein levels remained consistent, statistical significance was still not achieved (p > 0.05). Importantly, this trend is supported by our RNA-seq dataset (Supplementary Table 5), which shows decreased GPX4 expression upon HNRNPK deletion. We have now revised the text to more accurately reflect the experimental observations and to avoid overstating the effect in LN-229 C3 cells as follows:

      "In LN-229 C3 and U251-Cas9 cells, deletion of HNRNPK was associated with reduced glutathione peroxidase 4 (GPX4) protein abundance (although not statistically significant in LN-229 C3; p ≈ 0.08), whereas deletion of TFAP2C increased it (Supp. Fig. 5A-B)."

      The six new experimental replicas have been added to the uncropped western blot section.

      __Response: __2. Concerning the potential role of TFAP2C deletion in upregulating proteins with similar functions, we recognize the reviewer's perspective. However, our primary focus is on the observed trends rather than a definitive mechanistic conclusion. We clarified our wording to acknowledge this possibility while maintaining the relevance of our findings within the broader context of hnRNPK and TFAP2C interactions.

      "This last result was interesting as a previous study reported that Tfap2c enhances GPX4 expression (51). Thus, the observed increase upon TFAP2C deletion suggests additional layers of regulation, potentially involving compensatory mechanisms."

      Supplemental Figure 5 (Now shifted in Supplemental Figure 6):

      1. B. To obtain statistical significance and strengthen the conclusion, more repeated Western blot experiments can be done to quantify the pAMPK/AMPK ratio.

      Response: We included three more experiments as detailed in our response to reviewer #1 (Point 2 - Major comments) and reviewer #2 (Point 14 - Major comments).

      Figure 5:

      1. B. I believe statistical analysis with two replicates or less is not recommended. Although the assay is robust, and the blot is convincing, please consider adding more replicates if the blot is to be quantified and statistically analyzed.

      2. "Interestingly, RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C;ΔHNRNPK double deletion (Fig. 4C, Supp. Fig. E)." The statement is based on a slight difference at the protein level between the single deletion and the double deletion, as well as the observation from the bulk RNA-seq data. mTOR (and Rptor) mRNA level can be assessed by RT-qPCR to validate and further support the existing data. It is also curious why deletion of TFAP2C alone, also induced decrease in mTOR, but double deletion rescued mTOR level slightly compared to deletion of HNRNPK alone.

      3. C. The main text refers to the changes in the level of phosphorylated E4BP1, stating, "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C)." However, the quantification was done on the total E4BP1, which may be because separating pE4BP1 and E4BP1 bands on a blot is challenging. Please consider using phospho-E4BP1 specific antibody or rephrase the sentence mentioned above. The current data suggest the single- and double-deletion of hnRNP K/TFAP2C affect the overall stability of E4BP1, which may be a correlation and not due to the mTOR activity as claimed in "We conclude that HNRNPK and TFAP2C play an essential role in co-regulating cell metabolism homeostasis by influencing mTOR and AMPK activity and expression." How does the cap-dependent translation (or total protein level) change in TFAP2C deleted and overexpressing cells?

      Response: 1. We added two additional experiments as detailed in our response to reviewer #1 (Point 3 - Major comment).

      __Response: __2. Deletion of TFAP2C does not decrease mTOR levels as shown from the quantification in Fig. 5D. To further support our results, we have now included RT-qPCR in FIG. 5C as suggested by the reviewer. Data are also attached here for your convenience.

      __Response: __3. Regarding the assessment of phosphorylated 4EBP1, we think we achieved a clear separation of the differently phosphorylated forms of 4EBP1 in our blots, and we have now added the quantification for High p4EBP1/4EBP1 in Fig. 5E (see also our response to reviewer #2 Point 16 - Major comments). The quantification of total 4EBP1 represents an additional dataset, and we do not claim that 4EBP1 stability is affected by HNRNPK and TFAP2C directly through mTOR, which could be, in fact, correlative. We claim that HNRNPK and TFAP2C modulate mTORC1 and AMPK metabolic signaling as shown by the changed phosphorylation of 4EBP1, S6, AMPK, and ULK1 (Fig. 5C-E, Supp. FIG. 6B, D) and by the regulation of autophagy (Fig. 5B, Supp. Fig. 6C); we did not directly check cap-dependent translation.

      We have now rephrased our text to ensure clarity as follows:

      "We conclude that HNRNPK and TFAP2C play a role in co-regulating mTORC1 and AMPK expression, signaling, and activity."

      Figure 6:

      1. A. Did the sihnRNP K increase the TFAP2C level?

      2. A and C. Are the total PrP levels lower in TFAP2C overexpressing cells compared to mCherry cells when they are infected?

      3. D. Do the TFAP2C protein levels differ between 2-day+72-h and 7-day+96-h?

      __Response: __1. Yes, it does. We have now provided the quantification in Fig. 6A, C, and Supp. Fig. 8A (also attached below for your convenience).

      __Response: __2. We have now provided the quantification in Fig. 6A and Supp. Fig. 8A. The total PrP does not change in TFAP2C overexpressing cells. Total PrP consists of both PK-resistant PrP (PrPSc) and PK-sensitive PrP (PrPC plus potential other intermediate species), with PrPSc typically present at much lower levels. In our model, PrPC is exogenously expressed at high levels via a vector and remains constant across conditions (Fig. 6C and Supp. Fig. 8C). As a result, any changes in PrPSc may not necessarily reflect on total PrP levels.

      __Response: __3. No, there is no statistically significant change. We have now added a representative western blot and the quantification of 3 independent replicates in Supp. Fig. 8D. The other two western blots are only shown in the uncropped western blots section. This dataset is also attached here for your convenience.

      Figure 7:

      1. I agree with the latter half of the statement: "These findings suggest that HNRNPK influences prion propagation at least in part through mTORC1 signaling, although additional mechanisms may be involved." The first half requires careful rephrasing since (A) Independent of the background siRNA treatment, TFAP2C overexpression by itself can modulate PrPSC level as seen in Fig 6A and B, (B) Although the increase in TFAP2C level is observed with the hnRNP K deletion (Fig 1; LN-229 C3), sihnRNP K treatment may or may not influence the TFAP2C level (Fig 6; quantified data not provided), and (C) In the sihnRNP K-treated cells, E4BP1 level is increased compared to the siNT-treated cells, which was not observed hnRNP K-deleted cells. Discussions and additional experiments (e.g., mTOR knockdown) addressing these points would be helpful.

      __Response: __A, B) We respectfully disagree with the possibility that HNRNPK downregulation may increase prion propagation via TFAP2C upregulation. As shown in Fig. 6A-B, D and in Supp. Fig. 8A, TFAP2C overexpression reduces, rather than increases, prion levels. Therefore, it would be inconsistent to suggest that HNNRPK siRNA promotes prion propagation through TFAP2C upregulation (quantification is now provided, see reviewer #3 - Figure 6 - Point 1). C) Concerning 4EBP1 levels, we have quantified the total 4EBP1 (also attached below) and expanded the discussion on potential discrepancies between HNRNPK knockout and knockdown, as the former affects cell viability, while the latter does not. However, as explained also in the previous reply to reviewer #3 - Figure 5 - Point 3, our focus is on the highly phosphorylated band of 4EBP1 (High p4EBP1), which is the direct target of mTORC1 activity. In both the hnRNPK knockout LN-229 C3 (Fig. 5E) and knockdown HovL models (Fig. 7B), phosphorylation of 4EBP1, along with phosphorylation of S6, is clearly reduced (we have now included quantification for Fig. 7B), reinforcing our conclusion that mTORC1 activity is affected by hnRNPK depletion. As the reviewer noted, we do not claim that mTORC1 is the sole mediator of hnRNPK's effect on prion regulation. However, we think that our interpretation of a potential and partial role of mTORC1 inhibition in the effect of HNRNPK downregulation on prion propagation is in line with the data presented in Fig. 6-7 and Supp. Fig. 8-9. For further clarification, we expanded the text according to the new experiments and analysis, and we added mTOR and Raptor siRNA knockdown (Supp. Fig.9C) to further support our conclusions (also attached below for your convenience).

      Minor comments:

      1. Please clarify "independent cultures." Does this mean technical replicates on the same cell culture plate but different wells or replicated experiments on different days?

      __Response: __We have now clarified in each figure legend. "Individually treated wells" means different parental cultures grown and treated separately on the same day. n represents independent experiments on different days.

      1. Fig 2G. Please explain how the sigmoidal curves were fitted to the data points under the materials and methods section.

      2. Fig 3E and F. Please refer to the comment on Fig 2G above.

      __Response: __We have now added the explanation in Materials and Methods as follows:

      "Curve Fitting

      For sigmoidal curve fitting, we used GraphPad Prism (version X, GraphPad Software). Data in Figure 2G were fitted using nonlinear regression with a least squares regression model. For Figures 3E and 3F, data fitting was performed using an asymmetric sigmoidal model with five parameters (5PL) and log-transformed X-values (log[concentration])."

      3.Fig S3 F/H. Quantification of gel bands would be helpful when comparing protein expression changes after different treatments, as band intensities look different across.

      __Response: __We have now added the quantifications in Supp. FIG. 3D-H (attached below for your convenience). They confirm that there are no significant differences in the means of the normalized values.

      1. Supp Fig 5C and F. These panels can be combined with the corresponding panels in main Figure 5 if space allows so that the readers do not have to flip pages between the main text and Supplemental material.

      __Response: __We have now combined the panels. Previous Supp. FIG. 5C and F are now shown in FIG. 6C and E, respectively.

      Reviewer #3 (Significance (Required)):

      This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. It is also important to understand how hnRNPK deletion induces prion propagation and develop methods to mitigate its spread. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. I have expertise in RNA-binding protein, cell biology, and prion disease.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Using a CRISPR-based high throughput abrasion assay, Sellitto et al. identified a list of genes that improve cell viability when deleted in hnRNP K knockout cells. Tfap2c, a transcription factor, was identified as a candidate with potential overlap with a hnRNP K function like modulating glucose metabolism. The deletion of Tfap2c in hnRNP K-deletion background prevented caspase-dependent apoptosis observed in hnRNP K single-deletion cells. Further analysis of bulk RNA-seq in hnRNP K/TFAP2C single- and double-deletion cells revealed the impairment in cellular ATP level. Accordingly, activation of AMPK led to perturbed autophagy in hnRNP K deleted cells. Moreover, the reduction and/or inactivation of the downstream mTOR protein resulted in the reduced phosphorylation of S6. Conversely, the phosphorylation of S6 and E4BP1 can be increased by TFAP2C overexpression. Finally, the pharmacological inhibition of the mTOR pathway increased the PrPSC level. This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. Co-IP experiments suggested hnRNP K and Tfap2c may interact, though further validation is needed. Several figures require additional clarification, statistical analysis, or experimental validation to strengthen conclusions.

      Major comments:

      1. Different responses of the TFAP2C expression level to deletion of hnRNPK in the two cell lines (LN-229 C3 and U251-Cas9) should be more adequately addressed. The manuscript focuses on the interaction between hnRNPK and TFAP2C, yet the hnRNPK deletion causes different changes in TFAP2C level in two different lines. Furthermore, in studies where the mechanistic link between hnRNPK and TFAP2C is being investigated, only results from the LN-229 line are presented (Figure 4-7). Thus, it is not clear whether these mechanisms also apply to another line, U251-Cas9, where hnRNPK deletion has the opposite effect on the TFAP1C level. Thus, key experiments should be performed in both lines.
      2. Although a lot of data are presented, it is not clear how deletion of the TFAP2C reverses the toxicity caused by deletion of hnRNPK. Specifically, the first half of the paper seems to suggest an opposite mechanism than the second half of the paper. In Figure 2-4, the authors suggest a model that TFAP2C deletion has the opposite effect of hnRNPK deletion, thus rescuing toxicity. However, in Figure 5-6, it is suggested TFAP2C overexpression has the opposite effect of hnRNPK deletion. This two opposite effect of TFAP2C make it difficult to understand the models that the authors are proposing. Please also see below comment 2 for Figure 5.
      3. Similar to the point above, the first half of the paper focuses on hnRNPK deletion-induced toxicity (Fig. 1-5), while the second half of the paper focuses on hnRNPK deletion-induced PrPSC level (Fig. 6-7). The mechanistic link between these two downstream effects of hnRNPK deletion is not clear and thus, it is difficult to understand the reason that hnRNPK deletion-induced toxicity can be rescued by TFAP2C deletion, while hnRNPK deletion-induced PrPSC level increase can be rescued by TFAP2C overexpression.

      Abstract.

      1. Please rephrase and clarify "We linked HNRNPK and TFAP2C interaction to mTOR signaling..." by distinguishing functional, genetic, and direct (molecule-to-molecule) interactions.
      2. A sentence reads, "...HNRNPK ablation inhibited mTORC1 activity through downregulation of mTOR and Rptor," although the downregulation of Rptor is observed only at the RNA level. The change in Rptor protein expression level is not reported in the manuscript. Please consider adding an experiment to address this or rephrase the sentence.

      Figure 2.

      1. H and I. Co-IP experiments were done using anti-TFAP2C antibody to the bead. Although the TFAP2C bands show robust signals on the blots, indicating successful enrichment of the protein, hnRNP K bands are very faint. Has the experiment been done by conjugating the hnRNP K antibody to the beads instead? Was the input lysate enriched in the nuclear fraction? Did the lysis buffer include nuclease (if so, please indicate in the figure legend and the methods section)? Addressing these would make the argument, "We also observed specific co-immunoprecipitation of hnRNP K and Tfap2c in LN-229 C3 and U251-Cas9 cells (Fig. 2H-I, Supp. Fig. 3L), suggesting that the two proteins form a complex inside the nucleus" stronger, providing information on potential direct binding.

      Figure 3.

      1. C and D. Please add a sentence in the figure legend explaining which means the multiple comparisons were made between (DMSO vs each drug concentration?). Graphing individual data points instead of bars would also be helpful and more informative. Please discuss the lack of dose dependency.

      Supplemental Figure 4.

      1. A. Although the trend can be observed, the deletion of hnRNP K does not significantly reduce the GPX4 protein level in LN-229 C3. Therefore, the following statement requires more data points and additional statistical analysis to be accurate: "In LN-229 C3 and U251-Cas9 cells, the deletion of HNRNPK reduced the protein level of GPX4, whereas TFAP2C deletion increased it (Supp. Fig. 4A-B)."
      2. A and B. The results are confusing, considering the previous report cited (ref 49) shows an increase in GPX4 with TFAP2C. It may be possible that the deletion of TFAP2C upregulates the expression of proteins with similar functions (e.g., Sp1). If this is the case, the changes in GPX4 expression observed here are a consequence of TFAP2C deletion and may not "suggest a role for HNRNPK and TFAP2C in balancing the protein levels of GPX4."

      Supplemental Figure 5.

      1. B. To obtain statistical significance and strengthen the conclusion, more repeated Western blot experiments can be done to quantify the pAMPK/AMPK ratio.

      Figure 5.

      1. B. I believe statistical analysis with two replicates or less is not recommended. Although the assay is robust, and the blot is convincing, please consider adding more replicates if the blot is to be quantified and statistically analyzed.
      2. "Interestingly, RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C;ΔHNRNPK double deletion (Fig. 4C, Supp. Fig. E)." The statement is based on a slight difference at the protein level between the single deletion and the double deletion, as well as the observation from the bulk RNA-seq data. mTOR (and Rptor) mRNA level can be assessed by RT-qPCR to validate and further support the existing data. It is also curious why deletion of TFAP2C alone, also induced decrease in mTOR, but double deletion rescued mTOR level slightly compared to deletion of HNRNPK alone.
      3. C. The main text refers to the changes in the level of phosphorylated E4BP1, stating, "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C)." However, the quantification was done on the total E4BP1, which may be because separating pE4BP1 and E4BP1 bands on a blot is challenging. Please consider using phospho-E4BP1 specific antibody or rephrase the sentence mentioned above. The current data suggest the single- and double-deletion of hnRNP K/TFAP2C affect the overall stability of E4BP1, which may be a correlation and not due to the mTOR activity as claimed in "We conclude that HNRNPK and TFAP2C play an essential role in co-regulating cell metabolism homeostasis by influencing mTOR and AMPK activity and expression." How does the cap-dependent translation (or total protein level) change in TFAP2C deleted and overexpressing cells?

      Figure 6.

      1. A. Did the sihnRNP K increase the TFAP2C level?
      2. A and C. Are the total PrP levels lower in TFAP2C overexpressing cells compared to mCherry cells when they are infected?
      3. D. Do the TFAP2C protein levels differ between 2-day+72-h and 7-day+96-h?

      Figure 7.

      1. I agree with the latter half of the statement: "These findings suggest that HNRNPK influences prion propagation at least in part through mTORC1 signaling, although additional mechanisms may be involved." The first half requires careful rephrasing since (A) Independent of the background siRNA treatment, TFAP2C overexpression by itself can modulate PrPSC level as seen in Fig 6A and B, (B) Although the increase in TFAP2C level is observed with the hnRNP K deletion (Fig 1; LN-229 C3), sihnRNP K treatment may or may not influence the TFAP2C level (Fig 6; quantified data not provided), and (C) In the sihnRNP K-treated cells, E4BP1 level is increased compared to the siNT-treated cells, which was not observed hnRNP K-deleted cells. Discussions and additional experiments (e.g., mTOR knockdown) addressing these points would be helpful.

      Minor comments:

      1. Please clarify "independent cultures." Does this mean technical replicates on the same cell culture plate but different wells or replicated experiments on different days?
      2. Fig 2G. Please explain how the sigmoidal curves were fitted to the data points under the materials and methods section.
      3. Fig 3E and F. Please refer to the comment on Fig 2G above.
      4. Fig S3 F/H. Quantification of gel bands would be helpful when comparing protein expression changes after different treatments, as band intensities look different across.
      5. Supp Fig 5C and F. These panels can be combined with the corresponding panels in main Figure 5 if space allows so that the readers do not have to flip pages between the main text and Supplemental material.

      Significance

      This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. It is also important to understand how hnRNPK deletion induces prion propagation and develop methods to mitigate its spread. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. I have expertise in RNA-binding protein, cell biology, and prion disease.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2 (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency.

      They state that HNRNPK and TFAP2C are linked to mTOR signalling and observe that HNRNPK ablation inhibits mTORC1 activity through downregulation of mTOR and Rptor while TFAP2C overexpression enhances mTORC1 downstream functions. In prion infected cells, TFAP2C activation reduced prion levels and countered the increased prion propagation due to HNRNPK suppression. Pharmacological inhibition of mTOR also elevated prion levels and partially mimicked the effects of HNRNPK silencing. They state their study identifies TFAP2C as a genetic interactor of HNRNPK and implicates their roles in mTOR metabolic regulation and establishes a causative link between these activities and prion propagation.

      This is an interesting manuscript in which a lot of work has been undertaken. The experiments are on the whole well done, carefully documented and support most of the conclusions drawn. However, there are places where it was quite difficult to read as some of the important results are in the supplementary Figures and it was necessary to go back and forth between the Figs in the main body of the paper and the supplementary Figs. There are also Figures in the supplementary which should have been presented in the main body of the paper. These are indicated in our comments below.

      We have the following questions /points:

      1. A plasmid harbouring four guide RNAs driven by four distinct constitutive promoters is used for targetting HNRNPK- is there a reason for using 4 guides- is it simply to obtain maximal editing - in their experience is this required for all genes or specific to HNRNPK?
      2. Is there a minimal amount of Cas9 required for editing?
      3. It is stated that cell death is delayed in U251-MG cells compared to LN-229-C3 cells- why? Also, why use glioblastoma cells other than that they have high levels of HNRNPK? Would neuroblastoma cells be more appropriate if they are aiming to test for prion propagation?
      4. Human CRISPR Brunello pooled library- does the Brunello library use constructs which have four independent guide RNAs as used for the silencing of HNRPNK?
      5. To rank the 763 enriched genes, they multiply the -log10FDR with their effect size - is this a standard step that is normally undertaken?
      6. The 32 genes selected- they were ablated individually using constructs with one guide RNA or four guide RNAs?
      7. The identified targets were also tested in U251-MG cells and nine were confirmed but the percent viability was variable - is the variability simply a reflection of the different cell line?
      8. The two strongest hits were IKBAKP and TFAP2C. As TFAP2C is a transcription factor - is it known to modulate expression of any of the genes that were identified to be perturbed in the screen? Moreover, it is stated that it regulates expression of several lncRNAs- have the authors looked at expression of these lncRNAs- is the expression affected- can modulation of expression of these lncRNAs modulate the observed phenotypic effects and also some of the targets they have identified in the screen?
      9. As both HNRNPK and TFAP2C modulate glucose metabolism, the authors have chosen to explore the epistatic interaction. This is most reasonable.
      10. The orthogonal assay to confirm that deletion of TFAP2C supresses cell death upon removing HNRNPK- was this done using a single guide RNA or multiple guides - is there a level of suppression required to observe rescue? Interestingly ablation of HNRNPK increases TFAP2C expression in LN-229-C3 whereas in U251-Cas9 cells HNRNPK ablation has the opposite effect- both RNA and protein levels of TFAP2C are decreased - is this the cause of the smaller protective effect of TFAP2C deletion in this cell line?
      11. Nuclear localisation studies indicate that the HNRNPK and TFAP2C proteins colocalise in the nucleus however the co-IP data is not convincing- although appropriate controls are present, the level of interaction is very low - the amount of HNRNPK pulled down by TFAP2C is really very low in the LN-229C3 cells and even lower in the U251-Cas9 cells. Have they undertaken the reciprocal co-IP expt?
      12. They state that LN-229 C3 TFAP2C and U251-Cas9TFAP2C were only mildly resistant to the apoptotic action of staurosporin Fig 3E and F - I accept they have undertaken the stats which support their statement that at high concentrations of staurosporin the LN-229 C3 TFAP2C cells are less sensitive but the U251-Cas9TFAP2C decreased sensitivity is hard to believe. Has this been replicated? I agree that HNRNPK deletion causes apoptosis in both LN-229 C3 and U251-Cas9 cells and this is blocked by Z-VAD-FMK - however the block is not complete- the max viability for HNRNPK deletion in LN-229 C3 cells is about 40% whereas for U251-Cas9 cells it is about 30% - does this suggest that cells are being lost by another pathway. Have they tested concentrations higher than 10nM?
      13. The RNA-seq comparisons- the authors use log2 FC <0.5 upregulated or genes downregulated by a similar amount- this is a very low cut off and would include essentially minimal changes in expression - not convinced of the significance of such low-level changes.
      14. It is stated" Accordingly, we observed increased AMPK phosphorylation (pAMPK) upon ablation of HNRNPK, which was consistently reduced in LN-229 C3ΔTFAP2C cells (Supp. Fig. 5B). LN-229 C3ΔTFAP2C; ΔHNRNPK cells also showed a partial reduction of pAMPK relative to LN-229 C3ΔHNRNPK cells (Supp. Fig. 5B). These results suggest that hnRNP K depletion causes an energy shortfall, leading to cell death. I am not totally convinced by the data presented in this Fig. The authors have quantified the band intensity and present the ratio of pAMPK to AMPK. Please note that the actin levels are variable across the samples - did they normalise the data using the actin level before undertaking the comparisons? Also, if the authors think this is an important point which supports their conclusion, then it should be in the main body of the paper rather than the supplementary. If AMPK is being phosphorylated, this should lead to activation of the metabolic check point which involves p53 activation by phosphorylation. Activated p53 would turn on p21CIP1 which is a very sensitive indicator of p53 activation.
      15. We also do not understand why the mTOR Suppl. Fig. 5E is not in the main body of the paper. It's clear that RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C- however the ΔTFAP2C;ΔHNRNPK double deletion levels are only slightly higher than the ΔHNRNPK - they are not at the level NT or even ΔTFAP2C (Fig. 4C, Supp. Fig. 5E).
      16. The authors state: "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C). Similarly, the S6 phosphorylation ratio was reduced in LN-229 C3ΔHNRNPK cells and was restored in the ΔTFAP2C;ΔHNRNPK double-ablated cells (Fig. 5C)."

      WE are not convinced that p4EBP1 is preserved in the LN-229 C3ΔTFAP2C cells - there is a very faint band which is at a lower level than the band in the LN-229 C3ΔHNRNPK cells. However, when both HNRNPK and TFAP2C were ablated, the p4EBP1 band is clear cut. I agree with the quantitation that deletion of HNRNPK and TFAP2C both reduce the level of 4EBP1 - the reduction is greater with TFAP2 but when both are deleted together the levels of 4EBP1 are higher and p4EBP1 is clearly present. In quantifying the S6 and pS6 levels, did the authors consider the actin levels- they present a ratio of the pS6 to S6. I may be lacking some understanding but why is the ratio of pS6/S6 being calculated. Is the level of pS6 not what is important - phosphorylation of S6 should lead it to being activated and thus it's the actual level of pS6 that is important, not the ratio to the non-phosphorylated protein. 17. When determining ATP levels, do they control for cell number? HNRNPK depletion results in lower ATP levels, co-deletion of TFAP2C rescues this. But this could be because there is less cell-death? So, more cells express ATP. Have they controlled for relative numbers of cells. 18. The construction of the HovL cell line that propagate ovine prions - very few details are provided of the susceptibility of the cell line to PG127 prions. 19. It is stated that HRNPK depletion from HovL cells increases PrpSC as determined by 6D11 fluorescence, but in the manuscript HRNPK depletion results in cell death. How does this come together? 20. They show that mTOR inhibition mimics the effect of HNRNPK deletion, why didn't they overexpress mTOR and see if that rescues this? This would indicate a causal relationship. 21. Flow cytometric data: supplementary Fig of Fig6d. - when they are looking at fixed cells the gating strategy for cells results in the inclusion of a lot of debris. The gate needs to be moved and be more specific to ensure results are interpreted properly. Same with the singlet gating. It's not tight enough, they include doublets as well which will skew their data. The gating strategy needs to be regated.

      Significance

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2 (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The paper by Sellitto describes studies to determine the mechanism by which hnRNPK modulates the propagation of prion. The authors use cell models lacking HNRNPK, which is lethal, in a CRISPR screen to identify genes that suppress lethality. Based on this screen to 2 different cell lines, gene termed Tfap2C emerged as a candidate for interaction with HNRNPK. The show that Tfap2C counteracts the actions of HNRNPK with respect to prion propagation. Cells lacking HNRNPK show increased PrPSc levels. Overexpression of Tfap2C suppesses PrPSc levels. These effects on PrPSc are independent of PrPC levels. By RNAseq analysis, the authors hone in on metabolic pathways regulated by HNRPNK and Tfap2C, then follow the data to autophagy regulation by mTor. Ultimately, the authors show that short-term treatments of these cell models with mTor inhibitors causes increased accumulation of PrPSc. The authors conclude that the loss of HNRNPK leads to a reduced energy metabolism causing mTor inhibition, which is reduces translation by dephosphorylation of S6.

      Major Comments

      Fig H and I, Fig 3L. The interaction between Tfap2C and HNRNPK is pretty weak. The interaction may not be consequential. The experiment seems to be well controlled, yielding limited interaction. The co-ip was done in PBS with no detergent. The authors indicate that the cells were mechanically disrupted. Since both of these are DNA binding proteins, is it possible that the observed interaction is due to proximity on DNA that is linking the 2 proteins, including a DNAase treatment would clarify.

      Supplemental Fig 5B - The western blot images for pAMPK don't really look like a 2 fold increase in phosphorylation in HNRNPK deletion.

      Fig. 5A - I don't think it is proper to do statistics on an of 2. Fig 6D. The data look a bit more complicated than described in the text. At 7 days, compared to 2 days, it looks like there is a decrease in % cells positive for 6D11. Is there clearance of PrPSc or proliferation of un-infected cells? The authors might consider a different order of presenting the data. Fig 6 could follow Fig. 2 before the mechanistic studies in Figs 3-5. The authors use SEM throughout the paper and while this is often used there has been some interest in using StdDev to show the full scope of variability.

      Discussion The discrepancy between short-term and long-term treatments with mTor inhibitors is only briefly mentioned with a bit of a hand-waving explanation. The authors may need a better explanation.

      Minor Comments

      Page 12 - no mention of chloroquine in the text or related data.

      Page 12 - Supp. Fig. E - should be 5E

      Significance

      The study provides mechanistic insight into how HNRNPK modulates prion propagation. The paper is limited to cell models, and the authors note that long term treatment with mTor inhibitors reduced PrPSc levels in an in vivo model.

      The primary audience will be other prion researchers. There may be some broader interest in the mTor pathway and the role of HNRNPK in other neurodegenerative diseases.

    1. Groundwater extraction exceeds natural recharge by two- to threefold in many plains, causing land subsidence of 25–35 centimeters per year and collapsing centuries-old qanats.

      groundwater extraction is 2 or 3 times larger than replenishment. Subsidence of 25-35 cm p yr. Old qanat systems collapse, increasing the problems.

    1. Finally, in one word, their Ambition and Avarice, than which the heart of Man never entertained greater, and the vast Wealth of those Regions; the Humility and Patience of the Inhabitants (which made their approach to these Lands more easy) did much promote the business: Whom they so despicably contemned, that they treated them (I speak of things which I was an Eye Witness of, without the least fallacy) not as Beasts, which I cordially wished they would, but as the most abject dung and filth of the Earth; and so solicitous they were of their Life and Soul, that the above-mentioned number of People died without understanding the true Faith or Sacraments.

      The passage says that the Spaniards were driven by greed and desire for wealth. The Indigenous people were humble and patient, which made it easier for the Spaniards to take their land. The Spaniards treated the people terribly, worse than animals, and many died without learning about Christianity. It shows how cruel and disrespectful the Spaniards were.

    2. he first whereof was raising an unjust, bloody, cruel War. The other, by putting them to death, who hitherto, thirsted after their Liberty, or designed (which the most Potent, Strenuous and Magnanimous Spirits intended) to recover their pristine Freedom, and shake off the Shackles of so injurious a Captivity: For they being taken off in War, none but Women and Children were permitted to enjoy the benefit of that Country-Air…

      The passage says that Spanish Christians attacked the people on the islands. They fought a war and killed anyone who resisted. Most men died, and only women and children survived. The Spanish were cruel and wanted total control over the people.

    3. above Twelve Millions (computing Men, Women, and Children) have undeservedly perished; nor do I conceive that I should deviate from the Truth by saying that above Fifty Millions in all paid their last Debt to Nature.

      The numbers Las Casas uses are horrifying. He wants the king to realize that kingdoms that are greater than all Spain are being turned into ruins.

    4. Now the ultimate end and scope that incited the Spaniards to endeavor the Extirpation and Desolation of this People, was Gold only…

      This is an attack on the idea that the Spaniards were there for religious reasons. Las Casas calls out their greed, arguing that the search for wealth overrode their "Christian" mission they had claimed to have.

    5. ow this infinite multitude of Men are by the Creation of God innocently simple, altogether void of and averse to all manner of Craft, Subtlety and Malice, and most Obedient and Loyal Subjects to their Native Sovereigns

      Las Casas describes the natives as simple and submissive to frame them as perfect candidates for Christianity.

    6. Religion was central to Maya society, and stories of gods and goddesses led to the building of temples and development of a calendar that recorded religious dates but also the best times for planting and harvest.

      I find this very interesting because they combined science and religion together. This kind of suggests that their temples were not just places of worship, but also a way for their society to survive, by tracking different seasons for planting and harvesting.

    7. Those that arrived at these Islands from the remotest parts of Spain, and who pride themselves in the Name of Christians, steered Two courses principally, in order to the Extirpation, and Exterminating of this People from the face of the Earth.

      Las Casas is saying that the Spaniards are committing massacres and exterminating people in the name of Christianity.

    8. As to the firm land, we are certainly satisfied, and assured, that the Spaniards by their barbarous and execrable Actions have absolutely depopulated Ten Kingdoms, of greater extent than all Spain

      This claim of the Spaniards destroying ten kingdoms, that combined were greater then Spain, must've been Las Casas's way to explain the scale of the destruciton committed by the Spaniards to the rest of Europe.

    9. The Spaniards first assaulted the innocent Sheep, so qualified by the Almighty, like most cruel tigers, wolves, and lions, hunger-starved, studying nothing, for the space of Forty Years, after their first landing

      He refers to the Natives as "innocent Sheep, so qualified by the Almighty". I believe this was an appeal to the majority Christian population of Europe at the time.

    10. Mayan religious beliefs included scraping down and redecorating their temples every sixty years.

      I didn't know this. It's interesting to consider that even though these temples were only meant to last sixty years, the ones that didn't get torn down are still standing. I think it shows they had a lot of proficiency in temple construction.

    1. While China held the secret, silk was one of the most sought-after products of the ancient world; worth its weight in gold in imperial Rome. Silk production and trade was a key to China's wealth, cultural prestige, and diplomatic power, accounting for possibly a quarter of China's income at the peak of Silk Road exports.

      This makes sense why it was worth so much, I believe it is labor intensive and luxurious.

    1. Many people say they work better with distractions—they prefer to leave the television or the radio on—but the truth is that an environment with too many interruptions is rarely helpful when focus is required. Before deciding that the television or talkative roommates do not bother you when you work, take an honest accounting of the work you produce with interruptions compared to work you do without.

      something to really consider. people even put their headset on with music playing when they are reading, i wonder how they assimilate or even understand what they are reading, talk less of to retain and remember the information that they read about, someone like me, the music lyrics will just be ringing in my head and my mind.

    2. I have extrapolated three important components to this skill. First, knowing your values is imperative. Values will serve as a guide, which will help you to determine which actions bring you closer to your goals and those that don't. Second, know your constraints. Constraints (in form of time or other responsibilities) can help you set the parameter within which you can function efficiently. The last component is action. This component was the hardest for me to master, but it was the most fruitful. Because knowing values and limitations without engaging in appropriate actions does not serve any meaningful purpose.

      Thanks for sharing, just learnt something now

    3. Imagine a scenario where one of your class projects is to create a poster. It is your intent to use some kind of imaging software to produce professional-looking graphics and charts for the poster, but you have never used the software in that way before. It seems easy enough, but once you begin, you find the charts keep printing out in the wrong resolution. You search online for a solution, but the only thing you can find requires you to recreate them all over again in a different setting. Unfortunately, that part of the project will now take twice as long.

      I am a victim to this.

    4. Poor planning or a bad assumption in this area can be disastrous, especially if some part of the task has a steep learning curve. No matter how well you planned the other parts of the project, if there is some skill needed that you do not have and you have no idea how long it will take to learn, it can be a bad situation.

      Poor planning is actually something important to avoid, it can lead to a total mess and waste of time.

    1. incomplete contracts.

      A complete contract is an important concept from contract theory. If the parties to an agreement could specify their respective rights and duties for every possible future state of the world, their contract would be complete. There would be no gaps in the terms of the contract.

      However, because it would be prohibitively expensive to write a complete contract, contracts in the real world are usually incomplete. When a dispute arises and the case falls into a gap in the contract, either the parties must engage in bargaining or the courts must step in and fill in the gap. The idea of a complete contract is closely related to the notion of default rules, e.g. legal rules that will fill the gap in a contract in the absence of an agreed upon provision.

      In contract law, an incomplete contract is one that is defective or uncertain in a material respect. In economic theory, an incomplete contract (as opposed to a complete contract) is one that does not provide for the rights, obligations and remedies of the parties in every possible state of the world.[1]

      Since the human mind is a scarce resource and the mind cannot collect, process, and understand an infinite amount of information, economic actors are limited in their rationality (the limitations of the human mind in understanding and solving complex problems) and one cannot anticipate all possible contingencies.[2][3] Or perhaps because it is too expensive to write a complete contract, the parties will opt for a "sufficiently complete" contract.[4] In short, in practice, every contract is incomplete for a variety of reasons and limitations. The incompleteness of a contract also means that the protection it provides may be inadequate.[5] Even if a contract is incomplete, the legal validity of the contract cannot be denied, and an incomplete contract does not mean that it is unenforceable. The terms and provisions of the contract still have influence and are binding on the parties to the contract. As for contractual incompleteness, the law is concerned with when and how a court should fill gaps in a contract when there are too many or too uncertain to be enforceable, and when it is obliged to negotiate to make an incomplete contract fully complete or to achieve the desired final contract.[1]

    Annotators