10,000 Matching Annotations
  1. Last 7 days
    1. There is something so ugly about crushing an acoustic guitar. Making it buckle, making the middle of it explode in splinters. That might be personal to me, as someone who grew up with a dad who was what you might call a campfire guitarist — not a performer, just a dad who used to entertain us with songs like "Dark as a Dungeon," a little folk tune about the lethal dangers of coal mining. Maybe to you, it's not the guitar. Maybe it's the cameras or the vinyl records.

      She talks about how the ad is taking things that people love and are connected to and destroying them. While the ad is meant to show all the features of the ipad, many people saw it as saying they only need the ipad and the other things don't matter.

    1. So in this chapter, we will not consider internet-based social media as inherently toxic or beneficial for mental health. We will be looking for more nuance and where things go well, where they do not, and why. { requestKernel: true, binderOptions: { repo: "binder-examples/jupyter-stacks-datascience", ref: "master", }, codeMirrorConfig: { theme: "abcdef", mode: "python" }, kernelOptions: { kernelName: "python3", path: "./ch13_mental_health" }, predefinedOutput: true } kernelName = 'python3'

      This section makes me think social media isn’t just “showing posts”—it’s actively shaping what we believe is important, and even what feels true. The scary part is that the rules can be based on tons of signals we never consented to (like location or contacts), and because the algorithm is hidden, it’s hard to hold anyone accountable when the recommendations go wrong.

    2. “If [social media] was just bad, I’d just tell all the kids to throw their phone in the ocean, and it’d be really easy. The problem is it - we are hyper-connected, and we’re lonely. We’re overstimulated, and we’re numb. We’re expressing our self, and we’re objectifying ourselves. So I think it just sort of widens and deepens the experiences of what kids are going through. But in regards to social anxiety, social anxiety - there’s a part of social anxiety I think that feels like you’re a little bit disassociated from yourself. And it’s sort of like you’re in a situation, but you’re also floating above yourself, watching yourself in that situation, judging it. And social media literally is that. You know, it forces kids to not just live their experience but be nostalgic for their experience while they’re living it, watch people watch them, watch people watch them watch them. My sort of impulse is like when the 13 year olds of today grow up to be social scientists, I’ll be very curious to hear what they have to say about it. But until then, it just feels like we just need to gather the data.” Director Bo Burnham On Growing Up With Anxiety — And An Audience - NPR Fresh Air (10:15-11:20)

      This quote shows why it’s so hard to address most issues around social media. It’s almost a necessary evil, as it is an essential part of all of our lives. It is difficult to keep up with real life if you’re not connected online too, so solutions to social media addiction or negative effects must be found on the other end — the developers’ end.

    1. Because of similar reasons, people whose material needs are covered may act to improve the status of those who do not share their prosperity.

      I think this speaks to the divide we often see between empathetic people vs. non-empathetic people. We often see wealthy people who are raised to believe that the world revolves around them, and that breeds apathy. It's important now more than ever to value the wellness of every person, black or white, gay or straight, rich or poor, immigrant or citizen. If people don't understand that everyone getting their needs met leads to a greater outcome for all, the ultra-wealthy will continue to accumulate wealth, and everyone below them will fall further and further into oppression. You don't need firsthand experience to be empathetic to those who are disadvantaged, you just need enough common sense to know it's them today but could be you tomorrow.

    1. It’s really eye-opening to see how social media algorithms are designed to keep us scrolling, even when it’s bad for our mental health. I’ve definitely felt the pressure to keep up with others’ posts, and it’s helpful to understand that this isn’t just my own issue—it’s a feature of the platforms we use.

    1. Like other visual aids, images or photographs should be clear, bright, and of adequate size. If possible, crop large images to include only the relevant parts. Cropping reduces extraneous information, and you may be able to enlarge the cropped image for easier viewing.

      It's also important to remember that just because a person can see an image doesn't meant that they understand what's going on in the image so keep that in mind when it some to visuals. Given students time to really understand what's in the photo.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Response to Review

      We would like to thank all three reviewers for their encouraging comments on our manuscript. We now submit our revised study after considerable efforts to address each of the reviewer concerns. I will first provide a response related to a major change we have made in the revision that addressed a concern common to all three reviewers, followed by a point-by-point response to individual comments.

      Replacing LRRK2ARM data with a LRRK2 specific type II kinase inhibitor: The most critical issue for all 3 reviewers was the use of our new CRISPR-generated truncation mutant of LRRK2 that we called LRRK2ARM. We had not provided direct evidence of the protein product of this truncation, which was a significant limitation. To address this we performed proteomics analysis of all clones, and to our surprise, we identified 7 peptides that were C-terminal to our "predicted" stop codon we had engineered into the CRISPR design. A repeat of the deep sequencing analysis in both directions then more clearly revealed site specific mutations leading to 4 amino acid changes at the junction of exon 19, without introducing a stop codon. Given that we could not detect the protein by western blot (even though proteomics now indicated the region of LRRK2 recognized by our antibodies was present) we decided to remove this clone from the manuscript. In the meantime we had compared the ineffectiveness of MLi-2 to block Rab8 phosphorylation during iron overload in the LRRK2G2019S cells with a type II kinase inhibitor called rebastinib. The data showed very clearly that treatment with rebastinib reversed the iron-induced phospho-Rab8 at the plasma membrane (and by western blot, in new Fig 3). Since this inhibitor is very broad spectrum inhibiting ~30% of the kinome we reached out to Sam Reck-Peterson and Andres Leschziner, experts in LRRK2 structure/function, who recently developed a much more selective LRRK2-specific type II kinase inhibitor they called RN341 and RN277 (developed with Stefan Knapp PMID: 40465731). These compounds effectively coupled the MLi-2 compound through an indole ring to a rebastinib type II compound to provide LRRK2 binding specificity to the efficient DYG "out" type II inhibitor. As with rebastinib, the new LRRK-specific kinase inhibitors also effectively reversed the cell surface p-Rab8 seen in LRRK2G2019S, iron loaded cells. These new data provide the first biological paradigm where the kinase activity of LRRK2 is resistant to type I MLi-2, yet remains highly sensitive to type II inhibitors. While the loss of our LRRK2ARM clone marks a significant change in the manuscript we believe the main message is stronger with the addition of the new LRRK2 specific type II kinase inhibitor. Our data show that it is indeed the active kinase function of LRRK2G2019S that is impacting the iron phenotypes we observe but highlight the conformational specificity upon iron overload such that MLi-2 is ineffective. The overall phenotypes we observe in LRRK2G2019S macrophages remain unchanged and are now expanded within the manuscript. We hope reviewers will agree that our work provides important new insights into LRRK2 function in iron homeostasis while opening new avenues of research in future studies.

      Given this new information we have changed the title from "LRRK2G2019S acts as a dominant interfering mutant in the context of iron overload" to the more accurate "LRRK2G2019S interferes with NCOA4 trafficking in response to iron overload leading to oxidative stress and ferroptotic cell death."

      Response to Reviewer 1

      Reviewer 1 (R1): There are two major concerns with the data in their present form. In brief, first, the G2019S cells express much less LRRK2 and more Rab8 that the WT cells and this severely affects interpretability.

      Heidi McBride (HM): We agree that the LRRK2G2019S lines express lower levels of LRRK2 than wild type, which is a previously documented phenomenon, presumably as the cell attempts to downregulate the increased kinase activity by reducing protein expression. However, the levels of Rab8 across 10s of experiments do not consistently show any differences between the wild type, G2019S and KO. We have provided more comprehensive quantifications of the blots in the revised version, and the Rab8 levels are consistent across all the blots presented in the manuscript (Figure 1A and 1B).

      R1: Second, the investigators used CRISPR to truncate the endogenous LRRK2 locus to produce a hypothetical truncated LRRK2-ARM polypeptide. This appears to have robust effects on NCOA4, in particular, which drives the overall interpretation of the data. However, the expression of this novel LRRK2 species is not confirmed nor compared to WT or G2019S in these cells (although admittedly the investigators did seek to address this with subsequent KO in the ARM cells). It would be premature to account for the changes reported without evidence of protein expression. This latter issue may be more easily addressed and could provide very strong support for a novel function/finding, see more detailed comments below, most seeking clarifications beyond the above.

      HM: As described in my common response above, we have removed the LRRK2ARM data from the manuscript.

      R1: Need to make clear in the results whether the G2019S CRISPR mutant is heterozygous or homozygous (presumably homozygous, same for ARM)

      HM: The RAW cell line we generated is homozygous for the G2019S and the KO alleles. We added this to the beginning of the results section and methods.

      R1: The text of the results implies that MLi2 was used in both WT and G2019S Raw cells, but it's only shown for G2019S. Given the premise for the use of RAW cells, it's important to show that there is basal LRRK2 kinase activity in WT cells to go along with its high protein expression. This is particularly important as the G2019S blot suggests minor LRRK2-independent phosphorylation of Rab8a (and other detected pRabs). One would imagine that pRab8 levels in both WT and G2019S would reduce to the same base line or ratio of total Rab in the presence of MLi2, but WT untreated is similar to G2019S with MLi2. This suggests no basal LRRK2 activity in the Raw cells, but I don't think that is the case.

      HM: We have included the data from MLi-2 treatment of wild type cells in Fig 3C quantified in D. Again, the baseline levels of Rab8 are unchanged across the genotypes. However, the reviewer is correct that there is some baseline LRRK2 kinase activity that is sensitive to MLi2 in wild type cells. This is seen most clearly on the autophosphorylation of LRRK2 at S1292 in Fig 3C. The pRab8 blots is not as clear in wild type cells. It is likely that LRRK2 must be actively recruited to membranes (as seen by others with LLOME, etc) to easily visualize p-Rabs in wild type cells. Nevertheless, we do clearly see the activity of autophosphorylation in wild type cells. Therefore while we understand the reviewers point that there should be some Rab8 phosphorylation in wild type cells, we don't see a significant, or very convincing, amount of it in our RAW macrophages.

      R1: Also, in terms of these cells, the levels of LRRK2 are surprisingly unmatched (Fig 1A, 1D, 1H, S1D, etc.) as are total levels of Rab8 (but in opposite directions) between the WT and G2019S. This is not mentioned in the Results text and is clearly reproducible and significant. Why do the investigators think this is? If Rab8 plays a role in iron, how do these differences affect the interpretation of the G2019S cells (especially given that MLi2 does not rescue)? Are other LRRK2-related Rabs affected at the protein (not phosphorylation level)? Could reduced levels of LRRK2 or increase Rab 8 alone or together account for some of these differences? Substantial further characterization is required as this seriously affects the interpretability of the data. Since pRab8 is not normalized to total Rab8, this G2019S model may not reflect a total increase in LRRK2 kinase activity, and could in fact have both less LRRK2 protein and less cellular kinase activity than WT (in this case).

      HM: In our hands, the RAW cells with homozygous LRRK2G2019S mutations show clearly that the total protein levels of LRRK2 is reduced compared to wild type, which is likely a compensatory effect to reduce cellular kinase activity overall. We understand that some of our previous blots were not so clear on the total Rab8 levels across the different experiments. We have repeated many of these experiments and hope the reviewer can see in Figs 1A, 3C, 3E, 3J, and Sup3A that the total Rab8 levels are stable across the conditions. We also present quantifications from 3 independent experiments normalizing the pRab8/Rab8 levels in all three genotypes in untreated and iron-loaded conditions (Supp Fig 3A and B), and upon MLi2 treatment (Fig 3C). In 3C and D the data show the effectiveness of MLi-2 to reduce pRab8 in control conditions, but the resistance to MLi-2 in FAS treated cells.

      R1: Presumably, the blots in 1H are whole cell lysates and account for the pooled soluble and insoluble NCOA4 (increased in G2019S), as there is no difference in soluble NCOA4 (Fig 2H). I suspect the prior difference is nicely reflected in the insoluble fraction (Fig 2H). This should be better explained in the Results text. This is a very interesting finding and I wonder what the investigators believe is driving this phenotype? Is the NCOA4 partitioning into a detergent-inaccessible compartment? Does this replicate with other detergents, those perhaps better at solubilizing lipid rafts? Is this a phenotype reversible with MLi2? Very interesting data.

      HM: We apologize for not being clearer in the text describing the behavior of NCOA4. The reviewer is correct that the major change in G2019S is the increased triton-X100 insoluble NCOA4. Previous work has established that NCOA4 segregates into detergent-insoluble foci upon iron overload as a way to release it from ferritin cages, and this fraction is then internalized into lysosomes through a microautophagy pathway (see Mizushima's work PMID: 36066504). In Fig 1I we show that the elevation in NCOA4 and ferritin heavy chain seen in untreated G2019S cells can be cleared upon iron chelation with DFO, indicating that the canonical NCOA4 mediated ferritinophagy (macroautophagy) pathway remains intact to recycle the iron in conditions of iron starvation. However in Figure 2 we show that conditions of iron overload, when NCOA4 segregates from ferritin (to allow cytosolic storage of iron), this form of NCOA4 cannot be degraded within the lysosome through the microautophagy pathway, and begins to accumulate. We see this with our live and fixed imaging compared to wild type cells (Fig 2A,D), and by the lack of clearance seen by western blot (Fig 2E). As for the impact of MLi-2, we observe some reversal of NCOA4 accumulation in untreated cells at 4 and 8 hrs after MLi-2 treatment (Supp Fig 2F). However, in iron loaded conditions the high NCOA4 levels in G2019S cells are MLi2 insensitive, while the elevated NCOA4 in wild type cells is reduced upon MLi2 addition (Fig. 2F, compare lates 3vs4 in wt with lanes 7vs8 in G2019S). This is consistent with a block in the microautophagy pathway of phase-separated NCOA4 degradation in G2019S cells.

      R1: Figure 2 describes the increased NCOA4-positive iron structures after iron load, but does not emphasize that the G2019S cells begin preloaded with more NCOA4. How do the investigators account for differential NCOA4 in this interpretation? Is this simply a reflection of more NCOA4 available in G2019S cells? This seems reasonable.

      HM: The reviewer is correct, we showed that there is some turnover of NCOA4 in untreated conditions through canonical ferritinophagy, but in iron overload this appears to be blocked, the NCOA4 segregates from ferritin and remains within insoluble, phase-separated structures that cannot be degraded through microautophagy. We have written the text to be more clear on these points.

      R1: These are very long exposures to iron, some as high as 48 hr which will then take into account novel transcriptomic and protein changes. Did the investigators evaluate cell death? Iron uptake would be trackable much quicker.

      HM: We agree that many things will change after our FAS treatments and now provide a full proteomics dataset on wild type and G2019S cells with and without iron overload, which is presented in Figure 4A-B. Indeed Figure 4 is entirely new to this revised submission. The proteomics highlighted a series of cellular changes that reflect major cell stress responses including the upregulation of HMOX1 (western blots to validate in Supp Fig 4A), an NRF2 transcriptional target consistent with our observation that NRF2 is stabilized and translocated to the nucleus in G2019S iron loaded cells (Sup Fig 4B,C). There are several interesting changes, and we highlighted the three major nodes, which are changes in iron response proteins, lysosomal proteins - particularly a loss of catalytic enzymes like lysozymes and granzymes consistent with the loss of hydrolytic capacity we show in Fig. 4C,D. We also noted changes in cytoskeletal proteins we suspect is consistent with the "blebbing" of the plasma membrane we see decorated with pRab8 in Fig 3. To test the activation of lipid oxidation likely resulting from the elevation in Fe2+ and oxidation signatures we employed the C11-bodipy probe and observe strong signal specific to the G2019 iron-loaded cells, particularly labelling endocytic compartments and the cell surface (Fig. 4E-G).

      Lastly, an analysis of SYTOX green uptake experiments was done to monitor the uptake of the dye into cells that have died of cell membrane rupture, commonly used to examine ferroptotic cell death. We now show the G2019S cells are very susceptible to this form of death (Fig 4H,I). These data add new functional evidence for the consequence of the G2019S mutation in an increased susceptibility to iron stress.

      R1: The legend for 2F is awkward (BSADQRED)

      HM: We have changed this to BSA-DQRed, which is a widely used probe to monitor the hydrolytic capacity of the lysosome.

      R1: Why are WT cells not included in Fig 2G?

      HM: We have now included new panels in Fig 3C,D showing wild type and G2019S +/- FAS and +/-ML-i2 with quantifications of pRab8/Rab8.

      R1: The biochemical characterization of NCOA4 in the LRRK2-arm cells is a great experiment and strength of the paper. The field would benefit by a bit further interrogation, other detergents, etc.

      HM: We have removed all of the LRRK2ARM data given our confusion over the impact of the 4 amino acid changes in exon 19 and our inability to monitor this protein by western blot. The concept that NCOA4 enters into TX100 insoluble, phase separated compartments has been well established, so we didn't explore other detergents at this point.

      R1: Have the investigators looked for aberrant Rab trafficking to lysosomes in the LRRK2-arm cells? Is pRab8 mislocalized compared to WT? Other pRabs?

      HM: We did initially show that pRab8 was also at the plasma membrane in the LRRK2ARM cells, and we still focus on this finding for the G2019S, seen in Fig 3A,B,F,H. We did try to look at other p-Rabs known to be targets of LRRK2 but none of them worked in immunofluorescence so we couldn't easily monitor specific traffic and/or localization changes for them.

      R1: The expression levels and therefore stability of the ARM fragment is not shown. This is necessary for interpretation. While very intriguing, the data in Aim 3 rely on the assumption that the ARM fragment is expressed, and at comparable levels to G2019S to account for phenotypes. The generation of second clone is admirable, but the expression of the protein must be characterized. This is especially true because of the different LRRK2 levels between WT and G2019S. One could easily conceive of exogenous expression of a tagged-ARM fragment into LRRK2 KO cells, for example, as another proof-of-concept experiment. If it is truly dominant, does this effect require or benefit from some FL LRRK2? It seems easy enough to express the LRRK2-ARM in at least WT and KO RAW cells.

      HM: We agree and our attempts to understand this clone resulted in its removal from the manuscript. We did also express cDNA encoding our ARM domain (up to exon 19), but it didn't phenocopy the CRISPR clone, which of course made sense once we had better proteomics and repeated our deep sequencing.

      In our further efforts to understand why our phenotype was MLi-2 resistant upon iron overload we expanded to examine the impact of pan-specific TypeII kinase inhibitors, and then reached out to the Reck-Peterson and Leschziner labs to obtain a newly developed LRRK2 selective type II kinase inhibitor. These all very efficiently reversed the pRab8 signals seen at the plasma membrane of G2019S cells upon iron overload (Fig 3E-K). Therefore the G2019S is not dominant negative, as we had initially supposed, rather there is a specific conformation of LRRK2 in high iron that potentially opens the ATP binding pocket to bind the type II inhibitors, but not MLi2. We do not understand exactly what this conformation is but likely involves new protein interactions specific to high iron, or perhaps LRRK2 binds iron directly as a sensor somehow that ultimately leads to the differential sensitivity we observe between type I and type II kinase inhibitors. Our data indicate that MLi-2 treatment in clinic will not be protective against iron toxicity phenotypes that may contribute to PD, where these newer selective type II LRRK2 kinase inhibitors would be effective in this conformation-specific context of iron toxicity.

      R1: Does iron overload induce Rab8a phosphorylation in a LRRK2 KO cell? This would be a solid extension on the ARM data and support the important finding that an additional kinase(s) can phosphorylate Rab8a under these conditions, and while not unexpected, this may not have been demonstrated by others as clearly. It also addresses whether the ARM domain is important to this other putative kinase(s), which may add value to the authors' model.

      HM: Iron overload does not induce pRab8 in LRRK2 KO cells, as seen by immunofluorescence in Fig 3A,B, and western blot in Supp Fig 3 A,B. With our new type II kinase inhibitor data we can confirm that the plasma membrane localized Rab8 is indeed phosphorylated by LRRK2.

      R1: Minor concern - the abstract but not the introduction emphasizes a hypothesis that loss of neuromelanin may promote cell loss in PD (through loss of iron chelation), while post mortem studies are by definition only correlative, early works suggested that the higher melanized DA neurons were preferentially lost when compared to poorly melanized neurons in PD. This speculation in the abstract is not necessary to the novel findings of the paper.

      HM: We appreciate that the links to iron in PD are correlative, we have maintained some of our discussion on this point within the manuscript given the lack of attention the field has paid to the cell biology of iron homeostasis in PD models. If there is a cell autonomous nature to the loss of DA neurons in PD, iron is very likely to be a part of this specificity in our opinion. Most of the newer MRI studies looking at iron levels in patient brains are showing higher free iron and working on this as potential biomarkers of disease. The precise timing of this relative to the stability/loss of neuromelanin is, I agree, not really clear.

      R1: (Significance (Required)): This study could shed light on a both novel and unexpected behavior of the LRRK2 protein, and open new insights into how pathogenic mutations may affect the cell. While studied in one cell line known for unusually high LRRK2 expression levels, data in this cell type have been broadly applicable elsewhere. Give the link to Parkinson's disease, Rab-dependent trafficking, and iron homeostasis, the findings could have import and relevance to a rather broad audience.

      HM: We are so very appreciative that reviewer 1 feels our work will be of interest to the PD and cell biology communities.

      Response to Reviewer 2

      Reviewer 2 (R2): Major: Please confirm that the observed phenotype is conserved within bone marrow-derived macrophages of LRRK2 G2019S mice. These mice are widely available within the community and frozen bone marrow could be sent to the labs. The main reason for this experiment is that CRISPR macrophage cell lines do sometimes acquire weird phenotypes (at least in our lab they sometimes do!) and it would strengthen the validity of the observations.

      HM: We did a series of experiments on primary BMDM derived from 3 pairs of wild type, LRRK2G2019S and LRRK2KO mice. We examined levels of ferritin heavy and light chains in steady state and withFAS treatment experiments. Unfortunately the data did not phenocopy the RAW macrophage lines we present here since FTL and FTH were mostly unchanged. We did observe an increase in NCOA4 levels, consistent with potential issues with microautophagy as observed in our RAW system.

      While we understand the danger that our phenotypes are nonspecific and linked to a CRISPR-based anomaly, there are a number of arguments we would make that these data and pathways are potentially very important to our understanding of LRRK2 mutant phenotypes and pathology. The first point is that we now include a LRRK2-specific type II kinase inhibitor that reverses the iron-overload pRab8 accumulation at the plasma membrane in LRRK2G2019S cells, showing that this is at least directly linked to LRRK2 kinase activity, even though it is resistant to MLi2.

      Second, Suzanne Pfeffer recently published their single cell RNAseq datasets from brains of untreated LRRK2G2019S mice (PMID: 39088390). She reported major changes in Ferritin heavy chain (it is lost) in very specific cell types of the brain, astrocytes, microglia and oligodendrocytes, with no changes in other cell types at all (her Fig 6 included left). This is consistent with a very context specific impact of LRRK2 on iron homeostasis that we don't yet understand.

      Third, the labs of both Cookson, Mamais and Lavoie have been working on the impact of LRRK2 mutations on iron handling in a few different model systems, including iPSCs, and see changes in transferrin recycling and iron accumulation. Those studies did not go into much detail on ferritin, NCOA4 and other readouts of iron homeostasis but are roughly in agreement with our work here. In the last biorxiv study submitted after we sent this work for review they concluded their phenotypes were reversed by MLi2 treatment, however they required 7 days of treatment for a ~20% restoration in iron levels. Given our work it would seem the impact of LRRK2G019S in high iron conditions is also very resistant to MLi2 treatment. In all these studies we do not yet know for sure whether iron overload in the brain may be a precursor to DA neuron cell death, which could be exacerbated in G2019S carriers. But we hope the reviewer will agree that our approach and findings will be useful for the field to expand on these concepts within different models of PD.

      R2: Minor comments: Supplementary Fig 1: I don't think one should normalize all controls to 1 and then do a statistical test as obviously the standard deviation of control is 0.

      HM: We agree with the reviewer that statistical testing is not appropriate when the WT control is fixed to a value of 1, as this necessarily eliminates variance in that group; accordingly, we have removed both statistical comparisons and standard deviation from the WT control while retaining variability measures for all experimental conditions. Raw densitometry values could not be pooled across independent experiments due to substantial inter-blot variability, and therefore normalization to the WT control was used solely to allow relative comparison within experiments, acknowledging the inherent quantitative limitations of Western blot densitometry. Ultimately the magnitude of the changes relative to the control lanes in each biological replicate was consistent across experiments, even if the absolute density of the bands between experiments was not always the same.

      R2: The raw data needs to be submitted to PRIDE or similar.

      HM: All of our data is being uploaded to the GEO databases, protocols to protocols.io and raw data deposited on Zenodo site in compliance with our ASAP funding requirements and the journals.

      R2: Some of the western blots could be improved. If these are the best shown, I am a little concerned about the reproducibility. How often has they been done?

      HM: We now ensure there is quantification of all the blots for at least 3 independent experiments and have worked to improve the quality of them throughout the revision period.

      R2: (Significance (Required)): Considering the importance of LRRK2 biology in Parkinson's and the new biology shown, this paper will be of great interest to the community and wider research fields.

      HM: We are so very grateful that the reviewer appreciates that the LRRK2 and PD community will find our work of interest. We hope our revisions will prove satisfactory even in the absence of ferritin changes in primary G2019S BMDM.

      Response to Reviewer 3

      Reviewer 3 (R3): What is missing in the study is the physiological relevance of these findings, mainly whether this effect actually results in higher cell death during iron overload. Since iron overload is known to result in ferroptosis, it is surprising that the authors have not checked whether the LRRK2 G2019S and ARM cells undergo more ferroptosis relative to LRRK2 WT cells.

      HM: We thank the reviewer for pushing us to monitor the functional implications of the iron mishandling upon iron overload in the G2019S RAW cell system. We now add a completely new Figure 4 to get to these functional points. We employed two tools to look at established aspects of ferroptosis, first the C11-bodipy probe that labels oxidized lipids and we see significant signals specific to the G2019S iron loaded cells, where it labels endocytic membranes and the cell surface (Fig 4 E-G). This is consistent with the elevation of free iron 2+. We also used the SYTOX green death assay where the dye is internalized into cells when the cell surface is ruptured and show that G2019S cells die upon iron overload, but not the LRRK2KO or wild type cells (Fig 4 H,I). Lastly, we performed full proteomics analysis of the wt and G2019S RAW cells in iron overload conditions. These data provide a better view of the full stress response initiated in the G2019S cells, including the upregulation of HMOX1 (an NRF2 target gene), changes in lysosomal hydrolytic enzymes consistent with the reduction in BSA-DQRed signals, and in cytoskeleton, which is consistent with the plasma membrane blebbing phenotypes we see in G2019S (Fig. 4A-D and Supp. Fig 4 data). We hope these new data help to position the phenotype into a more physiological output.

      R3: Moreover, their conclusion of the findings as "resistant to LRRK2 kinase inhibitors" is not convincing, since in most of the studies, they have removed the kinase domain, and this description implies the use of pharmacological kinase inhibition which has not been done in this paper.

      HM: We took this comment to heart and, as explained in the general response we removed the LRRK2ARM clones from the study. To understand the kinase function in the iron overload conditions we first explored the pan-specific type II kinase inhibitor rebastinib, shown to inhibit LRRK2. In contrast to MLi2, this drug effectively blocked p-Rab8 in G2019S cells exposed to high iron. However, since it is not specific and likely inhibits about 30-40% of all kinases we reached out to the Reck-Peterson and Leschziner labs who have developed a LRRK2 specific type II kinase inhibitor (published in June 2025 PMID: 40465731). They provided these to us (along with a great deal of discussion) and the two drugs both blocked the effect of LRRK2G2019 on p-Rab8 at the plasma membrane. These data show that the phenotypes we observe are indeed linked to the increased kinase activity of LRRK2, even though they are fully resistant to MLi-2. It suggests that high iron results in some alteration in LRRK2 conformation that alters the ability of MLi2 to block the kinase activity, while still allowing the type II kinase inhibitors that bind deeper in the ATP-binding pocket, to functionally block activity. We believe that these new data remove a great deal of confusion we had in the initial submission to explain the MLi-2 resistance.

      R3: There is lower LRRK2 expression in LRRK2 G2019S cells, have the authors checked Rab phosphorylation to validate the mutation?

      HM: We agree that the G2019S mutation leads a reduction in total LRRK2 levels in the cell, which is likely a compensatory effect to lower kinase activity in the cell. We do show that the G2019S mutation has clear activation of phosphorylation on both Rab8 and at the autophosphorylation site S1292 of LRRK2, as seen in Fig 1A, quantified in Fig 1B. In untreated conditions, these phosphorylation events are reversible upon treatment with MLi-2. We also provide the sequencing data in the supplement to confirm the presence of the G2019S mutation in this clone, shown in Supp Fig. 1A.

      R3: The authors should specify if their cells are heterozygous or homozygous since they are discussing a dominant interfering mutant.

      HM: The G2019S and LRRK2 KO are both homozygous. We state this early in the results section and the methods.

      R3: The transferrin phenotype validated through proteomics and western blot is solid. HM: We agree, thank you very much!

      R3: Quantification in figure 1F-G is problematic, not clear what they mean by "diffuse and lysosomal". Puncta is either colocalising with lysosomes or not colocalising. This needs to be clarified and re-analysed.

      HM: We apologize for the confusion. In control cells the Cherry tagged FTL is efficiently cycling through the lysosomes and we don't see a strong cytosolic (diffuse) pool, which likely reflects the relatively iron-poor culture conditions. However, in G2019S cells, there is a highly elevated amount of FTL, with a strong cytosolic/diffuse stain in steady state, with some flux into lysosomes. In this experiment we chelated iron to test whether this cytosolic pool of FTL was capable of clearing through the lysosomes (ferritinophagy). While there is a cytosolic (diffuse) pool that remains, the pool that fluxes into the lysosome increases in G2019S chelated cells. This is also seen by the reduction in total FTL seen by western blot (endogenous FTL). Our conclusion here is that the general ferritinophagy machinery remains functional in G2019S cells. We have changed the term "diffuse" to "cytosolic" and improved our description of this experiment in the text.

      R3: Text in the first results part called "LRRK2G2019S RAW macrophages have altered iron homeostasis" is very long. It could be divided into more sections to improve readability. HM: We have improved the text to be more descriptive of the conclusions and added new sections

      R3: If the effect is armadillo-dependent, where does LRRK2 G2019S is implicated since there is no kinase domain in these cells?

      HM: Our new data employing the LRRK2-specific type II kinase inhibitors now confirm that the effects of the G2019S on iron overload are indeed kinase dependent, it's just insensitive to MLi2.

      R3: The authors do not show any controls (PCR, sequencing) confirming knockout or truncation. HM: We did higher resolution proteomics and deep sequencing and learned that the "Arm" mutation was not a truncation but a series of 4 point mutations around exon 19. Therefore we removed all data referring to this clone and replaced it with the use of the type II kinase inhibitor experiments. We feel this removed a lot of confusion and provides much clearer conclusions on the role of the kinase activity in iron overload. We may continue to explore what the 4 amino acid mutations created such strong phenotypes, as it could reflect a critical conformational change that impacts the kinase activity. But that is for future work. We now include the sequencing files of the G2019 and KO as Supplementary Data Files 1 and 2.

      R3: The data is interesting and the image quality with the insets is very high. HM: We thank the reviewer for their positive comments!

      R3: Mutant not clearly described in text, did the authors remove just the kinase and ROC-COR domains or all the domains downstream of the Armadillo domain? This is not clear. HM: We have removed the clone from the manuscript.

      R3: The authors cannot conclude that their phenotype is due to the independence of the kinase domain specifically as they are also interfering with the GTPase activity by removing the ROC-COR domains. HM: We agree and our new drugs allow us to confirm that the phenotypes are due to kinase activity, but there is a new conformation of LRRK2 induced in high iron that renders the kinase domain resistant to MLi-2 inhibition. We discuss this in the manuscript now.

      R3: In Figure 3E, is the difference between the "ARM CTRL" and the "ARM FAS" conditions significant? A trend appears to be there, but the p-value is not shown. HM: these data are now removed.

      R3: In figure 4A, it would have been important to check if Rab8 phosphorylation is also observed in LRRK2 KO cells after administration of FAS to further evaluate the mechanism through which this Rab8 phosphorylation is occurring.

      HM: We show that the pRab8 is specific to the G2019S lines and not seen in LRRK2 KO (Fig 3A,B, Supp. Fig. 3A,B).

      R3: The vinculin bands in figure 4A are misaligned with the rest of the bands.

      HM: We now provide new blots for all of these experiments (in Fig 3) as we removed the LRRK2ARM data from the manuscript and the appropriate loading controls are all included.

      R3: The authors do not have any controls to validate the pRab8 staining in IF. This is an important caveat and needs to be addressed. HM: We now include siRNA validation of Rab8 (vs Rab10) to confirm the specificity of the antibody to pRab8 in IF where it labels the plasma membrane in G2019S iron loaded cells.

      R3: The authors should have checked if FAS administration in the LRRK2 G2019S and the ARM cells is leading to ferroptotic cell death (or cell death in general). This is key to validate the link between the altered iron homeostasis in LRRK2 G2019S cells and increased cytotoxicity observed during neurodegeneration.

      HM: As mentioned above, we have added extensively to our new Fig 4 to include full proteomics analysis of the changes in iron loaded G2019S cells, we use C11-Bodipy probes to monitor lipid oxidation, and SYTOX green assays to monitor cell death through cell surface rupture (consistent with ferroptosis). We thank the reviewer for pushing us to do these experiments and provide further relevance to the potential for LRRK2 mutations to promote cell toxicity during neurodegeneration.

      R3: Regarding the literature, the authors are missing some important papers that are preprinted and these studies need to be discussed. This includes a report with opposite findingshttps://www.biorxiv.org/content/10.1101/2025.09.26.678370v1.full and a report showing kinase independent cell death in macrophages https://www.biorxiv.org/content/10.1101/2023.09.27.559807v1.abstract

      HM: We thank the reviewers for alerting us to the biorxiv papers, one of which was submitted after we sent our manuscript to review. We are excited to see the growing interest in the impact of LRRK2 function in iron homeostasis and hope our work will contribute to this. Upon reading the study from the LaVoie lab they do show some sensitivity of the iron loaded phenotype in G2019S cells, however they see a ~20% reduction in lysosomal iron after 7 days of MLi treatment in Astrocytes (their Fig 2L). To us, this is very likely an indication of a relatively high resistance to the drug. I'm sure if they tried these new Type II inhibitors the iron load would be much more rapidly reversed. The specificity of their phenotype to Rab8 is also very interesting considering the cell surface localization we see for pRab8 in our iron loaded system. Similar comments for the Guttierez study in macrophages. We have included the findings of these papers within the manuscript and thank the reviewer for pointing them out.

    1. What kind of exploration, then, do the worlds of walking simulators support? Contrary to expectations, these games are rarely just about exploration. There are a few exceptions: Proteus (2013) is a joyful exploration of a shifting island purely for its own sake, and experimental games like Césure and Lumiere (both 2013) place the player in explorable abstracted spaces of light, color, and shadow (Reed 2013). But the most famous and successful walking simulators are best understood as explorations not of environment, but of character. Just as the environments in first-person shooters exist to support action-packed combat, the environments in most walking sims are designed to be platforms for understanding and empathizing with characters. In games like Dear Esther, Virginia (2016), What Remains of Edith Finch (2017), and many others, 3D game worlds come to be understood as metaphorical spaces offering windows into the minds and stories of the people within them. Sometimes this is made literal as part of the game’s fiction (as in the 2014 games Mind: Path to Thalamus and Ether One, both about entering an environmental representation of another character’s mind) but more commonly we understand this reification as(p. 126)working in the same way experimental films signify abstract meanings with concrete visuals, or the reality-bending conventions of magical realism or unreliable narrators creating layers of truth in literature.

      This argument can definitely be backed by my own prior experiences. Whether it’s Gone Home or other similar games that I've played and seen–some with horror and mystery aspects–I never truly explored the environments simply for the sake of exploring my surroundings. Rather, I was always driven by a sense of curiosity to unfold the mysteries surrounding my character and the others around me.

    2. Stripping the violence from a first-person shooter, however, often results in a strange interstitial kind of experience, something in-between and unrecognizable. O’Connor’s review of the tourism Quake mod highlights some of the unsuitability of these environments to casual exploration. The architecture of these games in their original form is a means to the end of success in combat: to the extent the player notices it at all, it is while looking for places to hide, physical obstacles, routes for evasion or ambush. Details are designed to be glanced at briefly, not lingered over.

      I find it interesting that calmer games like walking sims originated from more violent and action-packed games like first-person shooters. Wanting to explore a world usually unavailable due to actions and conflict is common in human beings. It’s like setting a pack of cookies on a table, saying no one can eat it, and then leaving. Surely no one will miss one cookie in the pack. It’s the same concept for these shooter games. Players are busy running around, surviving, and fighting, so they don’t get to actually appreciate the world they are in (by design). It makes a part of their brain curious to experience the world they are already in without the pressures of battle, so a modder will take a cookie form the pack and eat it, opening up the rest of the pack for other people to enjoy. That’s when they realize the game they were praising for graphics and immersion is actually just a rough sketch, raisin cookies when they thought they were chocolate chips. This creates a sense of disappointment. I can see how walking sims were born from it. To cure the disappointment players felt when they realize the game they loved isn’t as polished as they thought. But without the allure of fighting, walking sims need something else, some other temptation. So, they promise ice cream with the cookies, sprinkling lore and stories into their world. Finding out pieces of the story give players their dopamine boost while satisfying their curiosity for adventure. The article then mentioned punishments within the game and how walking sims remove that and instead explore living with the consequences of your actions. This adds more depth to the ice cream and cookies players have been enjoying before, forcing them to either love or hate them more intensely.

    1. It’s not just common sense that tells us that learning to write ingeneral is not possible.

      SIDE NOTE: Not speaking about learning to write your A,B,C's: Even though there is meaning behind that as well.

    Annotators

    1. Reviewer #2 (Public review):

      Summary:

      This work addresses the question whether artificial deep neural network models of the brain could be improved by incorporating top-down feedback, inspired by the architecture of neocortex.

      In line with known biological features of cortical top-down feedback, the authors model such feedback connections with both, a typical driving effect and a purely modulatory effect on the activation of units in the network.

      To asses the functional impact of these top-down connections, they compare different architectures of feedforward and feedback connections in a model that mimics the ventral visual and auditory pathways in cortex on an audiovisual integration task.

      Notably, one architecture is inspired by human anatomical data, where higher visual and auditory layers possess modulatory top-down connections to all lower-level layers of the same modality, and visual areas provide feedforward input to auditory layers, whereas auditory areas provide modulatory feedback to visual areas.

      First, the authors find that this brain-like architecture imparts the models with a light visual bias similar to what is seen in human data, which is the opposite in a reversed architecture, where auditory areas provide feedforward drive to the visual areas.

      Second, they find that, in their model, modulatory feedback should be complemented by a driving component to enable effective audiovisual integration, similar to what is observed in neural data.

      Overall, the study shows some possible functional implications when adding feedback connections in a deep artificial neural network that mimic some functional aspects of visual perception in humans.

      Strengths:

      The study contains innovative ideas, such as incorporating an anatomically inspired architecture into a deep ANN, and comparing its impact on a relevant task to alternative architectures.

      Moreover, the simplicity of the model allows it to draw conclusions on how features of the architecture and functional aspects of the top-down feedback affects performance of the network.

      This could be a helpful resource for future studies of the impact of top-down connections in deep artificial neural network models of neocortex.

      Weaknesses:

      Some claims not yet supported.

      The problem is that results are phrased quite generally in the abstract and discussion, while the actual results shown in the paper are very specific to certain implementations of top-down feedback and architectures. This could lead to misunderstanding and requires some revisions of the claims in the abstract and discussion (see below).

      "Altogether our findings demonstrate that modulatory top-down feedback is a computationally relevant feature of biological brain..."

      This claim is not supported, since no performance increase is demonstrated for modulatory feedback. So far, only the second half of the sentence is supported: "...and that incorporating it into ANNs affects their behavior and constrains the solutions it's likely to discover."

      "This bias does not impair performance on the audiovisual tasks."

      This is only true for the composite top-down feedback that combines driving and modulatory effects, whereas modulatory feedback alone can impair the performance (e.g., in the visual tasks VS1 and VS2). The fact that modulatory feedback alone is insufficient in ANNs to enable effective cross-modal integration and requires some driving component is actually very interesting, but it is not stressed enough in the abstract. This is hinted at in the following sentence, but should be made more explicitly:

      "The results further suggest that different configurations of top-down feedback make otherwise identically connected models functionally distinct from each other, and from traditional feedforward and laterally recurrent models."

      "Here we develop a deep neural network model that captures the core functional properties of top-down feedback in the neocortex" -> this is too strong, take out "the", because very likely there are other important properties that are not yet incorporated.

      "Altogether, our results demonstrate that the distinction between feedforward and feedback inputs has clear computational implications, and that ANN models of the brain should therefore consider top-down feedback as an important biological feature."

      This claim is still not substantiated by evidence provided in the paper. First, the wording is a bit imprecise, because mechanistically, it is not really the feedforward versus feedback (a purely feedforward model is not considered at all in the paper), but modulatory versus driving. Moreover, the second part of the sentence is problematic: The results imply that, computationally/functionally, driving connections are doing the job, while modulatory feedback does not really seem to improve performance (best case, it does not do any harm). It is true that it is a feature that is inspired by biology, but I don't see why the results imply that (modulatory) top-down feedback should be considered in ANN models of the brain. This would require to show that such models either improve performance, or do improve the ability to fit neural data, both which are beyond the scope of the paper.

      The same argument holds for the following sentence, which is not supported by the results of the paper:

      "More broadly, our work supports the conclusion that both the cellular neurophysiology and structure of feed-back inputs have critical functional implications that need to be considered by computational models of brain function."

      Additional supplementary material required

      Although the second version checked the influence of processing time, this was not done for the most important figure of the paper, Figure 4. A central claim in the abstract "This bias does not impair performance on the audiovisual tasks" relies on this figure, because only with composite feedback the performance is comparable between the the "drive-only" and "brain-like" models. Thus, the supplementary Figure 3 should also include the composite networks and drive only network to check the robustness of the claim with respect to process time. This robustness analysis should then also be mentioned in the text. For example, it should be mentioned whether results in these networks are robust or not with respect to process time, whether there are differences between network architectures or types of feedback in general etc.

      Moreover, the current analysis for networks with modulatory feedback is a bit confusing. Why is the performance so low for the reverse model for a process time of 3 and 10? This is a very strong effect that warrants explanation. More details should be added in the caption as well. For example, are the models separately trained for the output after 3 and 10 processing steps for the comparison, or just evaluated at these times? Not training these networks separately might explain the low performance for some networks, so ideally networks are trained for each choice of processing steps.

    1. Reviewer #2 (Public review):

      Summary:

      The manuscript reports a cryo-EM structure of TMAO demethylase from Paracoccus sp. This is an important enzyme in the metabolism of trimethylamine oxide (TMAO) and trimethylamine (TMA) in human gut microbiota, so new information about this enzyme would certainly be of interest.

      Strengths:

      The cryo-EM structure for this enzyme is new and provides new insights into the function of the different protein domains, and a channel for formaldehyde between the two domains.

      Weaknesses:

      (1) The proposed catalytic mechanism in this manuscript does not make sense. Previous mechanistic studies on the Methylocella silvestris TMAO demethylase (FEBS Journal 2016, 283, 3979-3993, reference 7) reported that, as well as a Zn2+ cofactor, there was a dependence upon non-heme Fe2+, and proposed a catalytic mechanism involving deoxygenation to form TMA and an iron(IV)-oxo species, followed by oxidative demethylation to form DMA and formaldehyde.

      In this work, the authors do not mention the previously proposed mechanism, but instead say that elemental analysis "excluded iron". This is alarming, since the previous work has a key role for non-heme iron in the mechanism. The elemental analysis here gives a Zn content of about 0.5 mol/mol protein (and no Fe), whereas the Methylocella TMAO demethylase was reported to contain 0.97 mol Zn/mol protein, and 0.35-0.38 mol Fe/mol protein. It does, therefore, appear that their enzyme is depleted in Zn, and the absence of Fe impacts the mechanism, as explained below.

      The proposed catalytic mechanism in this manuscript, I am sorry to say, does not make sense to me, for several reasons:

      (i) Demethylation to form formaldehyde is not a hydrolytic process; it is an oxidative process (normally accomplished by either cytochrome P450 or non-heme iron-dependent oxygenase). The authors propose that a zinc (II) hydroxide attacks the methyl group, which is unprecedented, and even if it were possible, would generate methanol, not formaldehyde.

      (ii) The amine oxide is then proposed to deoxygenate, with hydroxide appearing on the Zn - unfortunately, amine oxide deoxygenation is a reductive process, for which a reducing agent is needed, and Zn2+ is not a redox-active metal ion;

      (iii) The authors say "forming a tetrahedral intermediate, as described for metalloproteinase", but zinc metalloproteases attack an amide carbonyl to form an oxyanion intermediate, whereas in this mechanism, there is no carbonyl to attack, so this statement is just wrong.

      So on several counts, the proposed mechanism cannot be correct. Some redox cofactor is needed in order to carry out amine oxide deoxygenation, and Zn2+ cannot fulfil that role. Fe2+ could do, which is why the previously proposed mechanism involving an iron(IV)-oxo intermediate is feasible. But the authors claim that their enzyme has no Fe. If so, then there must be some other redox cofactor present. Therefore, the authors need to re-analyse their enzyme carefully and look either for Fe or for some other redox-active metal ion, and then provide convincing experimental evidence for a feasible catalytic mechanism. As it stands, the proposed catalytic mechanism is unacceptable.

      (2) Given the metal content reported here, it is important to be able to compare the specific activity of the enzyme reported here with earlier preparations. The authors do quote a Vmax of 16.52 µM/min/mg; however, these are incorrect units for Vmax, they should be µmol/min/mg. There is a further inconsistency between the text saying µM/min/mg and the Figure saying µM/min/µg.

      (3) The consumption of formaldehyde to form methylene-THF is potentially interesting, but the authors say "HCHO levels decreased in the presence of THF", which could potentially be due to enzyme inhibition by THF. Is there evidence that this is a time-dependent and protein-dependent reaction? Also in Figure 1C, HCHO reduction (%) is not very helpful, because we don't know what concentration of formaldehyde is formed under these conditions; it would be better to quote in units of concentration, rather than %.

      (4) Has this particular TMAO demethylase been reported before? It's not clear which Paracoccus strain the enzyme is from; the Experimental Section just says "Paracoccus sp.", which is not very precise. There has been published work on the Paracoccus PS1 enzyme; is that the strain used? Details about the strain are needed, and the accession for the protein sequence.

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Thach et al. report on the structure and function of trimethylamine N-oxide demethylase (TDM). They identify a novel complex assembly composed of multiple TDM monomers and obtain high-resolution structural information for the catalytic site, including an analysis of its metal composition, which leads them to propose a mechanism for the catalytic reaction.

      In addition, the authors describe a novel substrate channel within the TDM complex that connects the N-terminal Zn²-dependent TMAO demethylation domain with the C-terminal tetrahydrofolate (THF)-binding domain. This continuous intramolecular tunnel appears highly optimized for shuttling formaldehyde (HCHO), based on its negative electrostatic properties and restricted width. The authors propose that this channel facilitates the safe transfer of HCHO, enabling its efficient conversion to methylenetetrahydrofolate (MTHF) at the C-terminal domain as a microbial detoxification strategy.

      Strengths:

      The authors provide convincing high-resolution cryo-EM structural evidence (up to 2 Å) revealing an intriguing complex composed of two full monomers and two half-domains. They further present evidence for the metal ion bound at the active site and articulate a plausible hypothesis for the catalytic cycle. Substantial effort is devoted to optimizing and characterizing enzyme activity, including detailed kinetic analyses across a range of pH values, temperatures, and substrate concentrations. Furthermore, the authors validate their structural insights through functional analysis of active-site point mutants.

      In addition, the authors identify a continuous channel for formaldehyde (HCHO) passage within the structure and support this interpretation through molecular dynamics simulations. These analyses suggest an exciting mechanism of specific, dynamic, and gated channeling of HCHO. This finding is particularly appealing, as it implies the existence of a unique, completely enclosed conduit that may be of broad interest, including potential applications in bioengineering.

      Weaknesses:

      Although the idea of an enclosed channel for HCHO is compelling, the experimental evidence supporting enzymatic assistance in the reaction of HCHO with THF is less convincing. The linear regression analysis shown in Figure 1C demonstrates a THF concentration-dependent decrease in HCHO, but the concentrations used for THF greatly exceed its reported KD (enzyme concentration used in this assay is not reported). It has previously been shown that HCHO and THF can couple spontaneously in a non-enzymatic manner, raising the possibility that the observed effect does not require enzymatic channeling. An additional control that can rule out this possibility would help to strengthen the evidence. For example, mutating the THF binding site to prevent THF binding to the protein complex could clarify whether the observed decrease in HCHO depends on enzyme-mediated proximity effects. A mutation which would specifically disable channeling could be even more convincing (maybe at the narrowest bottleneck).

      We agree with the reviewer that HCHO and THF can react spontaneously in a non-enzymatic manner, and our experiments were not intended to demonstrate enzymatic channeling. The linear regression analysis in Figure 1C was designed solely to confirm that HCHO reacts with THF under our assay conditions. Accordingly, THF was titrated over a broad concentration range starting from zero, and the observed THF concentration–dependent decrease in HCHO reflects this chemical reactivity.

      We do not interpret these data as evidence that the enzyme catalyzes or is required for the HCHO–THF coupling reaction. Instead, the structural observation of an enclosed channel is presented as a separate finding. We have clarified this point in the revised text to avoid overinterpretation of the biochemical data (page 2, line 16).

      Another concern is that the observed decrease in HCHO could alternatively arise from a reduced production of HCHO due to a negative allosteric effect of THF binding on the active site. From this perspective, the interpretation would be more convincing if a clear coupled effect could be demonstrated, specifically, that removal of the product (HCHO) from the reaction equilibrium leads to an increase in the catalytic efficiency of the demethylation reaction.

      We agree that, in principle, a decrease in detectable HCHO could also arise from an indirect effect of THF binding on enzyme activity. However, in our study the experiment was not designed to assess catalytic coupling or allosteric regulation. The assay in question monitors HCHO levels under defined conditions and does not distinguish between changes in HCHO production and downstream consumption.

      Additionally, we do not interpret the observed decrease in HCHO as evidence that THF binding enhances catalytic efficiency, or that removal of HCHO shifts the reaction equilibrium. Instead, the data are presented to establish that HCHO can react with THF under the assay conditions. Any potential allosteric effects of THF on the demethylation reaction, or kinetic coupling between HCHO removal and catalysis, are beyond the scope of the current study, and are not claimed.

      While the enzyme kinetics appear to have been performed thoroughly, the description of the kinetic assays in the Methods section is very brief. Important details such as reaction buffer composition, cofactor identity and concentration (Zn<sup>2+</sup>), enzyme concentration, defined temperature, and precise pH are not clearly stated. Moreover, a detailed methodological description could not be found in the cited reference (6), if I am not mistaken.

      Thank you for the suggestion. We have added reference [24] to the methodological description on page 8. The Methods section has been revised accordingly on page 8 under “TDM Activity Assay,” without altering the Zn<sup>2+</sup> concentration.

      The composition of the complex is intriguing but raises some questions. Based on SDS-PAGE analysis, the purified protein appears to be predominantly full-length TDM, and size-exclusion chromatography suggests an apparent molecular weight below 100 kDa. However, the cryo-EM structure reveals a substantially larger complex composed of two full-length monomers and two half-domains.

      We appreciate the reviewer’s careful analysis of the apparent discrepancy between the biochemical characterization and the cryo-EM structure. This issue is addressed in Figure S1, which may have been overlooked.

      As shown in Figure S1, the stability of TDM is highly dependent on protein and salt conditions. At 150 mM NaCl, SEC reveals a dominant peak eluting between 10.5 and 12 mL, corresponding to an estimated molecular weight of ~170–305 kDa (blue dot, Author response image 1). This fraction was explicitly selected for cryo-EM analysis and yields the larger complex observed in the reconstruction. At lower salt concentrations (50 mM) or higher (>150 mM NaCl), the protein either aggregates or elutes near the void volume (~8 mL).

      SDS–PAGE analysis detects full-length TDM together with smaller fragments (~40–50 kDa and ~22–25 kDa). The apparent predominance of full-length protein on SDS–PAGE likely reflects its greater staining intensity per molecule and/or a higher population, rather than the absence of truncated species.

      Author response image 1.

      Given the lack of clear evidence for proteolytic fragments on the SDS-PAGE gel, it is unclear how the observed stoichiometry arises. This raises the possibility of higher-order assemblies or alternative oligomeric states. Did the authors attempt to pick or analyze larger particles during cryo-EM processing? Additional biophysical characterization of particle size distribution - for example, using interferometric scattering microscopy (iSCAT)-could help clarify the oligomeric state of the complex in solution.

      Cryo-EM data were collected exclusively from the size-exclusion chromatography fraction eluting between 10.5 and 12 mL. This fraction was selected to isolate the dominant assembly in solution. Extensive 2D and 3D particle classification did not reveal distinct classes corresponding to smaller species or higher-order oligomeric assemblies. Instead, the vast majority of particles converged to a single, well-defined structure consistent with the 2 full-length + 2 half-domain stoichiometry.

      A minor subpopulation (~2%) exhibited increased flexibility in the N-terminal region of the two full-length subunits, but these particles did not form a separate oligomeric class, indicating conformational heterogeneity rather than alternative assembly states (Author response image 2). Together, these data support the 2+2½ architecture as the predominant and stable complex under the conditions used for cryo-EM. Additional techniques, such as iSCAT, would provide complementary information, but are not required to support the conclusions drawn from the SEC and cryo-EM analyses presented here.

      Author response image 2.

      The authors mention strict symmetry in the complex, yet C2 symmetry was enforced during refinement. While this is reasonable as an initial approach, it would strengthen the structural interpretation to relax the symmetry to C1 using the C2-refined map as a reference. This could reveal subtle asymmetries or domain-specific differences without sacrificing the overall quality of the reconstruction.

      We thank the reviewer for this thoughtful suggestion. In standard cryo-EM data processing, symmetry is typically not imposed initially to minimize potential model bias; accordingly, we first performed C1 refinement before applying C2 symmetry. The resulting C1 reconstructions revealed no detectable asymmetry or domain-specific differences relative to the C2 map. In addition, relaxing the symmetry consistently reduced overall resolution, indicating lower alignment accuracy and further supporting the presence of a predominantly symmetric assembly.

      In this context, the proposed catalytic role of Zn<sup>2+</sup> raises additional questions. Why is a 2:1 enzyme-to-metal stoichiometry observed, and how does this reconcile with previous reports? This point warrants discussion. Does this imply asymmetric catalysis within the complex? Would the stoichiometry change under Zn<sup>2+</sup>-saturating conditions, as no Zn<sup>2+</sup> appears to be added to the buffers? It would be helpful to clarify whether Zn<sup>2+</sup> occupancy is equivalent in both active sites when symmetry is not imposed, or whether partial occupancy is observed.

      The observed ~2:1 enzyme-to-Zn<sup>2+</sup> stoichiometry likely reflects the composition of the 2 full-length + 2 half-domain (2+2½) complex. In this assembly, only the core domains that are fully present in the complex contribute to metal binding. The truncated or half-domains lack the Zn<sup>2+</sup> binding domain. As a result, only two metal-binding sites are occupied per assembled complex, consistent with the measured stoichiometry.

      We note that Zn<sup>2+</sup> was not deliberately added to the buffers, so occupancy may not reflect full saturation. Based on our cryo-EM and biochemical data, both metal-binding sites in the full-length subunits appear to be occupied to an equivalent extent, and no clear evidence of asymmetric catalysis is observed under these current experimental conditions. Full Zn<sup>2+</sup> saturation could potentially increase occupancy, but was not explored in these experiments.

      The divalent ion Zn<sup>2+</sup> is suggested to activate water for the catalytic reaction. I am not sure if there is a need for a water molecule to explain this catalytic mechanism. Can you please elaborate on this more? As one aspect, it might be helpful to explain in more detail how Zn-OH and D220 are recovered in the last step before a new water molecule comes in.

      Thank you for your suggestion. We revised our text in page 2 as bellow.

      Based on our structural and biochemical data, we propose a structurally informed working model for TMAO turnover by TDM (Scheme 1). In this model, Zn<sup>2+</sup> plays a non-redox role by polarizing the O–H bond of the bound hydroxyl, thereby lowering its pK<sub>a</sub>. The D220 carboxylate functions as a general base, abstracting the proton to generate a hydroxide nucleophile. This hydroxide then attacks the electrophilic N-methyl carbon of TMAO, forming a tetrahedral carbinolamine (hemiaminal) intermediate. Subsequent heterolytic cleavage of the C–N bond leads to the release of HCHO. D220 then switches roles to act as a general acid, donating a proton to the departing nitrogen, which facilitates product release and regenerates the active site. This sequence allows a new water molecule to rebind Zn<sup>2+</sup>, enabling subsequent catalytic turnovers. This proposed pathway is consistent with prior mechanistic studies, in which water addition to the azomethine carbon of a cationic Schiff base generates a carbinolamine intermediate, followed by a rate-limiting breakdown to yield an amino alcohol and a carbonyl compound, in the published case, an aldehyde (Pihlaja et al., J. Chem. Soc. Perkin Trans. 2, 1983, 8, 1223–1226).

      Overall, the authors were successful in advancing our structural and functional understanding of the TDM complex. They suggest an interesting oligomeric complex composition which should be investigated with additional biophysical techniques.

      Additionally, they provide an intriguing hypothesis for a new type of substrate channeling. Additional kinetic experiments focusing on HCHO and THF turnover by enzymatic proximity effects would strengthen this potentially fundamental finding. If this channeling mechanism can be supported by stronger experimental evidence, it would substantially advance our understanding and knowledge of biologic conduits and enable future efforts in the design of artificial cascade catalysis systems with high conversion rate and efficiency, as well as detoxification pathways.

      Reviewer #2 (Public review):

      Summary:

      The manuscript reports a cryo-EM structure of TMAO demethylase from Paracoccus sp. This is an important enzyme in the metabolism of trimethylamine oxide (TMAO) and trimethylamine (TMA) in human gut microbiota, so new information about this enzyme would certainly be of interest.

      Strengths:

      The cryo-EM structure for this enzyme is new and provides new insights into the function of the different protein domains, and a channel for formaldehyde between the two domains.

      Weaknesses:

      (1) The proposed catalytic mechanism in this manuscript does not make sense. Previous mechanistic studies on the Methylocella silvestris TMAO demethylase (FEBS Journal 2016, 283, 3979-3993, reference 7) reported that, as well as a Zn2+ cofactor, there was a dependence upon non-heme Fe<sup>2+</sup>, and proposed a catalytic mechanism involving deoxygenation to form TMA and an iron(IV)-oxo species, followed by oxidative demethylation to form DMA and formaldehyde.

      In this work, the authors do not mention the previously proposed mechanism, but instead say that elemental analysis "excluded iron". This is alarming, since the previous work has a key role for non-heme iron in the mechanism. The elemental analysis here gives a Zn content of about 0.5 mol/mol protein (and no Fe), whereas the Methylocella TMAO demethylase was reported to contain 0.97 mol Zn/mol protein, and 0.35-0.38 mol Fe/mol protein. It does, therefore, appear that their enzyme is depleted in Zn, and the absence of Fe impacts the mechanism, as explained below.

      The proposed catalytic mechanism in this manuscript, I am sorry to say, does not make sense to me, for several reasons:

      (i) Demethylation to form formaldehyde is not a hydrolytic process; it is an oxidative process (normally accomplished by either cytochrome P450 or non-heme iron-dependent oxygenase). The authors propose that a zinc (II) hydroxide attacks the methyl group, which is unprecedented, and even if it were possible, would generate methanol, not formaldehyde.

      (ii) The amine oxide is then proposed to deoxygenate, with hydroxide appearing on the Zn - unfortunately, amine oxide deoxygenation is a reductive process, for which a reducing agent is needed, and Zn2+ is not a redox-active metal ion;

      (iii) The authors say "forming a tetrahedral intermediate, as described for metalloproteinase", but zinc metalloproteases attack an amide carbonyl to form an oxyanion intermediate, whereas in this mechanism, there is no carbonyl to attack, so this statement is just wrong.

      So on several counts, the proposed mechanism cannot be correct. Some redox cofactor is needed in order to carry out amine oxide deoxygenation, and Zn<sup>2+</sup>cannot fulfil that role. Fe<sup>2+</sup> could do, which is why the previously proposed mechanism involving an iron(IV)-oxo intermediate is feasible. But the authors claim that their enzyme has no Fe. If so, then there must be some other redox cofactor present. Therefore, the authors need to re-analyse their enzyme carefully and look either for Fe or for some other redox-active metal ion, and then provide convincing experimental evidence for a feasible catalytic mechanism. As it stands, the proposed catalytic mechanism is unacceptable.

      We thank the reviewer for the detailed and thoughtful mechanistic critique. We fully agree that Zn<sup>2+</sup> is not redox-active, and cannot directly mediate oxidative demethylation or amine oxide deoxygenation. We acknowledge that the oxidative step required for the conversion of TMAO to HCHO is not explicitly resolved in the present study. Accordingly, we have revised the manuscript to remove any implication of Zn<sup>2+</sup>-mediated redox chemistry, and have eliminated the previously imprecise analogy to zinc metalloproteases.

      We recognize and now discuss prior biochemical work on TMAO demethylase from Methylocella silvestris (MsTDM), which proposed an iron-dependent oxidative mechanism (Zhu et al., FEBS 2016, 3979–3993). That study reported approximately one Zn<sup>2+</sup> and one non-heme Fe<sup>2+</sup> per active enzyme, implicated iron in catalysis through homology modeling and mutagenesis, and used crossover experiments suggesting a trimethylamine-like intermediate and oxygen transfer from TMAO, consistent with an Fe-dependent redox process. However, that system lacked experimental structural information, and did not define discrete metal-binding sites.

      In contrast,

      (1) Our high-resolution cryo-EM structures and metal analyses of TDM consistently reveal only a single, well-defined Zn<sup>2+</sup>-binding site, with no structural evidence for an additional iron-binding site as in the previous report (Zhu et al., FEBS 2016, 3979–3993).

      (2) To investigate the potential involvement of iron, we expressed TDM in LB medium supplemented with Fe(NH<sub>4</sub>)<sub>2</sub>SO<sub>4</sub> and determined its cryo-EM structure. This structure is identical to the original one, and no EM density corresponding to a second iron ion was observed. Moreover, the previously proposed Fe<sup>2+</sup>-binding residues are spatially distant (Figure S6).

      (3) ICP-MS analysis shows undetectable Iron, and only Zinc ion (Figure S5).

      (4) Our enzyme kinetics analysis with the TDM without Iron is comparable to that of from MsTDM (Figure 1A). The differences in Km and Vmax we propose is due to the difference in the overall sequence of the enzymes. Please also see comment at the end on a new published paper on MsTDM.

      While we cannot comment on the MsTDM results, our ‘experimental’ results do not support the presence of an iron-binding site. Our data indicate that this chemistry is unlikely to be mediated by a canonical non-heme iron center as proposed for MsTDM. We therefore revised our model as a structural framework that rationalizes substrate binding, metal coordination, and product stabilization, while clearly delineating the limits of mechanistic inference supported by the current data.

      The scheme 1 and proposal mechanism section were revised in page 4. Figure S6 was added.

      (2) Given the metal content reported here, it is important to be able to compare the specific activity of the enzyme reported here with earlier preparations. The authors do quote a Vmax of 16.52 µM/min/mg; however, these are incorrect units for Vmax, they should be µmol/min/mg. There is a further inconsistency between the text saying µM/min/mg and the Figure saying µM/min/µg.

      Thank you for the correction. We converted the V<sub>max</sub> unit to nmol/min/mg. and revised the text in page 2. We also compared with the value of the previous report in the TDM enzyme by revising the text on page 2. See also the note on a newly published manuscript and its comparison.

      (3) The consumption of formaldehyde to form methylene-THF is potentially interesting, but the authors say "HCHO levels decreased in the presence of THF", which could potentially be due to enzyme inhibition by THF. Is there evidence that this is a time-dependent and protein-dependent reaction? Also in Figure 1C, HCHO reduction (%) is not very helpful, because we don't know what concentration of formaldehyde is formed under these conditions; it would be better to quote in units of concentration, rather than %.

      We appreciate this important point. We have revised Figure 1C to present HCHO levels in absolute concentration units. While the current data demonstrate reduced detectable HCHO in the presence of THF, we agree that distinguishing between HCHO consumption and potential THF-mediated enzyme inhibition would require dedicated time-course and protein-dependence experiments. We have therefore revised the description to avoid overinterpretation and limit our conclusions to the observed changes in HCHO concentration in page 2, line 18-19.

      (4) Has this particular TMAO demethylase been reported before? It's not clear which Paracoccus strain the enzyme is from; the Experimental Section just says "Paracoccus sp.", which is not very precise. There has been published work on the Paracoccus PS1 enzyme; is that the strain used? Details about the strain are needed, and the accession for the protein sequence.

      Thank you for this comment. We now indicate that the enzyme is derived from Paracoccus sp. DMF and provide the accession number for the protein sequence (WP_263566861) in the Experimental Section (page 8, line 4).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The ITC experiment requires a ligand-into-buffer titration as an additional control. Also, maybe I misunderstood the molar ratio or the concentrations you used, but if you indeed added a total of 4.75 μL of 20 μM THF into 250 μL of 5 μM TDM, it is not clear to me how this leads to a final molar ratio of 3.

      We thank the reviewer for this suggestion. A ligand-into-buffer control ITC experiment was performed and is now included in Figure S8C, which shows no realizable signal.

      Regarding the molar ratio, it is our mistake. The experiment used 2.45 μL injections of 80 μM THF into 250 μL of 5 μM TDM. This corresponds to a final ligand concentration of ~12.8 μM, giving a ligand-to-protein molar ratio of ~2.6. We revised our text in page 9, ITC section.

      (2) Characterization/quality check of all mutant enzymes should be performed by NanoDSF, CD spectroscopy or similar techniques to confirm that proteins are properly folded and fit for kinetic testing.

      We appreciate the reviewer’s suggestion. All mutant proteins, including D220A, D367A, and F327A, were purified with yields similar to the wild-type enzyme. Additionally, cryo-EM maps of the mutants show well-defined density and overall structural integrity consistent with the wild-type. These findings indicate that the introduced mutations do not significantly affect protein folding, supporting their use for kinetic analysis. While NanoDSF might reveal differences in thermal stability due to mutations, it does not provide structural information. Our conclusions are not based on minor differences in thermostability. Our cryo-EM structures of the mutants offer much more reliable structural data than CD spectroscopy.

      (3) Best practice would suggest overlapping pH ranges between different buffer systems in the pH-dependence experiments to rule out buffer-specific effects independent of pH.

      We thank the reviewer for this helpful suggestion. We agree that overlapping pH ranges between different buffer systems can be valuable for excluding buffer-specific effects. In this study, the pH-dependence experiments were intended to provide a qualitative assessment of pH sensitivity rather than a detailed analysis of buffer-independent pKa values. While we cannot fully exclude minor buffer-specific contributions, the overall trends observed were reproducible and sufficient to support the conclusions drawn. We have added a clarifying statement to the revised manuscript to reflect this consideration, page 2, line 12.

      (4) Structural comparison revealed high similarity to a THF-binding protein, with superposition onto a T protein.": It would be nice to show this as an additional figure, as resolution and occupancy for THF are low.

      We thank the reviewer for this suggestion. To address this point, we have revised Figure S6 by adding an additional panel (C, now is Figure S7C) showing the structural superposition of TDM with the THF-binding T protein. This comparison is included to better illustrate the structural similarity, despite the limited resolution and partial occupancy of THF density in our map.

      (5) Editing could have been done more thoroughly. Some spelling mistakes, e.g. "RESEULTS", "redius", "complec"; kinetic rate constants should be written in italic (not uniform between text and figures); Prism version is missing; Vmax of 16.52 µM/min/mg - doublecheck units; Figure S1B: The "arrow on the right" might have gone missing.

      We corrected the spelling in page 2 ~ line 10, page 5 ~ line 34, page 6 ~ line40. Prism version was added. The arrow was added into figure S1B. The Vmax unit is corrected to nmol/min/mg.

      Reviewer #2 (Recommendations for the authors):

      (1) The authors must re-examine the metal content of their purified enzyme, looking in particular for Fe or another redox-active metal ion, which could be involved in a reasonable catalytic mechanism.

      We thank the reviewer for this suggestion and have carefully re-examined the metal content of TDM. Elemental analyses by EDX and ICP-MS consistently detected Zn<sup>2+</sup> in purified TDM (Zn:protein ≈ 1:2), whereas Fe was below the detection limit across multiple independent preparations (Fig. S5A,B). To assess whether iron could be incorporated or play a functional role, we expressed TDM in E. coli grown in LB medium supplemented with Fe(NH<sub>4</sub>SO<sub>4</sub>)<sub>2</sub> and performed activity assays in the presence of exogenous Fe<sup>2+</sup>. Neither condition resulted in enhanced enzymatic activity.

      Consistent with these biochemical data, all cryo-EM structures reveal a single, well-defined metal-binding site coordinated by three conserved cysteine residues and occupied by Zn<sup>2+</sup>, with no evidence for an additional iron species or other redox-active metal site.

      (2) The specific activity of the enzyme should be quoted in the same units as other literature papers, so that the enzyme activity can be compared. It could be, for example, that the content of Fe (or other redox-active metal) is low, and that could then give rise to a low specific activity.

      Thank you for the suggestion, we quoted the enzyme units as similar with previous report. and revised the text in in page 2.

      Since the submission of our paper a new report on MsTDM has been published (Cappa et al., Protein Science 33(11), e70364). It further supports our findings. First, the reported kinetic parameters using ITC (Vmax = 0.309 μmol/s, approximately 240 nmol/min/mg; Km = 0.866 mM) are comparable to our observed (156 nmol/min/mg and 1.33 mM, respectively) in the absence of exogenous iron. Second, the optimal pH for enzymatic activity similar to that observed in our paraTDM. Third, the reported two-state unfolding behavior is consistent with our cryo-EM structural observations, in which the more dynamic subunits appear to destabilize prior to unfolding of the core domains. Based on these findings, we now propose that Zn<sup>2+</sup> appears to function primarily as an organizational cofactor at the core catalytic domain (revised Scheme 1).

    1. In random moments, you find yourself wondering if you should just burn it all to the ground and start over.

      I would make this bolder - on first read I'm thinking that I would maybe even make this the header because it's very striking and head-nod-y

    1. First phishing. Fishing with an F. There's a way to catch fish using bait like worms. It's fun, except for the worms. Fishing with a Ph isn't fun. It's scary internet stuff. Phishing is a way for bad people to catch your private details, like your bank account number or passwords. The bait they use is lies. Here's how phishing works, say you get an email. It looks like it's from someone you trust, like your bank, but it's not. It tells you to confirm your bank details or your account may be closed. Scary. You click on the link and go to a website. It looks like a real Bank website. But it's not. You enter your details and someone uses them to steal your identity and buy things with your money. Which is not nice. Here's how to be safe from fishing. Number one, your bank will never ask you to confirm your details via an email, like ever. This is the most obvious way to spot a fishing attempt if you receive an email like this, suspecting. Don't click it. Number two, look for your name. Phishing messages say things like, um, dear valued customer. If it doesn't say your name, don't click it. Number three. Look at the URL in your browser browser window if the URL looks like a different name from the name of the company. Don't click it. Number four. Rest your mouse pointer on the link that will show the real web address. If it doesn't look like the proper company name, don't click it. Number five, look for spelling mistakes. If there are any spelling mistakes or the email doesn't look professional? Don't click it. Number six, get security software that includes anti-phishing and identity protection features like Norton 360 or Symantec. Best of all, just don't use links in emails to get to websites like ever. Always type the URL instead. Thank you for looking at our quick guide to scary internet stuff to allow to be safe.

    2. That's why Innovative startups are popping up around the world, solving the security problem with Biometrics. We actually use the vein patterns in your eye. Eye verify discovered that the blood vessels in the whites of your eyes are as unique as the worlds on the tips of your fingers. So the iris, many people know the iris. They expect that to be a good biometric, and it is but actually takes infrared light and an infrared camera. Which phones don't have? We use just an ordinary phone, enter my name. Gosh, this is fast. Done. Eye Verified caught the attention of Samsung Sprint and Wells Fargo. They all invested now. The Kansas City-based company is using that backing to crack two markets where security is Paramount. Number one is Enterprises. The companies want to protect access to their networks. There's a breach like every day, the second one is, you know, broadly, mobile banking or mobile financial Services. According to one forecast by 2019, five, and a half billion people will use Biometric authentication on mobile and wearable devices. MasterCard has been experimenting with biometrics, too, and recently tested an app using voice and facial recognition for 14, 000 e-commerce transactions. MasterCard also invested in Bionam. This Toronto-based startup has developed a wristband, the 79 Nymi that harnesses what may be the most secure Biometric of all your heartbeat. You look at all of the little Peaks and valleys in the shape of that waveform that's unique to you. We've now tied this to you. It's in an active State because it's no, it's still on your rest, right? And now, we can have it do stuff. For example, we can have it unlock a phone. That's the beauty of persistent identity. Imagine how seamless life would be if logging onto your email, unlocking your car, or checking into a flight where as easy as strapping on a wristband and verifying your ECG. The device does the rest is binary to kill the password. I don't know if you use that word too much, but probably.

    1. it’s easy to forget that it lies on top of these other experiences of subjective feeling and of wanting to communicate that feeling.

      I think it's a bit of an exaggeration demonizing AI but I think we already know it's not alive and it says its happy to see us. It's like dora the explora saying how's your day going its just a cartoon and behind it is a voice actor that will never know you and your struggles in life.

    2. : they let you engage in something like plagiarism, but there’s no guilt associated with it because it’s not clear even to you that you’re copying.

      I've heard that they got there data to train the AI illegally, but I don't think that's our problem I think that's between them and the person they stole from. As long as you inspect sources they use and credit them then that's not your issue. I think sometimes we dig deep into stuff thats none of our buisness. There are a lot of sketchy things going in the world and if we focus too much on its orgin we would die from boycotting everything. It's like if your boss does something illegally to get his money but your just working a normal job as long you don't use that money in a bad way then its none of your buisness what he does because your not directly assisting his act.

    3. What I’m saying is that art requires making choices at every scale; the countless small-scale choices made during implementation are just as important to the final product as the few large-scale choices made during the conception.

      I guess there are details that you can find is profound that you won't find AI doing since it's mostly taking whats common but whose to say that It wont in the future make the small decisions we do. At the end of the day when we draw its from our in expereince in life and Ai just observing our observations

    1. Why are so many techies dysphoric?It must be said that some people in tech are closeted or unaware trans people, and it's probably significantly more of the population than we might think given that a lot of trans people wind up drawn towards tech as a field. In these cases, the dysphoria makes a considerable amount of sense. However, even at the outer limit, that would account for no more than a quarter of tech people, which isn't enough to explain the general prevalence of dysphoria that we observe in the tech community. This means that we need an explanation for why our tech industry is so dead-eyed and void of emotion or motivation that isn't just that they need estrogen.

      while there are relatively more trans people in tech (a clear pattern yes), it does not explain the overall presence of dysphoria in tech.

    2. Without knowing that it's gender-related, dysphoria often presents precisely as this kind of directionlessness, not having desires or not knowing what you want. You often wind up kind of sleepwalking through life, trying to pursue the things that you think that you should want or the desires that society tells you are appropriate for someone in your supposed social position. Nothing ever quite works though, and often enough, until you figure out the problem, you kinda just... stop wanting things and stop trying entirely.

      describes dysphoria as directionlessness, as long as you don't know its cause, mimicking the desires and motions others go through and society suggest. Leading to detachment and withdrawal.

  2. drive.google.com drive.google.com
    1. By seeing authors as genius artists only, we removeourselves from the activity of writing, which is social and contex-tual, and are distracted by the product itself.

      I hadn't thought about it that way, but I think It's right. Instead of just using the product, it would be much more interesting to create something myself

    2. . In fact, attimes throughout history, the best authors were believed to havebeen chosen and directly inspired by God Himself.

      I don't see it exactly the same way, but I agree with part of it. I think writing is different from other talents. For example, being good at math comes from practice ,just repeating things over and over. But with writing, I believe people are born with that imagination; it’s a special gift you’re born with it

    1. For example, author Roxane Gay has said, “Content going viral is overwhelming, intimidating, exciting, and downright scary..”

      It’s so interesting that such a seemingly regular aspect of our modern day society seems to be so disorienting for those in the center of it. It just seems like something people aren’t meant to experience and yet it increasingly because a outcome more people want to achieve

    1. Much like a standard-ized test can be hard to see as a problem if you score well on it,video games that cater to one’s own skills, abilities, and habits canbe hard to see as a systemic problem. Hardcore gamers can oftenbe seen calling for more video games that suit them, rather than fordiverse games that draw in more players or test new kinds of skills.The creative director at Ubisoft argues that this predictability andconsistency in video games is “less about familiarity than it’s just en-joying the content.

      And that's what's dark(ening), I'd say! It naturalises conservatism. By matching mainstream culture to individualistic desire, it establishes A Brave New World's premise: "I am giving you what you want". But desire is manufactured AND HISTORICAL (inherited, slightly molded through Overton window and Creeping normality), there was no choice, there is no choice, we have no free will, we are interdependent!

    2. if you are one of those people with such a degree you still haveninety-three fellow villagers who have not had the same opportuni-ties. If you have access to a computer, you would be one of twenty-twothat do, numbers that are largely matched by the seventeen who areunable to read and write, the twenty-three with no shelter from windand rain, and the thirteen who do not have safe water to drink. At thepoint that you are fortunate enough to be in a position where you canhave the ability, the time, and the resources to read a book about videogames, you are one of a very small group of people in the world withthat opportunity.

      Matthew effect. Survivors survive, rich get richer. Of course, spotlight bias would have us imagine otherwise it's possible, and actually rich people make sure to create that image by buying news outlets, marketing, philantropies or even social media sites, to shout how many people went from zero to hero. Actually, just check the 1%, and see how monopolistic aristocrats relate and feed on to monopolistic aristocrats.

      Trump, Elon, Bezos, Gates, Zuckerberg, are white, straight, cis, men BORN IN RICH FAMILIES.

    Annotators

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      We thank the Reviewers for their appreciative comments (Reviewer 1: “first time that a well-established existing mathematical model of signaling response extended and applied to heterogeneous ligand mixtures”)and constructive suggestions for improvement. In this extensive revision, we have not only addressed the suggestions comprehensively but also extended our analysis of signaling antagonism to all doses and at the single-cell level using novel computational workflows. This resulted in the discovery of several mechanismsof antagonism and synergy that are dose-dependent, and dependent on the cell-specific state of the signaling network, thereby manifesting in only a subset of cells.

      We have addressed Reviewer comments: we have made substantial revisions to improve clarity, rigor, and biological interpretation. Below we briefly summarize the main concerns raised by Reviewers 1-3 and how we have addressed them.

      • We have rewritten the Methods section to clarify our approaches. We have also added the explanation of methodology and the rationale in the main text to improve readability and comprehensiveness (Addressing Reviewer #1 comments). This includes explaining and justifying the signaling codon approaches (Reviewer 1), our core-module parameter matching methodology and discussion (Reviewer #1, point 11, Reviewer #2, point 1), and the model schematic (Reviewer #1, point 5).
      • For one of our major conclusions – that macrophages may distinguish stimuli in the context of ligand mixtures – we have validated these results with experiments, which increases confidence in this conclusion (Reviewer #2, point 3, Reviewer #3, point 2).
      • We have updated the model for CpG-pIC competition using Michaelis–Menten kinetics without any additional parameters, rather than introducing new free parameters. This change removes parameter freedom for fitting combinatorial conditions, leading to a more constrained and mechanistically grounded model whose predictions align better with experimental data (Updated Figures 2 and S2; Reviewer #2, point 2).
      • We have addressed all other editorial and clarification-related concerns as well, as detailed in our point-by-point response below. In addition, we have extended the scope of the manuscript. We have extended our analysis of ligand combinations across a broad dose range, from non-responsive to saturated conditions. This led to several additional discoveries. For example, we show that ultrasensitive IKK activation can underlie synergistic combinations of ligands at low doses. In contrast, beyond the CpG-poly(I:C) antagonism, we identify that competition for CD14 uptake by LPS and Pam can generate antagonism between these ligands within specific dose ranges.

      Importantly, such antagonism or synergy is not evident in all cells in the population. It may also not be picked up by studies of the mean behavior. With our new computational workflow that allows for single-cell resolution we identify the conditions that must be met by the signaling network state, for antagonism or synergy to take place.

      Further, we examine the hypothesis that such signaling pathway interactions affect stimulus-response specificity in combinatorial stimulus conditions. By comparing models with and without this antagonism, we demonstrate that antagonistic interactions can improve stimulus-response specificity in complex ligand mixtures.

      These additional analyses provide a new mechanistic understanding of cellular information processing and elucidate how synergy and antagonism can mechanistically shape signaling fidelity in response to complex ligand mixtures.

      Point-by-Point Response

      Reviewer #1

      Evidence, reproducibility and clarity

      The authors extend an existing mathematical model of NFkB signalling under stimulation of various single receptors, to model that describes responses to stimulation of multiple receptors simultaneously. They compare this model to experimental data derived from live-cell imaging of mouse macrophages, and modify the model to account for potential antagonism between TLR3 and TLR9 response due to competition for endosomal transport. Using this framework they show that, despite distinguishability decreasing with increasing numbers of heterogenous stimuli, macrophages are still able in principle to distinguish these to a statistically significant degree. I congratulate the authors on an interesting approach that extends and validates an existing mathematical model, and also provides valuable information regarding macrophage response.

      Response: We thank the reviewer for this appreciative assessment and for the careful reading of our work. The constructive comments helped us substantially improve the rigor and clarity of the manuscript.

      In addition to revising the text for clarity, we have extended our analysis to systematically investigate dose-response behavior for each pair of ligand combination. Using the experimentally validated model, we explored 10 ligand pairs across a range of doses from non-responsive to saturating. This allowed us to identify mechanistic regimes in which synergy and antagonism arise at the single-cell level. In particular, we found that low-dose synergy can be explained by ultrasensitive IKK activation (Figure 4 and corresponding supplementary figures), while antagonism can emerge from competition for shared components such as CD14 (Figure 5 and corresponding supplementary figures). We further show that antagonism can enhance condition distinguishability in ligand mixtures, thereby contributing to stimulus-response specificity (Figure 5 and corresponding supplementary figures).

      There are no major issues affecting the scientific conclusions of the paper, however the lack of detail surrounding the mathematical model and the 'signaling codons' that are used throughout the paper make it difficult to read. This is exacerbated by the fact that I was unable to find Ref 25 which apparently describes the model, however I was able to piece together the essential components from the description in Ref 8 and the supplementary material.

      Response: This comment helped us to improve the writing. We apologize that the key reference 25 was still not publicly available. It is now published in Nature Communications. In addition, we have added more details to clarify the mathematical model as well as the signaling codons, in results and in methods. Please see below for details.

      Lots of the minor comments below stem from this, however there are also a few other places that could benefit from some additional clarification and explanation.

      Significance: 1. '...it remains unclear complex...' -> '...it remains unclear whether complex...' Response: We have rewritten the Significance (now it is Synopsis).

      Introduction: 2. 'temporal dynamics of NFkB' - it would be good to be more concrete regarding the temporal dynamics of what aspect of this (expression, binding, conformation, etc), if possible. Response: It refers to the presence of NFκB into nucleus, which represents active NFκB capable of activating gene expression. We have clarified this (Lines 59-61 in introduction paragraph 2). “Upon stimulation, NFκB translocates into the nucleus, … activating immune gene expression (10, 15–19).

      'signaling codons' - the behaviour of these is key to the entire paper, so even if they are well described in the reference, it would be good to have a short description as early as possible so that the reader can get an idea in their mind what exactly is being discussed here. Later, it would be good to have concrete description of exactly what these capture.

      Response: We thank the reviewer for this comment. We have added one whole paragraph in the early introduction to describe the concept of Signaling Codons which allow quantitative characterization of NFkB stimulus-response-specific dynamics (Lines 60-67). We have also added more concrete description of Signaling Codons in the results as well as adding an illustration for the signaling codons (Lines 169-175, Figure S2B).

      'This challenge...population of macrophages' - this seems a bit out of place, and is a bit of a run on sentence, so I suggest moving this to the next paragraph and working it into the first sentence there '...regulatory mechanisms, and this challenge could be addressed with a model parameterised to account for heterogeneous...Early models ...', or something similar.

      Response: We thank the reviewer for this suggestion, we have revised this as suggested. This improves the logic flow (Lines 87-88).

      Ref 25: I can't find a paper with this title anywhere, so if it's an accepted preprint then it would be good to have this available as well. That said, I still think it would be difficult to grasp the work done in this paper without some description of the mathematical model here, at least schematically, if not the full set of ODEs. For example, there are numerous references to how this incorporates heterogeneous responses, the 'core module', etc, and the reader has no context of these if they aren't familiar with the structure of the model. Response: We apologize that Ref 25 was not on PubMed. Now it’s published, and we have updated the corresponding information. This comment also helped us to improve the writing by adding a description of the mathematical model in the Introduction (Lines 95-105), the results (Lines 129-141), and a detailed description of the model in the Methods (Simulation of heterogenous NFκB dynamical responses.)

      We have also added the schematic of the model topology in Figure S1 (adapted from previous publications Guo et al 2025, Adelaja et al 2021) to make sure the paper is self-contained.

      'A key challenge which is...' -> 'A key challenge is...' Response: We have revised the Introduction and removed this sentence.

      'With model simulation ...' -> a bit of a run on sentence, I suggest breaking after 'conditions'. Response: We have revised the introduction and removed this sentence.

      Results:

      1. This section would benefit from a more in-depth description of the model and experimental setup. In particular for the experiment, the reader never really knows what this workflow for this is, nor what the model ingests as input, and what the predictions are of. Response: This comment helped us to improve clarity by adding an in-depth description of the model and experimental setup. We have revised the Results as suggested (Lines 129-141). We also appended the corresponding revision here for reviewer reference.

      This mechanistic model was trained on single-ligand response experimental datasets, capturing the single-ligand stimulus-response specificity of the population of macrophages while accounting for cellular heterogeneity. Specifically, quantitative NFκB dynamic trajectory data from hundreds of single macrophages responding to five single ligands (TNF, pIC, Pam, CpG, LPS) at 3-5 doses was obtained from live cell imaging experiments. The mathematical model (Figure S1) consists of a 52-dimensional system of ordinary differential equations, including 52 intracellular species, 101 reactions and 133 parameters, and is divided into five receptor modules, which respond to the corresponding ligands respectively, and the IKK-NFκB core module that contains the prominent IκBα negative feedback loop. By fitting the single-cell experimental data set with a non-linear mixed effect statistical model (coupling with 52-dimensional NFκB ODE model), the parameter distributions for the single-cell population were inferred. Analyzing the resulting simulated NFκB trajectories with Information theoretic and machine learning classification analyses confirmed that the virtual cell model simulations reproduced key SRS performance characteristics of live macrophages.”

      '..mechanistic model was trained...' - trained in this study, or in the previous referenced study? Response: The mechanistic model was trained in a previous study (Guo et al 2025 Nature Comm), and we have clarified this in the revision (Lines 127 - 129).

      1. 'determined parameter distributions' - this is where it would be good to have more background on the model. What parameters are these, and what do they correspond to biologically? It would also be nice to see in the methods or supplementary material how this is done (maximum likelihood, etc). Response: This comment helps us to clarify the predetermined parameter distributions. We have revised the methods to include this information (Simulation of heterogenous NFκB dynamical responses, paragraph 3). We have appended the corresponding text here for reviewer’s convenience.

      “The ODE model was then fitted to the population of single-cell trajectories to recapitulate the cell-to-cell heterogeneity in the experimental data (2). This is achieved by solving the non-linear mixed effects model (NLME) through stochastic approximation of expectation maximation algorithm (SAEM) (3–6). Seventeen parameters were estimated. Within the core module, the estimated parameters included the rates governing TAK1 activation (k52, k65), the time delays of IκBα transcription regulated by NFκB (k99, k101), and the total cellular NFκB abundance (tot NFκB). Within the receptor module, receptor synthesis rates (k54 for TNF, k68 for Pam, k85 for CpG, k35 for LPS, k77 for pIC), degradation rates of the receptor–ligand complexes (k56, k61, k64 for TNF; k75 for Pam; k93 for CpG; k44 for LPS; k83 for pIC), and endosomal uptake rates (k87 for CpG; k36 and k40 for LPS; k79 for pIC) were fitted. All remaining parameters were fixed at literature-suggested values (1). The single-cell parameters inferred from experimental individualcell trajectories then served as empirical distributions for generating the new dataset (see SupplementaryDataset2).”

      'matching cells with similar core model...' - it's difficult to follow the logic as to why this is done, so I think this needs to be a little clearer. My guess would be that the assumption is that simulated cells with similar 'core' parameters have a similar downstream signalling response, and therefore the receptors can be 'transplanted'. So it would be nice to see exactly what these distributions are and what the effect of a bad match would be. Response: We thank the reviewer for this comment. In the revision, we have explained the rationale for matching cells with similar core module (Lines 145-152).

      Previous work determined parameter distributions for only the cognate receptor module (and the core module) that provided the best fit for the relevant single ligand experimental data (Figure 1A, Step 1), but other receptor modules’ parameter values were not determined. To simulate stimulus responses to more than two ligands, we imputed the other ligand-receptor module parameters using shared core-module parameters as common variables and employing nearest-neighbor hot-deck imputation (35). In this setup, the core module functions as an “anchor” to harmonize two or more receptor-specific parameter distributions.

      This nearest-neighbor hot-deck imputation approach (the core module matching method) was shown to outperform other approaches, including random matching and rescaled-similarity matching (Guo et al. 2025, Supplementary Figure S11). For the reviewer’s convenience, we have also appended the corresponding figure below.

      Figure S11 from (Guo et al., 2025). Assessment of matching techniques for predicting single-cell responses to various ligand stimuli (a-d). Heatmaps illustrating the Wasserstein distance between the signaling codon distributions predicted by the model and those observed in experiments. The analysis employs four distinct matching methods to align the five ligand-receptor module parameters: (a) “Random Matching”, (b) “Similarity Matching” (the method used in our study), (c) “Rescaled-Similarity Matching”, and (d) “Sampling Approximated Distribution”. In the heatmaps, rows represent signaling codons, columns denote ligands, and the color intensity indicates the Wasserstein distance, providing a visual metric of similarity between model predictions and experimental data. e-f. Histogram of the average Wasserstein distance between the model-predicted and experimentally observed signaling codon distributions, summarized across signaling codons (e) and ligands (f).

      Some explanation of how this relates to the experimental data the parameters are fit on would also be useful. (a) Is there a correspondence between individual simulated cells and the experimental data for the single ligand stimulation, and then the smallest set of these is taken? Is there also a matching from the simulated multi-receptor modules and the multi-receptor data, and if so, is this done in the same way? Response: This comment to help us clarify the correspondence relationship between model simulations and experimental data.

      Yes—there is a correspondence between individual simulated cells and the previously published experimental data (Guo et al., 2025b) for single-ligand stimulation. We have revised the first paragraph of the Results (Lines 136–148) and the Methods (Lines 544-557) to clarify how the model simulations were fit to the previous experimental dataset. See Reviewer 1, Comments 10 for the updates in Methods. We have pasted in the revised Results section below for the reviewer’s reference.

      By fitting the single-cell experimental data set with a non-linear mixed effect statistical model (coupling with 52-dimensional NFκB ODE model), the parameter distributions for the single cell population were inferred.

      'six signaling codons' - here it would be good to recapitulate what these represent, but also what the 'strength' and 'activity' correspond to (total integrated value, maximum value, etc) Response: We thank the reviewer for the suggestion and have clarified this point (Lines 169-175, Figure S2B).

      'pre-defined thresholds' - no need to state these numerically in the text (although giving some sense of how/why these were chosen would give some context), but I couldn't find the values of these, nor values corresponding to the signaling codons. Response: We appreciate the reviewer’s comment. We have added this information in the figure legend (Figure 1B-C) and Method -- “Responder fraction” (Lines 666-672). Specifically, for the model simulation data, the integral thresholds are 0.4 (µM·h), 0.5 (µM·h), and 0.6 (µM·h). The peak thresholds are 0.12 (µM), 0.14 (µM), and 0.16 (µM). For the experimental data, the integral thresholds are 0.2 (A.U.·h), 0.3 (A.U.·h), and 0.4 (A.U.·h). The peak thresholds are 0.14 (A.U.), 0.18 (A.U.), and 0.22 (A.U.). Thresholds were selected so that the medium threshold yields 50% responder cells under single-ligand conditions, while the responder ratio remains unsaturated under three-ligand stimulation.

      'non-responder cells are likely a result of cellular heterogeneity in receptor modules rather than the core module' - is this the 'ill health' referenced earlier? If so make this clear. Response: Yes, this is the ‘ill health’ referenced earlier, and we have clarified this (Lines 198-199).

      It's also very difficult to follow this chain of logic, given that the reader at this point doesn't have any knowledge of what the 'core' module is, nor the significance of the thresholds on the signaling codons. I would suggest making this much clearer, with reference to each of these. Response: We apologize for the poor explanation. We have now explained in the Introduction (Lines 95-106) and the results (Lines 129-141) how the model is structured into receptor-proximal modules that converge on the common core module. We have also added a schematic for clarity (Figure S1). For further clarification of the math models, we have significantly revised the Methods (Simulation of heterogenous NFκB dynamical responses). The defined thresholds are clarified in the Methods -- “Responder fraction”.

      '...but the model represented these as independent mass action reactions' - the significance of this may not be clear to someone not familiar with biophysical models, so probably better to make it explicit. Response: We thank the reviewer for this reminder, and we have added a description of the significance of this point (Lines 225-227).

      '...we trained a random forest classifier...' - is this trained on the 'raw' experimental time series data, or on the signaling codons? Response: It is trained on the signaling codons calculated from model simulations of NFκB trajectories. We have clarified this (Lines 260-261).

      'We also applied a Long Short-Term Memory (LSTM) machine learning model...' - it might be good to reference these three approaches at the beginning of this section, otherwise they seem to come out of the blue a little. Response: We have added the references of these three approaches in the beginning of this section (Lines 242-246).

      'We then used machine learning classifiers...' - random forests, LSTMs, or a different model? Response: We have clarified that this as random forest classifier (Line 276).

      Discussion:

      1. '...over statistical models...' - suggest maybe 'purely statistical models' Response: We thank the reviewer for this suggestion. We have rewritten the whole Discussion to include the new insights of antagonism and synergy and their roles in maintaining unexpectedly high SRS performance. Thus, this sentence was removed.

      'We found that endosomal transport...' - A paper by Huang, et. al. (https://www.jneurosci.org/content/40/33/6428) observed a synergistic phagocytic response between CpC and pIC stimulation in microglia. This is still consistent with a saturation effect dependent on dose, but may be worth a mention. Response: We thank the reviewer for referring this interesting paper to us, and this comment helps us to improve the Discussion of inflammatory signaling pathways besides NFκB. This paper demonstratessynergistic effects between CpG and pIC in inhibiting tumor growth and promoting cytokine production(Huang et al., 2020), such as IFN-β and TNF-α, whose expression is also regulated by the IRF and MAPK signaling pathways (Luecke et al., 2021; Sheu et al., 2023). This finding does not contradict our findings that CpG and pIC act antagonistically in the NFκB signaling pathway because of the combinatorial pathways that act on gene expression: CpG can activate the MAPK signaling pathway (Luecke et al., 2024) but not the IRF signaling pathway, whereas pIC activates the IRF signaling pathway (Akira and Takeda, 2004) but only weakly the MAPK pathway. Therefore, their combination can synergistically regulate inflammatory responses. We have added this to the discussion (Lines 515-522).

      '...features termed...' -> 'features, termed' Response: We thank the reviewer for their carefully reading, and we have rewritten the Discussion.

      '...we applied a Long Short-Term Memory (LSTM) machine learning model..' - maybe make clear that this is on the time-series data (also LSTM has already been defined). Response: We thank the reviewer for their carefully reading, and we have rewritten the Discussion.

      Materials and methods:

      1. The descriptions in this section are quite vague, so I would suggest expanding this with more detail from the supplementary material, where things are quite well explained. Response: We thank the reviewer for this suggestion, and we have rewritten the whole Methods as suggested.

      'sampling distribution' - not clear what this refers to in this context Response: We have clarified this in the revision (Methods -- Simulation of heterogenous NFκB dynamical responses, paragraph 3). The single-cell signaling-pathway parameter values used for bootstrapping sampling to generate model simulations are given in Supplementary dataset 2.

      'RelA-mVenus mouse strain' - it would be good to mention the relevance of the reporter for NFkB signaling Response: We have added the relevance of the reporter for NFkB signaling (Methods, Lines 624-626).

      '...A random forest classifier...' -> a random forest classifier

      Response: We have rewritten the methods.

      Significance

      This study provides mechanistically interpretable insight on the important question of how immune cells perform target recognition in realistic scenarios, and also provides validation of existing mathematical models by extending these beyond their original domain. The paper uses 'signaling codons' as a proxy for information processing, however in this instance it is cross-validated with an LSTM model that is applied directly to the time series data. Nevertheless, the scope of the paper is such that it does not deal with the question of how these signals are transmitted or used in a downstream immune response. To my knowledge, this is the first time that a well established existing mathematical model of signalling response has been extended and applied to heterogeneous ligand mixtures. These results will be of interest to those studying immune cell responses, and to those interested in basic research on mathematical models of signaling and cellular information processing more generally.

      My background is in biophysical models, machine learning, and signaling in cancer. I have a basic understanding of immunology, but no experience in experimental cell biology.

      Response: We thank the reviewer for highlighting the novelty of our study. We appreciate the reviewer’s recognition that our work advances the understanding of cellular information processing in the context of ligand mixtures, particularly as the first to extend computational models to investigate signaling fidelity under mixed-ligand conditions.

      We agree that this work will interest computational biologists focused on signaling network modeling and information processing. In addition, we believe it will also be valuable for all signaling biologists, as we provide fundamental insights. For experimental biologists in particular, our model provides an efficient, quantitative framework for exploring and generating testable hypotheses.

      We would also like to gently emphasize that evaluating specificity within signaling pathways is as essential as studying downstream functional responses. While immune function outcomes are certainly important, they rely on the upstream signaling pathways that first respond to environmental cues. Understanding how these signaling pathways achieve specificity and discriminability is therefore crucial. For example, this is particularly relevant for drug development targeting pathways such as NFκB, where assessing the direct signaling output—NFκB activation dynamics—can provide valuable insight into the effects of pharmacological interventions.

      Reviewer #2

      Evidence, reproducibility and clarity

      Guo et al. developed a heterogeneous, single-cell ODE model of NFκB signaling parameterized on five individual ligands (TNF, Pam, LPS, CpG, pIC) and extended it, via core-module parameter matching, to predict responses to all 31 combinations of up to five ligands. They found that simulated responder fractions and signaling codon features generally agreed with live-cell imaging data. A notable discrepancy emerged for the CpG (TLR9) + pIC (TLR3) pair: experiments exhibited non-integrative antagonism unpredicted by the original model. This issue was resolved by incorporating a Hill-type term for competitive, limited endosomal trafficking of these ligands. Finally, by decomposing NFκB trajectories into six "signaling codons" and applying Wasserstein distances plus random-forest and LSTM classifiers, the authors showed that stimulus-response specificity (SRS) declines with ligand complexity but remains statistically significant even for quintuple mixtures. This is a well written and scientifically sound manuscript about complexities of cellular signaling, especially considering the limitations of in vitro experiments in recapitulating in vivo dynamics.

      Response: We thank the reviewer for carefully reading the manuscript and for this endorsement. We have significantly improved the manuscript thanks to the reviewer’s insightful comments (see below for point-to-point responses).

      Besides addressing the reviewer’s questions, we have further extended our work to investigate how ligand pairs interact across all doses and how those interactions affect stimulus-response specificity. As the reviewer pointed out, experimental studies are limited in recapitulating the multitude of complex physiological contexts. The model is helpful to explore more complex scenarios beyond the feasibility of in-vitro experimental setups. Using computational simulations, we have further explored 360 conditions generated from 10 ligand pairs, each evaluated at 6 doses spanning non-responsive to saturating levels, and with each condition considered 1000 cells to capture the heterogeneity of the population.

      From this extended analysis, we identified the mechanistic bases for observations of both synergy and antagonism. Synergy for certain low-dose ligand combinations can be explained by ultrasensitive IKK activation (Figure 4), while antagonism between LPS and Pam arises from competition for the cofactor CD14 (Figure 5). We show that these phenomena are dependent on the signaling network state and therefore are not observed in all cells of the population. We define the network conditions that must be met for antagonism and synergy to occur. Importantly, we then show that antagonism can contribute to stimulus-response specificity in ligand mixtures (Figure 5).

      Here are a few comments and recommendations:

      1. The modeling approach used in this manuscript, while interesting, might need further validation. Inferring multi-ligand receptor parameters by matching single-ligand cells on core-module similarity may not capture true co-variation in receptor expression or adaptor availability. Single cell measurements of receptor expressions could be done (e.g. via flow cytometry) to ground this assumption in real data. If the authors think this is out of scope for this manuscript, they could fit core-matched single cell models with two receptor modules from scratch to the two-ligand experimental data. Would this fitted model produce similar receptor parameters compared to the presented approach? At least the authors should add a bit more explanation for why their modeling approach is better (or valid) than fitting the models with 2/3/4/5 receptor modules from scratch to the experimental data.

      Response: We thank the reviewer for this comment, this helped us improve the explanation of the methodology, the rationale, and the validation. The methodology is based on the well-established statistical method of nearest-neighbor hot-deck imputation (Andridge and Little, 2010). In this implementation, the core module functions as a stabilizing “anchor” (common variables) to harmonize various receptor-specific parameter distributions. Similar methodologies have been successfully applied to correct batch effects or integrate single-cell RNAseq datasets using anchor cell types (Stuart et al., 2019). Our workflow has been validated on single-ligand stimuli conditions in a previous study (Guo et al., 2025) (See below 3rdparagraph). Here, we used this method to generate predictions for ligand mixtures and have validated them with experimental studies of the dual-ligand stimuli, and we found that our predictions align well with the experimental data. As the reviewer suggested in point 3, in the revision, we also added experimental validation on the binary classifiers of macrophage determines whether specific stimuli are presented in the ligand mixture. The question we are interested in in this work is how macrophage process ligand-specific information in the context of ligand mixtures. For this question, the experimental results align with the model predictions, reaching consistent conclusions.

      In the revision, we have explained the rationale for using the nearest-neighbor hot-deck imputation by matching cells with similar core module (Lines 143-150).

      Previous work determined parameter distributions for only the cognate receptor module (and the core module) that provided the best fit for the single ligand experimental data (Figure 1A, Step 1), and other receptor modules parameter information is missing. To simulate stimulus responses to more than two ligands, we imputed the other ligand–receptor module parameters using shared core-module parameters as common variables and employing nearest-neighbor hot-deck imputation (35). In this setup, the core module functions as an “anchor” to harmonize two or more receptor-specific parameter distributions. This was achieved by by minimizing Euclidean distance between the core module parameters associated with the independently parameterized single-ligand models (Figure 1A, Step 2).

      In Guo et al. (2025) (see Supplementary Figure S11), the nearest-neighbor hot-deck imputation approach (core module similarity matching method) was compared with other approaches, including random matching and rescaled-similarity matching. The results show that, after matching, the core module method best preserves the single-ligand stimulus signaling codon distributions. For the reviewer’s convenience, we have also appended the figure in the response to Reviewer 1, Comment 11.

      The advantage of our workflow is that it does not need to be fit to new experimental data and still gives reliable predictions on signaling dynamics. For the reviewer’s interest, we have tried to fit core-matched single cell models with two receptor modules. As fitting parameters require sufficiently large and high-quality datasets, single-ligand stimulation data with more than 1,000 cells can be adequate to estimate 6~7 parameters (Guo et al., 2025) (approx. 1400 cells to 2000 cells per ligand). However, our current experimental dataset for combinatorial-ligand conditions contains only 500~1,000 cells, and we have tested these datasets but results show a poor fit of heterogeneous signaling dynamics. This is due to an insufficient number of cells for estimating 8~10 parameters. We estimate that at least ~1,500 cells would be needed for reliable parameter estimation under dual-ligand stimulation (and more cells may be needed for combinatorial ligand stimuli involving more ligands). This is currently not feasible to obtain for mixed ligands given the large number of combinatorial conditions.

      Overall, in this paper, the nearest-neighbor hot-deck imputation approach is presented as a feasible and acceptable approach that best reflects our current understanding of the signaling network. Importantly, it helps identify potential gaps by highlighting discrepancies between model predictions and experimental observations.

      (a) The refined model posits competitive, saturable endosomal transport for CpG and pIC, but no direct measurements of endosomal uptake rates or compartmental saturation thresholds are provided, leaving the Hill parameters under-constrained. The authors could produce dose-response curves for CpG and pIC individually and in combination across a range of concentrations to fit the Hill parameters for competitive uptake. (b) If this is out of scope for this paper, the authors should at least comment on why the endosome hypothesis is better than others e.g. crosstalks and other parallel pathway activations. Especially given that even the refined model simulations with Hill equations for CpG and pIC do not quite match with the experimental data (Fig 2 B,E).

      Response: (a) The reviewer’s comments helped us to improve our work by employing the Michaelis-Menten Kinetics for substrate competition reactions, which increases the mathematic rigor of the CpG-pIC competition model. In this updated model, there is no free parameters to tune, as all the Vmax, Kd, should be consistent with the single-ligand scenario. And the Hill is same as single-ligand case, equal to 1.

      The comments on examining dose-response curves for CpG and pIC inspired us to extend the dose-response curves for all ligand pair combination, allowing us to identify the synergy in low-dose ligand pairs and antagonism for high-dose LPS-Pam, besides CpG-pIC (new Figure 4 & 5).

      (b) Regarding alternative hypotheses for antagonism—such as crosstalk or parallel-pathway activation: any antagonistic effect would have to arise from negative regulation acting within the first 30 min. However, IκBα-mediated feedback only becomes appreciable after ~30 min (Hoffmann et al., 2002), and A20-dependent attenuation requires ≥2 h (Werner et al., 2005). Beyond these delayed feedback, NFκB activation depends primarily on phosphorylation and K63-linked ubiquitination, for which no mechanism produces true antagonism; at most, combinatorial inputs saturate the response to the level of the strongest single ligand. We have added this rationale to the Discussion to explain why we favor the endosome saturation hypothesis over other mechanisms (Lines 459-465). While this may not capture every nuance, it represents the simplest model extension capable of reproducing the observed antagonism.

      Authors asses the distinguishability of single-ligand stimuli and combinatorial ligands stimuli using the simulations from the refined model. While this is informative, the simulated data could propagate deviations from the experimental data to the classifiers. How would the classifiers fare when the experimental data is used to assess the single-stimulus distinguishability? The authors could use the experimental data they already have and confirm their main claim of the paper, that cells retain stimulus-response specificity even with multiple ligand exposure. In short, how would Fig 3E look when trained/validated on available experimental data?

      Response: We thank the reviewer’s valuable comments, and they helped us strengthen the rigor of our analysis by incorporating cross-model testing. Specifically, we refined our analysis of ligand presence/absence classification by including ROC AUC and balanced accuracy metrics. This adjustment accounts for the fact that the experimental data did not cover all combinatorial conditions, thereby mitigating potential biases from data imbalance and threshold choice. The experimental results are qualitatively consistent with the simulations, though—as expected—they show somewhat lower ligand distinguishability compared to the noise-free simulated dataset. We have updated Figures 3E–F (previously Figure 3E), added Figure S8, and revised the manuscript accordingly (Lines 292–301). For the reviewer’s convenience, we have also pasted in the revised manuscript text below.

      “Classifiers trained to distinguish TNF-present from TNF-absent conditions achieved a Receiver Operating Characteristic-Area Under the Curve (ROC AUC) of 0.96, significantly above the 0.5 baseline (Figure 3D, Figure S8A). Extending this analysis to other ligands, cells detected LPS (0.85), Pam (0.84), pIC (0.73), and CpG (0.63) in mixtures (Figure 3D, S8A). Using experimental data from double- and triple-ligand stimuli (Figure 1D), ROC AUC values were TNF 0.74, LPS 0.74, Pam 0.66, pIC 0.75, and CpG 0.66 (Figure 3E, S8B). Classifier accuracies yielded consistent results (Figure S8C-D). These results indicated a remarkable capability of preserving ligand-specific dynamic features within complex NFκB signal trajectories that enable nuclear detection of extracellular ligands even in complex stimulus mixtures.”

      While the approach of presented here with multiple simultaneous ligand exposures is a major step towards the in vivo-like conditions, the temporal aspect is still missing. That is, temporal phasing i.e. sequential exposure to multiple ligands as one would expect in vivo rather than all at once. This is probably out of scope for this paper but the authors could comment how how their work could be taken forward in such direction and would the SRS be better or worse in such conditions. Response: We thank the reviewer for this insightful comment. We have added “the temporal aspect of multiple ligand exposures” to the discussion (Lines 503-510), and we pasted the corresponding paragraph here for reviewer’s references (black fonts are previous version, and blue fonts is the revised new texts):

      Cells may be expected to interpret not only the combination of signals but also their timing and duration to mount appropriate transcriptional responses (58, 59). For example, acute inflammation integrates pathogen-derived cues with pro- and anti-inflammatory signals over a timeframe of hours to days (58), to coordinate the pathogen removal and tissue repairing process. Investigating sequential stimulus combinations in our model is therefore crucial for understanding how cells process complex physiological inputs. Simulations that account for longer timescales may require additional feedback mechanisms, as described in some of our previous studies for NFκB (15, 60). **

      There is no caption for Figure 3F in the figure legend nor a reference in the main text.

      Response: In the revised manuscript we actually removed Figure 3F.

      Significance

      General assessment: This is a good manuscript in it's present form which could get better with revision. There needs more supporting data and validation to back the main claim presented in the manuscript.

      Significance/impact/readership: When revised this manuscript could be of interest to a broad community involving single cells biology, cell and immune signaling, and mathematical modeling. Especially the models presented here could be used a starting point to more complex and detailed modeling approaches.

      Response: We thank the reviewer for this endorsement. The reviewer’s constructive suggestion helped us significantly improve the clarity and rigor of our main conclusion.

      In summary, we have strengthened the computational framework in several ways. We improved the model’s fit to experimental single-ligand training data and reformulated the antagonistic CpG-pIC model using Michaelis–Menten kinetics, thereby reducing parameter arbitrariness and increasing mechanistic interpretability. These changes led to better agreement between model predictions and experimental observations for combinatorial ligand responses (Updated Figure 2 and Figure S2), which we hope will further increase experimentalists’ confidence in the modeling results. We have also validated one key conclusion (“cells retain stimulus-response specificity even with multiple ligand exposure”) using the experimental dataset, and it aligns with the model predictions.

      In addition, we have further extended our analysis and the scope. Inspired by the reviewer’s advice (and Reviewer 3’s comment 1b) on dose-combination study for CpG-pIC pair, we expanded our research to dose-response relationships for all dual-ligand combinations (Lines 302-406, Figure 4-5). This additional comprehensive analysis allowed us to identify the mechanism of synergistic and antagonistic effects in single-cell responses and to pinpoint the corresponding dose ranges among different ligand pairs.

      Interestingly, we found that IKK ultrasensitive activation may lead to low-dose ligand combinations synergistic response for single cells. We also found that CD14 uptake competition between LPS and Pam may lead to antagonistic/non-integrative combination. Our simulation-based finding of non-integrative combination of LPS-Pam stimuli aligns with previous independent experimental finding of non-integrative response for LPS and Pam combination (Kellogg et al., 2017), and this independent experimental study validated our model prediction.

      We further analyzed stimulus-response specificity under conditions predicted to exhibit synergy or antagonism. Our results indicate that antagonistic combinations of ligands can increase stimulus-response specificity in the context of ligand mixtures.

      Reviewer #3

      Evidence, reproducibility and clarity

      The authors investigate experimentally single macrophages' NF-kB responses to five ligands, separately and to 3 pairs of ligands. Using the single ligand stimulations, they train an existing mathematical model to replicate single-cell NF-kB nuclear trajectories. From what I understand, for each single cell trajectory in response to a given ligand, the best fit parameters of the core module and the receptor module (specific for the given ligand) are found.

      Then (again, from what I understand), single ligand models are used to generate responses to combinations of ligands. The parametrizations of single ligand models (to be combined) are chosen to have the most similar core modules. It is not described how the responses to more than one ligand are calculated - I expect that respective receptor modules work in parallel, providing signals to the core module. After observing that the response to CpG+pIC is lower (in terms of duration and total) than for CpG alone, the model is modified to account for competition for endosomal transport required by both ligands.

      Having the trained model, simulations of responses to all 31 combinations of ligands are performed, and each NF-κB trajectory is described by six signaling codons-Speed, Peak, Duration, Total, Early vs. Late, and Oscillations. Next, these codons are used to reconstruct (using a random forest model) the stimuli (which may be the combination of ligands). The single and even the two ligand stimuli are relatively well recognized, which is interpreted as the ability of macrophages to distinguish ligands even if present in combination.

      We thank the reviewer for careful reading of the manuscript.

      Major comments

      1) The demonstrated ability to recognize stimuli is based on several key assumptions that can hardly be met in reality.

      Response: We thank the reviewer for this comment, which prompted us to carefully reflect on the rigor of our work, inspired us to extend our analysis to a broad range of ligand-dose combinations, and helped us improve clarifying the limitations of our approach. Please see our detailed responses below.

      a) The cell knows the stimulation time, and then it can use speed as a codon. Look on fig. S4A: The trajectories in response to plC are similar to those in response to TNF, but just delayed. Response: We thank the reviewer for this comment. We updated the model parameterization to better fit to the single-ligand pIC condition (Lines 557-559). In the updated model, the simulated responses to TNF and pIC are quite different (Fig. S2A-B, Fig. S5A-B). Specifically, the Peak, Duration, EarlyVsLate, and Total signaling codons have different values. In addition, the literature suggests that timing difference of NFκB activation are sufficient to elicit differences in downstream gene expression responses, especially for the early response genes (ERG) and intermediate response genes (ING) (Figure 1 in Ando, et al, 2021). For reviewer’s convenience, we have also appended the figures. Specifically, within the first 60 minutes, ctrl exhibit higher Speed of NFκB activation, and the NFκB regulated ERG and ING show differences in the first 60 minutes (Below Fig 1a,b). Ando et al then identified the gene regulatory mechanism that is able to distinguish between differences in the Speed codon. Importantly, this mechanism does not require knowledge of t=0, i.e. when the timer was started.

      The signaling codon Speed, which is based on derivatives, is one way to quantify such timing differences in activation. It was selected from a library of more than 900 different dynamic features using an information maximizing algorithm (Adelaja et al., 2021). It is possible that other ways of measuring time, e.g. time to half-max, might not be distinguished that well by these regulatory mechanisms.

      b) The increase of stimulus concentration typically increases Peak, Duration, and Total, so a similar effect can be achieved by changing the ligand or concentration. Response: This (“the increase of stimulus concentration typically increases Peak, Duration, and Total”) is not an assumption. What the reviewer described (“a similar effect can be achieved by changing the ligand or concentration”) may occur or may not. The six informative signaling codons can vary under different ligands or doses. For example, with increasing doses of Pam, the NFκB response shows a higher peak, potentially making it appear more like LPS stimulation. However, as the Pam dose increases, the response duration decreases, which distinguishes it from LPS stimulation (See experimental data shown in Figure 4A, second row, and Figure 3A, second row in Luecke et al., (2024), we also pasted the corresponding figure below for reviewer’s convenience).

      Figure 4A and Figure 3A from Luecke et al., (2024). Figure 4A: NFκB activity dynamics in the single cells in response to 0, 0.01, 0.1, 1, 10, and 100 ng/ml P3C4 stimulation. Eight hours were measured by fluorescence microscopy of reporter hMPDMs. Each row of the heatmap represents the p38 or NFκB signaling trajectory of one cell. Trajectories are sorted by the maximum amplitude of p38 activity. Data from two pooled biological replicates are depicted. Total # of cells: 898, 834, 827, 787, 778, and 923. Figure 3A: NFκB activity dynamics in the single cells in response to 100 ng/ml LPS stimulation. Eight hours were measured by fluorescence microscopy of reporter hMPDMs. Each row of the heatmap represents the NFκB signaling trajectory of one cell (with p38 measured shown in the original paper). Trajectories are sorted by the maximum amplitude of p38 activity. Data from two pooled biological replicates are depicted.

      Inspired by the reviewer’s comment (and also Reviewer 2’s comments), in the revision, we expanded our research to dose-response relationships for all dual-ligand combinations (Lines 302-406, Figure 4-5). This additional comprehensive analysis allowed us to identify the mechanism of synergistic and antagonistic effects in single-cell responses and to pinpoint the corresponding dose ranges among different ligand pairs.

      Interestingly, we found that IKK ultrasensitive activation may lead to synergistic responses to low-dose ligand combinations but only in a subset of single cells. We also found that CD14 uptake competition between LPS and Pam may lead to antagonistic/non-integrative combination. Our simulation-based finding of non-integrative combination of LPS-Pam stimuli aligns with previous independent experimental findings of non-integrative response for LPS and Pam combination (Kellogg et al., 2017).

      c) Distinguishing a given ligand in the presence of some others, even stronger bases, on the assumption that these ligands were given at the same time, which is hardly justified. Response: We agree with the reviewer that ligands could be given at different times. Considering time delays between ligands (the inset and also removal) dramatically adds to the combinatorial complexity. Some initial studies by the Tay lab are beginning to explore some scenarios of time-shifted ligand pairs (Wang et al 2025). Here we focus on a systematic exploration of all ligand combinations at 6 different doses. The fact that we do not consider time delays is not an assumption but admittedly a limitation that may well be addressed in future studies. We have included a brief discussion of this issue in the discussion (Lines 503-514). We’ve appended here for reviewer’s convenience.

      Cells may be expected to interpret not only the combination of signals but also their timing and duration to mount appropriate transcriptional responses (Kumar et al., 2004; Son et al., 2023). For example, acute inflammation integrates pathogen-derived cues with pro- and anti-inflammatory signals over a timeframe of hours to days (Kumar et al., 2004), to coordinate the pathogen removal and tissue repairing process. Investigating sequential stimulus combinations in our model is therefore crucial for understanding how cells process complex physiological inputs. Simulations that account for longer timescales may require additional feedback mechanisms, as described in some of our previous studies for NFκB (Werner et al., 2008, 2005).

      We would like to suggest that despite (or maybe because) limiting our study to coincident stimuli, we made some noteworthy discoveries.

      2) For single ligands, it would be nice to see how the random forest classifier works on experimental data, not only on in silico data (even if generated by a fitted model).

      Response: This comment and Reviewer 2 comment 3 have helped us strengthen the rigor of our analysis by incorporating cross-model testing. We pasted the response below.

      Specifically, we refined our analysis of ligand presence/absence classification by including ROC AUC and balanced accuracy metrics. This adjustment accounts for the fact that the experimental data did not cover all combinatorial conditions, thereby mitigating potential biases from data imbalance and threshold choice. The experimental results are qualitatively consistent with the simulations, though—as expected—they show somewhat lower ligand distinguishability compared to the noise-free simulated dataset. We have updated Figures 3E–F (previously Figure 3E), added Figure S8, and revised the manuscript accordingly (Lines 292–301). For the reviewer’s convenience, we have also included the revised manuscript text below.

      “Classifiers trained to distinguish TNF-present from TNF-absent conditions achieved a Receiver Operating Characteristic-Area Under the Curve (ROC AUC) of 0.96, significantly above the 0.5 baseline (Figure 3D, Figure S8A). Extending this analysis to other ligands, cells detected LPS (0.85), Pam (0.84), pIC (0.73), and CpG (0.63) in mixtures (Figure 3D, S8A). Using experimental data from double- and triple-ligand stimuli (Figure 1D), ROC AUC values were TNF 0.74, LPS 0.74, Pam 0.66, pIC 0.75, and CpG 0.66 (Figure 3E, S8B). Classifier accuracies yielded consistent results (Figure S8C-D). These results indicated a remarkable capability of preserving ligand-specific dynamic features within complex NFκB signal trajectories that enable nuclear detection of extracelular ligands even in complex stimulus mixtures.”

      3) My understanding of ligand discrimination is such that it is rather based on a combination of pathways triggered than solely on a single transcription factor response trajectory, which varies with ligand concentration and ligand concentration time profile (no reason to assume it is OFF-ON-OFF). For example, some of the considered ligands (plC and CpG) activate IRF3/IRF7 in addition to NF-kB, which leads to IFN production and activation of STATs. This should at least be discussed.

      Response: We thank the reviewer for this comment and fully agree. In the previous version, we discussed different signaling pathways combinatorically distinguishing stimulus. In the revision, we have extended this discussion to include the example of pIC and CpG activation, as suggested (Lines 515-522). We pasted the corresponding text below.

      Furthermore, innate immune responses do not solely rely on NFκB but also involve the critical functions of AP1, p38, and the IRF3-ISGF3 axis. The additional pathways are likely activated in a coordinated manner and provide additional information (Luecke et al., 2021). This is exemplified by the studies demonstrating synergistic effects between CpG and pIC in inhibiting tumor growth and promoting cytokine production (Huang et al., 2020), such as IFNβ and TNFα, whose expression is also regulated by the IRF and MAPK signaling pathways (Luecke et al., 2021; Sheu et al., 2023). Therefore the inclusion of parallel pathways of AP1 and MAPK, as well as the type I interferon network (Cheng et al., 2015; Davies et al., 2020; Hanson and Batchelor, 2022; Luecke et al., 2024; Paek et al., 2016; Peterson et al., 2022) are next steps for expanding the mathematical models presented here.”

      Technical comments

      1) Reference 25: X. Guo, A. Adelaja, A. Singh, W. Roy, A. Hoffmann, Modeling single-cell heterogeneity in signaling dynamics of macrophages reveals principles of information transmission. Nature Communications (2025) does not lead to any paper with the same or a similar title and author list. This Ref is given as a reference to the model. Fortunately, Ref 8 is helpful. Nevertheless, authors should include a schematic of the model.

      Response: We apologize for the paper not being accessible on time. It is now. We have also added a schematic of the model as suggested (Figure S1) and have added detailed description of the model and simulations in introduction (Lines 95-106), results (Lines 129-141), and methods (Simulation of heterogenous NFκB dynamical responses).

      2) Also Mendeley Data DOI:10.17632/bv957x6frk.1 and GitHub https://github.com/Xiaolu-Guo/Combinatorial_ligand_NFkB lead to nowhere.

      Response: We thank the reviewer for this comment, and we have made the GitHub codes public. Mendeley Data DOI:10.17632/bv957x6frk.1 can be accessed via the shared link: https://data.mendeley.com/preview/bv957x6frk?a=6d56e079-d7b0-482e-951f-8a8e06ee8797

      and will be public once the paper accepted.

      3) Dataset 1 is not described. Possibly it contains sets of parameters of receptor modules (different numbers of sets for each module, why?), but the names of parameters never appear in the text, which makes it impossible to reproduce the data.

      Response: We thank the reviewer for this comment, and we have added the description of the dataset (S3 SupplementaryDataset2_NFkB_network_single_cell_parameter_distribution.xlsx) and added the parameter names in the methods (Simulation of heterogenous NFκB dynamical responses).


      4) It is difficult to understand how the simulations in response to more than one ligand are performed.

      Response: We thank the reviewer for this comment, and we have improved the explanation of the methods (Results, Lines 145-152) and included a detailed description of the model and simulations for combinatorial ligands (Methods, Predicting heterogeneous single-cell responses to combinatorial-ligand stimulation).

      Significance

      A lot of work has been done, the methodology is interesting, but the biological conclusions are overstated.

      Response: We thank the reviewer for their interest in the methodology. We have revised the title, the abstract, and added the discussion about our finding to more accurately document what we have found. In the revision, we have increased the clarity and rigor of the work. For the key conclusion that macrophages maintain some level of NFκB signaling fidelity in response to ligand mixtures, we have validated the binary classifier results on experimental data as reviewer suggested.

      In the revision, we have also extended our methodology to explore further, the dose-response curves for different dosage combination for ligand pairs. This further work allowing us identified the synergistic and antagonistic regimes. By comparing the stimulus response specificity for antagonistic model vs the non-antagonistic model, we demonstrated that signaling antagonism may increase the distinguishability of presence or absence of specific ligands within complex ligand mixtures. This provides a mechanism of how signaling fidelity is maintained to the surprising degree we reported.

      REFERENCES

      Adelaja, A., Taylor, B., Sheu, K.M., Liu, Y., Luecke, S., Hoffmann, A., 2021. Six distinct NFκB signaling codons convey discrete information to distinguish stimuli and enable appropriate macrophage responses. Immunity 54, 916-930.e7. https://doi.org/10.1016/j.immuni.2021.04.011

      Akira, S., Takeda, K., 2004. Toll-like receptor signalling. Nat Rev Immunol 4, 499–511. https://doi.org/10.1038/nri1391

      Andridge, R.R., Little, R.J.A., 2010. A Review of Hot Deck Imputation for Survey Non-response. Int Stat Rev 78, 40–64. https://doi.org/10.1111/j.1751-5823.2010.00103.x

      Cheng, Z., Taylor, B., Ourthiague, D.R., Hoffmann, A., 2015. Distinct single-cell signaling characteristics are conferred by the MyD88 and TRIF pathways during TLR4 activation. Sci Signal 8, ra69. https://doi.org/10.1126/scisignal.aaa5208

      Davies, A.E., Pargett, M., Siebert, S., Gillies, T.E., Choi, Y., Tobin, S.J., Ram, A.R., Murthy, V., Juliano, C., Quon, G., Bissell, M.J., Albeck, J.G., 2020. Systems-Level Properties of EGFR-RAS-ERK Signaling Amplify Local Signals to Generate Dynamic Gene Expression Heterogeneity. Cell Systems 11, 161-175.e5. https://doi.org/10.1016/j.cels.2020.07.004

      Guo, X., Adelaja, A., Singh, A., Roy, W., Hoffmann, A., 2025a. Modeling single-cell heterogeneity in signaling dynamics of macrophages reveals principles of information transmission. Nature Communications.

      Guo, X., Adelaja, A., Singh, A., Wollman, R., Hoffmann, A., 2025b. Modeling heterogeneous signaling dynamics of macrophages reveals principles of information transmission in stimulus responses. Nat Commun 16, 5986. https://doi.org/10.1038/s41467-025-60901-3

      Hanson, R.L., Batchelor, E., 2022. Coordination of MAPK and p53 dynamics in the cellular responses to DNA damage and oxidative stress. Molecular Systems Biology 18, e11401. https://doi.org/10.15252/msb.202211401

      Huang, Y., Zhang, Q., Lubas, M., Yuan, Y., Yalcin, F., Efe, I.E., Xia, P., Motta, E., Buonfiglioli, A., Lehnardt, S., Dzaye, O., Flueh, C., Synowitz, M., Hu, F., Kettenmann, H., 2020. Synergistic Toll-like Receptor 3/9 Signaling Affects Properties and Impairs Glioma-Promoting Activity of Microglia. J. Neurosci. 40, 6428–6443. https://doi.org/10.1523/JNEUROSCI.0666-20.2020

      Kellogg, R.A., Tian, C., Etzrodt, M., Tay, S., 2017. Cellular Decision Making by Non-Integrative Processing of TLR Inputs. Cell Rep 19, 125–135. https://doi.org/10.1016/j.celrep.2017.03.027

      Kumar, R., Clermont, G., Vodovotz, Y., Chow, C.C., 2004. The dynamics of acute inflammation. Journal of Theoretical Biology 230, 145–155. https://doi.org/10.1016/j.jtbi.2004.04.044

      Luecke, S., Guo, X., Sheu, K.M., Singh, A., Lowe, S.C., Han, M., Diaz, J., Lopes, F., Wollman, R., Hoffmann, A., 2024. Dynamical and combinatorial coding by MAPK p38 and NFκB in the inflammatory response of macrophages. Molecular Systems Biology 20, 898–932. https://doi.org/10.1038/s44320-024-00047-4

      Luecke, S., Sheu, K.M., Hoffmann, A., 2021. Stimulus-specific responses in innate immunity: Multilayered regulatory circuits. Immunity 54, 1915–1932. https://doi.org/10.1016/j.immuni.2021.08.018

      Paek, A.L., Liu, J.C., Loewer, A., Forrester, W.C., Lahav, G., 2016. Cell-to-Cell Variation in p53 Dynamics Leads to Fractional Killing. Cell 165, 631–642. https://doi.org/10.1016/j.cell.2016.03.025

      Peterson, A.F., Ingram, K., Huang, E.J., Parksong, J., McKenney, C., Bever, G.S., Regot, S., 2022. Systematic analysis of the MAPK signaling network reveals MAP3K-driven control of cell fate. Cell Systems 13, 885-894.e4. https://doi.org/10.1016/j.cels.2022.10.003

      Sheu, K.M., Guru, A.A., Hoffmann, A., 2023. Quantifying stimulus-response specificity to probe the functional state of macrophages. Cell Systems 14, 180-195.e5. https://doi.org/10.1016/j.cels.2022.12.012

      Son, M., Wang, A.G., Keisham, B., Tay, S., 2023. Processing stimulus dynamics by the NF-κB network in single cells. Exp Mol Med 55, 2531–2540. https://doi.org/10.1038/s12276-023-01133-7

      Stuart, T., Butler, A., Hoffman, P., Hafemeister, C., Papalexi, E., Mauck, W.M., Hao, Y., Stoeckius, M., Smibert, P., Satija, R., 2019. Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902.e21. https://doi.org/10.1016/j.cell.2019.05.031

      Werner, S.L., Barken, D., Hoffmann, A., 2005. Stimulus Specificity of Gene Expression Programs Determined by Temporal Control of IKK Activity. Science 309, 1857–1861. https://doi.org/10.1126/science.1113319

      Werner, S.L., Kearns, J.D., Zadorozhnaya, V., Lynch, C., O’Dea, E., Boldin, M.P., Ma, A., Baltimore, D., Hoffmann, A., 2008. Encoding NF-kappaB temporal control in response to TNF: distinct roles for the negative regulators IkappaBalpha and A20. Genes Dev 22, 2093–2101. https://doi.org/10.1101/gad.1680708

    1. Can I feel inside my hand?It's a really... I've never...-Sense your hand.-I've never...I've never tried to do this.Okay. We're gonna try it right now.Okay, so sense your hand.Sense your index finger from the inside.Index. That's pinkie.Okay.-Yeah, it's kind of warm.-Yeah.Warm and what else?Numb. I don't know. Warm and numb.-Numb?-I don't know if numb is the right word.It just feels like an energy, you know.Yeah, you're right. See if you can senseyour whole hand from the insidein the same way.Take your time.If you've never done this before,it's hard.Yeah, I can feel something in there.Like...You do?Let it just...Gently let your attention go up your arm.Yeah.See if you can sense your whole armfrom the inside.Yeah, it feels like there's wolvesrunning in my arm.Oh. It does not.See if you can sense both of your armsat the same time from the inside.Yeah, I could do that.And if you can do that, add your legs,so that you can sense your armsand your legs at the same timefrom the inside.Yeah, it's weird. Yeah, I can do that.That...is presence.That's weird.-I've never done that before. It's weird.-That's presence.And what you're doingis you're not in your mind.This gets you out of your mind.You are not mediating your senseof your arms and legs through your mind.Through your mind. Mental.So that's a way to get out of your head.That's what you can do immediatelywhen you're on the side of the riverbeing nibbled by weasels and bees.And so the next thingyou can do, Duncan...Yes?...is, while you are sensingyour arms and your legs simultaneously...Yes....add two more facets.Look. Sense your arms and legs and look.-Look at what?-Whatever you see.Just let it come to you.-Don't search it out.-Okay.Just take in what you see.It's a bee with a thousand eyes and horns,-holding a dagger covered in pentagrams.-If you're on the side of the river,what else could it be?-You know...-Wait a minute. There's one more piece.Go ahead.And listen.So you sense your arms and legs...you look and you listen,all at the same time.Three things you're doing.Yeah, I get it. It gives youa kind of clarity. It kind of...It gives you clarityand it takes you into a slightly differentdimension of consciousness.It's quite different, actually.It is quite different.Because I've got a microphone in my face,it's hard for me to do what I would...Like if I was really sitting by myself...But I think that this recipefor a meditative exercise is awesome.Well, what it does is it allows youto get into presence without a teacherand without a penny to your name.-Yes.-And even as you are in that...um, state of presence...if you are in it, even for a minute,if you can stay in it for a minute,you will begin to sense the flow...-Right.-...of energy.You will sense that there is a river,because a lot of peopleon the side of the riverdon't even realize there's a river.Right.-So...-What's the river?It's reality, with its...flowing dynamism.
    2. But also, you know,if you look at the world,what you seeis things appearing and disappearing,and humans are a partof the whole of that,and humans appear and they disappear.- Hmm.- Off the face of the Earth.That just happens.You know, our egos personalize itand we consider ourselves special cases.-Yes.-But we're really not, you know?We are a part of the wholeand everything in the wholetransforms all the time.It changes form. Transfigures.You're a special case.- That's 'cause I'm your mama.- No!No, I know there's... I know, but come on.There's no way to stop the heartbreak.How do you...What do you do about that?You cry.You cry.It's really hard.But it's definitely somethingeveryone's got to, you know, deal with.Yeah.And it's such a strange thing.I mean, the universe, it...it seems so stable......if you are...you know.......in this kind of automatic state.And the encounter with truth,which, for me, you dying...your dying....This thing has beenprobably the greatest...run-in with truth that I've hadin my whole life.You can't really...It's inexpressible.Yeah.But it's not like it makes you feel...This... This is not a feeling of, like...-This is not a desirable feeling.-No.But it's a feeling that every singlehuman being will experienceone way or the other.It is.But so much...So many of us are spending so much timeengaged in just ridiculous activities,it seems like,just to try to avoid this experience.Exactly.People really try to avoidthe consideration that they are gonna dieand that people they love are gonna die.It opens your heart.It breaks your heart open.- Yes.- You know?Our hearts have been closed,because we've closed them.We've defended ourselves against pain.And this opens them.Opening your heart sucks.- It hurts.- This is the thing.This is the thing Ram Dasstalks about all the... It hurts.Does it always hurt?Does opening your heart always just hurt?Are you just in a constant state of...No, it doesn't always hurt.But when it really cracks open, it hurts.You know? And it does.Even the hurt transforms,because if you inquire into the hurt,you know what you're experiencing is love.-Right.-The real deal.Yeah, it's... Yeah, right.Because it's seeing newness.It's seeing how much you value life.Yes. And the reasonI look better now than I ever have- is because I am more fully living.- Right.Because I'm living and dying consciously.Simultaneously, I'm holding both.
      • our egos personalize our life experience
      • transformation is what's constant in the universe
      • the only way to grieve is to cry
      • pain opens your heart
      • "if you inquire into the hurt, you know what you're experiencing is love"
      • living and dying consciously and simultaneously
    1. Through socialannotation, you can contribute to conversations about reading and writingto enrich learning in the course, much like when you participate in classdiscussions

      This statement is strong because it clearly connects social annotation to active participation. By comparing it to class discussions, it helps readers understand that annotation is not just marking up a text—it’s a form of engagement and conversation.

    1. Doomscrolling

      I think doomscrolling can be really damaging on mental health, but I also think that in the recent time it's become more than just bad news, it's when you're on a social media platform (and say to yourself: just 5 more minutes) and then end up scrolling for another hour or so, without being able to stop. This loss of self control can be really hard to grasp, and take a toll on ones mental health.

    1. Arguments for Utilitarianismfunction togglePlayOrPause(){document.getElementById("player-container").classList.add("show-player"),document.getElementById("audio-icon").outerHTML=""}Table of ContentsIntroduction: Moral Methodology & Reflective EquilibriumArguments for UtilitarianismWhat Fundamentally MattersThe Veil of IgnoranceEx Ante ParetoExpanding the Moral CircleThe Poverty of the AlternativesThe Paradox of DeontologyThe Hope ObjectionSkepticism About the Distinction Between Doing and AllowingStatus Quo BiasEvolutionary Debunking ArgumentsConclusionResources and Further ReadingIntroduction: Moral Methodology & Reflective EquilibriumYou cannot prove a moral theory. Whatever arguments you come up with, it’s always possible for someone else to reject your premises—if they are willing to accept the costs of doing so. Different theories offer different advantages. This chapter will set out some of the major considerations that plausibly count in favor of utilitarianism. A complete view also needs to consider the costs of utilitarianism (or the advantages of its competitors), which are addressed in Chapter 8: Objections to Utilitarianism. You can then reach an all-things-considered judgment as to which moral theory strikes you as overall best or most plausible.To this end, moral philosophers typically use the methodology of reflective equilibrium. 1 1 This involves balancing two broad kinds of evidence as applied to moral theories:Intuitions about specific cases (thought experiments).General theoretical considerations, including the plausibility of the theory’s principles or systematic claims about what matters.General principles can be challenged by coming up with putative counterexamples, or cases in which they give an intuitively incorrect verdict. In response to such putative counterexamples, we must weigh the force of the case-based intuition against the inherent plausibility of the principle being challenged. This could lead you to either revise the principle to accommodate your intuitions about cases or to reconsider your verdict about the specific case, if you judge the general principle to be better supported (especially if you are able to “explain away” the opposing intuition as resting on some implicit mistake or confusion).As we will see, the arguments in favor of utilitarianism rest overwhelmingly on general theoretical considerations. Challenges to the view can take either form, but many of the most pressing objections involve thought experiments in which utilitarianism is held to yield counterintuitive verdicts.There is no neutral, non-question-begging answer to how one ought to resolve such conflicts. 2 2 It takes judgment, and different people may be disposed to react in different ways depending on their philosophical temperament. As a general rule, those of a temperament that favors systematic theorizing are more likely to be drawn to utilitarianism (and related views), whereas those who hew close to common sense intuitions are less likely to be swayed by its theoretical virtues. Considering the arguments below may thus do more than just illuminate utilitarianism; it may also help you to discern your own philosophical temperament!While our presentation focuses on utilitarianism, it’s worth noting that many of the arguments below could also be taken to support other forms of welfarist consequentialism (just as many of the objections to utilitarianism also apply to these related views). This chapter explores arguments for utilitarianism and closely related views over non-consequentialist approaches to ethics.Arguments for UtilitarianismWhat Fundamentally MattersMoral theories serve to specify what fundamentally matters, and utilitarianism offers a particularly compelling answer to this question.Almost anyone would agree with utilitarianism that suffering is bad, and well-being is good. What could be more obvious? If anything matters morally, human well-being surely does. And it would be arbitrary to limit moral concern to our own species, so we should instead conclude that well-being generally is what matters. That is, we ought to want the lives of sentient beings to go as well as possible (whether that ultimately comes down to maximizing happiness, desire satisfaction, or other welfare goods).Could anything else be more important? Such a suggestion can seem puzzling. Consider: it is (usually) wrong to steal. 3 3 But that is plausibly because stealing tends to be harmful, reducing people’s well-being. 4 4 By contrast, most people are open to redistributive taxation, if it allows governments to provide benefits that reliably raise the overall level of well-being in society. So it’s not that individuals just have a natural right to not be interfered with no matter what. When judging institutional arrangements (such as property and tax law), we recognize that what matters is coming up with arrangements that tend to secure overall good results, and that the most important factor in what makes a result good is that it promotes well-being. 5 5Such reasoning may justify viewing utilitarianism as the default starting point for moral theorizing. 6 6 If someone wants to claim that there is some other moral consideration that can override overall well-being (trumping the importance of saving lives, reducing suffering, and promoting flourishing), they face the challenge of explaining how that could possibly be so. Many common moral rules (like those that prohibit theft, lying, or breaking promises), while not explicitly utilitarian in content, nonetheless have a clear utilitarian rationale. If they did not generally promote well-being—but instead actively harmed people—it’s hard to see what reason we would have to still want people to follow them. To follow and enforce harmful moral rules (such as rules prohibiting same-sex relationships) would seem like a kind of “rule worship”, and not truly ethical at all. 7 7 Since the only moral rules that seem plausible are those that tend to promote well-being, that’s some reason to think that moral rules are, as utilitarianism suggests, purely instrumental to promoting well-being.Similar judgments apply to hypothetical cases in which you somehow know for sure that a typically reliable rule is, in this particular instance, counterproductive. In the extreme case, we all recognize that you ought to lie or break a promise if lives are on the line. In practice, of course, the best way to achieve good results over the long run is to respect commonsense moral rules and virtues while seeking opportunities to help others. (It’s important not to mistake the hypothetical verdicts utilitarianism offers in stylized thought experiments with the practical guidance it offers in real life.) The key point is just that utilitarianism offers a seemingly unbeatable answer to the question of what fundamentally matters: protecting and promoting the interests of all sentient beings to make the world as good as it can be.The Veil of IgnoranceHumans are masters of self-deception and motivated reasoning. If something benefits us personally, it’s all too easy to convince ourselves that it must be okay. We are also more easily swayed by the interests of more salient or sympathetic individuals (favoring puppies over pigs, for example). To correct for such biases, it can be helpful to force impartiality by imagining that you are looking down on the world from behind a “veil of ignorance”. This veil reveals the facts about each individual’s circumstances in society—their income, happiness level, preferences, etc.—and the effects that each choice would have on each person, while hiding from you the knowledge of which of these individuals you are. 8 8 To more fairly determine what ideally ought to be done, we may ask what everyone would have most personal reason to prefer from behind this veil of ignorance. If you’re equally likely to end up being anyone in the world, it would seem prudent to maximize overall well-being, just as utilitarianism prescribes. 9 9How much weight we should give to the verdicts that would be chosen, on self-interested grounds, from behind the veil? The veil thought experiment highlights how utilitarianism gives equal weight to everyone’s interests, without bias. That is, utilitarianism is just what we get when we are beneficent to all: extending to everyone the kind of careful concern that prudent people have for their own interests. 10 10 But it may seem question-begging to those who reject welfarism, and so deny that interests are all that matter. For example, the veil thought experiment clearly doesn’t speak to whether non-sentient life or natural beauty has intrinsic value. It’s restricted to that sub-domain of morality that concerns what we owe to each other, where this includes just those individuals over whom our veil-induced uncertainty about our identity extends: presently existing sentient beings, perhaps. 11 11 Accordingly, any verdicts reached via the veil of ignorance will still need to be weighed against what we might yet owe to any excluded others (such as future generations, or non-welfarist values).Still, in many contexts other factors will not be relevant, and the question of what we morally ought to do will reduce to the question of how we should treat each other. Many of the deepest disagreements between utilitarians and their critics concern precisely this question. And the veil of ignorance seems relevant here. The fact that some action is what everyone affected would personally prefer from behind the veil of ignorance seems to undermine critics’ claims that any individual has been mistreated by, or has grounds to complain about, that action.Ex Ante ParetoA Pareto improvement is better for some people, and worse for none. When outcomes are uncertain, we may instead assess the prospect associated with an action—the range of possible outcomes, weighted by their probabilities. A prospect can be assessed as better for you when it offers you greater well-being in expectation, or ex ante. 12 12 Putting these concepts together, we may formulate the following principle:Ex ante Pareto: in a choice between two prospects, one is morally preferable to another if it offers a better prospect for some individuals and a worse prospect for none.This bridge between personal value (or well-being) and moral assessment is further developed in economist John Harsanyi’s aggregation theorem. 13 13 But the underlying idea, that reasonable beneficence requires us to wish well to all, and prefer prospects that are in everyone’s ex ante interests, has also been defended and developed in more intuitive terms by philosophers. 14 14A powerful objection to most non-utilitarian views is that they sometimes violate ex ante Pareto, such as when choosing policies from behind the veil of ignorance. Many rival views imply, absurdly, that prospect Y could be morally preferable to prospect X, even when Y is worse in expectation for everyone involved.Caspar Hare illustrates the point with a Trolley case in which all six possible victims are stuffed inside suitcases: one is atop a footbridge, five are on the tracks below, and a train will hit and kill the five unless you topple the one on the footbridge (in which case the train will instead kill this one and then stop before reaching the others). 15 15 As the suitcases have recently been shuffled, nobody knows which position they are in. So, from each victim’s perspective, their prospects are best if you topple the one suitcase off the footbridge, increasing their chances of survival from 1/6 to 5/6. Given that this is in everyone’s ex ante interests, it’s deeply puzzling to think that it would be morally preferable to override this unanimous preference, shared by everyone involved, and instead let five of the six die; yet that is the implication of most non-utilitarian views. 16 16Expanding the Moral CircleWhen we look back on past moral atrocities—like slavery or denying women equal rights—we recognize that they were often sanctioned by the dominant societal norms at the time. The perpetrators of these atrocities were grievously wrong to exclude their victims from their “circle” of moral concern. 17 17 That is, they were wrong to be indifferent towards (or even delight in) their victims’ suffering. But such exclusion seemed normal to people at the time. So we should question whether we might likewise be blindly accepting of some practices that future generations will see as evil but that seem “normal” to us. 18 18 The best protection against making such an error ourselves would be to deliberately expand our moral concern outward, to include all sentient beings—anyone who can suffer—and so recognize that we have strong moral reasons to reduce suffering and promote well-being wherever we can, no matter who it is that is experiencing it.While this conclusion is not yet all the way to full-blown utilitarianism, since it’s compatible with, for example, holding that there are side-constraints limiting one’s pursuit of the good, it is likely sufficient to secure agreement with the most important practical implications of utilitarianism (stemming from cosmopolitanism, anti-speciesism, and longtermism).The Poverty of the AlternativesWe’ve seen that there is a strong presumptive case in favor of utilitarianism. If no competing view can be shown to be superior, then utilitarianism has a strong claim to be the “default” moral theory. In fact, one of the strongest considerations in favor of utilitarianism (and related consequentialist views) is the deficiencies of the alternatives. Deontological (or rule-based) theories, in particular, seem to rest on questionable foundations. 19 19Deontological theories are explicitly non-consequentialist: instead of morally assessing actions by evaluating their consequences, these theories tend to take certain types of action (such as killing an innocent person) to be intrinsically wrong. 20 20 There are reasons to be dubious of this approach to ethics, however.The Paradox of DeontologyDeontologists hold that there is a constraint against killing: that it’s wrong to kill an innocent person even if this would save five other innocent people from being killed. This verdict can seem puzzling on its face. 21 21 After all, given how terrible killing is, should we not want there to be less of it? Rational choice in general tends to be goal-directed, a conception which fits poorly with deontic constraints. 22 22 A deontologist might claim that their goal is simply to avoid violating moral constraints themselves, which they can best achieve by not killing anyone, even if that results in more individuals being killed. While this explanation can render deontological verdicts coherent, it does so at the cost of making them seem awfully narcissistic, as though the deontologist’s central concern was just to maintain their own moral purity or “clean hands”.Deontologists might push back against this characterization by instead insisting that moral action need not be goal-directed at all. 23 23 Rather than only seeking to promote value (or minimize harm), they claim that moral agents may sometimes be called upon to respect another’s value (by not harming them, even as a means to preventing greater harm to others), which would seem an appropriately outwardly-directed, non-narcissistic motivation.The challenge remains that such a proposal makes moral norms puzzlingly divergent from other kinds of practical norms. If morality sometimes calls for respecting value rather than promoting it, why is the same not true of prudence? (Given that pain is bad for you, for example, it would not seem prudent to refuse a painful operation now if the refusal commits you to five comparably painful operations in future.) Deontologists may offer various answers to this question, but insofar as we are inclined to think, pre-theoretically, that ethics ought to be continuous with other forms of rational choice, that gives us some reason to prefer consequentialist accounts. 24 24Deontologists also face a tricky question about where to draw the line. Is it at least okay to kill one person to prevent a hundred killings? Or a million? Absolutists never permit killing, no matter the stakes. But such a view seems too extreme for many. Moderate deontologists allow that sufficiently high stakes can justify violations. But how high? Any answer they offer is apt to seem arbitrary and unprincipled. Between the principled options of consequentialism or absolutism, many will find consequentialism to be the more plausible of the two.The Hope ObjectionImpartial observers should want and hope for the best outcome. Non-consequentialists claim, nonetheless, that it’s sometimes wrong to bring about the best outcome. Putting the two claims together yields the striking result that you should sometimes hope that others act wrongly.Suppose it would be wrong for some stranger—call him Jack—to kill one innocent person to prevent five other (morally comparable) killings. Non-consequentialists may claim that Jack has a special responsibility to ensure that he does not kill anyone, even if this results in more killings by others. But you are not Jack. From your perspective as an impartial observer, Jack’s killing one innocent person is no more or less intrinsically bad than any of the five other killings that would thereby be prevented. You have most reason to hope that there is only one killing rather than five. So you have reason to hope that Jack acts “wrongly” (killing one to save five). But that seems odd.More than merely being odd, this might even be taken to undermine the claim that deontic constraints matter, or are genuinely important to abide by. After all, to be important just is to be worth caring about. For example, we should care if others are harmed, which validates the claim that others’ interests are morally important. But if we should not care more about Jack’s abiding by the moral constraint against killing than we should about his saving five lives, that would seem to suggest that the constraint against killing is not in fact more morally important than saving five lives.Finally, since our moral obligations ought to track what is genuinely morally important, if deontic constraints are not in fact important then we cannot be obligated to abide by them. 25 25 We cannot be obliged to prioritize deontic constraints over others’ lives, if we ought to care more about others’ lives than about deontic constraints. So deontic constraints must not accurately describe our obligations after all. Jack really ought to do whatever would do the most good overall, and so should we.Skepticism About the Distinction Between Doing and AllowingYou might wonder: if respect for others requires not harming them (even to help others more), why does it not equally require not allowing them to be harmed? Deontological moral theories place great weight on distinctions such as those between doing and allowing harm, or killing and letting die, or intended versus merely foreseen harms. But why should these be treated so differently? If a victim ends up equally dead either way, whether they were killed or “merely” allowed to die would not seem to make much difference to them—surely what matters to them is just their death. Consequentialism accordingly denies any fundamental significance to these distinctions. 26 26Indeed, it’s far from clear that there is any robust distinction between “doing” and “allowing”. Sometimes you might “do” something by remaining perfectly still. 27 27 Also, when a doctor unplugs a terminal patient from life support machines, this is typically thought of as “letting die”; but if a mafioso, worried about an informant’s potentially incriminating testimony, snuck in to the hospital and unplugged the informant’s life support, we are more likely to judge it to constitute “killing”. 28 28 Jonathan Bennett argues at length that there is no satisfactory, fully general distinction between doing and allowing—at least, none that would vindicate the moral significance that deontologists want to attribute to such a distinction. 29 29 If Bennett is right, then that might force us towards some form of consequentialism (such as utilitarianism) instead.Status Quo BiasOpposition to utilitarian trade-offs—that is, benefiting some at a lesser cost to others—arguably amounts to a kind of status quo bias, prioritizing the preservation of privilege over promoting well-being more generally.Such conservatism might stem from the Just World fallacy: the mistake of assuming that the status quo is just, and that people naturally get what they deserve. Of course, reality offers no such guarantees of justice. What circumstances one is born into depends on sheer luck, including one’s endowment of physical and cognitive abilities which may pave the way for future success or failure. Thus, even later in life we never manage to fully wrest back control from the whimsies of fortune and, consequently, some people are vastly better off than others despite being no more deserving. In such cases, why should we not be willing to benefit one person at a lesser cost to privileged others? They have no special entitlement to the extra well-being that fortune has granted them. 30 30 Clearly, it’s good for people to be well-off, and we certainly would not want to harm anyone unnecessarily. 31 31 However, if we can increase overall well-being by benefiting one person at the lesser cost to another, we should not refrain from doing so merely due to a prejudice in favor of the existing distribution. 32 32 It’s easy to see why traditional elites would want to promote a “morality” which favors their entrenched interests. It’s less clear why others should go along with such a distorted view of what (and who) matters.It can similarly be argued that there is no real distinction between imposing harms and withholding benefits. The only difference between the two cases concerns what we understand to be the status quo, which lacks moral significance. Suppose scenario A is better for someone than B. Then to shift from A to B would be a “harm”, while to prevent a shift from B to A would be to “withhold a benefit”. But this is merely a descriptive difference. If we deny that the historically given starting point provides a morally privileged baseline, then we must say that the cost in either case is the same, namely the difference in well-being between A and B. In principle, it should not matter where we start from. 33 33Now suppose that scenario B is vastly better for someone else than A is: perhaps it will save their life, at the cost of the first person’s arm. Nobody would think it okay to kill a person just to save another’s arm (that is, to shift from B to A). So if we are to avoid status quo bias, we must similarly judge that it would be wrong to oppose the shift from A to B—that is, we should not object to saving someone’s life at the cost of another’s arm. 34 34 We should not care especially about preserving the privilege of whoever stood to benefit by default; such conservatism is not truly fair or just. Instead, our goal should be to bring about whatever outcome would be best overall, counting everyone equally, just as utilitarianism prescribes.Evolutionary Debunking ArgumentsAgainst these powerful theoretical objections, the main consideration that deontological theories have going for them is closer conformity with our intuitions about particular cases. But if these intuitions cannot be supported by independently plausible principles, that may undermine their force—or suggest that we should interpret these intuitions as good rules of thumb for practical guidance, rather than as indicating what fundamentally matters.The force of deontological intuitions may also be undermined if it can be demonstrated that they result from an unreliable process. For example, evolutionary processes may have endowed us with an emotional bias favoring those who look, speak, and behave like ourselves; this, however, offers no justification for discriminating against those unlike ourselves. Evolution is a blind, amoral process whose only “goal” is the propagation of genes, not the promotion of well-being or moral rightness. Our moral intuitions require scrutiny, especially in scenarios very different from our evolutionary environment. If we identify a moral intuition as stemming from our evolutionary ancestry, we may decide not to give much weight to it in our moral reasoning—the practice of evolutionary debunking. 35 35Katarzyna de Lazari-Radek and Peter Singer argue that views permitting partiality are especially susceptible to evolutionary debunking, whereas impartial views like utilitarianism are more likely to result from undistorted reasoning. 36 36 Joshua Greene offers a different psychological debunking argument. He argues that deontological judgments—for instance, in response to trolley cases—tend to stem from unreliable and inconsistent emotional responses, including our favoritism of identifiable over faceless victims and our aversion to harming someone up close rather than from afar. By contrast, utilitarian judgments involve the more deliberate application of widely respected moral principles. 37 37Such debunking arguments raise worries about whether they “prove too much”: after all, the foundational moral judgment that pain is bad would itself seem emotionally-laden and susceptible to evolutionary explanation—physically vulnerable creatures would have powerful evolutionary reasons to want to avoid pain whether or not it was objectively bad, after all! 38 38However, debunking arguments may be most applicable in cases where we feel that a principled explanation for the truth of the judgment is lacking. We do not tend to feel any such lack regarding the badness of pain—that is surely an intrinsically plausible judgment if anything is. Some intuitions may be over-determined: explicable both by evolutionary causes and by their rational merits. In such a case, we need not take the evolutionary explanation to undermine the judgment, because the judgment also results from a reliable process (namely, rationality). By contrast, deontological principles and partiality are far less self-evidently justified, and so may be considered more vulnerable to debunking. Once we have an explanation for these psychological intuitions that can explain why we would have them even if they were rationally baseless, we may be more justified in concluding that they are indeed rationally baseless.As such, debunking objections are unlikely to change the mind of one who is drawn to the target view (or regards it as independently justified and defensible). But they may help to confirm the doubts of those who already felt there were some grounds for scepticism regarding the intrinsic merits of the target view.ConclusionUtilitarianism can be supported by several theoretical arguments, the strongest perhaps being its ability to capture what fundamentally matters. Its main competitors, by contrast, seem to rely on dubious distinctions—like “doing” vs. “allowing”—and built-in status quo bias. At least, that is how things are apt to look to one who is broadly sympathetic to a utilitarian approach. Given the flexibility inherent in reflective equilibrium, these arguments are unlikely to sway a committed opponent of the view. For those readers who find a utilitarian approach to ethics deeply unappealing, we hope that this chapter may at least help you to better understand what appeal others might see in the view.However strong you judge the arguments in favor of utilitarianism to be, your ultimate verdict on the theory will also depend upon how well the view is able to counter the influential objections that critics have raised against it.The next chapter discusses theories of well-being, or what counts as being good for an individual.Next Chapter: Theories of Well-BeingHow to Cite This PageChappell, R.Y. and Meissner, D. (2023). Arguments for Utilitarianism. In R.Y. Chappell, D. Meissner, and W. MacAskill (eds.), An Introduction to Utilitarianism, <https://www.utilitarianism.net/arguments-for-utilitarianism>, accessed document.write((new Date).toLocaleDateString("en-US"))2/13/2026.
  3. Feb 2026
    1. You are an important part of your classroom

      This makes me realize that social annotation isn’t just highlighting... it’s contributing to an ongoing conversation about the reading.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      The authors assess the effectiveness of electroporating mRNA into male germ cells to rescue the expression of proteins required for spermatogenesis progression in individuals where these proteins are mutated or depleted. To set up the methodology, they first evaluated the expression of reporter proteins in wild-type mice, which showed expression in germ cells for over two weeks. Then, they attempted to recover fertility in a model of late spermatogenesis arrest that produces immotile sperm. By electroporating the mutated protein, the authors recovered the motility of ~5% of the sperm; although the sperm regenerated was not able to produce offspring using IVF, the embryos reached the 2-cell state (in contrast to controls that did not progress past the zygote state).

      This is a comprehensive evaluation of the mRNA methodology with multiple strengths. First, the authors show that naked synthetic RNA, purchased from a commercial source or generated in the laboratory with simple methods, is enough to express exogenous proteins in testicular germ cells. The authors compared RNA to DNA electroporation and found that germ cells are efficiently electroporated with RNA, but not DNA. The differences between these constructs were evaluated using in vivo imaging to track the reporter signal in individual animals through time. To understand how the reporter proteins affect the results of the experiments, the authors used different reporters: two fluorescent (eGFP and mCherry) and one bioluminescent (Luciferase). Although they observed differences among reporters, in every case expression lasted for at least two weeks. The authors used a relevant system to study the therapeutic potential of RNA electroporation. The ARMC2-deficient animals have impaired sperm motility phenotype that affects only the later stages of spermatogenesis. The authors showed that sperm motility was recovered to ~5%, which is remarkable due to the small fraction of germ cells electroporated with RNA with the current protocol. The sperm motility parameters were thoroughly assessed by CASA. The 3D reconstruction of an electroporated testis using state-of-the-art methods to show the electroporated regions is compelling.

      The main weakness of the manuscript is that although the authors manage to recover motility in a small fraction of the sperm population, it is unclear whether the increased sperm quality is substantial to improve assisted reproduction outcomes. The authors found that the rescued sperm could be used to obtain 2-cell embryos via IVF, but no evidence for more advanced stages of embryo differentiation was provided. The motile rescued sperm was also successfully used to generate blastocyst by ICSI, but the statistical significance of the rate of blastocyst production compared to non-rescued sperm remains unclear. The title is thus an overstatement since fertility was never restored for IVF, and the mutant sperm was already able to produce blastocysts without the electroporation intervention.

      Overall, the authors clearly show that electroporating mRNA can improve spermatogenesis as demonstrated by the generation of motile sperm in the ARMC2 KO mouse model.

      We thank the reviewer for this thoughtful and constructive comment. We agree that our study demonstrates a partial functional recovery of spermatogenesis rather than a complete restoration of fertility. Our main objective was to establish and validate a proof-of-concept approach showing that mRNA electroporation can rescue the expression of a missing or mutated protein in post-meiotic germ cells and result in the production of motile sperm.

      To address the reviewer’s concern, we have the title and discussion to more accurately reflect the scope of our findings. The new title reads:

      “Sperm motility in mice with oligo-astheno-teratozoospermia restored by in vivo injection and electroporation of naked mRNA”

      In the manuscript, we now emphasize that while motility recovery was significant, complete fertility restoration was not achieved. We have also clarified that:

      The 5% recovery in motile sperm represents a substantial improvement considering the small population of germ cells reached by the current electroporation method.

      The 2-cell embryo formation observed after IVF serves as a strong indication of partial functional recovery

      Finally, we now explicitly state in the Discussion that this approach should be considered a therapeutic proof-of-concept, demonstrating feasibility and potential, rather than a fully curative intervention.

      Reviewer #2 (Public review):

      The authors inject, into the rete testes, mRNA and plasmids encoding mRNAs for GFP and then ARMC2 (into infertile Armc2 KO mice) in a gene therapy approach to express exogenous proteins in male germ cells. They do show GFP epifluorescence and ARMC2 protein in KO tissues, although the evidence presented is weak. Overall, the data do not necessarily make sense given the biology of spermatogenesis and more rigorous testing of this model is required to fully support the conclusions, that gene therapy can be used to rescue male infertility.

      In this revision, the authors attempt to respond to the critiques from the first round of reviews. While they did address many of the minor concerns, there are still a number to be addressed. With that said, the data still do not support the conclusions of the manuscript.

      We thank the reviewer for their careful and detailed assessment of our manuscript. We appreciate the concerns raised regarding mRNA stability, GFP localization, and the interpretation of spermatogenesis stages, and we have addressed these points in the manuscript and in the responses below.

      (1) The authors have not satisfactorily provided an explanation for how a naked mRNA can persist and direct expression of GFP or luciferase for ~3 weeks. The most stable mRNAs in mammalian cells have half-lives of ~24-60 hours. The stability of the injected mRNAs should be evaluated and reported using cell lines. GFP protein's half-life is ~26 hours, and luciferase protein's half-life is ~2 hours.

      We thank the reviewer for this important comment. The stability of mRNA-GFP was assessed by RT-QPCR in HEK cells and seminiferous tubule cells (Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells (Fig. 5A). Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability, efficient translation within germ cells and the slow protein turnover that is typical of the spermatogenic lineage.

      (2) There is no convincing data shown in Figs. 1-8 that the GFP is even expressed in germ cells, which is obviously a prerequisite for the Armc2 KO rescue experiment shown in the later figures! In fact, to this reviewer the GFP appears to be in Sertoli cell cytoplasm, which spans the epithelium and surrounds germ cells - thus, it can be oft-confused with germ cells. In addition, if it is in germ cells, then the authors should be able to show, on subsequent days, that it is present in clones of germ cells that are maturing. Due to intracellular bridges, a molecule like GFP has been shown to diffuse readily and rapidly (in a matter of minutes) between adjacent germ cells. To clarify, the authors must generate single cell suspensions and immunostain for GFP using any of a number of excellent commercially-available antibodies to verify it is present in germ cells. It should also be present in sperm, if it is indeed in the germline.

      We thank the reviewer for this insightful comment. To directly address the concern, we performed additional experiments to assess GFP expression in germ cells following in vivo mRNA delivery. GFP-encoding mRNA was injected and electroporated into the testes on day 0. On day 1, testes were collected, enzymatically dissociated, and the resulting seminiferous tubule cell suspensions were cultured for 12 hours. Live cells were then analyzed by fluorescence microscopy (Fig. 10).

      We observed GFP expression in various germ cell types, including pachytene spermatocytes (53,4 %) (Fig 10 A-), round spermatids (25 %) (Fig 10B-E) and in elongated spermatids (11,4%) (Fig 10 C-E). The identification of these cell types was based on DAPI nuclear staining patterns, cell size fig 10 F, non-adherent characteristics, and the use of an enzymatic dissociation protocol.

      Fluorescence imaging revealed strong cytoplasmic GFP signals in each of these populations, confirming efficient transfection and translation of the delivered mRNA. These results demonstrate that the in vivo injection and electroporation protocol enables effective mRNA transfection across multiple stages of spermatogenesis. These results confirm that the injected mRNA is efficiently translated in germ cells at various stages of spermatogenesis. Together, these data validate the germ cell-specific nature of the GFP signal, supporting the Armc2 KO rescue experiments.

      As mentioned previously, we assessed the stability of mRNA-GFP using RT-QPCR in HEK cells and seminiferous tubule cells (see Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells. Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability and local translation within germ cells, as well as the slow protein turnover typical of the spermatogenic lineage.

      Other comments:

      70-1 This is an incorrect interpretation of the findings from Ref 5 - that review stated there were ~2,000 testis-enriched genes, but that does not mean "the whole process involves around two thousand of genes"

      We thank the reviewer for this helpful comment. We agree that our previous phrasing was imprecise. We have revised the sentence to clarify that approximately 2,000 genes show testis-enriched expression, rather than implying that the entire spermatogenic process is limited to these genes. The corrected sentence now reads:

      “Spermatogenesis involves the coordinated expression of a large number of genes, with approximately 2,000 showing testis-enriched expression, about 60% of which are expressed exclusively in the testes”

      74 would specify 'male':

      we have now specified it as you suggested.

      79-84 Are the concerns with ICSI due to the procedure itself, or the fact that it's often used when there is likely to be a genetic issue with the male whose sperm was used? This should be clarified if possible, using references from the literature, as this reviewer imagines this could be a rather contentious issue with clinicians who routinely use this procedure, even in cases where IVF would very likely have worked:

      We thank the reviewer for this important comment. Concerns about ICSI outcomes indeed reflect two partly overlapping causes: the procedure itself (direct sperm injection and associated laboratory manipulations) and the clinical/genetic background of couples undergoing ICSI (especially men with severe male-factor infertility). Large reviews and meta-analyses report a small increase in some perinatal and congenital risks after ART/ICSI, but these studies conclude that it is difficult to fully disentangle procedural effects from parental factors. Importantly, genetic or epigenetic abnormalities in the male (which motivate use of ICSI) likely contribute to adverse outcomes in offspring, while some studies also suggest that ICSI-specific manipulations may alter epigenetic marks in embryos. For these reasons professional bodies recommend reserving ICSI for appropriate male-factor indications rather than as routine insemination for non-male-factor cases

      We have revised the text accordingly to clarify this distinction:

      “ICSI can efficiently overcome the problems faced.  Nevertheless, concerns persist regarding the potential risks associated with this technique, including blastogenesis defect, cardiovascular defect, gastrointestinal defect, musculoskeletal defect, orofacial defect, leukemia, central nervous system tumors, and solid tumors [1-4]. Statistical analyses of birth records have demonstrated an elevated risk of birth defects, with a 30-40 % increased  likelihood in cases involving ICSI [1-4], and a prevalence of birth defects between 1 % and 4 % [3]. It is important to note, however, that the origin of these risks remains debated. Several large epidemiological and mechanistic studies indicate that both the procedure itself (direct microinjection and in vitro manipulation) and the underlying genetic or epigenetic abnormalities often present in men requiring ICSI contribute to the observed outcomes [1, 3] [5, 6] . To overcome these drawbacks, a number of experimental strategies have been proposed to bypass ARTs and restore spermatogenesis and fertility, including gene therapy [7-10].”

      199 Codon optimization improvement of mRNA stability needs a reference;

      We have added the references accordingly: [11-15]

      In one study using yeast transcripts, optimization improved RNA stability on the order of minutes (e.g., from ~5 minutes to ~17 minutes); is there some evidence that it could be increased dramatically to days or weeks?

      We agree with the reviewer that codon optimization can enhance mRNA stability, but available evidence indicates that this effect is moderate. In Saccharomyces cerevisiae, Presnyak et al. (2015) [16] showed that codon optimization increased mRNA half-life from approximately 5 minutes to ~17 minutes, representing a several-fold improvement rather than a shift to days or weeks. Similar codon-dependent stabilization has been observed in mammalian systems, where transcripts enriched in optimal codons exhibit longer half-lives and enhanced translation efficiency [11]; [17]). However, these studies consistently report effects on the scale of minutes to hours. In mammalian cells, the prolonged stability of therapeutic or vaccine mRNAs—lasting for days—is primarily achieved through additional features such as optimized untranslated regions, chemical nucleotide modifications (e.g., N¹-methylpseudouridine), and protective delivery systems, rather than codon usage alone ([18]; [19]).

      Other molecular optimizations that improve in vivo mRNA stability and translation include a poly(A) tail, which binds poly(A)-binding proteins to protect the transcript from 3′ exonuclease degradation and promotes ribosome recycling, and a CleanCap structure at the 5′ end, which mimics the natural Cap 1 configuration, protects against 5′ exonuclease attack, and enhances translational initiation [11-15]. Together, these modifications act synergistically to stabilize the transcript and support efficient translation.

      472-3 The reported half-life of EGFP is ~36 hours - so, if the mRNA is unstable (and not measured, but certainly could be estimated by qRT-PCR detection of the transcript on subsequent days after injection) and EGFP is comparatively more stable (but still hours), how does EGFP persist for 21 days after injection of naked mRNA??

      We thank the reviewer for this important comment. The stability of mRNA-GFP was assessed by RT-QPCR in HEK cells and seminiferous tubule cells (Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells (Fig. 5). Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability, efficient translation within germ cells and the slow protein turnover that is typical of the spermatogenic lineage.

      Curious why the authors were unable to get anti-GFP to work in immunostaining?

      We appreciate the reviewer’s question. We attempted to detect GFP using several commercially available anti-GFP antibodies under various standard immunostaining conditions. However, in our hands, these antibodies consistently produced either no signal or high background staining, making the results unreliable. We therefore relied on direct detection of GFP fluorescence, which provides a more accurate and specific readout of protein expression in our system.

      In Fig. 3-4, the GFP signals are unremarkable, in that they cannot be fairly attributed to any structure or cell type - they just look like blobs; and why, in Fig. 4D-E, why does the GFP signal appear stronger at 21 days than 15 days? And why is it completely gone by 28 days? This data is unconvincing.

      We would like to thank the reviewer for their comments. Figure 3–4 provides a global overview of GFP expression on the surface of the testis. The entire testis was imaged using an inverted epifluorescence microscope, and the GFP signal represents a composite of multiple seminiferous tubules across the tissue surface. Due to this whole-organ imaging approach, it is not possible to resolve individual structures such as the basement membrane or lumen, which is why the signals may appear as diffuse “blobs.”

      Regarding the time-course in Figure 4D–E, the apparent increase in GFP signal at 21 days compared with 15 days likely reflects accumulation and translation of the delivered mRNA in germ cells over time, whereas the absence of signal at 28 days corresponds to the natural turnover and degradation of GFP protein and mRNA in the tissue. We hope this explanation clarifies the observed patterns of fluorescence.

      If the authors did a single cell suspension, what types or percentage of cells would be GFP+? Since germ cells are not adherent in culture, a simple experiment could be done whereby a single cell suspension could be made, cultured for 4-6 hours, and non-adherent cells "shaken off" and imaged vs adherent cells. Cells could also be fixed and immunostained for GFP, which has worked in many other labs using anti-GFP.

      We thank the reviewer for this insightful comment. To directly address the concern, we performed additional experiments to assess GFP expression in germ cells following in vivo mRNA delivery. GFP-encoding mRNA was injected and electroporated into the testes on day 0. On day 1, testes were collected, enzymatically dissociated, and the resulting seminiferous tubule cell suspensions were cultured for 12 hours. Live cells were then analyzed by fluorescence microscopy (Fig. 10).

      We observed GFP expression in various germ cell types, including pachytene spermatocytes (53,4 %) (Fig 10 A-), round spermatids (25 %) (Fig 10B-E) and in elongated spermatids (11,4%) (Fig 10 C-E). The identification of these cell types was based on DAPI nuclear staining patterns, cell size fig 10 F, non-adherent characteristics, and the use of an enzymatic dissociation protocol.

      Fluorescence imaging revealed strong cytoplasmic GFP signals in each of these populations, confirming efficient transfection and translation of the delivered mRNA. These results demonstrate that the in vivo injection and electroporation protocol enables effective mRNA transfection across multiple stages of spermatogenesis.

      These results confirm that the injected mRNA is efficiently translated in germ cells at various stages of spermatogenesis. Together, these data validate the germ cell-specific nature of the GFP signal, supporting the Armc2 KO rescue experiments.

      As mentioned previously, we assessed the stability of mRNA-GFP using RT-QPCR in HEK cells and seminiferous tubule cells (see Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells. Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability and local translation within germ cells, as well as the slow protein turnover typical of the spermatogenic lineage.

      In Fig. 5, what is the half-life of luciferase? From this reviewer's search of the literature, it appears to be ~2-3 h in mammalian cells. With this said, how do the authors envision detectable protein for up to 20 days from a naked mRNA? The stability of the injected mRNAs should be shown in a mammalian cell line - perhaps this mRNA has an incredibly long half-life, which might help explain these results. However, even the most stable endogenous mRNAs (e.g., globin) are ~24-60 hrs.

      We did not directly assess the stability of luciferase mRNA, but we evaluated the persistence of GFP mRNA, which was synthesized and optimized using the same sequence optimization and chemical modification strategy as the luciferase mRNA. In these experiments, mRNA-GFP was detectable in seminiferous tubule cells for up to two weeks after injection. We therefore expect a similar stability profile for the luciferase mRNA. These findings suggest that the prolonged fluorescence or bioluminescence observed in our study likely reflects a combination of factors, including enhanced transcript stability, local translation within germ cells, and the inherently slow protein turnover characteristic of the spermatogenic lineage.

      527-8 The Sertoli cell cytoplasm is not just present along the basement membrane as stated, but also projects all the way to the lumina:

      we clarified the sentence " Sertoli cells have an oval to elongated nucleus and the cytoplasm presents a complex shape (“tombstone” pattern) along the basement membrane, with long projections that extend toward the lumen."

      529-30 This is incorrect, as round spermatids are never "localized between the spermatocytes and elongated spermatids" - if elongated spermatids are present, rounds are not - they are never coincident in the same testis section:

      We thank the reviewer for this important comment and for drawing attention to the detailed staging of the seminiferous epithelium. We agree that the spatial organization of germ cells varies depending on the stage of spermatogenesis. While round spermatids (steps 1–8) and elongated spermatids (steps 9–16) are typically associated with distinct stages, transitional stages of the seminiferous epithelium can contain both cell types in close proximity, reflecting the continuous and overlapping nature of spermatid differentiation (Meistrich, 2013, Methods Mol. Biol. 927:299–307). We have revised the text to clarify this point, indicating that the relative positioning of germ cell types depends on the stage of the seminiferous cycle rather than implying their constant coexistence within the same tubule section.

      Fig. 7. To this reviewer, all of the GFP appears to be in Sertoli cell cytoplasm In Figs 1-8 there is no convincing evidence presented that GFP is expressed in germ cells! In fact, it appears to be in Sertoli cells.

      We thank the reviewer for their observation. As previously mentioned, we have included an additional experiment specifically demonstrating GFP expression in germ cells (fig 10). This new data provides clear evidence that the GFP signal is not restricted to Sertoli cells and confirms successful uptake and translation of GFP mRNA in germ cells.

      Fig. 9 - alpha-tubuline?

      We corrected the figure.

      Fig. 11 - how was sperm morphology/motility not rescued on "days 3, 6, 10, 15, or 28 after surgery", but it was in some at 21 and 35? How does this make sense, given the known kinetics of male germ cell development??

      We note the reviewer’s concern regarding the timing of motile sperm appearance. Variability among treated mice is expected because transfection efficiency differed between spermatogonia and spermatids. Full spermiogenesis requires ~15 days, and epididymal transit adds ~8 days, consistent with motile sperm appearing around 21 days post-injection in some mice.

      And at least one of the sperm in the KO in Fig. B5 looks relatively normal, and the flagellum may be out-of-focus in the image? With only a few sperm for reviewers to see, how can we know these represent the population?

      We thank the reviewer for their comment. Upon closer examination of the image, the flagellum of the spermatozoon in question is clearly abnormally short and this is not due to being out of focus. Furthermore, the supplementary figure shows that the KO consistently lacks normal spermatozoa. These defects are consistent with previous findings from our laboratory [22], confirming that the observed phenotype is representative of the KO population rather than an isolated occurrence.

      Reviewer #3 (Public review):

      Summary:

      The authors used a novel technique to treat male infertility. In a proof-of-concept study, the authors were able to rescue the phenotype of a knockout mouse model with immotile sperm using this technique. This could also be a promising treatment option for infertile men.

      Strengths:

      In their proof-of-concept study, the authors were able to show that the novel technique rescues the infertility phenotype of Armc2 knockout spermatozoa. In the current version of the manuscript, the authors have added data on in vitro fertilisation experiments with Armc2 mRNA-rescued sperm. The authors show that Armc2 mRNA-rescued sperm can successfully fertilise oocytes that develop to the blastocyst stage. This adds another level of reliability to the data.

      Weaknesses:

      Some minor weaknesses identified in my previous report have already been fixed. The technique is new and may not yet be fully established for all issues. Nevertheless, the data presented in this manuscript opens the way for several approaches to immotile spermatozoa to ensure successful fertilisation of oocytes and subsequent appropriate embryo development.

      [Editors' note: The images in Figure 12 do not support the authors' interpretation that 2-cell embryos resulted from in vitro fertilization. Instead, the cells shown appear to be fragmented, unfertilized eggs. Combined with the lack of further development, it seems highly unlikely that fertilization was successful.]

      We thank the reviewer for their careful evaluation and constructive feedback. We appreciate the acknowledgment of the strengths of our study, particularly the proof-of-concept demonstration that Armc2-mRNA electroporation can rescue sperm motility in Armc2 knockout mice.

      Regarding the concern raised by the editor about Figure 12, we would like to clarify two technical points. First, the IVF experiments were performed using CD1 oocytes and B6D2 sperm. Due to strain-specific incompatibilities, fertilization of CD1 oocytes by B6D2 sperm typically does not progress beyond the two-cell stage (Fernández-González [23] et al., 2008, Biology of Reproduction). Therefore, the observation of two-cell embryos represents the expected limit of development in this cross and serves as a strong indication of successful fertilization, even though further development is not possible. Second, the oocytes used in these experiments were treated with collagenase to remove cumulus cells. This enzymatic treatment can sometimes affect the morphology of early embryos, which may explain why the two-cell embryos in Figure 12 appear less regular or somewhat fragmented. We also included a control showing embryos from B6D2 sperm with the same collagenase treatment on CD1 oocytes, which yielded similar appearances (Fig14 A4).

      To provide additional functional evidence, we complemented the IVF experiments with ICSI using rescued Armc2<sup>–/–</sup> sperm and B6D2 oocytes, which allowed embryos to develop to the blastocyst stage. In these experiments, 25% of injected oocytes reached the blastocyst stage with rescued sperm compared to 13% for untreated Armc2–/– sperm (Supplementary Fig. 9) These results support the functional competence of rescued sperm and demonstrate partial recovery of fertilization ability following Armc2 mRNA electroporation.

      We have clarified these points in the revised Results and Discussion sections to emphasize that the IVF data indicate partial functional recovery of rescued sperm rather than full fertility restoration. These clarifications address the editor’s concern while accurately representing the technical limitations of the strain combination used in our experiments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Fig 12 and Supplementary Fig 9 are mislabeled in the text and rebuttal.

      We thank the reviewer for pointing this out. We have carefully checked the manuscript and the rebuttal text, and corrected all references to Figure 12 and Supplementary Figure 9 to ensure they are accurately labeled and consistent throughout the text.

      Reviewer #3 (Recommendations for the authors):

      The contribution of the newly added authors should be clarified. All other aspects of inadequacy raised in my previous report have been adequately addressed.

      No further comments.

      We thank the reviewer for noting this. The contributions of the newly added authors have been clarified in the Author Contributions section of the revised manuscript. All other points raised in the previous review have been addressed as indicated.

      References

      (1) Hansen, M., et al., Assisted reproductive technologies and the risk of birth defects--a systematic review. Hum Reprod, 2005. 20(2): p. 328-38.

      (2) Halliday, J.L., et al., Increased risk of blastogenesis birth defects, arising in the first 4 weeks of pregnancy, after assisted reproductive technologies. Hum Reprod, 2010. 25(1): p. 59-65.

      (3) Davies, M.J., et al., Reproductive technologies and the risk of birth defects. N Engl J Med, 2012. 366(19): p. 1803-13.

      (4) Kurinczuk, J.J., M. Hansen, and C. Bower, The risk of birth defects in children born after assisted reproductive technologies. Curr Opin Obstet Gynecol, 2004. 16(3): p. 201-9.

      (5) Graham, M.E., et al., Assisted reproductive technology: Short- and long-term outcomes. Dev Med Child Neurol, 2023. 65(1): p. 38-49.

      (6) Palermo, G.D., et al., Intracytoplasmic sperm injection: state of the art in humans. Reproduction, 2017. 154(6): p. F93-f110.

      (7) Usmani, A., et al., A non-surgical approach for male germ cell mediated gene transmission through transgenesis. Sci Rep, 2013. 3: p. 3430.

      (8) Raina, A., et al., Testis mediated gene transfer: in vitro transfection in goat testis by electroporation. Gene, 2015. 554(1): p. 96-100.

      (9) Michaelis, M., A. Sobczak, and J.M. Weitzel, In vivo microinjection and electroporation of mouse testis. J Vis Exp, 2014(90).

      (10) Wang, L., et al., Testis electroporation coupled with autophagy inhibitor to treat non-obstructive azoospermia. Mol Ther Nucleic Acids, 2022. 30: p. 451-464.

      (11) Wu, Q., et al., Translation affects mRNA stability in a codon-dependent manner in human cells. eLife, 2019. 8: p. e45396.

      (12) Gallie, D.R., The cap and poly(A) tail function synergistically to regulate mRNA translational efficiency. Genes & Development, 1991. 5(11): p. 2108-2116.

      (13) Henderson, J.M., et al., Cap 1 messenger RNA synthesis with co-transcriptional CleanCap® analog improves protein expression in mammalian cells. Nucleic Acids Research, 2021. 49(8): p. e42.

      (14) Stepinski, J., et al., Synthesis and properties of mRNAs containing novel “anti-reverse” cap analogs. RNA, 2001. 7(10): p. 1486-1495.

      (15) Sachs, A.B., P. Sarnow, and M.W. Hentze, Starting at the beginning, middle, and end: translation initiation in eukaryotes. Cell, 1997. 89(6): p. 831-838.

      (16) Presnyak, V., et al., Codon optimality is a major determinant of mRNA stability. Cell, 2015. 160(6): p. 1111-24.

      (17) Cao, D., et al., Unlock the sustained therapeutic efficacy of mRNA. J Control Release, 2025. 383: p. 113837.

      (18) Karikó, K., et al., Incorporation of pseudouridine into mRNA yields superior nonimmunogenic vector with increased translational capacity and biological stability. Mol Ther, 2008. 16(11): p. 1833-40.

      (19) Pardi, N., et al., mRNA vaccines — a new era in vaccinology. Nature Reviews Drug Discovery, 2018. 17(4): p. 261-279.

      (20) Meistrich, M.L. and R.A. Hess, Assessment of Spermatogenesis Through Staging of Seminiferous Tubules, in Spermatogenesis: Methods and Protocols, D.T. Carrell and K.I. Aston, Editors. 2013, Humana Press: Totowa, NJ. p. 299-307.

      (21) Au - Mäkelä, J.-A., et al., JoVE, 2020(164): p. e61800.

      (22) Coutton, C., et al., Bi-allelic Mutations in ARMC2 Lead to Severe Astheno-Teratozoospermia Due to Sperm Flagellum Malformations in Humans and Mice. Am J Hum Genet, 2019. 104(2): p. 331-340.

      (23) Fernández-Gonzalez, R., et al., Long-term effects of mouse intracytoplasmic sperm injection with DNA-fragmented sperm on health and behavior of adult offspring. Biol Reprod, 2008. 78(4): p. 761-72.

    1. Chapter 1: Introduction to College Writing at CNM This textbook was designed for English 1110 and 1120, Composition I and Composition II, respectively. If you are enrolled in one of these courses, you may be nearing the end of your studies at Central New Mexico Community College (CNM), you may be just starting your studies at CNM, or you may have already taken this class but didn’t finish. The reality is every English 1110 and 1120 course at CNM contains a diverse range of students. If you are enrolled in English 1110 or 1120 at CNM, you are likely a resident of New Mexico (NM). You might have gone to an elementary or secondary school here. You might feel like a part of the unique culture here in NM. Wherever you started, we welcome you to CNM! The graphic below lists the outcomes for English 1110 and 1120, which will be introduced by your instructor and included in your syllabus. Course Outcomes: Composition I & II Composition I: English 1110 Analyze communication through reading and writing skills. Employ writing processes such as planning, organizing, composing, and revising. Express a primary purpose and organize supporting points logically. Use and document research evidence appropriate for college-level writing. Employ academic writing styles appropriate for different genres and audiences. Identify and correct grammatical and mechanical errors in their writing. Composition II: English 1120 Analyze the rhetorical situation for purpose, main ideas, support, audience, and organizational strategies in a variety of genres. Employ writing processes such as planning, organizing, composing, and revising. Use a variety of research methods to gather appropriate, credible information. Evaluate sources, claims, and evidence for their relevance, credibility, and purpose. Quote, paraphrase, and summarize sources ethically, citing and documenting them appropriately. Integrate information from sources to effectively support claims and for other purposes ( to provide background information, evidence/examples, illustrate an alternative view, etc.). Use an appropriate voice ( including syntax and word choice). Did You Know Being a CNM student means that you are enrolled at the largest post-secondary institution in the state. CNM offers resources that can help you not only with your studies but also with managing your responsibilities as well. In this textbook, we’ll cover the conventions of writing, and we’ll also cover some of the resources available to you as a CNM student. And since this book is free and available on the internet, you can keep it…forever! This textbook is an Open Educational Resource (OER) text, which means it was created using free and available sources on the Internet, namely eight different open access books. Our compiled textbook will shift between free, outside writing resources and the plural first pronoun voice, or the we voice, signaling the English teachers who compiled and developed sections of the text. Throughout this text, the writers–all CNM English faculty, some of whom are still paying back student loans–are the we who compiled this textbook. We did so because we believe that a college education should be engaging, enlightening, informative, life-affirming, worldview-upturning and affordable. We believe it shouldn’t cost money to learn how to write, and that is why we are making this book available to you. This project also would not have happened without the support of CNM’s OER initiative and Liberal Arts administration. This textbook will cover ways to communicate effectively as you develop insight into your own style, writing process, grammatical choices, and rhetorical situations. With these skills, you should be able to improve your writing talent regardless of the discipline you enter after completing this course. Knowing your rhetorical situation, or the circumstances under which you communicate, and knowing which tone, style, and genre will most effectively persuade your audience, will help you regardless of whether you are enrolling in history, biology, theater, or music next semester–because when you get to college, you write in every discipline. To help launch our introduction this chapter includes a section from the open access textbook Successful Writing. As you begin this chapter, you may wonder why you need an introduction. After all, you have been writing and reading since elementary school. You completed numerous assessments of your reading and writing skills in high school and as part of your application process for college. You may write on the job, too. Why is a college writing course even necessary? It can be difficult to feel excited about an intro writing course when you are eager to begin the coursework in your major (and if you are an English major, let your teacher know so you can talk about your future education plans). Regardless of your field of study, honing your writing skills—plus your reading and critical-thinking skills—will help you build a solid academic foundation. In college, academic expectations change from what you may have experienced in high school. The quantity of work you are expected to complete increases. When instructors expect you to read pages upon pages or study hours and hours for one particular course, managing your workload can be challenging. This chapter includes strategies for studying efficiently and managing your time. The quality of the work you do also changes. It is not enough to understand course material and summarize it on an exam. You will also be expected to seriously engage with new ideas by reflecting on them, analyzing them, critiquing them, making connections, drawing conclusions, or finding new ways of thinking about a given subject. Educationally, you are moving into deeper waters. A good introductory writing course will help you swim. Infographic comparing various aspects of high school and college, adapted from “Chapter One” of Successful Writing, 2012, used according to Creative Commons 3.0 cc-by-nc-sa. Seeking Help Meeting College Expectations Depending on your education before coming to CNM, you will have varied writing experiences as compared with other students in class. Some students might have earned a GED, some might be returning to school after a decades-long break, and still other students might either be graduating high school, or be freshly graduated. If the latter is the case, you might enter college with a wealth of experience writing five-paragraph essays, book reports, and lab reports. Even the best students, however, need to make big adjustments to learn the conventions of academic writing. College-level writing obeys different rules, and learning them will help you hone your writing skills. Think of it as ascending another step up the writing ladder. Many students feel intimidated asking for help with academic writing; after all, it’s something you’ve been doing your entire life in school. However, there’s no need to feel like it’s a sign of your lack of ability; on the contrary, many of the strongest student writers regularly seek help and support with their writing (that’s why they’re so strong). College instructors are familiar with the ups and downs of writing, and most colleges have support systems in place to help students learn how to write for an academic audience. The following sections discuss common on-campus writing services, what to expect from them, and how they can help you. Tutoring Center CNM students have access to The Learning and Computer Center (TLCc), which is available on six campuses: Advanced Technology Center, Main, Montoya, Rio Rancho, South Valley, and Westside. At these writing centers, trained tutors help students meet college-level expectations. The tutoring centers offer one-on-one meetings, online, and group sessions for multiple disciplines. TLCc also offers workshops on citing and learning how to develop a writing process.   CNM’s Ace Tutoring Lab provides students with resources and support for their academic needs. Student-Led Workshops Some courses encourage students to share their research and writing with each other, and even offer workshops where students can present their own writing and offer constructive comments to their classmates. Independent paper-writing workshops provide a space for peers with varying interests, work styles, and areas of expertise to brainstorm. Writing in drafts makes academic work more manageable. Drafting gets your ideas onto paper, which gives you more to work with than the perfectionist’s daunting blank screen. You can always return later to fix the problems that bother you. Communicating in a College Course Communication courses teach students that communication involves two parties—the sender and the receiver of the communicated message. Sometimes, there is more than one sender and often, there is more than one receiver of the message. The main purpose of communication whether it be email, text, tweet, blog, discussion, presentation, written assignment, or speech is always to help the receiver(s) of the message understand the idea that the sender of the message is trying to share. This section will focus on electronic communication in a college course. Email or message An email or message sent to your instructor is often the result of a question you may have. Many students think contacting their instructor shows that they weren’t paying attention or that they are the only student did not understand something, so they often keep quiet and go on trying to do work that they do not understand. Other students think that their teacher is their own private tutor, so they email or message the teacher several times a day to ask questions that likely have answers in the syllabus and in the learning module instructions. Both of these behaviors are unhelpful and frustrating to the students and the instructor. On the other hand, avoid monopolizing your teacher’s email inbox with dozens of emails and messages per week and expecting her to respond immediately. Nobody enjoys having their inbox blown up with multiple messages by the same person. Try to remember your instructor will likely have many other emails from administrators, staff, and other students. Avoid sending harsh or demanding emails or messages when you are panicked, frustrated, or angry. Walk away from your computer and return at a later time when you feel calmer. Then re-read the instructions, or syllabus, or the course materials you find confusing, and if you still cannot find the answer because it is not there, definitely email or message your instructor. Tips for Emailing Your Instructor Be polite: Address your professor formally, using the title “Professor” or “Instructor” with their last name. Depending on how formal your professor seems, use a salutation (“Dear” or “Hello” followed by your professor’s name/title (Dr. XYZ, Professor XYZ, etc.) Pose a question. Clearly introduce the purpose of your email and the information you are requesting. If you are not asking a specific question, be aware that you may not receive a response to your email. Be concise. Instructors are busy people, and although they are typically more than happy to help you, kindly get to your point quickly. Sign off with your first and last name, the course number, and the class time. This will make it easy for your professor to identify you. Do not ask, “When will you return our papers?” If you MUST ask, make it specific and realistic (e.g., “Will we get our papers back by the end of next week?”). Most Instructors teach multiple classes and could have hundreds of assignments to grade. Do not ask your Instructor if you missed anything important when you were absent. Instructors work diligently to design their coursework, so asking if any of that content was important can be considered rude or dismissive of their hard work. Instead ask if missed anything that was not included on the course schedule. Creating an appropriate tone can feel overwhelming. We know that all emails should be polite, and emails to your instructor may be more formal or professional. Not all Instructors will expect formal emails, but it’s important to remember that your instructor is not your friend and that an email or message is not a text message. It is not appropriate to send an informal or colloquial message and to assume your instructor is your friend or acquaintance and that an email or message is the same as text message. Sample Email to an Instructor Subject: English 1110 Section 102: Absence Dear/Hello Professor [Last name], l was unable to attend class today, so I wanted to ask if there are any handouts or additional assignments I should complete before we meet on Thursday? I did review the syllabus and course outline, and I will complete the quiz and reading homework listed there. Many thanks, [First name] [Last name]   Communication on Public Discussion Boards Whenever you are being asked to communicate or post in a discussion forum or other communication mode, you need to ask yourself if there will be one recipient or several. In other words, who will be your readers? Is the forum private so that only your instructor or only a group of classmates or only a specific classmate can see it or is it public so that everyone, all of your classmates and your instructor can see your post? Check the forum to which you are posting for these settings. The discussion board is a public forum, so you might have a broad audience. Create a post according to the recipient(s). It is nice to address a classmate by name if you are responding to a specific person in a discussion forum.

      post to the whole class

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript reports the identification of putative orthologues of mitochondrial contact site and cristae organizing system (MICOS) proteins in Plasmodium falciparum - an organism that unusually shows an acristate mitochondrion during the asexual part of its life cycle and then this develops cristae as it enters the sexual stage of its life cycle and beyond into the mosquito. The authors identify PfMIC60 and PfMIC19 as putative members and study these in detail. The authors at HA tags to both proteins and look for timing of expression during the parasite life cycle and attempt (unsuccessfully) to localise them within the parasite. They also genetically deleted both gene singly and in parallel and phenotyped the effect on parasite development. They show that both proteins are expressed in gametocytes and not asexuals, suggesting they are present at the same time as cristae development. They also show that the proteins are dispensible for the entire parasite life cycle investigated (asexuals through to sporozoites), however there is some reduction in mosquito transmission. Using EM techniques they show that the morphology of gametocyte mitochondria is abnormal in the knock out lines, although there is great variation.

      Major comments: The manuscript is interesting and is an intriguing use of a well studied organism of medical importance to answer fundamental biological questions. My main comments are that there should be greater detail in areas around methodology and statistical tests used. Also, the mosquito transmission assays (which are notoriously difficult to perform) show substantial variation between replicates and the statistical tests and data presentation are not clear enough to conclude the reduction in transmission that is claimed. Perhaps this could be improved with clearer text?

      We would like to thank the reviewer for taking the time to review our manuscript. We are happy to hear the reviewer thinks the manuscript is interesting and thank the reviewer for their constructive feedback.

      To clarify the statistical analyses used, we included a new supplementary dataset with all statistical analyses and p-values indicated per graph. Furthermore, figure legends now include the information on the exact statistical test used in each case.

      Regarding mosquito experiments, while we indeed reported a reduction in transmission and oocysts numbers we are aware that this effect might be due to the high variability in mosquito feeding assays. To highlight this point, we deleted the sentence "with the transmission reduction of [numbers]...." and we included the sentence "The high variability encountered in the standard membrane feeding assays, though, partially obstructs a clear conclusion on the biological relevance of the observed reduction in oocyst numbers"

      More specific comments to address: Line 101/Fig1E (and figure legend) - What is this heatmap showing. It would be helpful to have a sentence or two linking it to a specific methodology. I could not find details in the M+M section and "specialized, high molecular mass gels" does not adequately explain what experiments were performed. The reference to Supplementary Information 1 also did not provide information.

      We added the information "high molecular mass gels with lower acrylamide percentage" to clarify methodology in the text. Furthermore, we extended the figure legend to include all relevant information. Further experimental details can be found in the study cited in this context, where the dataset originates from (Evers et al., 2021).

      Line 115 and Supplementary Figure 2C + D - The main text says that the transgenic parasites contained a mitochondrially localized mScarlet for visualization and localization, but in the supplementary figure 2 it shows mitotracker labelling rather than mScarlet. This is very confusing. The figure legend also mentions both mScarlet and MitoTracker. I assume that mScarlet was used to view in regular IFAs (Fig S2C) and the MitoTracker was used for the expansion microscopy (Fig S2D)? Please clarify.

      We thank the reviewer for pointing this out - this was indeed incorrectly annotated. We used the endogenous mito-mScarlet signal in IFA and mitoTracker in U-ExM. The figure annotation has now been corrected.

      Figure 2C - what is the statistical test being used (the methods say "Mean oocysts per midgut and statistical significance were calculated using a generalized linear mixed effect model with a random experiment effect under a negative binomial distribution." but what test is this?)?

      The statistic test is now included in the material and method section with the sentence "The fitted model was used to obtain estimated means and contrasts and were evaluated using Wald Statistics". The test is now also mentioned in the figure legend.

      Also the choice of a log10 scale for oocyst intensity is an unusual choice - how are the mosquitoes with 0 oocysts being represented on this graph? It looks like they are being plotted at 10^-1 (which would be 0.1 oocysts in a mosquito which would be impossible).

      As the data spans three orders of magnitude with low values being biologically meaningful, we decided that a log scale would best facilitate readability of the graph. As the 0 values are also important to show, we went with a standard approach to handle 0s in log transformed data and substituted the 0s with a small value (0.001). We apologize for not mentioning this transformation in the manuscript. To make this transformation transparent, we added a break at the lower end of the log‑scaled y‑axis and relabelled the lowest tick as '0'. This ensures that mosquitoes with zero oocysts are shown along the x‑axis without being assigned an artificial value on the log scale. We would furthermore like to highlight that for statistics we used the true value 0 and not 0.001.

      Figure 2D - it is great that the data from all feeding replicates has been shared, however it is difficult to conclude any meaningful impact in transmission with the knock-out lines when there is so much variation and so few mosquitoes dissected for some datapoints (10 mosquitoes are very small sample sizes). For example, Exp1 shows a clear decrease in mic19- transmission, but then Exp2 does not really show as great effect. Similarly, why does the double knock out have better transmission than the single knockouts? Sure there would be a greater effect?

      We agree with the reviewer and with the new sentence added, as per major point, we hope we clarified the concept. Note that original Figure 2D has been moved to the supplementary information, as per minor comment of another reviewer.

      Figure 3 legend - Please add which statistical test was used and the number of replicates.

      Done

      Figure 4 legend - Please add which statistical test was used and the number of replicates.

      Done. Regarding replicates, note that while we measured over 100 cristae from over 30 mitochondria, these all stem from the same parasite culture.

      Figure 5C - the 3D reconstructions are very nice, but what does the red and yellow coloring show?

      Indeed, the information was missing. We added it to the figure legend.

      Line 352 - "Still, it is striking that, despite the pronounced morphological phenotype, and the possibly high mitochondrial stress levels, the parasites appeared mostly unaffected in life cycle propagation, raising questions about the functional relevance of mitochondria at these stages." How do the authors reconcile this statement with the proven fact that mitochondria-targeted antimalarials (such as atovaquone) are very potent inhibitors of parasite mosquito transmission?

      Our original sentence was reductive. What we wanted to state was related to the functional relevance of crista architecture and overall mitochondrial morphology rather than the general functional relevance of the mitochondria. We changed the sentence accordingly.

      Furthermore, even though we do not discuss this in the article, we are aware of mitochondria targeting drugs that are known to block mosquito transmission. We want to point out that it is difficult to discern the disruption of ETC and therefore an impact on energy conversion with the impact on the essential pathway of pyrimidine synthesis, highly relevant in microgamete formation. Still, a recent paper from Sparkes et al. 2024 showed the essentiality of mitochondrial ATP synthesis during gametogenesis so it is very likely that the mitochondrial energy conversion is highly relevant for transmission to the mosquito.

      Reviewer #1 (Significance (Required)):

      This manuscript is a novel approach to studying mitochondrial biology and does open a lot of unanswered questions for further research directions. Currently there are limitations in the use of statistical tests and detail of methodology, but these could be easily be addressed with a bit more analysis/better explanation in the text. This manuscript could be of interest to readers with a general interest in mitochondrial cell biology and those within the specific field of Plasmodium research. My expertise is in Plasmodium cell biology.

      We thank the reviewer for the praise.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Major comments: 1) In my opinion, the authors tend to sensationalize or overinterpret their results. The title of the manuscript is very misleading. While MICOS is certainly important for crista formation, it is not the only factor, as ATP synthase dimer rows make a highly significant contribution to crista morphology. Thus, one can argue with equal validity that ATP synthase should be considered the 'architect', as it's the conformation of the dimers and rows modulate positive curvature. Secondly, while cristae are still formed upon mic60/mic19 gene knockout (KO), they are severely deformed, and likely dysfunctional (see below). Thus, I do not agree with the title that MICOS is dispensable for crista formation, because the authors results show that it clearly is essential. So, the title should be changed.

      We thank the reviewer for taking the time to review our manuscript.

      Based on the reviewers' interpretation we conclude the title does not come across as intended. We have changed the title to: "The role of MICOS in organizing mitochondrial cristae in malaria parasites"

      The Discussion section starting from line 373 also suffers from overinterpretation as well as being repetitive and hard to understand. The authors infer that MICOS stability is compromised less in the single KOs (sKO) in compared to the mic60/mic19 double KO (dKO). MICOS stability was never directly addressed here and the composition of the MICOS complex is unaddressed, so it does not make sense to speculate by such tenuous connections. The data suggest to me that mic60 and mic19 are equally important for crista formation and crista junction (CJ) stabilization, and the dKO has a more severe phenotype than either KO, further demonstrating neither is epistatic.

      We do agree with the reviewer's notion that we did not address complex stability, and our wording did not make this sufficiently clear. We shortened and rephrased the paragraph in question.

      The following paragraphs (line 387 to 422) continues with such unnecessary overinterpretation to the point that it is confusing and contradictory. Line 387 mentions an 'almost complete loss of CJs' and then line 411 mentions an increase in CJ diameter, both upon Mic60 ablation. I do not think this discussion brings any added value to the manuscript and should be shortened. Yes, maybe there are other putative MICOS subunits that may linger in the KOS that are further destabilized in the dKO, or maybe Mic60 remains in the mic19 KO (and vice versa) to somehow salvage more CJs, which is not possible in the dKO. It is impossible to say with confidence how ATP synthase behaves in the KOs with the current data.

      We shortened this paragraph.

      2) While the authors went through impressive lengths to detect any effect on lifecycle progression, none was found except for a reduction in oocyte count. However, the authors did not address any direct effect on mitochondria, such as OXPHOS complex assembly, respiration, membrane potential. This seems like a missed opportunity, given the team's previous and very nice work mapping these complexes by complexome profiling. However, I think there are some experiments the authors can still do to address any mitochondrial defects using what they have and not resorting to complexome profiling (although this would be definitive if it is feasible):

      i) Quantification of MitoTracker Red staining in WT and KOs. The authors used this dye to visualize mitochondria to assay their gross morphology, but unfortunately not to assay membrane potential in the mutants. The authors can compare relative intensities of the different mitochondria types they categorized in Fig. 3A in 20-30 cells to determine if membrane potential is affected when the cristae are deformed in the mutants. One would predict they are affected.

      Interesting suggestion. As our staining and imaging conditions are suitable for such analysis (as demonstrated by Sarazin et al., 2025, https://www.biorxiv.org/content/10.1101/2025.11.27.690934v1), we performed the measurements on the same dataset which we collected for Figure 3. We did, however, not detect any difference in mitotracker intensity between the different lines. The result of this analysis is included in the new version of Supplementary figure S6.

      ii) Sporozoites are shown in Fig S5. The authors can use the same set up to track their motion, with the hypothesis that they will be slower in the mutants compared to WT due to less ATP. This assumes that sporozoite mitochondria are active as in gametocytes.

      While theoretically plausible and informative, we currently do not know the relevance of mitochondrial energy conversion for general sporozoite biology or specifically features of sporozoite movement. Given the required resources and time to set this experiment up and the uncertainty whether it is a relevant proxy for mitochondrial functioning, we argue it is out of scope for this manuscript.

      iii) Shotgun proteomics to compare protein levels in mutants compared to WT, with the hypothesis that OXPHOS complex subunits will be destabilized in the mutants with deformed cristae. This could be indirect evidence that OXPHOS assembly is affected, resulting in destabilized subunits that fail to incorporate into their respective complexes.

      While this experiment could potentially further our understanding of the interaction between MICOS and levels of OXPHOS complex subunits we argue that the indirect nature of the evidence does not justify the required investments.

      To expedite resubmission, the authors can restrict the cell lines to WT and the dKO, as the latter has a stronger phenotype that the individual KOs and conclusions from this cell line are valid for overall conclusions about Plasmodium MICOS.

      I will also conclude that complexome/shotgun proteomics may be a useful tool also for identifying other putative MICOS subunits by determining if proteins sharing the same complexome profile as PfMic60 and Mic19 are affected. This would address the overinterpretation problem of point 1.

      3) I am aware of the authors previous work in which they were not able to detect cristae in ABS, and thus have concluded that these are truly acristate. This can very well be true, or there can be immature cristae forms that evaded detection at the resolution they used in their volumetric EM acquisitions. The mitochondria and gametocyte cristae are pretty small anyway, so it not unreasonable to assume that putative rudimentary cristae in ABS may be even smaller still. Minute levels of sampled complex III and IV plus complex V dimers in ABS that were detected previously by the authors by complexome profiling would argue for the presence of miniscule and/or very few cristae.

      I think that authors should hedge their claim that ABS is acrisate by briefly stating that there still is a possibility that miniscule cristae may have been overlooked previously.

      We acknowledge that we cannot demonstrate the absolute absence of any membrane irregularities along the inner mitochondrial membrane. At the same time, if such structures were present, they would be extremely small and unlikely to contain the full set of proteins characteristic of mature cristae. For this reason, we consider it appropriate to classify ABS mitochondria as acristate. To reflect the reviewer's point while maintaining clarity for readers, we have slightly adjusted our wording in the manuscript, changing 'fully acristate' to 'acristate'.

      This brings me to the claim that Mic19 and Mic60 proteins are not expressed in ABS. This is based on the lack of signal from the epitope tag; a weak signal is detected in gametocytes. Thus, one can counter that Mic19 and Mic60 are also expressed, but below the expression limits of the assay, as the protein exhibits low expression levels when mitochondrial activity is upregulated.

      We agree with the reviewer that the absence of a detectable epitope‑tag signal does not definitively exclude low‑level expression, and we have therefore replaced the term 'absent' with 'undetectable' throughout the manuscript. In context with previous findings of low-level transcripts of the proteins in a study by Lopez-Berragan et al. and Otto et al., we also added the sentence "The apparent absence could indicate that transcripts are not translated in ABS or that the proteins' expression was below detection limits of western blot analysis." to the discussion. _At the same time, we would like to clarify that transcript levels for both genes fall within the

      To address this point, the authors should determine of mature mic60 and mic19 mRNAs are detected in ABS in comparison to the dKO, which will lack either transcript. RT-qPCR using polyT primers can be employed to detect these transcripts. If the level of these mRNAs are equivalent to dKO in WT ABS, the authors can make a pretty strong case for the absence of cristae in ABS.

      We appreciate the reviewer's suggestion. As noted in the Discussion, existing transcriptomic datasets already show detectable MIC19 and MIC60 mRNAs in ABS. For this reason, we expect RT-qPCR to reveal low (but not absent) levels of both transcripts, unlike the true loss expected to be observed in the dKO. Because such residual signals have been reported previously and their biological relevance remains uncertain, we do not believe transcript levels alone can serve as a definitive indicator of cristae absence in ABS.

      They should highlight the twin CX9C motifs that are a hallmark of Mic19 and other proteins that undergo oxidative folding via the MIA pathway. Interestingly, the Mia40 oxidoreductase that is central to MIA in yeast and animals, is absent in apicomplexans (DOI: 10.1080/19420889.2015.1094593).

      Searching for the CX9C motifs is a valuable suggestion. In response to the reviewer´s suggestion we analysed the conservation of the motif in PfMIC19 and included this in a new figure panel (Figure 1 F).

      Did the authors try to align Plasmodium Mic19 orthologs with conventional Mic19s? This may reveal some conserved residues within and outside of the CHCH domain.

      In response to this comment we made Figure 1 F, where we show conserved residues within the CHCH domains of a broad range of MIC19 annotated sequences across the opisthokonts, and show that the Cx9C motifs are conserved also in PfMIC19. Outside the CHCH domain, we did not find any meaningful conservation, as PfMIC19 heavily diverges from opisthokont MIC19.

      5) Statistcal significance. Sometimes my eyes see population differences that are considered insignificant by the statistical methods employed by the authors, eg Fig. 4E, mutants compared to WT, especially the dKO. Have the authors considered using other methods such as student t-test for pairwise comparisons?

      The graphs in figures 3, 4 and 5 got a makeover, such that they now are in linear scale and violin plots (also following a suggestion from further down in the reviewer's comments). We believe that this improves interpretability. ANOVA was kept as statistical testing to assure the correction for multiple comparisons that cannot be performed with standard t-test. A full overview of statistics and exact p-values can also be found in the newly added supplementary information 2.

      Minor comments: Line 33. Anaerobes (eg Giardia) have mitochondria that do produce ATP, unlike aerobic mitochondria

      We acknowledge that producing ATP via OXPHOS is not a characteristic of all mitochondria-like organelles (e.g. mitosomes), which is why these are typically classified separately from canonical mitochondria. When not considering mitochondria-like organelles, energy conversion is the function that the mitochondrion is most well-known for and the one associated with cristae.

      Line 56: Unclear what authors mean by "canonical model of mitochondria"

      To clarify we changed this to "yeast or human" model of mitochondria.

      Lines 75-76: This applies to Mic10 only

      We removed the "high degree of conservation in other cristate eukaryotes" statement.

      Line 80: Cite DOI: 10.1016/j.cub.2020.02.053

      Done

      Fig 2D: I find this table difficult to read. If authors keep table format, at least get rid of 'mean' column' as this data is better depicted in 2C. I suggest depicted this data either like in 3B depicting portion of infected vs unaffected flies in all experiments, then move modified Table to supplement. Important to point out experiment 5 appears to be an outlier with reduced infectivity across all cell lines, including WT.

      To clarify: the mean reported in the table indicates the mean per replicate while the mean reported in figure 2C is the overall mean for a given genotype that corrects for variability within experiments. We agree that moving the table to the supplementary data is a good idea. We decided to not include a graph for infected and non-infected mosquitoes as this information would be partially misleading, highlighting a phenotype we argue to be influenced by the strong variability.

      Fig. 3C-G: I feel like these data repeatedly lead to same conclusions. These are all different ways of showing what is depicted in Fig 2B: mitochondria gross morphology is affected upon ablation of MICOS. I suggest that these graphs be moved to supplement and replaced by the beautiful images.

      Thank you for the nice comment on our images. We have now moved part of the graphs to supplementary figure 6 and only kept the Relative Frequency, Sphericity and total mitochondria volume per cell in the main figure.

      Line 180: Be more specific with which tubulin isoform is used as a male marker and state why this marker was used in supplemental Fig S6.

      We have now specified the exact tubulin isoform used as the male gametocyte marker, both in the main text and in Supplementary Fig. S6. This is a commercial antibody previously known to work as an effective male marker, which is why we selected it for this experiment. This is now clearly stated in the manuscript.

      Line 196 and Fig 3C: the word 'intensities' in this context is very ambiguous. Please choose a different term (puncta, elements, parts?). This is related to major point 2i above.

      To clarify the biological effect that we can conclude form the measurement, we added an explanation about it in the respective section of the results, and we decided to replace the raw results of the plug-in readout with the deduced relative dispersion.

      Line 222: Report male/female crista measurements

      We added Supplementary information 2, which contains exact statistical test and outcomes on all presented quantifications as well as a per-sex statistical analysis of the data from figure 4. Correspondingly, we extended supplementary information 2 by a per-sex colour code for the thin section TEM data.

      Fig. 4B-E: depict data as violin plots or scatter plots like Fig. 2C to get a better grasp of how the crista coverage is distributed. It seems like the data spread is wider in the double KO. This would also solve the problem with the standard deviation extending beyond 0%.

      We changed this accordingly.

      Lines 331-333: Please clarify that this applies for some, but not all MICOS subunits. Please also see major point 1 above. Also, the authors should point out that despite their structural divergence, trypanosomal cryptic mitofilins Mic34 and Mic40 are essential for parasite growth, in contrast to their findings with PfMic60 (DOI: https://doi.org/10.1101/2025.01.31.635831).

      This has been changed accordingly.

      Line 320: incorrect citation. Related to point 1above.

      Correct citation is now included in the text.

      Lines 333-335. This is related to the above. Again, some subunits appear to affect cell growth under lab conditions, and some do not. This and the previous sentence should be rewritten to reflect this.

      This has been changed accordingly.

      Line 343-345: The sentence and citation 45 are strange. Regarding the former, it is about CHCHD10, whose status as a bona fide MICOS subunit is very tenuous, so I would omit this. About the phenomenon observed, I think it makes more sense to write that Mic60 ablation results in partially fragmented mitochondria in yeast (Rabl et al., 2009 J Cell Biol. 185: 1047-63). A fragmented mitochondria is often a physiological response to stress. I would just rewrite as not to imply that mitochondrial fission (or fusion) is impaired in these KOs, or at least this could be one of several possibilities.

      The sentence has been substituted following the indication of the reviewer. Though we still include the data of the human cells as this has also been shown in Stephens et al. 2020.

      Line 373: 'This indicates' is too strong. I would say 'may suggest' as you have no proof that any of the KOs disrupts MICOS. This hypothesis can be tested by other means, but not by penetrance of a phenotype.

      Done

      Line 376-377; 'deplete functionality' does not make sense, especially in the context of talking about MICOS subunit stability. In my opinion, this paragraph overinterprets the KO effects on MICOS stability. None of the experiments address this phenomenon, and thus the authors should not try to interpret their results in this context. See major point 1. Other suggestions for added value

      We removed the sentence. Also, the entire paragraph has been shortened, restructured and wording was changed to address major point 1.

      1) Does Plasmodium Sam50 co-fractionate with Mic60 and Mic19 in BN PAGE (Fig. 1E)

      While we did identify SAMM50 in our BN PAGE, the protein does not co-migrate with the MICOS components but instead comigrates with other components of a putative sorting and assembly machinery (SAM) complex. As SAMM50, the SAM complex and the overarching putative mitochondrial membrane space bridging (MIB) complex are not mentioned in the manuscript, we decided to not include the information in the figure.

      Reviewer #2 (Significance (Required)):

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      However, despite the herculean efforts of the authors, the manuscript as it currently stands represents only a minor advance in our understanding of the evolution of MICOS, which from the title and focus of the manuscript, is the main goal of the authors. In its current form, the manuscript reports some potentially important findings:

      1) Mic60 is verified to play a role in crista formation, as is predicted by its orthology to other characterized Mic60 orthologs.

      2) The discovery of a novel Mic19 analog (since the authors maintain there is no significant sequence homology), which exhibits a similar (or the same?) complexome profile with Mic60. This protein was upregulated in gametocytes like Mic60 and phenocopies Mic60 KO.

      3) Both of these MICOS subunits are essential (not dispensable) for proper crista formation

      4) Surprisingly, neither MICOS subunit is essential for in vitro growth or differentiation from ABS to sexual stages, and from the latter to sporozoites. This says more about the biology of plasmodium itself than anything about the essentiality of Mic60, ie plasmodium life cycle progression tolerates defects to mitochondrial morphology. But yes, I agree with the authors that Mic60's apparent insignificance for cell growth in examined conditions does differ with its essentiality in other eukaryotes. But fitness costs were not assayed (eg by competition between mutants and WT in infection of mosquitoes)

      5) Decreased fitness of the mutants is implied by a reduction of oocyte formation.

      While interesting in their own way, collectively they do not represent a major advance in our understanding of MICOS evolution. Furthermore, the findings bifurcate into categories informing MICOS or Plasmodium biology. Both aspects are somewhat underdeveloped in their current form.

      This is unfortunate because there seem to be many missed opportunities in the manuscript that could, with additional experiments, lead to a manuscript with much wider impact. For me, what is remarkable about Plasmodium MICOS that sets it apart from other iterations is the apparent absence of the Mic10 subunit. Purification of plasmodium MICOS via the epitope tagged Mic60 and Mic19 could have verified that MICOS is assembled without this core subunit. Perhaps Mic60 and Mic19 are the vestiges of the complex, and thus operate alone in shaping cristae. Such a reduction may also suggest the declining importance of mitochondria in plasmodium.

      Another missed opportunity was to assay the impact of MICOS-depletion of OXPHOS in plasmodium. This is a salient issue as maybe crista morphology is decoupled from OXPHOS capacity in Plasmodium, which links to the apparent tolerance of mitochondrial morphology in cell growth and differentiation. I suggested in section A experiments to address this deficit.

      Finally, the authors could assay fitness costs of MICOS-ablation and associated phenotypes by assaying whether mosquito infectivity is reduced in the mutants when they are directly competing with WT plasmodium. Like the authors, I am also surprised that MICOS mutants can pass population bottlenecks represented by differentiation events. Perhaps the apparent robustness of differentiation may contribute plasmodium's remarkable ability to adapt.

      I realize that the authors put a lot of efforts into their study and again, I am very impressed by the sophistication of the methods employed. Nevertheless, I think there is still better ways to increase the impact of the study aside from overinterpreting the conclusions from the data. But this would require more experiments along the lines I suggest in Section A and here.

      We thank the reviewer for their extensive analysis of the significance of our findings, including the compliments on our microscopy images and the sophisticated experimental approaches. We hope we have convincingly argued why we could or could not include some of the additional analyses suggested by the reviewer in section 1 above.

      With regard to the significance statement, we want to point out that our finding that PfMICOS is not needed for initial formation of cristae (as opposed to organization thereof), is a confirmation of something that has been assumed by the field, without being the actual focus of studies. We argue that the distinction between formation and organization of cristae is important and deserves some attention within the manuscript. The result of MICOS not being involved in the initial formation of cristae, we argue to be relevant in Plasmodium biology and beyond. As for the insights into how MICOS works in Plasmodium we have confirmed that the previously annotated PfMIC60 is indeed involved in the organization of cristae. Furthermore, we have identified and characterized PfMIC19. These findings, we argue, are indeed meaningful insights into PfMICOS.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      MICOS is a conserved mitochondrial protein complex responsible for organising the mitochondrial inner membrane and the maintenance of cristae junctions. This study sheds first light on the role of two MICOS subunits (Mic60 and the newly annotated Mic19) in the malaria parasite Plasmodium falciparum, which forms cristae de novo during sexual development, as demonstrated by EM of thin section and electron tomography. By generating knockout lines (including a double knockout), the authors demonstrate that knockout of both MICOS subunits leads to defects in cristae morphology and a partial loss of cristae junctions. With a formidable set of parasitological assays, the authors show that despite the metabolically important role of mitochondria for gametocytes, the knockout lines can progress through the life stages and form sporozoites, albeit with diminished infection efficiency.

      We thank the reviewer for their time and compliment.

      Major comments:

      1) The authors should improve to present their findings in the right context, in particular by:

      (i) giving a clearer description in the introduction of what is already known about the role of MICOS. This starts in the introduction, where one main finding is missing: loss of MICOS leads to loss of cristae junctions and the detachment of cristae membranes, which are nevertheless formed, but become membrane vesicles. This needs to be clearly stated in the introduction to allow the reader to understand the consistency of the authors' findings in P. falciparum with previous reports in the literature.

      We extended the introduction to include this information.

      (ii) at the end to the introduction, the motivating hypothesis is formulated ad hoc "conclusive evidence about its involvement in the initial formation of cristae is still lacking" (line 83). If there is evidence in the literature that MICOS is strictly required for cristae formation in any organism, then this should be explained, because the bona fide role of MICOS is maintenance of cristae junctions (the hypothesis is still plausible and its testing important).

      To clarify we rephrased the sentence to: "Although MICOS has been described as an organizer of crista junctions, its role during the initial formation of nascent cristae has not been investigated."

      2) Line 96-97: "Interestingly, PfMIC60 is much larger than the human MICOS counterpart, with a large, poorly predicted N-terminal extension." This statement is lacking a reference and presumably refers to annotated ORFs. The authors should clarify if the true N-terminus is definitely known - a 120kDa size is shown for the P. falciparum but this is not compared to the expected length or the size in S. cerevisiae.

      To solve the reference issue, we added the uniprot IDs we compared to see that the annotated ORF is bigger in Plasmodium. We also changed the comparison to yeast instead of human, because we realized it is confusing to compare to yeast all throughout the figure, but then talk about human in this specific sentence.

      Regarding whether the true N-terminus is known. Short answer: No, not exactly.

      However, we do know that the Pf version is about double the size of the yeast protein.

      As the reviewer correctly states, we show the size of 120kDa for the tagged protein in Figure 1G. Considering that we tagged the protein C-terminally, and observed a 120kDa product on western blot, it is safe to conclude that the true N-terminus does not deviate massively from the annotated ORF, and hence, that there is a considerable extension of the protein beyond a 60kDa protein. We do not directly compare to yeast MIC60 on our western blots, however, that comparison can be drawn from literature: Tarasenko et al., 2017 showed that purified MIC60 running at ~60kDa on SDS-PAGE actively bends membranes, suggesting that in its active form, the monomer of yeast MIC60 is indeed 60kDa in size.

      To clarify, we now emphasize that we ran the Alphafold prediction on the annotated open reading frame (annotated and sequenced by Bohme et al. and Chapell et al. now cited in the manuscript), and revised the wording to make clear what we are comparing in which sentence.

      3) lines 244-245: "Furthermore, our data indicates the effect size increases with simultaneous ablation of both proteins?". The authors should explain which data they are referring to, as some of the data in Fig 3 and 4 look similar and all significance tests relate to the wild type, not between the different mutants, so it is not clear if any overserved differences are significant. The authors repeat this claim in the discussion in lines 368-369 without referring to a specific significance test. This needs to be clarified.

      As a reply to this and other comments from the reviewers we added the multiple testing within all samples. In addition, to clarify statistics used we included a supplementary dataset with all p-values and statistical tests used.

      4) lines 304-306: "Though well established as the cristae organizing system, the role of MICOS in initial formation of cristae remains hidden in model organisms that constitutively display cristae.". This sentence is misleading since even in organisms that display numerous cristae throughout their life cycle, new cristae are being formed as the cells proliferate. Thus, failure to produce cristae in MICOS knockout lines would have been observable but has apparently not been reported in the literature. Thus, the concerted process in P. falciparum makes it a great model organism, but not fundamentally different to what has been studied before in other organisms.

      We deleted this statement.

      5) lines 373-378. "where ablation of just MIC60 is sufficient to deplete functionality of the entire MICOS (11, 15),". The authors' claim appears to be contrary to what is actually stated in ref 15, which they cite:

      "MICOS subunits have non-redundant functions as the absence of both MICOS subcomplexes results in more severe morphological and respiratory growth defects than deletion of single MICOS subunits or subcomplexes."

      This seems in line with what the authors show, rather than "different".

      This sentence has been removed.

      6) lines 380-385: "... thus suggesting that membrane invaginations still arise, but are not properly arranged in these knockout lines. This suggests that MICOS either isn't fully depleted,...". These conclusions are incompatible with findings from ref. 15, which the authors cite. In that study, the authors generated a ∆MICOS line which still forms membrane invaginations, showing that MICOS is not required at all for this process in yeast. Hence the authors' implication that MICOS needs to be fully depleted before membrane invaginations cease to occur is not supported by the literature.

      This sentence has been deleted in the revised version of the manuscript.

      Minor comments:

      7) The authors should consider if the first part of their title could be seen as misleading: It suggests that MICOS is "the architect" in cristae formation, but this is not consistent with the literature nor their own findings.

      Title is changed accordingly

      Minor comments:

      • Line 43, of the three seminal papers describing the discovery of MICOS in 2011, the authors only cite two (refs 6 and 7), but miss the third paper, Hoppins et al, PMID: 21987634, which should probably be corrected.

      Done, the paper is now cited

      • Page 2, line 58: for a more complete picture the authors should also cite the work of others here which shows that although at very low levels, e.g. complex III (a drug target) and ATP synthase do assemble (Nina et al, 2011, JBC).

      Done

      • Page 3, line 80: "Irrespective of the shape of an organism's cristae, the crista junctions have been described as tubular channels that connect the cristae membrane to the inner boundary membrane (22, 24)." This omits the slit-shaped cristae junctions found in yeast (Davies et al, 2011, PNAS), which the authors should include.

      The paper and concept have been added to the manuscript, though the sentence has been moved up in the introduction, when crista junctions are first introduced.

      • Line 97: "poorly predicted N-terminal extension", as there is no experimental structure, we don't know if the prediction is poor. Presumably the authors mean either poorly ordered or the absence of secondary structure elements, or the poor confidence score for that region in the prediction? This should be clarified or corrected.

      We were referring to the poor confidence score. To address this comment as well as major point 2, we rewrote the respective paragraph. It now clearly states that confidence of the prediction is low, and we mention the tool that was used to identify conserved domains (Topology-based Evolutionary Domains).

      • Line 98: "an antiparallel array of ten β-sheets". They are actually two parallel beta-sheets stacked together. The authors could find out the name of this fold, but the confidence of the prediction is marked a low/very low. So, its existence is unknown, not just its "function".

      We adapted the domain description to "a stack of two parallel beta-sheets" and replaced the statement on unknown function by the statement "Because this domain is predicted solely from computational analysis, both its actual existence in the native protein and its biological function remain unknown."

      Fig 1B: The authors show two alphafold predictions of S. cerevisiae and P. falciparum Mic60 structures. There is however an experimental Mic60/19 (fragment) structure from the former organism (PMID: 36044574), which should be included if possible

      We appreciate the reviewer's suggestion and note that the available structural data indeed provides valuable insight into how MIC60 and MIC19 interact. However, these structures represent fusion constructs of limited protein fragments and therefore capture only a small portion of each protein, specifically the interaction interface. Because our aim in Fig. 1B is to compare the overall domain architecture of the full‑length proteins, we believe that including fragment‑based structures would be less informative in this context.

      Line: 318-321: "The same trend was observed for PfMIC19 and PfMIC60. Although transcriptomic data suggested that low-level transcripts of PfMIC19 and PfMIC60 are present in ABS (38), we did not detect either of the proteins in ABS by western blot analysis. While this statement is true, the authors should comment on the sensitivity of the respective methods - how well was the antibody working in their hands and how do they interpret the absence of a WB band compared to transcriptomics data?

      The HA antibody used in our experiments is a standard commercial reagent that performs reliably in both WB and IFA, although it shows a low background signal in gametocytes. We agree that the sensitivity of the method and the interpretation of weak or absent bands should be addressed explicitly. Transcript levels for both PfMIC19 and PfMIC60 in asexual blood stages fall within the

      • Lines 322-323: would the authors not typically have expected an IFA signal given the strength of the band in Western blot? If possible, the authors should comment if the negative fluorescence outcome can indeed be explained with the low abundance or if technical challenges are an equally good explanation.

      Considering the nature of the investigated proteins (embedded in the IMM and spread throughout the mitochondria) difficulties in achieving a clear signal in IFA or U-ExM are not very surprizing. While epitopes may remain buried in IFA, U-ExM usually increases accessibility for the antibodies. However, U-ExM comes at the cost of being prone to dotty background signals, therefore potentially hiding low abundance, naturally dotty signals such as the signal of MICOS proteins that localize to distinct foci (at the CJ) along the mitochondrion. Current literature suggests that, in both human and yeast, STED is the preferred method for accurate spatial resolution of MICOS proteins (https://www.ncbi.nlm.nih.gov/pubmed/32567732,https://www.ncbi.nlm.nih.gov/pubmed/32067344). Unfortunately, we do not have experience with, nor access to, this particular technique/method.

      Lines 357-365: the authors describe limitations of the applied methods adequately. Perhaps it would be helpful to make a similar statement about the analysis of 3D objects like mitochondria and cristae from 2D sections. E.g. the apparent cristae length depends on whether cristae are straight (e.g. coiled structures do not display long cross sections despite their true length in 3D).

      The limitations of other methods are described in the respective results section.

      We added a clarifying sentence in the results section of Figure 4:

      "Note that such measurements do not indicate the true total length or width of cristae, as the data is two-dimensional. The recorded values are to be considered indicative of possible trends, rather than absolute dimensions of cristae."

      This statement refers to the length/width measurements of cristae.

      In the context of Figure 4 D we mention the following (see preprint lines 229 - 230): "We expect this effect to translate into the third dimension and thus conclude that the mean crista volume increases with the loss of either PfMIC19,PfMIC60, or both."

      For Figure 5, we included a clarifying statement in the results section of the preprint (lines 269 - 273): "Note that these mitochondrial volumes are not full mitochondria, but large segments thereof. As a result of the incompleteness of the mitochondria within the section, and the tomography specific artefact of the missing wedge, we were unable to confirm whether cristae were in fact fully detached from the boundary membrane, or just too long to fit within the observable z-range. "

      Line 404: perhaps undetected or similar would be a better description than "hidden"?

      The sentence does not exist in the revised manuscript

      Reviewer #3 (Significance (Required)):

      The main strength of the study is that it provides the first characterisation of the MICOS complex in P. falciparum, a human parasite in which the mitochondrion has been shown to be a drug target. Mic60 and the newly annotated Mic19 are confirmed to be essential for proper cristae formation and morphology, as well as overall mitochondrial morphology. Furthermore, the mutant lines are characterised for their ability to complete the parasite life cycle and defects in infection effectivity are observed. This work is an important first step for deciphering the role of MICOS in the malaria parasite and the composition and function of this complex in this organism. The limitation of the study stems from what is already known about MICOS and its subunits in

      great detail in yeast and humans with similar findings regarding loss of cristae and cristae defects. The findings of this study do not provide dramatic new insight on MICOS function or go substantially beyond the vast existing literature in terms of the extent of the study, which focuses on parasitological assays and morphological analysis. Exploring the role of MICOS in an early-divergent organism and human parasite is however important given the divergence found in mitochondrial biology and P. falciparum is a uniquely suited model system. One aspect that would increase the impact of the paper would be if the authors could mechanistically link the observed morphological defects to the decreased infection efficiency, e.g. by probing effects on mitochondrial function. This will likely be challenging as the morphological defects are diverse and the fitness defects appear moderate/mild.

      As suggested by Reviewer 2, we examined mitochondrial membrane potential in gametocytes using MitoTracker staining and did not observe any obvious differences associated with the morphological defects. At present, additional assays to probe mitochondrial function in P. falciparum gametocytes are not sufficiently established, and developing and validating such methods would require substantial work before they could be applied to our mutant lines. For these reasons, a more detailed mechanistic link between the observed morphological changes and the reduced infection efficiency is currently beyond reach.

      The advance presented in this study is to pioneer the study of MICOS in P. falciparum, thus widening our understanding of the role of this complex to different model organism. This study will likely be mainly of interest for specialised audiences such as basic research parasitologists and mitochondrial biologists. My own field of expertise is mitochondrial biology and structural biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Major comments:

      1) In my opinion, the authors tend to sensationalize or overinterpret their results. The title of the manuscript is very misleading. While MICOS is certainly important for crista formation, it is not the only factor, as ATP synthase dimer rows make a highly significant contribution to crista morphology. Thus, one can argue with equal validity that ATP synthase should be considered the 'architect', as it's the conformation of the dimers and rows modulate positive curvature. Secondly, while cristae are still formed upon mic60/mic19 gene knockout (KO), they are severely deformed, and likely dysfunctional (see below). Thus, I do not agree with the title that MICOS is dispensable for crista formation, because the authors results show that it clearly is essential. So, the title should be changed.

      The Discussion section starting from line 373 also suffers from overinterpretation as well as being repetitive and hard to understand. The authors infer that MICOS stability is compromised less in the single KOs (sKO) in compared to the mic60/mic19 double KO (dKO). MICOS stability was never directly addressed here and the composition of the MICOS complex is unaddressed, so it does not make sense to speculate by such tenuous connections. The data suggest to me that mic60 and mic19 are equally important for crista formation and crista junction (CJ) stabilization, and the dKO has a more severe phenotype than either KO, further demonstrating neither is epistatic.

      The following paragraphs (line 387 to 422) continues with such unnecessary overinterpretation to the point that it is confusing and contradictory. Line 387 mentions an 'almost complete loss of CJs' and then line 411 mentions an increase in CJ diameter, both upon Mic60 ablation. I do not think this discussion brings any added value to the manuscript and should be shortened. Yes, maybe there are other putative MICOS subunits that may linger in the KOS that are further destabilized in the dKO, or maybe Mic60 remains in the mic19 KO (and vice versa) to somehow salvage more CJs, which is not possible in the dKO. It is impossible to say with confidence how ATP synthase behaves in the KOs with the current data.

      2) While the authors went through impressive lengths to detect any effect on lifecycle progression, none was found except for a reduction in oocyte count. However, the authors did not address any direct effect on mitochondria, such as OXPHOS complex assembly, respiration, membrane potential. This seems like a missed opportunity, given the team's previous and very nice work mapping these complexes by complexome profiling. However, I think there are some experiments the authors can still do to address any mitochondrial defects using what they have and not resorting to complexome profiling (although this would be definitive if it is feasible):

      i) Quantification of MitoTracker Red staining in WT and KOs. The authors used this dye to visualize mitochondria to assay their gross morphology, but unfortunately not to assay membrane potential in the mutants. The authors can compare relative intensities of the different mitochondria types they categorized in Fig. 3A in 20-30 cells to determine if membrane potential is affected when the cristae are deformed in the mutants. One would predict they are affected.

      ii) Sporozoites are shown in Fig S5. The authors can use the same set up to track their motion, with the hypothesis that they will be slower in the mutants compared to WT due to less ATP. This assumes that sporozoite mitochondria are active as in gametocytes.

      iii) Shotgun proteomics to compare protein levels in mutants compared to WT, with the hypothesis that OXPHOS complex subunits will be destabilized in the mutants with deformed cristae. This could be indirect evidence that OXPHOS assembly is affected, resulting in destabilized subunits that fail to incorporate into their respective complexes.

      To expedite resubmission, the authors can restrict the cell lines to WT and the dKO, as the latter has a stronger phenotype that the individual KOs and conclusions from this cell line are valid for overall conclusions about Plasmodium MICOS.

      I will also conclude that complexome/shotgun proteomics may be a useful tool also for identifying other putative MICOS subunits by determining if proteins sharing the same complexome profile as PfMic60 and Mic19 are affected. This would address the overinterpretation problem of point 1.

      3) I am aware of the authors previous work in which they were not able to detect cristae in ABS, and thus have concluded that these are truly acristate. This can very well be true, or there can be immature cristae forms that evaded detection at the resolution they used in their volumetric EM acquisitions. The mitochondria and gametocyte cristae are pretty small anyway, so it not unreasonable to assume that putative rudimentary cristae in ABS may be even smaller still. Minute levels of sampled complex III and IV plus complex V dimers in ABS that were detected previously by the authors by complexome profiling would argue for the presence of miniscule and/or very few cristae.

      I think that authors should hedge their claim that ABS is acrisate by briefly stating that there still is a possibility that miniscule cristae may have been overlooked previously.

      This brings me to the claim that Mic19 and Mic60 proteins are not expressed in ABS. This is based on the lack of signal from the epitope tag; a weak signal is detected in gametocytes. Thus, one can counter that Mic19 and Mic60 are also expressed, but below the expression limits of the assay, as the protein exhibits low expression levels when mitochondrial activity is upregulated.

      To address this point, the authors should determine of mature mic60 and mic19 mRNAs are detected in ABS in comparison to the dKO, which will lack either transcript. RT-qPCR using polyT primers can be employed to detect these transcripts. If the level of these mRNAs are equivalent to dKO in WT ABS, the authors can make a pretty strong case for the absence of cristae in ABS.

      4) The major finding of the manuscript is of a Mic19 analog in plasmodium should be highlighted. As far as I know, this manuscript could represent the first instance of Mic19 outside of opisthokonts that was not found by sensitive profile HMM searches and certainly the first time such a Mic19 was functionally analyzed.

      They should highlight the twin CX9C motifs that are a hallmark of Mic19 and other proteins that undergo oxidative folding via the MIA pathway. Interestingly, the Mia40 oxidoreductase that is central to MIA in yeast and animals, is absent in apicomplexans (DOI: 10.1080/19420889.2015.1094593).

      Did the authors try to align Plasmodium Mic19 orthologs with conventional Mic19s? This may reveal some conserved residues within and outside of the CHCH domain.

      5) Statistcal significance. Sometimes my eyes see population differences that are considered insignificant by the statistical methods employed by the authors, eg Fig. 4E, mutants compared to WT, especially the dKO. Have the authors considered using other methods such as student t-test for pairwise comparisons?

      Minor comments:

      Line 33. Anaerobes (eg Giardia) have mitochondria that do produce ATP, unlike aerobic mitochondria

      Line 56: Unclear what authors mean by "canonical model of mitochondria"

      Lines 75-76: This applies to Mic10 only

      Line 80: Cite DOI: 10.1016/j.cub.2020.02.053

      Fig 2D: I find this table difficult to read. If authors keep table format, at least get rid of 'mean' column' as this data is better depicted in 2C. I suggest depicted this data either like in 3B depicting portion of infected vs unaffected flies in all experiments, then move modified Table to supplement. Important to point out experiment 5 appears to be an outlier with reduced infectivity across all cell lines, including WT.

      Fig. 3C-G: I feel like these data repeatedly lead to same conclusions. These are all different ways of showing what is depicted in Fig 2B: mitochondria gross morphology is affected upon ablation of MICOS. I suggest that these graphs be moved to supplement and replaced by the beautiful images

      Line 180: Be more specific with which tubulin isoform is used as a male marker and state why this marker was used in supplemental Fig S6.

      Line 196 and Fig 3C: the word 'intensities' in this context is very ambiguous. Please choose a different term (puncta, elements, parts?). This is related to major point 2i above.

      Line 222: Report male/female crista measurements

      Fig. 4B-E: depict data as violin plots or scatter plots like Fig. 2C to get a better grasp of how the crista coverage is distributed. It seems like the data spread is wider in the double KO. This would also solve the problem with the standard deviation extending beyond 0%.

      Lines 331-333: Please clarify that this applies for some, but not all MICOS subunits. Please also see major point 1 above. Also, the authors should point out that despite their structural divergence, trypanosomal cryptic mitofilins Mic34 and Mic40 are essential for parasite growth, in contrast to their findings with PfMic60 (DOI: https://doi.org/10.1101/2025.01.31.635831).

      Line 320: incorrect citation. Related to point 1above.

      Lines 333-335. This is related to the above. Again, some subunits appear to affect cell growth under lab conditions, and some do not. This and the previous sentence should be rewritten to reflect this.

      Line 343-345: The sentence and citation 45 are strange. Regarding the former, it is about CHCHD10, whose status as a bona fide MICOS subunit is very tenuous, so I would omit this. About the phenomenon observed, I think it makes more sense to write that Mic60 ablation results in partially fragmented mitochondria in yeast (Rabl et al., 2009 J Cell Biol. 185: 1047-63). A fragmented mitochondria is often a physiological response to stress. I would just rewrite as not to imply that mitochondrial fission (or fusion) is impaired in these KOs, or at least this could be one of several possibilities.

      Line 373: 'This indicates' is too strong. I would say 'may suggest' as you have no proof that any of the KOs disrupts MICOS. This hypothesis can be tested by other means, but not by penetrance of a phenotype.

      Line 376-377; 'deplete functionality' does not make sense, especially in the context of talking about MICOS subunit stability. In my opinion, this paragraph overinterprets the KO effects on MICOS stability. None of the experiments address this phenomenon, and thus the authors should not try to interpret their results in this context. See major point 1.

      Other suggestions for added value

      1) Does Plasmodium Sam50 co-fractionate with Mic60 and Mic19 in BN PAGE (Fig. 1E)

      2) Can Alphafold3 predict a heterotetramer of PfMic60? What about the four Mic19 and Mic60 subunits together. Is this tetramer consistent with the Bock-Bierbaum model. Is this model consistent with the CJ diameter measured in plasmodium, which is perhaps better evidence than that in lines 419-422.

      Significance

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      However, despite the herculean efforts of the authors, the manuscript as it currently stands represents only a minor advance in our understanding of the evolution of MICOS, which from the title and focus of the manuscript, is the main goal of the authors.

      In its current form, the manuscript reports some potentially important findings:

      1) Mic60 is verified to play a role in crista formation, as is predicted by its orthology to other characterized Mic60 orthologs.

      2) The discovery of a novel Mic19 analog (since the authors maintain there is no significant sequence homology), which exhibits a similar (or the same?) complexome profile with Mic60. This protein was upregulated in gametocytes like Mic60 and phenocopies Mic60 KO.

      3) Both of these MICOS subunits are essential (not dispensable) for proper crista formation

      4) Surprisingly, neither MICOS subunit is essential for in vitro growth or differentiation from ABS to sexual stages, and from the latter to sporozoites. This says more about the biology of plasmodium itself than anything about the essentiality of Mic60, ie plasmodium life cycle progression tolerates defects to mitochondrial morphology. But yes, I agree with the authors that Mic60's apparent insignificance for cell growth in examined conditions does differ with its essentiality in other eukaryotes. But fitness costs were not assayed (eg by competition between mutants and WT in infection of mosquitoes)

      5) Decreased fitness of the mutants is implied by a reduction of oocyte formation.

      While interesting in their own way, collectively they do not represent a major advance in our understanding of MICOS evolution. Furthermore, the findings bifurcate into categories informing MICOS or Plasmodium biology. Both aspects are somewhat underdeveloped in their current form.

      This is unfortunate because there seem to be many missed opportunities in the manuscript that could, with additional experiments, lead to a manuscript with much wider impact.

      For me, what is remarkable about Plasmodium MICOS that sets it apart from other iterations is the apparent absence of the Mic10 subunit. Purification of plasmodium MICOS via the epitope tagged Mic60 and Mic19 could have verified that MICOS is assembled without this core subunit. Perhaps Mic60 and Mic19 are the vestiges of the complex, and thus operate alone in shaping cristae. Such a reduction may also suggest the declining importance of mitochondria in plasmodium.

      Another missed opportunity was to assay the impact of MICOS-depletion of OXPHOS in plasmodium. This is a salient issue as maybe crista morphology is decoupled from OXPHOS capacity in Plasmodium, which links to the apparent tolerance of mitochondrial morphology in cell growth and differentiation. I suggested in section A experiments to address this deficit.

      Finally, the authors could assay fitness costs of MICOS-ablation and associated phenotypes by assaying whether mosquito infectivity is reduced in the mutants when they are directly competing with WT plasmodium. Like the authors, I am also surprised that MICOS mutants can pass population bottlenecks represented by differentiation events. Perhaps the apparent robustness of differentiation may contribute plasmodium's remarkable ability to adapt.

      I realize that the authors put a lot of efforts into their study and again, I am very impressed by the sophistication of the methods employed. Nevertheless, I think there is still better ways to increase the impact of the study aside from overinterpreting the conclusions from the data. But this would require more experiments along the lines I suggest in Section A and here.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) I have to admit that it took a few hours of intense work to understand this paper and to even figure out where the authors were coming from. The problem setting, nomenclature, and simulation methods presented in this paper do not conform to the notation common in the field, are often contradictory, and are usually hard to understand. Most importantly, the problem that the paper is trying to solve seems to me to be quite specific to the particular memory study in question, and is very different from the normal setting of model-comparative RSA that I (and I think other readers) may be more familiar with.

      We have revised the paper for clarity at all levels: motivation, application, and parameterization. We clarify that there is a large unmet need for using RSA in a trial-wise manner, and that this approach indeed offers benefits to any team interested in decoding trial-wise representational information linked to a behavioral responses, and as such is not a problem specific to a single memory study.

      (2) The definition of "classical RSA" that the authors are using is very narrow. The group around Niko Kriegeskorte has developed RSA over the last 10 years, addressing many of the perceived limitations of the technique. For example, cross-validated distance measures (Walther et al. 2016; Nili et al. 2014; Diedrichsen et al. 2021) effectively deal with an uneven number of trials per condition and unequal amounts of measurement noise across trials. Different RDM comparators (Diedrichsen et al. 2021) and statistical methods for generalization across stimuli (Schütt et al. 2023) have been developed, addressing shortcomings in sensitivity. Finally, both a Bayesian variant of RSA (Pattern component modelling, (Diedrichsen, Yokoi, and Arbuckle 2018) and an encoding model (Naselaris et al. 2011) can effectively deal with continuous variables or features across time points or trials in a framework that is very related to RSA (Diedrichsen and Kriegeskorte 2017). The author may not consider these newer developments to be classical, but they are in common use and certainly provide the solution to the problems raised in this paper in the setting of model-comparative RSA in which there is more than one repetition per stimulus.

      We appreciate the summary of relevant literature and have included a revised Introduction to address this bounty of relevant work. While much is owed to these authors, new developments from a diverse array of researchers outside of a single group can aid in new research questions, and should always have a place in our research landscape. We owe much to the work of Kriegeskorte’s group, and in fact, Schutt et al., 2023 served as a very relevant touchpoint in the Discussion and helped to highlight specific needs not addressed by the assessment of the “representational geometry” of an entire presented stimulus set. Principal amongst these needs is the application of trial-wise representational information that can be related to trial-wise behavioral responses and thus used to address specific questions on brain-behavior relationships. We invite the Reviewer to consider the utility of this shift with the following revisions to the Introduction.

      Page 3. “Recently, methodological advancements have addressed many known limitations in cRSA. For example, cross-validated distance measures (e.g., Euclidean distance) have improved the reliability of representational dissimilarities in the presence of noise and trial imbalance (Walther et al., 2016; Nili et al., 2014; Diedrichsen et al., 2021). Bayesian approaches such as pattern component modeling (Diedrichsen, Yokoi, & Arbuckle, 2018) have extended representational approaches to accommodate continuous stimulus features or temporal variation. Further, model comparison RSA strategies (Diedrichsen et al., 2021) and generalization techniques across stimuli (Schütt et al., 2023) have improved sensitivity and inference. Nevertheless, a common feature shared across most of improvements is that they require stimuli repetition to examine the representational structure. This requirement limits their ability to probe brain-behavior questions at the level of individual events”.

      Page 8. “While several extensions of RSA have addressed key limitations in noise sensitivity, stimulus variance, and modeling (e.g., Diedrichsen et al., 2021; Schütt et al., 2023), our tRSA approach introduces a new methodological step by estimating representational strength at the trial level. This accounts for the multi-level variance structure in the data, affords generalizability beyond the fixed stimulus set, and allows one to test stimulus- or trial-level modulations of neural representations in a straightforward way”.

      Page 44. “Despite such prevalent appreciation for the neurocognitive relevance of stimulus properties, cRSA often does not account for the fact that the same stimulus (e.g., “basketball”) is seen by multiple subjects and produces statistically dependent data, an issue addressed by Schütt et al., 2023, who developed cross validation and bootstrap methods that explicitly model dependence across both subjects and stimulus conditions”.

      (3) The stated problem of the paper is to estimate "representational strength" in different regions or conditions. With this, the authors define the correlation of the brain RDM with a model RDM. This metric conflates a number of factors, namely the variances of the stimulus-specific patterns, the variance of the noise, the true differences between different dissimilarities, and the match between the assumed model and the data-generating model. It took me a long time to figure out that the authors are trying to solve a quite different problem in a quite different setting from the model-comparative approach to RSA that I would consider "classical" (Diedrichsen et al. 2021; Diedrichsen and Kriegeskorte 2017). In this approach, one is trying to test whether local activity patterns are better explained by representation model A or model B, and to estimate the degree to which the representation can be fully explained. In this framework, it is common practice to measure each stimulus at least 2 times, to be able to estimate the variance of noise patterns and the variance of signal patterns directly. Using this setting, I would define 'representational strength" very differently from the authors. Assume (using LaTeX notation) that the activity patterns $y_j,n$ for stimulus j, measurement n, are composed of a true stimulus-related pattern ($u_j$) and a trial-specific noise pattern ($e_j,n$). As a measure of the strength of representation (or pattern), I would use an unbiased estimate of the variance of the true stimulus-specific patterns across voxels and stimuli ($\sigma^2_{u}$). This estimator can be obtained by correlating patterns of the same stimuli across repeated measures, or equivalently, by averaging the cross-validated Euclidean distances (or with spatial prewhitening, Mahalanobis distances) across all stimulus pairs. In contrast, the current paper addresses a specific problem in a quite specific experimental design in which there is only one repetition per stimulus. This means that the authors have no direct way of distinguishing true stimulus patterns from noise processes. The trick that the authors apply here is to assume that the brain data comes from the assumed model RDM (a somewhat sketchy assumption IMO) and that everything that reduces this correlation must be measurement noise. I can now see why tRSA does make some sense for this particular question in this memory study. However, in the more common model-comparative RSA setting, having only one repetition per stimulus in the experiment would be quite a fatal design flaw. Thus, the paper would do better if the authors could spell the specific problem addressed by their method right in the beginning, rather than trying to set up tRSA as a general alternative to "classical RSA".

      At a general level, our approach rests on the premise that there is meaningful information present in a single presentation of a given stimulus. This assumption may have less utility when the research goals are more focused on estimating the fidelity of signal patterns for RSA, as in designs with multiple repetitions. But it is an exaggeration to state that such a trial-wise approach cannot address the difference between “true” stimulus patterns and noise. This trial-wise approach has explicit utility in relating trial-wise brain information to trial-wise behavior, across multiple cognitions (not only memory studies, as applied here). We have added substantial text to the Introduction distinguishing cRSA, which is widely employed, often in cases with a single repetition per stimulus, and model comparative methods that employ multiple repetitions. We clarify that we do not consider tRSA an alternative to the model comparative approach, and discuss that operational definitions of representational strength are constrained by the study design.

      Page 3. “In this paper, we present an advancement termed trial-level RSA, or tRSA, which addresses these limitations in cRSA (not model comparison approaches) and may be utilized in paradigms with or without repeated stimuli”.

      Page 4. “Representational geometry usually refers to the structure of similarities among repeated presentations of the same stimulus in the neural data (as captured in the brain RSM) and is often estimated utilizing a model comparison approach, whereas representational strength is a derived measure that quantifies how strongly this geometry aligns with a hypothesized model RSM. In other words, geometry characterizes the pattern space itself, while representational strength reflects the degree of correspondence between that space and the theoretical model under test”.

      Finally, we clarified that in our simulation methods we assume a true underlying activity pattern and a random error pattern. The model RSM is computed based on the true pattern, whereas the brain RSM comes from the noisy pattern, not the model RSM itself.

      Page 9. “Then, we generated two sets of noise patterns, which were controlled by parameters σ<sub>A</sub> and σ<sub>B</sub> , respectively, one for each condition”.

      (4) The notation in the paper is often conflicting and should be clarified. The actual true and measured activity patterns should receive a unique notation that is distinct from the variances of these patterns across voxels. I assume that $\sigma_ijk$ is the noise variances (not standard deviation)? Normally, variances are denoted with $\sigma^2$. Also, if these are variances, they cannot come from a normal distribution as indicated on page 10. Finally, multi-level models are usually defined at the level of means (i.e., patterns) rather than at the level of variances (as they seem to be done here).

      We have added notations for true and measured activity patterns to differentiate it from our notation for variance. We agree that multilevel models are usually defined at the level of means rather than at the level of variances and we include a Figure (Fig 1D) that describes the model in terms of the means. We clarify that the σ ($\sigma$) used in the manuscript were not variances/standard deviations themselves; rather, they were meant to denote components of the actual (multilevel) variance parameter. Each component was sampled from normal distributions, and they collectively summed up to comprise the final variance parameter for each trial. We have modified our notation for each component to the lowercase letter s to minimize confusion. We have also made our R code publicly available on our lab github, which should provide more clarity on the exact simulation process.

      (5) In the first set of simulations, the authors sampled both model and brain RSM by drawing each cell (similarity) of the matrix from an independent bivariate normal distribution. As the authors note themselves, this way of producing RSMs violates the constraint that correlation matrices need to be positive semi-definite. Likely more seriously, it also ignores the fact that the different elements of the upper triangular part of a correlation matrix are not independent from each other (Diedrichsen et al. 2021). Therefore, it is not clear that this simulation is close enough to reality to provide any valuable insight and should be removed from the paper, along with the extensive discussion about why this simulation setting is plainly wrong (page 21). This would shorten and clarify the paper.

      We have added justification of the mixed-effects model given the potential assumption violations. We caution readers to investigate the robustness of their models, and to employ permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the Appendix. Finally, we agree that the first simulation setting does not possess several properties of realistic RDMs/RSMs; however, we believe that there is utility in understanding the mathematical properties of correlations – an essential component of RSA – in a straightforward simulation where the ground truth is known, thus moving the simulation to Appendix 1.

      (6) If I understand the second simulation setting correctly, the true pattern for each stimulus was generated as an NxP matrix of i.i.d. standard normal variables. Thus, there is no condition-specific pattern at all, only condition-specific noise/signal variances. It is not clear how the tRSA would be biased if there were a condition-specific pattern (which, in reality, there usually is). Because of the i.i.d. assumption of the true signal, the correlations between all stimulus pairs within conditions are close to zero (and only differ from it by the fact that you are using a finite number of voxels). If you added a condition-specific pattern, the across-condition RSA would lead to much higher "representational strength" estimates than a within-condition RSA, with obvious problems and biases.

      The Reviewer is correct that the voxel values in the true pattern are drawn from i.i.d. standard normal distributions. We take the Reviewer’s suggestion of “condition-specific pattern” to mean that there could be a condition-voxel interaction in two non-mutually exclusive ways. The first is additive, essentially some common underlying multi-voxel pattern like [6, 34, -52, …, 8] for all condition A trials, and different one such pattern for condition B trials, etc. The second is multiplicative, essentially a vector of scaling factors [x1.5, x0.5, x0.8, …, x2.7] for all condition A trials, and a different one such vector for condition B trials, etc. Both possibilities could indeed affect tRSA as much as it would cRSA.

      Importantly, If such a strong condition-specific pattern is expected, one can build a condition-specific model RDM using one-shot coding of conditions (see example figure; src: https://www.newbi4fmri.com/tutorial-9-mvpa-rsa), to either capture this interesting phenomenon or to remove this out as a confounding factor. This practice has been applied in multiple regression cRSA approaches (e.g., Cichy et al., 2013) and can also be applied to tRSA.

      (7) The trial-level brain RDM to model Spearman correlations was analyzed using a mixed effects model. However, given the symmetry of the RDM, the correlations coming from different rows of the matrix are not independent, which is an assumption of the mixed effect model. This does not seem to induce an increase in Type I errors in the conditions studied, but there is no clear justification for this procedure, which needs to be justified.

      We appreciate this important warning, and now caution readers to investigate the robustness of their models, and consider employing permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the supplement.

      Page 46. “While linear mixed-effects modeling offers a powerful framework for analyzing representational similarity data, it is critical that researchers carefully construct and validate their models. The multilevel structure of RSA data introduces potential dependencies across subjects, stimuli, and trials, which can violate assumptions of independence if not properly modeled. In the present study, we used a model that included random intercepts for both subjects and stimuli, which accounts for variance at these levels and improves the generalizability of fixed-effect estimates. Still, there is a potential for systematic dependence across trials within a subject. To ensure that the model assumptions were satisfied, we conducted a series of diagnostic checks on an exemplar ROI (right LOC; middle occipital gyrus) in the Object Perception dataset, including visual inspection of residual distributions and autocorrelation (Appendix 3, Figure 13). These diagnostics supported the assumptions of normality, homoscedasticity, and conditional independence of residuals. In addition, we conducted permutation-based inference, similar to prior improvements to cRSA (Niliet al. 2014), using a nested model comparison to test whether the mean similarity in this ROI was significantly greater than zero. The observed likelihood ratio test statistic fell in the extreme tail of the null distribution (Appendix 3, Figure 14), providing strong nonparametric evidence for the reliability of the observed effect. We emphasize that this type of model checking and permutation testing is not merely confirmatory but can help validate key assumptions in RSA modeling, especially when applying mixed-effects models to neural similarity data. Researchers are encouraged to adopt similar procedures to ensure the robustness and interpretability of their findings”.

      Exemplar Permutation Testing

      To test whether the mean representational strength in the ROI right LOC (middle occipital gyrus) was significantly greater than zero, we used a permutation-based likelihood ratio test implemented via the permlmer function. This test compares two nested linear mixed-effects models fit using the lmer function from the lme4 package, both including random intercepts for Participant and Stimulus ID to account for between-subject and between-item variability.

      The null model excluded a fixed intercept term, effectively constraining the mean similarity to zero after accounting for random effects:

      ROI ~ 0 + (1 | Participant) + (1 | Stimulus)

      The full model included the same random effects structure but allowed the intercept to be freely estimated:

      ROI ~ 1 + (1 | Participant) + (1 | Stimulus)

      By comparing the fit of these two models, we directly tested whether the average similarity in this ROI was significantly different from zero. Permutation testing (1,000 permutations) was used to generate a nonparametric p-value, providing inference without relying on normality assumptions. The full model, which estimated a nonzero mean similarity in the right LOC (middle occipital gyrus), showed a significantly better fit to the data than the null model that fixed the mean at zero (χ²(1) = 17.60, p = 2.72 × 10⁻⁵). The permutation-based p-value obtained from permlmer confirmed this effect as statistically significant (p = 0.0099), indicating that the mean similarity in this ROI was reliably greater than zero. These results support the conclusion that the right LOC contains representational structure consistent with the HMAXc2 RSM. A density plot of the permuted likelihood ratio tests is plotted along with the observed likelihood ratio test in Appendix 3 Figure 14.

      (8) For the empirical data, it is not clear to me to what degree the "representational strength" of cRSA and tRSA is actually comparable. In cRSA, the Spearman correlation assesses whether the distances in the data RSM are ranked in the same order as in the model. For tRSA, the comparison is made for every row of the RSM, which introduces a larger degree of flexibility (possibly explaining the higher correlations in the first simulation). Thus, could the gains presented in Figure 7D not simply arise from the fact that you are testing different questions? A clearer theoretical analysis of the difference between the average row-wise Spearman correlation and the matrix-wise Spearman correlation is urgently needed. The behavior will likely vary with the structure of the true model RDM/RSM.

      We agree that the comparability between mean row-wise Spearman correlations and the matrix-wise Spearman correlation is needed. We believe that the simulations are the best approach for this comparison, since they are much more robust than the empirical dataset and have the advantage of knowing the true pattern/noise levels. We expand on our comparison of mean tRSA values and matrix-wise Spearman correlations on page 42.

      Page 42. “Although tRSA and cRSA both aim to quantify representational strength, they differ in how they operationalize this concept. cRSA summarizes the correspondence between RSMs as a single measure, such as the matrix-wise Spearman correlation. In contrast, tRSA computes such correspondence for each trial, enabling estimates at the level of individual observations. This flexibility allows trial-level variability to be modeled directly, but also introduces subtle differences in what is being measured. Nonetheless, our simulations showed that, although numerical differences occasionally emerged—particularly when comparing between-condition tRSA estimates to within-condition cRSA estimates—the magnitude of divergence was small and did not affect the outcome of downstream statistical tests”.

      (9) For the real data, there are a number of additional sources of bias that need to be considered for the analysis. What if there are not only condition-specific differences in noise variance, but also a condition-specific pattern? Given that the stimuli were measured in 3 different imaging runs, you cannot assume that all measurement noise is i.i.d. - stimuli from the same run will likely have a higher correlation with each other.

      We recognize the potential of condition-specific patterns and chose to constrain the analyses to those most comparable with cRSA. However, depending on their hypotheses, researchers may consider testing condition RSMs and utilizing a model comparison approach or employ the z-scored approach, as employed in the simulations above. Regarding the potential run confounds, this is always the case in RSA and why we exclude within-run comparisons. We have also added to the Discussion the suggestion to include run as a covariate in their mixed-effects models. However, we do not employ this covariate here as we preferred the most parsimonious model to compare with cRSA.

      Page 46 - 47. “Further, while analyses here were largely employed to be comparable with cRSA, researchers should consider taking advantage of the flexibility of the mixed-effects models and include co variates of non-interest (run, trial order etc.)”.

      (10) The discussion should be rewritten in light of the fact that the setting considered here is very different from the model-comparative RSA in which one usually has multiple measurements per stimulus per subject. In this setting, existing approaches such as RSA or PCM do indeed allow for the full modelling of differences in the "representational strength" - i.e., pattern variance across subjects, conditions, and stimuli.

      We agree that studies advancing designs with multiple repetitions of a given stimulus image are useful in estimating the reliability of concept representations. We would argue however that model comparison in RSA is not restricted to such data. Many extant studies do not in fact have multiple repetitions per stimulus per subject (Wang et al., 2018 https://doi.org/10.1088/1741-2552/abecc3, Gao et al, 2022 https://doi.org/10.1093/cercor/bhac058, Li et al, 2022 https://doi.org/10.1002/hbm.26195, Staples & Graves, 2020 https://doi.org/10.1162/nol_a_00018) that allow for that type of model-comparative approach. While beneficial in terms of noise estimation, having multiple presentations was not a requirement for implementing cRSA (Kriegeskorte, 2008 https://doi.org/10.3389/neuro.06.004.2008). The aim of this manuscript is to introduce the tRSA approach to the broad community of researchers whose research questions and datasets could vary vastly, including but not limited to the number of repeated presentations and the balance of trial counts across conditions.

      (11) Cross-validated distances provide a powerful tool to control for differences in measurement noise variances and possible covariances in measurement noise across trials, which has many distinct advantages and is conceptually very different from the approach taken here.

      We have added language on the value of cross-validation approaches to RSA in the Discussion:

      Page 47. “Additionally, we note that while our proposed tRSA framework provides a flexible and statistically principled approach for modeling trial-level representational strength, we acknowledge that there are alternative methods for addressing trial-level variability in RSA. In particular, the use of cross-validated distance metrics (e.g., crossnobis distance) has become increasingly popular for controlling differences in measurement noise variance and accounting for possible covariance structures across trials (Walther et al., 2016). These metrics offer several advantages, including unbiased estimation of representational dissimilarities under Gaussian noise assumptions and improved generalization to unseen data. However, cross-validated distances are conceptually distinct from the approach taken here: whereas cross-validation aims to correct for noise-related biases in representational dissimilarity matrices, our trial-level RSA method focuses on estimating and modeling the variability in representation strength across individual trials using mixed-effects modeling. Rather than proposing a replacement for cross-validated RSA, tRSA adds a complementary tool to the methodological toolkit—one that supports hypothesis-driven inference about condition effects and trial-level covariates, while leveraging the full structure of the data”.

      (12) One of the main limitations of tRSA is the assumption that the model RDM is actually the true brain RDM, which may not be the case. Thus, in theory, there could be a different model RDM, in which representational strength measures would be very different. These differences should be explained more fully, hopefully leading to a more accessible paper.

      Indeed, the chosen model RSM may not be the true RSM, but as the noise level increases the correlation between RSMs practically becomes zero. In our simulations we assume this to be true as a straightforward way to manipulate the correspondence between the brain data and the model. However, just like cRSA, tRSA is constrained by the model selections the researchers employ. We encourage researchers to have carefully considered theoretically-motivated models and, if their research questions require, consider multiple and potentially competing models. Furthermore, the trial-wise estimates produced by tRSA encourage testing competing models within the multiple regression framework. We have added this language to the Discussion.

      Page 46. ..”choose their model RSMs carefully. In our simulations, we designed our model RSM to be the “true” RSM for demonstration purposes. However, researchers should consider if their models and model alternatives”.

      Pages 45-46. “While a number of studies have addressed the validity of measuring representational geometry using designs with multiple repetitions, a conceptual benefit of the tRSA approach is the reliance on a regression framework that engenders the testing of competing conceptual models of stimulus representation (e.g., taxonomic vs. encyclopedic semantic features, as in Davis et al., 2021)”.

      Reviewer #2 (Public review):

      (1)  While I generally welcome the contribution, I take some issue with the accusatory tone of the manuscript in the Introduction. The text there (using words such as 'ignored variances', 'errouneous inferences', 'one must', 'not well-suited', 'misleading') appears aimed at turning cRSA in a 'straw man' with many limitations that other researchers have not recognized but that the new proposed method supposedly resolves. This can be written in a more nuanced, constructive manner without accusing the numerous users of this popular method of ignorance.

      We apologize for the unintended accusatory tone. We have clarified the many robust approaches to RSA and have made our Introduction and Discussion more nuanced throughout (see also 3, 11 and16).

      (2) The described limitations are also not entirely correct, in my view: for example, statistical inference in cRSA is not always done using classic parametric statistics such as t-tests (cf Figure 1): the rsatoolbox paper by Nili et al. (2014) outlines non-parametric alternatives based on permutation tests, bootstrapping and sign tests, which are commonly used in the field. Nor has RSA ever been conducted at the row/column level (here referred to by the authors as 'trial level'; cf King et al., 2018).

      We agree there are numerous methods that go beyond cRSA addressing these limitations and have added discussion of them into our manuscript as well as an example analysis implementing permutation tests on tRSA data (see response to 7). We thank the reviewer for bringing King et al., 2014 and their temporal generalization method to our attention, we added reference to acknowledge their decoding-based temporal generalization approach.

      Page 8. “It is also important to note that some prior work has examined similarly fine-grained representations in time-resolved neuroimaging data, such as the temporal generalization method introduced by King et al. (see King & Dehaene, 2014). Their approach trains classifiers at each time point and tests them across all others, resulting in a temporal generalization matrix that reflects decoding accuracy over time. While such matrices share some structural similarity with RSMs, they do not involve correlating trial-level pattern vectors with model RSMs nor do their second-level models include trial-wise, subject-wise, and item-wise variability simultaneously”.

      (3) One of the advantages of cRSA is its simplicity. Adding linear mixed effects modeling to RSA introduces a host of additional 'analysis parameters' pertaining to the choice of the model setup (random effects, fixed effects, interactions, what error terms to use) - how should future users of tRSA navigate this?

      We appreciate the opportunity to offer more specific proscriptions for those employing a tRSA technique, and have added them to the Discussion:

      Page 46. “While linear mixed-effects modeling offers a powerful framework for analyzing representational similarity data, it is critical that researchers carefully construct and validate their models and choose their model RSMs carefully. In our simulations, we designed our model RSM to be the “true” RSM for demonstration purposes. However, researchers should consider if their models and model alternatives. However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question”.

      (4) Here, only a single real fMRI dataset is used with a quite complicated experimental design for the memory part; it's not clear if there is any benefit of using tRSA on a simpler real dataset. What's the benefit of tRSA in classic RSA datasets (e.g., Kriegeskorte et al., 2008), with fixed stimulus conditions and no behavior?

      To clarify, our empirical approach uses two different tasks: an Object Perception task more akin to the classic RSA datasets employing passive viewing, and a Conceptual Retrieval task that more directly addresses the benefits of the trialwise approach. We felt that our Object Perception dataset is a simpler empirical fMRI dataset without explicit task conditions or a dichotomous behavioral outcome, whereas the Retrieval dataset is more involved (though old/new recognition is the most common form of memory retrieval testing) and  dependent on behavioral outcomes. However, we recognize the utility of replication from other research groups and do invite researchers to utilize tRSA on their datasets.

      (5) The cells of an RDM/RSM reflect pairwise comparisons between response patterns (typically a brain but can be any system; cf Sucholutsky et al., 2023). Because the response patterns are repeatedly compared, the cells of this matrix are not independent of one another. Does this raise issues with the validity of the linear mixed effects model? Does it assume the observations are linearly independent?

      We recognize the potential danger for not meeting model assumptions. Though our simulation results and model checks suggest this is not a fatal flaw in the model design, we caution readers to investigate the robustness of their models, and consider employing permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the Appendix. See response to R1.

      (6) The manuscript assumes the reader is familiar with technical statistical terms such as Type I/II error, sensitivity, specificity, homoscedasticity assumptions, as well as linear mixed models (fixed effects, random effects, etc). I am concerned that this jargon makes the paper difficult to understand for a broad readership or even researchers currently using cRSA that might be interested in trying tRSA.

      We agree this jargon may cause the paper to be difficult to understand. We have expanded/added definitions to these terms throughout the methods and results sections.

      Page 12. “Given data generated with 𝑠<sub>𝑐𝑜𝑛𝑑,𝐴</sub> = 𝑠<sub>𝑐𝑜𝑛𝑑,B</sub>, the correct inference should be a failure to reject the null hypothesis of ; any significant () result in either direction was considered a false positive (spurious effect, or Type I error). Given data generated with , the inference was considered correct if it rejected the null hypothesis of  and yielded the expected sign of the estimated contrast (b<sub>B-𝐴</sub><0). A significant result with the reverse sign of the estimated contrast (b<sub>B-𝐴</sub><0) was considered a Type I error, and a nonsignificant (𝑝 ≥ 0.05) result was considered a false negative (failure to detect a true effect, or Type II error)”.

      Page 2. “Compared to cRSA, the multi-level framework of tRSA was both more theoretically appropriate and significantly sensitive (better able to detect) to true effects”.

      Page 25.”The performance of cRSA and tRSA were quantified with their specificity (better avoids false positives, 1 - Type I error rate) and sensitivity (better avoids false negatives 1 - Type II error rate)”.

      Page 6. “One of the fundamental assumptions of general linear models (step 4 of cRSA; see Figure 1D) is homoscedasticity or homogeneity of variance — that is, all residuals should have equal variance” .

      Page11. “Specifically, a linear mixed-effects model with a fixed effect  of condition (which estimates the average effect across the entire sample, capturing the overall effect of interest) and random effects of both subjects and stimuli (which model variation in responses due to differences between individual subjects and items, allowing generalization beyond the sample) were fitted to tRSA estimates via the `lme4 1.1-35.3` package in R (Bates et al., 2015), and p-values were estimated using Satterthwaites’s method via the `lmerTest 3.1-3` package (Kuznetsova et al., 2017)”.

      (7) I could not find any statement on data availability or code availability. Given that the manuscript reuses prior data and proposes a new method, making data and code/tutorials openly available would greatly enhance the potential impact and utility for the community.

      We thank the reviewer for raising our oversight here. We have added our code and data availability statements.

      Page 9. “Data is available upon request to the corresponding author and our simulations and example tRSA code is available at https://github.com/electricdinolab”.

      Reviewer #1 (Recommendations for the authors):

      (13) Page 4: The limitations of cRSA seem to be based on the assumption that within each different experimental condition, there are different stimuli, which get combined into the condition. The framework of RSA, however, does not dictate whether you calculate a condition x condition RDM or a larger and more complete stimulus x stimulus RDM. Indeed, in practice we often do the latter? Or are you assuming that each stimulus is only shown once overall? It would be useful at this point to spell out these implicit assumptions.

      We agree that stimulus x stimulus RDMs can be constructed and are often used. However, as we mentioned in the Introduction, researchers are often interested in the difference between two (or more) conditions, such as “remembered” vs. “forgotten” (Davis et al., https://doi.org/10.1093/cercor/bhaa269) or “high cognitive load” vs. “low cognitive load” (Beynel et al., https://doi.org/10.1523/JNEUROSCI.0531-20.2020). In those cases, the most common practice with cRSA is to construct condition-specific RDMs, compute cRSA scores separately for each condition, and then compare the scores at the group level. The number of times each stimulus gets presented does not prevent one from creating a model RDM that has the same rows and columns as the brain RDM, either in the same condition (“high load”) or across different conditions.

      (14) Page 5: The difference between condition-level and stimulus-level is not clear. Indeed, this definition seems to be a function of the exact experimental design and is certainly up for interpretation. For example, if I conduct a study looking at the activity patterns for 4 different hand actions, each repeated multiple times, are these actions considered stimuli or conditions?

      We have added clarifying language about what is considered stimuli vs conditions. Indeed, this will depend on the specific research questions being employed and will affect how researchers construct their models. In this specific example, one would most likely consider each different hand action a condition, treating them as fixed effects rather than random effects, given their very limited number and the lack of need to generalize findings to the broader “hand actions” category.

      Page 5. “Critically, the distinction between condition-level and stimulus level is not always clear as researchers may manipulate stimulus-level features themselves. In these cases, what researchers ultimately consider condition-level and stimulus-level will depend on their specific research questions. For example, researchers intending to study generalized object representation may consider object category a stimulus-level feature, while researchers interested in if/how object representation varies by category may consider the same category variable condition-level”.

      (15) Page 5: The fact that different numbers of trials / different levels of measurement noise / noise-covariance of different conditions biases non-cross-validated distances is well known and repeatedly expressed in the literature. We have shown that cross-validation of distances effectively removes such biases - of course, it does not remove the increased estimation variability of these distances (for a formal analysis of estimation noise on condition patterns and variance of the cross-nobis estimator, see (Diedrichsen et al. 2021)).

      We thank the reviewer for drawing our attention to this literature and have added discussions of these methods.

      (16). Page 5: "Most studies present subjects with a fixed set of stimuli, which are supposedly samples representative of some broader category". This may be the case for a certain type of RSA experiments in the visual domain, but it would be unfair to say that this is a feature of RSA studies in general. In most studies I have been involved in, we use a "stimulus" x "stimulus" RDM.

      We have edited this sentence to avoid the “most” characterization. We also added substantial text to the introduction and discussion distinguishing cRSA, which is nonetheless widely employed, especially in cases with a single repetition per stimulus (Macklin et al., 2023, Liu et al, 2024) and the model comparative method and explicitly stating that we do not consider tRSA an alternative to the model comparative approach.

      (17). Page 5: I agree that "stimuli" should ideally be considered a random effect if "stimuli" can be thought of as sampled from a larger population and one wants to make inferences about that larger population. Sometimes stimuli/conditions are more appropriately considered a fixed effect (for example, when studying the response to stimulation of the 5 fingers of the right hand). Techniques to consider stimuli/conditions as a random effect have been published by the group of Niko Kriegeskorte (Schütt et al. 2023).

      Indeed, in some cases what may be thought of as “stimuli” would be more appropriately entered into the model as a fixed effect; such questions are increasingly relevant given the focus on item-wise stimulus properties (Bainbridge et al., Westfall & Yarkoni). We have added text on this issue to the Discussion and caution researchers to employ models that most directly answer their research questions.

      Page 46. “However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question. An effect is fixed when the levels represent the specific conditions of theoretical interest (e.g., task condition) and the goal is to estimate and interpret those differences directly. In contrast, an effect is random when the levels are sampled from a broader population (e.g., subjects) and the goal is to account for their variability while generalizing beyond the sample tested. Note that the same variable (e.g., stimuli) may be considered fixed or random depending on the research questions”.

      (18) Page 6: It is correct that the "classical" RSA depends on a categorical assignment of different trials to different stimuli/conditions, such that a stimulus x stimulus RDM can be computed. However, both Pattern Component Modelling (PCM) and Encoding models are ideally set up to deal with variables that vary continuously on a trial-by-trial or moment-by-moment basis. tRSA should be compared to these approaches, or - as it should be clarified - that the problem setting is actually quite a different one.

      We agree that PCM and encoding models offer a flexible approach and handle continuous trial-by-trial variables. We have clarified the problem setting in cRSA is distinct on page 6, and we have added the robustness of encoding models and their limitations to the Discussion.

      Page 6. “While other approaches such as Pattern Component Modeling (PCM) (Diedrichsen et al., 2018) and encoding models (Naselaris et al., 2011) are well-suited to analyzing variables that vary continuously on a trial-by-trial or moment-by-moment basis, these frameworks address different inferential goals. Specifically, PCM and encoding models focus on estimating variance components or predicting activation from features, while cRSA is designed to evaluate representational geometry. Thus, cRSA as well as our proposed approach address a problem setting distinct from PCM and encoding models”.

      (19) Page 8: "Then, we generated two noise patterns, which were controlled by parameters 𝜎 𝐴 and 𝜎𝐵, respectively, one for each condition." This makes little sense to me. The noise patterns should be unique to each trial - you should generate n_a + n_b noise patterns, no?

      We clarify that the “noise patterns” here are n_voxel x n_trial in size; in other words, all trial-level noise patterns are generated together and each trial has their own unique noise pattern. We have revised our description as “two sets of noise patterns” for clarity starting on page 9.

      (20) Page 9: First, I assume if this is supposed to be a hierarchical level model, the "noise parameters" here correspond to variances? Or do these \sigma values mean to signify standard deviations? The latter would make little sense. Or is it the noise pattern itself?

      As clarified in 4., the σ values are meant to denote hierarchical components of the composite standard deviation; we have updated our notation to use lower case letter s instead for clarity.

      (21) Page 10: your formula states "𝜎<sub>𝑠𝑢𝑏𝑗</sub>~ 𝙽(0, 0.5^2)". This conflicts with your previous mention that \sigmas are noise "levels" are they the noise patterns themselves now? Variances cannot be normally distributed, as they cannot be negative.

      As clarified in 4., the σ values are meant to denote hierarchical components of the composite standard deviation; we have updated our notation to use lower case letter s instead for clarity.

      (22) Page 13: What was the task of the subject in the Memory retrieval task? Old/new judgements relative to encoding of object perception?

      We apologize for the lack of clarity about the Memory Retrieval task and have added that information and clarified that the old/new judgements were relative to a separate encoding phase, the brain data for which has been reported elsewhere.

      Page 14. “Memory Retrieval took place one day after Memory Encoding and involved testing participants’ memory of the objects seen in the Encoding phase. Neural data during the Encoding phase has been reported elsewhere. In the main Memory Retrieval task, participants were presented with 144 labels of real-world objects, of which 114 were labels for previously seen objects and 30 were unrelated novel distractors. Participants performed old/new judgements, as well as their confidence in those judgements on a four-point scale (1 = Definitely New, 2 = Probably New, 3 = Probably Old, 4 = Definitely Old)”.

      (23) Page 13: If "Memory Retrieval consisted of three scanning runs", then some of the stimulus x stimulus correlations for the RSM must have been calculated within a run and some between runs, correct? Given that all within-run estimates share a common baseline, they share some dependence. Was there a systematic difference between the within-run and the between-run correlations?

      We have clarified in this portion of the methods that within run comparisons were excluded from our analyses. We also double-checked that the within-run exclusion was included in the description of the Neural RSMs.

      Page 14. “Retrieval consisted of three scanning runs, each with 38 trials, lasting approximately 9 minutes and 12 seconds (within-run comparisons were later excluded from RSA analyses)”.

      Page 18. “This was done by vectorizing the voxel-level activation values within each region and calculating their correlations using Pearson’s r, excluding all within-run comparisons.”

      (24) Page 20: It is not clear why the mean estimate of "representational strength" (i.e., model-brain RSM correlations) is important at all. This comes back to Major point #2, namely that you are trying to solve a very different problem from model-comparative RSA.

      We have clarified that our approach is not an alternative to model-comparative RSA, and that depending on the task constraints researchers may choose to compare models with tRSA or other approaches requiring stimulus repetition (see 3).

      (25) Page 21: I believe the problems of simulating correlation matrices directly in the way that the authors in their first simulation did should be well known and should be moved to an appendix at best. Better yet, the authors could start with the correct simulation right away.

      We agree the paper is more concise with these simulations being moved to the appendix and more briefly discussed. We have implemented these changes (Appendix 1). However, we are not certain that this problem is unknown, and have several anecdotes of researchers inquiring about this “alternative” approach in talks with colleagues, thus we do still discuss the issues with this method.

      (26) Page 26: Is the "underlying continuous noise variable 𝜎𝑡𝑟𝑖𝑎𝑙 that was measured by 𝑣𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑑 " the variance of the noise pattern or the noise pattern itself? What does it mean it was "measured" - how?

      𝜎𝑡𝑟𝑖𝑎𝑙 is a vector of standard deviations for different trials, and 𝜎𝑡𝑟𝑖𝑎𝑙 i would be used to generate the noise patterns for trial i. v_measured is a hypothetical measurement of trial-level variability, such as “memorability” or “heartbeat variability”. We have revised our description to clarify our methods.

      Reviewer #2 (Recommendations for the authors):

      (8) It would be helpful to provide more clarity earlier on in the manuscript on what is a 'trial': in my experience, a row or column of the RDM is usually referred to as 'stimulus condition', which is typically estimated on multiple trials (instances or repeats) of that stimulus condition (or exemplars from that stimulus class) being presented to the subject. Here, a 'trial' is both one measurement (i.e., single, individual presentation of a stimulus) and also an entry in the RDM, but is this the most typical scenario for cRSA? There is a section in the Discussion that discusses repetitions, but I would welcome more clarity on this from the get-go.

      We have added discussion of stimulus repetition methods and datasets to the Introduction and clarified our use of the terms.

      Page 8. “Critically, in single-presentation designs, a “trial” refers to one stimulus presentation, and corresponds to a row or column in the RSM. In studies with repeated stimuli, these rows are often called “conditions” and may reflect aggregated patterns across trials. tRSA is compatible with both cases: whether rows represent individual trials or averaged trials that create “conditions”, tRSA estimates are computed at the row level”.

      (9) The quality of the results figures can be improved. For example, axes labels are hard to read in Figure 3A/B, panels 3C/D are hard to read in general. In Figure 7E, it's not possible to identify the 'dark red' brain regions in addition to the light red ones.

      We thank the reviewer for raising these and have edited the figures to be more readable in the manner suggested.

      (10) I would be interested to see a comparison between tRSA and cRSA in other fMRI (or other modality) datasets that have been extensively reported in the literature. These could be the original Kriegeskorte 96 stimulus monkey/fMRI datasets, commonly used open datasets in visual perception (e.g., THINGS, NSD), or the above-mentioned King et al. dataset, which has been analyzed in various papers.

      We recognize the great utility of replication from other research groups and do invite researchers to utilize tRSA on their datasets.

      (11) On P39, the authors suggest 'researchers can confidently replace their existing cRSA analysis with tRSA': Please discuss/comment on how researchers should navigate the choice of modeling parameters in tRSA's linear mixed effects setting.

      We have added discussion of the mixed-effects parameters and the various and encourage researchers to follow best practices for their model selection.

      Page 46. “However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question”.

      (12) The final part of the Results section, demonstrating the tRSA results for the continuous memorability factor in the real fMRI data, could benefit from some substantiation/elaboration. It wasn't clear to me, for example, to what extent the observed significant association between representational strength and item memorability in this dataset is to be 'believed'; the Discussion section (p38). Was there any evidence in the original paper for this association? Or do we just assume this is likely true in the brain, based on prior literature by e.g. Bainbridge et al (who probably did not use tRSA but rather classic methods)?

      Indeed, memorability effects have been replicated in the literature, but not using the tRSA method. We have expanded our discussion to clarify the relationship of our findings and the relevant literature and methods it has employed.

      Page 38. “Critically, memorability is a robust stimulus property that is consistent across participants and paradigms (Bainbridge, 2022). Moreover, object memorability effects have been replicated using a variety of methods aside from tRSA, including univariate analyses and representational analyses of neural activity patterns where trial-level neural activity pattern estimates are correlated directly with object memorability (Slayton et al, 2025).”

      (13) The abstract could benefit from more nuance; I'm not sure if RSA can indeed be said to be 'the principal method', and whether it's about assessing 'quality' of representations (more commonly, the term 'geometry' or 'structure' is used).

      We have edited the abstract to reflect the true nuisance in the current approaches.

      Abstract. Neural representation refers to the brain activity that stands in for one’s cognitive experience, and in cognitive neuroscience, a prominent method of studying neural representations is representational similarity analysis (RSA). While there are several recent advances in RSA, the classic RSA (cRSA) approach examines the structure of representations across numerous items by assessing the correspondence between two representational similarity matrices (RSMs): usually one based on a theoretical model of stimulus similarity and the other based on similarity in measured neural data.

      (14) RSA is also not necessarily about models vs. neural data; it can also be between two neural systems (e.g., monkey vs. human as in Kriegeskorte et al., 2008) or model systems (see Sucholutsky et al., 2023). This statement is also repeated in the Introduction paragraph 1 (later on, it is correctly stated that comparing brain vs. model is most likely the 'most common' approach).

      We have added these examples in our introduction to RSA.

      Page 3.”One of the central approaches for evaluating information represented in the brain is representational similarity analysis (RSA), an analytical approach that queries the representational geometry of the brain in terms of its alignment with the representational geometry of some cognitive model (Kriegeskorte et al., 2008; Kriegeskorte & Kievit, 2013), or, in some cases, compares the representational geometry of two neural systems (e.g., Kriegeskorte et al., 2008) or two model systems (Sucholutsky et al., 2023)”.

      (15) 'theoretically appropriate' is an ambiguous statement, appropriate for what theory?

      We apologize for the ambiguous wording, and have corrected the text:

      Page 11. “Critically, tRSA estimates were submitted to a mixed-effects model which is statistically appropriate for modeling the hierarchical structure of the data, where observations are nested within both subjects and stimuli (Baayen et al., 2008; Chen et al., 2021)”.

      (16) I found the statement that cRSA "cannot model representation at the level of individual trials" confusing, as it made me think, what prohibits one from creating an RDM based on single-trial responses? Later on, I understood that what the authors are trying to say here (I think) is that cRSA cannot weigh the contributions of individual rows/columns to the overall representational strength differently.

      We thank the reviewer for their clarifying language and have added it to this section of the manuscript.

      “Abstract. However, because cRSA cannot weigh the contributions of individual trials (RSM rows/columns), it is fundamentally limited in its ability to assess subject-, stimulus-, and trial-level variances that all influence representation”.

      (17) Why use "RSM" instead of "RDM"? If the pairwise comparison metric is distance-based (e..g, 1-correlation as described by the authors), RDM is more appropriate.

      We apologize for the error, and have clarified the Methods text:

      Page3-4. First, brain activity responses to a series of N trials are compared against each other (typically using Pearson’s r) to form an N×N representational similarity matrix.

      (18) Figure 2: please write 'Correlation estimate' in the y-axis label rather than 'Estimate'.

      We have edited the label in Figure 2.

      (19) Page 6 'leaving uncertain the directionality of any findings' - I do not follow this argument. Obviously one can generate an RDM or RSM from vector v or vector -v. How does that invalidate drawing conclusions where one e.g., partials out the (dis)similarity in e.g., pleasantness ratings out of another RDM/RSM of interest?

      We agree such an approach does not invalidate the partial method; we have clarified what we mean by “directionality”.

      Page 8. ”For instance, even though a univariate random variable , such as pleasantness ratings, can be conveniently converted to an RSM using pairwise distance metrics (Weaverdyck et al., 2020), the very same RSM would also be derived from the opposite random variable , leaving uncertain of the directionality (or if representation is strongest for pleasant or unpleasant items) of any findings with the RSM (see also Bainbridge & Rissman, 2018)”.

      (20) P7 'sampled 19900 pairs of values from a bi-variate normal distribution', but the rows/columns in an RDM are not independent samples - shouldn't this be included in the simulation? I.e., shouldn't you simulate first the n=200 vectors, and then draw samples from those, as in the next analysis?

      This section has been moved to Appendix 1 (see responses to Reviewer 1.13).

      (21) Under data acquisition, please state explicitly that the paper is re-using data from prior experiments, rather than collecting data anew for validating tRSA.

      We have clarified this in the data acquisition section.

      Page 13. “A pre-existing dataset was analyzed to evaluate tRSA. Main study findings have been reported elsewhere (S. Huang, Bogdan, et al., 2024)”.

      (22) Figure 4 could benefit from some more explanation in-text. It wasn't clear to me, for example, how to interpret the asterisks depicted in the right part of the figure.

      We clarified the meaning of the asterisks in the main text in addition to the existent text in the figure caption.

      Page 26. “see Figure 4, off-diagonal cells in blue; asterisks indicate where tRSA was statistically more sensitive then cRSA)”.

      (23) Page 38 "the outcome of tRSA's improved characterization can be seen in multiple empirical outcomes:" it seems there is one mention of 'outcomes' too many here.

      We have revised this sentence.

      Page 41. “tRSA's improved characterization can be seen in multiple empirical outcomes”.

      (24) Page 38 "model fits became the strongest" it's not clear what aspect of the reported results in the paragraph before this is referring to - the Appendix?

      Yes, the model fits are in the Appendix, we have added this in text citation.

      Moreover, model-fits became the strongest when the models also incorporated trial-level variables such as fMRI run and reaction time (Appendix 3, Table 6).

      References

      Diedrichsen, J., Berlot, E., Mur, M., Schütt, H. H., Shahbazi, M., & Kriegeskorte, N. (2021). Comparing representational geometries using whitened unbiased-distance-matrix similarity. Neurons, Behavior, Data and Theory, 5(3). https://arxiv.org/abs/2007.02789

      Diedrichsen, J., & Kriegeskorte, N. (2017). Representational models: A common framework for understanding encoding, pattern-component, and representational-similarity analysis. PLoS Computational Biology, 13(4), e1005508.

      Diedrichsen, J., Yokoi, A., & Arbuckle, S. A. (2018). Pattern component modeling: A flexible approach for understanding the representational structure of brain activity patterns. NeuroImage, 180, 119-133.

      Naselaris, T., Kay, K. N., Nishimoto, S., & Gallant, J. L. (2011). Encoding and decoding in fMRI. NeuroImage, 56(2), 400-410.

      Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS Computational Biology, 10(4), e1003553.

      Schütt, H. H., Kipnis, A. D., Diedrichsen, J., & Kriegeskorte, N. (2023). Statistical inference on representational geometries. ELife, 12. https://doi.org/10.7554/eLife.82566

      Walther, A., Nili, H., Ejaz, N., Alink, A., Kriegeskorte, N., & Diedrichsen, J. (2016). Reliability of dissimilarity measures for multi-voxel pattern analysis. NeuroImage, 137, 188-200.

      King, M. L., Groen, I. I., Steel, A., Kravitz, D. J., & Baker, C. I. (2019). Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images. NeuroImage, 197, 368-382.

      Kriegeskorte, N., Mur, M., Ruff, D. A., Kiani, R., Bodurka, J., Esteky, H., ... & Bandettini, P. A. (2008). Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron, 60(6), 1126-1141.

      Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS computational biology, 10(4), e1003553.

      Sucholutsky, I., Muttenthaler, L., Weller, A., Peng, A., Bobu, A., Kim, B., ... & Griffiths, T. L. (2023). Getting aligned on representational alignment. arXiv preprint arXiv:2310.13018.

    2. Reviewer #2 (Public review):

      This paper proposes two changes to classic RSA, a popular method to probe neural representation in neuroimaging experiments: computing RSA at row/column level of RDM, and using linear mixed modeling to compute second level statistics, using the individual row/columns to estimate a random effect of stimulus. The benefit of the new method is demonstrated using simulations and a re-analysis of a prior fMRI dataset on object perception and memory encoding.

      The author's claim that tRSA is a promising approach to perform more complete modeling of cogneuro data, and to conceptualize representation at the single trial/event level (cf Discussion section on P42), is appealing.

      In their revised manuscript, the authors have addressed some previous concerns, now referencing more literature aiming to improve RSA and its associated statistical inferences, and providing more guidance on methodological considerations in the Discussion. However, I wish the authors had more extensively edited the Introduction to better contextualize the work and clarify the specific settings in which they see the method as being beneficial over classic RSA. For example, some of the limitations of cRSA mentioned on page 6, e.g. related to presenting the same stimuli to multiple subjects, seem to be quite specific to settings where the researcher expects differential responses across subjects to fundamentally alter the interpretation, rather than something that will just average out by repeatedly offering the same stimulus, or combining data across subjects. It's not clear to me how the switch from 'matrix-level' to 'row-level' analysis in tRSA necessarily addresses this problem. I would be very helpful if the authors would more explicitly outline what problem the row-level aspect of tRSA is solving; what problem statistical inference via LMM is solving; and walk the reader through a very specific use case (perhaps a toy version of the real-data experiment which is now at the end of the paper). Explaining the utility of tRSA for experimental settings in which assessing representational strength for a single-events is crucial would clarify the contribution of this new method better.

      A few weaknesses mentioned in my previous review were not adequately addressed. To demonstrate the utility of the method on real neural recordings, only a single dataset is used with a quite complicated experimental design; it's not clear if there is any benefit of using tRSA on a simpler real dataset. Moreover, the cells of an RDM/RSM reflect pairwise comparisons between response patterns. Because the response patterns are repeatedly compared, the cells of this matrix are not independent of one another. While the authors show examples that failure to meet independence assumptions do not affect results in their specific dataset, it does not get acknowledged as a problem at a more fundamental level. Finally, while the paper now states that 'simulations and example tRSA code' are publicly available, the link points to the lab's general github page containing many lab repositories, in which I could not identify a specific repository related to this paper. This is disappointing given that the main goal of this manuscript is to provide a new method that they encourage others to use; a clear pointer to available code is only a minimal requirement to achieve that goal. A dedicated repository, including documentation, READMEs and tutorials/demo's to run simulations, compare methods, etc. would greatly enhance the paper's contribution.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      In this manuscript, Dillard and colleagues integrate cross-species genomic data with a systems approach to identify potential driver genes underlying human GWAS loci and establish the cell type(s) within which these genes act and potentially drive disease. Specifically, they utilize a large single-cell RNA-seq (scRNA-seq) dataset from an osteogenic cell culture model - bone marrow-derived stromal cells cultured under osteogenic conditions (BMSC-OBs) - from a genetically diverse outbred mouse population called the Diversity Outbred (DO) stock to discover network driver genes that likely underlie human bone mineral density (BMD) GWAS loci. The DO mice segregate over 40M single nucleotide variants, many of which affect gene expression levels, therefore making this an ideal population for systems genetic and co-expression analyses. The current study builds on previously published work from the same group that used co-expression analysis to identify co-expressed "modules" of genes that were enriched for BMD GWAS associations. In this study, the authors utilize a much larger scRNA-seq dataset from 80 DO BMSC-OBs, infer co-expression-based and Bayesian networks for each identified mesenchymal cell type, focused on networks with dynamic expression trajectories that are most likely driving differentiation of BMSC-OBs, and then prioritized genes ("differentiation driver genes" or DDGs) in these osteogenic differentiation networks that had known expression or splicing QTLs (eQTL/sQTLs) in any GTEx tissue that colocalized with human BMD GWAS loci. The systems analysis is impressive, the experimental methods are described in detail, and the experiments appear to be carefully done. The computational analysis of the single-cell data is comprehensive and thorough, and the evidence presented in support of the identified DDGs, including Tpx2 and Fgfrl1, is for the most part convincing. Some limitations in the data resources and methods hamper enthusiasm somewhat and are discussed below. Overall, while this study will no doubt be valuable to the BMD community, the cross-species data integration and analytical framework may be more valuable and generally applicable to the study of other diseases, especially for diseases with robust human GWAS data but for which robust human genomic data in relevant cell types is lacking. 

      Specific strengths of the study include the large scRNA-seq dataset on BMSC-OBs from 80 DO mice, the clustering analysis to identify specific cell types and sub-types, the comparison of cell type frequencies across the DO mice, and the CELLECT analysis to prioritize cell clusters that are enriched for BMD heritability (Figure 1). The network analysis pipeline outlined in Figure 2 is also a strength, as is the pseudotime trajectory analysis (results in Figure 3). One weakness involves the focus on genes that were previously identified as having an eQTL or sQTL in any GTEx tissue. The authors rightly point out that the GTEx database does not contain data for bone tissue, but the reason that eQTLs can be shared across many tissues - this assumption is valid for many cis-eQTLs, but it could also exclude many genes as potential DDGs with effects that are specific to bone/osteoblasts. Indeed, the authors show that important BMD driver genes have cell-type-specific eQTLs. Furthermore, the mesenchymal cell type-specific co-expression analysis by iterative WGCNA identified an average of 76 co-expression modules per cell cluster (range 26-153). Based on the limited number of genes that are detected as expressed in a given cell due to sparse per-cell read depth (400-6200 reads/cell) and dropouts, it's hard to believe that as many as 153 co-expression modules could be distinguished within any cell cluster. I would suspect some degree of model overfitting here and would expect that many/most of these identified modules have very few gene members, but the methods list a minimum module size of 20 genes. How do the numbers of modules identified in this study compare to other published scRNA-seq studies that use iterative WGCNA? 

      In the section "Identification of differentiation driver genes (DDGs)", the authors identified 408 significant DDGs and found that 49 (12%) were reported by the International Mouse Knockout [sic] Consortium (IMPC) as having a significant effect on whole-body BMD when knocked out in mice. Is this enrichment significant? E.g., what is the background percentage of IMPC gene knockouts that show an effect on whole-body BMD? Similarly, they found that 21 of the 408 DDGs were genes that have BMD GWAS associations that colocalize with GTEx eQTLs/sQTLs. Given that there are > 1,000 BMD GWAS associations, is this enrichment (21/408) significant? Recommend performing a hypergeometric test to provide statistical context to the reported overlaps here. 

      We thank the reviewer for their constructive feedback and thoughtful questions. In regards to the iterativeWGCNA, a larger number of modules is sometimes an outcome of the analysis, as reported in the iterativeWGCNA preprint (Greenfest-Allen et al., 2017). While we did not make a comparison to other works leveraging this tool for scRNA-seq, it has been used broadly across other published studies, such as PMID: 39640571, 40075303, 33677398, 33653874. While model overfitting, as you mention, may be a cause for more modules, our Bayesian network analysis we perform after iterativeWGCNA highlights smaller aspects of coexpression modules, as opposed to focusing on the entirety of any given module.

      We did not perform enrichment or statistical tests as our goal was to simply highlight attributes or unique features of these genes for additional context.

      Reviewer #2 (Public review): 

      Summary: 

      In this manuscript, Farber and colleagues have performed single-cell RNAseq analysis on bone marrow-derived stem cells from DO Mice. By performing network analysis, they look for driver genes that are associated with bone mineral density GWAS associations. They identify two genes as potential candidates to showcase the utility of this approach. 

      Strengths: 

      The study is very thorough and the approach is innovative and exciting. The manuscript contains some interesting data relating to how cell differentiation is occurring and the effects of genetics on this process. The section looking for genes with eQTLs that differ across the differentiation trajectory (Figure 4) was particularly exciting. 

      Weaknesses: 

      The manuscript is in parts hard to read due to the use of acronyms and there are some questions about data analysis that need to be addressed. 

      We thank the reviewer for their feedback and shared enthusiasm for our work. We tried to minimize the use of technical acronyms as much as we could without compromising readability. Additionally, we addressed questions regarding aspects of data analysis. 

      Reviewer #1 (Recommendations for the authors):

      (1) For increased transparency and to allow reproducibility, it would be necessary for the scripts used in the analysis to be shared along with the publication of the preprint. Also, where feasible, sharing the processed data in addition to the raw data would allow the community greater access to the results and be highly beneficial. 

      Thank you for this suggestion. The raw data will be available via GEO accession codes listed in the data availability statement. We will make available scripts for some analyses on our Github (https://github.com/Farber-Lab/DO80_project) and processed scRNA-seq data in a Seurat object (.rds) on Zenodo (https://zenodo.org/records/15299631)

      (2) Lines 55-76: I think the summary of previous work here is too long. I understand that they would like to cover what has been done previously, but this seems like overkill. 

      Good suggestion. We have streamlined some of the summary of our previous work.

      (3) Did the authors try to map QTL for cell-type proportion differences in their BMSC-OBs? While 80 samples certainly limit mapping power, the data shown in Figs 4C/D suggest that you might identify a large-effect modifier of LMP/OB1 proportions. 

      We did try to map QTL for cell type proportion differences, but no significant associations were identified. 

      (4) Methods question: Does the read alignment method used in your analysis account for SNPs/indels that segregate among the DO/CC founder strains? If not, the authors may wish to include this in their discussion of study limitations and speculate on how unmapped reads could affect expression results. 

      The read alignment method we used does not account for SNPs/indels from the DO founder strains that fall in RNA transcripts captured in the scRNA-seq data. We have included this as a limitation in our discussion (line 422-424). 

      (5) Much of the discussion reads as an overview of the methods, while a discussion of the results and their context to the existing BMD literature is relatively lacking in comparison.

      We have added additional explanation of the results and context to the discussion (line 381-382, 396-407). 

      (6) Figure 1E and lines 146-149: Adjusted p values should be reported in the figure and accompanying text instead of switching between unadjusted and adjusted p values. 

      We updated Figure 1e to portray adjusted p-values, listed the adjusted p-values in legend of Figure 1e, and listed them in the main text (line 153-154).

      (7) Why do the authors bring the IMPC KO gene list into the analysis so late? This seems like a highly relevant data resource (moreso than the GTEx eQTLs/sQTLs) that could have been used much earlier to help identify DDGs. 

      Given that our scRNA-seq data is also from mice, we did choose to integrate information from the IMPC to highlight supplemental features of genes in networks (i.e., genes that have an experimentally-tested and significant effect on BMD in mice). However, our primary goal was to inform human GWAS and leverage our previous work in which we identified colocalizations between human BMD GWAS and eQTL/sQTL in a human GTEx tissue, which is why this information was used to guide our network analysis.

      (8) Does Fgfrl1 and/or Tpx2 have a cis-eQTL in your BMSC-OB scRNA-seq dataset? 

      We did not identify cis-eQTL effects for Fgfrl1 and Tpx2.

      (9) Figure 4B-C: These eQTLs may be real, but based on the diplotype patterns in Figure 4C, I suspect they are artifacts of low mapping power that are driven by rare genotype classes with one or two samples having outlier expression results. For example, if you look at the results in Fig 4C for S100a1 expression, the genotype classes with the highest/lowest expression have lower sample numbers. In the case of Pkm eQTL showing a PWK-low effect, the PWK genome has many SNPs that differ from the reference genome in the 3' UTR of this gene, and I wonder if reads overlapping these SNPs are not aligning correctly (see point 4 above) and resulting (falsely) in lower expression values for samples with a PWK haplotype. 

      As mentioned above, our alignment method did not consider DO founder genetic variation that is specifically located in the 3’ end of RNA transcripts in the scRNA-seq data. We have included this as a limitation in our discussion (line 422-424).

      In future studies, we intend to include larger populations of mice to potentially overcome, as you mention, any artifacts that may be attributable to low statistical power, rare genotype classes, or outlier expression.

      Reviewer #2 (Recommendations for the authors):

      Major Points 

      (1) The authors hypothesize "that many genes impacting BMD do so by influencing osteogenic differentiation or possibly bone marrow adipogenic differentiation". However, cell type itself does not correlate with any bone trait. Does this indicate that the hypothesis is not entirely correct, as genes that drive these phenotypes would not be enriched in one particular cell type? The authors have previously identified "high-priority target genes". So, are there any cell types that are enriched for these target genes? If not, this would indicate that all these genes are more ubiquitously expressed and this is probably why they would have a greater effect on the overall bone traits. Furthermore, are the 73 eGenes (so genes with eQTLs in a particular cell type that change around cell type boundaries) or the DDGs (Table 1) enriched for these high-priority target genes? 

      The bone traits measured in the DO mice are complex and impacted by many factors, including the differentiation propensity and abundance of certain cell types, both within and outside of bone. Though we did not identify correlations between cell type abundance and the bone traits we measured, we tailored our investigations to focus on cellular differentiation using the scRNA-seq data. However, future studies would need to be performed to investigate any connections between cellular differentiation, cell type abundance, and bone traits.

      We did not perform enrichment analyses of either the target genes identified from our other work or eGenes identified here, but instead used the target gene list to center our network analysis and the eGenes to showcase the utility of the DO mouse population.

      (2) The readability of the paper could be improved by minimising the use of acronyms and there are several instances of confusing wording throughout the paper. In many cases, this can be solved by re-organising sentences and adding a bit more detail. For example, it was unclear how you arrived at Fgfrl1 or Tpx2.

      One of the goals of our study was to identify genes that have (to our knowledge) little to no known connection to BMD. We chose to highlight Fgfrl1 and Tpx2 because there is minimal literature characterizing these genes in the context of bone, which we speak to in the results (line 296-297). Additionally, we prioritized these genes in our previous work and they were identified in this study by using our network analyses using the scRNA-seq data, which we mention in the results (line 276-279).

      (3) Technical aspects of the assay. In Figure 1d you show that the cell populations vary considerably between different DO mice. It would be useful to give some sense of the technical variance of this assay given that the assay involves culturing the cells in an exogenous environment. This could take the form of tests between mice within the same inbred strain, or even between different legs of the same DO mice to show that results are technically very consistent. It might also be prudent to identify that this is a potential limitation of the approach as in vitro culturing has the potential to substantially change the cell populations that are present. 

      We agree that in vitro culturing, in addition to the preparation of single cells for scRNA-seq, are unavoidable sources of technical variation in this study. However, the total number of cells contributed by each of the 80 DO mice after data processing does not appear to be skewed and the distribution appears normal (see added figures, now included as Supplemental Figure 3). Therefore, technical variation is at least consistent across all samples. Nevertheless, we have mentioned the potential for technical variation artifacts in our study in the discussion (line 414-416).

      (4) Need for permutation testing. "We identified 563 genes regulated by a significant eQTL in specific cell types. In total, 73 genes with eQTLs were also tradeSeq-identified genes in one or more cell type boundaries". These types of statements are fine but they need to be backed up with permutation testing to show that this level of enrichment is greater than one would expect by chance. 

      We did not perform enrichment tests as our only goal was to 1. determine if eQTL could be resolved in the DO mouse population using our scRNA-seq data and 2. predict in what cell type the associated eQTL and associated eGene may have an effect.

      (5) The main novelty of the paper seems to be that you have used single-cell RNA seq (given that you appear to have already detailed the candidates at the end). I don't think this makes the paper less interesting, but I think you need to reframe the paper more about the approach, and not the specific results. How you landed on these candidates is also not clear. So the paper might be improved by more robustly establishing the workflow and providing guidelines for how studies like this should be conducted in the future. 

      We sought to not only devise a rigorous approach to analyze our single cell data, but also showcase the utility of the approach in practice by highlighting targets for future research (i.e., Fgfrl1 and Tpx2).

      Our goal was to identify novel genes and we landed on these candidate genes (Fgfrl1 and Tpx2) because they had substantial data supporting their causality and they have yet to be fully characterized in the context of bone and BMD (line 295-297).

      In regards to establishing the workflow, we have included rationale for specific aspects of our approach throughout the paper. For example, Figure 2 itemizes each step of our network analysis and we explain why each step is utilized throughout various parts results (e.g., lines 168-170, 179-181, 191-193, 202-203, 257-260, 276-277).

      We have added a statement advocating for large-scale scRNA-seq from genetically diverse samples and network analyses for future studies (line 436-438).

      Minor Points 

      (1) In the summary you use the word "trajectory". Trajectories for what? I assume the transition between cell types, but this is not clear. 

      We added text to clarify the use of trajectory in the summary (line 34).

      (2) This sentence: "By 60 identifying networks enriched for genes implicated in GWAS we predicted putatively causal genes 61 for hundreds of BMD associations based on their membership in enriched modules." is also not clear. Do you mean: we predicted putatively causal genes by identifying clusters of co-expressed genes that were enriched for GWAS genes?" It is not clear how you identify the causal gene in the network. Is this just based on the hub gene? 

      The aforementioned sentence has since been removed to streamline the introduction, as suggested by Reviewer 1.

      In regards to causal gene identification, it is not based on whether it is hub gene. We prioritized a DDG (and their associated networks) if it was a causal gene that we identified in our previous work as having eQTL/sQTL in a GTEx tissue that colocalizes with human BMD GWAS.

      (3) Figure 3C. This is good but the labels are quite small. Would be good to make all the font sizes larger. 

      We have enlarged Figure 3C.

      (4) Line 341 in the Discussion should be "pseudotemporal". 

      We have edited “temporal” to “pseduotemporal”.

    1. A traffic signal was installed at the intersection of Clay Street and Eagle Avenue.end underline Before it was installed, many vehicles raced through the intersection without regard to the posted speed limit. Pedestrians, many of whom were students, crossed the intersection to get to and from campus, but they had to dodge in and out of constant traffic. The number of accidents rose far past an acceptable limit. Indeed, one recent accident resulted in a loss of life.

      When I read this paragraph, it feels like the writer is showing how one problem led to another until things got serious. Instead of just saying “a traffic light was needed,” the paragraph walks me through all the small issues that built up to that decision. It’s basically connecting the dots so I can see why the final outcome actually makes sense.

    1. When someone creates content that goes viral, they didn’t necessarily intend it to go viral, or viral in the way that it does. If a user posts a joke, and people share it because they think it is funny, then their intention and the way the content goes viral is at least somewhat aligned. If a user tries to say something serious, but it goes viral for being funny, then their intention and the virality are not aligned. Let’s look at some examples of the relationship between virality and intent. 12.4.1. Building on the original intention# Content is sometimes shared without modification fitting the original intention, but let’s look at ones where there is some sort of modification that aligns with the original intention. We’ll include several examples on this page from the TikTok Duet feature, which allows people to build off the original video by recording a video of themselves to play at the same time next to the original. So for example, This tweet thread of TikTok videos (cross-posted to Twitter) starts with one Tiktok user singing a short parody musical of an argument in a grocery store. The subsequent tweets in the thread build on the prior versions, first where someone adds themselves singing the other half of the argument, then where someone adds themselves singing the part of their child, then where someone adds themselves singing the part of an employee working at the store1:

      This section shows that virality isn’t just about how many people see something, it’s also about how others reshape it, sometimes in ways that support the creator’s original purpose. Features like TikTok Duets can amplify a post by letting others add on in a way that stays aligned with the initial intent, but that alignment isn’t guaranteed.

    1. people susceptible to conspiratorial and reactionary thinking and sending them increasingly deeper into Flat Earth evangelism.

      When I first started reading this section, I was prepared to argue that we shouldn't downplay crazy theories just because they're crazy (content like that will always exist). However, I had no idea that these sorts of videos were being peddled to people prone to believing it. On one hand, it's a matter of prioritizing interests and creating a good algorithm. I'm not entirely sure about my ethical stance on this.

    1. Nor did the Declaration speak for all the white colonists.

      It's important to remember that the ideas of America that are valued today weren't what was being fought over then. In a nutshell, it was just the ruling class of the colonies doing what they were already doing, just without outside influence.

    1. As a device this takes some getting used to, though it makes more sense as the book goes on and the claustrophobia builds. To begin with it acts as a barrier, as though one must fight one’s way into a book that is resisting being read.

      Wow, okay so it it's not just me, thank you! Though during the last hundred pages I had gotten used to it and the resistance turned into an urge to not stop reading, while simultaneously being revulsed.

    1. 12.7. Activity: Value statements in what goes viral# 12.7.1. Choose three scenarios# When content goes viral there may be many people with a stake in it’s going viral, such as: The person (or people) whose content or actions are going viral, who might want attention, or get financial gain, or might be embarrassed or might get criticism or harassment, etc. Different people involved might have different interests. Some may not have awareness of it happening at all (like a video of an infant). Different audiences might have interests such as curiosity or desire to bring justice to a situation or desire to get attention for themselves or their ideas based on engaging the viral content, or desire to troll or harass others. Social networking platforms might have interests such as increased attention to their platform or increased advertising, or increased or decreased reputation (in views of different audiences). List at least three different scenarios of content going viral and list out the interests of different groups and people in the content going viral. 12.7.2. Create value statements# Social media platforms have some ability to influence what goes viral and how (e.g., recommendation algorithms, what actions are available, what data is displayed, etc.), though they only have partial control, since human interaction and organization also play a large role. Still, regardless of whether we can force any particular outcome, we can still consider of what you think would be best for what content should go viral, how much, and in what ways. Create a set of value statements for when and how you ideally would want content to go viral. Try to come up with at least 10 value statements. We encourage you to consider different ethics frameworks as you try to come up with ideas.

      This section clearly shows that virality isn’t neutral and always involves tradeoffs between different groups. I liked how the examples highlight that what benefits platforms or audiences can still harm individuals, especially through misinformation or loss of privacy. It also made me think more about how recommendation systems should reflect ethical values, not just engagement metrics

    1. An essay is good if it has sprung from necessity. Imagine if we could teach it that way.

      This feels like the main point of the whole essay. Writing works best when it comes from something you actually need to work through, not just because it’s assigned.

    2. What you do need is That Thing; maybe a question, a fear or a fury. It makes your blood boil. It’s all you can talk about when you sit down with your friends over a glass of wine or two or five, or maybe you can’t talk about it with anyone, just your own heart, alone with the impossible architecture of words

      Its me Ethan, this part of the text seems to let students help produce ideas but include methods of doing so. If they also compare the subject to something they are already interested about they will obviously want to find more about it.

    1. How Socrates inspired Rangers' motivational T-shirt sloganPublished in:Fort Worth Star-Telegram (TX), Feb 21, 2017,Points of View Reference SourceBy:Stevenson, StefanStevenson, Stefan How Socrates inspired Rangers' motivational T-shirt slogan ~~~~~~~~ Stefan Stevenson Feb. 21--SURPRISE, Ariz. -- Before the Texas Rangers opened their first full squad workout of spring training Tuesday morning, manager Jeff Banister held his annual team meeting with players, coaches and support staff. Besides introducing faces to the new players in the major league clubhouse, Banister reminded the players of the meaning of the '86400' and 'The Game Knows' slogans on the front and back of the workout shirts he had made up. The front of the shirts read simply: "86400." "The Game Knows" runs across the back. 86,400 Seconds in a day, which Rangers' manager Jeff Banister hopes inspires his players to make them all count. There are 86,400 seconds in a day, Banister said, and "They go by very quickly. Don't waste any of them." "Very simple message," said Banister, who nevertheless invoked Socrates' famous quote "An unexamined life is not worth living" as inspiration. "Socrates talked about evaluating and reflecting on your own life and where you've been and how you take inventory," he said. "I thought about how do we take inventory on what we do each day. It's a lot of seconds but they go by very fast but how are you preparing and utilizing them on a daily basis to play a game that you're very skilled at?" The message on the back, "The Game Knows," is a reminder to the players that they're "all in this game together." "It's a reflection of how you play is what you put into it," Banister said. "It's just a reminder that great teammates challenge each other and they lead by example. Our guys are great at that. They prepare extremely well, they work hard." It's just a reminder that great teammates challenge each other and they lead by example. Our guys are great at that. Rangers' Jeff Banister on T-shirt slogan Banister's message also included a reminder to his players that spring is about baby steps. "We didn't' say this was go time. Go time is when you get the flyovers and you're standing on the foul lines and the stands are full," he said. That's go time. Right now, it's get ready time and get better time." He also reminded them of a mission unaccomplished. "The reality is we haven't completed our ultimate mission. That will always be the goal for us. But we can't do that today, we can't complete it today," he said. "If you're a team that looks back on what you've done in the past you don't really pay attention to what's right in front of you." Stefan Stevenson: 817-390-7760, @StevensonFWST ___ (c)2017 the Fort Worth Star-Telegram Visit the Fort Worth Star-Telegram at www.star-telegram.com Distributed by Tribune Content Agency, LLC. Copyright of Fort Worth Star-Telegram (TX) is the property of Fort Worth Star-Telegram (TX). The copyright in an individual article may be maintained by the author in certain cases. Content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. Source: Fort Worth Star-Telegram (TX), Feb 21, 2017 Item: 2W6689243756
    1. reply to u/Yiqu at https://old.reddit.com/r/typewriters/comments/1r08q2i/buying_a_first_typewriter_for_writing/

      The first three articles you'll find under https://boffosocko.com/research/typewriter-collection/#Typewriter%20Market will give you a quick crash course about what to consider and look out for in your search.

      If you want to get to work, your best bet (and honestly the best value) is to get something from a repair shop that is serviced and ready to go. In the US this means a budget range typically from US$300-$600, or perhaps slightly more if they've recovered the platen which will improve your experience. Prices dramatically in excess of this often include a lot more custom work or less common typefaces which don't necessarily improve your performance (or are people selling typewriters who have less of an idea than you do about typewriters.)

      Many hobbyists here may say to get something cheap that "works", but the amount of time and knowledge you have to scaffold to do that is worth a lot of writing time, and often still requires a lot of cleaning, restoration, and potential tinkering which is even more onerous when you just want to get to writing.

      In case you haven't found them, some great resources on leveraging typewriters as distraction-free writing devices:

      And if you need some serious distraction free advice, since it's hiding as deep knowledge amongst a handful of serious collectors/writers, the bigger your (standard) machine, the more visual space it takes up as you're writing and subtly helps your concentration. Similarly placing it in front of a wall (and not a window) helps a lot too.

    1. Design Justice

      It’s wild how much tech can fail when the people building it don't represent the people using it, like that soap dispenser that literally couldn't see dark skin. It really drives home the point that design justice isn't just about the final product, but about making sure disabled and marginalized people are actually the ones in the room making the decisions.

    1. Assistive technologies

      This section highlights how managing disability accessibility is shifting from putting the burden on individuals to "mask" or fix themselves toward designers creating flexible tools that actually adapt to the person. It’s a move from just making someone act "normal" with assistive tech to using ability-based design where programs change their own behavior to match what a user can do.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript uses primarily simulation tools to probe the pathway of cholesterol transport with the smoothened (SMO) protein. The pathway to the protein and within SMO is clearly discovered, and interactions deemed important are tested experimentally to validate the model predictions.

      Strengths:

      The authors have clearly demonstrated how cholesterol might go from the membrane through SMO for the inner and outer leaflets of a symmetrical membrane model. The free energy profiles, structural conformations, and cholesterol-residue interactions are clearly described.

      We thank the reviewer for their kind words.

      (1) Membrane Model: The authors decided to use a rather simple symmetric membrane with just cholesterol, POPC, and PSM at the same concentration for the inner and outer leaflets. This is not representative of asymmetry known to exist in plasma membranes (SM only in the outer leaflet and more cholesterol in this leaflet). This may also be important to the free energy pathway into SMO. Moreover, PE and anionic lipids are present in the inner leaflet and are ignored. While I am not requesting new simulations, I would suggest that the authors should clearly state that their model does not consider lipid concentration leaflet asymmetry, which might play an important role.

      We thank the reviewer for their comment. Membrane asymmetry is inherent in endogenous systems; we acknowledge that as a limitation of our current model. We have addressed the comment by adding this limitation to our discussion in the manuscript.

      Added lines: (End of paragraph 6, Results subsection 2):

      “One possibility that might alter the thermodynamic barriers is native membrane asymmetry, particularly the anionic lipid-rich inner leaflet. This presents as a limitation of our current model.”

      (2) Statistical comparison of barriers: The barriers for pathways 1 and 2 are compared in the text, suggesting that pathway 2 has a slightly higher barrier than pathway 1. However, are these statistically different? If so, the authors should state the p-value. If not, then the text in the manuscript should not state that one pathway is preferred over the other.

      We thank the reviewer for their comment. We have added statistical t-tests for the barriers.

      Changes made: (Paragraph 6, Results subsection 2)

      “However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol v/s 6.5 ± 0.8 kcal/mol, p = 0.0013)”

      (3) Barrier of cholesterol (reasoning): The authors on page 7 argue that there is an enthalpy barrier between the membrane and SMO due to the change in environment. However, cholesterol lies in the membrane with its hydroxyl interacting with the hydrophilic part of the membrane and the other parts in the hydrophobic part. How is the SMO surface any different? It has both characteristics and is likely balanced similarly to uptake cholesterol. Unless this can be better quantified, I would suggest that this logic be removed.

      We thank the reviewer for this suggestion. We have removed the line to avoid confusion.

      Reviewer #2 (Public review):

      Summary:

      In this work, the authors applied a range of computational methods to probe the translocation of cholesterol through the Smoothened receptor. They test whether cholesterol is more likely to enter the receptor straight from the outer leaflet of the membrane or via a binding pathway in the inner leaflet first. Their data reveal that both pathways are plausible but that the free energy barriers of pathway 1 are lower, suggesting this route is preferable. They also probe the pathway of cholesterol transport from the transmembrane region to the cysteine-rich domain (CRD).

      Strengths:

      (1) A wide range of computational techniques is used, including potential of mean force calculations, adaptive sampling, dimensionality reduction using tICA, and MSM modelling. These are all applied rigorously, and the data are very convincing. The computational work is an exemplar of a well-carried out study.

      (2) The computational predictions are experimentally supported using mutagenesis, with an excellent agreement between their PMF and mRNA fold change data.

      (3) The data are described clearly and coherently, with excellent use of figures. They combine their findings into a mechanism for cholesterol transport, which on the whole seems sound.

      (4) The methods are described well, and many of their analysis methods have been made available via GitHub, which is an additional strength.

      Weaknesses:

      (1) Some of the data could be presented a little more clearly. In particular, Figure 7 needs additional annotation to be interpretable. Can the position of the cholesterol be shown on the graph so that we can see the diameter change more clearly?

      We thank the reviewer for this suggestion. We have added the cholesterol positions as requested.

      Changes made: (Caption, Figure 7)

      “The tunnel profile during cholesterol translocation in SMO. (a) Free energy plot of the zcoordinate v/s the tunnel diameter when cholesterol is present in the core TMD. The tunnel shows a spike in the radius in the TMD domain, indicating the presence of a cholesterol-accommodating cavity. (b) Representative figure for the tunnel when a cholesterol molecule is in the TMD. (c) Same as (a), when cholesterol is at the TMD-CRD interface. (e) same as (b), when cholesterol is at the TMD-CRD interface. (e) same as (a), when cholesterol is at the CRD binding site. (f) same as (b), when cholesterol is at the CRD binding site. Tunnel diameters shown as spheres. Cholesterol positions marked on plots using dotted lines. All snapshots presented are frames taken from MD simulations.”

      (2) In Figure 3C, it doesn’t look like the Met is constricting the tunnel at all. What residue is constricting the tunnel here? Can we see the Ala and Met panels from the same angle to compare the landscapes? Or does the mutation significantly change the tunnel? Why not A283 to a bulkier residue? Finally, the legend says that the figure shows that cholesterol can still pass this residue, but it doesn’t really show this. Perhaps if the HOLE graph was plotted, we could see the narrowest point of the tunnel and compare it to the size of cholesterol.

      We thank the reviewer for this suggestion. A283 was mutated to methionine as it presents with a longer heavy tail containing sulfur. We have plotted the tunnel radii for both WT and A283M mutants and added them as a supplemental figure. As shown in the figure, the presence of methionine doesn’t completely block the tunnel, but occludes it, thereby increasing the barrier for cholesterol transport slightly.

      Changes made: (End of Results subsection 1)

      “When we calculated the PMF for cholesterol entry, A<sup>2.60f</sup>M mutant showed restricted tunnel but it did not fully block the tunnel (Figure 3—figure Supplement 3).”

      (3) The PMF axis in 3b and d confused me for a bit. Looking at the Supplementary data, it’s clear that, e.g., the F455I change increases the energy barrier for chol entering the receptor. But in 3d this is shown as a -ve change, i.e., favourable. This seems the wrong way around for me. Either switch the sign or make this clearer in the legend, please.

      We thank the reviewer for this suggestion. We measured ∆PMF as PMF<sub>WT</sub> PMF<sub>mutant</sub>, hence the negative values. We have added additional text to the legend to clarify this.

      Changes made: (Caption, Figure 3)

      “(b) ∆Gli1 mRNA fold change (high SHH vs untreated) and ∆ PMF (difference of peak PMF , calculated as PMF<sub>WT</sub> - PMF<sub>mutant</sub>) plotted for the mutants in Pathway 1. (c) Example mutant A<sup>2_._60f</sup>M shows that cholesterol can enter SMO through Pathway 1 even on a bulky mutation. (d) Same as (b) but for Pathway 2 (e) Example mutant L<sup>5.62f</sup>A shows that cholesterol can enter SMO through Pathway 2 due to lesser steric hindrance. All snapshots presented are frames taken from MD simulations.”

      Changes made: (Caption, Figure 6)

      “(b) ∆Gli1 mRNA fold change (high SHH vs untreated) and ∆ PMF (difference of peak PMF, calculated as PMF<sub>WT</sub> - PMF<sub>mutant</sub>) plotted for mutants along the TMD-CRD pathway. (c, d) Example mutants Y<sup>LD</sup>A and F<sup>5.65f</sup>A show that cholesterol is unable to translocate through this pathway because of the loss of crucial hydrophobic contacts provided by Y207 and F484 and along the solvent-exposed pathway.”

      (4) The impact of G280V is put down to a decrease in flexibility, but it could also be a steric hindrance. This should be discussed.

      We thank the reviewer for this suggestion. We have added it as a possible mechanism of the decrease in activity of SMO.

      Changes made: (Paragraph 5, Results subsection 1)

      “We mutated G280<sup>2.57f</sup>  to valine - G<sup>2.57f</sup>V to test whether reducing the flexibility of TM2 prevents cholesterol entry into the TMD. Consequently, the activity of mSMO showed a decrease. However, this decrease could also be attributed to steric hindrance added by the presence of a bulky propyl group in valine.”

      (5) Are the reported energy barriers of the two pathways (5.8plus minus0.7 and 6.5plus minus0.8 kcal/mol) significantly and/or substantially different enough to favour one over the other? This could be discussed in the manuscript.

      We thank the reviewer for this suggestion. We have added statistical t-tests for the barriers.

      Changes made: (Paragraph 6, Results subsection 2)

      “However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol v/s 6.5 ± 0.8 kcal/mol, p = 0.001)”

      (6) Are the energy barriers consistent with a passive diffusion-driven process? It feels like, without a source of free energy input (e.g., ion or ATP), these barriers would be difficult to overcome. This could be discussed.

      We thank the reviewer for this suggestion. We have added a discussion to further clarify this point.

      Discussion: (Paragraph 6, Results subsection 2)

      “These values are comparable to ATP-Binding Cassette (ABC) transporters of membrane lipids, which use ATP hydrolysis (-7.54 ± 0.3 kcal/mol) (Meurer et al., 2017) to drive lipid transport from the membrane to an extracellular acceptor. Some of these transporters share the same mechanism as SMO, where the lipid from the inner leaflet is flipped and transported to the extracellular acceptor protein (Tarling et al., 2013). Additionally, for secondary active transporters that do not use ATP for the transport of substrates, a thermodynamic barrier of 5-6 kcal/mol has been reported in literature. (Chan et al., 2022; Selvam et al., 2019; McComas et al., 2023; Thangapandian et al., 2025).”

      (7) Regarding the kinetics from MSM, it is stated that the values seen here are similar to MFS transporters, but this then references another MSM study. A comparison to experimental values would support this section a lot.

      We thank the reviewer for this suggestion. We have added a discussion discussing millisecond-scale timescales measured for MFS transporters.

      Changes made: (Paragraph 2, Results subsection 5)

      “These timescales are comparable to the substrate transport timescales of Major Facilitator Superfamily (MFS) transporters (Chan et al., 2022). Furthermore, several experimental studies have also resolved the millisecond-scale kinetics of MFS transporters (Blodgett and Carruthers, 2005; Körner et al., 2024; Bazzone et al., 2022; Smirnova et al., 2014; Zhu et al., 2019), further corroborating the results from our study.”

      Reviewer #2 (Recommendations for the authors):

      (1) The heatmaps in Figures 2a and 4a are great. On these, an arrow denotes what looks like a minimum energy path. Is it possible to see this plotted, as this might show the height of the energy barriers more clearly?

      We thank the reviewer for this suggestion. We have computed the minimum energy paths for both pathways and presented them in a supplementary figure.

      Added lines: (Paragraph 4, Results subsection 1):

      For further clarity, we have plotted the minimum energy path taken by cholesterol as it translocates along this pathway (Figure 2—figure Supplement 3)a,b)

      Added lines: (Paragraph 4, Results subsection 2):

      For further clarity, we have plotted the minimum energy path taken by cholesterol as it translocates along this pathway (Figure 2—figure Supplement 3)c,d)

      (2) The tiCA data in S15 is first referred to on line 137, but the technique isn’t introduced until line 222. This makes understanding the data a little confusing. Reordering this might improve readability.

      We thank the reviewer for this suggestion. We have reordered the text to make it clearer.

      Changes made: (Paragraph 2, Results subsection 1) This provides evidence for multiple stable poses along the pathway as observed in the multiple stable poses of cholesterol in Cryo-EM structures of SMO bound to sterols (Deshpande et al., 2019; Qi et al., 2019b, 2020). A reliable estimate of the barriers comes from using the time-lagged Independent Components (tICs), which project the entire dataset along the slowest kinetic degrees of freedom. Overall, the highest barrier along Pathway 1 is 5.8 ± 0.7 kcal/mol, and it is associated with the entry of cholesterol into the TMD (Figure 2—Figure Supplement 2).

      Changes made: (Paragraph 3, Results subsection 2)

      “On plotting the first two components of tICs, (Figure 2—Figure Supplement 2), we observe that the energetic barrier between η and θ is ∼6.5 ± 0.8 kcal/mol.”

      (3) Missing bracket on line 577.

      We thank the reviewer for this suggestion. The typo has been fixed.

      (4) Line 577: Fig. S2nd?

      We thank the reviewer for this suggestion. This typo has been fixed.

      Reviewer #3 (Public review):

      Summary:

      This manuscript presents a study combining molecular dynamics simulations and Hedgehog (Hh) pathway assays to investigate cholesterol translocation pathways to Smoothened (SMO), a G protein-coupled receptor central to Hedgehog signal transduction. The authors identify and characterize two putative cholesterol access routes to the transmembrane domain (TMD) of SMO and propose a model whereby cholesterol traverses through the TMD to the cysteine-rich domain (CRD), which is presented as the primary site of SMO activation. The MD simulations and biochemical experiments are carefully executed and provide useful data.

      Weaknesses:

      However, the manuscript is significantly weakened by a narrow and selective interpretation of the literature, overstatement of certain conclusions, and a lack of appropriate engagement with alternative models that are well-supported by published data-including data from prior work by several of the coauthors of this manuscript. In its current form, the manuscript gives a biased impression of the field and overemphasizes the role of the CRD in cholesterol-mediated SMO activation. Below, I provide specific points where revisions are needed to ensure a more accurate and comprehensive treatment of the biology.

      (1) Overstatement of the CRD as the Orthosteric Site of SMO Activation

      The manuscript repeatedly implies or states that the CRD is the orthosteric site of SMO activation, without adequate acknowledgment of alternative models. To give just a few examples (of many in this manuscript):

      (a) “PTCH is proposed to modulate the Hh signal by decreasing the ability of membrane cholesterol to access SMO’s extracellular cysteine-rich domain (CRD)” (p. 3).

      (b) “In recent years, there has been a vigorous debate on the orthosteric site of SMO” (p. 3).

      (c) “cholesterol must travel through the SMO TMD to reach the orthosteric site in the CRD” (p. 4).

      (d) “we observe cholesterol moving along TM6 to the TMD-CRD interface (common pathway, Fig. 1d) to access the orthosteric binding site in the CRD” (p. 6).

      While the second quote in this list at least acknowledges a debate, the surrounding text suggests that this debate has been entirely resolved in favor of the CRD model. This is misleading and not reflective of the views of other investigators in the field (see, for example, a recent comprehensive review from Zhang and Beachy, Nature Reviews Molecular and Cell Biology 2023, which makes the point that both the CRD and 7TM sites are critical for cholesterol activation of SMO as well as PTCH-mediated regulation of SMO-cholesterol interactions).

      In contrast, a large body of literature supports a dual-site model in which both the CRD and the TMD are bona fide cholesterol-binding sites essential for SMO activation. Examples include:

      (a) Byrne et al., Nature 2016: point mutation of the CRD cholesterol binding site impairs-but does not abolish-SMO activation by cholesterol (SMO D99A, Y134F, and combination mutants - Fig 3 of the 2016 study).

      (b) Myers et al., Dev Cell 2013 and PNAS 2017: CRD deletion mutants retain responsiveness to PTCH regulation and cholesterol mimetics (similar Hh responsiveness of a CRD deletion mutant is also observed in Fig. 4 Byrne et al, Nature 2016).

      (c) Deshpande et al., Nature 2019: mutation of residues in the TMD cholesterol binding site blocks SMO activation entirely, strongly implicating the TMD as a required site, in contrast to the partial effects of mutating or deleting the CRD site.

      Qi et al., Nature 2019, and Deshpande et al., Nature 2019, both reported cholesterol binding at the TMD site based on high-resolution structural data. Oddly, Deshpande et al., Nature 2019, is not cited in the discussion of TMD binding on p. 3, despite being one of the first papers to describe cholesterol in the TMD site and its necessity for activation (the authors only cite it regarding activation of SMO by synthetic small molecules).

      Kinnebrew et al., Sci Adv 2022 report that CRD deletion abolished PTCH regulation, which is seemingly at odds with several studies above (e.g., Byrne et al, Nature 2016; Myers et al, Dev Cell 2013); but this difference may reflect the use of an N-terminal GFP fusion to SMO in the Kinnebrew et al 2022, which could alter SMO activation properties by sterically hindering activation at the TMD site by cholesterol (but not synthetic SMO agonists like SAG); in contrast, the earlier work by Byrne et al is not subject to this caveat because it used an untagged, unmodified form of SMO.

      Although overexpression of PTCH1 and SMO (wild-type or mutant) has been noted as a caveat in studies of CRD-independent SMO activation by cholesterol, this reviewer points out that several of the studies listed above include experiments with endogenous PTCH1 and low-level SMO expression, demonstrating that SMO can clearly undergo activation by cholesterol (as well as regulation by PTCH1) in a manner that does not require the CRD.

      Recommendation: The authors should revise the manuscript to provide a more balanced overview of the field and explicitly acknowledge that the CRD is not the sole activation site. Instead, a dual-site model is more consistent with available structural, mutational, and functional data. In addition, the authors should reframe their interpretation of their MD studies to reflect this broader and more accurate view of how cholesterol binds and activates SMO.

      We thank the reviewer for this comprehensive overview of the existing literature. We agree that cholesterol binding to both the TMD and CRD sites is required for full activation of SMO. As described below in responses to comments, we have made changes to the manuscript to make this point clear. For instance, in the revised manuscript, we refrain from calling the CRD cholesterol binding site the “orthosteric site”. Instead, we highlight that the goal of the manuscript is not to resolve the debate over whether the TMD or CRD site is more important for PTCH1 regulation by SMO but rather to use molecular dynamics to understand the fascinating question of how cholesterol in the membrane can reach the CRD, located at a significant distance above the outer leaflet of the membrane. We believe that this is an important goal since there is an abundance of evidence that supports the view that PTCH1 inhibits SMO by reducing cholesterol access to the CRD. This evidence is now summarized succinctly in the introduction:

      Changes made: (Paragraph 4, Introduction)

      “While cholesterol binding to both the TMD and CRD sites is required for full SMO activation, our work focuses on how cholesterol gains access to the CRD site, perched above the outer leaflet of the membrane (Luchetti et al., 2016; Kinnebrew et al., 2022). Multiple lines of evidence suggest that PTCH1-regulated cholesterol binding to the CRD plays an instructive role in SMO regulation both in cells and animals. Mutations in residues predicted to make hydrogen bonds with the hydroxyl group of cholesterol bound to the CRD reduced both the potency and efficacy of SHH in cellular signaling assays (Kinnebrew et al., 2022; Byrne et al., 2016) and, more importantly, eliminated HH signaling in mouse embryos (Xiao et al., 2017). Experiments using both covalent and photocrosslinkable sterol probes in live cells directly show that PTCH1 activity reduces sterol access to the CRD (Kinnebrew et al., 2022; Xiao et al., 2017). Notably, our simulations evaluate a path of cholesterol translocation that includes both the TMD and CRD sites: cholesterol first enters the 7-transmembrane domain bundle from the membrane; it then engages the TMD site before continuing along a conduit to the CRD site. Thus, we analyze translocation energetics and residue-level contacts along a path that includes both the TMD and the CRD.”

      However, Reviewer 3 makes several comments below that are biased, inaccurate, or selective. We feel it is important to address these so readers can approach the literature from a balanced perspective. Indeed, the eLife review forum provides an ideal venue to present contrasting views on a scientific model. We encourage the editors to publish both Reviewer 3’s comments and our response in full so readers can read the original papers and reach their own conclusions. It is important to note these issues are not relevant to the quality of the computational and experimental data presented in this paper.

      We have now removed the term “orthosteric” to describe the CRD site throughout the paper and clearly state in the introduction that “both the CRD and TMD sites are required for SMO activation” but that our focus is on how cholesterol moves from the membrane to the CRD site. There is no doubt that cholesterol binding to the CRD plays a key role in SMO activation– our focus on this path is justified and does not devalue the importance of the TMD site. Our prior models (see Figure 7 of Kinnebrew 2022 explicitly include contributions of both sites).

      Now we respond to some of the concerns outlined, individually:

      (1) Byrne et al., Nature 2016: point mutation of the CRD cholesterol binding site impairs-but does not abolish-SMO activation by cholesterol (SMO D99A, Y134F, and combination mutants - Fig 3 of the 2016 study)

      The fact that a point mutation dramatically diminishes (but does not abolish signaling) does not mean that the CRD cholesterol binding site is not important for SMO regulation. Indeed, the reviewer fails to mention that Song et. al. (Molecular Cell, 2017) found that a SMO protein carrying a subtle mutation at D99 (D95/99N, a residue that makes a hydrogen bond with the cholesterol hydroxyl) completely abolishes SMO signaling in mouse embryos. Thus, the CRD site is critical for SMO activation in an intact animal, justifying our focus on evaluating the path of cholesterol translocation to the CRD site.

      (2) Myers et al., Dev Cell 2013 and PNAS 2017: CRD deletion mutants retain responsiveness to PTCH regulation and cholesterol mimetics (similar Hh responsiveness of a CRD deletion mutant is also observed in Fig 4 Byrne et al, Nature 2016).

      The Reviewer fails to note that CRD-deleted versions of SMO have markedly (>10-fold) higher basal (i.e. ligand-independent) activity compared to full-length SMO. The response to SHH is minimal (∼2-fold), compared to >50-100-fold with full-length SMO. Thus, CRD-deleted SMO is likely in a non-native conformation. Local changes in cholesterol accessibility caused by PTCH1 inactivation or cholesterol loading can cause small fluctuations in delta-CRD activity, but this cannot be used to infer meaningful insights about how native, full-length SMO (with >10-fold lower basal activity) is regulated. We encourage the reviewer to read our previous paper (Kinnebrew et. al. 2022), which presents a unified view of how the TMD and CRD sites together regulate SMO activation.

      A more physiological experiment, reported in Kinnebrew et. al. 2022, tested mutations in residues that make hydrogen bonds with cholesterol at the CRD and TMD sites in the context of full-length SMO. These mutants were stably expressed at moderate levels in Smo<sup>−/−</sup> cells. Mutations at the CRD site reduced the fold-increase in signaling output in response to SHH, as would be expected for a PTCH1-regulated site. In contrast, analogous mutations in the TMD site reduced the magnitude of both basal and maximal signaling, without affecting the fold-change in response to SHH. In signaling assays, the key parameter in evaluating the impact of a mutation is whether it impacts the change in output in response to a signal (in this case PTCH1 inactivation by SHH). A mutation in SMO that affects PTCH1 regulation is expected to decrease the fold-change in signaling in response to SHH, a criterion that is fulfilled by mutations in the CRD site. Accordingly, mutations in the CRD site abolish SMO signaling in mouse embryos (Xiao et al., 2017).

      (3) Deshpande et al., Nature 2019: mutation of residues in the TMD cholesterol binding site blocks SMO activation entirely, strongly implicating the TMD as a required site, in contrast to the partial effects of mutating or deleting the CRD site.

      Introduction of bulky mutations at the TMD site (V333F) that abolish SMO activity were first reported by Byrne et. al. 2016 and were used to markedly increase the stability of SMO for protein expression. These mutations indeed stabilize the inactive state of SMO, increasing protein abundance and completely preventing its localization at primary cilia. SMO variants carrying such bulky mutations cannot be used to infer the importance of the TMD site since they do not distinguish between the following possibilities: (1) SMO is inactive because the sterol cannot bind, or (2) SMO is inactive because it is locked in an inactive conformation, or (3) SMO is inactive because it cannot localize to primary cilia (where it must be localized to activate downstream signaling).

      As described in Response 3.3, a better evaluation of the importance of the TMD site is the use of mutations in residues that make hydrogen bonds with the hydroxyl group of TMD cholesterol. These mutations do not markedly increase protein stability or prevent ciliary localization (Kinnebrew 2022, Fig.S2). While a TMD site mutation decreases the magnitude of maximal (and basal) SMO signaling, it does not impact the fold-increase in signal output in response to Hh ligands (the key parameter that should be used to evaluate PTCH1 activity).

      (4) Qi et al., Nature 2019, and Deshpande et al., Nature 2019, both reported cholesterol binding at the TMD site based on high-resolution structural data. Oddly, Deshpande et al., Nature 2019 not cited in the discussion of TMD binding on p. 3, despite being one of the first papers to describe cholesterol in the TMD site and its necessity for activation (the authors only cite it regarding activation of SMO by synthetic small molecules)

      The reference has now been added at this location in the manuscript.

      (5) Kinnebrew et al., Sci Adv 2022 report that CRD deletion abolished PTCH regulation, which is seemingly at odds with several studies above (e.g., Byrne et al, Nature 2016; Myers et al, Dev Cell 2013); but this difference may reflect the use of an N-terminal GFP fusion to SMO in the Kinnebrew et al 2022, which could alter SMO activation properties by sterically hindering activation at the TMD site by cholesterol (but not synthetic SMO agonists like SAG); in contrast, the earlier work by Byrne et al is not subject to this caveat because it used an untagged, unmodified form of SMO.

      The reviewer fails to note that CRD deleted versions of SMO have markedly (>10-fold) higher basal activity than full-length SMO. The response to SHH is minimal (∼2fold), compared to >50-fold with full-length SMO. Thus, CRD-deleted SMO is likely in a non-native conformation. Local changes in cholesterol accessibility caused by PTCH1 inactivation or cholesterol loading can cause small fluctuations in delta-CRD activity, but this cannot be used to infer meaningful insights about how native, full-length SMO (with >10-fold lower basal activity) is regulated. Please see Response 3.3 for further details.

      Reviewer 3 presents an incomplete picture of the extensive experiments reported in Kinnebrew et. al. to establish the functionality of YFP-tagged delta-CRD SMO. Most importantly, a TMDselective sterol analog (KK174) can fully activate YFP-tagged delta-CRD, showing conclusively that the YFP fusion does not block sterol access to the TMD site. The fact that this protein is nearly unresponsive to SHH highlights the critical role of the CRD-bound cholesterol in SMO regulation by PTCH1. Indeed, the YFP-tagged, CRD-deleted SMO was made purposefully to test the requirement of the CRD in a construct that had normal basal activity. Again, this data justifies the value of investigating the path of cholesterol movement from the membrane via the TMD site to the CRD.

      (6) Although overexpression of PTCH1 and SMO (wild-type or mutant) has been noted as a caveat in studies of CRD-independent SMO activation by cholesterol, this reviewer points out that several of the studies listed above include experiments with endogenous PTCH1 and low-level SMO expression, demonstrating that SMO can clearly undergo activation by cholesterol (as well as regulation by PTCH1) in a manner that does not require the CRD.

      This comment is inaccurate. The data presented in Deshpande et. al. (and prior work in Myers et. al.) used transient transfection to overexpress SMO in Smo<sup>−/−</sup> cells. At the individual cell level transient transfection produces expression levels that are markedly higher (10-1000-fold) than stable expression (in addition to being more variable). Most scientists would agree that stable expression (as used in Kinnebrew 2022) at a moderate expression level is a better system to compare mutant phenotypes, assess basal and activated signaling, and provide an accurate measure of the fold-change in signal output in response to SHH. Notably, introduction of a mutation in the CRD cholesterol binding site at the endogenous mouse Smo locus (an even better experiment than stable expression) leads to complete loss of SMO activity (PMID 28344083). This result again justifies our investigation of the pathway of cholesterol movement from the membrane to the CRD site.

      We have changed the initial discussion and reflect a more general outlook.

      Changes made: (Paragraph 1, Introduction)

      “PTCH modulates the availability of accessible cholesterol at the primary cilium and thereby regulates SMO, with models invoking effects on both the CRD and 7TM pockets.”

      Changes made: (Results subsection 3, paragraph 1)

      “According to the dual-site model, to reach the binding site in the CRD (ζ), cholesterol translocate along the TMD-CRD interface from the TM binding site (α∗) is required.”

      Added lines: (Paragraph 5, Results subsection 3):

      “The computational investigation showed here covers the dual-site model, where cholesterol reaches the CRD site via binding to the TM binding site first. In comparison to the CRD site, the TM site is more stable by ∼ 2 kcal/mol (Figure 2—Figure Supplement 3b, d).”

      Added lines: (Paragraph 2, Conclusions):

      “Here we have explored the role the CRD-site plays in SMO activation. In addition, through simulating the CRD site-dependent SMO activation hypothesis, we have also simulated the TMD site-dependent activation. We show that the overall stability of cholesterol is higher than the CRD site by ∼ 2 kcal/mol.”

      (2) Bias in Presentation of Translocation Pathways

      The manuscript presents the model of cholesterol translocation through SMO to the CRD as the predominant (if not sole) mechanism of activation. Statements such as: "Cholesterol traverses SMO to ultimately reach the CRD binding site" (p. 6) suggest an exclusivity that is not supported by prior literature in the field. Indeed, the authors’ own MD data presented here demonstrate more stable cholesterol binding at the TMD than at the CRD (p 17), and binding of cholesterol to the TMD site is essential for SMO activation. As such, it is appropriate to acknowledge that cholesterol may activate SMO by translocating through the TM5/6 tunnel, then binding to the TMD site, as this is a likely route of SMO activation in addition to the CRD translocation route they highlight in their discussion.

      The authors describe two possible translocation pathways (Pathway 1: TM2/3 entry to TMD; Pathway 2: TM5/6 entry and direct CRD transfer), but do not sufficiently acknowledge that their own empirical data support Pathway 2 as more relevant. Indeed, because their experimental data suggest Pathway 2 is more strongly linked to SMO activation, this pathway should be weighted more heavily in the authors’ discussion. In addition, Pathway 2 is linked to cholesterol binding to both the TMD and CRD sites (the former because the TMD binding site is at the terminus of the hydrophobic tunnel, the latter via the translocation pathway described in the present manuscript), so it is appropriate that Pathway 2 figures more prominently than Pathway 1 in the authors’ discussion.

      The authors also claim that "there is no experimental structure with cholesterol in the inner leaflet region of SMO TMD" (p 16). However, a structural study of apo-SMO from the Manglik and Cheng labs (Zhang et al., Nat Comm, 2022) identified a cholesterol molecule docked at the TM5/6 interface and also proposed a "squeezing" mechanism by which cholesterol could enter the TM5/6 pocket from the membrane. The authors do not consider this SMO conformation in their models, nor do they discuss the possibility that conformational dynamics at the TM5/6 interface could facilitate cholesterol flipping and translocation into the hydrophobic conduit, despite both possibilities having precedent in the 2022 empirical cryoEM structural analysis.

      Recommendation: The authors should avoid oversimplifying the SMO cholesterol activation process, either by tempering these claims or broadening their discussion to better reflect the complexity and multiplicity of cholesterol access and activation routes for SMO. They should also consider the 2022 apo-SMO cryoEM structure in their analysis of the TM5/6 translocation pathway.

      We thank the reviewer for this comprehensive overview of the existing literature and parts we have missed to include in the discussion. We agree with the reviewer, since our data shows that both pathways are probable. Through our manuscript, we have avoided using a competitive approach (that one pathway dominates over the other). Instead, we have evaluated both pathways independently and presented a comparative rather than competitive overview of both pathways from our observations. While we agree that experimental evidence suggests the inner leaflet pathway is possible, we cannot discount the observations made in previous studies that support the outer leaflet pathway, particularly Hedger et al. (2019), Bansal et al. (2023), and Kinnebrew et al. (2021). Therefore, considering the reviewer’s comments have made the following changes:

      (1) Added lines: (Paragraph 3, Conclusions):

      “We show that the barriers associated with the pathway starting from the outer leaflet are lower by ∼0.7 kcal, (p=0.0013). We also provide evidence that cholesterol can enter SMO via both leaflets, considering that multiple computational and experimental studies have found cholesterol entry sites and activation modulation via the outer leaflet, between TM2TM3. This is countered by evidence from multiple experimental and computational studies corroborating entry via the inner leaflet, between TM5-TM6, including this study. Overall, we posit that cholesterol translocation from either pathway is feasible.”

      (2)nChanges made: (Paragraph 6, Results subsection 2)

      “Based on our experimental and computational data, we conclude that cholesterol translocation can happen via either pathway. This is supported on the basis of the following observations: mutations along pathway 2 affect SMO activity more significantly, and the presence of a direct conduit that connects the inner leaflet to the TMD binding site. In addition, a resolved structure of SMO in the presence of cholesterol shows a cholesterol situated at the entry point from the membrane into the protein between TM5 and TM6, in the inner leaflet. However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol vs. 6.5 ± 0.8 kcal/mol, p \= 0.0013). Additionally, PTCH1 controls cholesterol accessibility in the outer leaflet. This shows that there is a possibility for transport from both leaflets. One possibility that might alter the thermodynamic barriers is native membrane asymmetry, particularly the anionic lipid-rich inner leaflet. This presents as a limitation of our current model.”

      (3)nChanges made: (Paragraph 1, Results subsection 2)

      “In a structure resolved in 2022, cholesterol was observed at the interface between the protein and the membrane, in the inner leaflet, between TMs 5 and 6. However, cholesterol in the inner leaflet has a downward orientation, with the polar hydroxyl group pointing intracellularly (η). A striking observation is that this cholesterol binding site pose was never used as a starting point for simulations and was discovered independent of the pose described in Zhang et al. (2022) (Figure 4—Figure Supplement 1).”

      (3) Alternative Possibility: Direct Membrane Access to CRD

      The possibility that the CRD extracts cholesterol directly from the membrane outer leaflet is not considered. While the crystal structures place the CRD in a stable pose above the membrane, multiple cryo-EM studies suggest that the CRD is dynamic and adopts a variety of conformations, raising the possibility that the stability of the CRD in the crystal structures is a result of crystal packing and that the CRD may be far more dynamic under more physiological conditions.

      Recommendation: The authors should explicitly acknowledge and evaluate this potential mechanism and, if feasible, assess its plausibility through MD simulations.

      We thank the reviewer for the suggestion. We have addressed this comment by calculating the distance from the lipid headgroups for each lipid in the membrane to the cholesterol binding site. We show that in our study, we do not observe any bending of the CRD over the membrane, precluding any cholesterol from being extracted from the membrane directly.

      Added lines: (Paragraph 3, Conclusions):

      “An alternative possibility states that the flexibility associated with the CRD would allow it to directly access the membrane, and consequently, cholesterol. In the extensive simulations reported in this study, the binding site of cholesterol in the CRD remains at least 20 Å away from the nearest lipid head group in the membrane, suggesting that such direct extraction and the bending of the CRD do not occur within the timescales sampled (Appendix 2 – Figure 6).

      The mechanistic details of this process are still unexplored and form the basis of future work.”

      (4) Inconsistent Framing of Study Scope and Limitations

      The discussion contains some contradictory and misleading language. For example, the authors state that "In this study we only focused on the cholesterol movement from the membrane to the CRD binding site," and then several sentences later state that "We outline the entire translocation mechanism from a kinetic and thermodynamic perspective." These statements are at odds. The former appropriately (albeit briefly) notes the limited scope of the modeling, while the latter overstates the generality of the findings.

      In addition, the authors’ narrow focus on the CRD site constitutes a major caveat to the entire work. It should be acknowledged much earlier in the manuscript, preferably in the introduction, rather than mentioned as an aside in the penultimate paragraph of the conclusion.

      Recommendation: The authors should clarify the scope of the study and expand the discussion of its limitations. They should explicitly acknowledge that the study models one of several cholesterol access routes and that the findings do not rule out alternative pathways.

      We thank the reviewer for the suggestion. We have addressed this comment by explicitly mentioning the scope of the study.

      Changes made: (Paragraph 3, Conclusions)

      “We outline the entire translocation mechanism from a kinetic and thermodynamic perspective for one of the leading hypotheses for the activation mechanism of SMO.”

      (5) Summary:

      This study has the potential to make a useful contribution to our understanding of cholesterol translocation and SMO activation. However, in its current form, the manuscript presents an overly narrow and, at times, misleading view of the literature and biological models; as such, it is not nearly as impactful as it could be. I strongly encourage the authors to revise the manuscript to include:

      (1) A more balanced discussion of the CRD vs. TMD binding sites.

      (2) Acknowledgment of alternative cholesterol access pathways.

      (3) More comprehensive citation of prior structural and functional studies.

      (4) Clarification of assumptions and scope.

      Of note, the above suggestions require little to no additional MD simulations or experimental studies, but would significantly enhance the rigor and impact of the work.

      We thank the reviewer for the suggestions. We have taken into account the literature and diverse viewpoints. We have changed the initial discussion and reflected a more general outlook. In the revised version of the manuscript, we have refrained from referring to the CRD site as the orthosteric site. Instead, we refer to it as the CRD sterol-binding site. To better represent the dual-site model, we add further discussion in the Introduction. Through our manuscript, we have avoided using a competitive approach (that one pathway dominates over the other). Instead, we have evaluated both pathways independently and presented a comparative rather than competitive overview of both pathways from our observations. We explicitly mention the scope of the study.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      Authors should be commended for the availability of data/code and detailed methods. Clarity is good. Authors have clearly spent a lot of time thinking about the challenges of metabolomics data analysis.

      Significance

      Schmidt et al. present MetaProViz, a comprehensive and modular platform for metabolomics data analysis. The tool provides a full suite of processing capabilities spanning metabolite annotation, quality control, normalization, differential analysis, integration of prior knowledge, functional enrichment, and visualization. The authors also include example datasets, primarily from renal cancer studies, to demonstrate the functionality of the pipeline. The MetaProViz framework addresses several long-standing challenges in metabolomics data analysis, particularly issues of reproducibility, ambiguous metabolite annotation, and the integration of metabolite features with pathway knowledge. The platform is likely to be a valuable addition for the community, but the reviewer has some comments that need to be addressed prior to publication.

      We thank the reviewer for this positive feedback.

      Comments:

      (1) (Planned)

      The section "Improving the connection between prior knowledge and metabolomics features" could benefit from additional clarification. It is not entirely clear to the reader what specific steps were taken beyond using RaMP-DB to translate metabolite identifiers. For example, how exactly were ambiguous mappings ("different scenarios") handled in practice, and to what extent does this process "fix" or merely flag inconsistencies? A more explicit description or example of how MetaProViz resolves these cases would help readers better understand the improvements claimed.

      We thank the reviewer for pointing this out and we agree that this section requires extension to ensure clarity. Beyond using RaMP-DB, we are characterising the mapping ambiguity (one-to-none, one-to-many, many-to-one, many-to-many) within and across metabolite-sets (i.e. pathways) and return this information to the user together with the translated identifiers. This is important to understand potential inflation/deflation of metabolite-sets that occur due to the translation. Moreover, we also offer the manually curated amino-acid collection to ensure L-, D- and zwitterion without chirality IDs are assigned for aminoacids (Fig. 2b). Ambiguous mappings are handled based on the measured data (Fig. 2e). Indeed, many translation cases that deflate (many-to-one mapping) or inflate (one-to-many mapping) the metabolite-sets are resolved when merging the prior knowledge with actual measured data (i.e. Fig. 2e, one-to-many in scenario 1, which becomes obsolete as only one/none of the many potential metabolite IDs is detected). By sorting each mapping into one of those scenarios, we only flag those cases. The reason for this decision has been that in many cases multiple decisions are valid (i.e. Fig. 2e, Scenario 5: Here the values of the two detected metabolites could be summed or the metabolite value with the larger Log2FC could be kept) and it should really be up to the user to make those dependent on their knowledge of the biological system and the analytical LC-MS method used.

      Since these points have not been clear enough, we will add a more explicit description to the results section by showcasing more details on how we exactly tackled this problem in the ccRCC example data. This has also been suggested by Reviewer 3 (Minor Comment 7 and 8), so feel free to also see the responses below.

      (2) (Planned)

      The introduction of MetSigDB is intriguing, but its construction and added value are not sufficiently described. It would be helpful to clarify what specific advantages MetSigDB provides over directly using existing pathway resources such as KEGG, Reactome, or WikiPathways. For example, how many features, interactions, or metabolite-set relationships are included, and in what way are these pathways improved or extended compared to those already available in public databases?

      We thank the reviewer for this valuable comment and we apologise that this was not described sufficiently. One of the major advantages is that all the resources are available in one place following the same table format without the need to visit the different original resources and perform data wrangling prior to enrichment analysis. In addition, where applicable, we have removed metabolites that are not detectable by LC-MS (i.e. ions, H2O, CO2) to circumvent pathway inflation with features that are never within the data and hence impacting the statistical testing in enrichment analysis workflows.

      During the revision, we will compile an Extended Data Table listing all the resources present in MetSigDB, their number of features and interactions. We will also extend the methods section "Prior Knowledge access" about MetSigDB and how we removed metabolites.

      (3)

      Figure 1D/1E: The reviewer appreciates the inclusion of the visualizations illustrating the different mapping scenarios, as these effectively convey the complexity of metabolite ID translation. However, it took some time to interpret what each scenario represented. It would be helpful to include brief annotations or explanatory text directly on the figures to clarify what each scenario depicts and how it relates to the underlying issue being addressed.

      *We think the reviewer refers to Fig. 2D/E and we acknowledge that this is a complex problem we try to convey. We received a similar comment from Reviewer 2 (Minor Comment 1), who asked to extend the figure legend description of what the different scenarios display. *

      We have extended the figure legend and specifically explained each displayed case and its meaning (Line 222-242):

      "d-e) Schematics of possible mapping cases between metabolite IDs (= each circle corresponds to one ID) of a pathway-metabolite set (e.g. KEGG) to metabolites IDs of a different database (e.g. HMDB) with (d) showing many-to-many mappings that can occur within and across pathway-metabolite sets and (e) additionally showing the mapping to metabolite IDs that were assigned to the detected peaks within and across pathway-metabolite sets. (d) __Translating the metabolite IDs of a pathway-metabolite set can lead to special cases such as many-to-one mappings (Pathway 1), where for example the original resource used the ID for L-Alanine (Pathway 1, green) and D-Alanine (Pathway 1, yellow) in the amino-acid pathway, whilst the translated resources only has an entry for Alanine zwitterion (Pathway 1, blue). Additionally, many-to-one mappings can also occur across pathways (Pathway 2-4), where this mapping is only detected when mappings are analysed taking all pathways into account. Both of these cases deflate the pathways, which can also happen for one-to-none mappings (Pathway 1, white). There are also cases that inflate the pathway such as one-to-many mappings (e.g. Pathway 2-4, orange mapping to pink and violet). (e)__ Showcasing the different scenarios when merging measured data (detected) based on the translated metabolites within pathways (scenario 1-5) and across pathways (scenario 6-8) highlighting problematic scenarios (4-7) that require further actions. Unproblematic scenarios (1-3 and 8) can include special cases between original and translated (i.e. one-to-many in scenario 1), which become obsolete as only one/none of the many potential metabolite IDs is detected. Yet, if multiple metabolites are detected action is required (scenario 5), which can include building the sum of the multiple detected features or only keeping the one with the highest Log2FC between two conditions. Other special cases between original and translated (i.e. many-to-one in scenario 4 and 6) also depend on what has been mapped to the measured features. If features have been measured in those scenarios, pathway deflation (i.e. only one original entry remains) or measured feature duplication (the same measurement is mapped to many features in the prior knowledge) are the possible results within and across pathways. Those scenarios should be addressed on a case-by-case basis as they also require biological information to be taken into account."

      We have also rearranged the Scenarios in Fig. 2e. We hope that together with the extended figure legend this is now clear.

      (4) (Planned)

      "By assigning other potential metabolite IDs and by translating between the present ID types, we not only increase the number of features within all ID types but also increase the feature space with HMDB and KEGG IDs (Fig. 2a, right, SFig. 2 and Supplementary Table 1)". The reviewer would appreciate additional clarification on how this was done. It is not clear what specific steps or criteria were used to assign additional metabolite IDs or to translate between identifier types. The reviewer also appreciates the inclusion of the UpSet plots. However, simply having the plots side-by-side makes it difficult to determine the specific differences. An alternative visualization, such as stacked bar plots, scatter plots summarizing the changes in feature counts, or other representation that more clearly highlights the deltas, might make these results easier to interpret.

      The main Fig. 2a shows the original (left) metabolite ID availability per detected metabolite feature in the ccRCC data and the adapted (right) metabolite IDs. The individual steps taken to extend the metabolite ID coverage of the measured features and obtain Fig 2a (right), are shown in SFig. 2 for HMDB (SFig. 2a) and KEGG (SFig. 2b). We did not include the plots for the pubchem IDs as they follow the same principle. The individual steps we are showcasing with SFig. 2 are (I) How many of the detected features (577) have a HMDB ID (341, red bar + grey bar), (II) How this distribution changed after equivalent amino-acid IDs are added, which does not change the number of features with an HMDB ID, but the number of features with a single HMDB ID, and (III) How this distribution changed after translating from the other available ID types (KEGG and PubChem) to HMDB IDs using RaMP-DBs knowledge, which leads to 430 detected features with one or multiple HMDB IDs. The exact numbers can be extracted from Supplementary Table 1, Sheet "Feature metadata", where for example N-methylglutamate had no HMDB ID assigned in the original publication (see column HMDB_Original), yet by translating HMDB from KEGG (hmdb_from_kegg) and PubChem (see column hmdb_from_pubchem) we obtain in both cases the same HMDB ID "HMDB0062660". In order to clarify this in the manuscript, we have extended the figure legend of SFig. 2: "a-b) Bargraphs showing the frequency at which a certain number of metabolite IDs per integrated peak are available as per ccRCC patients feature metadata provided in the original publication (left), after potential equivalent IDs for amino-acid and amnio-acid-related features were assigned (middle), which increases the number of features with multiple (middle: grey bars) and after IDs were translated from the other available ID types (right). for a) Of 577 detected features, 341 had at least one HMDB IDs assigned (left graph, red + grey bar) according to the original publication (left). Translating from KEGG-to-HMDB and from PubChem-to-HMDB increased the number of features with an HMDB ID from 341 to 430 (left). and __b) __Of 577 detected features, 306 had at least one KEGG IDs assigned (left graph, red + grey bar) according to the original publication (left). Translating from HMDB-to-KEGG and from PubChem-to-KEGG did not increase the total number of features with an KEGG ID (left)."

      We like the suggestion of the reviewer to provide representations of the deltas and will add additional plots to SFig. 2 as part of our planned revision.

      (5) (Planned)

      MetaboAnalyst is mentioned several times in the manuscript. The reviewer is familiar with some of the limitations and practical challenges associated with using MetaboAnalyst and its R package. Given that MetaboAnalyst already offers some overlapping functionality with MetaProViz (and offers it in the form of an interactive website and a sometimes functional R package), a more explicit comparison between the two tools would help readers fully understand the unique advantages and improvements provided by MetaProViz.

      This is a good point the reviewer raises. As part of the revisions, we plan to create a supplementary data table that includes both tools and their respective features. We will refer to this table within the manuscript text.

      (6)

      Page 11: The authors state that they used limma for statistical testing, including for the analysis of exometabolomics data, where the values appear to represent log2-transformed distances or ratios rather than normally distributed intensities. Since limma assumes approximately normal residuals, please provide evidence or justification that this assumption holds for these data types. If the distributions deviate substantially from normality, a non-parametric alternative might be more appropriate.

      For exometabolomics data we use data normalised to media blank and growth factor (formula (1)). Limma is performed on those data, not on the log2-transformed distances. The Log2(Distance) is calculated separately to the statistical results using the normalised exometabolomics data. In addition, we always perform the Shapiro-Wilk test as part of MetaProViz differential analysis function on each metabolite to understand the distribution. In this particular case we have the following distributions:

      Cell line

      Metabolites normal distribution [%]

      Metabolites not-normal distribution [%]

      HK2

      82.35

      17.65

      786-O

      95.71

      4.29

      786-M1A

      97.14

      2.86

      786-M2A

      88.57

      11.43

      OSRC2

      92.86

      7.14

      OSLM1B

      85.71

      14.29

      RFX631

      97.14

      2.86

      If a user would have distributions that deviate substantially from normality, non-parametric alternatives are also available in MetaProViz (see methods section for all options).

      7)

      Page 13: why were young and old defined this way? Authors should provide their reasoning and/or citations for this grouping.

      We thank the reviewer for pointing this out. The explanation of our choices of the age groups is purely based on the literature:

      First, ccRCC can be sporadic (>96%) or familial (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3308682/pdf/nihms362390.pdf). This was also observed in other cohorts, where of 1233 patients only 93 were under 40 years of age (%, whilst 1140 (%) were older than 40 years (https://www.europeanurology.com/article/S0302-2838(06)01316-9/fulltext). Second, given the high frequency of sporadic cases it is unsurprising that ccRCC incidences were found to peak in patients aged 60 to 79 years with more male than female incidences (https://journals.lww.com/md-journal/Fulltext/2019/08020/Frequency,_incidence_and_survival_outcomes_of.49.aspx). Third, it was shown that sex impacts on the renal cancer-specific mortality and is modified by age, which is a proxy for hormonal status with premenopausal period below 42 years and postmenopausal period above 58 years (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4361860/pdf/srep09160.pdf). Putting all of this information together, we decided on our age groups of young (58years) following the hormonal period in order to account for sex impact. Additionally, our young age group is representative of the age of familial ccRCC, whilst our old age group summarises the age group where incidences were found to peak.

      To make this clear in the manuscript we have extended the method section of the manuscript (Line 547-548):

      "For the patient's ccRCC data, we compared tumour versus normal of two patient subset, "young" (58years)."

      (8)

      Figure 4e: It may help with interpretation to have these Sankey-like graph edges be proportional to the number of metabolites.

      We thank the reviewer for this suggestion, which we also pondered. When we tested this visualisation, the plot became convoluted, hard to interpret and not all potential flows exist in the data. This is why we have opted to create an overview graph of each potential flow, with each edge representing a potentially existing flow. The number of times a flow exists is shown in Fig. 4f.

      (9)

      Figure 4h: The values appear to be on an intensity scale (e.g., on the order of 3e10), yet some of them are negative, which would not be expected for raw or log-transformed mass spectrometry intensities. It is unclear whether these represent normalized abundance values, distances, or some other transformation. In addition, for the comparison of tumour versus normal tissue, it is not specified what statistical test was applied. Since mass spectrometry data are typically log2-transformed to approximate a log-normal distribution before performing t-tests or similar parametric methods, clarification is needed on how these data were processed.

      Thanks for pointing this out, it made us realize that we need to extend our figure legend for clarity for Fig. 4h (Line 343-345). In both cases we show normalized intensities following the workflow described in Fig. 3a. In case of the left graph labelled "CoRe", we are plotting an exometabolomics experiment, were additionally normalised using both media blanks (samples where no cells were cultured in) and growth factor (accounts for cell growth during experiment) as growth rate (accounts for variations in cell proliferation) has not been available (see also formula (1) in methods section). A result has a negative value if the metabolite has been consumed from the media, or a positive value if the metabolite has been released from the cell into the culture media.

      In addition, the reviewer refers to the comparison of tumour versus normal (Fig. 4a __and 4d__) and the missing description of the chosen statistical test. We have added the details to the figure legend (Lines 334 and 345).

      Adapted legend Fig. 4: "a) Differential metabolite analysis results for exometabolomics data comparing 786-O versus HK2 cells using Annova and false discovery rate (FDR) for p-value adjustment. b) __Heatmap of mean consumption-release of the measured metabolites across cell lines. c) Heatmap of normalised ccRCC cell line exometabolomics data for the selected metabolites of amino acid metabolism for a sample subset. __d) __Differential metabolite analysis results for intracellular data comparing 786-O versus HK2 cells using Annova and false discovery rate (FDR) for p-value adjustment. __e) __Schematics of bioRCM process to integrate exometabolomics with intracellular metabolomics and __f) __number of metabolites by their combined change patterns in intracellular- and exometabolomics in 786-M1A versus HK2. g)__ Heatmap of the metabolite abundances in the "Both_DOWN (Released/Comsumed)" cluster. __h) __Bar graphs of normalised methionine intensity for exometabolomics (CoRe: negative value, if the metabolite has been consumed from the media, or a positive value, if the metabolite has been released from the cell into the culture media) and intracellular metabolomics (Intra)."


      (10)

      Figure 5: "Tukey's p.adj We thank the reviewer for pointing this out. We have used the TukeyHSD (Tukey's Honestly Significant Difference) test in R on the Anova results. We have added more details into the figure legend (Line 384): "(Tukey's post-doc test after anova p.adj<br /> (11)

      The potential for multi-omics is mentioned. Please clarify how generalizable this framework is. Can it readily accommodate transcriptomics, proteomics, or fluxomics data, or does it require custom logic or formatting for each new data type?

      Thanks for raising this question. MetaProViz can readily accommodate transcriptomics and proteomics data for combined enrichment analysis using for example MetalinksDB metabolite-receptor pairs. Yet, MetaProViz does not support modelling fluxomics data into metabolic networks. We state in the discussion that this could be future development ("Beyond current capabilities, future developments could also incorporate mechanistic modeling to capture metabolic fluxes, subcellular compartmentalization, enzyme kinetics, regulatory feedback loops, and thermodynamic constraints to dissect metabolic response under perturbations."). To clarify on the availability of multi-omics integration for combined enrichment analysis, we have added some more details into the discussion section.

      Line 467-469: "In addition, providing knowledge of receptor-, transporter- and enzyme-metabolite pairs, MetaProViz can readily accommodate transcriptomics and proteomics data for combined enrichment analysis."

      (12)

      Please clarify if/how enrichment analyses account for varying set sizes and redundant metabolite memberships across pathways, which can bias over-representation analysis results.

      This is a very relevant point, which we have already been working on. Indeed, we agree that enrichment results from enrichment analyses can be biased due to varying set sizes and redundant metabolite memberships across pathways. MetaProViz explicitly accounts for varying set sizes when running over representation analysis (functions standard_ora()and cluster_ora()), which uses a model that computes the p-value under a hypergeometric distribution. Thereby, larger pathways are penalized unless the overlap is proportionally large, while smaller pathways can be significant with fewer overlaps. Hence, the test quantifies whether the observed overlap between the query set and a pathway is larger than would be expected under random sampling. In addition, we explicitly filter by gene‑set size using min_gssize/max_gssize, which further controls for extreme small or large sets. So both the statistical test itself and the size filters incorporate gene‑set size variation.

      Regarding the redundant metabolite-set (i.e. pathways) memberships, we have now implemented a new function (cluster_pk()) to cluster metabolite-sets like pathways based on overlapping metabolites. Thereby we allow investigation of enrichment results in regard to redundancy and similarity. For given metabolite-sets, the function calculates pathway similarities via either overlap- or correlation-based metrics. After optional thresholding to remove weak similarities, we implemented three clustering algorithms (connected-components clustering, Louvain community detection and hierarchical clustering) to group similar pathways. We then visualize the clustering results as a network graph using the new function viz_graph based on igraph. We have added all information into our methods section "Metabolite-set clustering" (Lines 656-671). In addition, we have also added the results of the clustering into Fig. 5f.

      New Fig. 5f:"f) *Network graph of top enriched pathways (p.adjusted

      Reviewer #2

      Evidence, reproducibility and clarity

      Schmidt et al report the development of MetaProViz, an integrated R package to process, analyze and visualize metabolomics data, including integration with prior knowledge. The authors then go on to demonstrate utility by analyzing several metabolomes of cell lines, media and patient samples from kidney cancer. The manuscript provides a concise description of key challenges in metabolomics that the authors identify and address in their software. The examples are helpful and illustrative, although I should point out that I lack the expertise to evaluate the R package itself. I only have a few very minor comments.

      Significance

      This is a very significant advance from one of the leading groups in the field that is likely to enhance metabolomics data analysis in the wider community.

      We thank the reviewer for this positive feedback on our package. We appreciate that there are no major comments from the reviewer.

      Minor comments:

      (1)

      Figure 2D, E: While the schematics are fairly intuitive, a brief figure legend description of what the different scenarios etc. represent would make this easier to grasp.

      We thank the reviewer for pointing this out and we acknowledge that this is a complex problem we try to convey. We received a similar comment from Reviewer 1 (Comment 3), so please see the extensive response there. In brief, we have extended the figure legend and specifically explained each displayed case and its meaning (Line 222-242) and extended the Figure itself by adding additional categories to Fig. 2e.

      Extended legend Fig.2 d-e: "d-e) Schematics of possible mapping cases between metabolite IDs (= each circle corresponds to one ID) of a pathway-metabolite set (e.g. KEGG) to metabolites IDs of a different database (e.g. HMDB) with (d) showing many-to-many mappings that can occur within and across pathway-metabolite sets and (e) additionally showing the mapping to metabolite IDs that were assigned to the detected peaks within and across pathway-metabolite sets. (d) __Translating the metabolite IDs of a pathway-metabolite set can lead to special cases such as many-to-one mappings (Pathway 1), where for example the original resource used the ID for L-Alanine (Pathway 1, green) and D-Alanine (Pathway 1, yellow) in the amino-acid pathway, whilst the translated resources only has an entry for Alanine zwitterion (Pathway 1, blue). Additionally, many-to-one mappings can also occur across pathways (Pathway 2-4), where this mapping is only detected when mappings are analysed taking all pathways into account. Both of these cases deflate the pathways, which can also happen for one-to-none mappings (Pathway 1, white). There are also cases that inflate the pathway such as one-to-many mappings (e.g. Pathway 2-4, orange mapping to pink and violet). (e)__ Showcasing the different scenarios when merging measured data (detected) based on the translated metabolites within pathways (scenario 1-5) and across pathways (scenario 6-8) highlighting problematic scenarios (4-7) that require further actions. Unproblematic scenarios (1-3 and 8) can include special cases between original and translated (i.e. one-to-many in scenario 1), which become obsolete as only one/none of the many potential metabolite IDs is detected. Yet, if multiple metabolites are detected action is required (scenario 5), which can include building the sum of the multiple detected features or only keeping the one with the highest Log2FC between two conditions. Other special cases between original and translated (i.e. many-to-one in scenario 4 and 6) also depend on what has been mapped to the measured features. If features have been measured in those scenarios, pathway deflation (i.e. only one original entry remains) or measured feature duplication (the same measurement is mapped to many features in the prior knowledge) are the possible results within and across pathways. Those scenarios should be addressed on a case-by-case basis as they also require biological information to be taken into account."

      (2) Fig. 4: The authors briefly state that they integrate prior knowledge to identify the changes in methionine metabolism in kidney cancer, but it is not clear how exactly they contribute to this conclusion. It could be helpful to expand a bit on this to better illustrate how MetaProViz can be used to integrate prior knowledge into the analysis workflow.

      We think the reviewer refers to this section in the text (Line 363-370):

      "Next, we focused on the cluster "Both_DOWN (Released-Consumed)" and found that several amino acids are consumed by the ccRCC cell line 786-M1A but released by healthy HK2 cells. At the same time, intracellular levels are significantly lower than in HK2 (Log2FC = -0.9, p.adj = 4.4e-5) (Fig. 4g). To explore the role of these metabolites in signaling, we queried the prior knowledge resource MetalinksDB, which includes metabolite-receptor, metabolite-transporter and metabolite-enzyme relationships, for their known upstream and downstream protein interactors for the measured metabolites (Supplementary Table 5). This approach is especially valuable for exometabolomics, as it allows us to generate hypotheses about cell-cell communication. Notably, we identified links involving methionine (Fig. 4h), enzymes such as BHMT, and transporters such as SLC43A2 that were previously shown to be important in ccRCC25,42 (Supplementary Table 5)."

      We have now extended this part to clearly state that here MetalinkDB is the prior knowledge resource we used to identify the links for methionine (Line 363-364). In addition we have extended our summary statement to ensure clarity for the reader that we combine the biological clustering, which revealed the amino acid changes, with prior knowledge for the mechanistic insight (Line 380-381):

      "In summary, calculating consumption-release and combining it with intracellular metabolomics via biological regulated clustering reveals metabolites of interest. Further combining these results with prior knowledge using the MetaproViz toolkit facilitates biological interpretation of the data."

      (3)

      Given the functional diversity among metabolites -central to diverse pathways, are key signaling molecules, restricted functions, co-variation within a pathway - I wonder how informative approaches such as PCA or enrichment analyses are for identifying metabolic drivers of a (patho)physiological state. To some extent, this can be addressed by integrating prior knowledge, and it would be helpful if the authors could comment on (and if applicable explain) whether/how this is integrated into MetaProViz.

      The reviewer is correct in stating the functional diversity of metabolites, which is also why prior knowledge is needed to add mechanistic interpretation to the finding from the metadata analysis (as we showcased by focusing on the separation of age (Fig. 5c-d)). We think that approaches such as PCA or enrichment can be helpful, even if admittedly limited. For example, in the metadata analysis presented in Fig. 5b and the subsequent enrichment analysis presented in Fig. 5, we used PCA to extract the eigenvector and the loading, which act as weights indicating the contribution of each original metabolite to that specific principal components separation. Hence, the eigenvector of PCA shows the metabolite drivers of the separation. This does not necessarily mean that those metabolites are drivers of a (patho)physiological state - the (patho)physiological state can equally be the reason for those metabolites driving the separation on the Eigenvectors. Thus, the metadata analysis presented in Fig. 5b enables us to extract the metadata variables (patho)physiological states separated on a PC with the explained variance. This can also lead to co-variation, when multiple (patho)physiological states are separated on the same PC, as the reviewer correctly points out. Regarding the enrichment analysis, we provide different types of prior knowledge for classical mapping, but also the prior knowledge we used to create the biological regulated clustering, which together help to identify key metabolic groups as we can first cluster the metabolites and afterwards perform functional enrichment. Yet, this does not account for the technical issues of enrichment analysis. In this context multi-omics integration building metabolic-centric networks could further elucidate the diversity of metabolic pathways and connection to signalling and co-variation, yet this is not the scope of MetaProViz. To sum up, we are aware of the limitations of this analysis and the constraints on the downstream interpretation.

      To capture the functional diversity amongst metabolites, which leads to metabolites being present in multiple pathways of metabolite-pathways sets, we have implemented a new function to cluster metabolite-sets like pathways based on overlapping metabolites and visualize redundant metabolite-set (i.e. pathways) memberships (Fig.5f). For more details also see our response to Reviewer 1, Comment 12. We hope this will circumvent miss- and over-interpretation of the enrichment results.

      In addition, we have extended the text to include the analysis pitfalls explicitly (Line 416-419): "Another variable explaining the same amount of variance in PC1 is the tumour stage, which could point to adjacent normal tissue metabolic rewiring that happens in relation to stage and showcases that biological data harbour co-variations, which can not be disentangled by this method."

      Reviewer #3

      Evidence, reproducibility and clarity

      This manuscript introduces an R package MetaProViz for metabolomics data analysis (post anotation), aiming to solve a poor-analysis-choices problem and enable more people to do the analysis. MetaProViz not only guides people to select the best statistical method, but also enables to solve previously unsolved problems: e.g. multiple and variable metabolite names in different databases and their connections to prior knowledge. They also created exometabolomics analysis and the needed steps to visualise intra-cell / media processes. The authors demonstrated their new package via kidney cancer (clear-cell renal cell carcinoma dataset, steping one step closer to improve biological interpretability of omics data analysis.

      Significance

      This is a great tool and I can't wait to use it on many upcoming metabolomics projects! Authors tackle multiple ongoing issues within the field: from poor selection of statistical methods (they provide guidance or have default safer options) to the messiness of data annotation between databases and improving data interpretability. The field is still evolving quickly, and it's impossible to solve all problems with one package; thus some limitations within the package could be seen as a bit rigid. Nonetheless, this fully steps toward filling an existing methodological gap. All bioinformaticians doing metabolomic analysis, or those learning how to do it, will greatly benefit from this knowledge.

      I myself lead a team of 6 bioinformaticians, and we do analysis for researchers, clinicians, drug discovery, and various companies. We run internal metabolomics pipelines every day and fully sympathise with the problems addressed by the authors.

      Major comments affecting conclusions

      none.

      We thank the reviewer for this positive feedback on evidence, reproducibility and clarity as well as significance of our work given the reviewers experience with metabolomics data analysis mentioned. We appreciate that there are no major comments from the reviewer.

      Minor comments

      Minor comments, important issues that could be addressed and possibly improve the clarity or generally presentation of the tool. Please see all below.

      (1)

      1- You start with separating and talking about metabolomics and lipidomics, but lipidomics quickly dissapears (especially beyond abstract/intro) - no real need to discuss lipidomics.

      Thanks, that's a good note and we have removed it from the abstract and introduction.

      (2)

      2- You refer to the MetImp4 imputation web tool, but I cannot find an active website, manuscript, or R package for it, and the cited link does not load. This raises doubts about whether the tool is currently usable. Additionally, imputation choice should be guided by biological context and study design, not just by testing a few methods and selecting the one that performs best.

      We fully agree with the reviewer on imputation handling. The manuscript we cite from Wei et. al. (https://doi.org/10.1038/s41598-017-19120-0) compared a multitude of missing value imputation methods and made this comparison strategy available as a web-based tool not as any code-based package such as an R-package. Yet, the reviewer is right, the web-tool is no longer reachable. Hence, we have adapted the statement in our introduction (Line 61-62): "Moreover, there are tools that focus on specific steps of the pre-processing of feature intensities, which encompasses feature selection, missing value imputation (MVI)9 and data normalisation. For example, MetImp4 is a web-tool that includes and compares multiple MVI methods9. "

      (3)

      3- The authors address key metabolomics issues such as ambiguous metabolite names and isoforms, and their focus on resolving mapping ambiguities and translating between database identifiers is highly valuable. However, the larger challenge of de novo identification and the "dark matter" of unannotated metabolites remains unresolved (initiatives as MassIVE might help in the future https://massive.ucsd.edu/ProteoSAFe/ ), and readers may benefit from clearer acknowledgement that MetaProViz does not operate on raw spectral data. The introduction currently emphasizes annotation, but since MetaProViz requires already annotated metabolite tables (and then deals with all the messiness), this space might be better used to frame the interpretability and pathway-analysis challenges that the tool directly addresses.

      We appreciate the comment and have highlighted this in the abstract and introduction: "MetaProViz operates on annotated intensity values..." (Line 29 and 88).

      Given the newest advancements in metabolite identification using AI-based methods, MetaProViz toolkit with a focus on connecting metabolite IDs to prior knowledge becomes increasingly valuable. We added this to our discussion (Line 484-488): "Given the imminent shift in metabolite identification through AI-based approaches, including language model-guided48 methods and self-supervised learning49, the growing number of identified metabolites will make the MetaProViz toolkit increasingly valuable for the community to gain functional insights."

      In regards to the introduction, where we mention some tools for peak annotation: The reason why we have this paragraph where peak annotation are named is that we wanted to set the basis by (I) listing the different steps of metabolomics data analysis and (II) pointing to well-known tools of those steps. We also have a dedicated paragraph for pathway-analysis challenges.

      (4)

      4- I also really enjoyed you touching on the point of user-friendly but then inflexible and problem of reproducibility. We truly need well working packages for other bioinformaticians, rather than expecting wet-lab scientists to do all the analysis within the user interface.

      We thank the reviewer for this positive feedback.

      (5)

      5- It would be helpful to explain why the authors chose cancer/RCC samples for the demonstration. Was it because the dataset included both media and cell measurements? Does the tool perform best when multiple layers of information are available from the same experiment?

      We specifically chose the ccRCC cell line data as example since, for a multitude of cell lines, both media (exometabolomics) and intracellular metabolomics had been performed. The combination of both data types is only used in the biological regulated clustering (Fig. 5e-g), all other analyses do not require additional data modalities. We have not specifically tested how performance differs for this particular case as it would require multiple paired data (exometabolomics and intracellular metabolomics) taken at the same time and at different times.

      (6)

      6- Figure 2B: The upset plots effectively show increased overlap after adaptation, but it would be easier to compare changes if the order of the intersection bars in the "adapted" plot matched the original. For example, while total intersections increased (251→285), the PubChem+KEGG overlap decreased (24→5), likely due to reallocation to the full intersection.

      Thanks for raising this point. We initially had ordered the bars based on their intersection size, but we agree with the reviewers that for our point it makes sense to fix the order in the adapted plot to match the order of the original plot. We have done this (Fig 2a) and also extended the figure legend text of SFig. 2, which shows the individually performed adaptations summarized in Fig 2a.

      (7) (Planned)

      7- In your example of D-alanine and L-alanine - you mention how chirality is important biological feature, but up to this point it's not clear how do you do translation exactly and in which situations this would be treated just as "alanine" and when the more precise information would be retained? You mention RaMP-DB knowledge and one to X mappings as well as your general guidance in the "methods" part, but it would be useful to describe in this publication how you exactly tackled this problem in the ccRCC case.

      We thank the reviewer for this suggestion. Since this is a complex problem, we will add a more explicit description to the results section by showcasing more details on how we exactly tackled this problem in the ccRCC example data.

      In regards to D- and L-alanine, even though chirality is an important biological feature, in a standard experiment we can not distinguish if we detect the L- or D-aminoacid. This is why we try to assign all possible IDs to increase the overlap with the prior knowledge. In Fig. 2b we showcase that this can potentially lead to multiple mappings of the same measured feature to multiple pathways. For example, if we measure alanine and assign the pubchem ID for L-Alanine, D-Alanine and Alanine and try to map to metabolite-sets that include both L-Alanine and D-Alanine. In turn this could fall into Scenario 6 (Fig. 2e), where across pathways there is a D-Alanine specific one (Pathway 1) and a L-Alanine specific one (Pathway 2). Now we can decide, if we want to allow both mapping (many-to-one) or if we decide to exclude D-Alanine because we know our biological system is human and should primarily have L-Alanine.

      (8) (Planned)

      8- In one to many mappings, it would be interesting to see quantification how frequently it was happening within a pathway or across pathways. I.e. Would going into pathway analysis "solve" the issue of "lost in translation" or not really?

      We have quantified the frequency for the example of translating the KEGG metabolite-set into HMDB IDs (Fig. 2c, left panel). Yet, we are not showcasing the quantification across the KEGG metabolite-sets with this plot. During the revision we will add the full results available to the Extended Data Table 2, which currently only includes the results displayed in Fig.2c.

      (9)

      9- QC: the coefficient of variation (CV) helps identify features with high variability and thus low detection accuracy. Here it's important to acknowledge that if the feature is very variable between groups it can be extremely important, but if the feature is very variable within the group - only then one would have low trust in the accuracy.

      Yes, we totally agree with the reviewer on this. For this reason, we have applied CV only in instances where this is not leading to any condition-driven CV differences, but is truly feature-focused: (1) Function pool_estimation performs CV on the pool samples only, which are a homogeneous mixture of all samples, and hence can be used to assess feature variability. (2) Function processing performs CV on exometabolomics media samples (=blanks), which are also not impacted by different conditions.

      (10)

      10- Missing value imputation - while missing not at random is a great way to deal with missingness, it would be great to have options for others (not just MNAR), as missingness is of a complex nature. If a pretty strong decision has been made, it would be good to support this by some supplementary data (i.e. how results change while applying various combinations of missingness and why choosing MNAR seems to be the most robust).

      We have decided to only offer support for MNAR, since we would recommend MVI only if there is a biological basis for it.

      As mentioned in the response to your minor comment 2, Wei et. al. (https://doi.org/10.1038/s41598-017-19120-0) compared a multitude of missing value imputation methods. They compared six imputation methods (i.e., QRILC, Half-minimum, Zero, RF, kNN, SVD) for MNAR and systematically measured the performance of those imputation methods. They showed that QRILC and Half-Minimum produced much smaller SOR values, showing consistent good performances on data with different numbers of missing variables. This was the reason for us to only provide Half-minimum.

      (11) (Planned)

      11- In the pre-processing and imputation stages - it would be interesting to see a summary table of how many features are left after each stage.

      This is a good suggestion and refers to the steps described in Fig. 3a. We will create an overview table for this, add it into the Extended Data Table and refer to it in the results section.

      (12)

      12- Is there a reason not to do UMAP or PSL-DA graphs for outlier detection? Doing more than PCA would help to have more confidence in removing or retaining outliers in the cases where biological relevance is borderline.

      The reason we decided to use PCA was the standardly used combination with the Hotelling T2 outlier testing. Since PCA is a linear dimensionality reduction technique that preserves the overall variance in the data and has a clear mathematical foundation linked to the covariance structure, it specifically fits the required assumptions of the Hotelling T2 outlier testing. Indeed, Hotelling T2 relies on the properties of the covariance matrix and the assumption of a multivariate Gaussian distribution. UMAP is a non-linear dimensionality reduction technique, which prioritizes preserving local and global structures in a way that often results in good clustering visualization, but it distorts distances between clusters and does not have the same rigorous statistical underpinnings as PCA. In terms of PLS-DA, which focuses on maximizing the covariance between variables and the class labels, even though not commonly done, one could use the optimal latent variables for discrimination and apply Hotelling's T² to those latent variables. Yet, PLS-DA is supervised and actively tries to separate data points in the latent space, which can be misleading for outlier detection where methods like PCA that are unbiased, unsupervised and preserve global variance are advantageous.

      (13)

      13- Metadata vs metabolite features - can this be used beyond metabolomics (i.e. proteomics, transcriptomics, etc)? It can be always very useful when there are many metadata features and it's hard to pre-select beforehand which ones are the most biologically relevant.

      Yes, definitely. In fact, we have used the metadata analysis strategy also with proteomics data and it will work equally with any omics data type.

      (14)

      14- While authors discussed what KEGG pathways were significantly deregulated, it would be interesting to see all the pathways that were affected (e.g. aPEAR "bubble" graphs can show this (https://github.com/kerseviciute/aPEAR) , or something similar to NES scores). I appreciate the trickiness of it, but it would be quite interesting to see how authors e.g. Figure5e narrowed it down to the two pathways and how all the others looked like.

      We thank the reviewer for the suggestion of the aPEAR graphs. Following this suggestion, we have implemented a new function to enable clustering of the pathways based on overlapping metabolites (cluster_pk()). For more details regarding the method see also our response to Reviewer 1 (Comment 12) and our extended method section "Metabolite-set clustering" (Lines 656-671). We visualize the clustering results as a network graph, which we also included into Fig. 5f.

      The complete result of the KEGG enrichment can be found in Extended Data Table 1, Sheet 13 (Pathway enrichment analysis using KEGG on Young patient subset). The pathways are ranked by p.adjusted value and also include a score (FoldEnrichment) from the fishers exact test (similar to NES scores in GSEA). Here one can find a total of seven pathways with a p.adjusted value For Fig. 5e we narrowed down to these two pathways based on the previous findings of dysregulated dipeptides (Fig. 5d), as we searched for a potential explanation of this observation.

      (15)

      15- Could you comment on the runtime of the pipeline? In particular, do the additional translation steps and use of multiple databases substantially affect computational speed?

      Downloading and parsing databases takes significant time, especially large ones like RaMP or HMDB might take minutes on a standard laptop. Our local cache speeds up the process by eliminating the need for repeated downloads. In the future, database access will be even faster: according to our plans, all prior knowledge will be accessible in an already parsed format by our own API (omnipathdb.org). The ambiguity analysis, which is a complex data transformation pipeline, and plotting by ggplot2, another key component of MetaProViz, are the slowest parts, especially when performing analysis for the first time when no cache can be used. This means there are a few slow operations which complete in maximum a few dozens of seconds. However, the implementation and speed of these solutions doesn't fall behind what we commonly find in bioinformatics packages, and most importantly, the speed of MetaProViz doesn't pose an obstacle or difficulty regarding an efficient use of it in analysis pipelines.

      (16)

      16- I clap to the authors for automated checks if selected methods are appropriate!

      Thank you, this is something we think is important to ensure correct analysis and circumvent misinterpretation.

      (17)

      17- My suggestion would be to also look into power calculation or p-value histogram. In your example you saw some clear signal, but very frequently research studies are under-sampled and while effect can be clearly seen, there are just not enough samples to have statistically significant hits.

      We fully agree that power calculations are very important. Yet, this should ideally happen prior to the user's experiment. MetaProViz analysis starts at a later time-point and power calculations should have been done before. In regards to p-value histogram, we have implemented a similar measure, namely a density plot, which is plotted as a quality control measure within MetaProViz differential analysis function. The density plot is a smoothed version of a histogram that represents the distribution as a continuous probability density function and can be used to assess whether the p-values follow a uniform distribution.

      (18)

      18- Overall functional parts are novel and next step in helping with data interpretability, but I still found it hard to read into functionally clear insights (re to pathways / functional groupings of metabolites) - especially as you have e.g. enzyme-metabolite databases etc. I think clarity there could be improved and would help to get your message more widely across.

      Regarding the clarity to the pathway enrichment and their functional insights, we have extended the Figure legends of Fig. 4 and 5, clearly state that for the functional interpretation MetalinkDB is the prior knowledge resource we used to identify the links for methionine (Line 367-368), and we have extended our summary statement to highlight that we combine the biological clustering with prior knowledge for the mechanistic insight (Line 380-381).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript introduces an R package MetaProViz for metabolomics data analysis (post anotation), aiming to solve a poor-analysis-choices problem and enable more people to do the analysis. MetaProViz not only guides people to select the best statistical method, but also enables to solve previously unsolved problems: e.g. multiple and variable metabolite names in different databases and their connections to prior knowledge. They also created exometabolomics analysis and the needed steps to visualise intra-cell / media processes. The authors demonstrated their new package via kidney cancer (clear-cell renal cell carcinoma dataset, steping one step closer to improve biological interpretability of omics data analysis.

      Major comments affecting conclusions: none.

      Minor comments, important issues that could be addressed and possibly improve the clarity or generally presentation of the tool. Please see all below.

      1. You start with separating and talking about metabolomics and lipidomics, but lipidomics quickly dissapears (especially beyond abstract/intro) - no real need to discuss lipidomics.
      2. You refer to the MetImp4 imputation web tool, but I cannot find an active website, manuscript, or R package for it, and the cited link does not load. This raises doubts about whether the tool is currently usable. Additionally, imputation choice should be guided by biological context and study design, not just by testing a few methods and selecting the one that performs best.
      3. The authors address key metabolomics issues such as ambiguous metabolite names and isoforms, and their focus on resolving mapping ambiguities and translating between database identifiers is highly valuable. However, the larger challenge of de novo identification and the "dark matter" of unannotated metabolites remains unresolved (initiatives as MassIVE might help in the future https://massive.ucsd.edu/ProteoSAFe/ ), and readers may benefit from clearer acknowledgement that MetaProViz does not operate on raw spectral data. The introduction currently emphasizes annotation, but since MetaProViz requires already annotated metabolite tables (and then deals with all the messiness), this space might be better used to frame the interpretability and pathway-analysis challenges that the tool directly addresses.
      4. I also really enjoyed you touching on the point of user-friendly but then inflexible and problem of reproducibility. We truly need well working packages for other bioinformaticians, rather than expecting wet-lab scientists to do all the analysis within the user interface.
      5. It would be helpful to explain why the authors chose cancer/RCC samples for the demonstration. Was it because the dataset included both media and cell measurements? Does the tool perform best when multiple layers of information are available from the same experiment?
      6. Figure 2B: The upset plots effectively show increased overlap after adaptation, but it would be easier to compare changes if the order of the intersection bars in the "adapted" plot matched the original. For example, while total intersections increased (251→285), the PubChem+KEGG overlap decreased (24→5), likely due to reallocation to the full intersection.
      7. In your example of D-alanine and L-alanine - you mention how chirality is important biological feature, but up to this point it's not clear how do you do translation exactly and in which situations this would be treated just as "alanine" and when the more precise information would be retained? You mention RaMP-DB knowledge and one to X mappings as well as your general guidance in the "methods" part, but it would be useful to describe in this publication how you exactly tackled this problem in the ccRCC case.
      8. In one to many mappings, it would be interesting to see quantification how frequently it was happening within a pathway or across pathways. I.e. Would going into pathway analysis "solve" the issue of "lost in translation" or not really?
      9. QC: the coefficient of variation (CV) helps identify features with high variability and thus low detection accuracy. Here it's important to acknowledge that if the feature is very variable between groups it can be extremely important, but if the feature is very variable within the group - only then one would have low trust in the accuracy.
      10. Missing value imputation - while missing not at random is a great way to deal with missingness, it would be great to have options for others (not just MNAR), as missingness is of a complex nature. If a pretty strong decision has been made, it would be good to support this by some supplementary data (i.e. how results change while applying various combinations of missingness and why choosing MNAR seems to be the most robust).
      11. In the pre-processing and imputation stages - it would be interesting to see a summary table of how many features are left after each stage.
      12. Is there a reason not to do UMAP or PSL-DA graphs for outlier detection? Doing more than PCA would help to have more confidence in removing or retaining outliers in the cases where biological relevance is borderline.
      13. Metadata vs metabolite features - can this be used beyond metabolomics (i.e. proteomics, transcriptomics, etc)? It can be always very useful when there are many metadata features and it's hard to pre-select beforehand which ones are the most biologically relevant.
      14. While authors discussed what KEGG pathways were significantly deregulated, it would be interesting to see all the pathways that were affected (e.g. aPEAR "bubble" graphs can show this (https://github.com/kerseviciute/aPEAR) , or something similar to NES scores). I appreciate the trickiness of it, but it would be quite interesting to see how authors e.g. Figure5e narrowed it down to the two pathways and how all the others looked like.
      15. Could you comment on the runtime of the pipeline? In particular, do the additional translation steps and use of multiple databases substantially affect computational speed?
      16. I clap to the authors for automated checks if selected methods are appropriate!
      17. My suggestion would be to also look into power calculation or p-value histogram. In your example you saw some clear signal, but very frequently research studies are under-sampled and while effect can be clearly seen, there are just not enough samples to have statistically significant hits.
      18. Overall functional parts are novel and next step in helping with data interpretability, but I still found it hard to read into functionally clear insights (re to pathways / functional groupings of metabolites) - especially as you have e.g. enzyme-metabolite databases etc. I think clarity there could be improved and would help to get your message more widely across.

      Significance

      This is a great tool and I can't wait to use it on many upcoming metabolomics projects! Authors tackle multiple ongoing issues within the field: from poor selection of statistical methods (they provide guidance or have default safer options) to the messiness of data annotation between databases and improving data interpretability. The field is still evolving quickly, and it's impossible to solve all problems with one package; thus some limitations within the package could be seen as a bit rigid. Nonetheless, this fully steps toward filling an existing methodological gap. All bioinformaticians doing metabolomic analysis, or those learning how to do it, will greatly benefit from this knowledge.

      I myself lead a team of 6 bioinformaticians, and we do analysis for researchers, clinicians, drug discovery, and various companies. We run internal metabolomics pipelines every day and fully sympathise with the problems addressed by the authors.

    1. When receiving peer feedback, remember that your classmates are being asked to perform a task and that they, just like you, are just trying to perform the task the teacher asked them to perform.

      It’s important to keep an open mind and accept constructive criticism positively, because it’s meant to help you grow, not to attack you. On the contrary, it shows that others are on your side and want to see you succeed.

    1. Besides, there is something unattractive, unhealthy even, about Eucalyptus desertorum. It’s more like a bush than a tree; has hardly a trunk at all: just sev- eral stems sprouting at ground level, stunted and itchy-looking.

      This description makes the tree feel almost wrong or unsettling, especially compared to how trees are usually associated with strength or stability. Calling it “unattractive” and “unhealthy” seems intentional, as if the tree’s physical form mirrors something emotionally off, suggesting that nature here reflects deeper unease rather than comfort.

    2. But fiction doesn’t “merely narrate”: this is one of its great potencies. In the centuries that Western fiction has taken to arise, it’s evolved to do many things, especially in the most cannibalistic form, the novel.

      This stood out to me because it pushes back against the idea of fiction as just storytelling for entertainment. Calling the novel “cannibalistic” suggests it absorbs and reworks history, philosophy, and other forms of knowledge, which makes fiction feel less passive and more like an active way of thinking and understanding the world.

    3. Given all of this, my writer self thinks two things: first, being aware of visual elements such as texture, color, or symmetry can open windows and let us design as much as write. Text comes from texere, after all: to weave. Next, we

      This passage makes me think a lot about art and how one can go about making a painting, or a sculpture, etc. I think it's cool to think of literature this way as well because it is art that is really just expressed more through words.

    1. With the growing diversity of students in classrooms and a deeper understanding of how students learn, educators have been taking a fresh look at making teaching and learning accessible to all students.

      I agree with this because classrooms look a lot different than they used to, and teaching has had to adjust. Students come in with different backgrounds, learning styles, and needs, so using the same approach for everyone just doesn’t work anymore. I think it’s a good thing that educators are rethinking how lessons are taught, because when learning is more accessible, more students are actually able to participate and succeed instead of feeling left out or behind.

    1. This portion of the presentation will focus on the RSA public key encryption algorithm. As we saw at the end of the previous presentation, the Diffie Hellman algorithm has some impracticalities as we saw. It requires a setup phase, in which Alice has to exchange some public information with with Bob, not only with that Bob with with any Bob (meaning to an imposter without knowing it?) she wishes to exchange a key with. These impracticalities are avoided by another algorithm that we're going to look at RSA algorithm, which is the most widely used cryptographic algorithm. Here's the public key model that it uses. The cryptographic key is broken into two parts, a public and a private part. The public part is used for encrypting, so Bob's public key is used by Alice to encrypt the word hello, running it through the encrypt algorithm and then sending the result. This gibberish here, over to Bob. He uses the private part of his key to decrypt the word the gibberish and retrieve the word hello. The encryption happens using a key that's been divided into two, and as you can see visually, these two parts are related. There really are one key that's been divided in half. Include public model, slide 36

      We'll want to see how that happens, how that works. The key is broken into a public and private part. Bob and Allice publish there's public keys, and that's the big difference between RSA and Diffie Hellman. They published them so that all people who want to encrypt messages to Bob can use Bob's public key, Alice and anyone else. Alice encrypts the word hello using Bob's public key and sends the encrypted gibberish to Bob, who decrypts it with his private key? So let's see what makes RSA hard? What protects RSA from Eve? It's also based on a one-way function, and it's our familiar modular arithmetic function, which is easy in one direction and hard, and the other.

      The expression:

      m^e mod N -> c

      m: message (a number) <--RSA requires messages to be represented by numbers. But this is not a problem, right? Everything is binary!

      In this case, the m in this expression represents the secret message that's being communicated. The e is a public exponent. It's part of the public key. The N is a public modulus, also part of the public key. And the c is the resulting encrypted message? And again, it's easy to compute m to the E mod n, knowing m, e, and N. But it's very difficult. It's intractable to find m given the encrypted message, plus the public key just e and N.

      And again, the idea for the public and private key is that they are two halves of this exponent used in the expression m raised to an exponent mod n and so the trick is to mathematically to get this exponent in such a way that it's very hard to break it in half if you don't know some secrets. So that's a very high level summary of the RSA algorithm. Slide 51

      Let's summarize its key features. First, like Diffie Hellman, the RSA algorithm solves the key exchange problem. Unlike Diffie Hellman, however, the RSA public Keys can be widely published and distributed rather than needing to be shared among parties in a encryption transaction and that makes it especially well suited for internet encryption. well suited for internet encryption. And finally, RSA is a secured by the intractability of the prime factorization problem. That's the problem of trying to discover the prime factors of a very, very large number. So here again, we see intractability being used to protect information, just as we did when we used it to help protect passwords from Brute Force attacks (by having too many possible options)

    1. To accommodate that and my readers’ tastes, I rejected all forms of sentimentality in my prose for a restrained austerity that read like knock-off Hemingway.

      It’s sad that Sindu felt she had to copy a "Western" writing style like Hemingway's just to be taken seriously by her teachers and editors. It shows how much pressure there is for immigrant writers to change their natural voice just to fit into what American readers expect "good" writing to look like.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reply to the reviewers

      We are grateful for the reviewers' constructive comments and suggestions, which contributed to improving our manuscript. We are pleased to see that our work was described as an "interesting manuscript in which a lot of work has been undertaken". We are also encouraged by the fact that the experiments were considered "on the whole well done, carefully documented, and support most of the conclusions drawn," and that our findings were viewed as providing "mechanistic insight into how HNRNPK modulates prion propagation" and potentially offering "new mechanical insight of hnRNPK function and its interaction with TFAP2C."

      We conducted several new experiments and revised specific sections of the manuscript, as detailed below in the point-by-point response in this letter.

      Referee #1

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      The paper by Sellitto describes studies to determine the mechanism by which hnRNPK modulates the propagation of prion. The authors use cell models lacking HNRNPK, which is lethal, in a CRISPR screen to identify genes that suppress lethality. Based on this screen to 2 different cell lines, gene termed Tfap2C emerged as a candidate for interaction with HNRNPK. The show that Tfap2C counteracts the actions of HNRNPK with respect to prion propagation. Cells lacking HNRNPK show increased PrPSc levels. Overexpression of Tfap2C suppesses PrPSc levels. These effects on PrPSc are independent of PrPC levels. By RNAseq analysis, the authors hone in on metabolic pathways regulated by HNRPNK and Tfap2C, then follow the data to autophagy regulation by mTor. Ultimately, the authors show that short-term treatments of these cell models with mTor inhibitors causes increased accumulation of PrPSc. The authors conclude that the loss of HNRNPK leads to a reduced energy metabolism causing mTor inhibition, which is reduces translation by dephosphorylation of S6

      Major comments:

      1) Fig H and I, Fig 3L. The interaction between Tfap2C and HNRNPK is pretty weak. The interaction may not be consequential. The experiment seems to be well controlled, yielding limited interaction. The co-ip was done in PBS with no detergent. The authors indicate that the cells were mechanically disrupted. Since both of these are DNA binding proteins, is it possible that the observed interaction is due to the proximity on DNA that is linking the 2 proteins, including a DNAase treatment would clarify.

      Response: We agree that the observed co-IP between Tfap2c and hnRNP K is weak (previous Fig. 2H-I, Supp. Fig. 3L now shifted in Supp. Fig. 4C-E), and we have now highlighted this in the relevant section of the manuscript to reflect this observation better.

      Importantly, the co-IP was performed using endogenous proteins without overexpression or tagging, which can sometimes artificially enhance protein-protein interactions. However, we acknowledge that the use of a detergent-free lysis buffer and mechanical disruption alone may have limited nuclear protein extraction and solubilization, potentially contributing to the low co-IP signal.

      To address the reviewer's concerns and clarify whether the observed interaction could be DNA-mediated, we repeated the co-IP experiments under low-detergent conditions and included benzonase nuclease treatment to digest nucleic acids (Fig. 2H-I). DNA digestion was confirmed by agarose gel electrophoresis (Supp. Fig. 4F-G). Additionally, we performed the reciprocal IPs using both hnRNP K and Tfap2c antibodies (Fig. 2H-I). Although the level of co-immunoprecipitation remains modest, these updated experiments continue to demonstrate a specific co-immunoprecipitation between Tfap2c and hnRNP K, independent of DNA bridging. These additional controls and experimental refinements strengthen the validity of our findings. These results are also attached here for your convenience.

      2) Supplemental Fig 5B - The western blot images for pAMPK don't really look like a 2 fold increase in phosphorylation in HNRNPK deletion.

      Response: We thank the reviewer for raising this point. We re-examined the original pAMPK western blot (previously Supp. Fig. 5B; now presented as Supp. Fig. 6B) and confirmed the reported results. We note that the overall loading is not perfectly uniform across lanes (as suggested by the actin signal), which may affect the visual impression of band intensity. However, the phosphorylation change reported in the manuscript is based on the pAMPK/total AMPK ratio, which accounts for differences in AMPK expression and accurately reflects relative phosphorylation levels. To further address this concern, we performed three additional independent experiments. These new data reproduce the increase in pAMPK/AMPK upon HNRNPK deletion and are now included in the revised Supplementary Fig. 6B, together with the updated quantification. The new blot and the quantification are also attached here for your convenience.

      3) Fig. 5A - I don't think it is proper to do statistics on an of 2.

      Response: We believe the reviewer's comment refers to Fig. 5B, as Fig. 5A already has sufficient replication. We have now added two additional replicates, bringing the total to four. The updated statistical analysis corroborates our initial results. The new quantification is provided in the revised manuscript (Fig. 5B) along with the new blot (Supp. Fig. 6C). Both data are also attached here for your convenience.

      4) Fig 6D. The data look a bit more complicated than described in the text. At 7 days, compared to 2 days, it looks like there is a decrease in % cells positive for 6D11. Is there clearance of PrPSc or proliferation of un-infected cells?

      Response: We have now reworded our text in the results paragraph as follows:

      "These data show that TFAP2C overexpression and HNRNPK downregulation bidirectionally regulate prion levels in cell culture."

      We have now also included the following comments in the discussion section:

      "However, prion propagation relies on a combination of intracellular PrPSc seeding and amplification, as well as intercellular spread, which together contribute to the maintenance and expansion of infected cells within the cultured population. In this study, we were limited in our ability to dissect which specific steps of the prion life cycle are affected by TFAP2C. We also cannot fully exclude the possibility that TFAP2C overexpression influenced the relative proliferation of prion-infected versus uninfected cells in the PG127-infected HovL culture, thereby contributing to the observed reduction in the percentage of 6D11+ cells and overall 6D11+ fluorescence. However, we did not observe any signs of cell death, growth impairment, or increased proliferation under TFAP2C overexpression in PG127-infected HovL cells compared to NBH controls (data not shown). This suggests that a negative selective pressure on infected cells or a proliferative advantage of uninfected cells is unlikely in this context".

      5) The authors might consider a different order of presenting the data. Fig 6 could follow Fig. 2 before the mechanistic studies in Figs 3-5.

      Response: We believe that the current order of presenting the data is more appropriate. The first part of the manuscript focuses on the genetic and functional interactions between hnRNP K and its partners, particularly TFAP2C, which is a critical point for understanding the broader context before delving into the mechanistic studies involving prion-infected cells.

      6) The authors use SEM throughout the paper and while this is often used, there has been some interest in using StdDev to show the full scope of variability.

      Response: We chose to use SEM as it reflects the precision of the mean, which is central to our statistical comparisons. As the reviewer notes, this is a common and appropriate practice. To address variability, almost all graphs already include individual data points, which provide a direct visual representation of data spread. To further enhance clarity, we have now included StdDev in the Supplementary Source Data table of the revised manuscript.

      Discussion:

      The discrepancy between short-term and long-term treatments with mTor inhibitors is only briefly mentioned with a bit of a hand-waving explanation. The authors may need a better explanation.

      Response: We have now integrated a more detailed explanation in the discussion section of the revised manuscript as follows:

      "Previous studies showed that mTORC1/2 inhibition and autophagy activation generally reduce, rather than increase, PrPSc aggregation (79, 80). The reason for this discrepancy remains unclear and may be multifactorial. First, most prior studies were based on long-term mTOR inhibition, whereas our work examined acute inhibition, mimicking the time frame of HNRNPK and TFAP2C manipulation. Acute inhibition may trigger transient metabolic or signaling shifts that differ from adaptive changes associated with mTOR chronic inhibition, potentially overriding autophagy's effects on prion propagation. Additionally, while previous works were primarily conducted in murine in vivo models, our study focused on a human cell system propagating ovine prions. Differences in species background, model complexity (e.g., interactions between different cell types), and prion strain variability, as certain strains exhibit distinct responses to autophagy and mTOR modulation (https://doi.org/10.1371/journal.pone.0137958), likely contributed to the observed differences".

      Minor comments:

      Page 12 - no mention of chloroquine in the text or related data.

      Page 12 - Supp. Fig. E - should be 5E

      Response: We thank the reviewer for pointing this out. We have now better highlighted the use of chloroquine in Fig. 5B (see reviewer #1 - Point 3 - Major comments) and in the text as follows:

      "Furthermore, in the presence of chloroquine, LC3-II levels rose almost proportionally across all conditions (Fig. 5B), suggesting that the effects of HNRNPK and TFAP2C on autophagy occur at the level of autophagosome formation, rather than autophagosome-lysosome fusion and degradation."

      We have corrected the reference to Supp. Fig. 5E.

      Reviewer #1 (Significance (Required)):

      The study provides mechanistic insight into how HNRNPK modulates prion propagation. The paper is limited to cell models, and the authors note that long term treatment with mTor inhibitors reduced PrPSc levels in an in vivo model.

      The primary audience will be other prion researchers. There may be some broader interest in the mTor pathway and the role of HNRNPK in other neurodegenerative diseases.

      Referee #2

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2g (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency. They state that HNRNPK and TFAP2C are linked to mTOR signalling and observe that HNRNPK ablation inhibits mTORC1 activity through downregulation of mTOR and Rptor while TFAP2C overexpression enhances mTORC1 downstream functions. In prion infected cells, TFAP2C activation reduced prion levels and countered the increased prion propagation due to HNRNPK suppression. Pharmacological inhibition of mTOR also elevated prion levels and partially mimicked the effects of HNRNPK silencing. They state their study identifies TFAP2C as a genetic interactor of HNRNPK and implicates their roles in mTOR metabolic regulation and establishes a causative link between these activities and prion propagation.

      This is an interesting manuscript in which a lot of work has been undertaken. The experiments are on the whole well done, carefully documented and support most of the conclusions drawn. However, there are places where it was quite difficult to read as some of the important results are in the supplementary Figures and it was necessary to go back and forth between the Figs in the main body of the paper and the supplementary Figs. There are also Figures in the supplementary which should have been presented in the main body of the paper. These are indicated in our comments below.

      We have the following questions /points:

      Major comments:

      1) A plasmid harbouring four guide RNAs driven by four distinct constitutive promoters is used for targetting HNRNPK- is there a reason for using 4 guides- is it simply to obtain maximal editing - in their experience is this required for all genes or specific to HNRNPK?

      Response: The use of four guide RNAs driven by distinct promoters is chosen to maximize editing efficiency for HNRNPK. As previously demonstrated by J. A. Yin et al. (Ref. 32), this system provides better efficiency for gene knockout (or activation). For HNRNPK, achieving full knockout was crucial for observing a complete lethal phenotype, which made the four guide RNAs approach fundamental. However, other knockout systems, while potentially less efficient, have been shown to work well in other circumstances. We have now included this explanation in the revised manuscript as follows:

      "We employed a plasmid harboring quadruple non-overlapping single-guide RNAs (qgRNAs), driven by four distinct constitutive promoters, to target the human HNRNPK gene and maximize editing efficiency in polyclonal LN-229 and U-251 MG cells stably expressing Cas9 (32)."

      2) Is there a minimal amount of Cas9 required for editing?

      Response: We did not observe a correlation between Cas9 levels and activity, yet the C3 clone was the one with higher Cas9 expression and higher activity (Supp. Fig. 1A-B). We agree that comments about the amount of Cas9 expression may be misleading here. Thus, in the first result paragraph of the revised manuscript, we have now modified the text "we isolated by limiting dilutions LN-229 clones expressing high Cas9 levels" to "we isolated by limiting dilutions LN-229 single-cell clones expressing Cas9".

      3) It is stated that cell death is delayed in U251-MG cells compared to LN-229-C3 cells- why? Also, why use glioblastoma cells other than that they have high levels of HNRNPK? Would neuroblastoma cells be more appropriate if they are aiming to test for prion propagation?

      Response: As shown in Fig. 1A, U251-MG cells reached complete cell death at day 13, while LN-229 C3 reached it already at day 10. The percentage of viable U251-MG cells is higher (statistically significant) than LN-229 C3 cells at all time points before day 13, when both lines show complete death. The underlying reasons for this partial and relative resistance are probably multiple, but we clearly showed in Fig. 2 that TFAP2C differential expression is one modulator of cell sensitivity to HNRNPK ablation.

      We selected glioblastoma cells because their high expression of HNRNPK was essential for developing our synthetic lethality screen strategy, and we have now clarified it in the revised manuscript as follows:

      "As model systems, we chose the human glioblastoma-derived LN-229 and U-251 MG cell lines, which express high levels of HNRNPK (2, 3), a key factor for optimizing our synthetic lethality screen."

      While neuroblastoma cells might be more relevant in terms of prion neurotoxicity, glial cells, despite their resistance to prion toxicity, are fully capable of propagating prions. Prion propagation in glial cells has been shown to play crucial roles in mediating prion-dependent neuronal loss in a non-autonomous manner (see 10.1111/bpa.13056). This makes glioblastoma cells a valuable model for studying prion propagation (that is the focus of our study), despite the lack of direct toxicity (which is not the focus of our study). We have now added this explanation to the revised manuscript as follows:

      "Therefore, we continued our experiments using LN-229 cells, which provide a relevant model for studying prions, as glial cells can propagate prions and contribute to prion-induced neuronal loss through non-cell-autonomous mechanisms."

      4) Human CRISPR Brunello pooled library- does the Brunello library use constructs which have four independent guide RNAs as used for the silencing of HNRPNK?

      Response: No, the Human CRISPR Brunello pooled library does not use constructs with four independent guide RNAs (qgRNAs). Instead, each gene is targeted by 4 different single-guide RNAs (sgRNAs), each expressed on a separate plasmid. We have now clarified this in the main text of the revised manuscript as follows:

      "To identify functionally relevant epistatic interactors of HNRNPK, we conducted a whole-genome ablation screen in LN-229 C3 cells using the Human CRISPR Brunello pooled library (33), which targets 19,114 genes with an average of four distinct sgRNAs per gene, each expressed by a separate plasmid (total = 76,441 sgRNA plasmids)."

      5) To rank the 763 enriched genes, they multiply the -log10FDR with their effect size - is this a standard step that is normally undertaken?

      Response: The approach of ranking hits using the product of effect size and statistical significance is a well-established method in CRISPR screening studies. This strategy has been explicitly used in high-impact work by Martin Kampmann and others (see https://doi.org/10.1371/journal.pgen.1009103 and https://doi.org/10.1016/j.neuron.2019.07.014 as references). We have now added both references to the revised manuscript.

      6) The 32 genes selected- they were ablated individually using constructs with one guide RNA or four guide RNAs?

      Response: The 32 genes selected were ablated individually using constructs with quadruple-guide RNAs (qgRNAs), as this approach was intended to maximize editing efficiency for each gene. We have now clarified this in the main text of the revised manuscript as follows:

      "We ablated each gene individually using qgRNAs and then deleted HNRNPK."

      7) The identified targets were also tested in U251-MG cells and nine were confirmed but the percent viability was variable - is the variability simply a reflection of the different cell line?

      Response: The variability in percent viability observed in U251-MG cells likely reflects the inherent differences between cell lines, which can contribute to varying levels of susceptibility to gene ablation, even for the same targets. We have now highlighted these small differences in the main text of the revised manuscript as follows:

      "We confirmed a total of 9 hits (Fig. 1H), including the ELPs gene IKBAKP and the transcription factor TFAP2C, the two strongest hits identified in LN-229 C3 cells. However, in the U251-Cas9 the rescue effect did not always fall within the exact range observed in LN-229 C3 cells, likely due to intrinsic differences between the two cell lines."

      8) The two strongest hits were IKBAKP and TFAP2C. As TFAP2C is a transcription factor - is it known to modulate expression of any of the genes that were identified to be perturbed in the screen? Moreover, it is stated that it regulates expression of several lncRNAs- have the authors looked at expression of these lncRNAs- is the expression affected- can modulation of expression of these lncRNAs modulate the observed phenotypic effects and also some of the targets they have identified in the screen?

      Response: While TFAP2C is a transcription factor known to regulate the expression of several genes and lncRNAs, we did not identify any of its known target genes among the hits of our screen. However, our RNA-seq data and RT-qPCR (data not shown) indicate that the expression of lncRNA MALAT1 and NEAT1 (reported to interact with both HNRNPK and TFAP2C; ref 37, 41, 47) is strongly affected by HNRNPK ablation and to a lesser extent by TFAP2C deletion. However, the double deletion condition does not appear to change these lncRNA levels beyond what is observed with HNRNPK ablation alone. Therefore, we concluded that these changes do not play a primary role in the phenotypic effects observed in our study. Thus, although interesting, we believe that the description of such observations goes beyond the scope of this manuscript and the relevance of this work.

      9) As both HNRNPK and TFAP2C modulate glucose metabolism, the authors have chosen to explore the epistatic interaction. This is most reasonable.

      Response: We do not have further comments on this point.

      10) The orthogonal assay to confirm that deletion of TFAP2C supresses cell death upon removing HNRNPK- was this done using a single guide RNA or multiple guides - is there a level of suppression required to observe rescue? Interestingly ablation of HNRNPK increases TFAP2C expression in LN-229-C3 whereas in U251-Cas9 cells HNRNPK ablation has the opposite effect- both RNA and protein levels of TFAP2C are decreased - is this the cause of the smaller protective effect of TFAP2C deletion in this cell line?

      Response: TFAP2C deletion was performed using quadruple-guide RNAs (gqRNAs). We have clarified this point by addressing the reviewer #2's point 6 in "Major comments".

      We did not directly test the threshold of TFAP2C inhibition required to suppress HNRNPK ablation-induced cell death. We did not exclude that other effectors may take a role in the smaller protective effect of TFAP2C deletion in the U251-Cas9 cells, however, multiple lines of evidence from our study suggest that TFAP2C expression levels influence cellular sensitivity to HNRNPK loss:

      1) Both LN-229 C3 and U251-Cas9 cells are less sensitive to HNRNPK ablation upon TFAP2C deletion (Fig. 1G-H, Fig. 2A-B, Supp. Fig.3A-B).

      2) We observed a correlation between endogenous TFAP2C levels and HNRNPK ablation sensitivity. U251-Cas9 cells, where TFAP2C expression is reduced upon HNRNPK ablation (in contrast to LN-229 C3 cells, where HNRNPK ablation leads to an increase in TFAP2C expression) (Fig. 2C-F), are a) less sensitive to HNRNPK deletion than LN-229 C3 (Fig. 1A, 2A-B) and b) the protective effect of TFAP2C deletion is less pronounced than in LN-229 C3 (Fig. 1G-H, Fig. 2A-B, Supp. Fig.3A-B).

      3) TFAP2C overexpression experiments (Fig. 2G) establish a causal relationship to the former correlation: TFAP2C overexpression increased U251-Cas9 sensitivity to HNRNPK ablation.

      As clearly mentioned in the manuscript, we believe that, taken together, these findings strongly demonstrate a causal role for TFAP2C in modulating sensitivity to HNRNPK loss. Thus, despite the differences in the expression, the proposed viability interaction between TFAP2C and HNRNPK is conserved across cell lines.

      To further strengthen our conclusions, we have now added LN-229 C3 TFAP2C overexpression in Fig. 2G (also attached below for your convenience). As for the U251-Cas9, LN-229 C3 cells show increased sensitivity to HNRNPK ablation upon TFAP2C overexpression.

      11) Nuclear localisation studies indicate that the HNRNPK and TFAP2C proteins colocalise in the nucleus however the co-IP data is not convincing- although appropriate controls are present, the level of interaction is very low - the amount of HNRNPK pulled down by TFAP2C is really very low in the LN-229C3 cells and even lower in the U251-Cas9 cells. Have they undertaken the reciprocal co-IP expt?

      Response: We rephrased our text to better highlight this as also mentioned in our response to reviewer #1 (Point 1 - Major comments). However, as also noted by the reviewer, the experiments included all the relevant controls. Thus, the results are solid and confirm a degree of co-immunoprecipitation (although weak). As detailed in our response to reviewer #1 (Point 1 - Major comments), to strengthen our conclusion, we have now repeated the experiment in low-detergent conditions and used benzonase nuclease for DNA digestion. We also have performed the reciprocal experiment as suggested by the reviewer, confirming the initial results. In our opinion, these additional experiments support the conclusion that Tfap2c and hnRNP K co-immunoprecipitate through a weak, but direct, interaction.

      12) They state that LN-229 C3 ∆TFAP2C and U251-Cas9 ∆TFAP2C were only mildly resistant to the apoptotic action of staurosporin Fig 3E and F - I accept they have undertaken the stats which support their statement that at high concentrations of staurosporin the LN-229 C3 ∆TFAP2C cells are less sensitive but the U251-Cas9 ∆TFAP2C decreased sensitivity is hard to believe. Has this been replicated? I agree that HNRNPK deletion causes apoptosis in both LN-229 C3 and U251-Cas9 cells and this is blocked by Z-VAD-FMK - however the block is not complete- the max viability for HNRNPK deletion in LN-229 C3 cells is about 40% whereas for U251-Cas9 cells it is about 30% - does this suggest that cells are being lost by another pathway. Have they tested concentrations higher than 10nM?

      Response: The experiments in FIG. 3E-F have been replicated four times, as stated in the figure legend. We agree that TFAP2C plays a limited role in response to staurosporine-induced apoptosis, particularly in U251-Cas9 cells. To ensure clarity, we have now modified our previous sentence as follows:

      "LN-229 C3ΔTFAP2C cells were only mildly resistant to the apoptotic action of staurosporine, and U251-Cas9ΔTFAP2C showed even lower and minimal recovery (Fig. 3E-F). These results indicate that TFAP2C plays a limited role in apoptosis regulation and suggest that its suppressive effect on HNRNPK essentiality is not mediated through direct modulation of apoptosis but rather through upstream processes that eventually converge on it."

      The incomplete blockade of apoptosis by Z-VAD-FMK suggests that HNRNPK ablation may activate alternative, non-caspase-mediated cell death pathways. Regarding this point, we decided to not test Z-VAD-FMK above 10 nM as we noted that the rescue effect at the lowest concentration (2nM) was not proportionally increasing at higher concentrations, suggesting we already reached saturation. We have now added and clarified these observations in the revised manuscript as follows:

      "Z-VAD-FMK decreased cell death consistently and significantly in LN-229 C3 and U251-Cas9 cells transduced with HNRNPK ablation qgRNAs (Fig. 3C‑D), confirming that HNRNPK deletion promotes cell apoptosis. However, we observed that viability recovery plateaued already at the lowest concentration (2 nM) without further increase at higher doses, suggesting a saturation effect. This indicates that while caspase inhibition alleviates part of the cell death, HNRNPK loss triggers additional mechanisms beyond apoptosis".

      Following the suggestion of the reviewer, we have now also tested two higher concentrations of Z-VAD (20 and 50nM) in LN-229 cells. At these concentrations, we observed a slight decrease in cell viability in the NT condition, with a rescue effect in the HNRNPK-ablated cells comparable to what was observed at 2-10nM Z-VAD. For this reason, we did not include these data in the revised manuscript, and we attached them here for transparency.

      13) The RNA-seq comparisons- the authors use log2 FC Response: We used a log2 FC threshold of >0.5 and 0.25) is commonly used in RNA-seq studies to capture biologically relevant shifts (e.g.,https://doi.org/10.1371/journal.ppat.1012552; https://doi.org/10.1371/journal.ppat.1008653; https://doi.org/10.1016/j.neuron.2025.03.008; https://doi.org/10.15252/embj.2022112338). We complemented this analysis with Gene Set Enrichment Analysis (GSEA) to assess coordinated changes in biological/genetic pathways, ensuring that our conclusions are not based on isolated, minor expression changes nor on arbitrary thresholds. Finally, to enhance our result robustness, we applied False Discovery Rate (FDR) statistics, which is more stringent than a p-value cutoff. We hope this clarification strengthens the reviewer's confidence in the significance of the observed changes.

      14) It is stated" Accordingly, we observed increased AMPK phosphorylation (pAMPK) upon ablation of HNRNPK, which was consistently reduced in LN-229 C3ΔTFAP2C cells (Supp. Fig. 5B). LN-229 C3ΔTFAP2C; ΔHNRNPK cells also showed a partial reduction of pAMPK relative to LN-229 C3ΔHNRNPK cells (Supp. Fig. 5B). These results suggest that hnRNP K depletion causes an energy shortfall, leading to cell death.

      Response: I am not totally convinced by the data presented in this Fig. The authors have quantified the band intensity and present the ratio of pAMPK to AMPK. Please note that the actin levels are variable across the samples - did they normalise the data using the actin level before undertaking the comparisons? Also, if the authors think this is an important point which supports their conclusion, then it should be in the main body of the paper rather than the supplementary. If AMPK is being phosphorylated, this should lead to activation of the metabolic check point which involves p53 activation by phosphorylation. Activated p53 would turn on p21CIP1 which is a very sensitive indicator of p53 activation.

      We also refer the reviewer to our response to reviewer #1 (Point 2 - Major comments). We understand the point of the reviewer as pAMPK/Actin (absolute AMPK phosphorylation) may provide additional context regarding the downstream effects of AMPK activation, which, however, is not the primary scope of our experiment. We believe that in our specific case, a) the pAMPK/AMPK ratio is the most appropriate metric, as it reflects the energy status of the cell (ATP/AMP levels), which was our main point to assess in this experiment, and b) phospho-protein/total protein is the standard approach for quantifying phosphorylation ratio. For completeness, we have now included pAMPK/Actin quantifications in Supp. Fig. 6B of the revised manuscript (also attached below). pAMPK/Actin levels follow the same trend of pAMPK/AMPK in HNRNPK and TFAP2C single ablations. The pAMPK/AMPK partial rescue in HNRNPK;TFAP2C double ablation relative to HNRNPK single deletion is instead not observed at pAMPK/Actin level. We have now added the pAMPK/Actin quantification and this observation to the revised manuscript as follows:

      "Accordingly, we observed increased AMPK phosphorylation (pAMPK/AMPK ratio and pAMPK/Actin) upon ablation of HNRNPK, with a trend toward reduction in LN-229 C3ΔTFAP2C cells (Supp. Fig. 6B). LN-229 C3ΔTFAP2C;ΔHNRNPK cells also showed a reduction of pAMPK/AMPK ratio relative to LN-229 C3ΔHNRNPK cells, although absolute AMPK phosphorylation (pAMPK/Actin) remained high (Supp. Fig. 6B)."

      We prefer to keep the AMPK blots in Supplementary Fig. 6B, as we believe the main take-home message of the manuscript should remain centered on mTORC1 activity.

      15) We also do not understand why the mTOR Suppl. Fig. 5E is not in the main body of the paper. It's clear that RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C- however the ΔTFAP2C;ΔHNRNPK double deletion levels are only slightly higher than the ΔHNRNPK - they are not at the level NT or even ΔTFAP2C (Fig. 4C, Supp. Fig. 5E).

      Response: We moved the mTOR blot to Fig.5D of the revised manuscript. About the low rescue effect, this is in line with all the other observations where a full rescue of the effects of HNRNPK ablation is never achieved, but is only partial. As suggested by reviewer #3 (Figure 5 - Point 2), we have now added RT-qPCR in Fig.5C, which corroborates these data.

      16) The authors state: "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C). Similarly, the S6 phosphorylation ratio was reduced in LN-229 C3ΔHNRNPK cells and was restored in the ΔTFAP2C;ΔHNRNPK double-ablated cells (Fig. 5C)."

      WE are not convinced that p4EBP1 is preserved in the LN-229 C3ΔTFAP2C cells - there is a very faint band which is at a lower level than the band in the LN-229 C3ΔHNRNPK cells. However, when both HNRNPK and TFAP2C were ablated, the p4EBP1 band is clear cut. I agree with the quantitation that deletion of HNRNPK and TFAP2C both reduce the level of 4EBP1 - the reduction is greater with TFAP2 but when both are deleted together the levels of 4EBP1 are higher and p4EBP1 is clearly present. In quantifying the S6 and pS6 levels, did the authors consider the actin levels- they present a ratio of the pS6 to S6. I may be lacking some understanding but why is the ratio of pS6/S6 being calculated. Is the level of pS6 not what is important - phosphorylation of S6 should lead it to being activated and thus it's the actual level of pS6 that is important, not the ratio to the non-phosphorylated protein.

      Response: In Fig. 5C, the three-band pattern of 4EBP1 is clearly visible in the NT+NT or WT condition, with the top band representing the highest phosphorylation state. Upon HNRNPK deletion, this top band almost completely disappears, mimicking the effect of our starvation control (Starv.). This top band remains clearly visible in both TFAP2C-ablated and double-ablated cells, supporting our conclusion. In our original text, we referred to the "highly phosphorylated forms" of 4EBP1, which might have caused some confusion, suggesting we were evaluating the two top bands. We are specifically referring only to the very top band (high p4EBP1), which represents the most highly phosphorylated form of 4EBP1. This is the relevant phosphorylated form to focus on, as it is the only one that disappears in the starvation control (Starv.) or upon mTORC1/2 inhibition with Torin-1 (Fig. 7B).

      To better clarify these points, we have now more clearly indicated the "high p4EBP1" band with an asterisk in Fig. 5E, added quantification of high p4EBP1/4EBP1, and rephrased the text as follows:

      "Deletion of HNRNPK diminished the highest phosphorylated form of 4EBP1 (high p4EBP1, marked with an asterisk), mimicking the effect observed in starved cells (Starv.). This high p4EBP1 band was preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C).".

      Regarding pS6 quantification, we added pS6/Actin quantification in Supp. Fig. 6E and F of the revised manuscript, also attached here for your convenience.

      17) When determining ATP levels, do they control for cell number? HNRNPK depletion results in lower ATP levels, co-deletion of TFAP2C rescues this. But this could be because there is less cell-death? So, more cells express ATP. Have they controlled for relative numbers of cells.

      Response: As described in the Materials and Methods , we normalized ATP levels to total protein content, which is a standard approach for this type of quantification (see DOI:10.1038/nature19312).

      18) The construction of the HovL cell line that propagate ovine prions - very few details are provided of the susceptibility of the cell line to PG127 prions.

      Response: As with other prion-infected cell lines, HovL cells do not exhibit any specific growth defects, susceptibilities, or phenotypes beyond their ability to propagate prions. This is consistent with established observations in prion research, where immortalized cell lines (and in general in vitro cultures) normally do not show cytotoxicity upon prion infection and, therefore, are used as models for prion propagation rather than for prion toxicity (see https://doi.org/10.1111/jnc.14956 for reference).

      We now expanded the relevant section, including technical and conceptual details in the main text of the revised manuscript as follows:

      "As reported for other ovinized cell models (66), HovL cells were susceptible to infection by the PG127 strain of ovine prions and capable of sustaining chronic prion propagation, as shown by proteinase K (PK)-digested western blot and by detection of PrPSc using the anti-PrP antibody 6D11, which selectively stains prion-infected cells after fixation and guanidinium treatment (67) (Supp. Fig. 7C-E). Consistent with most prion-propagating cell lines (68), HovL cells did not exhibit specific growth defects, susceptibilities, or overt phenotypes beyond their ability to propagate prions."

      19) It is stated that HRNPK depletion from HovL cells increases PrpSC as determined by 6D11 fluorescence, but in the manuscript HRNPK depletion results in cell death. How does this come together?

      Response: As explicitly stated in the main text and shown in Fig.6-7, HNRNPK is downregulated (via siRNAs) in the prion experiments rather than fully deleted (via CRISPR) as in the first part of the manuscript. As shown in Supp. Fig. 8B, this downregulation does not affect cell viability within the experimental time window. Therefore, the observed increase in PrPSc levels upon HNRNPK downregulation, as determined by western blot and 6D11 staining, is independent of any potential cell death effects. Moreover, the same siRNA downregulation approach was used by M. Avar et al. (Ref. 26) in comparable experiments, yielding similar outcomes.

      20) They show that mTOR inhibition mimics the effect of HNRNPK deletion, why didn't they overexpress mTOR and see if that rescues this? This would indicate a causal relationship.

      Response: We appreciate the reviewer's suggestion. We agree that the proposed rescue strategy would be the best approach to indicate a causal relationship. However, we linked the activity of the mTORC1 complex (and not only that of mTOR) to prion propagation. Overexpression of only mTOR would not restore mTORC1 full function, as Rptor would still be downregulated in the context of HNRNPK siRNA silencing (Fig. 7A and Supp. Fig. 8E). Moreover, our RNA-seq data (Supp. Table 5) from HNRNPK ablation indicate the downregulation of other mTORC1 components (namely Pras40 (AKT1S1) and mLST8). Therefore, the rescue of the mTORC1 activity by an overexpression strategy would be a very challenging approach. Given these complexities, to infer causality, we used mTORC1 inhibition (via rapamycin and Torin1) to mimic the effects of HNRNPK downregulation in reducing mTORC1 activity (FIG. 7B).

      For clarification, we have now highlighted in Fig. 4C that HNRNPK ablation downregulates also AKT1S1 and mLST8, other than mTOR and Rptor (also attached below), and we have discussed this in the main text as well. We also have clarified in the revised manuscript (where we sometimes inadvertently referred to it as just mTOR inhibition) that the observed effects are due to mTORC1 inhibition, and not simply mTOR inhibition.

      21) Flow cytometric data: supplementary Fig of Fig6d. - when they are looking at fixed cells the gating strategy for cells results in the inclusion of a lot of debris. The gate needs to be moved and be more specific to ensure results are interpreted properly. Same with the singlet gating. It's not tight enough, they include doublets as well which will skew their data. The gating strategy needs to be regated.

      Response: We have reanalyzed the flow cytometry data in Fig. 6D with a more stringent gating approach to better exclude debris and ensure proper singlet selection. We confirm that there is no change in the final interpretation of the results after applying the updated gating strategy.

      Reviewer #2 (Significance (Required)):

      The manuscript "Prion propagation is controlled by a hierarchical network involving the nuclear Tfap2c and hnRNP K factors and the cytosolic mTORC1 complex" by Sellitto et al aims to examine how heterogenous nuclear ribonucleoprotein K (hnRNPK), limits pion propagation. They perform a synthetic - viability CRISPR- ablation screen to identify epistatic interactors of HNRNPK. They found that deletion of Transcription factor AP-2g (TFAP2C) suppressed the death of hnRNP-K depleted LN-229 and U-251 MG cells whereas its overexpression hypersensitized them to hnRNP K loss. Moreover, HNRNPK ablation decreased cellular ATP, downregulated genes related to lipid and glucose metabolism and enhanced autophagy. Simultaneous deletion of TFAP2C reversed these effects, restored transcription and alleviated energy deficiency.

      Referee #3

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary: Using a CRISPR-based high throughput abrasion assay, Sellitto et al. identified a list of genes that improve cell viability when deleted in hnRNP K knockout cells. Tfap2c, a transcription factor, was identified as a candidate with potential overlap with a hnRNP K function like modulating glucose metabolism. The deletion of Tfap2c in hnRNP K-deletion background prevented caspase-dependent apoptosis observed in hnRNP K single-deletion cells. Further analysis of bulk RNA-seq in hnRNP K/TFAP2C single- and double-deletion cells revealed the impairment in cellular ATP level. Accordingly, activation of AMPK led to perturbed autophagy in hnRNP K deleted cells. Moreover, the reduction and/or inactivation of the downstream mTOR protein resulted in the reduced phosphorylation of S6. Conversely, the phosphorylation of S6 and E4BP1 can be increased by TFAP2C overexpression. Finally, the pharmacological inhibition of the mTOR pathway increased the PrPSC level. This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. Co-IP experiments suggested hnRNP K and Tfap2c may interact, though further validation is needed. Several figures require additional clarification, statistical analysis, or experimental validation to strengthen conclusions.

      Major comments:

      1) Different responses of the TFAP2C expression level to deletion of hnRNPK in the two cell lines (LN-229 C3 and U251-Cas9) should be more adequately addressed. The manuscript focuses on the interaction between hnRNPK and TFAP2C, yet the hnRNPK deletion causes different changes in TFAP2C level in two different lines. Furthermore, in studies where the mechanistic link between hnRNPK and TFAP2C is being investigated, only results from the LN-229 line are presented (Figure 4-7). Thus, it is not clear whether these mechanisms also apply to another line, U251-Cas9, where hnRNPK deletion has the opposite effect on the TFAP1C level. Thus, key experiments should be performed in both lines.

      Response: The opposite effects of hnRNPK ablation on TFAP2C expression between LN-229 C3 and U251-Cas9 cells likely reflect intrinsic differences between the two cell lines. However, the viability interaction between hnRNPK and TFAP2C is conserved in both cell models (Fig. 1G-H, 2A-B, Supp. Fig. 3A-B), suggesting that shared molecular functions at the interface of this interaction exist across the lines. In fact, we believe that the opposite effect of hnRNPK ablation on TFAP2C expression in the two lines strengthens (rather than weakens) our model by highlighting how TFAP2C expression modulates cellular sensitivity to HNRNPK ablation, as detailed in our response to Reviewer #2 (Point 10 - Major comments).

      Regarding the mechanistic studies presented in FIG. 4-7, our initial goal in using two cell lines was to validate the functional viability interaction between HNRNPK and TFAP2C, as identified in our screening (performed in LN-229 C3 cells). After confirming this interaction, we chose to focus only on LN-229 C3 (beginning with RNA-seq analysis, which then led to subsequent mechanistic studies), as this provided the necessary foundation to investigate prion propagation in HovL cells (derived from LN-229). As a U251 model propagating prions does not exist, we are technically limited in performing prion experiments only in HovL and we do not believe that conducting additional experiments in U251 cells would add substantial value to our work or further our investigation.

      We hope this explanation clarifies our rationale and addresses the reviewer's concerns.

      2) Although a lot of data are presented, it is not clear how deletion of the TFAP2C reverses the toxicity caused by deletion of hnRNPK. Specifically, the first half of the paper seems to suggest an opposite mechanism than the second half of the paper. In Figure 2-4, the authors suggest a model that TFAP2C deletion has the opposite effect of hnRNPK deletion, thus rescuing toxicity. However, in Figure 5-6, it is suggested TFAP2C overexpression has the opposite effect of hnRNPK deletion. This two opposite effect of TFAP2C make it difficult to understand the models that the authors are proposing. Please also see below comment 2 for Figure 5.

      Response: We respectfully disagree with the notion that the first and second halves of the manuscript propose contradictory mechanisms.

      In Fig. 2-4, we describe the phenotypic rescue of cell viability upon TFAP2C deletion in hnRNPK-deficient cells. At this stage, we are not proposing a specific molecular mechanism but simply observing a rescue of viability and highlighting underlying transcriptional differences. There is no implication of an opposite molecular mechanism involving the individual activities of hnRNPK and TFAP2C; rather, we focused on the broader effect of TFAP2C deletion on the viability of HNRNPK-lacking cells. In Fig. 5, we isolated a partial mechanism underlying this interaction. We state that: "These data specify a role for TFAP2C in promoting mTORC1-mediated cell anabolism and suggest that its overexpression might hypersensitize cells to HNRNPK ablation by depleting the already limited ATP available, thus making its deletion advantageous". In the discussion, we now further reviewed our explanation: "HNRNPK deletion might cause a metabolic impairment leading to a nutritional crisis and a catabolic shift, whereas TFAP2C activation could promote mTORC1 anabolic functions. Thus, Tfap2c removal may rewire the bioenergetic needs of cells by modulating the mTORC1 signaling and augmenting their resilience to metabolic stress like the one induced by HNRNPK ablation". Therefore, we propose that TFAP2C expression might be particularly detrimental in hnRNPK-deficient cells, as it could push the cell into an anabolic biosynthetic state, further depleting energy stores that the cell is attempting to conserve in response to hnRNPK depletion. Removal of TFAP2C alleviates this metabolic strain. In our view, there is no contradiction between our observations.

      We hope this explanation clarifies our rationale and resolves any perceived inconsistency in our model. To further enhance the understanding of our interpretations, we have now also added (in substitution of Fig. 5E of the original manuscript) a graphical scheme (Fig. 5G of the revised manuscript) to visually explain and illustrate our model (attached below for your convenience).

      3) Similar to the point above, the first half of the paper focuses on hnRNPK deletion-induced toxicity (Fig. 1-5), while the second half of the paper focuses on hnRNPK deletion-induced PrPSC level (Fig. 6-7). The mechanistic link between these two downstream effects of hnRNPK deletion is not clear and thus, it is difficult to understand the reason that hnRNPK deletion-induced toxicity can be rescued by TFAP2C deletion, while hnRNPK deletion-induced PrPSC level increase can be rescued by TFAP2C overexpression.

      Response: Our study is not aimed at comparing viability and prion propagation as interconnected phenotypes but rather at identifying molecular processes regulated by the HNRNPK-TFAP2C interaction. Our study identifies mTORC1 activity as a molecular process at the interface of the HNRNPK-TFAP2C. HNRNPK knockout (or knockdown, which does not affect viability, and therefore is used in the prion section of the manuscript) tones mTORC1 activity down, while TFAP2C overexpression enhances it. This finding suggested an explanation for the viability interaction we observed (see reply to reviewer #3 - Point 2 -Major comments) and it provided a partial mechanism (mTORC1 activity) to explain the effect of HNRNPK knockdown and TFAP2C overexpression on prions.

      We hope this clarification addresses the reviewer's concern.

      Abstract:

      1) Please rephrase and clarify "We linked HNRNPK and TFAP2C interaction to mTOR signaling..." by distinguishing functional, genetic, and direct (molecule-to-molecule) interactions.

      Response: 1) We have now clarified it in the text of the revised manuscript as follows:

      "We linked HNRNPK and TFAP2C functional and genetic interaction to mTOR signaling, observing that HNRNPK ablation inhibited mTORC1 activity through downregulation of mTOR and Rptor, while TFAP2C overexpression enhanced mTORC1 downstream functions."

      2) A sentence reads, "...HNRNPK ablation inhibited mTORC1 activity through downregulation of mTOR and Rptor," although the downregulation of Rptor is observed only at the RNA level. The change in Rptor protein expression level is not reported in the manuscript. Please consider adding an experiment to address this or rephrase the sentence.

      Response: 2) We have now added the experiment in Supp. Fig. 9A of the revised manuscript. The blot shows that hnRNP K depletion reduces both mTOR and Rptor protein levels. "hnRNP K depletion inhibited mTORC1 activity through downregulation of mTOR and Rptor".

      Figure 2:

      1. H and I. Co-IP experiments were done using anti-TFAP2C antibody to the bead. Although the TFAP2C bands show robust signals on the blots, indicating successful enrichment of the protein, hnRNP K bands are very faint. Has the experiment been done by conjugating the hnRNP K antibody to the beads instead? Was the input lysate enriched in the nuclear fraction? Did the lysis buffer include nuclease (if so, please indicate in the figure legend and the methods section)? Addressing these would make the argument, "We also observed specific co-immunoprecipitation of hnRNP K and Tfap2c in LN-229 C3 and U251-Cas9 cells (Fig. 2H-I, Supp. Fig. 3L), suggesting that the two proteins form a complex inside the nucleus" stronger, providing information on potential direct binding.

      Response: 1. We refer the reviewer to our response to reviewers #1 and #2 regarding the weak interaction, the nuclease treatment, and the HNRNPK IP (reviewer #1 Point 1 and reviewer #2 Point 11 - Major comments). As for the co-IP input, it was not enriched in the nuclear fraction, but as shown in Supp. Fig. 4A-B hnRNPK and Tfap2c are exclusively nuclear.

      Figure 3:

      1. C and D. Please add a sentence in the figure legend explaining which means the multiple comparisons were made between (DMSO vs each drug concentration?). Graphing individual data points instead of bars would also be helpful and more informative. Please discuss the lack of dose dependency.

      Response: 1. We have now added information about the comparison in the figure legend ("Multiple comparison was made between Z-VAD-FMK and DMSO treatments in ΔHNRNPK cells."), modified the graph to show the individual data points (attached below for your convenience), and expanded the discussion as detailed for reviewer #2 (Point 14 - Major comments). (For completeness, we have also modified Supp. FIG. 5F to show individual data points, and we have combined the graphs (the DMSO control was shared across treatments)).

      Supplemental Figure 4 (Now shifted in Supplemental Figure 5):

      1. A. Although the trend can be observed, the deletion of hnRNP K does not significantly reduce the GPX4 protein level in LN-229 C3. Therefore, the following statement requires more data points and additional statistical analysis to be accurate: "In LN-229 C3 and U251-Cas9 cells, the deletion of HNRNPK reduced the protein level of GPX4, whereas TFAP2C deletion increased it (Supp. Fig. 4A-B)."

      2. A and B. The results are confusing, considering the previous report cited (ref 49) shows an increase in GPX4 with TFAP2C. It may be possible that the deletion of TFAP2C upregulates the expression of proteins with similar functions (e.g., Sp1). If this is the case, the changes in GPX4 expression observed here are a consequence of TFAP2C deletion and may not "suggest a role for HNRNPK and TFAP2C in balancing the protein levels of GPX4."

      Response: 1. We agree with the reviewer that in LN-229 C3 cells the reduction of GPX4 protein levels upon HNRNPK deletion did not reach statistical significance in our initial Western blot analysis. To address this concern, we performed six additional independent experiments and repeated the statistical analysis. Although the trend toward reduced GPX4 protein levels remained consistent, statistical significance was still not achieved (p > 0.05). Importantly, this trend is supported by our RNA-seq dataset (Supplementary Table 5), which shows decreased GPX4 expression upon HNRNPK deletion. We have now revised the text to more accurately reflect the experimental observations and to avoid overstating the effect in LN-229 C3 cells as follows:

      "In LN-229 C3 and U251-Cas9 cells, deletion of HNRNPK was associated with reduced glutathione peroxidase 4 (GPX4) protein abundance (although not statistically significant in LN-229 C3; p ≈ 0.08), whereas deletion of TFAP2C increased it (Supp. Fig. 5A-B)."

      The six new experimental replicas have been added to the uncropped western blot section.

      __Response: __2. Concerning the potential role of TFAP2C deletion in upregulating proteins with similar functions, we recognize the reviewer's perspective. However, our primary focus is on the observed trends rather than a definitive mechanistic conclusion. We clarified our wording to acknowledge this possibility while maintaining the relevance of our findings within the broader context of hnRNPK and TFAP2C interactions.

      "This last result was interesting as a previous study reported that Tfap2c enhances GPX4 expression (51). Thus, the observed increase upon TFAP2C deletion suggests additional layers of regulation, potentially involving compensatory mechanisms."

      Supplemental Figure 5 (Now shifted in Supplemental Figure 6):

      1. B. To obtain statistical significance and strengthen the conclusion, more repeated Western blot experiments can be done to quantify the pAMPK/AMPK ratio.

      Response: We included three more experiments as detailed in our response to reviewer #1 (Point 2 - Major comments) and reviewer #2 (Point 14 - Major comments).

      Figure 5:

      1. B. I believe statistical analysis with two replicates or less is not recommended. Although the assay is robust, and the blot is convincing, please consider adding more replicates if the blot is to be quantified and statistically analyzed.

      2. "Interestingly, RNA and protein levels of mTOR were downregulated in LN-229 C3ΔHNRNPK cells but were partially rebalanced by the ΔTFAP2C;ΔHNRNPK double deletion (Fig. 4C, Supp. Fig. E)." The statement is based on a slight difference at the protein level between the single deletion and the double deletion, as well as the observation from the bulk RNA-seq data. mTOR (and Rptor) mRNA level can be assessed by RT-qPCR to validate and further support the existing data. It is also curious why deletion of TFAP2C alone, also induced decrease in mTOR, but double deletion rescued mTOR level slightly compared to deletion of HNRNPK alone.

      3. C. The main text refers to the changes in the level of phosphorylated E4BP1, stating, "Deletion of HNRNPK diminished the highly phosphorylated forms of 4EBP1, which instead were preserved in both LN-229 C3ΔTFAP2C and LN-229 C3ΔTFAP2C;ΔHNRNPK cells (Fig. 5C)." However, the quantification was done on the total E4BP1, which may be because separating pE4BP1 and E4BP1 bands on a blot is challenging. Please consider using phospho-E4BP1 specific antibody or rephrase the sentence mentioned above. The current data suggest the single- and double-deletion of hnRNP K/TFAP2C affect the overall stability of E4BP1, which may be a correlation and not due to the mTOR activity as claimed in "We conclude that HNRNPK and TFAP2C play an essential role in co-regulating cell metabolism homeostasis by influencing mTOR and AMPK activity and expression." How does the cap-dependent translation (or total protein level) change in TFAP2C deleted and overexpressing cells?

      Response: 1. We added two additional experiments as detailed in our response to reviewer #1 (Point 3 - Major comment).

      __Response: __2. Deletion of TFAP2C does not decrease mTOR levels as shown from the quantification in Fig. 5D. To further support our results, we have now included RT-qPCR in FIG. 5C as suggested by the reviewer. Data are also attached here for your convenience.

      __Response: __3. Regarding the assessment of phosphorylated 4EBP1, we think we achieved a clear separation of the differently phosphorylated forms of 4EBP1 in our blots, and we have now added the quantification for High p4EBP1/4EBP1 in Fig. 5E (see also our response to reviewer #2 Point 16 - Major comments). The quantification of total 4EBP1 represents an additional dataset, and we do not claim that 4EBP1 stability is affected by HNRNPK and TFAP2C directly through mTOR, which could be, in fact, correlative. We claim that HNRNPK and TFAP2C modulate mTORC1 and AMPK metabolic signaling as shown by the changed phosphorylation of 4EBP1, S6, AMPK, and ULK1 (Fig. 5C-E, Supp. FIG. 6B, D) and by the regulation of autophagy (Fig. 5B, Supp. Fig. 6C); we did not directly check cap-dependent translation.

      We have now rephrased our text to ensure clarity as follows:

      "We conclude that HNRNPK and TFAP2C play a role in co-regulating mTORC1 and AMPK expression, signaling, and activity."

      Figure 6:

      1. A. Did the sihnRNP K increase the TFAP2C level?

      2. A and C. Are the total PrP levels lower in TFAP2C overexpressing cells compared to mCherry cells when they are infected?

      3. D. Do the TFAP2C protein levels differ between 2-day+72-h and 7-day+96-h?

      __Response: __1. Yes, it does. We have now provided the quantification in Fig. 6A, C, and Supp. Fig. 8A (also attached below for your convenience).

      __Response: __2. We have now provided the quantification in Fig. 6A and Supp. Fig. 8A. The total PrP does not change in TFAP2C overexpressing cells. Total PrP consists of both PK-resistant PrP (PrPSc) and PK-sensitive PrP (PrPC plus potential other intermediate species), with PrPSc typically present at much lower levels. In our model, PrPC is exogenously expressed at high levels via a vector and remains constant across conditions (Fig. 6C and Supp. Fig. 8C). As a result, any changes in PrPSc may not necessarily reflect on total PrP levels.

      __Response: __3. No, there is no statistically significant change. We have now added a representative western blot and the quantification of 3 independent replicates in Supp. Fig. 8D. The other two western blots are only shown in the uncropped western blots section. This dataset is also attached here for your convenience.

      Figure 7:

      1. I agree with the latter half of the statement: "These findings suggest that HNRNPK influences prion propagation at least in part through mTORC1 signaling, although additional mechanisms may be involved." The first half requires careful rephrasing since (A) Independent of the background siRNA treatment, TFAP2C overexpression by itself can modulate PrPSC level as seen in Fig 6A and B, (B) Although the increase in TFAP2C level is observed with the hnRNP K deletion (Fig 1; LN-229 C3), sihnRNP K treatment may or may not influence the TFAP2C level (Fig 6; quantified data not provided), and (C) In the sihnRNP K-treated cells, E4BP1 level is increased compared to the siNT-treated cells, which was not observed hnRNP K-deleted cells. Discussions and additional experiments (e.g., mTOR knockdown) addressing these points would be helpful.

      __Response: __A, B) We respectfully disagree with the possibility that HNRNPK downregulation may increase prion propagation via TFAP2C upregulation. As shown in Fig. 6A-B, D and in Supp. Fig. 8A, TFAP2C overexpression reduces, rather than increases, prion levels. Therefore, it would be inconsistent to suggest that HNNRPK siRNA promotes prion propagation through TFAP2C upregulation (quantification is now provided, see reviewer #3 - Figure 6 - Point 1). C) Concerning 4EBP1 levels, we have quantified the total 4EBP1 (also attached below) and expanded the discussion on potential discrepancies between HNRNPK knockout and knockdown, as the former affects cell viability, while the latter does not. However, as explained also in the previous reply to reviewer #3 - Figure 5 - Point 3, our focus is on the highly phosphorylated band of 4EBP1 (High p4EBP1), which is the direct target of mTORC1 activity. In both the hnRNPK knockout LN-229 C3 (Fig. 5E) and knockdown HovL models (Fig. 7B), phosphorylation of 4EBP1, along with phosphorylation of S6, is clearly reduced (we have now included quantification for Fig. 7B), reinforcing our conclusion that mTORC1 activity is affected by hnRNPK depletion. As the reviewer noted, we do not claim that mTORC1 is the sole mediator of hnRNPK's effect on prion regulation. However, we think that our interpretation of a potential and partial role of mTORC1 inhibition in the effect of HNRNPK downregulation on prion propagation is in line with the data presented in Fig. 6-7 and Supp. Fig. 8-9. For further clarification, we expanded the text according to the new experiments and analysis, and we added mTOR and Raptor siRNA knockdown (Supp. Fig.9C) to further support our conclusions (also attached below for your convenience).

      Minor comments:

      1. Please clarify "independent cultures." Does this mean technical replicates on the same cell culture plate but different wells or replicated experiments on different days?

      __Response: __We have now clarified in each figure legend. "Individually treated wells" means different parental cultures grown and treated separately on the same day. n represents independent experiments on different days.

      1. Fig 2G. Please explain how the sigmoidal curves were fitted to the data points under the materials and methods section.

      2. Fig 3E and F. Please refer to the comment on Fig 2G above.

      __Response: __We have now added the explanation in Materials and Methods as follows:

      "Curve Fitting

      For sigmoidal curve fitting, we used GraphPad Prism (version X, GraphPad Software). Data in Figure 2G were fitted using nonlinear regression with a least squares regression model. For Figures 3E and 3F, data fitting was performed using an asymmetric sigmoidal model with five parameters (5PL) and log-transformed X-values (log[concentration])."

      3.Fig S3 F/H. Quantification of gel bands would be helpful when comparing protein expression changes after different treatments, as band intensities look different across.

      __Response: __We have now added the quantifications in Supp. FIG. 3D-H (attached below for your convenience). They confirm that there are no significant differences in the means of the normalized values.

      1. Supp Fig 5C and F. These panels can be combined with the corresponding panels in main Figure 5 if space allows so that the readers do not have to flip pages between the main text and Supplemental material.

      __Response: __We have now combined the panels. Previous Supp. FIG. 5C and F are now shown in FIG. 6C and E, respectively.

      Reviewer #3 (Significance (Required)):

      This is an interesting paper potentially providing new mechanical insight of hnRNPK function and its interaction with TFAP2C. It is also important to understand how hnRNPK deletion induces prion propagation and develop methods to mitigate its spread. However, inconsistencies in TFAP2C expression across cell lines and conflicting mechanistic interpretations complicate conclusions. I have expertise in RNA-binding protein, cell biology, and prion disease.

    1. For example, the proper security practice for storing user passwords is to use a special individual encryption process for each individual password. This way the database can only confirm that a password was the right one, but it can’t independently look up what the password is or even tell if two people used the same password. Therefore if someone had access to the database, the only way to figure out the right password is to use “brute force,” that is, keep guessing passwords until they guess the right one (and each guess takes a lot of time).

      This section makes the privacy vs. security distinction feel very concrete: users may accept some privacy trade-offs, but they still expect platforms to protect the data they collect. The examples about password storage and breaches underline that security failures aren’t just “technical accidents”—they’re governance choices (process, incentives, access controls), and it’s why protections like unique password hashing and 2-factor authentication matter in practice.

    2. While we have our concerns about the privacy of our information, we often share it with social media platforms under the understanding that they will hold that information securely. But social media companies often fail at keeping our information secure.

      I find the concept of securty and privacy on social media incredibly intriguing, as users are almost always promised that just by having a username and password, your data is completely protected, or at least we are given that assumption. However, it's relatively easy to hack into anyone's account if you have the right knowledge and know where to look. Especially if you have the capabality to manage an app or website from the back end, having the ability to go through data within an application can lead to data leaks, and private information you thought no one would have can suddenly be given to the world.

    3. But while that is the proper security for storing passwords. So for example, Facebook stored millions of Instagram passwords in plain text, meaning the passwords weren’t encrypted and anyone with access to the database could simply read everyone’s passwords. And Adobe encrypted their passwords improperly and then hackers leaked their password database of 153 million users.

      I have heard multiple cases/instances of this happening but never heard of any follow up about this, at least in the cases of social media. Were people's accounts then hacked? What came from this? I'm aware it's bad to generalize based on your own experiences, but I was only ever hacked once on instagram (my account started posting scam advertisements and changed my profile picture), but I just changed my password and the problem went away. Was it a similar case for different users? Did anything serious come from these cases?

    1. Then said, Leif “It has not come to pass with us in regard to this land as with Biarni, that we have not gone upon it. To this country I will now give a name and call it Helluland [the land of flat rocks].”

      It's interesting to see how Leif Erikson named the places based on their practical value. Like the "Land of Flat Rocks" and "Land of Forests", shows that their weren't just exploring for fun, but searching for different materials that would be useful to them.

    1. “Now, John, I don’t know anything about politics, but I can read my Bible; and there I see that I must feed the hungry, clothe the naked, and comfort the desolate; and that Bible I mean to follow.” “But in cases where your doing so would involve a great public evil—” “Obeying God never brings on public evils. I know it can’t. It’s always safest, all round, to do as He bids us. “Now, listen to me, Mary, and I can state to you a very clear argument, to show—” “O, nonsense, John! you can talk all night, but you wouldn’t do it. I put it to you, John,—would you now turn away a poor, shivering, hungry creature from your door, because he was a runaway? Would you, now?”

      Mrs. Bird doesn’t let John hide behind “politics.” She basically says: it’s easy to defend this law when it’s just an idea, but what would you actually do if a freezing, hungry runaway showed up at your door? John can talk about the “greater good” in theory, but Mary drags it into real life, where it’s suddenly obvious how cruel it is. Stowe is showing that these arguments only work when you don’t have to look the suffering person in the face.

    1. This is such an awesome effort. Some thoughts while reading:

      I have been wary of claims-based review given how hard it is to decouple claims from those made to chase editorial outcomes. I’d be very interested if we might prospectively use this type of result extraction to re-derive a supported claim (regardless of whether that was the claim made) and have that those be the content we operate on. I could even imagine authors using this type of supported content earlier in their workflow to make their narratives more robust rather than only post-hoc in evaluation.

      That said, a useful distinction made here is to separate claims based on data/citations from claims that are inference or speculation. The latter of course are important for proposing parsimonious models that one can use as a theoretical framework for subsequent hypothesis generation or orthogonal testing. One awesome application of this in a future machine-readable world would be a forward-looking comparison of inference and speculation type claims with results from the rest of the corpus not just within-paper support.

      Not at all a surprise is the increased coverage by LLMs compared to a few human reviewers who indeed likely only have expertise that covers a fraction of any interdisciplinary or collaborative work (the expertise of many authors went into a paper - the expertise of a reviewer is unlikely to span those).

      Another thing that strikes me as invaluable here is the implicit and uncited result pairs that really will be essentially to generate knowledge graphs that can more usefully be built upon by new science/other scientists and far less biased than citation graphs.

      Finally, as someone who spent many years invested in making the flip to the new eLife model (eliminating accept/reject editorial decisions), I was particularly interested in figure 2a comparing the old and new models. Even if unsupported claims had been dramatically higher or even high in an absolute sense, it would highlight the benefit of public reviews for such content rather than the same content being rejected and published as-is with a stamp of editorial acceptance elsewhere. However it's striking that even with many more claims evaluated by OpenEval, the unsupported claims are low whether authors decide when to stop revising their papers or when acceptance is gated by editors.

    1. In answering this letter, please state if there would be any safety for my Milly and Jane, who are now grown up, and both good-looking girls. You know how it was with poor Matilda and Catherine. I would rather stay here and starve—and die, if it come to that—than have my girls brought to shame by the violence and wickedness of their young masters. You will also please state if there has been any schools opened for the colored children in your neighborhood. The great desire of my life now is to give my children an education, and have them form virtuous habits.

      This is also the same sort of passive aggressive tone as from before, since he is coupling negotiations from a position of power and basically drawing a hard line that he will not come back unless the Colonel Anderson can guarantee his daughters' safety. But he couples it with a sort of innocent question about schooling as well. It's surprisingly educated in its delivery, since everytime it takes something or delivers a devastating statement, it also balances it with either an outright compliment or a mundane question to dilute the previous blow. It as a whole seems to be establishing a sort of idea that, yes, he's willing to return, but he's actually not, as he's demanding many things that the Colonel can't provide. It's basically a soft rejection of his offer by making it look instead of as a personal decision off of just a refusal to work for him, but by disguising it as a logical decision based on just common working conditions and employment, such as differing wages and benefits. Again, it seems surprisingly well put together and educated, which likely is also intentionally done to undermine the Colonel's position.

    1. Well, PKM is not actually my idea. There are other people who are writing about personal knowledge management. I changed it to mastery because I wanted to move away from the knowledge management world, which was too much about big systems and big databases. I wanted to focus on what I, as an individual, do, and what we as a community or a network do. For example, how do we enable that kind of collaboration and cooperation? So, personal knowledge mastery just became the term because it is about mastery, and you never master it completely, right? It’s like any discipline, a lifelong thing. You continuously try to get better.

      how Harold replaced the m in pkm for mastery. This comes very close to my [[Kenniswerk is ambacht 20040924200250]] artisanal view on knowledge work. The focus on practice in community, networks and on your own. Vgl [[% Practice Praktijk OP]]

    1. re and more​ in his fiction, Matthiessen made a fetish of the Faulknerian device of multiple, conflicting and elliptical points of view. The tendency is tamed in At Play in the Fields of the Lord. For all the hyperbolic reactions to it, Far Tortuga is a fine if difficult novel that teaches you how to read it as the narrative advances through foul weather. Shadow Country in its final form is a heap of jumbled repetitions about the life of Edgar Watson, a real-life sugar planter on the south-west coast of Florida accused of various crimes, including multiple murders, who died at the hands of a mob of his neighbours in 1910. The first volume is told by a rotation of dozens of neighbours and relatives, speaking in various forms of swamp hillbilly dialect. The next book, told in the third person, follows Watson’s descendants, who try to piece together the truth of his legend. The final volume is narrated by Watson himself in a higher register. It’s hard to defy the blurb from Don DeLillo that appears on the cover of the current edition – ‘His writing does every justice to the blood fury of his themes’ – but just as hard to call Watson a hero or villain deserving of this epic treatment.

      Some nicely acid comments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study presents a system for delivering precisely controlled cutaneous stimuli to freely moving mice by coupling markerless real-time tracking to transdermal optogenetic stimulation, using the tracking signal to direct a laser via galvanometer mirrors. The principal claims are that the system achieves sub-mm targeting accuracy with a latency of <100 ms. The nature of mouse gait enables accurate targeting of forepaws even when mice are moving.

      Strengths:

      The study is of high quality and the evidence for the claims is convincing. There is increasing focus in neurobiology in studying neural function in freely moving animals, engaged in natural behaviour. However, a substantial challenge is how to deliver controlled stimuli to sense organs under such conditions. The system presented here constitutes notable progress towards such experiments in the somatosensory system and is, in my view, a highly significant development that will be of interest to a broad readership.

      Weaknesses:

      (1) "laser spot size was set to 2.00 } 0.08 mm2 diameter (coefficient of variation = 3.85)" is unclear. Is the 0.08 SD or SEM? (not stated). Also, is this systematic variation across the arena (or something else)? Readers will want to know how much the spot size varies across the arena - ie SD. CV=4 implies that SD~7 mm. ie non-trivial variation in spot size, implying substantial differences in power delivery (and hence stimulus intensity) when the mouse is in different locations. If I misunderstood, perhaps this helps the authors to clarify. Similarly, it would be informative to have mean & SD (or mean & CV) for power and power density. In future refinements of the system, would it be possible/useful to vary laser power according to arena location?

      We thank the reviewer for their comments and for identifying areas needing more clarity. The previous version was ambiguous: 0.08 refers to the standard deviation (SD). We have removed the ambiguity by stating mean ± SD and reporting a unitless coefficient of variation (CV).

      The revised text reads “laser spot size was set to 2.00 ± 0.08 mm<sup>2</sup> (mean ± SD; coefficient of variation = 0.039).” This makes clear that the variability in spot size is minimal: it is 0.08 mm<sup>2</sup> SD (≈0.03 mm SD in diameter). This should help clarify that spot size variability across the arena is minute and unlikely to contribute meaningfully to differences in stimulus intensity across locations. The power was modulated depending on the experiment, so we provide the unitless CV here in “The absolute optical power and power density were uniform across the glass platform (coefficient of variation 0.035 and 0.029, respectively; Figure 2—figure supplement)”. We are grateful to the reviewer for spotting these omissions.

      The reviewer also asks whether, in the future, it is “possible/useful to vary laser power according to arena location”. This is already possible in our system for infrared cutaneous stimulation using analog modulation (Figure 4). We have added the following sentence to make this clearer: “Laser power could be modulated using the analog control.”

      (2) "The video resolution (1920 x 1200) required a processing time higher than the frame interval (33.33 ms), resulting in real-time pose estimation on a sub-sample of all frames recorded". Given this, how was it possible to achieve 84 ms latency? An important issue for closed-loop research will relate to such delays. Therefore please explain in more depth and (in Discussion) comment on how the latency of the current system might be improved/generalised. For example, although the current system works well for paws it would seem to be less suited to body parts such as the snout that do not naturally have a stationary period during the gait cycle.

      We captured and stored video with a frame-to-frame interval of 33.33 ms (30 fps). DeepLabCut-live! was run in a latency-optimization mode, meaning that new frames are not processed while the network is busy - only the most recent frame is processed when free. The processing latency is measured per processed frame, and intermediate frames are thus skipped while the network is busy. Although a wide field of view and high resolution is required to capture the large environment, increasing the per-frame compute time, the processing latency remained small enough to track and stimulate moving mice. This processing latency of 84 ± 12 ms (mean ± SD) was calculated using the timestamps stored in the output files from DeepLabCut-live!: subtracting the frame acquisition timestamp from the frame processing timestamp across 16,000 processed frames recorded across four mice (4,000 each). In addition, there is a small delay to move the galvanometers and trigger the laser, calculated as 3.3 ± 0.5 ms (mean ± SD; 245 trials). This is described in the manuscript, but can be combined with the processing latency to indicate a total closed-loop delay of ≈87 ms so we have expanded on the ‘Optical system characterization’ subsection in the Methods, adding “We estimated a processing latency of 84 ± 12 ms (mean ± SD) by subtracting…” and that “In the current configuration the end-to-end closed-loop delay is ≈87 ms from the combination of the processing latency and other delays”. To the Discussion, we now comment on how this latency can be reduced and how this can allow for generalization to more rapidly moving body parts.

      Reviewer #2 (Public review):

      Parkes et al. combined real-time keypoint tracking with transdermal activation of sensory neurons to examine the effects of recruitment of sensory neurons in freely moving mice. This builds on the authors' previous investigations involving transdermal stimulation of sensory neurons in stationary mice. They illustrate multiple scenarios in which their engineering improvements enable more sophisticated behavioral assessments, including (1) stimulation of animals in multiple states in large arenas, (2) multi-animal nociceptive behavior screening through thermal and optogenetic activation, and (3) stimulation of animals running through maze corridors. Overall, the experiments and the methodology, in particular, are written clearly. However, there are multiple concerns and opportunities to fully describe their newfound capabilities that, if addressed, would make it more likely for the community to adopt this methodology:

      The characterization of laser spot size and power density is reported as a coefficient of variation, in which a value of ~3 is interpreted as uniform. My interpretation would differ - data spread so that the standard deviation is three times larger than the mean indicates there is substantial variability in the data. The 2D polynomial fit is shown in Figure 2 - Figure Supplement 1A and, if the fit is good, this does support the uniformity claim (range of spot size is 1.97 to 2.08 mm2 and range of power densities is 66.60 to 73.80 mW). The inclusion of the raw data for these measurements and an estimate of the goodness of fit to the polynomials would better help the reader evaluate whether these parameters are uniform across space and how stable the power density is across repeated stimulations of the same location. Even more helpful would be an estimate of whether the variation in the power density is expected to meaningfully affect the responses of ChR2-expressing sensory neurons.

      We thank the reviewer for their comments. As also noted in response to Reviewer 1, the coefficient of variation (CV) is now reported in unitless form (rather than a percentage) to ensure clarity. For avoidance of doubt, the CV is 0.039 (3.9%), so the variation in laser spot size is minimal – there is negligible spot size variability across the system. The ranges are indeed consistent with uniformity. We have included the goodness-of-fit estimates in the appropriate figure legend “fit with a two-dimensional polynomial (area R<sup>2</sup> = 0.91; power R<sup>2</sup> = 0.75)”. This indicates that the polynomials fit well overall.

      The system already allows for control of spot size. To examine whether the variation in the power density affects the responses of ChR2-expressing sensory neurons, we examined this in our previous work that focused more on input-output relationships, demonstrating a steep relationship between spot size (range of 0.02 mm<sup>2</sup> to 2.30 mm<sup>2</sup>) and the probability of paw response, demonstrating a meaningful change in response probability (Schorscher-Petcu et al. eLife, 2021). In future studies, we aim to use this approach to “titrate” cutaneous inputs as mice move through their environments.

      While the error between the keypoint and laser spot error was reported as ~0.7 to 0.8 mm MAE in Figure 2L, in the methods, the authors report that there is an additional error between predicted keypoints and ground-truth labeling of 1.36 mm MAE during real-time tracking. This suggests that the overall error is not submillimeter, as claimed by the authors, but rather on the order of 1.5 - 2.5 mm, which is considerable given the width of a hind paw is ~5-6 mm and fore paws are even smaller. In my opinion, the claim for submillimeter precision should be softened and the authors should consider that the area of the paw stimulated may differ from trial to trial if, for example, the error is substantial enough that the spot overlaps with the edge of the paw.

      We thank the reviewer for identifying a discrepancy in these reported errors. We clarify this below and in the manuscript

      The real-time tracking error is the mean absolute Euclidean distance (MAE) between ground truth and DLC on the left hind paw where likelihood was relatively high. More specifically, ground truth was obtained by manual annotation of the left hind paw center. The corresponding DLC keypoint was evaluated in frames with likelihood >0.8 (the stimulation threshold). Across 1,281 frames from five videos of freely exploring mice (30 fps), the MAE was 1.36 mm.

      The targeting error is the MAE between ground truth and the laser spot location, so should reflect the real-time tracking error plus errors from targeting the laser. More specifically, this metric was determined by comparing the manually determined ground truth keypoint of the left hind paw and the actual center of the laser spot. Importantly, this metric was calculated using four five-minute high-speed videos recorded at 270 fps of mice freely exploring the open arena (463 frames) and frames were selected with a likelihood threshold >0.8. This allowed us to resolve the brief laser pulses but inadvertently introduced a difference in spatial scaling. After rescaling, the values give a targeting error MAE now in line with the real-time tracking error  (see corrected Figure 2L). This is approximately 1.3 mm across all locomotion speeds categories. These errors are small and are limited by the spatial resolution of the cameras. We thank the reviewer for noting this discrepancy and prompting us to get to its root cause.

      We have amended the subtitle on Figure 2L as “Ground truth keypoint to laser spot error” and have avoided the use of submillimeter throughout. We have added the following sentence to clarify this point: “As laser targeting relies on real-time tracking to direct the laser to the specified body part, this metric includes any errors introduced by tracking and targeting”.

      As the major advance of this paper is the ability to stimulate animals during ongoing movement, it seems that the Figure 3 experiment misses an opportunity to evaluate state-dependent whole-body reactions to nociceptor activation. How does the behavioral response relate to the animal's activity just prior to stimulation?

      The reviewers suggest analysis of state-dependent responses. In the Figure 3 experiment, mice were stimulated up to five times when stationary. Analysis of whole body reactions in stationary mice has been described in (Schorscher-Petcu et al. eLife, 2021) and doing this here would be redundant, so instead we now analyse the responses of moving mice in Figure 5. This new analysis shows robust state-dependent responses during movement as suggested by the reviewer. We find two behavioral clusters: one that is for faster, direct (coherent) movement and the other that is for slower assessment (incoherent) movement. Stimulation during the former results in robust and consistent slowing and shift towards assessment, whereas stimulation during the former results in a reduction in assessment. We describe and interpret these new data in the Results and Discussion sections and add information in the Methods and Figure legend, as given below. We believe that demonstrating movement statedependence is a valuable addition to the paper and thank the reviewer for suggesting this.

      Given the characterization of full-body responses to activation of TrpV1 sensory neurons in Figure 4 and in the authors' previous work, stimulation of TrpV1 sensory neurons has surprisingly subtle effects as the mice run through the alternating T maze. The authors indicate that the mice are moving quickly and thus that precise targeting is required, but no evidence is shared about the precision of targeting in this context beyond images of four trials. From the characterization in Figure 2, at max speed (reported at 241 +/- 53 mm/s, which is faster than the high speeds in Figure 2), successful targeting occurs less than 50% of the time. Is the initial characterization consistent with the accuracy in this context? To what extent does inaccuracy in targeting contribute to the subtlety of affecting trajectory coherence and speed? Is there a relationship between animal speed and disruption of the trajectory?

      We thank the reviewer for pointing out the discrepancy in the reported maximum speed. We have corrected the error in the main text: the average maximum speed is 142 ± 26 mm/s (four mice).

      The self-paced T-maze alternation task in Figure 5 demonstrates that mice running in a maze can be stimulated using this method. We did not optimize the particular experimental design to assess the hit accuracy, as this was determined in Figure 2. Instead, we optimized for the pulse frequencies, meaning the galvanometers tracked with processed frames but the laser was triggered whether or not the paw was actually targeted. However, even in this case with the system pulsing in the free-run mode, the laser hit rate was 54 ± 6% (mean ± sem, n = 7 mice). We have weakened references to submillimeter as it was only inferred from other experiments and was not directly measured here. We find in this experiment that stimulation in freely moving mice can cause them to briefly halt and evaluate. In the future, we will use experimental designs to more optimally examine learning.

      The reviewer also asks if there is a relationship between speed and disruption of the trajectory. We find that this is the case as described above with our additional analysis.

      Reviewer #3 (Public review):

      Summary:

      To explore the diverse nature of somatosensation, Parkes et al. established and characterized a system for precise cutaneous stimulation of mice as they walk and run in naturalistic settings. This paper provides a framework for real-time body part tracking and targeted optical stimuli with high precision, ensuring reliable and consistent cutaneous stimulation. It can be adapted in somatosensation labs as a general technique to explore somatosensory stimulation and its impact on behavior, enabling rigorous investigation of behaviors that were previously difficult or impossible to study.

      Strengths:

      The authors characterized the closed-loop system to ensure that it is optically precise and can precisely target moving mice. The integration of accurate and consistent optogenetic stimulation of the cutaneous afferents allows systematic investigation of somatosensory subtypes during a variety of naturalistic behaviors. Although this study focused on nociceptors innervating the skin (Trpv1::ChR2 animals), this setup can be extended to other cutaneous sensory neuron subtypes, such as low-threshold mechanoreceptors and pruriceptors. This system can also be adapted for studying more complex behaviors, such as the maze assay and goal-directed movements.

      Weaknesses:

      Although the paper has strengths, its weakness is that some behavioral outputs could be analyzed in more detail to reveal different types of responses to painful cutaneous stimuli. For example, paw withdrawals were detected after optogenetically stimulating the paw (Figures 3E and 3F). Animals exhibit different types of responses to painful stimuli on the hind paw in standard pain assays, such as paw lifting, biting, and flicking, each indicating a different level of pain. Improving the behavioral readouts from body part tracking would greatly strengthen this system by providing deeper insights into the role of somatosensation in naturalistic behaviors. Additionally, if the laser spot size could be reduced to a diameter of 2 mm², it would allow the activation of a smaller number of cutaneous afferents, or even a single one, across different skin types in the paw, such as glabrous or hairy skin.

      We thank the reviewer for highlighting how our system can be combined with improved readouts of coping behavior to provide deeper insights. Optogenetic and infrared cutaneous stimulation are well established generators of coping behaviors (lifting, flicking, licking, biting, guarding). Detection of these behaviors is an active and evolving field with progress being made regularly (e.g. Jones et al., eLife 2020 [PAWS];  Wotton et al., Mol Pain 2020; Zhang et al., Pain 2022; Oswell et al., bioRxiv 2024 [LUPE]; Barkai et al., Cell Reports Methods 2025 [BAREfoot], along with more general tools like Hsu et al., Nature Communications 2021 [B-SOiD]; Luxem et al., Communications Biology 2022 [VAME]; Weinreb et al,. Nature Methods 2024 [Keypoints-MoSeq]). One output of our system is bodypart keypoints, which are the typical input to many of these tools. We will leave the readers and users of the system to decide which tools are appropriate for their experimental designs - the focus of this current manuscript is describing the novel stimulation approach in moving animals.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) It is hard to see how the rig is arranged from the render of Figure 2AB due to the components being black on black. A particularly useful part of Fig2AB is the aerial view in panel B that shows the light paths. I suggest adding the labelling of Figure 2A also to that. The side/rear views could perhaps be deleted, allowing the aerial view to be larger.

      We appreciate this suggestion and have revised Figure 2B to improve the visibility of the optomechanical components. We have enlarged the side and aerial views, removed the rear view, and added further labels to the aerial view.

      (2) MAE - to interpret the 0.54 result, it would be useful to state the arena size in this paragraph.

      Thank you. We have added the arena size in this paragraph and also added scales in the relevant figure (Figure 2).

      (3) "pairwise correlations of R = 0.999 along both x- and y-axes". Is this correlation between hindpaw keypoint and galvo coordinates?

      Yes, we have added the following to clarify: “...between galvanometer coordinates and hind paw keypoints”

      (4) Latency was 84 ms. Is this mainly/entirely the delay between DLC receiving the camera image and outputting key point coordinates?

      Yes, we hope that the additional detail in the Methods and Discussion described above will now clarify the current closed-loop latencies.

      (5) "Mice move at variable speeds": in this sentence, spell out when "speed" refers to mouse and when it refers to hindpaw. Similarly, Fig 2i. The sentence is potentially confusing to general readers (paws stationary although the mouse is moving). Presumably, it's due to gait. I suggest explaining this here.

      The speed values that relate to the mouse body and paws are now clearer in the main text and in the legend for Figure 2I.

      (6) Figure 2k and associated main text. It is not clear what "success/hit rate" means here.

      We have added the following sentence in the main text: “Hit accuracy refers to the percentage of trials in which the laser successfully targeted (‘hit’) the intended hind paw.” and use hit accuracy throughout instead of success rate.

      (7) Figure 2L. All these points are greater than the "average" 0.54 reported in the text. How is this possible?

      The MAE of 0.54 mm refers to the “predicted and actual laser spot locations” (that is, the difference between where the calibration map should place the laser spot and where it actually fell), while Figure 2L MAE values refers to the error between the ground truth keypoint to laser spot (that is, the error between the human-observed paw target and where the laser spot fell). The latter error will include the former error so is expected to be larger. We have clarified this point throughout the text, for example, stating “As laser targeting relies on real-time tracking to direct the laser to the specified body part, this metric inherently accounts for any errors introduced by the tracking and targeting.”. This is also discussed above in response to Reviewer 2.

      (8) "large circular arena". State the size here

      We have added this to the Figure 2 legend.

      (9) Figure 3c-left. Can the contrast between the mouse and floor be increased here?

      We have improved the contrast in this image.

      (10) Figure 5c. It is unclear what C1, C2, etc refers to. Mice?

      Yes, these refer to mice. We have removed reference to these now as they are not needed.

      (11) Discussion. A comment. There is scope for elaborating on the potential for new research by combining it with new methods for measurements of neural activity in freely moving animals in the somatosensory system.

      Thank you. We agree and have added more detail on this in the discussion stating “The system may be combined with existing tools to record neural activity in freely-moving mice, such as fiber photometry, miniscopes, or large-scale electrophysiology, and manipulations of this neural activity, such as optogenetics and chemogenetics. This can allow mechanistic dissection of cell and circuit biology in the context of naturalistic behaviors.”

      Reviewer #3 (Recommendations for the authors):

      (1) Include the number of animals for behavior assays for the panels (e.g., Figures 4G).

      Where missing, we now state the number of animals in panels.

      (2) If representative responses are shown, such as in Figures 3E and 4F, include the average response with standard deviation so readers can appreciate the variation in the responses.

      We appreciate the suggestion to show variability in the responses. We have made several changes to Figures 3 and 4. Specifically, to illustrate the variability across multiple trials more clearly, Figure 3E now shows representative keypoint traces for each body part from two mice during their 5 trials. For Figure 4, we have re-analyzed the thermal stimulation trials and shown a raster plot of keypoint-based local motion energy (Figure 4E) sorted by response latency for hundreds of trials. Figure 4G now presents the cumulative distribution for all trials and animals for thermal (18 wild-type mice, 315 trials) and optogenetic stimulation trials (9 Trpv1::ChR2 mice, 181 trials). We also now provide means ± SD for the key metrics for optogenetic and thermal stimulation trials in Figure 4 in the Results section. This keeps the manuscript focused on the methodological advances while showing the trial variability.

      (3) "optical targeting of freely-moving mice in a large environments" should be "optical targeting of freely-moving mice in a large environment".

      Corrected

      (4) Define fps when you first mention this in the manuscript.

      Added

      (5) Data needs to be shown for the claim "Mice concurrently turned their heads toward the stimulus location while repositioning their bodies away from it".

      We state this observation to qualify that the stimulation of stationary mice resulted in behavioral responses “consistent with previous studies”. It would be redundant to repeat our full analysis and might distract from the novelty of the current manuscript. We have restricted this sentence to make it clearer: “Consistent with previous studies, we observed the whole-body behaviors like head orienting concurrent with local withdrawal (Browne et al., Cell Reports 2017; Blivis et al., eLife, 2017.)”

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary:

      Damaris et al. perform what is effectively an eQTL analysis on microbial pangenomes of E. coli and P. aeruginosa. Specifically, they leverage a large dataset of paired DNA/RNA-seq information for hundreds of strains of these microbes to establish correlations between genetic variants and changes in gene expression. Ultimately, their claim is that this approach identifies non-coding variants that affect expression of genes in a predictable manner and explain differences in phenotypes. They attempt to reinforce these claims through use of a widely regarded promoter calculator to quantify promoter effects, as well as some validation studies in living cells. Lastly, they show that these non-coding variations can explain some cases of antibiotic resistance in these microbes.

      Major comments

      Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

      The authors convincingly demonstrate that they can identify non-coding variation in pangenomes of bacteria and associate these with phenotypes of interest. What is unclear is the extent by which they account for covariation of genetic variation? Are the SNPs they implicate truly responsible for the changes in expression they observe? Or are they merely genetically linked to the true causal variants. This has been solved by other GWAS studies but isn't discussed as far as I can tell here.

      We thank the reviewer for their effective summary of our study. Regarding our ability to identify variants that are causal for gene expression changes versus those that only “tag” the causal ones, here we have to again offer our apologies for not spelling out the limitation of GWAS approaches, namely the difficulty in separating associated with causal variants. This inherent difficulty is the main reason why we added the in-silico and in-vitro validation experiments; while they each have their own limitations, we argue that they all point towards providing a causal link between some of our associations and measured gene expression changes. We have amended the discussion (e.g. at L548) section to spell our intention out better and provide better context for readers that are not familiar with the pitfalls of (bacterial) GWAS.

      They need to justify why they consider the 30bp downstream of the start codon as non-coding. While this region certainly has regulatory impact, it is also definitely coding. To what extent could this confound results and how many significant associations to expression are in this region vs upstream?

      We agree with the reviewer that defining this region as “non-coding” is formally not correct, as it includes the first 10 codons of the focal gene. We have amended the text to change the definition to “cis regulatory region” and avoided using the term “non-coding” throughout the manuscript. Regarding the relevance of this including the early coding region, we have looked at the distribution of associated hits in the cis regulatory regions we have defined; the results are shown in Supplementary Figure 3.

      We quantified the distribution of cis associated variants and compared them to a 2,000 permutations restricted to the -200bp and +30bp window in both E. coli * (panel A) and P. aeruginosa* (panel B). As it can be seen, the associated variants that we have identified are mostly present in the 200bp region and the +30bp region shows a mild depletion relative to the random expectation, which we derived through a variant position shuffling approach (2,000 replicates). Therefore, we believe that the inclusion of the early coding region results in an appreciable number of associations, and in our opinion justify its inclusion as a putative “cis regulatory region”.

      The claim that promoter variation correlates with changes in measured gene expression is not convincingly demonstrated (although, yes, very intuitive). Figure 3 is a convoluted way of demonstrating that predicted transcription rates correlate with measured gene expression. For each variant, can you do the basic analysis of just comparing differences in promoter calculator predictions and actual gene expression? I.e. correlation between (promoter activity variant X)-(promoter activity variant Y) vs (measured gene expression variant X)-(measured gene expression variant Y). You'll probably have to

      We realize that we may not have failed to properly explain how we carried out this analysis, which we did exactly in the way the reviewer suggests here. We had in fact provided four example scatterplots of the kind the reviewer was requesting as part of Figure 4. We have added a mention of their presence in the caption of Figure 3.

      Figure 7 it is unclear what this experiment was. How were they tested? Did you generate the data themselves? Did you do RNA-seq (which is what is described in the methods) or just test and compare known genomic data?

      We apologize for the lack of clarity here; we have amended the figure’s caption and the corresponding section of the results (i.e. L411 and L418) to better highlight how the underlying drug susceptibility data and genomes came from previously published studies.

      Are the data and the methods presented in such a way that they can be reproduced?

      No, this is the biggest flaw of the work. The RNA-Seq experiment to start this project is not described at all as well as other key experiments. Descriptions of methods in the text are far too vague to understand the approach or rationale at many points in the text. The scripts are available on github but there is no description of what they correspond to outside of the file names and none of the data files are found to replicate the plots.

      We have taken this critique to heart, and have given more details about the experimental setup for the generation of the RNA-seq data in the methods as well as the results sections. We have also thoroughly reviewed any description of the methods we have employed to make sure they are more clearly presented to the readers. We have also updated our code repository in order to provide more information about the meaning of each script provided, although we would like to point out that we have not made the code to be general purpose, but rather as an open documentation on how the data was analyzed.

      Figure 8B is intended to show that the WaaQ operon is connected to known Abx resistance genes but uses the STRING method. This requires a list of genes but how did they build this list? Why look at these known ABx genes in particular? STRING does not really show evidence, these need to be substantiated or at least need to justify why this analysis was performed.

      We have amended the Methods section (“Gene interaction analysis”, L799) to better clarify how the network shown in this panel was obtained. In short, we have filtered the STRING database to identify genes connected to members of the waa operon with an interaction score of at least 0.4 (“moderate confidence”), excluding the “text mining” field. Antimicrobial resistance genes were identified according to the CARD database. We believe these changes will help the readers to better understand how we derived this interaction.

      Are the experiments adequately replicated and statistical analysis adequate?

      An important claim on MIC of variants for supplementary table 8 has no raw data and no clear replicates available. Only figure 6, the in vitro testing of variant expression, mentions any replicates.

      We have expanded the relevant section in the Methods (“Antibiotic exposure and RNA extraction”, L778) to provide more information on the way these assays were carried out. In short, we carried out three biological replicates, the average MIC of two replicates in closest agreement was the representative MIC for the strain. We believe that we have followed standard practice in the field of microbiology, but we agree that more details were needed to be provided in order for readers to appreciate this.

      Minor comments

      Specific experimental issues that are easily addressable..

      Are prior studies referenced appropriately?

      There should be a discussion of eQTLs in this. Although these have mostly been in eukaryotes a. https://doi.org/10.1038/s41588-024-01769-9 ; https://doi.org/10.1038/nrg3891.

      We have added these two references, which provide a broader context to our study and methodology, in the introduction.

      Line 67. Missing important citation for Ireland et al. 2020 https://doi.org/10.7554/eLife.55308

      Line 69. Should mention Johns et al. 2018 (https://doi.org/10.1038/nmeth.4633) where they study promoter sequences outside of E. coli

      Line 90 - replace 'hypothesis-free' with unbiased

      We have implemented these changes.

      Line 102 - state % of DEGs relative to the entire pan-genome

      Given that the study is focused on identifying variants that were associated with changes in expression for reference genes (i.e. those present in the reference genome), we think that providing this percentage would give the false impression that our analysis include accessory genes that are not encoded by the reference isolate, which is not what we have done.

      Figure 1A is not discussed in the text

      We have added an explicit mention of the panels in the relevant section of the results.

      Line 111: it is unclear what enrichment was being compared between, FIgures 1C/D have 'Gene counts' but is of the total DEGs? How is the p-value derived? Comparing and what statistical test was performed? Comparing DEG enrichment vs the pangenome? K12 genome?

      We have amended the results and methods section, as well as Figure 1’s caption to provide more details on how this analysis was carried out.

      Line 122-123: State what letters correspond to these COG categories here

      We have implemented the clarifications and edits suggested above

      Line 155: Need to clarify how you use k-mers in this and how they are different than SNPs. are you looking at k-mer content of these regions? K-mers up to hexamers or what? How are these compared. You can't just say we used k-mers.

      We have amended that line in the results section to more explicitly refer to the actual encoding of the k-mer variants, which were presence/absence patterns for k-mers extracted from each target gene’s promoter region separately, using our own developed method, called panfeed. We note that more details were already given in the methods section, but we do recognize that it’s better to clarify things in the results section, so that more distracted readers get the proper information about this class of genetic variants.

      Line 172: It would be VERY helpful to have a supplementary figure describing these types of variants, perhaps a multiple-sequence alignment containing each example

      We thank the reviewer for this suggestion. We have now added Supplementary Figure 3, which shows the sequence alignments of the cis-regulatory regions underlying each class of the genetic marker for both E. coli and P. aeruginosa.

      Figure 4: THis figure is too small. Why are WaaQ and UlaE being used as examples here when you are supposed to be explicitly showing variants with strong positive correlations?

      We rearranged the figure’s layout to improve its readability. We agree that the correlation for waaQ and ulaE is weaker than for yfgJ and kgtP, but our intention was to not simply cherry-pick strong examples, but also those for which the link between predicted promoter strength and recorded gene expression was less obvious.

      Figure 4: Why is there variation between variants present and variant absent? Is this due to other changes in the variant? Should mention this in the text somewhere

      Variability in the predicted transcription rate for isolates encoding for the same variant is due to the presence of other (different) variants in the region surrounding the target variant. PromoterCalculator uses nucleotide regions of variable length (78 to 83bp) to make its predictions, while the variants we are focusing on are typically shorter (as shown in Figure 4). This results in other variants being included in the calculation and therefore slightly different predicted transcription rates for each strain. We have amended the caption of Figure 4 to provide a succinct explanation of these differences.

      Line 359: Need to talk about each supplementary figure 4 to 9 and how they demonstrate your point.

      We have expanded this section to more explicitly mention the contents of these supplementary figures and why they are relevant for the findings of this section (L425).

      Are the text and figures clear and accurate?

      Figure 4 too small

      We have fixed the figure, as described above

      Acronyms are defined multiple times in the manuscript, sometimes not the first time they are used (e.g. SNP, InDel)

      Figure 8A - Remove red box, increase label size

      Figure 8B - Low resolution, grey text is unreadable and should be darker and higher resolution

      Line 35 - be more specific about types of carbon metabolism and catabolite repression

      Line 67 - include citation for ireland et al. 2020 https://doi.org/10.7554/eLife.55308

      Line 74 - You talk about looking in cis but don't specify how mar away cis is

      Line 75 - we encoded genetic variants..... It is unclear what you mean here

      Line 104 - 'were apart of operons' should clarify you mean polycistronic or multi-gene operons. Single genes may be considered operonic units as well.

      We have addressed all the issues indicated above.

      Figure 2: THere is no axis for the percents and the percents don't make sense relative to the bars they represent??

      We realize that this visualization might not have been the most clear for readers, and have made the following improvement: we have added the number of genes with at least one association before the percentage. We note that the x-axis is in log scale, which may make it seem like the light-colored bars are off. With the addition of the actual number of associated genes we think that this confusion has been removed.

      Figure 2: Figure 2B legend should clarify that these are individual examples of Differential expression between variants

      Line 198-199: This sentence doesn't make sense, 'encoded using kmers' is not descriptive enough

      Line 205: Should be upfront about that you're using the Promoter Calculator that models biophysical properties of promoter sequences to predict activity.

      Line 251: 'Scanned the non-coding sequences of the DEGs'. This is far too vague of a description of an approach. Need to clarify how you did this and I didn't see in the method. Is this an HMM? Perfect sequence match to consensus sequence? Some type of alignment?

      Line 257-259: This sentence lacks clarity

      We have implemented all the suggested changes and clarified the points that the reviewer has highlighted above.

      Line346: How were the E. coli isolates tested? Was this an experiment you did? This is a massive undertaking (1600 isolates * 12 conditions) if so so should be clearly defined

      While we have indicated in the previous paragraph that the genomes and antimicrobial susceptibility data were obtained from previously published studies, we have now modified this paragraph (e.g. L411 and L418) slightly to make this point even clearer.

      Figure 6A: The tile plot on the right side is not clearly labeled and it is unclear what it is showing and how that relates to the bar plots.

      In the revised figure, we have clarified the labeling of the heatmap to now read “Log2(Fold Change) (measured expression)” to indicate that it represents each gene’s fold changes obtained from our initial transcriptomic analysis. We have also included this information in the caption of the figure, making the relationship between the measured gene expression (heatmap) and the reporter assay data (bar plots) clear to the reader.

      FIgure 6B: typo in legend 'Downreglation'

      We thank the review for pointing this out. The typo has been corrected to “Down regulation” in the revised figure.

      Line 398: Need to state rationale for why Waaq operon is being investigated here. WHy did you look into individual example?

      We thank the reviewer for asking for a clarification here. Our decision to investigate the waaQ gene was one of both biological relevance and empirical evidence. In our analysis associating non-coding variants with antimicrobial resistance using the Moradigaravand et al. dataset, we identified a T>C variant at position 3808241 that was associated with resistance to Tobramycin. We also observed this variant in our strain collection, where it was associated with expression changes of the gene, suggesting a possible functional impact. The waa operon is involved in LPS synthesis, a central determinant of the bacteria’s outer membrane integrity and a well established virulence factor. This provided a plausible biological mechanism through which variation could influence antimicrobial susceptibility. As its role in resistance has not been extensively characterized, this represents a good candidate for our experimental validation. We have now included this rationale in our revised manuscript (i.e. L476).

      Figure 8: Can get rid of red box

      We have now removed the red box from Figure 8 in the revised version.

      Line 463 - 'account for all kinds' is too informal

      Mix of font styles throughout document

      We have implemented all the suggestions and formatting changes indicated above.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In their manuscript "Cis non-coding genetic variation drives gene expression changes in the E. coli and P. aeruginosa pangenomes", Damaris and co-authors present an extensive meta-analysis, plus some useful follow up experiments, attempting to apply GWAS principles to identify the extent to which differences in gene expression between different strains within a given species can be directly assigned to cis-regulatory mutations. The overall principle, and the question raised by the study, is one of substantial interest, and the manuscript here represents a careful and fascinating effort at unravelling these important questions. I want to preface my review below (which may otherwise sound more harsh than I intend) with the acknowledgment that this is an EXTREMELY difficult and challenging problem that the authors are approaching, and they have clearly put in a substantial amount of high quality work in their efforts to address it. I applaud the work done here, I think it presents some very interesting findings, and I acknowledge fully that there is no one perfect approach to addressing these challenges, and while I will object to some of the decisions made by the authors below, I readily admit that others might challenge my own suggestions and approaches here. With that said, however, there is one fundamental decision that the authors made which I simply cannot agree with, and which in my view undermines much of the analysis and utility of the study: that decision is to treat both gene expression and the identification of cis-regulatory regions at the level of individual genes, rather than transcriptional units. Below I will expand on why I find this problematic, how it might be addressed, and what other areas for improvement I see in the manuscript:

      We thank the reviewer for their praise of our work. A careful set of replies to the major and minor critiques are reported below each point.

      In the entire discussion from lines roughly 100-130, the authors frequently dissect out apparently differentially expressed genes from non differentially expressed genes within the same operons... I honestly wonder whether this is a useful distinction. I understand that by the criteria set forth by the authors it is technically correct, and yet, I wonder if this is more due to thresholding artifacts (i.e., some genes passing the authors' reasonable-yet-arbitrary thresholds whereas others in the same operon do not), and in the process causing a distraction from an operon that is in fact largely moving in the same direction. The authors might wish to either aggregate data in some way across known transcriptional units for the purposes of their analysis, and/or consider a more lenient 'rescue' set of significance thresholds for genes that are in the same operons as differentially expressed genes. I would favor the former approach, performing virtually all of their analysis at the level of transcriptional units rather than individual genes, as much of their analysis in any case relies upon proper assignment of genes to promoters, and this way they could focus on the most important signals rather than get lots sometimes in the weeds of looking at every single gene when really what they seem to be looking at in this paper is a property OF THE PROMOTERS, not the genes. (of course there are phenomena, such as rho dependent termination specifically titrating expression of late genes in operons, but I think on the balance the operon-level analysis might provide more insights and a cleaner analysis and discussion).

      We agree with the reviewer that the peculiar nature of transcription in bacteria has to be taken into account in order to properly quantify the influence of cis variants in gene expression changes. We therefore added the exact analysis the reviewer suggested; that is, we ran associations between the variants in cis to the first gene of each operon and a phenotype that considered the fold-change of all genes in the operon, via a weighted average (see Methods for more details). As reported in the results section (L223), we found a similar trend as with the original analysis: we found the highest proportion of associations when encoding cis variants using k-mers (42% for E. coli and 45% for P. aeruginosa). More importantly, we found a high degree of overlap between this new “operon-level” association analysis and the original one (only including the first gene in each operon). We found a range of 90%-94% of associations overlapping for E. coli and between 75% and 91% for P. aeruginosa, depending on the variant type. We note that operon definitions are less precise for P. aeruginosa, which might explain the higher variability in the level of overlap. We have added the results of this analysis in the results section.

      This also leads to a more general point, however, which I think is potentially more deeply problematic. At the end of the day, all of the analysis being done here centers on the cis regulatory logic upstream of each individual open reading frame, even though in many cases (i.e., genes after the first one in multi-gene operons), this is not where the relevant promoter is. This problem, in turn, raises potentially misattributions of causality running in both directions, where the causal impact on a bona fide promoter mutation on many genes in an operon may only be associated with the first gene, or on the other side, where a mutation that co-occurs with, but is causally independent from, an actual promoter mutation may be flagged as the one driving an expression change. This becomes an especially serious issue in cases like ulaE, for genes that are not the first gene in an operon (at least according to standard annotations, the UlaE transcript should be part of a polycistronic mRNA beginning from the ulaA promoter, and the role played by cis-regulatory logic immediately upstream of ulaE is uncertain and certainly merits deeper consideration. I suspect that many other similar cases likewise lurk in the dataset used here (perhaps even moreso for the Pseudomonas data, where the operon definitions are likely less robust). Of course there are many possible explanations, such as a separate ulaE promoter only in some strains, but this should perhaps be carefully stated and explored, and seems likely to be the exception rather than the rule.

      While we again agree with the reviewer that some of our associations might not result in a direct causal link because the focal variant may not belong to an actual promoter element, we also want to point out how the ability to identify the composition of transcriptional units in bacteria is far from a solved problem (see references at the bottom of this comment, two in general terms, and one characterizing a specific example), even for a well-studied species such as E. coli. Therefore, even if carrying out associations at the operon level (e.g. by focusing exclusively on variants in cis for the first gene in the operon) might be theoretically correct, a number of the associations we find further down the putative operons might be the result of a true biological signal.

      1. Conway, T., Creecy, J. P., Maddox, S. M., Grissom, J. E., Conkle, T. L., Shadid, T. M., Teramoto, J., San Miguel, P., Shimada, T., Ishihama, A., Mori, H., & Wanner, B. L. (2014). Unprecedented High-Resolution View of Bacterial Operon Architecture Revealed by RNA Sequencing. mBio, 5(4), 10.1128/mbio.01442-14. https://doi.org/10.1128/mbio.01442-14

      2. Sáenz-Lahoya, S., Bitarte, N., García, B., Burgui, S., Vergara-Irigaray, M., Valle, J., Solano, C., Toledo-Arana, A., & Lasa, I. (2019). Noncontiguous operon is a genetic organization for coordinating bacterial gene expression. Proceedings of the National Academy of Sciences, 116(5), 1733–1738. https://doi.org/10.1073/pnas.1812746116

      3. Zehentner, B., Scherer, S., & Neuhaus, K. (2023). Non-canonical transcriptional start sites in E. coli O157:H7 EDL933 are regulated and appear in surprisingly high numbers. BMC Microbiology, 23(1), 243. https://doi.org/10.1186/s12866-023-02988-6

      Another issue with the current definition of regulatory regions, which should perhaps also be accounted for, is that it is likely that for many operons, the 'regulatory regions' of one gene might overlap the ORF of the previous gene, and in some cases actual coding mutations in an upstream gene may contaminate the set of potential regulatory mutations identified in this dataset.

      We agree that defining regulatory regions might be challenging, and that those regions might overlap with coding regions, either for the focal gene or the one immediately upstream. For these reasons we have defined a wide region to identify putative regulatory variants (-200 to +30 bp around the start codon of the focal gene). We believe this relatively wide region allows us to capture the most cis genetic variation.

      Taken together, I feel that all of the above concerns need to be addressed in some way. At the absolute barest minimum, the authors need to acknowledge the weaknesses that I have pointed out in the definition of cis-regulatory logic at a gene level. I think it would be far BETTER if they performed a re-analysis at the level of transcriptional units, which I think might substantially strengthen the work as a whole, but I recognize that this would also constitute a substantial amount of additional effort.

      As indicated above, we have added a section in the results section to report on the analysis carried out at the level of operons as individual units, with more details provided in the methods section. We believe these results, which largely overlap with the original analysis, are a good way to recognize the limitation of our approach and to acknowledge the importance of gaining a better knowledge on the number and composition of transcriptional units in bacteria, for which, as the reference above indicates, we still have an incomplete understanding.

      Having reached the end of the paper, and considering the evidence and arguments of the authors in their totality, I find myself wondering how much local x background interactions - that is, the effects of cis regulatory mutations (like those being considered here, with or without the modified definitions that I proposed above) IN THE CONTEXT OF A PARTICULAR STRAIN BACKGROUND, might matter more than the effects of the cis regulatory mutations per se. This is a particularly tricky problem to address because it would require a moderate number of targeted experiments with a moderate number of promoters in a moderate number of strains (which of course makes it maximally annoying since one can't simply scale up hugely on either axis individually and really expect to tease things out). I think that trying to address this question experimentally is FAR beyond the scope of the current paper, but I think perhaps the authors could at least begin to address it by acknowledging it as a challenge in their discussion section, and possibly even identify candidate promoters that might show the largest divergence of activities across strains when there IS no detectable cis regulatory mutation (which might be indicative of local x background interactions), or those with the largest divergences of effect for a given mutation across strains. A differential expression model incorporating shrinkage is essential in such analysis to avoid putting too much weight on low expression genes with a lot of Poisson noise.

      We again thank the reviewer for their thoughtful comments on the limitations of correlative studies in general, and microbial GWAS in particular. In regards to microbial GWAS we feel we may have failed to properly explain how the implementation we have used allows to, at least partially, correct for population structure effects. That is, the linear mixed model we have used relies on population structure to remove the part of the association signal that is due to the genetic background and thus focus the analysis on the specific loci. Obviously examples in which strong epistatic interactions are present would not be accounted for, but those would be extremely challenging to measure or predict at scale, as the reviewer rightfully suggests. We have added a brief recap of the ability of microbial GWAS to account for population structure in the results section (“A large fraction of gene expression changes can be attributed to genetic variations in cis regulatory regions”, e.g. L195).

      I also have some more minor concerns and suggestions, which I outline below:

      It seems that the differential expression analysis treats the lab reference strains as the 'centerpoint' against which everything else is compared, and yet I wonder if this is the best approach... it might be interesting to see how the results differ if the authors instead take a more 'average' strain (either chosen based on genetics or transcriptomics) as a reference and compared everything else to that.

      While we don’t necessarily disagree with the reviewer that a “wild” strain would be better to compare against, we think that our choice to go for the reference isolates is still justified on two grounds. First, while it is true that comparing against a reference introduces biases in the analysis, this concern would not be removed had we chosen another strain as reference; which strain would then be best as a reference to compare against? We think that the second point provides an answer to this question; the “traditional” reference isolates have a rich ecosystem of annotations, experimental data, and computational predictions. These can in turn be used for validation and hypothesis generation, which we have done extensively in the manuscript. Had we chosen a different reference isolate we would have had to still map associations to the traditional reference, resulting in a probable reduction in precision. An example that will likely resonate with this reviewer is that we have used experimentally-validated and high quality computational operon predictions to look into likely associations between cis-variants and “operon DEGs”. This analysis would have likely been of worse quality had we used another strain as reference, for which operon definitions would have had to come from lower-quality predictions or be “lifted” from the traditional reference.

      Line 104 - the statement about the differentially expressed genes being "part of operons with diverse biological functions" seems unclear - it is not apparent whether the authors are referring to diversity of function within each operon, or between the different operons, and in any case one should consider whether the observation reflects any useful information or is just an apparently random collection of operons.

      We agree that this formulation could create confusion and we have elected to remove the expression “with diverse biological functions”, given that we discuss those functions immediately after that sentence.

      Line 292 - I find the argument here somewhat unconvincing, for two reasons. First, the fact that only half of the observed changes went in the same direction as the GWAS results would indicate, which is trivially a result that would be expected by random chance, does not lend much confidence to the overall premise of the study that there are meaningful cis regulatory changes being detected (in fact, it seems to argue that the background in which a variant occurs may matter a great deal, at least as much as the cis regulatory logic itself). Second, in order to even assess whether the GWAS is useful to "find the genetic determinants of gene expression changes" as the authors indicate, it would be necessary to compare to a reasonable, non-straw-man, null approach simply identifying common sequence variants that are predicted to cause major changes in sigma 70 binding at known promoters; such a test would be especially important given the lack of directional accuracy observed here. Along these same lines, it is perhaps worth noting, in the discussion beginning on line 329, that the comparison is perhaps biased in favor of the GWAS study, since the validation targets here were prioritized based on (presumably strong) GWAS data.

      We thank the reviewer for prompting us into reasoning about the results of the in-vitro validation experiments. We agree that the agreement between the measured gene expression changes agree only partly with those measured with the reporter system, and that this discrepancy could likely be attributed to regulatory elements that are not in cis, and thus that were not present in the in-vitro reporter system. We have noted this possibility in the discussion. Additionally, we have amended the results section to note that even though the prediction in the direction of gene expression change was not as accurate as it could be expected, the prediction of whether a change would be present (thus ignoring directionality) was much higher.

      I don't find the Venn diagrams in Fig 7C-D useful or clear given the large number of zero-overlap regions, and would strongly advocate that the authors find another way to show these data.

      While we are aware that alternative ways to show overlap between sets, such as upset plots, we don’t actually find them that much easier to parse. We actually think that the simple and direct Venn diagrams we have drawn convey the clear message that overlaps only exist between certain drug classes in E. coli, and virtually none for P. aeruginosa. We have added a comment on the lack of overlap between all drug classes and the differences between the two species in the results section (i.e. L436 and L465).

      In the analysis of waa operon gene expression beginning on line 400, it is perhaps important to note that most of the waa operon doesn't do anything in laboratory K12 strains due to the lack of complete O-antigen... the same is not true, however, for many wild/clinical isolates. It would be interesting to see how those results compare, and also how the absolute TPMs (rather than just LFCs) of genes in this operon vary across the strains being investigated during TOB treatment.

      We thank the reviewer for this helpful suggestion. We examined the absolute expression (TPMs) of waa operon genes under the baseline (A) and following exposure to Tobramycin (B). The representative TPMs per strain were obtained by averaging across biological replicates. We observed a constitutive expression of the genes in the reference strain (MG1655) and the other isolates containing the variant of interest (MC4100, BW25113). In contrast, strains lacking the variants of interest (IAI76 and IAI78), showed lower expression of these operon genes under both conditions. Strain IAI77, on the other hand, displayed increased expression of a subset of waa genes post Tobramycin exposure, indicating strain-specific variation in transcriptional response. While the reference isolate might not have the O-antigen, it certainly expresses the waa operon, both constitutively and under TOB exposure.

      I don't think that the second conclusion on lines 479-480 is fully justified by the data, given both the disparity in available annotation information between the two species, AND the fact that only two species were considered.

      While we feel that the “Discussion” section of a research paper allows for speculative statements, we have to concede that we have perhaps overreached here. We have amended this sentence to be more cautious and not mislead readers.

      Line 118: "Double of DEGs"

      Line 288 - presumably these are LOG fold changes

      Fig 6b - legend contains typos

      Line 661 - please report the read count (more relevant for RNA-seq analysis) rather than Gb

      We thank the reviewer for pointing out the need to make these edits. We have implemented them all.

      Source code - I appreciate that the authors provide their source code on github, but it is very poorly documented - both a license and some top-level documentation about which code goes with each major operation/conclusion/figure should be provided. Also, ipython notebooks are in general a poor way in my view to distribute code, due to their encouragement of nonlinear development practices; while they are fine for software development, actual complete python programs along with accompanying source data would be preferrable.

      We agree with the reviewer that a software license and some documentation about what each notebook is about is warranted, and we have added them both. While we agree that for “consumer-grade” software jupyter notebooks are not the most ergonomic format, we believe that as a documentation of how one-time analyses were carried out they are actually one of the best formats we could think of. They in fact allow for code and outputs to be presented alongside each other, which greatly helped us to iterate on our research and to ensure that what was presented in the manuscript matched the analyses we reported in the code. This is of course up for debate and ultimately specific to someone’s taste, and so we will keep the reviewer’s critique in mind for our next manuscript. And, if we ever decide to package the analyses presented in the manuscript as a “consumer-grade” application for others to use, we would follow higher standards of documentation and design.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Damaris et al. collected genome sequences and transcriptomes from isolates from two bacterial species. Data for E. coli were produced for this paper, while data for P. aeruginosa had been measured earlier. The authors integrated these data to detect genes with differential expression (DE) among isolates as well as cis-expression quantitative trait loci (cis-eQTLs). The authors used sample sizes that were adequate for an initial exploration of gene regulatory variation (n=117 for E. coli and n=413 for P. aeruginosa) and were able to discover cis eQTLs at about 39% of genes. In a creative addition, the authors compared their results to transcription rates predicted from a biophysical promoter model as well as to annotated transcription factor binding sites. They also attempted to validate some of their associations experimentally using GFP-reporter assays. Finally, the paper presents a mapping of antibiotic resistance traits. Many of the detected associations for this important trait group were in non-coding genome regions, suggesting a role of regulatory variation in antibiotic resistance.

      A major strength of the paper is that it covers an impressive range of distinct analyses, some of which in two different species. Weaknesses include the fact that this breadth comes at the expense of depth and detail. Some sections are underdeveloped, not fully explained and/or thought-through enough. Important methodological details are missing, as detailed below.

      We thank the reviewer for highlighting the strengths of our study. We hope that our replies to their comments and the other two reviewers will address some of the limitations.

      Major comments:

      1. An interesting aspect of the paper is that genetic variation is represented in different ways (SNPs & indels, IRG presence/absence, and k-mers). However, it is not entirely clear how these three different encodings relate to each other. Specifically, more information should be given on these two points:

      2. it is not clear how "presence/absence of intergenic regions" are different from larger indels.

      In order to better guide readers through the different kinds of genetic variants we considered, we have added a brief explanation about what “promoter switches” are in the introduction (“meaning that the entire promoter region may differ between isolates due to recombination events”, L56). We believe this clarifies how they are very different in character from a large deletion. We have kept the reference to the original study (10.1073/pnas.1413272111) describing how widespread these switches are in E. coli as a way for readers to discover more about them.

      • I recommend providing more narration on how the k-mers compare to the more traditional genetic variants (SNPs and indels). It seems like the k-mers include the SNPs and indels somehow? More explanation would be good here, as k-mer based mapping is not usually done in other species and is not standard practice in the field. Likewise, how is multiple testing handled for association mapping with k-mers, since presumably each gene region harbors a large number of k-mers, potentially hugely increasing the multiple testing burden?

      We indeed agree with the reviewer in thinking that representing genetic variants as k-mers would encompass short variants (SNP/InDels) as well as larger variants and promoters presence/absence patterns. We believe that this assumption is validated by the fact that we identify the highest proportion of DEGs with a significant association when using this representation of variants (Figure 2A, 39% for both species). We have added a reference to a recent review on the advantages of k-mer methods for population genetics (10.1093/molbev/msaf047) in the introduction. Regarding the issue of multiple testing correction, we have employed a commonly recognized approach that, unlike a crude Bonferroni correction using the number of tested variants, allows for a realistic correction of association p-values. We used the number of unique presence/absence patterns, which can be shared between multiple genetic variants, and applied a Bonferroni correction using this number rather than the number of variants tested. We have expanded the corresponding section in the methods (e.g. L697) to better explain this point for readers not familiar with this approach.

      1. What was the distribution of association effect sizes for the three types of variants? Did IRGs have larger effects than SNPs as may be expected if they are indeed larger events that involve more DNA differences? What were their relative allele frequencies?

      We appreciate the suggestion made by the reviewer to look into the distribution of effect sizes divided by variant type. We have now evaluated the distribution of the effect sizes and allele frequencies for the genetic markers (SNPs/InDels, IGRs, and k-mers) for both species (Supplementary Figure 2). In E. coli, IGR variants showed somewhat larger median effect sizes (|β| = 4.5) than SNPs (|β| = 3.8), whereas k-mers displayed the widest distribution (median |β| = 5.2). In P. aeruginosa, the trend differed with IGRs exhibiting smaller effects (median |β| = 3.2), compared to SNPs/InDels (median |β| =5.1) and k-mers (median |β| = 6.2). With respect to allele frequencies, SNPs/InDels generally occured at lower frequencies (median AF = 0.34 for E.coli, median AF = 0.33 for P. aeruginosa), whereas IGRs (median AF = 0.65 for E. coli and 0.75 for P. aeruginosa) and k-mers (median AF = 0.71 for E. coli and 0.65 for P. aeruginosa) were more often at the intermediate to higher frequencies respectively. We have added a visualization for the distribution of effect sizes (Supplementary Figure 2).

      1. The GFP-based experiments attempting to validate the promoter effects for 18 genes are laudable, and the fact that 16 of them showed differences is nice. However, the fact that half of the validation attempts yielded effects in the opposite direction of what was expected is quite alarming. I am not sure this really "further validates" the GWAS in the way the authors state in line 292 - in fact, quite the opposite in that the validations appear random with regards to what was predicted from the computational analyses. How do the authors interpret this result? Given the higher concordance between GWAS, promoter prediction, and DE, are the GFP assays just not relevant for what is going on in the genome? If not, what are these assays missing? Overall, more interpretation of this result would be helpful.

      We thanks the reviewer for their comment, which is similar in nature to that raised by reviewer #2 above. As noted in our reply above we have amended the results and discussion to indicate that although the direction of gene expression change was not highly accurate, focusing on the magnitude (or rather whether there would be a change in gene expression, regardless of the direction), resulted in a higher accuracy. We postulate that the cases in which the direction of the change was not correctly identified could be due to the influence of other genetic elements in trans with the gene of interest.

      1. On the same note, it would be really interesting to expand the GFP experiments to promoters that did not show association in the GWAS. Based on Figure 6, effects of promoter differences on GFP reporters seem to be very common (all but three were significant). Is this a higher rate than for the average promoter with sequence variation but without detected association? A handful of extra reporter experiments might address this. My larger question here is: what is the null expectation for how much functional promoter variation there is?

      We thank the reviewer for this comment. We agree that estimating the null expectation for the functional promoter would require testing promoter alleles with sequence variation that are not associated in the GWAS. Such experiments, which would directly address if the observed effects in our study exceeds background, would have required us to prepare multiple constructs, which was unfortunately not possible for us due to staff constraints. We therefore elected to clarify the scope of our GFP reporter assays instead. These experiments were designed as a paired comparison of the wild-type and the GWAS-associated variant alleles of the same promoter in an identical reporter background, with the aim of testing allele-specific functional effects for GWAS hits (Supplementary Figure 6). We also included a comparison in GFP fluorescence between the promoterless vector (pOT2) and promoter-containing constructs; we observed higher GFP signals in all but four (yfgJ, fimI, agaI, and yfdQ) variant-containing promoter constructs, which indicates that for most of the construct we cloned active promoter elements. We have revised the manuscript text accordingly to reflect this clarification and included the control in the supplementary information as Supplementary Figure 6.

      1. Were the fold-changes in the GFP experiments statistically significant? Based on Figure 6 it certainly looks like they are, but this should be spelled out, along with the test used.

      We thank the reviewer for pointing this out. We have reviewed Figure 6 to indicate significant differences between the test and control reporter constructs. We used the paired student’s t-test to match the matched plate/time point measurements. We also corrected for multiple testing using the Benhamini-Hochberg correction. As seen in the updated Figure 6A, 16 out of the 18 reporter constructs displayed significant differences (adjusted p-value

      1. What was the overall correlation between GWAS-based fold changes and those from the GFP-based validation? What does Figure 6A look like as a scatter plot comparing these two sets of values?

      We thank the reviewer for this helpful suggestion, which allows us to more closely look into the results of our in-vitro validation. We performed a direct comparison of RNAseq fold changes from the GWAS (x-axis) with the GFP reporter measurements (y-axis) as depicted in the figure above. The overall correlation between the two was weak (Pearson r = 0.17), reflecting the lack of thorough agreement between the associations and the reporter construct. We however note that the two metrics are not directly comparable in our opinion, since on the x-axis we are measuring changes in gene expression and on the y-axis changes in fluorescence expression, which is downstream from it. As mentioned above and in reply to a comment from reviewer 2, the agreement between measured gene expression and all other in-silico and in-vitro techniques increases when ignoring the direction of the change. Overall, we believe that these results partly validate our associations and predictions, while indicating that other factors in trans with the regulatory region contribute to changes in gene expression, which is to be expected. The scatter plot has been included as a new supplementary figure (Supplementary Figure 7).

      1. Was the SNP analyzed in the last Results section significant in the gene expression GWAS? Did the DE results reported in this final section correspond to that GWAS in some way?

      The T>C SNP upstream of waaQ did not show significant association with gene expression in our cis GWAS analysis. Instead, this variant was associated with resistance to tobramycin when referencing data from Danesh et al, and we observed the variant in our strain collection. We subsequently investigated whether this variant also influenced expression of the waa operon under sub-inhibitory tobramycin exposure. The differential expression results shown in the final section therefore represent a functional follow-up experiment, and not a direct replication of the GWAS presented in the first part of the manuscript.

      1. Line 470: "Consistent with the differences in the genetic structure of the two species" It is not clear what differences in genetic structure this refers to. Population structure? Genome architecture? Differences in the biology of regulatory regions?

      The awkwardness of that sentence is perhaps the consequence of our assumption that readers would be aware of the differences in population genetics differences between the two species. We however have realized that not much literature is available (if at all!) about these differences, which we have observed during the course of this and other studies we have carried out. As a result, we agree that we cannot assume that the reader is similarly familiar with these differences, and have changed that sentence (i.e. L548) to more directly address the differences between the two species, which will presumably result in a diverse population structure. We thank the reviewer for letting us be aware of a gap in the literature concerning the comparison of pangenome structures across relevant species.

      1. Line 480: the reference to "adaption" is not warranted, as the paper contains no analyses of evolutionary patterns or processes. Genetic variation is not the same as adaptation.

      We have amended this sentence to be more adherent to what we can conclude from our analyses.

      1. There is insufficient information on how the E. coli RNA-seq data was generated. How was RNA extracted? Which QC was done on the RNA; what was its quality? Which library kits were used? Which sequencing technology? How many reads? What QC was done on the RNA-seq data? For this section, the Methods are seriously deficient in their current form and need to be greatly expanded.

      We thank the reviewer for highlighting the need for clearer methodological detail. We have expanded this section (i.e. L608) to fully describe the generation and quality control of the E. coli RNA-seq data including RNA extraction and sequencing platform.

      1. How were the DEG p-values adjusted for multiple testing?

      As indicated in the methods section (“Differential gene expression and functional enrichment analysis”), we have used DEseq2 for E. coli, and LPEseq for P. aeruginosa. Both methods use the statistical framework of the False Discovery Rate (FDR) to compute an adjusted p-value for each gene. We have added a brief mention of us following the standard practice indicated by both software packages in the methods.

      1. Were there replicates for the E. coli strains? The methods do not say, but there is a hint there might have been replicates given their absence was noted for the other species.

      In the context of providing more information about the transcriptomics experiments for E. coli, we have also more clearly indicated that we have two biological replicates for the E. coli dataset.

      1. There needs to be more information on the "pattern-based method" that was used to correct the GWAS for multiple tests. How does this method work? What genome-wide threshold did it end up producing? Was there adjustment for the number of genes tested in addition to the number of variants? Was the correction done per variant class or across all variant classes?

      In line with an earlier comment from this reviewer, we have expanded the section in the Methods (e.g. L689) that explains how this correction worked to include as many details as possible, in order to provide the readers with the full context under which our analyses were carried out.

      1. For a paper that, at its core, performs a cis-eQTL mapping, it is an oversight that there seems not to be a single reference to the rich literature in this space, comprising hundreds of papers, in other species ranging from humans, many other animals, to yeast and plants.

      We thank both reviewer #1 and #3 for pointing out this lack of references to the extensive literature on the subject. We have added a number of references about the applications of eQTL studies, and specifically its application in microbial pangenomes, which we believe is more relevant to our study, in the introduction.

      Minor comments:

      1. I wasn't able to understand the top panels in Figure 4. For ulaE, most strains have the solid colors, and the corresponding bottom panel shows mostly red points. But for waaQ, most strains have solid color in the top panel, but only a few strains in the bottom panel are red. So solid color in the top does not indicate a variant allele? And why are there so many solid alleles; are these all indels? Even if so, for kgtP, the same colors (i.e., nucleotides) seem to seamlessly continue into the bottom, pale part of the top panel. How are these strains different genotypically? Are these blocks of solid color counted as one indel or several SNPs, or somehow as k-mer differences? As the authors can see, these figures are really hard to understand and should be reworked. The same comment applies to Figure 5, where it seems that all (!) strains have the "variant"?

      We thank the reviewer for pointing out some limitations with our visualizations, most importantly with the way we explained how to read those two figures. We have amended the captions to more explicitly explain what is shown. The solid colors in the “sequence pseudo-alignment” panels indicate the focal cis variant, which is indicated in red in the corresponding “predicted transcription rate” panels below. In the case of Figure 5, the solid color indicates instead the position of the TFBS in the reference.

      1. Figure 1A & B: It would be helpful to add the total number of analyzed genes somewhere so that the numbers denoted in the colored outer rings can be interpreted in comparison to the total.

      We have added the total number of genes being considered for either species in the legend.

      1. Figure 1C & D: It would be better to spell out the COG names in the figure, as it is cumbersome for the reader to have to look up what the letters stand for in a supplementary table in a separate file.

      While we do not disagree with the awkwardness of having to move to a supplementary table to identify the full name of a COG category, we also would like to point out that the very long names of each category would clutter the figure to a degree that would make it difficult to read. We had indeed attempted something similar to what the reviewer suggests in early drafts of this manuscript, leading to small and hard to read labels. We have therefore left the full names of each COG category in Supplementary Table 3.

      1. Line 107: "Similarly," does not fit here as the following example (with one differentially expressed gene in an operon) is conceptually different from the one before, where all genes in the operon were differentially expressed.

      We agree and have amended the sentence accordingly.

      1. Figure 5 bottom panel: it is odd that on the left the swarm plots (i.e., the dots) are on the inside of the boxplots while on the right they are on the outside.

      We have fixed the position of the dots so that they are centered with respect to the underlying boxplots.

      1. It is not clear to me how only one or a few genes in an operon can show differential mRNA abundance. Aren't all genes in an operon encoded by the same mRNA? If so, shouldn't this mRNA be up- or downregulated in the same manner for all genes it encodes? As I am not closely familiar with bacterial systems, it is well possible that I am missing some critical fact about bacterial gene expression here. If this is not an analysis artifact, the authors could briefly explain how this observation is possible.

      We thanks the reviewer for their comment, which again echoes one of the main concerns from reviewer #2. As noted in our reply above, it has been established in multiple studies (see the three we have indicated above in our reply to reviewer #2) how bacteria encode for multiple “non-canonical” transcriptional units (i.e. operons), due to the presence of accessory terminators and promoters. This, together with other biological effects such as the presence of mRNA molecules of different lengths due to active transcription and degradation and technical noise induced by RNA isolation and sequencing can result in variability in the estimation of abundance for each gene.

    1. Learn how to watch and rate movies people (rated for balance only)The people who rated this movie 1-star should get their heads out of their posteriors. Too many movie-goers these days seem to only see movies as either being the best thing ever or the worst thing ever. The only way a movie should get 10 stars is if it would be difficult to improve upon and the only way a movie should get 1 star is if it was absolutely ineptly made on every level, and I assure you this movie doesn't come close to that. Even solely rating on personal taste and ignoring the technical filmmaking and how successfully the movie achieves the filmmakers' apparent intent, this movie could hardly be in the worst 10% of movies for anyone's taste.This movie fails in many respects, but it has some redeeming moments and taken as a movie for small kids, it's not bad. The humor and acting both fall flat or miss the mark about as often as they're on target, but that is a sign of mediocrity, not atrocity.Unfortunately at this point most of the IMDb users seem to think that if they enjoyed a movie they should give it a 10 and if it wasn't all they hoped for they should give it a 1. For instance the Lord of the Rings movies were entertaining, but have no business being rated higher than Citizen Kane or any of the countless classics relegated to lower ranks here. Similarly. Zoom has no business being rated lower than a piece of garbage like I Accuse My Parents which wasn't even watchable when it was skewered on Mystery Science Theater 3000.Remember folks most movies are mediocre. That means a low rating, not the bottom rating. Furthermore, just because a movie is exciting or satisfying doesn't make it a 10. For example, one can love the original Star Wars movies and still realize they have occasional flaws in acting, direction, pacing, or script.Is Zoom a great movie? Absolutely not. Will some children, some parents, and even some adults without children enjoy it? Yes. Will it go down in history for being remarkable in any way? Probably not.
    1. So is this simple substitution Cipher uncrackable? Well, let's wait a minute here. It's not, for example, one of the pieces of information we can use in analyzing simple substitution is the letter frequencies of the English language or any language If you looked at a histogram of English letter frequencies, you'd see that e is the most frequent letter just above 12 percent, T is the second most frequent at somewhere around nine percent, a is very frequent at more than eight percent and so forth, so you would see this familiar pattern in any text that you encrypted. For example, if you encrypted it using the simple substitution Cipher, you might get this pattern, but you'd still see a peak where the letter that was substituted for e occurs. And maybe this is the peak for T, and maybe this is the peak for a. By analyzing those peaks and valleys, you can determine how the message was encrypted, can figure out the key. But, wait a minute… frequency analysis works!<br /> Slide 21, 22 chart

      E.G. If you sorted by frequencies. Now, you can see this very clearly if you sort the two histograms, they're practically identical, and that's what lets the analyst with some effort, of course, figure out the key and break the message so you can break simple substitution cipher using frequency analysis. That makes Eve happy, Alice and Bob are not very happy about that. Eve wins … you don’t need brute force. Frequency analysis will break simple substitution. slide 23, 24 chart

    1. Missionaries had their greatest success in Spanish America and in the Philippines, areas that shared two critical elements beyond their colonization by Spain

      The lasting success of these missionaries is still shown in present day Philippines. With nearly 90% of the country apart of the Christian faith, it's the largest Christian nation on the Asian continent. The vast majority of these Christians are Roman Catholic, just like the Spaniards who settled in Philippines in the 1500's https://blogs.loc.gov/international-collections/2018/07/catholicism-in-the-philippines-during-the-spanish-colonial-period-1521-1898/

    1. Reference:

      Preprints. (2024). Traditional ecological knowledge and environmental stewardship. https://doi.org/10.20944/preprints202406.1838.v1

      “Traditional Ecological Knowledge (TEK) represents a knowledge system grounded in lived experience and intergenerational learning.”(Preprints,2024)

      This article connects strongly to our course because in class we will study sustainability and social-ecological systems, In class we already discussed about social ecological system where we learned that Environmental management is not just technical but it's also related to social and historical things. In class. we discussed about how ecosystems and humans influence each other, which is exactly what Traditional ecological knowledge is based on. TEK is defined as generations of observation, trial and error instead of short term scientific studies and policies. This article also connects to discussion about whose knowledge is considered valid in environmental decision making. The course gives the idea that ignoring indigenous people's knowledge leads to ineffective policies and wrong results. This article support the course argument that for effective environmental management multiple ways of knowing the TEK is necessary. This article also reinforces that sustainability is not just about protecting nature but also about respecting culture, people and historical experience.

    1. How is it that she could only see junk where I saw my entire life story?

      This sentence in itself is very important and impactful. People can judge others hobbies or interests and see their perspective on it has boring or uninteresting. For that person is it something more than just what meets the eyes. It’s something that they enjoy, find meaningful, interesting etc. This just introduces the difference between people. We all have different ideas, interests, etc.

    1. For example, a social media application might offer us a way of “Private Messaging” (also called Direct Messaging) with another user. But in most cases those “private” messages are stored in the computers at those companies, and the company might have computer programs that automatically search through the messages, and people with the right permissions might be able to view them directly.

      That’s such a good reminder that “private” messaging usually just means “not public,” not “only between two people.” It makes me want to be more careful about what I assume is truly confidential when I DM. I also think it’s wild how much of this is automated—like filtering, scanning, or flagging messages without us noticing. It raises a real question of how much privacy we’re actually trading for convenience.

    1. 3.4. Bots and Responsibility# As we think about the responsibility in ethical scenarios on social media, the existence of bots causes some complications. 3.4.1. A Protesting Donkey?# To get an idea of the type of complications we run into, let’s look at the use of donkeys in protests in Oman: “public expressions of discontent in the form of occasional student demonstrations, anonymous leaflets, and other rather creative forms of public communication. Only in Oman has the occasional donkey…been used as a mobile billboard to express anti-regime sentiments. There is no way in which police can maintain dignity in seizing and destroying a donkey on whose flank a political message has been inscribed.” From Kings and People: Information and Authority in Oman, Qatar, and the Persian Gulf by Dale F. Eickelman1 In this example, some clever protesters have made a donkey perform the act of protest: walking through the streets displaying a political message. But, since the donkey does not understand the act of protest it is performing, it can’t be rightly punished for protesting. The protesters have managed to separate the intention of protest (the political message inscribed on the donkey) and the act of protest (the donkey wandering through the streets). This allows the protesters to remain anonymous and the donkey unaware of it’s political mission. 3.4.2. Bots and responsibility# Bots present a similar disconnect between intentions and actions. Bot programs are written by one or more people, potentially all with different intentions, and they are run by others people, or sometimes scheduled by people to be run by computers. This means we can analyze the ethics of the action of the bot, as well as the intentions of the various people involved, though those all might be disconnected. 3.4.3. Reflection questions# How are people’s expectations different for a bot and a “normal” user? Choose an example social media bot (find on your own or look at Examples of Bots (or apps).) What does this bot do that a normal person wouldn’t be able to, or wouldn’t be able to as easily? Who is in charge of creating and running this bot? Does the fact that it is a bot change how you feel about its actions? Why do you think social media platforms allow bots to operate? Why would users want to be able to make bots? How does allowing bots influence social media sites’ profitability? 1 We haven’t been able to get the original chapter to load to see if it indeed says that, but I found it quoted here and here. We also don’t know if this is common or representative of protests in Oman, nor that we fully understand the cultural importance of what is happening in this story. Still, we are using it at least as a thought experiment. { requestKernel: true, binderOptions: { repo: "binder-examples/jupyter-stacks-datascience", ref: "master", }, codeMirrorConfig: { theme: "abcdef", mode: "python" }, kernelOptions: { kernelName: "python3", path: "./ch03_bots" }, predefinedOutput: true } kernelName = 'python3'

      I found the donkey protest example helpful for understanding how responsibility can be separated from action. Just like the donkey does not understand the protest it carries, bots can perform actions without intention or awareness. This makes it harder to assign responsibility, since the people who design, deploy, or benefit from a bot may all have different roles and intentions.

    1. often require manual intervention or complex mapping logic in Clickpipes to prevent the pipeline from breaking.

      It's not as simple as just using CDC - there are a ton of downstream issues like these schema issues that makes it harder

    1. Reviewer #1 (Public review):

      Summary:

      This paper investigates the control signals that drive event model updating during continuous experience. The authors apply predictions from previously published computational models to fMRI data acquired while participants watched naturalistic video stimuli. They first examine the time course of BOLD pattern changes around human-annotated event boundaries, revealing pattern changes preceding the boundary in anterior temporal and then parietal regions, followed by pattern stabilization across many regions. The authors then analyze time courses around boundaries generated by a model that updates event models based on prediction error and another that uses prediction uncertainty. These analyses reveal overlapping but partially distinct dynamics for each boundary type, suggesting that both signals may contribute to event segmentation processes in the brain.

      Strengths:

      The question addressed by this paper is of high interest to researchers working on event cognition, perception, and memory. There has been considerable debate about what kinds of signals drive event boundaries, and this paper directly engages with that debate by comparing prediction error and prediction uncertainty as candidate control signals.

      The authors use computational models that explain significant variance in human boundary judgments, and they report the variance explained clearly in the paper.

      The authors' method of using computational models to generate predictions about when event model updating should occur is a valuable mechanistic alternative to methods like HMM or GSBS, which are data-driven.

      The paper utilizes an analysis framework that characterizes how multivariate BOLD pattern dissimilarity evolves before and after boundaries. This approach offers an advance over previous work focused on just the boundary or post-boundary points.

      Weaknesses:

      Boundaries derived from prediction error and uncertainty are correlated for the naturalistic stimuli. This raises some concerns about how well their distinct contributions to brain activity can be separated. While the authors attempt to look at the unique variance, there is a limit to how effectively this can be done without experimentally dissociating prediction error and uncertainty.

      The authors reports an average event length of ~20 seconds, and they also look +20 and -20 seconds around each event boundary. Thus, it's unclear how often pre- and post-boundary timepoints are part of adjacent events. This complicates the interpretations of the reported timecourses.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper investigates the control signals that drive event model updating during continuous experience. The authors apply predictions from previously published computational models to fMRI data acquired while participants watched naturalistic video stimuli. They first examine the time course of BOLD pattern changes around human-annotated event boundaries, revealing pattern changes preceding the boundary in anterior temporal and then parietal regions, followed by pattern stabilization across many regions. The authors then analyze time courses around boundaries generated by a model that updates event models based on prediction error and another that uses prediction uncertainty. These analyses reveal overlapping but partially distinct dynamics for each boundary type, suggesting that both signals may contribute to event segmentation processes in the brain.

      Strengths:

      (1) The question addressed by this paper is of high interest to researchers working on event cognition, perception, and memory. There has been considerable debate about what kinds of signals drive event boundaries, and this paper directly engages with that debate by comparing prediction error and prediction uncertainty as candidate control signals.

      (2) The authors use computational models that explain significant variance in human boundary judgments, and they report the variance explained clearly in the paper.

      (3) The authors' method of using computational models to generate predictions about when event model updating should occur is a valuable mechanistic alternative to methods like HMM or GSBS, which are data-driven.

      (4) The paper utilizes an analysis framework that characterizes how multivariate BOLD pattern dissimilarity evolves before and after boundaries. This approach offers an advance over previous work focused on just the boundary or post-boundary points.

      We appreciate this reviewer’s recognition of the significance of this research problem, and of the value of the approach taken by this paper.

      Weaknesses:

      (1) While the paper raises the possibility that both prediction error and uncertainty could serve as control signals, it does not offer a strong theoretical rationale for why the brain would benefit from multiple (empirically correlated) signals. What distinct advantages do these signals provide? This may be discussed in the authors' prior modeling work, but is left too implicit in this paper.

      We added a brief discussion in the introduction highlighting the complementary advantages of prediction error and prediction uncertainty, and cited prior theoretical work that elaborates on this point. Specifically, we now note that prediction error can act as a reactive trigger, signaling when the current event model is no longer sufficient (Zacks et al., 2007). In contrast, prediction uncertainty is framed as proactive, allowing the system to prepare for upcoming changes even before they occur (Baldwin & Kosie, 2021; Kuperberg, 2021). Together, this makes clearer why these two signals could each provide complementary benefits for effective event model updating.

      "One potential signal to control event model updating is prediction error—the difference between the system’s prediction and what actually occurs. A transient increase in prediction error is a valid indicator that the current model no longer adequately captures the current activity. Event Segmentation Theory (EST; Zacks et al., 2007) proposes that event models are updated when prediction error increases beyond a threshold, indicating that the current model no longer adequately captures ongoing activity. A related but computationally distinct proposal is that prediction uncertainty (also termed "unpredictability") can serve as a control signal (Baldwin & Kosie, 2021). The advantage of relying on prediction uncertainty to detect event boundaries is that it is inherently proactive: the cognitive system can start looking for cues about what might come next before the next event starts (Baldwin & Kosie, 2021; Kuperberg, 2021). "

      (2) Boundaries derived from prediction error and uncertainty are correlated for the naturalistic stimuli. This raises some concerns about how well their distinct contributions to brain activity can be separated. The authors should consider whether they can leverage timepoints where the models make different predictions to make a stronger case for brain regions that are responsive to one vs the other.

      We addressed this concern by adding an analysis that explicitly tests the unique contributions of prediction error– and prediction uncertainty–driven boundaries to neural pattern shifts. In the revised manuscript, we describe how we fit a combined FIR model that included both boundary types as predictors and then compared this model against versions with only one predictor. This allowed us to identify the variance explained by each boundary type over and above the other. The results revealed two partially dissociable sets of brain regions sensitive to error- versus uncertainty-driven boundaries (see Figure S1), strengthening our argument that these signals make distinct contributions.

      "To account for the correlation between uncertainty-driven boundaries and error-driven boundaries, we also fitted a FIR model that predicted pattern dissimilarity from both types of boundaries (combined FIR) for each parcel. Then, we performed two likelihood ratio tests: combined FIR to error FIR, which measures the unique contribution of uncertainty boundaries to pattern dissimilarity, and combined FIR to uncertainty FIR, which measures the unique contribution of error boundaries to pattern dissimilarity. The analysis also revealed two dissociable sets of brain regions associated with each boundary type (see Figure S1)."

      (3) The authors refer to a baseline measure of pattern dissimilarity, which their dissimilarity measure of interest is relative to, but it's not clear how this baseline is computed. Since the interpretation of increases or decreases in dissimilarity depends on this reference point, more clarity is needed.

      We clarified how the FIR baseline is estimated in the methods section. Specifically, we now explain that the FIR coefficients should be interpreted relative to a reference level, which reflects the expected dissimilarity when timepoints are far from an event boundary. This makes it clear what serves as the comparison point for observed increases or decreases in dissimilarity.

      "The coefficients from the FIR model indicate changes relative to baseline, which can be conceptualized as the expected value when far from event boundaries."

      (4) The authors report an average event length of ~20 seconds, and they also look at +20 and -20 seconds around each event boundary. Thus, it's unclear how often pre- and post-boundary timepoints are part of adjacent events. This complicates the interpretations of the reported time courses.

      This is related to reviewer's 2 comment, and it will be addressed below.

      (5) The authors describe a sequence of neural pattern shifts during each type of boundary, but offer little setup of what pattern shifts we might expect or why. They also offer little discussion of what cognitive processes these shifts might reflect. The paper would benefit from a more thorough setup for the neural results and a discussion that comments on how the results inform our understanding of what these brain regions contribute to event models.

      We thank the reviewer for this advice on how better to set the context for the different potential outcomes of the study. We expanded both the introduction and discussion to better set up expectations for neural pattern shifts and to interpret what these shifts may reflect. In the introduction, we now describe prior findings showing that sensory regions tend to update more quickly than higher-order multimodal regions (Baldassano et al., 2017; Geerligs et al., 2021, 2022), and we highlight that it remains unclear whether higher-order updates precede or follow those in lower-order regions. We also note that our analytic approach is well-suited to address this open question. In the discussion, we then interpret our results in light of this framework. Specifically, we describe how we observed early shifts in higher-order areas such as anterior temporal and prefrontal cortex, followed by shifts in parietal and dorsal attention regions closer to event boundaries. This pattern runs counter to the traditional bottom-up temporal hierarchy view and instead supports a model of top-down updating, where high-level representations are updated first and subsequently influence lower-level processing (Friston, 2005; Kuperberg, 2021). To make this interpretation concrete, we added an example: in a narrative where a goal is reached midway—for instance, a mystery solved before the story formally ends—higher-order regions may update the event representation at that point, and this updated model then cascades down to shape processing in lower-level regions. Finally, we note that the widespread stabilization of neural patterns after boundaries may signal the establishment of a new event model.

      Excerpt from Introduction:

      “More recently, multivariate approaches have provided insights into neural representations during event segmentation. One prominent approach uses hidden Markov models (HMMs) to detect moments when the brain switches from one stable activity pattern to another (Baldassano et al., 2017) during movie viewing; these periods of relative stability were referred to as "neural states" to distinguish them from subjectively perceived events. Sensory regions like visual and auditory cortex showed faster transitions between neural states. Multi-modal regions like the posterior medial cortex, angular gyrus, and intraparietal sulcus showed slower neural state shifts, and these shifts aligned with subjectively reported event boundaries. Geerligs et al. (2021, 2022) employed a different analytical approach called Greedy State Boundary Search (GSBS) to identify neural state boundaries. Their findings echoed the HMM results: short-lived neural states were observed in early sensory areas (visual, auditory, and somatosensory cortex), while longer-lasting states appeared in multi-modal regions, including the angular gyrus, posterior middle/inferior temporal cortex, precuneus, anterior temporal pole, and anterior insula. Particularly prolonged states were found in higher-order regions such as lateral and medial prefrontal cortex.

      The previous evidence about evoked responses at event boundaries indicates that these are dynamic phenomena evolving over many seconds, with different brain areas showing different dynamics (Ben-Yakov & Henson, 2018; Burunat et al., 2024; Kurby & Zacks, 2018; Speer et al., 2007; Zacks, 2010). Less is known about the dynamics of pattern shifts at event boundaries (e.g. whether shifts observed in higher-order regions precedes or follow shifts observed in lower-level regions), because the HMM and GSBS analysis methods do not directly provide moment-by-moment measures of pattern shifts. Both the spatial and temporal aspects of evoked responses and pattern shifts at event boundaries have the potential to provide evidence about two potential control processes (error-driven and uncertainty-driven) for event model updating.”

      Excerpt from Discussion:

      “We first characterized the neural signatures of human event segmentation by examining both univariate activity changes and multivariate pattern changes around subjectively identified event boundaries. Using multivariate pattern dissimilarity, we observed a structured progression of neural reconfiguration surrounding human-identified event boundaries. The largest pattern shifts were observed near event boundaries (~4.5s before) in dorsal attention and parietal regions; these correspond with regions identified by Geerligs et. al as shifting their patterns on a fast to intermediate timescale (2022). We also observed smaller pattern shifts roughly 12 seconds prior to event boundaries in higher-order regions within anterior temporal cortex and prefrontal cortex, and these are slow-changing regions identified by Geerligs et. al (2022). This is puzzling. One prevalent proposal, based on the idea of a cortical hierarchy of increasing temporal receptive windows (TRWs), suggests that higher-order regions should update representations after lower-order regions do (Chang et al., 2021). In this view, areas with shorter TRWs (e.g., word-level processors) pass information upward, where it is integrated into progressively larger narrative units (phrases, sentences, events). This proposal predicts neural shifts in higher-order regions to follow those in lower-order regions. By contrast, our findings indicate the opposite sequence. Our findings suggest that the brain might engage in top-down event representation updating, with changes in coarser-grain representations propagating downward to influence finer-grain representations. (Friston, 2005; Kuperberg, 2021). For example, in a narrative where the main goal is achieved midway—such as a detective solving a mystery before the story formally ends—higher-order regions might update the overarching event representation at that point, and this updated model could then cascade down to reconfigure how lower-level regions process the remaining sensory and contextual details. In the period after a boundary (around +12 seconds), we found widespread stabilization of neural patterns across the brain, suggesting the establishment of a new event model. Future work could focus on understanding the mechanisms behind the temporal progression of neural pattern changes around event boundaries.”

      Reviewer #2 (Public review):

      Summary:

      Tan et al. examined how multivoxel patterns shift in time windows surrounding event boundaries caused by both prediction errors and prediction uncertainty. They observed that some regions of the brain show earlier pattern shifts than others, followed by periods of increased stability. The authors combine their recent computational model to estimate event boundaries that are based on prediction error vs. uncertainty and use this to examine the moment-to-moment dynamics of pattern changes. I believe this is a meaningful contribution that will be of interest to memory, attention, and complex cognition research.

      Strengths:

      The authors have shown exceptional transparency in terms of sharing their data, code, and stimuli, which is beneficial to the field for future examinations and to the reproduction of findings. The manuscript is well written with clear figures. The study starts from a strong theoretical background to understand how the brain represents events and has used a well-curated set of stimuli. Overall, the authors extend the event segmentation theory beyond prediction error to include prediction uncertainty, which is an important theoretical shift that has implications in episodic memory encoding, the use of semantic and schematic knowledge, and attentional processing.

      We thank the reader for their support for our use of open science practices, and for their appreciation of the importance of incorporating prediction uncertainty into models of event comprehension.

      Weaknesses:

      The data presented is limited to the cortex, and subcortical contributions would be interesting to explore. Further, the temporal window around event boundaries of 20 seconds is approximately the length of the average event (21.4 seconds), and many of the observed pattern effects occur relatively distal from event boundaries themselves, which makes the link to the theoretical background challenging. Finally, while multivariate pattern shifts were examined at event boundaries related to either prediction error or prediction uncertainty, there was no exploration of univariate activity differences between these two different types of boundaries, which would be valuable.

      The fact that we observed neural pattern shifts well before boundaries was indeed unexpected, and we now offer a more extensive interpretation in the discussion section. Specifically, we added text noting that shifts emerged in higher-order anterior temporal and prefrontal regions roughly 12 seconds before boundaries, whereas shifts occurred in lower-level dorsal attention and parietal regions closer to boundaries. This sequence contrasts with the traditional bottom-up temporal hierarchy view and instead suggests a possible top-down updating mechanism, in which higher-order representations reorganize first and propagate changes to lower-level areas (Friston, 2005; Kuperberg, 2021). (See excerpt for Reviewer 1’s comment #5.)

      With respect to univariate activity, we did not find strong differences between error-driven and uncertainty-driven boundaries. This makes the multivariate analyses particularly informative for detecting differences in neural pattern dynamics. To support further exploration, we have also shared the temporal progression of univariate BOLD responses on OpenNeuro (BOLD_coefficients_brain_animation_pe_SEM_bold.html and BOLD_coefficients_brain_animation_uncertainty_SEM_bold.html in the derivatives/figures/brain_maps_and_timecourses/ directory; https://doi.org/10.18112/openneuro.ds005551.v1.0.4) for interested researchers.

      Reviewer #3 (Public review):

      Summary:

      The aim of this study was to investigate the temporal progression of the neural response to event boundaries in relation to uncertainty and error. Specifically, the authors asked (1) how neural activity changes before and after event boundaries, (2) if uncertainty and error both contribute to explaining the occurrence of event boundaries, and (3) if uncertainty and error have unique contributions to explaining the temporal progression of neural activity.

      Strengths:

      One strength of this paper is that it builds on an already validated computational model. It relies on straightforward and interpretable analysis techniques to answer the main question, with a smart combination of pattern similarity metrics and FIR. This combination of methods may also be an inspiration to other researchers in the field working on similar questions. The paper is well written and easy to follow. The paper convincingly shows that (1) there is a temporal progression of neural activity change before and after an event boundary, and (2) event boundaries are predicted best by the combination of uncertainty and error signals.

      We thank the reviewer for their thoughtful and supportive comments, particularly regarding the use of the computational model and the analysis approaches.

      Weaknesses:

      (1) The current analysis of the neural data does not convincingly show that uncertainty and prediction error both contribute to the neural responses. As both terms are modelled in separate FIR models, it may be that the responses we see for both are mostly driven by shared variance. Given that the correlation between the two is very high (r=0.49), this seems likely. The strong overlap in the neural responses elicited by both, as shown in Figure 6, also suggests that what we see may mainly be shared variance. To improve the interpretability of these effects, I think it is essential to know whether uncertainty and error explain similar or unique parts of the variance. The observation that they have distinct temporal profiles is suggestive of some dissociation,but not as convincing as adding them both to a single model.

      We appreciate this point. It is closely related to Reviewer 1's comment 2; please refer to our response above.

      (2) The results for uncertainty and error show that uncertainty has strong effects before or at boundary onset, while error is related to more stabilization after boundary onset. This makes me wonder about the temporal contribution of each of these. Could it be the case that increases in uncertainty are early indicators of a boundary, and errors tend to occur later?

      We also share the intuition that increases in uncertainty are early indicators of a boundary, and errors tend to occur later. If that is the case, we would expect some lags between prediction uncertainty and prediction error. We examined lagged correlation between prediction uncertainty and prediction error, and the optimal lag is 0 for both uncertainty-driven and error-driven models. This indicates that when prediction uncertainty rises, prediction error also simultaneously rises.

      Author response image 1.

      (3) Given that there is a 24-second period during which the neural responses are shaped by event boundaries, it would be important to know more about the average distance between boundaries and the variability of this distance. This will help establish whether the FIR model can properly capture a return to baseline.

      We have added details about the distribution of event lengths. Specifically, we now report that the mean length of subjectively identified events was 21.4 seconds (median 22.2 s, SD 16.1 s). For model-derived boundaries, the average event lengths were 28.96 seconds for the uncertainty-driven model and 24.7 seconds for the error-driven model.

      " For each activity, a separate group of 30 participants had previously segmented each movie to identify fine-grained event boundaries (Bezdek et al., 2022). The mean event length was 21.4 s (median 22.2 s, SD 16.1 s). Mean event lengths for uncertainty-driven model and error-driven model were 28.96s, and 24.7s, respectively (Nguyen et al., 2024)."

      (4) Given that there is an early onset and long-lasting response of the brain to these event boundaries, I wonder what causes this. Is it the case that uncertainty or errors already increase at 12 seconds before the boundaries occur? Or if there are other makers in the movie that the brain can use to foreshadow an event boundary? And if uncertainty or errors do increase already 12 seconds before an event boundary, do you see a similar neural response at moments with similar levels of error or uncertainty, which are not followed by a boundary? This would reveal whether the neural activity patterns are specific to event boundaries or whether these are general markers of error and uncertainty.

      We appreciate this point; it is similar to reviewer 2’s comment 2. Please see our response to that comment above.

      (5) It is known that different brain regions have different delays of their BOLD response. Could these delays contribute to the propagation of the neural activity across different brain areas in this study?

      Our analyses use ±20 s FIR windows, and the key effects we report include shifts ~12s before boundaries in higher-order cortex and ~4.5s pre-boundary in dorsal attention/parietal areas. Given the literature above, region-dependent BOLD delays are much smaller (~1–2s) than the temporal structure we observe (Taylor et al., 2018), making it unlikely that HRF lag alone explains our multi-second, region-specific progression.

      (6) In the FIR plots, timepoints -12, 0, and 12 are shown. These long intervals preclude an understanding of the full temporal progression of these effects.

      For page length purposes, we did not include all timepoints. We uploaded a brain animation of all timepoints and coefficients for each parcel in Openneuro (PATTERN_coefficients_brain_animation_human_fine_pattern.html and PATTERN_coefficients_lines_human_fine.html in the derivatives/figures/brain_maps_and_timecourses/ directory; https://doi.org/10.18112/openneuro.ds005551.v1.0.4) for interested researchers.

      References

      Taylor, A. J., Kim, J. H., & Ress, D. (2018). Characterization of the hemodynamic response function across the majority of human cerebral cortex. NeuroImage, 173, 322–331. https://doi.org/10.1016/j.neuroimage.2018.02.061

    1. repository open issue .md .pdf Data From the Reddit API 8.2. Data From the Reddit API# When we’ve been accessing Reddit through Python and the “PRAW” code library. The praw code library works by sending requests across the internet to Reddit, using what is called an “application programming interface” or API for short. APIs have a set of rules for what requests you can make, what happens when you make the request, and what information you can get back. If you are interested in learning more about what you can do with praw and what information you can get back, you can look at the official documentation for those. But be warned they are not organized in a friendly way for newcomers and take some getting used to to figure out what these documentation pages are talking about. So, if you are interested, you can look at the praw library documentation to find out what the library can do (again, not organized in a beginner-friendly way). You can learn a little more by clicking on the praw models and finding a list of the types of data for each of the models, and a list of functions (i.e., actions) you can do with them. You can also look up information on the data that you can get from the Reddit API by looking at the Reddit API Documentation. The Reddit API lets you access just some of the data that Reddit tracks, but Reddit and other social media platforms track much more than they let you have access to.

      This section helped me better understand what the Reddit API actually is and how PRAW works behind the scenes. I didn’t really think about the fact that it’s just sending requests to Reddit and getting specific data back based on rules. The warning about the documentation being hard to read feels very accurate, because most official docs are kind of confusing for beginners. It was also interesting to realize that Reddit collects way more data than what the API lets us see.

    2. The Reddit API lets you access just some of the data that Reddit tracks, but Reddit and other social media platforms track much more than they let you have access to.

      I found it interesting that APIs create a filtered version of reality for researchers and developers. When we analyze Reddit data through the API, it’s easy to forget that we are only seeing what the platform allows us to see, not the full picture of user behavior. This makes me wonder how data mining conclusions might change if hidden data—like deleted interactions or internal metrics—were included.

    1. nclusion, I argue for univeris within our grasp financially. It'sour gr

      Kahle’s main argument is that universal access to knowledge is not just an ideal, it’s something we already have the tools, money, and systems to achieve if we choose to act on it.

    1. nclusion, I argue for univeris within our grasp financially. It'sour gra

      Kahle’s main argument is that universal access to knowledge is not just an ideal, it’s something we already have the tools, money, and systems to achieve if we choose to act on it.

    1. Digital libraries will be critical to future humanities scholarship. Not only will they provide access to a host of source materials that humanists need in order to do their work, but these libraries will also enable new forms of research that were difficult or impossible to undertake before.

      The author emphasizes that digital libraries aren’t just about easy access, they change how research can be done. It’s a shift from visiting a library to interacting with the material in new computational ways.

  4. muhlenbergcollege.instructure.com muhlenbergcollege.instructure.com
    1. The consistency is also impressive given the fact that when the causal arrow is reversed and one considers research that investigates the influence of repressive behavior on dissent (exploring one of the core aspects of collective action theory regarding the importance of costs), the results are highly inconsistent.

      This is the part that confuses researchers. We know that dissent causes repression, but we don't know if repression actually stops dissent. Sometimes people get scared and go home (negative impact), but other times it just makes them angrier and leads to more protests (positive impact). It’s a puzzle because the government keeps using repression even though they aren't totally sure it works.

    1. The alleged ‘‘purity’’ of the archival fondsas an integral organic whole was increasingly challenged in working reality,however much the evidential rhetoric lingered from an earlier period.

      (I remember we only need to do one post, I'm doing an additional one because I had thoughts I wanted to express) It seems to me that the early focuses on truth, impartiality, purity, etc. especially as they relate to the history of a nation are indicative of a sort of imperialist, or just broadly nationalist approach to record keeping. The concept of a pure truth that is muddied by human inadequacies seems tied to other scientific movements of the time like eugenics. It feels like a weird comparison to make, but you see the common thread of logic drawn from evolutionary theory and various industrial philosophies: Humans are flawed, thus we must erase the fallible parts of humanity so that there is only the most rational and advantageous parts left. It's interesting to see how much archival theory was dominated by a similar approach until the material impracticality forced a change in perspective.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We sincerely appreciate the feedback, attention to detail and timeliness of the referees for our manuscript. Below, we provide a point-by-point response to all comments from the referees, detailing the changes we have already made, and those that are in progress. Referee's comments will appear in bolded text, while our responses will be unbolded. Any text quoted directly from the manuscript will be italicised and contained within "quotation marks". Additionally, we have grouped all comments into four categories (structural changes, minor text changes, experimental changes, figure changes), comments are numbered 1-n in each of these categories. Please note: this response to reviewer's comments included some images that cannot be embedded in this text-only section.

      1. General Statements

      We appreciate the overall highly positive and enthusiastic comments from all reviewers, who clearly appreciated the technical difficulty of this study, and noted amongst other things that this study represents" a major contribution to the future advancement of oocyst-sporozoite biology" and the development of the segmentation score for oocysts as a "major advance[ment]". We apologise for the omission of line numbers on the document sent to reviewers, we removed these for the bioRxiv submission without considering that this PDF would be transferred across to Review Commons.

      We have responded to all reviewers comments through a variety of text changes, experimental inclusions, or direct query response. Significant changes to the manuscript since initial submission are as follows:

      1. Refinement of rhoptry biogenesis model: Reviewers requested more detail around the content of the AORs, which we had previously suggested were a vehicle for rhoptry biogenesis as we saw they carried the rhoptry neck protein RON4. To address this, we first attempted to address this using antibodies against rhoptry bulb proteins but were unsuccessful. We then developed a * berghei* line where there rhoptry bulb protein RhopH3 was GFP-tagged. Using this parasite line, we observed that the earliest rhoptry-like structure, which we had previously interpreted as an AOR contained RhopH3. By contrast, RhopH3 was absent from AORs. Reflecting these observations we have renamed this initial structure the 'pre-rhoptry' and suggested a model for rhoptry biogenesis where rhoptry neck cargo are trafficked via the AOR but rhoptry bulb cargo are trafficked by small vesicles that move along the rootlet fibre (previously observed by EM).
      2. Measurement of rhoptry neck vs bulb: While not directly suggested by the reviewers, we have also included an analysis that estimates the proportion of the sporozoite rhoptry that represents the rhoptry neck. By contrast to merozoites, which we show are overwhelmingly represented by the rhoptry bulb, the vast majority of the sporozoite rhoptry represents the rhoptry neck.
      3. Measurement of subpellicular microtubules: One reviewer asked if we could measure the length of subpellicular microtubules where we had previously observed that they were longer on one side of the sporozoite than the other. We have now provided absolute and relative (% sporozoite length) length measurements for these subpellicular microtubules and also calculated the proportion of the microtubule that is polyglutamylated.
      4. More detailed analysis of RON11cKD rhoptries: Multiple comments suggested a more detailed analysis of the rhoptries that were formed/not formed in RON11cKD We have included an updated analysis that shows the relative position of these rhoptries in sporozoites.

      2. Point-by-point description of the revisions

      Reviewer #1

      Minor text changes (Reviewer #1)

      1. __Text on page 12 could be condensed to highlight the new data of ron4 staining of the AOR. __

      We agree with the reviewer that it is a reasonable suggestion. After obtaining additional data on the contents of the AOR (as described in General Statements #1), this section has been significantly rewritten to highlight these findings. 2.

      __Add reference on page 3 after 'disrupted parasites' __

      This sentence has been rewritten slightly with some references included and now reads: "Most data on these processes comes from electron microscopy studies 6-8, with relatively few functional reports on gene deleted or disrupted parasites9-11. 3.

      __Change 'the basal complex at the leading edge' - this seems counterintuitive __

      This change has been made. 4.

      __Change 'mechanisms underlying SG are poorly' - what mechanisms? of invasion or infection? __

      This was supposed to read "SG invasion" and has now been fixed. 5.

      __On page 4: 'handful of proteins' __

      This error has been corrected. 6.

      __What are the 'three microtubule spindle structures'? __

      The three microtubule spindle structures: hemispindle, mitotic spindle, and interpolar spindle are now listed explicitly in the text. 7.

      __On page 5: 'little is known' - please describe what is known, also in other stages. At the end of the paper I would like to know what is the key difference to rhoptry function in other stages? __

      The following sentence already detailed that we had recently used U-ExM to visualise rhoptry biogenesis in blood-stage parasites, but the following two sentences have been added to provide extra detail on these findings: "In that study, we defined the timing of rhoptry biogenesis showing that it begun prior to cytokinesis and completed approximate coincident with the final round of mitosis. Additionally, we observed that rhoptry duplication and inheritance was coupled with centriolar plaque duplication and nuclear fission." 8.

      __change 'rhoptries golgi-derived, made de novo' __

      This has been fixed. 9.

      __change 'new understand to' __

      This change has been made 10.

      __'rhoptry malformations' seem to be similar in sporozoites and merozoites. Is that surprising/new? __

      We assume this is in reference to mention of "rhoptry malformations" in the abstract. In the RON11 merozoite study (PMID:39292724) the authors noted no gross rhoptry malformations, only that one was not formed/missing. The abstract sentence has been changed to the following to better reflect this nuance: "*We show that stage-specific disruption of RON11 leads to a formation of sporozoites that only contain half the number of rhoptries of controls like in merozoites, however unlike in merozoites the majority of rhoptries appear grossly malformed."

      * 11.

      __What is known about crossing the basal lamina. Where rhoptries thought to be involved in this process? Or is it proteins on the surface or in other secretory organelles? __

      We are unaware of any studies that specifically look at sporozoites crossing the SG basal lamina. A review, although now ~15 years old stated that "No information is available as to how the sporozoites traverse the basal lamina" (PMID:19608457) and we don't know any more information since then. To try and better define our understanding of rhoptry secretion during SG invasion, we have added the following sentence:

      "It is currently unclear precisely when during these steps of SG invasion rhoptry proteins are required, but rhoptry secretion is thought to begin before in the haemolymph before SG invasion16." 12.

      __On page change/specify: 'wide range of parasite structures' __

      The structures observed have been listed: centriolar plaque, rhoptry, apical polar rings, rootlet fibre, basal complex, apicoplast. 13.

      __On page 7: is Airyscan2 a particular method or a specific microscope? __

      Airyscan2 is a detector setup on Zeiss LSM microscopes, this was already detailed in the materials and methods sections, but figure legends have been clarified to read: "...imaged by an LSM900 microscopy with an Airyscan2 detector". 14.

      __how large is RON11? __

      RON11 is 112 kDa in * berghei*, as noted in the text. 15.

      __There is no causal link between ookinete invasion and oocyst developmental asynchrony __

      We have deleted the sentence that implied that ookinete invasion was responsible for oocyst asynchrony. This section now simply states that "Development of each oocyst within a midgut is asynchronous..." 16.

      __First sentence of page 24 appears to contradict what is written in results____ I don't understand the first two sentences in the paragraph titled Comparison between Plasmodium spp __

      This sentence was worded confusingly, making it appear contradictory when that was not the intention. The sentence has been changed to more clearly support what is written in the discussion and now reads: "Our extensive analysis only found one additional ultrastructural difference between Plasmodium spp."

      __On page 25 or before the vast number of electron microscopy studies should be discussed and compared with the authors new data. __

      It is not entirely clear which new data should be specifically discussed based on this comment. However, we have added a new paragraph that broadly compares MoTissU-ExM and our findings with other imaging methods previously used on mosquito-stage malaria parasites:

      "*Comparison of MoTissU-ExM and other imaging modalities

      Prior to the development of MoTissU-ExM, imaging of mosquito-stage malaria parasites in situ had been performed using electron microscopy7,8,11,28, conventional immunofluorescence assays (IFA)10, and live-cell microscopy25. MoTissU-ExM offers significant advantages over electron microscopy techniques, especially volume electron microscopy, in terms of accessibility, throughput, and detection of multiple targets. While we have benchmarked many of our observations against previous electron microscopy studies, the intracellular detail that can be observed by MoTissU-ExM is not as clear as electron microscopy. For example, previous electron microscopy studies have observed Golgi-derived vesicles trafficking along the rootlet fibre8 and distinguished the apical polar rings44; both of which we could not observe using MoTissU-ExM. Compared to conventional IFA, MoTissU-ExM dramatically improves the number and detail of parasite structures/organelles that can be visualised while maintaining the flexibility of target detection. By contrast, it can be difficult or impossible to reliably quantify fluorescence intensity in samples prepared by expansion microscopy, something that is routine for conventional IFA. For studying temporally complex processes, live-cell microscopy is the 'gold-standard' and there are some processes that fundamentally cannot be studied or observed in fixed cells. We attempt to increase the utility of MoTissU-ExM in discerning temporal relationships through the development of the segmentation score but note that this cannot be applied to the majority of oocyst development. Collectively, MoTissU-ExM offers some benefits over these previously applied techniques but does not replace them and instead serves as a novel and complementary tool in studying the cell biology of mosquito-stage malaria parasites.**"

      *

      __First sentence on page 27: there are many studies on parasite proteins involved in salivary gland invasion that could be mentioned/discussed. __

      The sentence in question is "To the best of our knowledge, the ability of sporozoites to cross the basal lamina and accumulate in the SG intercellular space has never previously been reported."

      This sentence has now been changed to read as follows: "While numerous studies have characterized proteins whose disruption inhibited SG invasion9,10,15,59-63, to the best of our knowledge the ability of sporozoites to cross the basal lamina and accumulate in the SG intercellular space has never previously been reported ."

      __On page 10 I suggest to qualify the statement 'oocyst development has typcially been inferred by'. There seem a few studies that show that size doesn't reflect maturation. __

      In our opinion, this statement is already qualified in the following sentence which reads: "Recent studies have shown that while oocysts increase in size initially, their size eventually plateaus (11 days pot infection (dpi) in P. falciparum4)."

      __On page 16 the authors state that different rhoptries might have different function. This is an interesting hypothesis/result that could be mentioned in the abstract. __

      The abstract already contains the following statement: "...and provide the first evidence that rhoptry pairs are specialised for different invasion events." We see this as an equivalent statement.


      Experimental changes (Reviewer #1)

      1. On page 19: do the parasites with the RON11 knockout only have the cytoplasmic or only the apical rhoptries?

      The answer to this is not completely clear. We have added the following data to Figures 6 and 8 where we quantify the proportion of rhoptries that are either apical or cytoplasmic: In both wildtype parasites and RON11ctrl parasites, oocyst spz rhoptries are roughly 50:50 apical:cytoplasmic (with a small but consistent majority apical), while almost all rhoptries are found at the apical end (>90%) in SG spz. Presumably, after the initial apical rhoptries are 'used up' during SG invasion, the rhoptries that were previously cytoplasmic take their place. In RON11cKD the ratio of apical:cytoplasmic rhoptries is fairly similar to control oocyst spz. In RON11cKD SG spz, the proportion of cytoplasmic rhoptries decreases but not to the same extent as in wildtype or RON11Ctrl. From this, we infer that the two rhoptries that are lost/not made in RON11cKD sporozoites are likely a combination of both the apical and cytoplasmic rhoptries we find in control sporozoites.

      __in panel G: Are the dense granules not micronemes? What are the dark lines? Rhoptries?? __

      We have labelled all of Figure 1 more clearly to point out that the 'dark lines' are indeed rhoptries. Additionally, we have renamed the 'protein-dense granules' to 'protein-rich granules', as it seems we are suggesting that these structures are dense granules the secretory organelle. At this stage we simply do not know what all of these granules are. The observation that some but not all of these granules contain CSP (Supplementary Figure 2) suggests that they may represent heterogenous structures. It is indeed possible that some are micronemes, however, we think it is unlikely that they are all micronemes for a number of reasons: (1) micronemes are not nearly this protein dense in other Plasmodium lifecycle stages, (2) some of them carry CSP which has not been demonstrated to be micronemal, (3) very few of these granules are present in SG sporozoites, which would be unexpected because microneme secretion is required for hepatocyte invasion.

      __Figure 2 seems to add little extra compared to the following figures and could in my view go to the supplement. __

      We agree that Figure 2b adds little and so have moved that to Supplementary Figure 2, but think that the relative ease at which it can be distinguished if sporozoites are in the secretory cavity or SG epithelial cell is a key observation because of the difficulty in doing this by conventional IFA.

      __On page 8 the authors mention a second layer of CSP but do not further investigate it. It is likely hard to investigate this further but to just let it stand as it is seems unsatisfactory, considering that CSP is the malaria vaccine. What happens if you add anti-CSP antibodies? I would suggest to shorten the opening paragraphs of this paper and to focus on the rhoptries. This could be done be toning down the text on all aspects that are not rhoptries and point to the open question some of the observations such as the CSP layers raise for future studies. __

      When writing the manuscript, we were unsure whether to include this data at all as it is a purely incidental finding. We had no intention of investigating CSP specifically, but anti-CSP antibodies were included in most of the salivary gland imaging experiments so we could more easily find sporozoites. Given the tremendous importance of CSP to the field, we figured that these observations were potentially important enough that they should be reported in the literature even though they are not something we have the intention or resources to investigate subsequently. Additionally, after consultation with other microscopists we think there is a reasonable chance that this double-layer effect could be a product of chemical fixation. To account for this, we have qualified the paragraph on CSP with this sentence:

      "We cannot determine if there is any functional significance of this second CSP layer and considering that it has not been observed previously it may well represent an artefact of chemical (paraformaldehyde) fixation."

      __Maybe include more detail of the differences between species on rhoptry structure into Figure 4. I would encourage to move the Data on rhoptries in Figure S6 to the main text ie to Figure 4. __

      We have moved the images of developing rhoptries in * falciparum *(previously Figure S6a and b) into figure 4, which now looks as follows:

      Figure S8 (previously S6c) now consists only of the MG spz rhoptry quantification

      Manuscript structural changes (Reviewer #1)

      1. Abstract: don't focus on technique but on the questions you tried to answer (ie rewrite or delete the 3rd and 4th sentence)

      2. 'range of cell biology processes' - I understand the paper that the key discovery concerns rhoptry biogenesis and function, so focus on that, all other aspects appear rather peripheral.

      3. 'Much of this study focuses on the secretory organelles': I would suggest to rewrite the intro to focus solely on those, which yield interesting findings.

      4. Page 11: I am tempted to suggest the authors start their study with Figure 3 and add panel A from Figure 2 to it. This leads directly to their nice work on rhoptries. Other features reported in Figures 1 and 2 are comparatively less exciting and could be moved to the supplement or reported in a separate study.____ Page 23: I suggest to delete the first sentence and focus on the functional aspects and the discoveries.

      5. __Maybe add a conclusion section rather than a future application section, which reads as if you want to promoted the use of ultrastructure expansion microscopy. To my taste the technological advance is a bit overplayed considering the many applications of this techniques over the last years, especially in parasitology, where it seems widely used. In any case, please delete 'extraordinarily' __

      Response to Reviewer#1 manuscript structural changes 1-5: This reviewer considers the findings related to rhoptry biology as the most significant aspect of the study and suggests rewriting the manuscript to emphasize these findings specifically. Doing so might make the key findings easier to interpret. However, in our view, this approach could misrepresent how the study originated and what we see as the most important outcomes. We did not develop MoTissU-ExM specifically to investigate rhoptry biology. Instead, this technique was created independently of any particular biological question, and once established, we asked what questions it could answer, using rhoptry biology as a proof of concept. Given the authors' previous work and available resources, we chose to focus on rhoptry biology. Since this was driven by basic research rather than a specific hypothesis, it's important to acknowledge this in the manuscript. While we agree that the findings related to rhoptry biology are valuable, we believe that highlighting the technique's ability to observe organelles, structures, and phenotypes with unprecedented ease and detail is more important than emphasizing the rhoptry findings alone. For these reasons, we have decided not to restructure the manuscript as suggested.


      Reviewer #2

      Minor text changes (Reviewer #2)

      1. __The 'image Z-depth' value indicated in the figures is ambiguous. It is not clear whether this refers to the distance from the coverslip surface or the starting point of the z-stack image acquisition. A precise definition of this parameter would be beneficial. __

      In the legend of Figure 1, the image Z-depth has been clarified as "sum distance of Z-slices in max intensity projection". 2.

      __Paragraph 3 of the introduction - line 7, "handful or proteins" should be handful of proteins __

      This has been corrected. 3.

      __Paragraph 5 of the introduction - line 7, "also able to observed" should be observe __

      This has been changed. 4.

      __In the final paragraph of the introduction - line 1, "leverage this new understand" should be understanding __

      This has been fixed. 5.

      __The first paragraph of the discussion summary contains an incomplete sentence on line 7, "PbRON11ctrl-infected SGs." __

      This has been removed. 6.

      __The second paragraph of the discussion - line 10, "until cytokinesis beings" should be begins __

      This mistake has been corrected. 7.

      __One minor point that author suggest that oocyst diameter is not appropriate for the development of sporozoite develop. This is not so true as oocyst diameter tells between cell division and cell growth so it is important parameter especially where the proliferation with oocyst does not take place but the growth of oocyst takes place. __

      We agree that this was not highlighted enough in the text. The final sentence of the results section about this now reads:

      "While diameter is a useful readout for oocyst development in the early stages of its growth, this suggests that diameter is a poor readout for oocyst development once sporozoite formation has begun and highlights the usefulness of the segmentation score as an alternative.", and the final sentence of the discussion section about this now reads "Considering that oocyst size does not plateau until cytokinesis begins4, measuring oocyst diameter may represent a useful biological clock specifically when investigating the early stages of oocyst development." 8.

      __How is the apical polarity different to merozoite as some conoid genes are present in ookinete and sporozoite but not in merozoite. __

      Our hypothesis is that apical polarity is established by the positioning and attachment of the centriolar plaque to the parasite plasma membrane in both forming merozoites and sporozoites. While the apical polar ring proteins are obviously present at the apical end, and have important functions, we think that they themselves are unlikely to regulate polarity establishment directly. Additionally, it seems that the apical polar rings are visible in forming sporozoites far before the comparable stages of merozoite formation. An important note here is that at this point, this is largely inferences based on observational differences and there is relatively little functional data on proteins that regulate polarity establishment at any stage of the Plasmodium 9.

      __Therefore, I think that electron microscopy remains essential for the observation of such ultra-fine structures __

      We have added a paragraph in the discussion that provides a more clear comparison between MoTissU-ExM and other imaging modalities previously applied on mosquito-stage parasites (see response to Reviewer#1 (Minor text changes) comment #17). 10.

      __The author have not mentioned that sometimes the stage oocyst development is also dependent on the age of mosquito and it vary between different mosquito gut even if the blood feed is done on same day. __

      In our opinion this can be inferred through the more general statement that "development of each oocyst within a midgut is asynchronous..."


      Figure changes (Reviewer #2)

      1. __Fig 3B: stage 2 and 6 does not show the DNA cyan, it would-be good show the sate of DNA at that particular stage, especially at stage 2 when APR is visible. And box the segment in the parent picture whose subset is enlarged below it. __

      We completely agree with the reviewer that the stage 2 image would benefit from the addition of a DNA stain. Many of the images in Figure 3b were done on samples that did not have a DNA stain and so in these * yoelii samples we did not find examples of all segmentation scores with the DNA stain. Examples of segmentation score 2 and 6 for P. berghei, and 6 for P. falciparum* can be found with DNA stains in Figure S8. 2.

      __For clarity, it would be helpful to add indicators for the centriolar plaques in Figure 1b, as their locations are not immediately obvious. __

      The CPs in Figure 1a and 1b have been circled on the NHS ester only panel for clarity. +

      __Regarding Figure 1c, the authors state that 'the rootlet fiber is visible'. However, such a structure cannot be confirmed from the provided NHS ester image. Can the authors present a clearer image where the rootlet fibre is more distinct? Furthermore, please provide the basis for identifying this structure as a rootlet fiber based on the NHS ester observation alone. __

      The image in Figure 1c has been replaced with one that more clearly shows the rootlet fibre.

      Based on electron microscopy studies, the rootlet fibre has been defined as a protein dense structure that connects the centriolar plaque to the apical polar rings (PMID: 17908361). Through NHS ester and tubulin staining, we could identify the apical polar rings and centriolar plaque as sites on the apical end of the parasite and nucleus that microtubules are nucleated from. There is a protein dense fibre that connects these two structures. Based on the fact that the protein density of this structure was previously considered sufficient for its identification by electron microscopy, we consider its visualisation by NHS ester staining sufficient for its identification by U-ExM.

      __Fig 1B - could the tubulin image in the hemispindle panel be made brighter? __

      The tubulin staining in this panel was not saturated, and so this change has been made.

      __Fig 4A - the green text in the first image panel is not visible. Also, the cyan text in the 3rd image in Fig 1A is also difficult to see. There's a few places where this is the case __

      We have made all microscopy labels legible at least when printed in A4/Letter size.

      __Fig 6A - how do the authors know ron11 expression is reduced by 99%? Did they test this themselves or rely on data from the lab that gifted them the construct? Also please provide mention the number of oocyst and sporozoites were observed. __

      The way Figure 6a was previously designed and described was an oversight, that wrongly suggested we had quantified a >99% reduction in *ron11 * The 99% reduction has been removed from Figure 6a and the corresponding part of the figure legend has been rewritten to emphasise that this was previously established:

      "(a) Schematic showing previously established Ron11Ctrl and Ron11cKD parasite lines where ron11 expression was reduced by >99%9."

      As to the second part of the question, we did not independently test either protein or RNA level expression of RON11, but we were gifted the clonal parasite lines established by Prof. Ishino's lab in PMID: 31247198 not just the genetic constructs.

      __Fig 6E - are the data point colours the wrong way round on this graph? Just looking at the graph it looks as though the RON11cKD has more rhoptries than the control which does not match what is said in the text. __

      Thank you for pointing out this mistake, the colours have now been corrected.

      __Fig S8C, PbRON11 ctrl, pie chart shows 89.7 % spz are present in the secretory cavity while the text shows 100 %, 35/35 __

      The text saying 100% (35/35) only considered salivary glands that were infected (ie. Uninfected SGs were removed from the count. The two sentences that report this data have been clarified to reflect this better:

      "Of *PbRON11ctrl SGs that were infected (35/39), 100% (35/35) contained sporozoites in the secretory cavity (Figure S8c). Conversely of infected PbRON11cKD SGs (59/82), only 24% (14/59) contained sporozoites within the secretory cavity (Figure S9d)."

      *

      __Fig S9D shows that RON11 ckd contains 17.1% sporozoites in secretory cavity while the text says 24%. __

      Please see the response to Reviewer#2 Figure Changes Comment #8 where this was addressed.


      Experimental changes (Reviewer #2)

      1. __Why do the congruent rhoptries have similar lengths to each other, while the dimorphic rhoptries have different lengths? Is this morphological difference related to the function of these rhoptries? __

      We hypothesise that this morphological difference arises because the congruent rhoptries are 'used' during SG invasion, while the dimorphic rhoptries are utilized during hepatocyte invasion. It is not straightforward to test this functionally at this point, as no protein is known to have differential localization between the two. Additionally, RON11 is likely directly involved in both SG and hepatocyte invasion through a secreted portion of the protein (as seen in RBC invasion). Therefore, RON11cKD sporozoites may have combined defects, meaning we cannot assume any defect is solely due to the absence of two rhoptries. Determining this functionally is of high interest to our research groups and remains an area of ongoing study, but it is beyond the scope of this study. 2.

      Would it be possible to show whether RON11 localises to the dimorphic rhoptries, the congruent rhoptries, or both, by using expansion microscopy and a parasite line that expresses RON11 tagged with GFP or a peptide tag?

      __ __We do not have access to a parasite line that expresses a tagged copy of RON11, or anti-PbRON11 antibodies. Based on previously published localisation data, however, it seems likely that RON11 localises to both sets of rhoptries. Below are excerpts from Figure 1c of PMID: 31247198, where RON11 (in green) seems to have a more basally-extended localisation in midgut (MG) sporozoites than in salivary gland (SG) sporozoites. From this we infer that in the MG sporozoite you're seeing RON11 in both pairs of rhoptries, but only the one remaining pair in the SG sporozoite.


      __The knockdown of RON11 disrupts the rhoptry structure, making the dimorphic and congruent rhoptries indistinguishable. Does this suggest that RON11 is important for the formation of both types of rhoptries? I believe that it would be crucial to confirm whether RON11 localises to all rhoptries or is restricted to specific rhoptries for a more precise discussion of RON11's function. __

      Based on our analysis, it does indeed seem that RON11 is important for both types of rhoptries as when RON11 isn't expressed sporozoites still have both apical and cytoplasmic rhoptries (ie. Not just one pair is lost; see Reviewer #1 Experimental changes comment #1).

      __The authors state that 64% of RON11cKD SG sporozoites contained no rhoptries at all. Does this mean RON11cKD SG sporozoites used up all rhoptries corresponding to the dimorphic and congruent pairs during SG invasion? If so, this contradicts your claims that sporozoites are 'leaving the dimorphic rhoptries for hepatocyte invasion' and that 'rhoptry pairs are specialized for different invasion events'. If that is not the case, does it mean that RON11cKD sporozoites failed to form the rhoptries corresponding to the dimorphic pair? A more detailed discussion would be needed on this point and, as I mentioned above, on the specific role of RON11 in the formation of each rhoptry pair. __

      We do not agree that this constitutes a contradiction; instead, more nuance is needed to fully explain the phenotype. As shown in the new graph added in response to Reviewer#1 Figure changes comment #1 in RON11cKD oocyst sporozoites, 64% of all rhoptries are located at the apical end. Our hypothesis is that these rhoptries are used for SG invasion and, therefore, would not be present in RON11cKD SG sporozoites. Consequently, the fact that 64% of RON11cKD sporozoites lack rhoptries is exactly what we would expect. Essentially, we predict three slightly different 'pathways' for RON11cKD sporozoites: If they had 2 apical rhoptries in the oocyst, we predict they would have zero rhoptries in the SG. If they had 2 cytoplasmic rhoptries in the oocyst, we predict they would have two rhoptries in the SG. If they had one apical and one cytoplasmic rhoptry in the oocyst, we predict they would have one rhoptry in the SG. In any case, we expect the apical rhoptries to be 'used up,' which appears to be supported by the data.

      __Out of pure curiosity, is it possible to measure the length and number of subpellicular microtubules in the sporozoites observed in this study using expansion microscopy? __

      We have performed an analysis of subpellicular microtubules which is now included as Supplementary Figure 2. We could not always distinguish every SPMT from each other and so have not quantified SPMT number. We have, however, quantified their absolute length on both the 'long side' and 'short side', their relative length (as % sporozoite length) and the degree to which they are polyglutamylated.

      A description of this analysis is now found in the results section as follows: "*We quantified the length and degree of polyglutamylation of SPMTs on the 'long side' and 'short side' of the sporozoite (Figure S2). 'Short side' SPMTs were on average 33% shorter (mean = 3.6 µm {plus minus}SD 1.0 µm) than 'long side' SPMTs (mean = 5.3 µm {plus minus}SD 1.5 µm) and extended 17.4% less of the total sporozoite length. While 'short side' SPMTs were significantly shorter, a greater proportion of their length (87.9% {plus minus}SD 11.2%) was polyglutamylated compared to 'long side' SPMTs (69.4% {plus minus}SD 13.8%)." *

      Supplementary Figure 2: Analysis of sporozoite subpellicular microtubules. Isolated P. yoelii salivary gland sporozoites were prepared by U-ExM and stained with anti-tubulin (microtubules) and anti-PolyE (polyglutamylated SPMTs) antibodies. SPMTs were defined as being on either the 'long side' (nucleus distant from plasma membrane) or 'short side' (nucleus close to plasma membrane) of the sporozoite as depicted in Figure 1f. (a) SPMT length along with (b) SPMT length as a proportion of sporozoite length were both measured. (c) Additionally, the proportion of the SPMT that was polyglutamylated was measured. Analysis comprises 25 SPMTs (11 long side, 14 short side) from 6 SG sporozoites. ** = p The following section has also been added to the methods to describe this analysis: * "Subpellicular microtubule measurement

      • To measure subpellicular microtubule length and polyglutamylation maximum intensity projections were made of sporozoites stained with NHS Ester, anti-tubulin and anti-PolyE antibodies, and SYTOX Deep Red. The side where the nucleus was closest to the parasite plasma membrane was defined as the 'short side', while the side where the nucleus was furthest from the parasite plasma membrane was defined as the 'long side'. Subpellicular microtubules were then measured using a spline contour from the apical end of the sporozoite to the basal-most end of the microtubule with fluorescence intensity across the contour plotted (Zeiss ZEN 3.8). Sporozoite length was defined as the distance from the sporozoite apical polar rings to the basal complex, measuring through the centre of the cytoplasm. The percentage of the subpellicular microtubule that was polyglutamylated was determined by assessing when along the subpellicular microtubule contour the anti-PolyE fluorescence intensity last dropped below a pre-defined threshold."

      *

      __In addition to the previous point, in the text accompanying Figure 7a, the authors claim that "64% of PbRON11cKD SG sporozoites contained no rhoptries at all, while 9% contained 1 rhoptry and 27% contained 2 rhoptries". Could this data be used to infer which rhoptry pair are missing from the RON11cKD oocyst sporozoites? Can it be inferred that the 64% of salivary gland sporozoites that had no rhoptries in fact had 2 congruent rhoptries in the oocyst sporozoite stage and that these have been discharged already? __

      Please see the response to Reviewer #2 Experimental Changes Comment #4.

      __Is it possible that the dimorphic rhoptries are simply precursors to the congruent rhoptries? Could it be that after the congruent rhoptries are used for SG invasion, new congruent rhoptries are formed from the dimorphic ones and are then used for the next invasion?____ Would it be possible to investigate this by isolating sporozoites some time after they have invaded the SG and performing expansion microscopy? This would allow you to confirm whether the dimorphic rhoptries truly remain in the same form, or if new congruent rhoptries have been formed, or if there have been any other changes to the morphology of the dimorphic rhoptries. __

      In theory, it is possible that the dimorphic rhoptries are precursors to the uniform rhoptries, specifically how the larger one of the two in the dimorphic pair might be a precursor. Maybe the smaller one is, but we have no evidence to suggest that this rhoptry lengthens after SG invasion. We are interested in isolating sporozoites from SGs to add a temporal perspective, but currently, this isn't feasible. When sporozoites are isolated from SGs, they are collected at all stages of invasion. Additionally, we don't know how long each step of SG invasion takes, so a time-based method might not be effective either. We are developing an assay to better determine the timing of events during SG invasion with MoTissU-ExM, but this is beyond the scope of this study.

      __In the section titled "Presence of PbRON11cKD sporozoites in the SG intercellular space", the authors state that "the majority of PbRON11cKD-infected mosquitoes contained some sporozoites in their SGs, but these sporozoites were rarely inside either the SG epithelial cell or secretory cavity". - this is suggestive of an invasion defect as the authors suggest. Could the authors collect these sporozoites and see if liver hepatocyte infection can be established by the mutant sporozoites? They previously speculate that the two different types of rhoptries (congruent and dimorphic) may be specific to the two invasion events (salivary gland epithelial cell and liver cell infection). __

      It has already been shown that RON11cKD sporozoites fail hepatocyte invasion (PMID: 31247198), even when isolated from the haemolymph and so it seems very unlikely that they would be invasive following SG isolation. As mentioned in the discussion, RON11 in merozoites has a 'dual-function' where it is partially secreted during merozoite invasion in addition to its rhoptry biogenesis functions. Assuming this is also the case in sporozoites, using the RON11cKD parasite line we cannot differentiate these two functions and therefore cannot ascribe invasion defects purely to issues with rhoptry biogenesis. In order to answer this question functionally, we would need to identify a protein that only has roles in rhoptry biogenesis and not invasion directly.

      Reviewer #3

      Minor text changes (Reviewer #3)

      1. __Page 3 last paragraph: ...the molecular mechanisms underlying SG (invasion?) are poorly understood. __

      This has been corrected 2.

      __The term "APR" does not refer to a tubulin structure per se, but rather to the proteinaceous structure to which tubulin anchors. Are there any specific APR markers that can be used in Figure 1C? If not, I recommend avoiding the use of "APR" in this context. __

      The text does not state that the APR is a tubulin structure. Given that it is a proteinaceous structure, we visualise the APRs through protein density (NHS Ester). It has been standard for decades to define APRs by protein density using electron microscopy, and it has previously been sufficient in Plasmodium using expansion microscopy (PMIDs: 41542479, 33705377) so it is unclear why it should not be done so in this study. 3.

      __I politely disagree with the bold statements ‚ Little is known about cell biology of sporozoite formation.....from electron microscopy studies now decades old' (p.3, 2nd paragraph); ‚To date, only a handful of (instead of ‚or') proteins have been implicated in SG invasion' (p. 4, 1st paragraph). These claims may overlook existing studies; a more thorough review of the literature is recommended. __

      This study includes at least 50 references from papers broadly related to sporozoite biology, covering publications from every decade since the 1970s. The most recent review that discusses salivary gland invasion cites 11 proteins involved in SG invasion. We have replaced "handful" with a more precise term, as it is not the best adjective, but it is hardly an exaggeration.


      Figure changes (Reviewer #3)

      1. __The hypothesis that Plasmodium utilizes two distinct rhoptry pairs for invading the salivary gland and liver cells is intriguing but remains clearly speculative. Are the "cytoplasmic pair" and "docked pair" composed of the same secretory proteins? Are the paired rhoptries identical? How does the parasite determine which pair to use for salivary gland versus liver cell invasion? Is there any experimental evidence showing that the second pair is activated upon successful liver cell invasion? Without such data this hypothesis seems rather premature. __

      We are unaware of any direct protein localisation evidence suggesting that the rhoptry pairs may carry different cargo. However, only a few proteins have been localised in a way that would allow us to determine if they are associated with distinct rhoptry pairs, so this possibility cannot be ruled out either. It seems unlikely that the parasite 'selects' a specific pair, as rhoptries are typically always found at the apical end. What appears more plausible is that the "docked pair" forms first and immediately occupies the apical docking site, preventing the cytoplasmic pair from docking there. Regarding any evidence that the second pair is activated during liver cell invasion, it has been well documented over decades that rhoptries are involved in hepatocyte invasion. If the dimorphic rhoptries are the only ones present in the parasite during hepatocyte invasion, then they must be used for this process. 2.

      __The quality of the "Roolet fibre" image is not good and resembles background noise from PolyE staining. Additional or alternative images should be provided to convincingly demonstrate that PolyE staining indeed visualizes the Roolet fibre. It is puzzling that the structure is visible with PolyE staining but not with tubulin staining. __

      This is a logical misinterpretation based on the image provided in Figure 1c. Our intention was not to imply that PolyE staining enables us to see the rootlet fibre but that PolyE and tubulin allow us to see the APR to which the rootlet fibre is connected. There is some PolyE staining that likely corresponds to the early SPMTs that in 1c appears to run along the rootlet fibre but this is a product of the max-intensity projection. Please see Reviwer#2 Figure Changes Comment #3 for the updated Figure 1c. 3.

      __More arrows should be added to Figures 6b and 6c to guide readers and improve clarity. __

      We have added arrows to Figure 6b and 6c which point out what we have defined as normal and aberrant rhoptries more clearly. These panels now look like this: 4.

      __Figure 2a zoomed image of P. yoelii infected SG is different than the highligted square. __

      We agree that the highlighted square and the zoomed area appear different, but this is due to the differing amounts of light captured by the objectives used in these two panels. The entire SG panel was captured with a 5x objective, while the zoomed panel was captured with a 63x objective. Because of this difference, the plane of focus of the zoomed area is hard to distinguish in the whole SG image. The zoomed image is on the 'top' of the SG (closest to the coverslip), while most of the signal you see in the whole SG image comes from the 'middle' of the SG. To demonstrate this more clearly, we have provided the exact region of interest shown in the 63x image alongside a 5x image and an additional 20x image, all of which are clearly superimposable.__

      __ 5.

      __Figure 3 legend: "P. yoelii infected midguts harvested on day 15" should be corrected. More general, yes, "...development of each oocyst within a single midgut is asynchronous." but it is still required to provide the dissection days. __

      We are unsure what the suggested change here is. We do not know what is wrong with the statement about day 15 post infection, that is when these midguts were dissected. __ Experimental Changes (Reviewer #3)__

      1. __The proposed role of AOR in rhoptry biogenesis appears highly speculative. It is unclear how the authors conclude that "AORs carry rhoptry cargo" solely based on the presence of RON4 within the structure. Inclusion of additional markers to characterize the content of AOR and rhoptries will be essential to substantiate the hypothesis that this enigmatic structure supports rhoptry biogenesis. __

      It is important to note that the hypothesis that AORs, or rhoptry anlagen, carry rhoptry cargo and serve as vehicles of rhoptry biogenesis was proposed long before this study (PMID: 17908361). In that study, it was assumed that structures now called AORs or rhoptry anlagen were developing rhoptries. Although often visualised by EM and presumed to carry rhoptry cargo (PMID: 33600048, 26565797, 25438048), it was only more recently that AORs became the subject of dedicated investigation (PMID: 31805442), where the authors stated that "...AORs could be immature rhoptr[ies]...". Our observation that AORs contain the rhoptry protein RON4, which is not known to localize to any other organelle, we therefore consider sufficient to conclude that AORs carry rhoptry cargo and are thus vehicles for rhoptry biogenesis. 2.

      __The study of RON11 appears to be a continuation of previous work by a collaborator in the same group. However, neither this study nor the previous one adequately addresses the evolutionary context or structural characteristics of RON11. Notably, the presence of an EF-hand motif is an important feature, especially considering the critical role of calcium signaling in parasite stage conversion. Given the absence of a clear ortholog, it would be interesting to know whether other Apicomplexan parasites harbor rhoptry proteins with transmembrane domains and EF-hand motifs, and if these proteins might respond similarly to calcium stimulation. Investigating mutations within the EF-hand domain could provide valuable functional insights into RON11. __

      We are unsure what suggests that RON11 lacks a clear orthologue. RON11 is conserved across all apicomplexans and is also present in Vitrella brassicaformis (OrthoMCL orthogroup: OG7_0028843). A phylogenetic comparison of RON11 across apicomplexans has previously been performed (PMID: 31247198), and this study provides a structural prediction of PbRON11 with the dual EF-hand domains annotated (Supplementary Figure 9). 3.

      __The study cannot directly confirm that membrane fusion occurs between rhoptries and AORs. __

      This is already stated verbatim in the results "Our data cannot directly confirm that membrane fusion occurs between rhoptries and AORs..." 4.

      __It is unclear what leads to the formation of the aberrant rhoptries observed in RON11cKD sporozoites. Since mosquitoes were not screened for infection prior to salivary gland dissection, The defect reports and revisited of RON11 knockdown does not aid in interpreting rhoptry pair specialization, as there was no consistent trend as to which rhoptry pair was missing in RON11cKD oocyst sporozoites. The notion that RON11cKD parasites likely have ‚combinatorial defects that effect both rhoptry biogenesis and invasion' poses challenges to understand the molecular role(s) of RON11 on biogenesis versus invasion. Of note, RON11 also plays a role in merozoite invasion. __

      We are unclear about the comment or suggestion here, as the claims that RON11cKD does not help interpret rhoptry pair specialization, and that these parasites have combined defects, are both directly stated in the manuscript. 5.

      __Do all SG PbRON11cKD sporozoites lose their reduced number of rhoptries during SG invasion as in Figure 7a (no rhoptries)? __

      Not all RON11cKD SG sporozoites 'use up' their rhoptries during SG invasion. This is quantified in both Figure 7a and the text, which states: "64% of *PbRON11cKD SG sporozoites contained no rhoptries at all, while 9% contained 1 rhoptry and 27% contained 2 rhoptries."

      * 6.

      Different mosquito species/strains are used for P. yoelii, P. berghei, and P. falciparum. Does it effect oocyst sizes/stages? Is it ok to compare?

      __ __We agree that a direct comparison between for example * yoelii and P. berghei *oocyst size would be inappropriate, however Figure 3c and Supplementary Figure 4 are not direct comparisons between two species, but a summation of all oocysts measured in this study to indicate that the trends we observe transcend parasite/mosquito species differences. Our study was not set up with the experimental power to determine if mosquito host species alter oocyst size. 7.

      __While I acknowledge that UExM has significantly advanced resolution capabilities in parasite studies, the value of standard microscopy technique should not be overlooked. Particularly, when discussing the function of RON11, relevant IFA and electron microscopy (EM) images should be included to support claims about RON11's role in rhoptry biogenesis. This would complement the UExM data and substantially strengthen the conclusions. Importantly, UExM can sometimes produce unexpected localization patterns due to the denaturation process, which warrants caution. __

      The purpose of this study is not to discredit, undermine, or supersede other imaging techniques. It is simply to use U-ExM to answer biological questions that cannot or have not been answered using other techniques. Please refer to Reviewer # 1 Minor text changes comment#17 to see the new paragraph "Comparison of MoTissU-ExM and other imaging modalities" that addresses this

      Both conventional IFA and immunoEM have already been performed on RON11 in sporozoites before (PMID: 31247198). When assessing defects caused by RON11 knockdown, conventional IFA isn't especially helpful because it doesn't allow visualization of individual rhoptries. Thin-section TEM also doesn't provide the whole-cell view needed to draw these kinds of conclusions. Volume EM could likely support these observations, but we don't have access to or expertise in this technique, and we believe it is beyond the scope of this study. It's also important to note that for the defect we observe-missing or abnormal rhoptries-the visualization with NHS ester isn't significantly different from what would be seen with EM-based techniques, where rhoptries are easily identified based on their protein density.

      The statement that "UExM can sometimes produce unexpected localisation patterns due to the denaturation process..." is partially correct but lacks important nuance in this context. Based on our extensive experience with U-ExM, there are two main reasons why the localisation of a single protein may look different when comparing U-ExM and traditional IFA images. First, denaturation: in conventional IFAs, antibodies need to recognize conformational epitopes to bind to their target, whereas in U-ExM, antibodies must recognize linear epitopes. This doesn't mean the target protein's localisation changes, only that the antibody's ability to recognize it does. Second, antibody complexes seem unable to freely diffuse out of the gel, which can result in highly fluorescent signals not related to the target protein appearing in the image, as we have previously reported (PMID: 36993603). Importantly, neither of these factors applies to our phenotypic analysis of RON11 knockdown. All phenotypes described are based solely on NHS Ester (total protein) staining, so the considerations about changes in the localisation of individual proteins are not relevant.

    1. So those Redditors suggested they spam the site with fake applications, poisoning the job application data, so Kellogg’s wouldn’t be able to figure out which applications were legitimate or not (we could consider this a form of trolling).

      This statement reminds us that data poisoning is not just a technical problem; it's also a social action strategy. By creating a large number of fake applications, the attack targets the "data entry points" of a company's screening and decision-making processes, causing the system to lose its ability to distinguish between genuine and fraudulent applications. While it can change action costs and slow down replacement processes in the short term, it also raises ethical concerns and risks: could it inadvertently harm genuine job applicants and lead to a broader collapse of trust? Therefore, when discussing this issue, it's necessary to simultaneously evaluate the motives, the affected parties, and the long-term consequences.

    2. Datasets can be poisoned unintentionally. For example, many scientists posted online surveys that people can get paid to take. Getting useful results depended on a wide range of people taking them. But when one TikToker’s video about taking them went viral, the surveys got filled out with mostly one narrow demographic, preventing many of the datasets from being used as intended.

      Yeah, that’s such a good example of “poisoning” that isn’t even malicious — it’s just the internet doing internet things. A dataset can get totally skewed if one group floods it, because suddenly the results stop representing the wider population the researchers were trying to study.

    1. COVID-19

      It's fascinating how data mining can reveal stories from simple numbers, like the connection between COVID-19 surges and candle reviews. However, the section on spurious correlations is a great reminder that just because two trends line up—like margarine consumption and divorce rates—it doesn't mean one actually causes the other.

    1. is just a clever sort of . . . thing to disguise the fact that we’ve become the most aggressively inarticulate generation to come along since . . . you know, a long, long time ago!

      It's still showing in his tone while writting this that, this generation has become one where many of us can't express ourselves freely without some type of pushback from people who generations who lived way before us

    2. Or do we have, like, nothing to say? Has society become so, like, totally . . . I mean absolutely . . . You know? That we’ve just gotten to the point where it’s just, like . . . whatever!

      Now Taylor is telling us how society who been targeting his voice is ... you know. :)

    1. “death by a 1000 cuts”.  One small cut might seem inconsequential, but it is the everyday, continuous experience of being belittled, ignored, or mischaracterized that can become injurious if not deadly.

      Reading this reminded me of Taylor Swift's song "Death by a Thousand Cuts" where she perfectly captures the feeling of the breaking point that microaggressions can lead to. Popular culture has the unique ability to convey lasting messages through art. Taylor Swift has undoubtedly played an integral role in this. Her music gives words to feelings we often can't put words to. While both Taylor Swift and I are part of many privileged demographics, we can also relate to microaggressions in the sense that we are women.

      Simply the way our society is designed to confine women to their traditional role instead of encouraging them to thrive is an example of this. Taylor says in her song "My heart, my hips, my body, my love//Trying to find a part of me that you didn't touch." Society both subtly and unsubtly places expectations and policies on every aspect of women's lives. Like we discussed last class, just because things have gotten dramatically better for women, that does not mean they are equitable. Taylor describes the feeling that accompanies the exhaustion of this fight by saying "Gave you too much but it wasn't enough//But I'll be all right, it's just a thousand cuts." All of the small stabs women take on a daily basis might just lead to "death by a thousand cuts" if they are not addressed, and still we will persist as we bleed out.

    1. and in that moment you murder it with all the poison in your like softness you let it know that like this like this moment it’s like, uhm, you know me using my voice

      And then it ends with all the softness that they would believe they have to suddenly. For them to finally listen with just her voice.

    2. It’s like rapes happen all the time on campuses but as soon as John Krakauer writes about it, suddenly it’s like innovative nonfiction and not like something girls are like making up for like attention

      This is just a bombshell and it all makes sense now. Girls face alot stuff but weren't believed for it, this supported by the earlier lines. Others only believe it because someone who many people favor such as an author, John Krakauer, they now treat the issue like it's a specactle, a new genre of writting, or just pretend like they always known and try to assume some sort of high ground. While the girl who was trying to say that the whole time is still left forgotten, like this is not her issue and one she isn't making up but actively suffered in.

    3. has only gotten in the way of helping them declare more shit about how they will never be forgotten like ever

      I think that the way the old white men or just white men "help" people like her (It's not), is by just declaring nonsense that inturn is just for their own egos so they could go down in history as an all time great while saying an absolute nothing.

    4. Invisible red pens and college degrees have been making their way into the middle of my sentences

      Like it was said earlier about the wait list, people ussally mark her sentences as wrong since some believe because they have a certain achivement that they know more than the person they are talking to. It's like when a proffesor is grading your work all wrong even though it's all right just not the way he wanted the work to be done all because he's the teacher and your the student.

    1. An expanded version of chaos theory, complexity theory looks to uncover a system's beginning (its “initial conditions”) and to trace development as variables and subsystems are added to combine and shape outcomes in ways that are unpredictable.

      What's fascinating about complexity theory is that it's not just about unpredictability... it's also about finding patterns within that unpredictability. Complex systems often exhibit self-organization, where order emerges without central control. Think of how cities develop distinct neighborhoods, how ecosystems stabilize into food webs, or how free markets coordinate resources without anyone orchestrating the whole system.

    1. We are still using these algorithms called humans that are really biased,” says Ghani. “We’ve tested them and known that they’re horrible, but we still use them to make really important decisions every day.”these unregulated tools can harm individu-als and society, causing anxiety, unneces-sary medical expenses, stigmatization and worse. “It’s the Wild West of genetics,” says Erin Demo, a genetic counsellor at Sibley Heart Center Cardiology in Atlanta, Georgia. “This is just going to get harder and harder.”Bellenson posted his app on GenePlaza, an online marketplace for DNA-interpretation tools, in early October. For US$5.50, a person could upload their genetic data — as supplied by consumer DNA sequencing companies such as 23andMe of Mountain View, Califor-nia — and the app would place them along a Nature | Vol 574 | 31 October 2019 | 609©2019SpringerNatureLimited.Allrightsreserved.©2019SpringerNatureLimited.Allrightsreserved.

      Scary

    2. we still use them to make really important decisions every day.”these unregulated tools can harm individu-als and society, causing anxiety, unneces-sary medical expenses, stigmatization and worse. “It’s the Wild West of genetics,” says Erin Demo, a genetic counsellor at Sibley Heart Center Cardiology in Atlanta, Georgia. “This is just going to get harder and harder.”Bellenson posted his app on GenePlaza, an online marketplace for DNA-interpretation tools, in early October. For US$5.50, a person could upload their genetic data — as supplied by consumer DNA sequencing companies such as 23andMe of Mountain View, Califor-nia — and the app would place them along a Nature | Vol 574 | 31 October 2019 | 609©2019SpringerNatureLimited.Allrightsreserved.©2019SpringerNatureLimited.Allrightsreserved.

      In the long run, making bad decisions for your business, whatever it is, will not be good for that business. How do we stop looking at perceived short-term gains and look to improvement for the long term?

    1. Another term related to the internet performance is latency, so we just talked about bandwidth, which is the throughput or the speed that you can upload and download information latency, then, is a measure of the time. It takes a piece of data to reach its destination. So, it's typically measured in milliseconds or thousandths of a second. A ping activity, which will measure the latency of your network. So, ping is a networking utility used by Network administrators to measure the administrators to measure the latency on the internet. So, what you can do is, you can go to a website and use their ping utility to test the reachability of certain hosts.

    2. Broadband Access definitely an issue in terms of thinking about if you have access to the internet, but you don't have Broadband access where broadband referring to high-speed internet service. Then, that means that you may not be able to use the web as effectively, or use the resources that are on the web as effectively. So, if you have Broadband, that usually means that you have it through a cable or DSL modem in a DSL, digital subscriber line, one that's connected via your home telephone Network. You can see here, there's about 72 percent Broadband penetration in its member. So, within the OECD (The Organization for Economic Co-operation and Development) countries, and you can see some of them listed there. You'll notice that a couple of them Finland, Australia, Japan, Sweden, Denmark, create even the United States. It's just barely over that 100 percent mark. So, what that means is that there is more than one Broadband connection for every person. So, for example, I have a connection. I have multiple connections, including my cable network here at home. I have an internet service plan for one of my tablets, and I also have cellular data plans, you know, for each of the phones that are in our house. So, if you start to think about the different plans you have four different devices. You can see how we're now getting over a hundred percent penetration, but to notice that many countries, like if you go down to the last one, they're on the list. Mexico doesn't even have 25 penetration for Broadband yet, so that means that they're using something, means if they have internet access, because more than 25 of Mexico's population has internet access. They are using something slower than Broadband access.

    1. Most sociologists define health as the capacity to perform roles and tasks of everyday living, while acknowledging social differences in people’s perceptions of healt

      Health is not just a biological state but a social one, it's measured by a person's ability to function in their roles. This part of the reading also touches on how health is socially constructed meaning it is shaped by social factors like culture, age, gender, class, experiences, etc. From Parson, Twaddle, and Blaxter's findings it expands this idea that health is tied to social norms and expectations.

    2. Health is a broad concept with multiple dimensions.

      This paragraph points out why you can't just perceive health as one dimension, disease. Ware says it has six dimensions, I agree with these dimensions. I think it's important to define health in a way that accounts for physical, mental, social, functional, and perceptual components, not just the disease aspect.

  5. Jan 2026
    1. Living with family members often introduces additional time stresses. You may have family obligations that require careful time management. Use all the strategies described earlier, including family time in your daily plans the same as you would hours spent at work. Don’t assume that you’ll be “free” every hour you’re home, because family events or a family member’s need for your assistance may occur at unexpected times. Schedule your important academic work well ahead and in blocks of time you control. See also the earlier suggestions for controlling your space: you may need to use the library or another space to ensure you are not interrupted or distracted during important study times. Students with their own families are likely to feel time pressures. After all, you can’t just tell your partner or kids that you’ll see them in a couple years when you’re not so busy with job and college! In addition to all the planning and study strategies discussed so far, you also need to manage your family relationships and time spent with family. While there’s no magical solution for making more hours in the day, even with this added time pressure there are ways to balance your life well: Talk everything over with your family. If you’re going back to school, your family members may not have realized changes will occur. Don’t let them be shocked by sudden household changes. Keep communication lines open so that your partner and children feel they’re together with you in this new adventure. Eventually you will need their support. Work to enjoy your time together, whatever you’re doing. You may not have as much time together as previously, but cherish the time you do have—even if it’s washing dishes together or cleaning house. If you’ve been studying for two hours and need a break, spend the next ten minutes with family instead of checking e-mail or watching television. Ultimately, the important thing is being together, not going out to movies or dinners or the special things you used to do when you had more time. Look forward to being with family and appreciate every moment you are together, and they will share your attitude.

      The main idea of this passage is that students with family responsibilities must plan their time carefully to balance academics and family life. It emphasizes scheduling study time in blocks you control, using distraction-free spaces, and including family time in your plans. Open communication with family and valuing the time you do spend together helps maintain support and harmony while managing school, work, and family obligations.

    2. Here are some more tips for successful schedule planning: Studying is often most effective immediately after a class meeting. If your schedule allows, block out appropriate study time after class periods. Be realistic about time when you make your schedule. If your class runs to four o’clock and it takes you twenty minutes to wrap things up and reach your study location, don’t figure you’ll have a full hour of study between four o’clock and five o’clock. Don’t overdo it. Few people can study four or five hours nonstop, and scheduling extended time periods like that may just set you up for failure. Schedule social events that occur at set times, but just leave holes in the schedule for other activities. Enjoy those open times and recharge your energies! Try to schedule some time for exercise at least three days a week. Plan to use your time between classes wisely. If three days a week you have the same hour free between two classes, what should you do with those three hours? Maybe you need to eat, walk across campus, or run an errand. But say you have an average forty minutes free at that time on each day. Instead of just frittering the time away, use it to review your notes from the previous class or for the coming class or to read a short assignment. Over the whole term, that forty minutes three times a week adds up to a lot of study time. If a study activity is taking longer than you had scheduled, look ahead and adjust your weekly planner to prevent the stress of feeling behind. If you maintain your schedule on your computer or smartphone, it’s still a good idea to print and carry it with you. Don’t risk losing valuable study time if you’re away from the device. If you’re not paying close attention to everything in your planner, use a colored highlighter to mark the times blocked out for really important things. When following your schedule, pay attention to starting and stopping times. If you planned to start your test review at four o’clock after an hour of reading for a different class, don’t let the reading run long and take time away from studying for the test.

      Effective schedule planning means studying at the right times, especially soon after class when the material is fresh. Being realistic about how much time you truly have helps avoid stress and disappointment. Short, focused study sessions work better than long hours without breaks, and free moments between classes can add up to valuable study time. A good schedule also leaves room for social life, exercise, and rest to stay balanced and motivated.

    3. Calendar Planners and To-Do Lists Calendar planners and to-do lists are effective ways to organize your time. Many types of academic planners are commercially available (check your college bookstore), or you can make your own. Some people like a page for each day, and some like a week at a time. Some use computer calendars and planners. Almost any system will work well if you use it consistently. Some college students think they don’t need to actually write down their schedule and daily to-do lists. They’ve always kept it in their head before, so why write it down in a planner now? Some first-year students were talking about this one day in a study group, and one bragged that she had never had to write down her calendar because she never forgot dates. Another student reminded her how she’d forgotten a preregistration date and missed taking a course she really wanted because the class was full by the time she went online to register. “Well,” she said, “except for that time, I never forget anything!” Of course, none of us ever forgets anything—until we do. Calendars and planners help you look ahead and write in important dates and deadlines so you don’t forget. But it’s just as important to use the planner to schedule your own time, not just deadlines. For example, you’ll learn later that the most effective way to study for an exam is to study in several short periods over several days. You can easily do this by choosing time slots in your weekly planner over several days that you will commit to studying for this test. You don’t need to fill every time slot, or to schedule every single thing that you do, but the more carefully and consistently you use your planner, the more successfully will you manage your time. But a planner cannot contain every single thing that may occur in a day. We’d go crazy if we tried to schedule every telephone call, every e-mail, every bill to pay, every trip to the grocery store. For these items, we use a to-do list, which may be kept on a separate page in the planner. Check the example of a weekly planner form in Figure 2.5 “Weekly Planner”. (You can copy this page and use it to begin your schedule planning. By using this first, you will find out whether these time slots are big enough for you or whether you’d prefer a separate planner page for each day.) Fill in this planner form for next week. First write in all your class meeting times; your work or volunteer schedule; and your usual hours for sleep, family activities, and any other activities at fixed times. Don’t forget time needed for transportation, meals, and so on. Your first goal is to find all the blocks of “free time” that are left over. Remember that this is an academic planner. Don’t try to schedule in everything in your life—this is to plan ahead to use your study time most effectively. Next, check the syllabus for each of your courses and write important dates in the planner. If your planner has pages for the whole term, write in all exams and deadlines. Use red ink or a highlighter for these key dates. Write them in the hour slot for the class when the test occurs or when the paper is due, for example. (If you don’t yet have a planner large enough for the whole term, use Figure 2.5 “Weekly Planner” and write any deadlines for your second week in the margin to the right. You need to know what’s coming next week to help schedule how you’re studying this week.)

      Calendar planners and to-do lists help students organize their time and avoid forgetting important dates. Writing schedules down is more reliable than keeping everything in your head, because everyone forgets things sometimes. Planners are not only for deadlines but also for scheduling study time in advance so work is spread out and less stressful. To-do lists are useful for smaller daily tasks that don’t fit into a planner, helping you stay organized without feeling overwhelmed.

    4. Procrastination is a way of thinking that lets one put off doing something that should be done now. This can happen to anyone at any time. It’s like a voice inside your head keeps coming up with these brilliant ideas for things to do right now other than studying: “I really ought to get this room cleaned up before I study” or “I can study anytime, but tonight’s the only chance I have to do X.” That voice is also very good at rationalizing: “I really don’t need to read that chapter now; I’ll have plenty of time tomorrow at lunch.…” Procrastination is very powerful. Some people battle it daily, others only occasionally. Most college students procrastinate often, and about half say they need help avoiding procrastination. Procrastination can threaten one’s ability to do well on an assignment or test. People procrastinate for different reasons. Some people are too relaxed in their priorities, seldom worry, and easily put off responsibilities. Others worry constantly, and that stress keeps them from focusing on the task at hand. Some procrastinate because they fear failure; others procrastinate because they fear success or are so perfectionistic that they don’t want to let themselves down. Some are dreamers. Many different factors are involved, and there are different styles of procrastinating. Just as there are different causes, there are different possible solutions for procrastination. Different strategies work for different people. The time management strategies described earlier can help you avoid procrastination. Because this is a psychological issue, some additional psychological strategies can also help: Since procrastination is usually a habit, accept that and work on breaking it as you would any other bad habit: one day at a time. Know that every time you overcome feelings of procrastination, the habit becomes weaker—and eventually you’ll have a new habit of being able to start studying right away. Schedule times for studying using a daily or weekly planner. Carry it with you and look at it often. Just being aware of the time and what you need to do today can help you get organized and stay on track. If you keep thinking of something else you might forget to do later (making you feel like you “must” do it now), write yourself a note about it for later and get it out of your mind. Counter a negative with a positive. If you’re procrastinating because you’re not looking forward to a certain task, try to think of the positive future results of doing the work. Counter a negative with a worse negative. If thinking about the positive results of completing the task doesn’t motivate you to get started, think about what could happen if you keep procrastinating. You’ll have to study tomorrow instead of doing something fun you had planned. Or you could fail the test. Some people can jolt themselves right out of procrastination. On the other hand, fear causes procrastination in some people—so don’t dwell on the thought of failing. If you’re studying for a test, and you’re so afraid of failing it that you can’t focus on studying and you start procrastinating, try to put things in perspective. Even if it’s your most difficult class and you don’t understand everything about the topic, that doesn’t mean you’ll fail, even if you may not receive an A or a B. Study with a motivated friend. Form a study group with other students who are motivated and won’t procrastinate along with you. You’ll learn good habits from them while getting the work done now. Keep a study journal. At least once a day write an entry about how you have used your time and whether you succeeded with your schedule for the day. If not, identify what factors kept you from doing your work. (Use the form at the end of this chapter.) This journal will help you see your own habits and distractions so that you can avoid things that lead to procrastination. Get help. If you really can’t stay on track with your study schedule, or if you’re always putting things off until the last minute, see a college counselor. They have lots of experience with this common student problem and can help you find ways to overcome this habit.

      Procrastination is a common habit where people delay important tasks by making excuses to do something else. It affects many students for different reasons, such as stress, fear of failure, or poor time management, and can hurt academic performance. However, with planning, positive thinking, and the right strategies, procrastination can be reduced and overcome over time.

    5. ime Management Strategies for Success Following are some strategies you can begin using immediately to make the most of your time: Prepare to be successful. When planning ahead for studying, think yourself into the right mood. Focus on the positive. “When I get these chapters read tonight, I’ll be ahead in studying for the next test, and I’ll also have plenty of time tomorrow to do X.” Visualize yourself studying well! Use your best—and most appropriate—time of day. Different tasks require different mental skills. Some kinds of studying you may be able to start first thing in the morning as you wake, while others need your most alert moments at another time. Break up large projects into small pieces. Whether it’s writing a paper for class, studying for a final exam, or reading a long assignment or full book, students often feel daunted at the beginning of a large project. It’s easier to get going if you break it up into stages that you schedule at separate times—and then begin with the first section that requires only an hour or two. Do the most important studying first. When two or more things require your attention, do the more crucial one first. If something happens and you can’t complete everything, you’ll suffer less if the most crucial work is done. If you have trouble getting started, do an easier task first. Like large tasks, complex or difficult ones can be daunting. If you can’t get going, switch to an easier task you can accomplish quickly. That will give you momentum, and often you feel more confident tackling the difficult task after being successful in the first one. If you’re feeling overwhelmed and stressed because you have too much to do, revisit your time planner. Sometimes it’s hard to get started if you keep thinking about other things you need to get done. Review your schedule for the next few days and make sure everything important is scheduled, then relax and concentrate on the task at hand. If you’re really floundering, talk to someone. Maybe you just don’t understand what you should be doing. Talk with your instructor or another student in the class to get back on track. Take a break. We all need breaks to help us concentrate without becoming fatigued and burned out. As a general rule, a short break every hour or so is effective in helping recharge your study energy. Get up and move around to get your blood flowing, clear your thoughts, and work off stress. Use unscheduled times to work ahead. You’ve scheduled that hundred pages of reading for later today, but you have the textbook with you as you’re waiting for the bus. Start reading now, or flip through the chapter to get a sense of what you’ll be reading later. Either way, you’ll save time later. You may be amazed how much studying you can get done during downtimes throughout the day. Keep your momentum. Prevent distractions, such as multitasking, that will only slow you down. Check for messages, for example, only at scheduled break times. Reward yourself. It’s not easy to sit still for hours of studying. When you successfully complete the task, you should feel good and deserve a small reward. A healthy snack, a quick video game session, or social activity can help you feel even better about your successful use of time. Just say no. Always tell others nearby when you’re studying, to reduce the chances of being interrupted. Still, interruptions happen, and if you are in a situation where you are frequently interrupted by a family member, spouse, roommate, or friend, it helps to have your “no” prepared in advance: “No, I really have to be ready for this test” or “That’s a great idea, but let’s do it tomorrow—I just can’t today.” You shouldn’t feel bad about saying no—especially if you told that person in advance that you needed to study. Have a life. Never schedule your day or week so full of work and study that you have no time at all for yourself, your family and friends, and your larger life. Use a calendar planner and daily to-do list. We’ll look at these time management tools in the next section.

      The main idea of “Time Management Strategies for Success” is that managing your time well is about working smarter, not just harder. This section gives practical, realistic strategies students can use right away to stay productive, reduce stress, and avoid procrastination—while still having a life.

      In simple terms, it teaches you how to:

      Plan ahead with a positive mindset, so studying feels less stressful and more motivating.

      Use your energy wisely by doing tasks at the time of day when you focus best.

      Break big tasks into smaller, manageable pieces to avoid feeling overwhelmed.

      Set priorities, so the most important work gets done first.

      Build momentum by starting with easier tasks when motivation is low.

      Stay flexible by reviewing your schedule when things feel out of control.

      Ask for help when needed, instead of staying stuck and confused.

      Take regular breaks to avoid burnout and stay mentally fresh.

      Use small pockets of free time during the day to get work done early.

      Avoid distractions, especially multitasking, to keep your focus strong.

      Reward yourself after completing tasks to stay motivated.

      Learn to say no to interruptions without feeling guilty.

      Balance work and life, making time for rest, friends, and personal well-being.

      Use planners and to-do lists to stay organized and on track.

    1. Sometimes going to the library or elsewhere is not practical for studying, and you have to find a way to cope in a shared space. Part of the solution is time management. Agree with others on certain times that will be reserved for studying; agree to keep the place quiet, not to have guests visiting, and to prevent other distractions. These arrangements can be made with a roommate, spouse, and older children. If there are younger children in your household and you have child-care responsibility, it’s usually more complicated. You may have to schedule your studying during their nap time or find quiet activities for them to enjoy while you study. Try to spend some time with your kids before you study, so they don’t feel like you’re ignoring them. (More tips are offered later in this chapter.) The key is to plan ahead. You don’t want to find yourself, the night before an exam, in a place that offers no space for studying. Finally, accept that sometimes you’ll just have to say no. If your roommate or a friend often tries to engage you in conversation or suggests doing something else when you need to study, just say no. Learn to be firm but polite as you explain that you just really have to get your work done first. Students who live at home may also have to learn how to say no to parents or family members—just be sure to explain the importance of the studying you need to do! Remember, you can’t be everything to everyone all the time.

      This paragraph talks about studying in shared spaces: If you must study at home or with roommates/family, interruptions are common. A solution is to agree on specific quiet study times, times when no guests visit, etc.

      With kids or family responsibilities, you may need to plan your study around their schedules (e.g., nap times). The paragraph ends by reminding you that sometimes you’ll need to say “no” politely when friends/family want your attention — you can’t always be available to everyone.

    2. Multitasking is the term commonly used for being engaged in two or more different activities at the same time, usually referring to activities using devices such as cell phones, smartphones, computers, and so on. Many people claim to be able to do as many as four or five things simultaneously, such as writing an e-mail while responding to an instant message (IM) and reading a tweet, all while watching a video on their computer monitor or talking on the phone. Many people who have grown up with computers consider this kind of multitasking a normal way to get things done, including studying. Even people in business sometimes speak of multitasking as an essential component of today’s fast-paced world. It is true that some things can be attended to while you’re doing something else, such as checking e-mail while you watch television news—but only when none of those things demands your full attention. You can concentrate 80 percent on the e-mail, for example, while 20 percent of your attention is listening for something on the news that catches your attention. Then you turn to the television for a minute, watch that segment, and go back to the e-mail. But you’re not actually watching the television at the same time you’re composing the e-mail—you’re rapidly going back and forth. In reality, the mind can focus only on one thing at any given moment. Even things that don’t require much thinking are severely impacted by multitasking, such as driving while talking on a cell phone or texting. An astonishing number of people end up in the emergency room from just trying to walk down the sidewalk while texting, so common is it now to walk into a pole or parked car while multitasking! “Okay,” you might be thinking, “why should it matter if I write my paper first and then answer e-mails or do them back and forth at the same time?” It actually takes you longer to do two or more things at the same time than if you do them separately—at least with anything that you actually have to focus on, such as studying. That’s true because each time you go back to studying after looking away to a message or tweet, it takes time for your mind to shift gears to get back to where you were. Every time your attention shifts, add up some more “downtime”—and pretty soon it’s evident that multitasking is costing you a lot more time than you think. And that’s assuming that your mind does fully shift back to where you were every time, without losing your train of thought or forgetting an important detail. It doesn’t always. The other problem with multitasking is the effect it can have on the attention span—and even on how the brain works. Scientists have shown that in people who constantly shift their attention from one thing to another in short bursts, the brain forms patterns that make it more difficult to keep sustained attention on any one thing. So when you really do need to concentrate for a while on one thing, such as when studying for a big test, it becomes more difficult to do even if you’re not multitasking at that time. It’s as if your mind makes a habit of wandering from one thing to another and then can’t stop.

      This section explains multitasking doing more than one thing at once: Many people think they can multitask by checking messages, emails, social media, while studying, but actually the brain can only focus on one thing at a time. What seems like multitasking is really rapid switching between tasks, and every switch makes you lose focus and time. Constantly shifting attention trains your brain to be less able to sustain focus, even when you stop multitasking. To avoid this, it’s better to turn off technology distractions when studying (phones, messaging, Wi-Fi), or go to a place like the library without your phone.

    3. Everyone needs his or her own space. This may seem simple, but everyone needs some physical area, regardless of size, that is really his or her own—even if it’s only a small part of a shared space. Within your own space, you generally feel more secure and in control. Physical space reinforces habits. For example, using your bed primarily for sleeping makes it easier to fall asleep there than elsewhere and also makes it not a good place to try to stay awake and alert for studying. Different places create different moods. While this may seem obvious, students don’t always use places to their best advantage. One place may be bright and full of energy, with happy students passing through and enjoying themselves—a place that puts you in a good mood. But that may actually make it more difficult to concentrate on your studying. Yet the opposite—a totally quiet, austere place devoid of color and sound and pleasant decorations—can be just as unproductive if it makes you associate studying with something unpleasant. Everyone needs to discover what space works best for himself or herself—and then let that space reinforce good study habits.

      These sentences list several reasons space is important:

      Everyone needs their own physical study area — even a small spot where you feel in control helps you focus.

      Space reinforces habits — for example, if you always use your bed for sleep, your brain connects it with sleeping — making it hard to study there.

      Different places create different moods — a lively place might lift your mood but also distract you; a very quiet place might feel boring and make studying unpleasant. -> You need to find the space that works best for you

    1. It's probably worth mentioning that much of this work creating maize from teosinte, learning to freeze-dry potatoes and detoxify cassava, and identifying wheat and rice as valuable plants, was probably done by women.

      Just think it's fascinating how many of these techniques are still used in today's society, like refrigerating / learning to freeze dry a variety of things.

    1. and the Ottoman Empire in the Middle East.

      This little empire would last all the way into the 20th century, ultimately collapsing after World War I. It's crazy to think about just how long this empire, one that had been an enemy to the late Roman Empire, survived.

    2. Year 9, Month 6, Day 16 (6 Jul 1411)

      It's hard to imagine this kind of expedition taking this long. I know it did, but still that is a commitment that's hard to imagine. Just leaving everything for years, decades even. It's a lot.

    3. While baby boys were a blessing, girls were considered an expense to their birth families, since they only became valuable when they married and bore sons for their husbands' families.

      It’s sad to think that girls were seen as less important just because of their gender, and it makes me realize how much cultural values can shape peoples lives in both harmful and lasting ways.

    1. many conquistadors took native women as wives and mistresses

      Due to the racism, classism, oppression, and discrimination against indigenous people in the past, I wonder if these women were forced to elope with these conquistadors. By forced, I am meaning through violence, being seen as objects, having economic issues and a poor class status, and/or being sold off to them. I believe this even happened to Pocahontas (held hostage, forced to change her name, and had to marry an English colonizer), so it wouldn't be too far-fetched if that were the case for other women of the similar backgrounds. It's just depressing to learn and hear from others or textbooks about these types of experiences. No one should be dehumanized and treated so wrongly for things they cannot control.

    1. What type of system best helps you to manage your resources? When conducting any type of thinking, you need to have a firm grasp on information literacy, or knowing how to access the sources you may need. Practicing good information literacy skills involves more than simply using a search engine such as Google, although that could be a starting point. You also engage in creative thinking (i.e., generating topics to research), analytical thinking (i.e., reading and examining the parts of sources), and critical thinking (i.e., evaluating sources for accuracy, authority, etc.). Then there is synthesis that is used when incorporating multiple sources into a research project. Information literacy utilizes all of the necessary thinking skills. If you saw the name of a person on the cover of a magazine, for instance, you might assume the person did something important to merit the attention. If you were to google the person’s name, you would instantly need to use context clues to determine if the information your search produced is actually about your person and not someone else with the same or a similar name, whether the information is accurate, and if it is current. If it is not, you would need to continue your research with other sources.

      I think this is helpful because it shows that good research isn’t just about finding information, it’s about analyzing and synthesizing it properly.

    1. Thinking helps in many situations, as we’ve discussed throughout this chapter. When we work out a problem or situation systematically, breaking the whole into its component parts for separate analysis, to come to a solution or a variety of possible solutions, we call that analytical thinking. Characteristics of analytical thinking include setting up the parts, using information literacy, and verifying the validity of any sources you reference. While the phrase analytical thinking may sound daunting, we actually do this sort of thinking in our everyday lives when we brainstorm, budget, detect patterns, plan, compare, work puzzles, and make decisions based on multiple sources of information. Think of all the thinking that goes into the logistics of a dinner-and-a-movie date—where to eat, what to watch, who to invite, what to wear, popcorn or candy—when choices and decisions are rapid-fire, but we do it relatively successfully all the time.

      I like that the reading shows analytical thinking isn’t just for school or work, it’s something we practice all the time in normal life.

    1. Simply put, procrastination is the act of delaying some task that needs to be completed. It is something we all do to greater and lesser degrees. For most people, a little minor procrastination is not a cause for great concern. But there are situations where procrastination can become a serious problem with a lot of risk. These include: when it becomes a chronic habit, when there are a number of tasks to complete and little time, or when the task being avoided is very important. Because we all procrastinate from time to time, we usually do not give it much thought, let alone think about its causes or effects. Ironically, many of the psychological reasons for why we avoid a given task also keep us from using critical thinking to understand why procrastination can be extremely detrimental, and in some cases difficult to overcome. To succeed at time management, you must understand some of the hurdles that may stand in your way. Procrastination is often one of the biggest. What follows is an overview of procrastination with a few suggestions on how to avoid it.

      I think it’s helpful that the reading points out that procrastination isn’t just laziness, it often has psychological causes that make it hard to overcome.

    2. As a pure survival mechanism it does have its usefulness in that it reminds us to be wary of behaviors that can result in undesirable outcomes.

      Sometimes it's not just out own selves that are sabotaging our abilities and mistakes, a lot of the time it is the people around us; our "support" system; parents, co-workers, spouse, if these people are also giving negative bias, it makes it even more difficult to overcome it.

    1. Duff and litter are not reported for post-pct stands for two reasons. First, these are not expected to have changed significantly given the relatively short timespan before and after PCT. Second, our sampling protocol did not make it clear how to quantify the loading for leaves attached to recently cut branches, especially given that these “suspended litter” particles were in a state of active transition to the ground, as they dry, abscise, and sift through the coarser woody fuels as they make their way to the ground. This class may deserve more attention because of it’s potentially dynamic relationship with the timing of prescribed fires: as suspended particles settle and begin to decompose they’re bulk density changes and bulk density is an important determinant of fire behavior.

      I think you can just mention this in the methods and not worry about discussing this here.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers and editors for their careful evaluation of our manuscript and their positive comments on the importance and rigor of the work. Below you will find our point-by-point response to each reviewer's suggestions. We believe that we have addressed (in the response and the revised manuscript) all of the concerns. Please note that in some cases, we have numbered a reviewer's comments for clarity, however beyond this, we have not altered any of the reviewers' text.

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Lo et al., report a high-throughput functional profiling study on the gene encoding for argininosuccinate synthase (ASS1), done in a yeast experimental system. The study design is robust (see lines 141-143, main text, Methods), whereby "approximately three to four independent transformants of each variant would be isolated and assayed." (lines 140 - 141, main text, Methods). Such a manner of analysis will allow for uncertainty of the functional readout for the tested variants to be accounted for.

      This is an outstanding study providing insights on the functional landscape of ASS1. Functionally impaired ASS1 may cause citrullinemia type I, and disease severity varies according to the degree of enzyme impairment (line 30, main text; Abstract). Data from this study forms a valuable resource in allowing for functional interpretation of protein-altering ASS1 variants that could be newly identified from large-scale whole-genome sequencing efforts done in biobanks or national precision medicine programs. I have some suggestions for the Authors to consider:

      1. The specific function of ASS1 is to condense L-citrulline and L-aspartate to form argininosuccinate. Instead of measuring either depletion of substrate or formation of product, the Authors elected to study 'growth' of the yeast cells. This is a broader phenotype which could be determined by other factors outside of ASS1. Whereas i agree that the experiments were beautifully done, the selection of an indirect phenotype such as ability of the yeast cells to grow could be more vigorously discussed.

      We appreciate the reviewer's point regarding the indirect nature of growth as a functional readout. In our system, yeast growth is tightly and specifically coupled to ASS enzymatic activity. The strains used are isogenic and lack the native yeast argininosuccinate synthetase, such that arginine biosynthesis, and therefore yeast replication on minimal medium lacking arginine, depends exclusively on the activity of human ASS1. Under these defined and limiting conditions, growth provides a quantitative proxy for ASS1 function. However, we acknowledge that this assay does not resolve specific molecular mechanisms underlying reduced function, such as altered catalytic activity versus effects on protein stability. We have updated the text to clarify these points.

      "While growth is an indirect phenotype relative to direct measurement of substrate turnover or product formation, it is tightly coupled to ASS enzymatic activity in this system and is expected to be impaired by amino acid substitutions that reduce catalytic activity or protein stability. Therefore, growth on minimal medium lacking arginine is a quantitative measure of ASS enzyme function, allowing the impact of ASS1 missense variants to be assessed at scale through a high-throughput growth assay, in a single isogenic strain background, under controlled, defined conditions that limit confounding factors unrelated to ASS1 activity. We expect that the assay will detect reductions in both catalytic activity and protein stability but will not distinguish between these mechanisms."

      1. One of the key reasons why studies such as this one are valuable is due to the limitations of current variant classification methods that rely on 'conservation' status of amino acid residues to predict which variants might be 'pathogenic' and which variants might be 'likely benign'. However, there are serious limitations, and Figures 2 and 6 in the main text shows this clearly. Specifically, there is an appreciable number of variants that, despite being classified as "ClinVar Pathogenic", were shown by the assay to unlikely be functionally impaired. This should be discussed vigorously. Could these inconsistencies be potentially due to the read out (growth instead of a more direct evaluation of ASS1 function)?

      We interpret this discrepancy as reflecting a sensitivity limitation of the growth-based readout rather than a fundamental disagreement between functional effect and clinical annotation. Specifically, we believe that our assay is unable to resolve the very mildest hypomorphic variants from true wild type, i.e., the residual activity of these variants is sufficient to fully support yeast growth under the conditions used. On this basis, we have chosen not to treat wild-type-like growth in our assay as informative for benignity; conversely, reduced growth provides evidence supporting pathogenicity (all clinically validated variants examined in this range are pathogenic).

      We have revised the manuscript to clarify this point explicitly and to frame these variants as lying outside the effective resolution limit of the assay rather than representing true false positives. Additional discussion of this limitation and its implications is provided in our responses to Reviewer 2 (points 1 and 4) along with specific changes made to the text.

      1. Figure 3 is very interesting, showing a continuum of functional readout ranging from 'wild-type' to 'null'. It is very interesting that the Authors used a threshold of less than 0.85 as functionally hypomorphic. What does this mean? It would be very nice if they have data from patients carrying two hypomorphic ASS1 alleles, and correlate their functional readout with severity of clinical presentation. The reader might be curious as to the clinical presentation of individuals carrying, for example, two ASS1 alleles with normalized growth of 0.7 to 0.8.

      I hope you will find these suggestions helpful.

      We thank the reviewer for this thoughtful comment. Figure 3 indeed illustrates a continuum of functional effects, and we agree that careful interpretation of the thresholds used is important. To clarify the rationale for the hypomorphic threshold, the interpretation of intermediate growth values, and to emphasize that these labels reflect only behavior in the functional assay, we have rewritten the relevant section of the Results:

      "The normalized growth scores of the 2,193 variants tested in our functional assay form a clear bimodal distribution (Figure 3), with two distinct peaks corresponding to functional extremes, as is commonly reported in large-scale functional assays of protein function [9, 10]. The smaller peak, centered around the null control (normalized growth = 0), represents variants that fail to support growth in the assay (growth 0.85). Variants with growth values falling between these two peak-based thresholds display partial functional impairment and are classified as functionally hypomorphic (n = 323). Crucially, these classifications are entirely derived from the observed peaks in the distribution of growth values and reflect differences in functional activity under the assay conditions. They do not provide direct evidence for clinical pathogenicity or benignity and should not be used for clinical variant interpretation without proper benchmarking against clinical reference datasets, as implemented below within an OddsPath framework."

      We agree with the reviewer that correlating functional measurements with clinical severity in individuals carrying two hypomorphic ASS1 alleles would be highly informative, particularly given that ASS1 deficiency is an autosomal recessive disorder. While mild hypomorphic variants (for example, variants with normalized growth values of 0.7-0.8 in our assay) could plausibly contribute to disease when paired with a complete loss-of-function allele, systematic analysis of combinatorial genotype effects and genotype-phenotype correlations is beyond the scope of the present study, which focuses on the functional effects of individual variants. We view this as an important direction for future work.

      Reviewer #1 (Significance (Required)):

      This is an outstanding study providing insights on the functional landscape of ASS1. Functionally impaired ASS1 may cause citrullinemia type I, and disease severity varies according to the degree of enzyme impairment (line 30, main text; Abstract). Data from this study forms a valuable resource in allowing for functional interpretation of protein-altering ASS1 variants that could be newly identified from large-scale whole-genome sequencing efforts done in biobanks or national precision medicine programs.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this manuscript, Lo et al characterize the phenotypic effect of ~90% of all possible ASS1 missense mutations using an elegant yeast-based system, and use this dataset to aid the interpretation of clinical ASS1 variants. Overall, the manuscript is well-written and the experimental data are interpretated rigorously. Of particular interest is the identification of pairs of deleterious alleles that rescue ASS1 activity in trans. My comments mainly pertain to the relevance of using a yeast screening methodology to infer functional effects of human ASS1 mutations.

      1. Since human ASS1 is heterologously expressed in yeast for this mutational screen, direct comparison of native expression levels between human cells and yeast is not possible. Could the expression level of human ASS1 (driven by the pARG1 promoter) in yeast alter the measured fitness defect of each variant? For instance, if ASS1 expression in yeast is sufficiently high to mask modest reductions in catalytic activity, such variants may be misclassified as hypomorphic rather than amorphic. Conversely, if expression is intrinsically low, even mild catalytic impairments could appear deleterious. While it is helpful that the authors used non-human primate SNV data to calibrate their assay, experiments could be performed to directly address this possibility.

      The nature of the relationship between yeast growth and availability of functional ASS1 could also influence the interpretation of results from the yeast-based screen. Does yeast growth scale proportionately with ASS1 enzymatic activity?

      We completely agree that the expression level of human ASS1 in yeast could influence the measured fitness effects of individual variants. We expect the rank ordering of variants in our growth assay to reflect their relative enzymatic activity (i.e. a monotonic relationship) but acknowledge that the precise mapping between activity and growth is unknown and may include ceiling and floor effects that limit the assay's dynamic range. As the reviewer notes, under high expression conditions moderate loss-of-function variants could appear indistinguishable from wild type (ceiling effect), whereas under lower expression the same variants could behave closer to the null control (floor effect).

      In our system, ASS1 is expressed from the pARG1 promoter, chosen under the assumption that the native expression level of ARG1 (the yeast ASS1 ortholog) is appropriately tuned for yeast growth. Crucially, rather than assuming a fixed mapping from assay growth to clinical pathogenicity (given potential nonlinearities in the relationship between ASS function and growth) we benchmark the assay against external data, including known pathogenic and benign variants and non-human primate SNVs, to calibrate thresholds and guide interpretation within an OddsPath framework. This benchmarking indicates that ceiling effects are likely present, with some mild loss-of-function pathogenic variants appearing indistinguishable from wild type in the growth assay. We explicitly account for this by not using high-growth scores as evidence toward benignity. We have made the following changes the manuscript:

      "A subset of clinically pathogenic ASS1 variants exhibit near-wild-type growth in our yeast assay. In general, we expect a monotonic relationship between ASS function and yeast growth, but with the potential for floor and ceiling effects that constrain the assay's dynamic range. In this context, we interpret high-growth pathogenic variants as likely causing mild loss of function that cannot be distinguished from wild type in our assay"

      "Based on these findings and given that 22/56 pathogenic variants show >85% growth, we conclude that growth above this threshold should not be used as evidence toward benignity."

      1. It would be helpful to add an additional diagram to Figure 1A explaining how the screen was performed, in particular: when genotype and phenotype were measured, relative to plating on selective vs non-selective media? This is described in "Variant library sequence confirmation" and "Measuring the growth of individual isolates" of the Methods section but could also be distilled into a diagram.

      We thank the reviewer for this helpful suggestion. We have updated Figure 1 by adding a new schematic panel (Figure 1C) that distills the experimental workflow into a visual overview. This diagram is intended to complement the detailed descriptions in the Methods and improve clarity for the reader.

      1. The authors rationalize the biochemical consequences of ASS1 mutations in the context of ASS1 per se - for example, mutations in the active site pocket impair substrate binding and therefore catalytic activity, which is expected. Does ASS1 physically interact with other proteins in human cells, and could these interactions be altered in the presence of specific ASS1 mutations? Such effects may not be captured by performing mutational scanning in yeast.

      We are not aware of any specific protein-protein interactions involving ASS that are required for its enzymatic function. However, we agree that ASS could engage in non-essential interactions with other human proteins that might be altered by specific missense variants and that such interactions would not necessarily be captured in a yeast-based assay.

      Importantly, our complementation system depends on human ASS providing the essential enzymatic activity required for arginine biosynthesis in yeast. If ASS1 required obligate human-specific protein interactions to function, even the wild-type enzyme would fail to support yeast growth, which is clearly not the case. We therefore conclude that the assay robustly reports on the intrinsic enzymatic activity of ASS, while acknowledging that non-essential human-specific interactions may not be assessed. We have updated the manuscript to reflect this point.

      "Importantly, successful functional complementation indicates that ASS enzymatic activity does not depend on any obligate human-specific protein interactions."

      1. The authors note that only a small number (2/11) of mutations at the ASS1 monomer-monomer interface lead to growth defects in yeast. It would be helpful for the authors to discuss this further.

      As discussed in response to the reviewer's comments on the relationship between ASS activity and yeast growth (point 1 above), we expect growth to be a monotonic but nonlinear function of enzymatic activity, with potential ceiling effects at high activity. Under this model, variants causing weak or moderate loss of function may remain indistinguishable from wild type when residual activity is sufficient to support normal growth. We favor this explanation for the observation that only 2/11 interface variants show reduced growth, as many pathogenic interface substitutions are associated with milder disease presentations, consistent with higher residual enzyme function. Consistent with this interpretation, variants affecting the active site, where substitutions are expected to cause large reductions in catalytic activity, are readily detected by the assay.

      Although we cannot exclude partial buffering of dimerization defects in yeast, we interpret the reduced sensitivity to interface variants primarily as a general limitation of growth-based assays. Accordingly, our decision not to use growth >85% as evidence toward benignity is conservative relative to approaches that would classify high-growth variants as benign except at the monomer-monomer interface, avoiding reliance on structural subclassification and minimizing the risk of false benign interpretation. Reduced growth, by contrast, provides strong evidence of loss of ASS1 function and pathogenicity, validated under the OddsPath framework.

      We have updated the Results and Discussion sections to clarify these points (also see response to the reviewer's point 1).

      "A subset of clinically pathogenic ASS1 variants exhibit near-wild-type growth in our yeast assay. In general, we expect a monotonic relationship between ASS function and yeast growth, but with the potential for floor and ceiling effects that constrain the assay's dynamic range. In this context, we interpret high-growth pathogenic variants as likely causing mild loss of function that cannot be distinguished from wild type in our assay. Consistent with this view, many pathogenic variants with high assay growth are located at the monomer-monomer interface rather than the active site, and are associated with milder or later-onset clinical presentations, suggesting partial enzymatic impairment that is clinically relevant in humans but not resolved by the yeast assay."

      "Based on these findings and given that 22/56 pathogenic variants show >85% growth, we conclude that growth above this threshold should not be used as evidence toward benignity. Notably, this approach is conservative relative to treating high-growth variants as benign except at the monomer-monomer interface, avoiding reliance on structural subclassification and minimizing the risk of false benign interpretation arising from assay ceiling effects. Conversely, the variants with

      Reviewer #2 (Significance (Required)):

      This study presents the first comprehensive mutational profiling of human ASS1 and would be of broad interest to clinical geneticists as well as those seeking biochemical insights into the enzymology of ASS1. The authors' use of a yeast system to profile human mutations would be particularly useful for researchers performing deep mutational scans, given that it provides functional insights in a rapid and inexpensive manner.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Section 1 - Evidence, reproducibility, and clarity Summary This manuscript presents a comprehensive functional profiling of 2,193 ASS1 missense variants using a yeast complementation assay, providing valuable data for variant interpretation in the rare disease citrullinemia type I. The dataset is extensive, technically sound, and clinically relevant. The demonstration of intragenic complementation in ASS1 is novel and conceptually important. Overall, the study represents a substantial contribution to functional genomics and rare disease variant interpretation.

      Major comments 1. This is an exciting paper as it can provide support to clinicians to make actionable decisions when diagnosing infants. I have a few major comments, but I want to emphasize the label of "functionally unimpaired" variants to be misleading. The authors explain that there are several pathogenic ClinVar variants that fall into this category (above the >.85 growth threshold) but I think this category needs a more specific name and I would ask the authors to reiterate the shortcomings of the assay again in the Discussion section.

      We thank the reviewer for raising this important point. We agree that the label "functionally unimpaired" could be misleading if interpreted as implying clinical benignity rather than assay behavior. We have therefore clarified that this designation refers strictly to variant behavior in the yeast growth assay and does not imply absence of pathogenicity.

      In addition, we have expanded the Discussion to explicitly address the existence of clinically pathogenic variants with high growth scores (>0.85), emphasizing that these likely reflect a ceiling effect of the assay and represent a key limitation for interpretation. This clarification reiterates that high-growth scores should not be used as evidence toward benignity, while reduced growth provides strong functional evidence of pathogenicity. Relevant revisions are described in our responses to Reviewers 1 and 2.

      1. I think there's an important discussion to be had here, is the assay detecting variants that alter the function of ASS or is it detecting a complete ablation of enzymatic activity? The results might be strengthened with a follow-up experiment that identifies stably expressed ASS1 variants.

      We agree with the review that distinguishing between stability and enzyme activity would be valuable information. Unfortunately, we do not currently have the resources to perform this type of large-scale study. We have acknowledged in the text that our assay does not distinguish between enzyme activity and protein stability:

      "We expect that the assay will detect reductions in both catalytic activity and protein stability, but will not distinguish between these mechanisms."

      At the very least, it would be great to see the authors replicate some of their interesting results from the high-throughput screen by down-selecting to ~12 variants of uncertain significance that could be newly considered pathogenic.

      We have included new analysis of all 25 VUS variants falling in the pathogenic range of our assay (Supplemental Table S7). Reclassification under current guidelines (in the absence of our data) shifts six variants to Pathogenic/Likely Pathogenic and 11 more are reclassified to Likely Pathogenic with the application of our functional data as PS3_Supporting. The remaining eight VUS are all reclassified to Likely Pathogenic when inclusion of homozygous PrimateAI-benign variants allows the assay to satisfy full PS3 criteria.

      1. I would ask the authors to provide more citations of the literature in the introduction of the manuscript. I would be especially interested in knowing more about human ASS being identified as a homolog of yeast ARG1, as they share little sequence similarity (27.5%) at the protein level. That said, I find the yeast complementation assay exciting.

      We thank the reviewer for this suggestion. Human ASS and yeast Arg1 catalyze the same biochemical reaction and share approximately 49% amino acid sequence identity. We have revised the Introduction to clarify this relationship and to note explicitly that the Saccharomyces Genome Database (SGD) identifies the human gene encoding argininosuccinate synthase (ASS1) as the ortholog of yeast ARG1. An appropriate citation has been added to support this statement. The protein alignments have been provided as File S2.

      "This assay is based on the ability of human ASS to functionally replace (complement) its yeast ortholog (Arg1) in S. cerevisiae (Saccharomyces Genome Database, 2026). Importantly, successful functional complementation indicates that ASS enzymatic activity does not depend on any obligate human-specific protein interactions. At the protein level, human ASS and yeast Arg1 display 49% sequence identity (File S2) and share identical enzymatic roles in converting citrulline and aspartate into argininisuccinate."

      1. I appreciate the efforts made by the authors to share their work and make this study more reproducible, such as sharing the hASS1 and yASS1 plasmids being shared on NCBI Genbank (Line 121) and publishing the ONT reads on SRA (Line 154). I made a requests for additional data to be shared, such as the custom method/code for codon optimization and a table of Twist variant cassettes that were ordered. I would also love to see these results shared on MaveDB.org.

      We thank the reviewer for these suggestions regarding data sharing and reproducibility. As requested, we have provided the custom codon optimization script as File S1 and the amino acid alignment used to perform codon harmonization as File S2. The sequence of the underlying variant cassette is included in the corresponding GenBank entry, and we have clarified this point in the legend of Figure 1. For each amino acid substitution, Twist Bioscience used a yeast-specific codon scheme with a single consistent codon per amino acid; accordingly, the sequence of each variant cassette can be inferred from the base construct and the specified amino acid change. A complete list of variant amino acid substitutions used in this study is provided in Table S3.

      1. I find this manuscript very exciting as the authors have a compelling assay that identifies pathogenic variants, but I was generally disappointed by the quality and organization of the figures. For example, Figure 4 provides very little insight, but could be dramatically improved with an overlay of the normalized growth score data or highlighting variants surrounding the substrate or ATP interfaces. There are some very interesting aspects of this manuscript that could be shine through with some polished figures.

      We thank the reviewer for this feedback and agree that clear and well-organized figures are essential for conveying the key results of the study. In response, we have substantially revised Figure 4 by adding colored overlays showing residue conservation and median normalized growth scores (new panels Figure 4C and 4D), which more directly link structural context to functional outcomes and highlight patterns surrounding the active site and substrate interfaces.

      I would also encourage the authors to generate a heatmap of the data represented in Figure 2 (see Fowler and Fields 2014 PMID 25075907, Figure 2), this would be more helpful reference to the readers.

      The reviewer also suggested that a heatmap representation, similar to that used in Fowler and Fields (2014), might aid interpretation of the data shown in Figure 2. Because our dataset consists of sparse single-amino acid substitutions rather than a complete mutational scan, such heatmaps are inherently less dense and less effective at conveying patterns than in saturation mutagenesis studies. Nevertheless, to aid readers who may find this visualization useful, we have generated and included a single-nucleotide variant heatmap as Supplemental Figure S1.

      My major comments are as follows: 6. Citations needed - especially in the introduction and for establishing that hASS is a homolog of yARG1

      We have added the requested citations and clarified the ASS1-ARG1 orthology in the Introduction, as described in our response to point 3 above.

      1. Generally, the authors do a nice job distinguishing the ASS1 gene from the ASS enzyme, though I found some ambiguities (Line 685). Please double-check the use of each throughout the manuscript.

      We have edited the manuscript to ensure consistent and unambiguous use of gene and enzyme nomenclature throughout.

      1. Generally, I'm confused about what strain was used for integrating all these variants, was is the arg1 knock-out strain from the yeast knockout collection or was it FY4? I think FY4 was used for the preliminary experiments, then the KO collection strain was used for making the variant library but I think this could be made more clear in the text and figures. Lines 226-229 describes introducing the hASS1 and yASS1 sequences into the native ARG1 locus in strain FY4, but the Fig1A image depicts the ASS1 variants going into arg1 KO locus. Fig1A should be moved to Fig2.

      We agree that the strain construction steps were not described as clearly as they could have been. We have therefore clarified the strain construction workflow in the Materials & Methods and Results sections, as well as in the Figure 1 legend, to explicitly distinguish preliminary experiments performed in strain FY4 from construction of the variant library in the arg1 knockout background.

      As we have also added an additional panel to Figure 1 that schematically explains how the screen was performed (per Reviewer #2's request), we believe that Figure 1A is appropriately placed and should remain in Figure 1.

      1. Line 303 - "We classify these variants as 'functionally unimpaired'", this is not an accurate description of these variants as Figure 2 highlights 24 pathogenic ClinVar variants that would fall into this category of "functionally unimpaired". The yeast growth assay appears to capture pathogenic variants, but there is likely some nuance of human ASS functionality that is not being assessed here. I would make the language more specific, e.g. "complementary to Arg1" or "growth-compatible".

      We agree that the label "functionally unimpaired" could be misinterpreted if read as implying clinical benignity. We have therefore clarified within the manuscript that this designation refers strictly to variant behavior in the yeast growth assay (i.e., wild-type-like growth under assay conditions) and does not imply absence of pathogenicity. We also expanded the Discussion to explicitly address the subset of clinically pathogenic variants with high growth scores (>0.85), consistent with a ceiling effect of the assay and a key limitation for interpretation. See response to reviewer #3 point 1. Relevant revisions are also discussed in our responses to Reviewers #1 and #2.

      1. Lines 345-355 - It is interesting that there are variants that appear functional at the substrate interfacing sites. Is there anything common across these variants? Are they maintaining the polarity or hydrophobicity of the WT residue? Are any of these variants included in ClinVar or gnomAD? Are pathogenic variants found at any of these sites

      Yes. For highly sensitive active-site residues that have few permissible variants, the vast majority of amino acid substitutions that do retain activity preserve key physicochemical properties of the wild-type residue, such as hydrophobicity or charge. We have added this important observation to the manuscript:

      "Any variants at these sensitive residues that are permissive for activity in our assay retain hydrophobicity or charged states relative to the original amino acid side chain (Figure 5A & Table S5)."

      None of these variants are present in ClinVar. Only L15V and E191D are present in gnomAD (Table S4).

      1. Lines 423-430 - The OddsPath calculation would seem to rely heavily on the thresholds of .85 for normalized growth. The OddsPath calculation could be bolstered with some additional analysis that emphasizes the robustness to alternative thresholds.

      We agree that the sensitivity of the OddsPath calculation to the choice of growth thresholds is an important consideration. In our assay, benign ClinVar variants and non-human primate variants are observed exclusively within the peak centered on wild-type growth, whereas clinically annotated variants falling below this peak are exclusively pathogenic. On this basis, we defined the upper boundary of the assay range interpreted as supporting pathogenicity as the lower boundary of the wild-type-centered peak in the growth distribution (as defined in Figure 3), rather than selecting a cutoff by direct optimization of the OddsPath. This choice reflects the observed concordance, in our dataset, between the onset of measurable functional impairment in the assay and clinical pathogenic annotation. Importantly, in practice the OddsPath value is locally robust to the precise placement of this boundary, remaining invariant across the range 0.82-0.88. Supporting our chosen threshold of 0.85, the lowest-growth benign or primate variant observed has a normalized growth value of 0.88, while the lowest growth observed among variants present as homozygotes in gnomAD was 0.86. We have clarified this rationale and analysis in the revised manuscript.

      "Notably, the "Among all nine of the human ASS1 missense variants observed as homozygotes in gnomAD which were tested as amino acid substitutions in our assay, the lowest observed growth value was 0.86 (Ala258Val) consistent with the lower boundary of the PrimateAI variants which was a growth value of 0.87 (Ala81Thr) (Figure 6) and with our use of a 0.85 classification threshold."

      "If we treat PrimateAI variants as benign (solely for OddsPath calculation purposes), the OddsPath for growth

      1. Lines 432-441 - This is an interesting idea to use variants observed in primates, has ACMG weighed in on this? I understand that CTLN1 is an autosomal recessive disorder but I'd still be interested in seeing how the observed ASS1 missense variants in gnomAD perform in your growth assay, possibly a supplemental figure?

      To our knowledge, the ACMG/AMP guidelines do not currently address the use of homozygous missense variants observed in non-human primates. We are currently in discussion with two ClinGen working groups to discuss the possibility of formalizing the use of this data source.

      We agree that comparison with human population data is also important. Accordingly, total gnomAD allele counts and homozygous counts for all applicable ASS1 missense variants are provided in Table S4, and the growth behavior of ASS1 missense variants observed in the homozygous state in gnomAD is shown in Figure 6. These homozygous variants uniformly exhibit high growth in our assay, consistent with the absence of strong loss-of-function effects. We have updated the manuscript text to clarify these points.

      Minor comments 1. Lines 53-59 - This paragraph needs to cite the literature, especially lines 56, 57, and 59 2. Line 61 - no need to repeat "citrullinemia type I", just use the abbreviation as it was introduced in the paragraph above 3. Lines 61-71 - again, this paragraph needs more literature citations 4. Line 62 - change to "results"

      The changes suggested in points 1-4 have all been implemented in the revised manuscript.

      1. Line 74-75 - "RUSP" acronym not needed as it's never used in the manuscript, the same goes for "HHS"

      We agree that the acronyms "RUSP" and "HHS" are not reused elsewhere in the manuscript. We have nevertheless retained them at first mention, alongside the expanded names, because these acronyms are commonly used in newborn screening and public health policy contexts and may be more familiar to some readers than the expanded terms. We would be happy to remove the acronyms if preferred.

      1. Line 86 - "ASS1" I think is referring to the enzyme and should just be "ASS"? If referring to the gene then italicize to "ASS1"
      2. Lines 91-93 - It would be helpful to mention this is a functional screen in yeast
      3. Line 101 - It would be helpful to the readers to define SD before using the acronym, consider changing to "minimal synthetic defined (SD) medium" and afterwards can refer to as "SD medium"
      4. 109-114 - It would be great if you could share your method for designing the codon-harmonized yASS1 gene, consider sharing as a supplemental script or creating a GitHub repository linked to a Zenodo DOI for publication.

      The changes suggested in points 6-9 have all been implemented in the revised manuscript. The codon harmonization script has been provided as File S1.

      1. Lines 135-137 - I think it's helpful to provide a full table of the cassettes ordered from Twist as well as the primers used to amplify them, consider a supplemental table.

      Details of Twist cassette and the primer sequences used for amplification have been added to the Materials & Methods.

      1. Line 138 - "standard methods" is a bit vague, I'm guessing this is a Geitz and Schiestl 2007 LiAc/ssDNA protocol (PMID 17401334)? Also, was ClonNAT used to select for natMX colonies?

      The reviewer is correct about which protocol was used, and we have added the citation. We have also clarified that selection was carried out based on resistance to nourseothricin.

      1. Line 150 - change to "sequence the entire open reading frame, as previously described [4]."
      2. Line 222-223 - remove "replace" and just use "complement" (and remove the parenthesis)
      3. Line 249 - It would be great to see a supplemental alignment of the hASS1 and yASS1 sequences.
      4. Line 261 - spelling "citrullemia" should be corrected to "citrullinemia"
      5. Line 280 - "using Oxford Nanopore sequencing" is a bit vague, I suggest specifying the equipment used (e.g. Oxford Nanopore Technologies MinION platform) or simplify to "via long-read sequencing (see Materials & Methods)"

      The changes suggested in points 12-16 have all been implemented in the revised manuscript. An alignment of the ASS and Arg1 protein sequences has been provided as File S2.

      1. Line 287-289 - It would be great to see the average number of isolates per variant, as well as a plot of the variant growth estimate vs individual isolate growth.

      We agree with the reviewer that conveying measurement precision is important. The number of isolates assayed per variant is provided in Table S4, and we have added explicit mention of this in the text. Because variants were assayed with a mixture of 1, 2, or {greater than or equal to}3 independent isolates, a scatterplot of variant-level growth estimates versus individual isolate measurements would be difficult to interpret and potentially misleading. Instead, we report standard error estimates for each variant in Table S4, derived from the linear model used to estimate growth effects, which more appropriately summarizes measurement uncertainty given the experimental design.

      1. Lines 324-25 - consider removing the last sentence of this paragraph, it is redundant as the following paragraph starts with the same statement.

      We have removed this sentence.

      1. Lines 327-335 - This is interesting and would benefit from its own subpanel or plot in which the normalized growth score is plotted against variants that are at conserved or diverse residues in human ASS, and see if there's a statistical difference in score between the two groupings.

      As suggested by the reviewer, we have added Supplemental Figure 2 (Figure S2) in which the normalized growth score of each variant is plotted against the conservation of the corresponding residue, as measured by ConSurf. The manuscript already includes a statistical analysis of the relationship between residue conservation and functional impact, showing that amorphic variants occur significantly more frequently at highly conserved residues than unimpaired variants do (one-sided Fisher's exact test). We now refer to this new supplemental figure in the relevant Results section.

      1. Lines 339-341 - As written, it is unclear if aspartate interacts with all of the same residues as citrulline or just Asn123 and Thr119.
      2. Lines 345-355 - As with my above comment, I find this interesting and would
      3. Line 353 - add a period to "al" in "Diez-Fernandex et al."

      The issues raised in points 20 and 22 have all addressed. Point 21 appears to be truncated.

      1. Figure 1 a. Remove "Figure" from the subpanels and show just "A" and "B" (as you do for Figure 4) and combine the two images into a single image. Also make this correction to Figure 5 and Figure 8. b. Panel A - I thought the hASS1 and yASS1 were dropped into FY4, not the arg1 KO strain. This needs clarification. c. Panel A - I'm assuming the natMX cassette contains its own promoter, you could use a right-angled arrow to indicate where the promotors are in your construct. d. Panel B - I'm not sure the bar graph is necessary, it would be more helpful to see calculations of the colony size (or growth curves for each strain) and plot the raw values (maybe pixel counts?) for each replicate rather than normalizing to yeast ARG1. I would be great to have a supplemental figure showing all the replicates side-by-side. e. Panel B - Would be helpful to denote the pathogenic and benign ClinVar variants with an icon or colored text.

      f. Figure 1 Caption - make "A)" and "B)" bold.

      We have implemented the requested changes in Figure 1 with the following exceptions. We have retained panels A and B as separate subfigures because they illustrate distinct experimental concepts. In addition, we respectfully disagree with point (d). The bar graph is intended to provide a clear, high-level comparison of functional complementation by hASS1 versus yASS1 and to illustrate the gross differences in growth between benign and pathogenic proof-of-principle variants. As the bar graph includes error bars for standard deviations, presenting raw colony size measurements or growth curves for individual replicates would substantially complicate the figure without materially improving interpretability for this purpose.

      1. Figure 2 a. "Shown in magenta are amino acid substitutions corresponding to ClinVar pathogenic, pathogenic/likely pathogenic, and likely pathogenic variants" is repeated in the figure caption. b. "Shown in green are amino acid substitutions corresponding to ClinVar benign and likely benign variants." I don't see any green points. c. Identify the colors used for ASS1 substrate binding residues. d. This plot would benefit from a depiction of the human ASS secondary structure and any protein domains (nucleotide-binding domain, synthase domain, and C-terminal helix from Fig4B)

      e. Line 685 675 - "ASS1" is being used in reference to the enzyme, is this correct or should it be "ASS"?

      We have made the requested changes to Figure 2. The repeated caption text has been removed, and references to green points have been corrected to orange points to match the figure. The colors used to indicate ASS substrate-binding residues are explicitly described in the figure key. Secondary structure annotations have been added. References to the enzyme have been corrected to "ASS" rather than "ASS1" where appropriate.

      1. Figure 3 a. Rename the "unimpaired" category as there are several pathogenic ClinVar variants that fall into this category.

      To address this point, we have clarified the labeling by adding "in our yeast assay" to the figure legend, making explicit that the "unimpaired" category refers only to wild-type-like behavior under assay conditions and does not imply clinical benignity. See also response to Reviewer #3, Major Comment 1.

      1. Figure 4 a. List the PDB or AlphaFold accession used for this structure b. Panel A - state which colors are used for to depict each monomer. It is confusing to see several shades of pink/purple used to depict a single monomer in Panel A. c. It is very difficult to make out the aspartate and citrulline substrates in the catalytic binding activity, consider making an inset zooming-in on this domain and displaying a ribbon diagram of the structure rather than the surface. d. Generally, it would be more helpful here to label any particular residues that were identified as pathogenic from your screen, or to overlay average grow scores per residue data onto the structure

      We have implemented the requested changes to Figure 4. The relevant PDB/AlphaFold accession is now listed, and the colors used to depict each monomer in Panel A are clarified in the figure legend. An inset focusing on the active site has been added to improve visualization of the citrulline and aspartate substrates. In addition, we have added new panels (Figure 4C and 4D) overlaying pathogenic residues and average growth scores onto the structure to more directly link structural context with functional data.

      1. Figure 5 a. Line 716 - Insert a page break to place Figure 5 on its own page b. I suggest using a heatmap for this type of plot, as it is very difficult to track which color corresponds to which residue.

      c. Fig5A - This plot could be improved by identifying which residue positions interface with which substrate.

      We have placed Figure 5 on its own page and added information to the legend identifying which residue positions interface with each substrate. We have retained the active-site variant strip charts raised in point (b), as we believe they effectively illustrate how the distribution of variant effects differs between residues. In addition, we have provided a supplemental heatmap showing variant growth across the entire protein (Figure S1), and individual variant scores for all residues are provided in Table S4.

      1. Figure 7 a. Line 735 - Insert page break to place figure on a new page

      List the PDB accession used for these images. c. For clarity I would mention "human ASS" in the figure title d. State the colors of the substrates e. Panels A and B could be combined into a single panel, making it easier to distinguish the active site and dimerization variants.

      f. Could be interesting to get SASA scores for the ClinVar structural variants to determine if they are surface-accessible

      We have implemented the requested changes in Figure 7 with the following exceptions. For point (e), there is no single orientation of the structure that allows a clear simultaneous view of both active-site and dimerization variants; accordingly, we have retained panels A and B as separate subfigures to preserve clarity. With respect to point (f), we agree that solvent accessibility analysis could be informative in other contexts. However, such an analysis does not integrate naturally with the functional and assay-based framework of the present study and was therefore not included.

      1. Figure 8 a. Panel B - overlay a square frame in the larger protein structure that depicts where the below inset is focused, and frame inset image as well.

      We have framed the inset image as requested. We did not add a corresponding frame to the full protein structure, as doing so obscured structural details in the region of interest.

      Reviewer #3 (Significance (Required)):

      Section 2 - Significance This study represents a substantial technical, functional, and translational advance in the interpretation of missense variation in ASS1, a gene of high clinical relevance for the rare disease citrullinemia type I. Its principal strength lies in the generation of an experimentally validated functional atlas of ASS1 missense variants that covers ~90% of all SNV-accessible substitutions. The scale, internal reproducibility, and careful benchmarking of the yeast complementation assay against known pathogenic and benign variants provide a robust foundation for identifying pathogenic ASS1 variants. Particularly strong aspects include the rigorous quality control of variant identities, the quantitative nature of the functional readout, and the thoughtful integration of results into the ACMG/AMP OddsPath framework. The discovery of intragenic complementation between variants affecting distinct structural regions of the enzyme is a notable conceptual and mechanistic contribution. Limitations include the assay's reduced sensitivity to variants impacting oligomerization or subtle folding defects, and the use of yeast as a heterologous system, which may mask disease-relevant mechanisms as several pathogenic ClinVar variants were found to be "functionally unimpaired". Future work extending functional testing to additional cellular contexts or expanding genotype-level combinatorial analyses would further enhance clinical applicability. Relative to prior studies, which have relied on small numbers of patient-derived variants or low-throughput biochemical assays, this work extends the field decisively by delivering a comprehensive, variant-resolved functional map for ASS1. To the best of my current knowledge, this is the first systematic functional screen of ASS1 at this scale and the first direct experimental demonstration that ASS active sites span multiple subunits, enabling intragenic complementation consistent with Crick and Orgel's classic variant sequestration model. As such, the advance is simultaneously technical (high-throughput functional genomics), mechanistic (defining structural contributors to catalysis and epistasis), and clinical (enabling evidence-based reclassification of VUS). I find the use of homozygous non-human primate variants as an orthogonal benign calibration set both creative and controversial, my hope would be that this manuscript will prompt a productive discussion.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary

      This manuscript presents a comprehensive functional profiling of 2,193 ASS1 missense variants using a yeast complementation assay, providing valuable data for variant interpretation in the rare disease citrullinemia type I. The dataset is extensive, technically sound, and clinically relevant. The demonstration of intragenic complementation in ASS1 is novel and conceptually important. Overall, the study represents a substantial contribution to functional genomics and rare disease variant interpretation.

      Major comments

      This is an exciting paper as it can provide support to clinicians to make actionable decisions when diagnosing infants. I have a few major comments, but I want to emphasize the label of "functionally unimpaired" variants to be misleading. The authors explain that there are several pathogenic ClinVar variants that fall into this category (above the >.85 growth threshold) but I think this category needs a more specific name and I would ask the authors to reiterate the shortcomings of the assay again in the Discussion section. I think there's an important discussion to be had here, is the assay detecting variants that alter the function of ASS or is it detecting a complete ablation of enzymatic activity? The results might be strengthened with a follow-up experiment that identifies stably expressed ASS1 variants. At the very least, it would be great to see the authors replicate some of their interesting results from the high-throughput screen by down-selecting to ~12 variants of uncertain significance that could be newly considered pathogenic. I would ask the authors to provide more citations of the literature in the introduction of the manuscript. I would be especially interested in knowing more about human ASS being identified as a homolog of yeast ARG1, as they share little sequence similarity (27.5%) at the protein level. That said, I find the yeast complementation assay exciting. I appreciate the efforts made by the authors to share their work and make this study more reproducible, such as sharing the hASS1 and yASS1 plasmids being shared on NCBI Genbank (Line 121) and publishing the ONT reads on SRA (Line 154). I made a requests for additional data to be shared, such as the custom method/code for codon optimization and a table of Twist variant cassettes that were ordered. I would also love to see these results shared on MaveDB.org. I find this manuscript very exciting as the authors have a compelling assay that identifies pathogenic variants, but I was generally disappointed by the quality and organization of the figures. For example, Figure 4 provides very little insight, but could be dramatically improved with an overlay of the normalized growth score data or highlighting variants surrounding the substrate or ATP interfaces. There are some very interesting aspects of this manuscript that could be shine through with some polished figures. I would also encourage the authors to generate a heatmap of the data represented in Figure 2 (see Fowler and Fields 2014 PMID 25075907, Figure 2), this would be more helpful reference to the readers.

      My major comments are as follows:

      1. Citations needed - especially in the introduction and for establishing that hASS is a homolog of yARG1
      2. Generally, the authors do a nice job distinguishing the ASS1 gene from the ASS enzyme, though I found some ambiguities (Line 685). Please double-check the use of each throughout the manuscript
      3. Generally, I'm confused about what strain was used for integrating all these variants, was is the arg1 knock-out strain from the yeast knockout collection or was it FY4? I think FY4 was used for the preliminary experiments, then the KO collection strain was used for making the variant library but I think this could be made more clear in the text and figures. Lines 226-229 describes introducing the hASS1 and yASS1 sequences into the native ARG1 locus in strain FY4, but the Fig1A image depicts the ASS1 variants going into arg1 KO locus. Fig1A should be moved to Fig2.
      4. Line 303 - "We classify these variants as 'functionally unimpaired'", this is not an accurate description of these variants as Figure 2 highlights 24 pathogenic ClinVar variants that would fall into this category of "functionally unimpaired". The yeast growth assay appears to capture pathogenic variants, but there is likely some nuance of human ASS functionality that is not being assessed here. I would make the language more specific, e.g. "complementary to Arg1" or "growth-compatible".
      5. Lines 345-355 - It is interesting that there are variants that appear functional at the substrate interfacing sites. Is there anything common across these variants? Are they maintaining the polarity or hydrophobicity of the WT residue? Are any of these variants included in ClinVar or gnomAD? Are pathogenic variants found at any of these sites
      6. Lines 423-430 - The OddsPath calculation would seem to rely heavily on the thresholds of <.05 and >.85 for normalized growth. The OddsPath calculation could be bolstered with some additional analysis that emphasizes the robustness to alternative thresholds.
      7. Lines 432-441 - This is an interesting idea to use variants observed in primates, has ACMG weighed in on this? I understand that CTLN1 is an autosomal recessive disorder but I'd still be interested in seeing how the observed ASS1 missense variants in gnomAD perform in your growth assay, possibly a supplemental figure?

      Minor comments

      1. Lines 53-59 - This paragraph needs to cite the literature, especially lines 56, 57, and 59
      2. Line 61 - no need to repeat "citrullinemia type I", just use the abbreviation as it was introduced in the paragraph above
      3. Lines 61-71 - again, this paragraph needs more literature citations
      4. Line 62 - change to "results"
      5. Line 74-75 - "RUSP" acronym not needed as it's never used in the manuscript, the same goes for "HHS"
      6. Line 86 - "ASS1" I think is referring to the enzyme and should just be "ASS"? If referring to the gene then italicize to "ASS1"
      7. Lines 91-93 - It would be helpful to mention this is a functional screen in yeast
      8. Line 101 - It would be helpful to the readers to define SD before using the acronym, consider changing to "minimal synthetic defined (SD) medium" and afterwards can refer to as "SD medium"
      9. 109-114 - It would be great if you could share your method for designing the codon-harmonized yASS1 gene, consider sharing as a supplemental script or creating a GitHub repository linked to a Zenodo DOI for publication.
      10. Lines 135-137 - I think it's helpful to provide a full table of the cassettes ordered from Twist as well as the primers used to amplify them, consider a supplemental table
      11. Line 138 - "standard methods" is a bit vague, I'm guessing this is a Geitz and Schiestl 2007 LiAc/ssDNA protocol (PMID 17401334)? Also, was ClonNAT used to select for natMX colonies?
      12. Line 150 - change to "sequence the entire open reading frame, as previously described [4]."
      13. Line 222-223 - remove "replace" and just use "complement" (and remove the parenthesis)
      14. Line 249 - It would be great to see a supplemental alignment of the hASS1 and yASS1 sequences
      15. Line 261 - spelling "citrullemia" should be corrected to "citrullinemia"
      16. Line 280 - "using Oxford Nanopore sequencing" is a bit vague, I suggest specifying the equipment used (e.g. Oxford Nanopore Technologies MinION platform) or simplify to "via long-read sequencing (see Materials & Methods)"
      17. Line 287-289 - It would be great to see the average number of isolates per variant, as well as a plot of the variant growth estimate vs individual isolate growth
      18. Lines 324-25 - consider removing the last sentence of this paragraph, it is redundant as the following paragraph starts with the same statement
      19. Lines 327-335 - This is interesting and would benefit from its own subpanel or plot in which the normalized growth score is plotted against variants that are at conserved or diverse residues in human ASS, and see if there's a statistical difference in score between the two groupings
      20. Lines 339-341 - As written, it is unclear if aspartate interacts with all of the same residues as citrulline or just Asn123 and Thr119.
      21. Lines 345-355 - As with my above comment, I find this interesting and would
      22. Line 353 - add a period to "al" in "Diez-Fernandex et al."
      23. Figure 1

      a. Remove "Figure" from the subpanels and show just "A" and "B" (as you do for Figure 4) and combine the two images into a single image. Also make this correction to Figure 5 and Figure 8

      b. Panel A - I thought the hASS1 and yASS1 were dropped into FY4, not the arg1 KO strain. This needs clarification

      c. Panel A - I'm assuming the natMX cassette contains its own promoter, you could use a right-angled arrow to indicate where the promotors are in your construct

      d. Panel B - I'm not sure the bar graph is necessary, it would be more helpful to see calculations of the colony size (or growth curves for each strain) and plot the raw values (maybe pixel counts?) for each replicate rather than normalizing to yeast ARG1. I would be great to have a supplemental figure showing all the replicates side-by-side

      e. Panel B - Would be helpful to denote the pathogenic and benign ClinVar variants with an icon or colored text

      f. Figure 1 Caption - make "A)" and "B)" bold 24. Figure 2

      a. "Shown in magenta are amino acid substitutions corresponding to ClinVar pathogenic, pathogenic/likely pathogenic, and likely pathogenic variants" is repeated in the figure caption

      b. "Shown in green are amino acid substitutions corresponding to ClinVar benign and likely benign variants." I don't see any green points

      c. Identify the colors used for ASS1 substrate binding residues

      d. This plot would benefit from a depiction of the human ASS secondary structure and any protein domains (nucleotide-binding domain, synthase domain, and C-terminal helix from Fig4B)

      e. Line 685 - "ASS1" is being used in reference to the enzyme, is this correct or should it be "ASS"? 25. Figure 3

      a. Rename the "unimpaired" category as there are several pathogenic ClinVar variants that fall into this category 26. Figure 4

      a. List the PDB or AlphaFold accession used for this structure

      b. Panel A - state which colors are used for to depict each monomer. It is confusing to see several shades of pink/purple used to depict a single monomer in Panel A

      c. It is very difficult to make out the aspartate and citrulline substrates in the catalytic binding activity, consider making an inset zooming-in on this domain and displaying a ribbon diagram of the structure rather than the surface.

      d. Generally, it would be more helpful here to label any particular residues that were identified as pathogenic from your screen, or to overlay average grow scores per residue data onto the structure 27. Figure 5

      a. Line 716 - Insert a page break to place Figure 5 on its own page

      b. I suggest using a heatmap for this type of plot, as it is very difficult to track which color corresponds to which residue

      c. Fig5A - This plot could be improved by identifying which residue positions interface with which substrate 28. Figure 7

      a. Line 735 - Insert page break to place figure on a new page

      b. List the PDB accession used for these images

      c. For clarity I would mention "human ASS" in the figure title

      d. State the colors of the substrates

      e. Panels A and B could be combined into a single panel, making it easier to distinguish the active site and dimerization variants

      f. Could be interesting to get SASA scores for the ClinVar structural variants to determine if they are surface-accessible 29. Figure 8

      a. Panel B - overlay a square frame in the larger protein structure that depicts where the below inset is focused, and frame inset image as well.

      Significance

      This study represents a substantial technical, functional, and translational advance in the interpretation of missense variation in ASS1, a gene of high clinical relevance for the rare disease citrullinemia type I. Its principal strength lies in the generation of an experimentally validated functional atlas of ASS1 missense variants that covers ~90% of all SNV-accessible substitutions. The scale, internal reproducibility, and careful benchmarking of the yeast complementation assay against known pathogenic and benign variants provide a robust foundation for identifying pathogenic ASS1 variants. Particularly strong aspects include the rigorous quality control of variant identities, the quantitative nature of the functional readout, and the thoughtful integration of results into the ACMG/AMP OddsPath framework. The discovery of intragenic complementation between variants affecting distinct structural regions of the enzyme is a notable conceptual and mechanistic contribution. Limitations include the assay's reduced sensitivity to variants impacting oligomerization or subtle folding defects, and the use of yeast as a heterologous system, which may mask disease-relevant mechanisms as several pathogenic ClinVar variants were found to be "functionally unimpaired". Future work extending functional testing to additional cellular contexts or expanding genotype-level combinatorial analyses would further enhance clinical applicability.

      Relative to prior studies, which have relied on small numbers of patient-derived variants or low-throughput biochemical assays, this work extends the field decisively by delivering a comprehensive, variant-resolved functional map for ASS1. To the best of my current knowledge, this is the first systematic functional screen of ASS1 at this scale and the first direct experimental demonstration that ASS active sites span multiple subunits, enabling intragenic complementation consistent with Crick and Orgel's classic variant sequestration model. As such, the advance is simultaneously technical (high-throughput functional genomics), mechanistic (defining structural contributors to catalysis and epistasis), and clinical (enabling evidence-based reclassification of VUS). I find the use of homozygous non-human primate variants as an orthogonal benign calibration set both creative and controversial, my hope would be that this manuscript will prompt a productive discussion.

    1. But it’s still not entirely natural, and he sometimes slips. About halfway through a recent nine-month research project, he’d built up so many files that he gave up on keeping them all structured.

      I find this interesting and relatable because even people with years of experience can make mistakes. I sometimes give up and just put everything in one folder too. This shows how people often shift from structured folders to a “one bucket” approach when things become overwhelming.

    1. ‘repulsive creatures’’ who men-aced the very foundations of American civilization

      It's interesting how this fear has not subsided as we get to modern times, just been redirected. Even though the overwhelming majority of Americans trace their roots to immigration, there is constantly a sense of "us versus them", born from the fear of the unknown. This also reminds me of a book set around this time that I'm sure a lot of us have read in our English classes --- The Jungle by Upton Sinclair.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The revised manuscript presents an interesting and technically competent set of experiments exploring the role of the infralimbic cortex (IL) in extinction learning. The inclusion of histological validation in the supplemental material improves the transparency and credibility of the results, and the overall presentation has been clarified. However, several key issues remain that limit the strength of the conclusions.

      We thank the Reviewer for their positive assessment of our revised manuscript. We discussed the issues raised by the Reviewer below.

      The behavioral effects reported are modest, as evident from the trial-by-trial data included in the supplemental figures. Although the authors interpret their findings as evidence that IL stimulation facilitates extinction only after prior inhibitory learning, this conclusion is not directly supported by their data. The experiments do not include a condition in which IL stimulation is delivered during extinction training alone, without prior inhibitory experience. Without this control, the claim that prior inhibitory memory is necessary for facilitation remains speculative.

      The manuscript provides evidence across five experiments (Figures 2-6) that IL stimulation fails to facilitate extinction training in the absence of prior inhibitory experience. We therefore remain confident that the data support our conclusion: prior inhibitory learning enables IL stimulation to facilitate subsequent inhibitory learning.

      The electrophysiological example provided shows that IL stimulation induces a sustained inhibition that outlasts the stimulation period. This prolonged suppression could potentially interfere with consolidation processes following tone presentation rather than facilitating them. The authors should consider and discuss this alternative interpretation in light of their behavioral data.

      The possibility that IL stimulation exerted its effects by interfering with consolidation processes is inconsistent with the literature. Disrupting consolidation processes in the IL impairs extinction learning (1), even when animals have prior inhibitory learning experience (2). Yet our experiments found that IL stimulation failed to interfere with initial extinction learning but instead facilitated subsequent learning. Furthermore, the electrophysiological example demonstrates that the inhibitory effect is transient: the cell returned to firing properties similar to those observed pre-stimulation, making it unlikely that inhibition persists during the consolidation window.

      It is unfortunate that several animals had to be excluded after histological verification, but the resulting mismatch between groups remains a concern. Without a power analysis indicating the number of subjects required to achieve reliable effects, it is difficult to determine whether the modest behavioral differences reflect genuine biological variability or insufficient statistical power. Additional animals may be needed to properly address this imbalance.

      As noted in the revised manuscript, we are confident about the reliability of the findings reported. The manuscript provides evidence across five experiments that IL stimulation fails to facilitate brief extinction in the absence of prior inhibitory experience, replicating previous findings (3, 4). The manuscript also replicates these prior studies by demonstrating that experience with either fear or appetitive extinction enables IL stimulation to facilitate subsequent fear extinction. Furthermore, the present experiments replicate the facilitative effects of IL stimulation following fear or appetitive backward conditioning.

      Overall, while the manuscript is improved in clarity and methodological detail, the behavioral effects remain weak, and the mechanistic interpretation requires stronger experimental support and consideration of alternative explanations.

      We respectfully disagree with the assertion that the reported results are weak. The manuscript replicates all main findings internally or reproduces findings from previously published studies. While alternative explanations cannot be entirely excluded, we are not aware of any competing account that predicts the pattern of results reported here.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors examine the mechanisms by which stimulation of the infralimbic cortex (IL) facilitates the retention and retrieval of inhibitory memories. Previous work has shown that optogenetic stimulation of the IL suppresses freezing during extinction but does not improve extinction recall when extinction memory is probed one day later. When stimulation occurs during a second extinction session (following a prior stimulation-free extinction session), freezing is suppressed during the second extinction as well as during the tone test the following day. The current study was designed to further explore the facilitatory role of the IL in inhibitory learning and memory recall. The authors conducted a series of experiments to determine whether recruitment of IL extends to other forms of inhibitory learning (e.g., backward conditioning) and to inhibitory learning involving appetitive conditioning. Further, they assessed whether their effects could be explained by stimulus familiarity. The results of their experiments show that backward conditioning, another form of inhibitory learning, also enabled IL stimulation to enhance fear extinction. This phenomenon was not specific to aversive learning as backward appetitive conditioning similarly allowed IL stimulation to facilitate extinction of aversive memories. Finally, the authors ruled out the possibility that IL facilitated extinction merely because of prior experience with the stimulus (e.g., reducing the novelty of the stimulus). These findings significantly advance our understanding of the contribution of IL to inhibitory learning. Namely, they show that the IL is recruited during various forms of inhibitory learning and its involvement is independent of the motivational value associated with the unconditioned stimulus.

      We thank the Reviewer for their positive assessment.

      Strengths to highlight:

      (1) Transparency about the inclusion of both sexes and the representation of data from both sexes in figures

      We thank the Reviewer for their positive assessment.

      (2) Very clear representation of groups and experimental design for each figure

      We thank the Reviewer for their positive assessment.

      (3) The authors were very rigorous in determining the neurobehavioral basis for the effects of IL stimulation on extinction. They considered multiple interpretations and designed experiments to address these possible accounts of their data.

      We thank the Reviewer for their positive assessment.

      (4) The rationale for and the design of the experiments in this manuscript are clearly based on a wealth of knowledge about learning theory. The authors leveraged this expertise to narrow down how the IL encodes and retrieves inhibitory memories.

      We thank the Reviewer for their positive assessment.

      Reviewer #3 (Public review):

      Summary:

      This is a really nice manuscript with different lines of evidence to show that the IL encodes inhibitory memories that can then be manipulated by optogenetic stimulation of these neurons during extinction. The behavioral designs are excellent, with converging evidence using extinction/re-extinction, backwards/forwards aversive conditioning, and backwards appetitive/forwards aversive conditioning. Additional factors, such as nonassociative effects of the CS or US, also are considered, and the authors evaluate the inhibitory properties of the CS with tests of conditioned inhibition. The authors have addressed the prior reviews. I still think it is unfortunate that the groups were not properly balanced in some of the figures (as noted by the authors, they were matched appropriately in real time, but some animals had to be dropped after histology, which caused some balancing issues). I think the overall pattern of results is compelling enough that more subjects do not need to be added, but it would still be nice to see more acknowledgement and statistical analyses of how these pre-existing differences may have impacted test performance.

      We thank the Reviewer for their positive assessment of our revised manuscript. We discussed the comments regarding group balancing below.

      Strengths:

      The experimental designs are very rigorous with an unusual level of behavioral sophistication.

      We thank the Reviewer for their positive assessment

      Weaknesses:

      The various group differences in Figure 2 prior to any manipulation are still problematic. There was a reliable effect of subsequent group assignment in Figure 2 (p<0.05, described as "marginal" in multiple places). Then there are differences in extinction (nonsignificant at p=.07). The test difference between ReExt OFF/ON is identical to the difference at the end of extinction and the beginning of Forward 2, in terms of absolute size. I really don't think much can be made of the test result. The authors state in their response that this difference was not evident during the forward phase, but there clearly is a large ordinal difference on the first trial. I think it is appropriate to only focus on test differences when groups are appropriately matched, but when there are pre-existing differences (even when not statistically significant) then they really need to be incorporated into the statistical test somehow.

      We carefully considered the Reviewer's suggestion, but it is not possible to adjust the statistical analyses at test because these analyses do not directly compare the two ReExt groups. Any scaling of performance would require including the two Ext groups, which is not feasible since these groups did not receive initial extinction. Moreover, the analyses provide no conclusive evidence of pre-existing differences between the two ReExt groups: the difference was not significant during initial extinction and was absent during the Forward 2 stage. We acknowledge that closer performance between the two ReExt groups during initial extinction would have been preferable. However, we remain confident in the results obtained because they replicate previous experiments in which the two ReExt groups displayed identical performance during initial extinction.

      The same problem is evident in Figure 4B, but here the large differences in the Same groups are opposite to the test differences. It's hard to say how those large differences ultimately impacted the test results. I suppose it is good that the differences during Forward conditioning did not ultimately predict test differences, but this really should have been addressed with more subjects in these experiments. The authors explore the interactions appropriately but with n=6 in the various subgroups, it's not surprising that some of these effects were not detected statistically.

      As the Reviewer noted, the unexpected differences in Figure 4B are opposite in direction to the test differences. Importantly, Figure 4B replicates the main findings from Figure 3, which did not show these unexpected differences.

      It is useful to see the trial-by-trial test data now presented in the supplement. I think the discussion does a good job of addressing the issues of retrieval, but the ideas of Estes about session cues that the authors bring up in their response haven't really held up over the years (e.g., Robbins, 1990, who explicitly tested this; other demonstrations of within-session spontaneous recovery), for what it's worth.

      We thank the Reviewer for bringing our attention to Robbins’ work on session cues. We understand that the issue of retrieval is important but as we noted before, our manuscript and its conclusions do not claim to differentiate retrieval from additional learning.

      References

      (1) K. E. Nett, R. T. LaLumiere, Infralimbic cortex functioning across motivated behaviors: Can the differences be reconciled Neurosci Biobehav Rev 131, 704–721 (2021).

      (2) V. Laurent, R. F. Westbrook, Inactivation of the infralimbic but not the prelimbic cortex impairs consolidation and retrieval of fear extinction Learn Mem 16, 520–529 (2009).

      (3) N. W. Lingawi, R. F. Westbrook, V. Laurent, Extinction and Latent Inhibition Involve a Similar Form of Inhibitory Learning that is Stored in and Retrieved from the Infralimbic Cortex Cereb Cortex 27, 5547–5556 (2017).

      (4) N. W. Lingawi, N. M. Holmes, R. F. Westbrook, V. Laurent, The infralimbic cortex encodes inhibition irrespective of motivational significance Neurobiol Learn Mem 150, 64–74 (2018).


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript reports a series of experiments designed to test whether optogenetic activation of infralimbic (IL) neurons facilitates extinction retrieval and whether this depends on animals' prior experience. In Experiment 1, rats underwent fear conditioning followed by either one or two extinction sessions, with IL stimulation given during the second extinction; stimulation facilitated extinction retrieval only in rats with prior extinction experience. Experiments 2 and 3 examined whether backward conditioning (CS presented after the US) could establish inhibitory properties that allowed IL stimulation to enhance extinction, and whether this effect was specific to the same stimulus or generalized to different stimuli. Experiments 5 - 7 extended this approach to appetitive learning: rats received backward or forward appetitive conditioning followed by extinction, and then fear conditioning, to determine whether IL stimulation could enhance extinction in contexts beyond aversive learning and across conditioning sequences. Across studies, the key claim is that IL activation facilitates extinction retrieval only when animals possess a prior inhibitory memory, and that this effect generalizes across aversive and appetitive paradigms.

      Strengths:

      (1) The design attempts to dissect the role of IL activity as a function of prior learning, which is conceptually valuable.

      We thank the Reviewer for their positive assessment.

      (2) The experimental design of probing different inhibitory learning approaches to probe how IL activation facilitates extinction learning was creative and innovative.

      We thank the Reviewer for their positive assessment.

      Weaknesses:

      (1) Non-specific manipulation.

      ChR2 was expressed in IL without distinction between glutamatergic and GABAergic populations. Without knowing the relative contribution of these cell types or the percentage of neurons affected, the circuit-level interpretation of the results is unclear.

      ChR2 was intentionally expressed in the infralimbic cortex (IL) without distinction between local neuronal populations for two reasons. First, the primary aim of this was to uncover some of the features characterizing the encoding of inhibitory memories in the IL, and this encoding likely engages interactions among various neuronal populations within the IL. Second, the hypotheses tested in the manuscript derived from findings that indiscriminately stimulated the IL using the GABA<sub>A</sub> receptor antagonist picrotoxin, which is best mimicked by the approach taken. We agree that it is also important to determine the respective contributions of distinct IL neuronal populations to inhibitory encoding; however, the global approach implemented in the present experiments represents a necessary initial step. These matters have been incorporated in the Discussion of the revised manuscript.

      (2) Extinction retrieval test conflates processes

      The retrieval test included 8 tones. Averaging across this many tone presentations conflate extinction retrieval/expression (early tones) with further extinction learning (later tones). A more appropriate analysis would focus on the first 2-4 tones to capture retrieval only. As currently presented, the data do not isolate extinction retrieval.

      It is unclear when retrieval of what has been learned across extinction ceases and additional extinction learning occurs. In fact, it is only the first stimulus presentation that unequivocally permits a distinction between retrieval and additional extinction learning, as the conditions for this additional learning have not been fulfilled at that presentation. However, confining evidence for retrieval to the first stimulus presentation introduces concerns that other factors could influence performance. For instance, processing of the stimulus present at the start of the session may differ from that present at the end of the previous session, thereby affecting what is retrieved. Such differences between the stimuli present at the start and end of an extinction session have been long recognized as a potential explanation for spontaneous recovery (Estes, 1955). More importantly, whether the test data presented confound retrieval and additional extinction learning or not, the interpretation remains the same with respect to the effects of a prior history of inhibitory learning on enabling the facilitative effects of IL stimulation. Finally, it is unclear how these facilitative effects could occur in the absence of the subjects retrieving the extinction memory formed under the stimulation. Nevertheless, the revised manuscript now provides the trial-by-trial performance (see Supplemental Figure 3) during the post-extinction retrieval tests and addresses this issue in the Discussion.

      (3) Under-sampling and poor group matching.

      Sample sizes appear small, which may explain why groups are not well matched in several figures (e.g., 2b, 3b, 6b, 6c) and why there are several instances of unexpected interactions (protocol, virus, and period). This baseline mismatch raises concerns about the reliability of group differences.

      Efforts were made to match group performance upon completion of each training stage and before IL stimulation. Unfortunately, these efforts were not completely successful due to exclusions following post-mortem analyses. This has been made explicit in the revised manuscript (Materials and Methods, Subjects section). However, we acknowledge that the unexpected interactions deserve further discussion, and this has been incorporated into the revised manuscript (see also comment from Reviewer 2). Although we cannot exclude the possibility that sample sizes may have contributed to some of these interactions, we remain confident about the reliability of the main findings reported, especially given their replication across the various protocols. Overall, the manuscript provides evidence that IL stimulation does not facilitate brief extinction in the absence of prior inhibitory experience in five different experiments, replicating previous findings (Lingawi et al., 2018; Lingawi et al., 2017). It also replicates these previous findings by showing that prior experience with either fear or appetitive extinction enables IL stimulation to facilitate subsequent fear extinction. Furthermore, the facilitative effects of such stimulation following fear or appetitive backward conditioning are replicated in the present manuscript. This is discussed in the Discussion of the revised manuscript.

      (4) Incomplete presentation of conditioning data

      Figure 3 only shows a single conditioning session despite five days of training. Without the full dataset, it is difficult to evaluate learning dynamics or whether groups were equivalent before testing.

      We apologize, as we incorrectly labeled the X axis for the backward conditioning data in Figures 3B, 4B, 4D and 5B. It should have indicated “Days” instead of “Trials”. This error has been corrected in the revised manuscript (see also second comment from Reviewer 2).

      (5) Interpretation stronger than evidence.

      The authors conclude that IL activation facilitates extinction retrieval only when an inhibitory memory has been formed. However, given the caveats above, the data are insufficient to support such a strong mechanistic claim. The results could reflect nonspecific facilitation or disruption of behavior by broad prefrontal activation. Moreover, there is compelling evidence that optogenetic activation of IL during fear extinction does facilitate subsequent extinction retrieval without prior extinction training (DoMonte et al 2015, Chen et al 2021), which the authors do not directly test in this study.

      As noted above, the interpretations of the main findings stand whether the test data confounds retrieval with additional extinction learning or not. The revised manuscript also clarifies the plotting of the data for the backward conditioning stages. We do agree that further discussion of the unexpected interactions is necessary, and this has been incorporated into the revised manuscript. However, the various replications of the core findings provide strong evidence for their reliability and the interpretations advanced in the original manuscript. The proposal that the results reflect non-specific facilitation or disruption of behavior seems highly unlikely. Indeed, the present experiments and previous findings (Lingawi et al., 2018; Lingawi et al., 2017) provide multiple demonstrations that IL stimulation fails to produce any facilitation in the absence of prior inhibitory experience with the target stimulus. Although these demonstrations appear inconsistent with previous studies (Do-Monte et al., 2015; Chen et al., 2021), this inconsistency is likely explained by the fact that these studies manipulated activity in specific IL neuronal populations. Previous work has already revealed differences between manipulations targeting discrete IL neuronal populations as opposed to general IL activity (Kim et al., 2016). Importantly, as previously noted, the present manuscript aimed to generally explore inhibitory encoding in the IL that is likely to engage several neuronal populations within the IL. Adequate statements on these matters have been included in the Discussion of the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors examine the mechanisms by which stimulation of the infralimbic cortex (IL) facilitates the retention and retrieval of inhibitory memories. Previous work has shown that optogenetic stimulation of the IL suppresses freezing during extinction but does not improve extinction recall when extinction memory is probed one day later. When stimulation occurs during a second extinction session (following a prior stimulation-free extinction session), freezing is suppressed during the second extinction as well as during the tone test the following day. The current study was designed to further explore the facilitatory role of the IL in inhibitory learning and memory recall. The authors conducted a series of experiments to determine whether recruitment of IL extends to other forms of inhibitory learning (e.g., backward conditioning) and to inhibitory learning involving appetitive conditioning. Further, they assessed whether their effects could be explained by stimulus familiarity. The results of their experiments show that backward conditioning, another form of inhibitory learning, also enabled IL stimulation to enhance fear extinction. This phenomenon was not specific to aversive learning, as backward appetitive conditioning similarly allowed IL stimulation to facilitate extinction of aversive memories. Finally, the authors ruled out the possibility that IL facilitated extinction merely because of prior experience with the stimulus (e.g., reducing the novelty of the stimulus). These findings significantly advance our understanding of the contribution of IL to inhibitory learning. Namely, they show that the IL is recruited during various forms of inhibitory learning, and its involvement is independent of the motivational value associated with the unconditioned stimulus.

      Strengths:

      (1) Transparency about the inclusion of both sexes and the representation of data from both sexes in figures.

      We thank the Reviewer for their positive assessment.

      (2) Very clear representation of groups and experimental design for each figure.

      We thank the Reviewer for their positive assessment.

      (3) The authors were very rigorous in determining the neurobehavioral basis for the effects of IL stimulation on extinction. They considered multiple interpretations and designed experiments to address these possible accounts of their data.

      We thank the Reviewer for their positive assessment.

      (4) The rationale for and the design of the experiments in this manuscript are clearly based on a wealth of knowledge about learning theory. The authors leveraged this expertise to narrow down how the IL encodes and retrieves inhibitory memories.

      We thank the Reviewer for their positive assessment.

      Weaknesses:

      (1) In Experiment 1, although not statistically significant, it does appear as though the stimulation groups (OFF and ON) differ during Extinction 1. It seems like this may be due to a difference between these groups after the first forward conditioning. Could the authors have prevented this potential group difference in Extinction 1 by re-balancing group assignment after the first forward conditioning session to minimize the differences in fear acquisition (the authors do report a marginally significant effect between the groups that would undergo one vs. two extinction sessions in their freezing during the first conditioning session)?

      Efforts were made daily to match group performance across the training stages, but these efforts were ultimately hampered by the necessary exclusions following postmortem analyses. This has been made explicit in the revised manuscript (Materials and Methods, Subjects section). Regarding freezing during Extinction 1, as noted by the Reviewer, the difference, which was not statistically significant, was absent across trials during the subsequent forward fear conditioning stage. Likewise, the protocol difference observed during the initial forward fear conditioning was absent in subsequent stages. We are therefore confident that these initial differences (significant or not) did not impact the main findings at test. Importantly, these findings replicate previous work using identical protocols in which no differences were present during the training stages. These considerations have been addressed in the revised manuscript (see Results for Experiment 1).

      (2) Across all experiments (except for Experiment 1), the authors state that freezing during the initial conditioning increased across "days". The figures that correspond to this text, however, show that freezing changes across trials. In the methods, the authors report that backward conditioning occurred over 5 days. It would be helpful to understand how these data were analyzed and collated to create the final figures. Was the freezing averaged across the five days for each trial for analyses and figures?

      We apologize, as noted above, for having incorrectly labeled the X axis across the backward conditioning data sets in Figures 3B, 4B, 4D and 5B. It should have indicated “Days” instead of “Trials”. The data shown in these Figures use the average of all trials on a given day. This has been clarified in the methods section of the revised manuscript (Statistical Analyses section). The labeling errors on the Figures have been corrected.

      (3) In Experiment 3, the authors report a significant Protocol X Virus interaction. It would be useful if the authors could conduct post-hoc analyses to determine the source of this interaction. Inspection of Figure 4B suggests that freezing during the two different variants of backward conditioning differs between the virus groups. Did the authors expect to see a difference in backward conditioning depending on the stimulus used in the conditioning procedure (light vs. tone)? The authors don't really address this confounding interaction, but I do think a discussion is warranted.

      We agree with the Reviewer that further discussion of the Protocol x Virus interaction that emerged during the backward conditioning and forward conditioning stages of Experiment 3 is warranted. This discussion has been provided in the revised manuscript (see Results section). Briefly, during both stages, follow-up analyses did not reveal any differences (main effects or interactions) between the two groups trained with the light stimulus (Diff-EYFP and Diff-ChR2). By contrast, the ChR2 group trained with the tone (Back-ChR2) froze more overall than the EYFP group (Back-EYFP), but there were no other significant differences between the two groups. Based on these analyses, the Protocol x Virus interaction appears to be driven by greater freezing in the ChR2 group trained with the tone rather than a difference in the backward conditioning performance based on stimulus identity. Consistent with this, the statistical analyses did not reveal a main effect of Protocol during either the backward conditioning stage or the stimulus trials during the forward conditioning stage. Nevertheless, during this latter stage, a main effect of Protocol emerged during baseline performance, but once again, this seems to be driven by the Back-ChR2 group. Critically, it is unclear how greater stimulus freezing in the Back-ChR2 group during forward conditioning would lead to lower freezing during the post-extinction retrieval test.

      We note that an unexpected Protocol x Period interaction was found during appetitive backward conditioning in Experiment 5. For consistency, we conducted additional analyses to determine the source of this interaction (see Results section). As previously noted, performance during appetitive backward conditioning is noisy and cannot be taken as a failure to generate inhibitory learning. It is therefore unlikely that this interaction implied a difference in such learning.

      (4) In this same experiment, the authors state that freezing decreased during extinction; however, freezing in the Diff-EYFP group at the start of extinction (first bin of trials) doesn't look appreciably different than their freezing at the end of the session. Did this group actually extinguish their fear? Freezing on the tone test day also does not look too different from freezing during the last block of extinction trials.

      We confirm that overall, there was a significant decline in freezing across the extinction session shown in Figure 4B. The Reviewer is correct to point out that this decline was modest (if not negligible) in the Diff-EYFP group, which was receiving its first inhibitory training with the target tone stimulus. It is worth noting that across all experiments, most groups that did not receive infralimbic stimulation displayed a modest decline in freezing during the extinction session since it was relatively brief, involving only 6 or 8 tone alone presentations. This was intentional, as we aimed for the brief extinction session to generate minimal inhibitory learning and thereby to detect any facilitatory effect of infralimbic stimulation. This has been clarified and explained in the revised version of the manuscript (see Results section, description of Experiment 1).

      (5) The Discussion explored the outcomes of the experiments in detail, but it would be useful for the authors to discuss the implications of their findings for our understanding of circuits in which the IL is embedded that are involved in inhibitory learning and memory. It would also be useful for the authors to acknowledge in the Discussion that although they did not have the statistical power to detect sex differences, future work is needed to explore whether IL functions similarly in both sexes.

      In line with the Reviewer’s suggestion (see also Reviewer 3), the Discussion section has been substantially altered in the revised manuscript. Among other things, it does mention that future studies will need to examine the role of additional brain regions in the effects reported and it acknowledges the need to further explore sex differences and IL functions.

      Reviewer #3 (Public review):

      Summary:

      This is a really nice manuscript with different lines of evidence to show that the IL encodes inhibitory memories that can then be manipulated by optogenetic stimulation of these neurons during extinction. The behavioral designs are excellent, with converging evidence using extinction/re-extinction, backwards/forwards aversive conditioning, and backwards appetitive/forwards aversive conditioning. Additional factors, such as nonassociative effects of the CS or US, are also considered, and the authors evaluate the inhibitory properties of the CS with tests of conditioned inhibition.

      Strengths:

      The experimental designs are very rigorous with an unusual level of behavioral sophistication.

      We thank the Reviewer for their positive assessment

      Weaknesses:

      (1) More justification for parametric choices (number of days of backwards vs forwards conditioning) could be provided.

      All experimental parameters were based on previously published experiments showing the capacity of the backward conditioning protocols to generate inhibitory learning and the forward conditioning protocols to produce excitatory learning. Although this was mentioned in the methods section, we acknowledge that further explanation was required to justify the need for multiple days of backward training. This has been provided in the revised manuscript (see Results section and description of the backward parameters.

      (2) The current discussion could be condensed and could focus on broader implications for the literature.

      The discussion has been severely condensed and broader implications have been discussed with respect to the existing literature looking at the neural circuitry underlying inhibitory learning.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Re-analyze extinction retrieval, focusing only on the first 2-4 tones to capture extinction expression.

      This recommendation corresponds to the second public comment made by the Reviewer, and we have replied to this comment.

      (2) Directly test whether activation of IL during fear extinction is insufficient to facilitate extinction retrieval without prior extinction training.

      The manuscript provides five separate demonstrations that the optogenetic approach to stimulate IL activity did not facilitate the initial brief extinction session. This reproduces what had been found with indiscriminate pharmacological stimulation in our previous research (Lingawi et al., 2018; Lingawi et al., 2017). We appreciate that other work that stimulated specific IL neuronal populations has observed facilitation of extinction but, the present manuscript focuses on the role of all IL neuronal populations in encoding inhibitory memories. The Reviewer’s request would imply contrasting the role of various neuronal populations, which is beyond the scope of this manuscript. Nevertheless, we have modified our discussion to indicate that future research should establish which IL neuronal population(s) contribute to the effects reported here.

      (3) Show the percentage of neurons that exhibit excitatory or inhibitory responses in IL after non-specific optogenetic activation to better understand how this manipulation is affecting IL circuitry.

      All electrophysiological recordings (n = 10 cells) are presented in Figure 1C. ChR2 excitation was substantial and overwhelming. Based on the physiological and morphological characteristics of the recorded cells, one was non-pyramidal and was excited by LED light delivery. The remaining 9 cells were pyramidal. One did not respond to LED delivery, but we cannot exclude the possibility that this was due to a lack of ChR2 expression in the somatic compartment. Another cell showed a mild reduction in activity following LED stimulation, while the remaining 7 cells displayed clear excitation upon LED stimulation. We have modified our manuscript to reflect these observations. We did not include percentages since only 10 recordings are shown.

      (4) Present data from all five conditioning sessions, not just one, to allow evaluation of learning history.

      This recommendation corresponds to the fourth public comment made by the Reviewer, and we have replied to this comment.

      (5) Address the issue of small and poorly matched groups, particularly in Figures 2b, 3b, 6b, and 6c.

      This recommendation corresponds to the third public comment made by the Reviewer, and we have replied to this comment.

      (6) Temper the conclusions to reflect the limitations of sampling, group matching, and the lack of specificity in the manipulation.

      We have modified our Discussion to address potential issues related to sampling and group matching. However, we are unsure how the lack of specificity of the IL stimulation has any impact on the interpretations made, since no statement is made about neuronal specificity. That said, as noted above, “we have modified our discussion to indicate that future research should establish which IL neuronal population(s) contribute to the effects reported here”.

      Reviewer #2 (Recommendations for the authors):

      Nothing additional to include beyond what is written for public view.

      Reviewer #3 (Recommendations for the authors):

      This is a really nice manuscript with different lines of evidence to show that the IL encodes inhibitory memories that can then be manipulated by optogenetic stimulation of these neurons during extinction. The behavioral designs are excellent, with converging evidence using extinction/re-extinction, backwards/forwards aversive conditioning, and backwards appetitive/forwards aversive conditioning. Additional factors, such as nonassociative effects of the CS or US, are also considered, and the authors evaluate the inhibitory properties of the CS with tests of conditioned inhibition. I only have a couple of comments that the authors may want to consider.

      We thank the Reviewer for their positive assessment.

      First, in Figure 2, it is unfortunate that there is a general effect of the LED assignment before the LED experience (p=.07 during that first extinction session). This is in the same direction as the difference during the test, so it is not clear that the test difference really reflects differences due to Extinction 2 treatment or to preexisting differences based on group assignments.

      The Reviewer’s comment is identical to the first public comment of Reviewer 2, which has been addressed.

      Second, it is notable that the backwards fear conditioning phase was conducted over 5 days, but the forward conditioning phase was conducted over one day. The rationale for these differences should be presented. There is an old idea going back to Konorski that backwards conditioning may lead to excitation initially, and it is only after more extensive trials that inhibitory conditioning occurs (a finding supported by Heth, 1976). Some discussion of the potential biphasic nature of backwards conditioning would be useful, especially for people who want to run this type of experiment but with only a single session of backwards conditioning.

      In line with the Reviewer’s suggestion, the revised manuscript (see results section) provide an explanation for conducting backward conditioning across multiple days.

      Third, as written, each paragraph of the discussion is mostly a recapitulation of the findings from each experiment. This could be condensed significantly, and it would be nice to see more integration with the current literature and how these results challenge or suggest nuance in current thinking about IL function.

      We have significantly condensed the recapitulation of our findings in the Discussion of the revised manuscript. The Discussion now dedicates space to address comments from the other Reviewers and integrate the present findings with the current literature.

      References

      Chen, Y.-H., Wu, J.-L., Hu, N.-Y., Zhuang, J.-P., Li, W.-P., Zhang, S.-R., Li, X.-W., Yang, J.-M., & Gao, T.-M. (2021). Distinct projections from the infralimbic cortex exert opposing effects in modulating anxiety and fear. J Clin Invest, 131(14), e145692. https://doi.org/10.1172/JCI145692

      Do-Monte, F. H., Manzano-Nieves, G., Quiñones-Laracuente, K., Ramos-Medina, L., & Quirk, G. J. (2015). Revisiting the role of infralimbic cortex in fear extinction with optogenetics. J Neurosci, 35(8), 3607-3615. https://doi.org/10.1523/JNEUROSCI.3137-14.2015

      Estes, W. K. (1955). Statistical theory of spontaneous recovery and regression. Psychol Rev, 62(3), 145-154. https://doi.org/10.1037/h0048509

      Kim, H.-S., Cho, H.-Y., Augustine, G. J., & Han, J.-H. (2016). Selective Control of Fear Expression by Optogenetic Manipulation of Infralimbic Cortex after Extinction. Neuropsychopharmacology, 41(5), 1261-1273. https://doi.org/10.1038/npp.2015.276

      Lingawi, N. W., Holmes, N. M., Westbrook, R. F., & Laurent, V. (2018). The infralimbic cortex encodes inhibition irrespective of motivational significance. Neurobiol Learn Mem, 150, 64-74. https://doi.org/10.1016/j.nlm.2018.03.001

      Lingawi, N. W., Westbrook, R. F., & Laurent, V. (2017). Extinction and Latent Inhibition Involve a Similar Form of Inhibitory Learning that is Stored in and Retrieved from the Infralimbic Cortex. Cereb Cortex, 27(12), 5547-5556.

      https://doi.org/10.1093/cercor/bhw322.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Monziani and Ulitsky present a large and exhaustive study on the lncRNA EPB41L4A-AS1 using a variety of genomic methods. They uncover a rather complex picture of an RNA transcript that appears to act via diverse pathways to regulate the expression of large numbers of genes, including many snoRNAs. The activity of EPB41L4A-AS1 seems to be intimately linked with the protein SUB1, via both direct physical interactions and direct/indirect of SUB1 mRNA expression.

      The study is characterised by thoughtful, innovative, integrative genomic analysis. It is shown that EPB41L4A-AS1 interacts with SUB1 protein and that this may lead to extensive changes in SUB1's other RNA partners. Disruption of EPB41L4A-AS1 leads to widespread changes in non-polyA RNA expression, as well as local cis changes. At the clinical level, it is possible that EPB41L4A-AS1 plays disease-relevant roles, although these seem to be somewhat contradictory with evidence supporting both oncogenic and tumour suppressive activities.

      A couple of issues could be better addressed here. Firstly, the copy number of EPB41L4A-AS1 is an important missing piece of the puzzle. It is apparently highly expressed in the FISH experiments. To get an understanding of how EPB41L4A-AS1 regulates SUB1, an abundant protein, we need to know the relative stoichiometry of these two factors. Secondly, while many of the experiments use two independent Gapmers for EPB41L4A-AS1 knockdown, the RNA-sequencing experiments apparently use just one, with one negative control (?). Evidence is emerging that Gapmers produce extensive off-target gene expression effects in cells, potentially exceeding the amount of on-target changes arising through the intended target gene. Therefore, it is important to estimate this through the use of multiple targeting and non-targeting ASOs, if one is to get a true picture of EPB41L4A-AS1 target genes. In this Reviewer's opinion, this casts some doubt over the interpretation of RNA-seq experiments until that work is done. Nonetheless, the Authors have designed thorough experiments, including overexpression rescue constructs, to quite confidently assess the role of EPB41L4A-AS1 in snoRNA expression.

      It is possible that EPB41L4A-AS1 plays roles in cancer, either as an oncogene or a tumour suppressor. However, it will in the future be important to extend these observations to a greater variety of cell contexts.

      This work is valuable in providing an extensive and thorough analysis of the global mechanisms of an important regulatory lncRNA and highlights the complexity of such mechanisms via cis and trans regulation and extensive protein interactions.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Monziani et al. identified long noncoding RNAs (lncRNAs) that act in cis and are coregulated with their target genes located in close genomic proximity. The authors mined the GeneHancer database, and this analysis led to the identification of four lncRNA-target pairs. The authors decided to focus on lncRNA EPB41L4A-AS1.

      They thoroughly characterised this lncRNA, demonstrating that it is located in the cytoplasm and the nuclei, and that its expression is altered in response to different stimuli. Furthermore, the authors showed that EPB41L4A-AS1 regulates EPB41L4A transcription, leading to a mild reduction in EPB41L4A protein levels. This was not recapitulated with siRNA-mediated depletion of EPB41L4AAS1. RNA-seq in EPB41L4A-AS1-depleted cells with single LNA revealed 2364 DEGs linked to pathways including the cell cycle, cell adhesion, and inflammatory response. To understand the mechanism of action of EPB41L4A-AS1, the authors mined the ENCODE eCLIP data and identified SUB1 as an lncRNA interactor. The authors also found that the loss of EPB41L4A-AS1 and SUB1 leads to the accumulation of snoRNAs, and that SUB1 localisation changes upon the loss of EPB41L4A-AS1. Finally, the authors showed that EPB41L4A-AS1 deficiency did not change the steady-state levels of SNORA13 nor RNA modification driven by this RNA. The phenotype associated with the loss of EPB41L4A-AS1 is linked to increased invasion and EMT gene signature.

      Overall, this is an interesting and nicely done study on the versatile role of EPB41L4A-AS1 and the multifaceted interplay between SUB1 and this lncRNA, but some conclusions and claims need to be supported with additional experiments. My primary concerns are using a single LNA gapmer for critical experiments, increased invasion, and nucleolar distribution of SUB1- in EPB41L4A-AS1-depleted cells. These experiments need to be validated with orthogonal methods.

      Strengths:

      The authors used complementary tools to dissect the complex role of lncRNA EPB41L4A-AS1 in regulating EPB41L4A, which is highly commendable. There are few papers in the literature on lncRNAs at this standard. They employed LNA gapmers, siRNAs, CRISPRi/a, and exogenous overexpression of EPB41L4A-AS1 to demonstrate that the transcription of EPB41L4A-AS1 acts in cis to promote the expression of EPB41L4A by ensuring spatial proximity between the TAD boundary and the EPB41L4A promoter. At the same time, this lncRNA binds to SUB1 and regulates snoRNA expression and nucleolar biology. Overall, the manuscript is easy to read, and the figures are well presented. The methods are sound, and the expected standards are met.

      Weaknesses:

      The authors should clarify how many lncRNA-target pairs were included in the initial computational screen for cis-acting lncRNAs and why MCF7 was chosen as the cell line of choice. Most of the data uses a single LNA gapmer targeting EPB41L4A-AS1 lncRNA (eg, Fig. 2c, 3B, and RNA-seq), and the critical experiments should be using at least 2 LNA gapmers. The specificity of SUB1 CUT&RUN is lacking, as well as direct binding of SUB1 to lncRNA EPB41L4A-AS1, which should be confirmed by CLIP qPCR in MCF7 cells. Finally, the role of EPB41L4A-AS1 in SUB1 distribution (Figure 5) and cell invasion (Figure 8) needs to be complemented with additional experiments, which should finally demonstrate the role of this lncRNA in nucleolus and cancer-associated pathways. The use of MCF7 as a single cancer cell line is not ideal.

      Reviewer #3 (Public review):

      Summary:

      In this paper, the authors made some interesting observations that EPB41L4A-AS1 lncRNA can regulate the transcription of both the nearby coding gene and genes on other chromosomes. They started by computationally examining lncRNA-gene pairs by analyzing co-expression, chromatin features of enhancers, TF binding, HiC connectome, and eQTLs. They then zoomed in on four pairs of lncRNA-gene pairs and used LNA antisense oligonucleotides to knock down these lncRNAs. This revealed EPB41L4A-AS1 as the only one that can regulate the expression of its cis-gene target EPB41L4A. By RNA-FISH, the authors found this lncRNA to be located in all three parts of a cell: chromatin, nucleoplasm, and cytoplasm. RNA-seq after LNA knockdown of EPB41L4A-AS1 showed that this increased >1100 genes and decreased >1250 genes, including both nearby genes and genes on other chromosomes. They later found that EPB41L4A-AS1 may interact with SUB1 protein (an RNA-binding protein) to impact the target genes of SUB1. EPB41L4A-AS1 knockdown reduced the mRNA level of SUB1 and altered the nuclear location of SUB1. Later, the authors observed that EPB41L4A-AS1 knockdown caused an increase of snRNAs and snoRNAs, likely via disrupted SUB1 function. In the last part of the paper, the authors conducted rescue experiments that suggested that the full-length, intron- and SNORA13-containing EPB41L4A-AS1 is required to partially rescue snoRNA expression. They also conducted SLAM-Seq and showed that the increased abundance of snoRNAs is primarily due to their hosts' increased transcription and stability. They end with data showing that EPB41L4A-AS1 knockdown reduced MCF7 cell proliferation but increased its migration, suggesting a link to breast cancer progression and/or metastasis.

      Strengths:

      Overall, the paper is well-written, and the results are presented with good technical rigor and appropriate interpretation. The observation that a complex lncRNA EPB41L4A-AS1 regulates both cis and trans target genes, if fully proven, is interesting and important.

      Weaknesses:

      The paper is a bit disjointed as it started from cis and trans gene regulation, but later it switched to a partially relevant topic of snoRNA metabolism via SUB1. The paper did not follow up on the interesting observation that there are many potential trans target genes affected by EPB41L4A-AS1 knockdown and there was limited study of the mechanisms as to how these trans genes (including SUB1 or NPM1 genes themselves) are affected by EPB41L4A-AS1 knockdown. There are discrepancies in the results upon EPB41L4A-AS1 knockdown by LNA versus by CRISPR activation, or by plasmid overexpression of this lncRNA.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Copy number:

      Perhaps I missed it, but it seems that no attempt is made to estimate the number of copies of EPB41L4A-AS1 transcripts per cell. This should be possible given RNAseq and FISH. At least an order of magnitude estimate. This is important for shedding light on the later observations that EPB41L4A-AS1 may interact with SUB1 protein and regulate the expression of thousands of mRNAs.

      We thank the reviewer for the insightful suggestion. We agree that an estimate of EPB41L4A-AS1 copy number might further strengthen the hypotheses presented in the manuscript. Therefore, we analyzed the smFISH images and calculated the copy number per cell of this lncRNA, as well as that of GAPDH as a comparison.

      Because segmenting MCF-7 cells proved to be difficult due to the extent of the cell-cell contacts they establish, we imaged multiple (n = 14) fields of view, extracted the number of EPB41L4A-AS1/GAPDH molecules in each field and divided them by the number of cells (as assessed by DAPI staining, 589 cells in total). We detected an average of 33.37 ± 3.95 EPB41L4A-AS1 molecules per cell, in contrast to 418.27 ± 61.79 GAPDH molecules. As a comparison, within the same qPCR experiment the average of the Ct values of these two RNAs is about  22.3 and 17.5, the FPKMs in the polyA+ RNA-seq are ~ 2479.4 and 35.6, and the FPKMs in the rRNA-depleted RNA-seq are ~ 3549.9 and 19.3, respectively. Thus, our estimates of the EPB41L4A-AS1 copy number in MCF-7 cells fits well into these observations.

      The question whether an average of ~35 molecules per cell is sufficient to affect the expression of thousands of genes is somewhat more difficult to ascertain. As discussed below, it is unlikely that all the genes dysregulated following the KD of EPB41L4A-AS1 are all direct targets of this lncRNA, and indeed SUB1 depletion affects an order of magnitude fewer genes. It has been shown that lncRNAs can affect the behavior of interacting RNAs and proteins in a substoichiometric fashion (Unfried & Ulitsky, 2022), but whether this applies to EPB41L4A-AS1 remains to be addressed in future studies. Nonetheless, this copy number appears to be sufficient for a trans-acting functions for this lncRNA, on top of its cis-regulatory role in regulating EPB41L4A. We added this information in the text as follows:

      “Using single-molecule fluorescence in-situ hybridization (smFISH) and subcellular fractionation we found that EPB41L4A-AS1 is expressed at an average of 33.37 ± 3.95 molecule per cell, and displays both nuclear and cytoplasmic localization in MCF-7 cells (Fig. 1D), with a minor fraction associated with chromatin as well (Fig. 1E).”

      We have updated the methods section as well:

      “To visualize the subcellular localization of EPB41L4A-AS1 in vivo, we performed single-molecule fluorescence in situ hybridization (smFISH) using HCR™ amplifiers. Probe sets (n = 30 unique probes) targeting EPB41L4A-AS1 and GAPDH (positive control) were designed and ordered from Molecular Instruments. We followed the Multiplexed HCR v3.0 protocol with minor modifications. MCF-7 cells were plated in 8-well chambers (Ibidi) and cultured O/N as described above. The next day, cells were fixed with cold 4% PFA in 1X PBS for 10 minutes at RT and then permeabilized O/N in 70% ethanol at -20°C. Following permeabilization, cells were washed twice with 2X SSC buffer and incubated at 37°C for 30 minutes in hybridization buffer (HB). The HB was then replaced with a probe solution containing 1.2 pmol of EPB41L4A-AS1 probes and 0.6 pmol of GAPDH probes in HB. The slides were incubated O/N at 37°C. To remove excess probes, the slides were washed four times with probe wash buffer at 37°C for 5 minutes each, followed by two washes with 5X SSCT at RT for 5 minutes. The samples were then pre-amplified in amplification buffer for 30 minutes at RT and subsequently incubated O/N in the dark at RT in amplification buffer supplemented with 18 pmol of the appropriate hairpins. Finally, excess hairpins were removed by washing the slides five times in 5X SSCT at RT. The slides were mounted with ProLong™ Glass Antifade Mountant (Invitrogen), cured O/N in the dark at RT, and imaged using a Nikon CSU-W1 spinning disk confocal microscope. In order to estimate the RNA copy number, we imaged multiple distinct fields, extracted the number of EPB41L4A-AS1/GAPDH molecules in each field using the “Find Maxima” tool in ImageJ/Fiji, and divided them by the number of cells (as assessed by DAPI staining).”

      (2) Gapmer results:

      Again, it is quite unclear how many and which Gapmer is used in the genomics experiments, particularly the RNA-seq. In our recent experiments, we find very extensive off-target mRNA changes arising from Gapmer treatment. For this reason, it is advisable to use both multiple control and multiple targeting Gapmers, so as to identify truly target-dependent expression changes. While I acknowledge and commend the latter rescue experiments, and experiments using multiple Gapmers, I'd like to get clarification about how many and which Gapmers were used for RNAseq, and the authors' opinion on the need for additional work here.

      We agree with the Reviewer that GapmeRs are prone to off-target and unwanted effects (Lai et al., 2020; Lee & Mendell, 2020; Maranon & Wilusz, 2020). Early in our experiments, we found out that LNA1 triggers a non-specific CDKN1A/p21 activation (Fig. S5A-C), and thus, we have initially performed some experiments such as RNA-seq with only LNA2.

      Nonetheless, other experiments were performed using both GapmeRs, such as multiple RT-qPCRs, UMI-4C, SUB1 and NPM1 imaging, and the in vitro assays, among others, and consistent results were obtained with both LNAs.

      To accommodate the request by this and the other reviewers, we have now performed another round of polyA+ RNA-seq following EPB41L4A-AS1 knockdown using LNA1 or LNA2, as well as the previously used and an additional control GapmeR. The FPKMs of the control samples are highly-correlated both within replicates and between GapmeRs (Fig. S6A). More importantly, the fold-changes to control are highly correlated between the two on-target GapmeRs LNA1 and LNA2, regardless of the GapmeR used for normalization (Fig. S6B), thus showing that the bulk of the response is shared and likely the direct result of the reduction in the levels of EPB41L4A-AS1. Notably, key targets NPM1 and MTREX (see discussion, Fig. S12A-C and comments to Reviewer 3) were found to be downregulated by both LNAs (Fig. S6C).

      However, we acknowledge that some of the dysregulated genes are observed only when using one GapmeR and not the other, likely due to a combination of indirect, secondary and non-specific effects, and as such it is difficult to infer the direct response. Supporting this, LNA2 yielded a total of 1,069 DEGs (617 up and 452 down) and LNA1 2,493 DEGs (1,328 up and 1,287 down), with the latter triggering a stronger response most likely as a result of the previously mentioned CDKN1A/p21 induction. Overall, 45.1% of the upregulated genes following LNA2 transfection were shared with LNA1, in contrast to only the 24.3% of the downregulated ones.

      We have now included these results in the Results section (see below) and in Supplementary Figure (Fig. S6).

      “Most of the consequences of the depletion of EPB41L4A-AS1 are thus not directly explained by changes in EPB41L4A levels. An additional trans-acting function for EPB41L4A-AS1 would therefore be consistent with its high expression levels compared to most lncRNAs detected in MCF-7 (Fig. S5G). To strengthen these findings, we have transfected MCF-7 cells with LNA1 and a second control GapmeR (NT2), as well as the previous one (NT1) and LNA2, and sequenced the polyadenylated RNA fraction as before. Notably, the expression levels (in FPKMs) of the replicates of both control samples are highly correlated with each other (Fig. S6A), and the global transcriptomic changes triggered by the two EPB41L4A-AS1-targeting LNAs are largely concordant (Fig. S6B and S6C). Because of this concordance and the cleaner (i.e., no CDKN1A upregulation) readout in LNA2-transfected cells, we focused mainly on these cells for subsequent analyses.”

      (3) Figure 1E:

      Can the authors comment on the unusual (for a protein-coding mRNA) localisation of EPB41L4A, with a high degree of chromatin enrichment?

      We acknowledge that mRNAs from protein-coding genes displaying nuclear and chromatin localizations are quite unusual. The nuclear and chromatin localization of some mRNAs are often due to their low expression, length, time that it takes to be transcribed, repetitive elements and strong secondary structures (Bahar Halpern et al., 2015; Didiot et al., 2018; Lubelsky & Ulitsky, 2018; Ly et al., 2022).

      We now briefly mention this in the text:

      “In contrast, both EPB41L4A and SNORA13 were mostly found in the chromatin fraction (Fig. 1E), the former possibly due to the length of its pre-mRNA (>250 kb), which would require substantial time to transcribe (Bahar Halpern et al., 2015; Didiot et al., 2018; Lubelsky & Ulitsky, 2018; Ly et al., 2022).”

      Supporting our results, analysis of the ENCODE MCF-7 RNA-seq data of the cytoplasmic, nuclear and total cell fractions indeed shows a nuclear enrichment of the EPB41L4A mRNA (Author response image 1), in line with what we observed in Fig. 1E by RT-qPCR. 

      Author response image 1.

      The EPB41L4A transcript is nuclear-enriched in the MCF-7 ENCODE subcellular RNA-seq dataset. Scatterplot of gene length versus cytoplasm/nucleus ratio (as computed by DESeq2) in MCF-7 cells. Each dot represents an unique gene, color-coded reflecting if their DESeq2 adjusted p-value < 0.05 and absolute log<sub>2</sub>FC > .41 (33% enrichment or depletion).GAPDH and MALAT1 are shown as representative cytoplasmic and nuclear transcripts, respectively. Data from ENCODE.

      (4) Annotation and termini of EPB41L4A-AS1:

      The latest Gencode v47 annotations imply an overlap of the sense and antisense, different from that shown in Figure 1C. The 3' UTR of EPB41L4A is shown to extensively overlap EPB41L4A-AS1. This could shed light on the apparent regulation of the former by the latter that is relevant for this paper. I'd suggest that the authors update their figure of the EPB41L4A-AS1 locus organisation with much more detail, particularly evidence for the true polyA site of both genes. What is more, the authors might consider performing RACE experiments for both RNAs in their cells to definitely establish whether these transcripts contain complementary sequence that could cause their Watson-Crick hybridisation, or whether their two genes might interfere with each other via some kind of polymerase collision.

      We thank the reviewer for pointing this out. Also in previous GENCODE annotations, multiple isoforms were reported with some overlapping the 3’ UTR of EPB41L4A. In the EPB41L4A-AS1 locus image (Fig. 1C), we report at the bottom the different transcripts isoforms currently annotated, and a schematics of the one that is clearly the most abundant in MCF-7 cells based on RNA-seq read coverage. This is supported by both the polyA(+) and ribo(-) RNA-seq data, which are strand-specific, as shown in the figure.

      We now also examined the ENCODE/CSHL MCF-7 RNA-seq data from whole cell, cytoplasm and nucleus fractions, as well as 3P-seq data (Jan et al., 2011) (unpublished data from human cell lines), reported in Author response image 2. All these data support the predominant use of the proximal polyA site in human cell lines. This shorter isoform does not overlap EPB41L4A.

      Author response image 2.

      Most EPB41L4A-AS1 transcripts end before the 3’ end of EPB41L4A. UCSC genome browser view showing tracks from 3P-seq data in different cell lines and neural crest (top, with numbers representing the read counts, i.e. how many times that 3’ end has been detected), and stranded ENCODE subcellular RNA-seq (bottom).

      Based on these data, the large majority of cellular transcripts of EPB41L4A-AS1 terminate at the earlier polyA site and don’t overlap with EPB41L4A. There is a small fraction that appears to be restricted to the nucleus that terminates later at the annotated isoform. 3' RACE experiments are not expected to provide substantially different information beyond what is already available.

      (5) Figure 3C:

      There is an apparent correlation between log2FC upon EPB41L4A-AS1 knockdown, and the number of clip sites for SUB1. However, I expect that the clip signal correlates strongly with the mRNA expression level, and that log2FC may also correlate with the same. Therefore, the authors would be advised to more exhaustively check that there really is a genuine relationship between log2FC and clip sites, after removing any possible confounders of overall expression level.

      As the reviewer suggested, there is a correlation between the baseline expression level and the strength of SUB1 binding in the eCLIP data. To address this issue, we built expression-matched controls for each group of SUB1 interactors and checked the fold-changes following EPB41L4A-AS1 KD, similarly to what we have done in Fig. 3C. The results are presented, and are now part of Supplementary Figure 7 (Fig. S7C). 

      Based on this analysis, while there is a tendency of increased expression with increased SUB1 binding, when controlling for expression levels the effect of down-regulation of SUB1-bound RNAs upon lncRNA knockdown remains, suggesting that it is not merely a confounding effect. We have updated the text as follows:

      “We hypothesized that loss of EPB41L4A-AS1 might affect SUB1, either via the reduction in its expression or by affecting its functions. We stratified SUB1 eCLIP targets into confidence intervals, based on the number, strength and confidence of the reported binding sites. Indeed, eCLIP targets of SUB1 (from HepG2 cells profiled by ENCODE) were significantly downregulated following EPB41L4A-AS1 KD in MCF-7, with more confident targets experiencing stronger downregulation (Fig. 3C). Importantly, this still holds true when controlling for gene expression levels (Fig. S7C), suggesting that this negative trend is not due to differences in their baseline expression.”

      (6) The relation to cancer seems somewhat contradictory, maybe I'm missing something. Could the authors more clearly state which evidence is consistent with either an Oncogene or a Tumour Suppressive function, and discuss this briefly in the Discussion? It is not a problem if the data are contradictory, however, it should be discussed more clearly.

      We acknowledge this apparent contradiction. Cancer cells are characterized by a multitude of hallmarks depending on the cancer type and stage, including high proliferation rates and enhanced invasive capabilities. The notion that cells with reduced EPB41L4A-AS1 levels exhibit lower proliferation, yet increased invasion is compatible with a function as an oncogene. Cells undergoing EMT may reduce or even completely halt proliferation/cell division, until they revert back to an epithelial state (Brabletz et al., 2018; Dongre & Weinberg, 2019). Notably, downregulated genes following EPB41L4A-AS1 KD are enriched in GO terms related to cell proliferation and cell cycle progression (Fig. 2I), whereas those upregulated are enriched for terms linked to EMT processes. Thus, while we cannot rule out a potential function as tumor suppressor gene, our data fit better the notion that EPB41L4A-AS1 promotes invasion, and thus, primarily functions as an oncogene. We now address this in point in the discussion:

      “The notion that cells with reduced EPB41L4A-AS1 levels exhibit lower proliferation (Fig. 8C), yet increased invasion (Fig. 8A and 8B) is compatible with a function as an oncogene by promoting EMT (Fig. 8D and 8E). Cells undergoing this process may reduce or even completely halt proliferation/cell division, until they revert back to an epithelial state (Brabletz et al., 2018; Dongre & Weinberg, 2019). Notably, downregulated genes following EPB41L4A-AS1 KD are enriched in GO terms related to cell proliferation and cell cycle progression (Fig. 2I), whereas those upregulated for terms linked to EMT processes. Thus, while we cannot rule out a potential function as tumor suppressor gene, our data better fits the idea that this lncRNA promotes invasion, and thus, primarily functions as an oncogene.”

      Reviewer #2 (Recommendations for the authors):

      Below are major and minor points to be addressed. We hope the authors find them useful.

      (1) Figure 1:

      Where are LNA gapmers located within the EPB41L4A-AS1 gene? Are they targeting exons or introns of the EPB41L4A-AS1? Please clarify or include in the figure.

      We now report the location of the two GapmeRs in Fig. 1C. LNA1 targets the intronic region between SNORA13 and exon 2, and LNA2 the terminal part of exon 1.

      (2) Figure 2B:

      Why is a single LNA gapmer used for EPB41L4A Western? In addition, are the qPCR data in Figure 2B the same as in Figure 1B? Please clarify.

      The Western Blot was performed after transfecting the cells with either LNA1 or LNA2. We now have replaced Fig. 2C with the full Western Blot image, in order to show both LNAs. With respect to the qPCRs in Fig. 1B and 2B, they represent the results from two independent experiments.

      (3) Figure 2F:

      2364 DEGs for a single LNA is a lot of deregulated genes in RNA-seq data. How do the authors explain such a big number in DEGs? Is that because this LNA was intronic? Additional LNA gapmer would minimise the "real" lncRNA target and any potential off-target effect.

      We agree with the Reviewer that GapmeRs are prone to off-target and unwanted effects (Lai et al.,2020; Lee & Mendell, 2020; Maranon & Wilusz, 2020). Early in our experiments, we found out that LNA1 triggers a non-specific CDKN1A/p21 activation (Fig. S5A-C), and thus, we have initially performed some experiments such as RNA-seq with only LNA2.

      Nonetheless, other experiments were performed using both GapmeRs, such as multiple RT-qPCRs, UMI-4C, SUB1 and NPM1 imaging, and the in vitro assays, among others, and consistent results were obtained with both LNAs.

      To accommodate the request by this and the other reviewers, we have now performed another round of polyA+ RNA-seq following EPB41L4A-AS1 knockdown using LNA1 or LNA2, as well as the previously used and an additional control GapmeR. The FPKMs of the control samples are highly-correlated both within replicates and between GapmeRs (Fig. S6A). More importantly, the fold-changes to control are highly correlated between the two on-target GapmeRs LNA1 and LNA2, regardless of the GapmeR used for normalization (Fig. S6B), thus showing that despite significant GapmeR-specific effects, the bulk of the response is shared and likely the direct result of the reduction in the levels of EPB41L4A-AS1. Notably, key targets NPM1 and MTREX (see discussion, Fig. S12A-C and comments to Reviewer 3) were found to be downregulated by both LNAs (Fig. S6C).

      However, we acknowledge that some of the dysregulated genes are observed only when using one GapmeR and not the other, likely due to a combination of indirect, secondary and non-specific effects, and as such it is difficult to infer the direct response. Supporting this, LNA2 yielded a total of 1,069 DEGs (617 up and 452 down) and LNA1 2,493 DEGs (1,328 up and 1,287 down), with the latter triggering a stronger response most likely as a result of the previously mentioned CDKN1A/p21 induction. Overall, 45.1% of the upregulated genes following LNA2 transfection were shared with LNA1, in contrast to only the 24.3% of the downregulated ones.

      We have now included these results in the Results section (see below) and in Supplementary Figure (Fig. S6).

      “Most of the consequences of the depletion of EPB41L4A-AS1 are thus not directly explained by changes in EPB41L4A levels. An additional trans-acting function for EPB41L4A-AS1 would therefore be consistent with its high expression levels compared to most lncRNAs detected in MCF-7 (Fig. S5G). To strengthen these findings, we have transfected MCF-7 cells with LNA1 and a second control GapmeR (NT2), as well as the previous one (NT1) and LNA2, and sequenced the polyadenylated RNA fraction as before. Notably, the expression levels (in FPKMs) of the replicates of both control samples are highly correlated with each other (Fig. S6A), and the global transcriptomic changes triggered by the two EPB41L4A-AS1-targeting LNAs are largely concordant (Fig. S6B and S6C). Because of this concordance and the cleaner (i.e., no CDKN1A upregulation) readout in LNA2-transfected cells, we focused mainly on these cells for subsequent analyses.”

      (4) Figure 3B: Does downregulation of SUB1 and NPM1 reflect at the protein level with both LNA gapmers? The authors should show a heatmap and metagene profile for SUB1 CUT & RUN. How did the author know that SUB1 binding is specific, since CUT & RUN was not performed in SUB1-depleted cells?

      As requested by both Reviewer #2 and #3, we have performed WB for SUB1, NPM1 and FBL following EPB41L4A-AS1 KD with two targeting (LNA1 and LNA2) and the previous control GapmeRs. Interestingly, we did not detect any significant downregulation of either proteins (Author response image 3), although this might be the result of the high variability observed in the control samples. Moreover, the short timeframe in which the experiments have been conducted━that is, transient transfections for 3 days━might not be sufficient time for the existing proteins to be degraded, and thus, the downregulation is more evident at the RNA (Fig. 3B and Supplementary Figure 6C) rather than protein level.

      Author response image 3.

      EPB41L4A-AS1 KD has only marginal effects on the levels of nucleolar proteins. (A) Western Blots for the indicated proteins after the transfection for 3 days of the control and targeting GapmeRs. (B) Quantification of the protein levels from (A).  All experiments were performed in n=3 biological replicates, with the error bars in the barplots representing the standard deviation. ns - P>0.05; * - P<0.05; ** - P<0.01; *** - P<0.001 (two-sided Student’s t-test).

      Following the suggestion by the Reviewer, we now show both the SUB1 CUT&RUN metagene profile (previously available as Fig. 3F) and the heatmap (now Fig. 3G) around the TSS of all genes, stratified by their expression level. Both graphs are reported.

      We show that the antibody signal is responsive to SUB1 depletion via siRNAs in both WB (Fig. S8F) and IF (Fig. 5E) experiments. As mentioned below, this and the absence of non-specific signals makes us confident in the CUT&RUN data. Performing CUT&RUN in SUB1 depleted cells would be difficult to interpret as perturbations are typically not complete, and so the remaining protein can still bind the same regions. Since there isn’t a clear way to add spike-ins to CUT&RUN experiments, it is very difficult to show specificity of binding by CUT&RUN in siRNA-knockdown cells.

      (5) Figure 3D: The MW for the depicted proteins are lacking. Why is there no SUB1 protein in the input? Please clarify. Since the authors used siRNA to deplete SUB1, it would be good to know if the antibody is specific in their CUT & RUN (see above)

      We apologize for the lack of the MW in Fig. 3D. As shown in Fig. S8F, SUB1 is ~18 kDa and the antibody signal is responsive to SUB1 depletion via siRNAs in both WB (Fig. S8F) and IF (Fig. 5E) experiments. Thus, given its 1) established specificity in those two settings and 2) the lack of generalized signal at most open chromatin regions, which is typical of nonspecific CUT&RUN experiments, we are confident in the specificity of the CUT&RUN results.

      We now mention the MW of SUB1 in Fig. 3D as well and we provide in Author response image 4 the full SUB1 WB picture, enhancing the contrast to highlight the bands. We agree that the SUB1 band in the input is weak, likely reflecting the low abundance in that fraction and the detection difficulty due to its low MW (see Fig. S8F).

      Author response image 4.

      Western blot for SUB1 following RIP using either a SUB1 or IgG antibody. IN - input, SN - supernatant/unbound, B - bound.

      (6) Supplementary Figure 6C:

      The validation of lncRNA EPB41L4A-AS1 binding to SUB1 should be confirmed by CLIP qPCR, since native RIP can lead to reassociation of RNA-protein interactions (PMID: 15388877). Additionally, the eclip data presented in Figure 3a were from a different cell line and not MCF7.

      We acknowledge that the SUB1 eCLIP data was generated in a different cell line, as we mentioned in the text:

      “Indeed, eCLIP targets of SUB1 (from HepG2 cells profiled by ENCODE) were significantly downregulated following EPB41L4A-AS1 KD in MCF-7, with more confident targets experiencing stronger downregulation (Fig. 3C). Importantly, this still holds true when controlling for gene expression levels (Fig. S7C), suggesting that this negative trend is not due to differences in their baseline expression. To obtain SUB1-associated transcripts in MCF-7 cells; we performed a native RNA immunoprecipitation followed by sequencing of polyA+ RNAs (RIP-seq) (Fig. 3D, S7D and S7E).”

      Because of this, we resorted to native RIP, in order to get binding information in our experimental system. As we show independent evidence for binding using both eCLIP and RIP, and the substantial challenge in establishing the CLIP method, which has not been successfully used in our group, we respectfully argue that further validations are out of scope of this study. We nonetheless agree that several genes which are nominally significantly enriched in our RIP data are likely not direct targets of SUB1, especially given that it is difficult to assign the perfect threshold that discriminates between bound and unbound RNAs.

      We now additionally mention this at the beginning of the paragraph as well:

      “In order to identify potential factors that might be associated with EPB41L4A-AS1, we inspected protein-RNA binding data from the ENCODE eCLIP dataset(Van Nostrand et al., 2020). The exons of the EPB41L4A-AS1 lncRNA were densely and strongly bound by SUB1 (also known as PC4) in both HepG2 and K562 cells (Fig. 3A).”

      (7) Figure 3G:

      Can the authors distinguish whether loss of EPB41L4A-AS1 affects SUB1 chromatin binding or its activity as RBP? Please discuss.

      Distinguishing between altered SUB1 chromatin and RNA binding is challenging, as this protein likely does not interact directly with chromatin and exhibits rather promiscuous RNA binding properties (Ray et al., 2023). In particular, SUB1 (also known as PC4) interacts with and regulates the activity of all three RNA polymerases, and was reported to be involved in transcription initiation and elongation, response to DNA damage, chromatin condensation (Conesa & Acker, 2010; Das et al., 2006; Garavís & Calvo, 2017; Hou et al., 2022) and telomere maintenance (Dubois et al., 2025; Salgado et al., 2024).

      Based on our data, genes whose promoters are occupied by SUB1 display marginal, yet highly significant changes in their steady-state expression levels upon lncRNA perturbations. We also show that upon EPB41L4A-AS1 KD, SUB1 acquires a stronger nucleolar localization (Fig. 5A), which likely affects its RNA interactome as well. However, further elucidating these activities would require performing RIP-seq and CUT&RUN in lncRNA-depleted cells, which we argue is out of the scope of the current study. We note that  KD of SUB1 with siRNAs have milder effects than that of EPB41L4A-AS1 (Fig. S8G), suggesting that additional players and effects shape the observed changes. Therefore, it is highly likely that the loss of this lncRNA affects both SUB1 chromatin binding profile and RNA binding activity, with the latter likely resulting in the increased snoRNAs abundance.

      (8) Figure 4: Can the authors show that a specific class of snorna is affected upon depletion of SUB1 and EPB41L4A-AS1? Can they further classify the effect of their depletion on H/ACA box snoRNAs, C/D box snoRNAs, and scaRNAs?

      Such potential distinct effect on the different classes of snoRNAs was considered, and the results are available in Fig. S8B and S8H (boxplots, after EPB41L4A-AS1 and SUB1 depletion), as well as Fig. 4F and S9F (scatterplots between EPB41L4A-AS1 and SUB1 depletion, and EPB41L4A-AS1 and GAS5 depletion, respectively). We see no preferential effect on one group of snoRNAs or the other.

      (9) Figure 5: From the representative images, it looks to me that LNA 2 targeting EPB41L4A-AS1 has a bigger effect on nucleolar staining of SUB1. To claim that EPB41L4A-AS1 depletion "shifts SUB1 to a stronger nucleolar distribution", the authors need to perform IF staining for SUB1 and Fibrillarin, a known nucleolar marker. Also, how does this data fit with their qPCR data shown in Figure 3B? It is instrumental for the authors to demonstrate by IF or Western blotting that SUB1 levels decrease in one fraction and increase specifically in the nucleolus. They could perform Western blot for SUB1 and Fibrillarin in EPB41L4A-AS1-depleted cells and isolate cytoplasmic, nuclear, and nucleolar fractions.This experiment will strengthen their finding. The scale bar is missing for all the images in Figure 5. The authors should also show magnified images of a single representative cell at 100x.

      We apologize for the confusion regarding the scale bars. As mentioned here and elsewhere, the scale bars are present in the top-left image of each panel only, in order to avoid overcrowding the panel. All the images are already at 100X, with the exception of Fig. 5E (IF for SUB1 upon siSUB1 transfection) which is 60X in order to better show the lack of signal. We however acknowledge that the images are sometimes confusing, due to the PNG features once imported into the document. In any case, in the submission we have also provided the original images in high-quality PDF and .ai formats.  The suggested experiment would require establishing a nucleolar fractionation protocol which we currently don’t have available and we argue that it is out of scope of the current study.

      (10) Additionally, is rRNA synthesis affected in SUB1- and EPB41L4A-AS1-depleted cells? The authors could quantify newly synthesised rRNA levels in the nucleoli, which would also strengthen their findings about the role of this lncRNA in nucleolar biology.

      We acknowledge that there are many aspects of the role of EPB41L4A-AS1 in nucleolar biology that remain to be explored, as well as in nucleolar biology itself, but given the extensive experimental data we already provide in this and other subjects, we respectfully suggest that this experiment is out of scope of the current work. We note that a recent study has shown that SUB1 is required for Pol I-mediated rDNA transcription in the nucleolus (Kaypee et al., 2025). In the presence of nucleolar SUB1, rDNA transcription proceeds as expected, but when SUB1 is depleted or its nucleolar localization is affected—by either sodium butyrate treatment or inhibition of KAT5-mediated phosphorylation at its lysine 35 (K35)—the levels of the 47S pre-rRNA are significantly reduced. In our settings, SUB1 enriches into the nucleolus following EPB41L4A-AS1 KD; thus, we might expect to see a slightly increased rDNA transcription or no effect at all, given that SUB1 localizes in the nucleolus in baseline conditions as well. We now mention this novel role of SUB1 both in the results and discussion.

      “SUB1 interacts with all three RNA polymerases and was reported to be involved in transcription initiation and elongation, response to DNA damage, chromatin condensation(Conesa & Acker, 2010; Das et al., 2006; Garavís & Calvo, 2017; Hou et al., 2022), telomere maintenance(Dubois et al., 2025; Salgado et al., 2024) and rDNA transcription(Kaypee et al., 2025). SUB1 normally localizes throughout the nucleus in various cell lines, yet staining experiments show a moderate enrichment for the nucleolus (source: Human Protein Atlas; https://www.proteinatlas.org/ENSG00000113387-SUB1/subcellular)(Kaypee et al., 2025).”

      “Several features of the response to EPB41L4A-AS1 resemble nucleolar stress, including altered distribution of NPM1(Potapova et al., 2023; Yang et al., 2016). SUB1 was shown to be involved in many nuclear processes, including transcription(Conesa & Acker, 2010), DNA damage response(Mortusewicz et al., 2008; Yu et al., 2016), telomere maintenance(Dubois et al., 2025), and nucleolar processes including rRNA biogenesis(Kaypee et al., 2025; Tafforeau et al., 2013). Our results suggest a complex and multi-faceted relationship between EPB41L4A-AS1 and SUB1, as SUB1 mRNA levels are reduced by the transient (72 hours) KD of the lncRNA (Fig. 3B), the distribution of the protein in the nucleus is altered (Fig. 5A and 5C), while the protein itself is the most prominent binder of the mature EPB41L4A-AS1 in ENCODE eCLIP data (Fig. 3A). The most striking connection between EPB41L4A-AS1 and SUB1 is the similar phenotype triggered by their loss (Fig. 4). We note that a recent study has shown that SUB1 is required for Pol I-mediated rDNA transcription in the nucleolus(Kaypee et al., 2025). In the presence of nucleolar SUB1, rDNA transcription proceeds as expected, but when SUB1 is depleted or its nucleolar localization is affected—by either sodium butyrate treatment or inhibition of KAT5-mediated phosphorylation at its lysine 35 (K35)—the levels of the 47S pre-rRNA are significantly reduced. In our settings, SUB1 enriches into the nucleolus following EPB41L4A-AS1 KD; thus, we might expect to see a slightly increased rDNA transcription or no effect at all, given that SUB1 localizes in the nucleolus in baseline conditions as well. It is however difficult to determine which of the connections between these two genes is the most functionally relevant and which may be indirect and/or feedback interactions. For example, it is possible that EPB41L4A-AS1 primarily acts as a transcriptional regulator of SUB1 mRNA, or that its RNA product is required for proper stability and/or localization of the SUB1 protein, or that EPB41L4A-AS1 acts as a scaffold for the formation of protein-protein interactions of SUB1.”

      (11) Figure 8: The scratch assay alone cannot be used as a measure of increased invasion, and this phenotype must be confirmed with a transwell invasion or migration assay. Thus, I highly recommend that the authors conduct this experiment using the Boyden chamber. Do the authors see upregulation of N-cadherin, Vimentin, and downregulation of E-cadherin in their RNA-seq?

      We agree with the reviewer that those phenotypes are complex and normally require multiple in vitro, as well as in vivo assays to be thoroughly characterized. However, we respectfully consider those as out of scope of the current work, which is more focused on RNA biology and the molecular characterization and functions of EPB41L4A-AS1.

      Nevertheless, in Fig. 8D we show that the canonical EMT signature (taken from MSigDB) is upregulated in cells with reduced expression of EPB41L4A-AS1. Notably, EMT has been found to not possess an unique gene expression program, but it rather involves distinct and partially overlapping gene signatures (Youssef et al., 2024). In Fig. 8D, the most upregulated gene is TIMP3, a matrix metallopeptidase inhibitor linked to a particular EMT signature that is less invasive and more profibrotic (EMT-T2, (Youssef et al., 2024)). Interestingly, we observed a strong upregulation of other genes linked to EMT-T2, such as TIMP1, FOSB, SOX9, JUNB, JUN and KLF4, whereas MPP genes (linked to EMT-T1, which is highly proteolytic and invasive) are generally downregulated or not expressed. With regards to N- and E-cadherin, the first does not pass our cutoff to be considered expressed, and the latter is not significantly changing. Vimentin is also not significantly dysregulated. All these examples are reported, which were added as Fig. 8E:

      The text has also been updated accordingly:

      “These findings suggest that proper EPB41L4A-AS1 expression is required for cellular proliferation, whereas its deficiency results in the onset of more aggressive and migratory behavior, likely linked to the increase of the gene signature of epithelial to mesenchymal transition (EMT) (Fig. 8D). Because EMT is not characterized by a unique gene expression program and rather involves distinct and partially overlapping gene signatures (Youssef et al., 2024), we checked the expression level of marker genes linked to different types of EMTs (Fig. 8E). The most upregulated gene in Fig. 8D is TIMP3, a matrix metallopeptidase inhibitor linked to a particular EMT signature that is less invasive and more profibrotic (EMT-T2) (Youssef et al., 2024). Interestingly, we observed a stark upregulation of other genes linked to EMT-T2, such as TIMP1, FOSB, SOX9, JUNB, JUN and KLF4, whereas MPP genes (linked to EMT-T1, which is highly proteolytic and invasive) are generally downregulated or not expressed. This suggests that the downregulation of EPB41L4A-AS1 is primarily linked to a specific EMT program (EMT-T2), and future studies aimed at uncovering the exact mechanisms and relevance will shed light upon a possible therapeutic potential of this lncRNA.”

      (12) Minor points:

      (a) What could be the explanation for why only the EPB41L4A-AS1 locus has an effect on the neighbouring gene?

      There might be multiple reasons why EPB41L4A-AS1 is able to modulate the expression of the neighboring genes. First, it is expressed from a TAD boundary exhibiting physical contacts with several genes in the two flanking TADs (Fig. 1F and 2A), placing it in the right spot to regulate their expression. Second, it is highly expressed when compared to most of the genes nearby, with transcription having been linked to the establishment and maintenance of TAD boundaries (Costea et al., 2023). Accordingly, the (partial) depletion of EPB41L4A-AS1 via GapmeRs transfection slightly reduces the contacts between the lncRNA and EPB41L4A loci (Fig. 2E and S4J), although this effect could also be determined by a premature transcription termination triggered by the GapmeRs. 

      There are a multitude of mechanisms by which lncRNAs with regulatory functions modulate the expression of one or more target genes in cis (Gil & Ulitsky, 2020), and our data do not unequivocally point to one of them. Distinguishing between these possibilities is a major challenge in the field and would be difficult to address in the context of this one study. It could be that the processive RNA polymerases at the EPB41L4A-AS1 locus are recruited to the neighboring loci, facilitated by the close proximity in the 3D space. It could also be possible that chromatin remodeling factors are recruited by the nascent RNA, and then promote and/or sustain the opening of chromatin at the target site. The latter possibility is intriguing, as this mechanism is proposed to be widespread among lncRNAs (Gil & Ulitsky, 2020; Oo et al., 2025) and we observed a significant reduction of H3K27ac levels at the EPB41L4A promoter region (Fig. 2D). Future studies combining chromatin profiling (e.g., CUT&RUN and ATAC-seq) and RNA pulldown experiments will shed light upon the exact mechanisms by which this lncRNA regulates the expression of target genes in cis and its interacting partners.

      (b) The scale bar is missing on all the images in the Supplementary Figures as well.

      The scale bars are present in the top-left figure of each panel. We acknowledge that due to the export as PNG, some figures (including those with microscopy images) display abnormal font sizes and aspect ratio. All images were created using consistent fonts, sizes and ratio, and are provided as high-quality PDF in the current submission.

      (13) Methods:

      The authors should double-check if they used sirn and LNA gapmers at 25 and 50um concentrations, as that is a huge dose. Most papers used these reagents in the range of 5-50nM maximum.

      We apologize for the typo, the text has been fixed. We performed the experiments at 25 and 50nM, respectively, as suggested by the manufacturer’s protocol.

      (14) Discussion:

      Which cell lines were used in reference 27 (Cheng et al., 2024 Cell) to study the role of SNORA13? It may be useful to include this in the discussion.

      We already mentioned the cell system in the discussion, and now we edited to include the specific cell line that was used:

      “A recent study found that SNORA13 negatively regulates ribosome biogenesis in TERT-immortalized human fibroblasts (BJ-HRAS<Sup>G12V</sup>), by decreasing the incorporation of RPL23 into the maturing 60S ribosomal subunits, eventually triggering p53-mediated cellular senescence(Cheng et al., 2024).”

      Reviewer #3 (Recommendations for the authors):

      Major comments on weaknesses:

      (1) The paper is quite disjointed:

      (a) Figures1/2 studied the cis- and potential trans target genes altered by EPB41L4A-AS1 knockdown. They also showed some data about EPB41L4A-AS1 overlaps a strong chromatin boundary.

      (b) Figures3/4/5 studied the role of SUB1 - as it is altered by EPB41L4A-AS1 knockdown - in affecting genes and snoRNAs, which may partially underlie the gene/snoRNA changes after EPB41L4A-AS1 knockdown.

      (c) Figure 6 showed that EPB41L4A-AS1 knockdown did not directly affect SNORA13, the snoRNA located in the intron of EPB41L4A-AS1. Thus, the upregulation of many snoRNAs is not due to SNORA13.

      (d) Figure 7 studied whether the changes of cis genes or snoRNAs are due to transcriptional stability.

      (e) Figure 8 studied cellular phenotypes after EPB41L4A-AS1 knockdown.

      These points are overly spread out and this dilutes the central theme of these results, which this Reviewer considered to be on cis or trans gene regulation by this lncRNA.The title of the paper implies EPB41L4A-AS1 knockdown affected trans target genes, but the paper did not focus on studying cis or trans effects, except briefly mentioning that many genes were changed in Figure 2. The many changes of snoRNAs are suggested to be partially explained by SUB1, but SUB1 itself is affected (>50%, Figure 3B) by EPB41L4A-AS1 knockdown, so it is unclear if these are mostly secondary changes due to SUB1 reduction. Given the current content of the paper, the authors do not have sufficient evidence to support that the changes of trans genes are due to direct effects or indirect effects. And so they are encouraged to revise their title to be more on snoRNA regulation, as this area took the majority of the efforts in this paper.

      We respectfully disagree with the reviewer. We show that the effect on the proximal genes are cis-acting, as they are not rescued by exogenous expression, whereas the majority of the changes observed in the RNA-seq datasets appear to be indirect, and the snoRNA changes, that indeed might be indirect and not necessarily involve direct interaction partners of the lncRNA, such as SUB1, appear to be trans-regulated, as they can be rescued partially by exogenous expression of the lncRNA. We also show that KD of the main cis-regulated gene, EPB41L4A, results in a much milder transcriptional response, further solidifying the contribution of trans-acting effects. While we agree that the snoRNA effects are interesting, we do not consider them to be the main result, as they are accompanied by many additional changes in gene expression, and changes in the subnuclear distribution of the key nucleolar proteins, so it is difficult for us to claim that EPB41L4A-AS1 is specifically relevant to the snoRNAs rather than to the more broad nucleolar biology. Therefore, we prefer not to mention snoRNAs specifically in the title.

      (2) EPB41L4A-AS1 knockdown caused ~2,364 gene changes. This is a very large amount of change on par with some transcriptional factors. It thus needs more scrutiny. First, on Page 9, second paragraph, the authors used|log2Fold-change| >0.41 to select differential genes, which is an unusual cutoff. What is the rationale? Often |log2Fold-change| >1 is more common. How many replicates are used? To examine how many gene changes are likely direct target genes, can the authors show how many of the cist-genes that are changed by EPB41L4A-AS1 knockdown have direct chromatin contacts with EPB41L4A-AS1 in HiC data? Is there any correlation between HiC contact with their fold changes? Without a clear explanation of cis target genes as direct target genes, it is more difficult to establish whether any trans target genes are directly affected by EPB41L4A-AS1 knockdown.

      A |log<sub>2</sub>Fold-change| >0.41 equals a change of 33% or more, which together with an adjusted P < 0.05 is a threshold that has been used in the past. All RNA-seq experiments have been performed in triplicates, in line with the standards in the field. While it is possible that the EPB41L4A-AS1 establishes multiple contacts in trans—a process that has been observed in at least another lncRNA, namely Firre but involving its mature RNA product—we do believe this to be less likely that the alternative, namely that the > 2,000 DEGs are predominantly result from secondary changes rather than genes directly regulated by EPB41L4A-AS1 contacts.

      In any case, we have inspected our UMI-4C data to identify other genes exhibiting higher contact frequencies than background levels, and thus, potentially regulated in cis. To this end, we calculated the UMI-4C coverage in a 10kb window centered around the TSS of the genes located on chromosome 5, which we subsequently normalized based on the distance from EPB41L4A-AS1, in order to account for the intrinsic higher DNA recovery the closer to the target DNA sequence. However, in our UMI-4C experiment we have employed baits targeting three different genes—EPB41L4A-AS1, EPB41L4A and STARD4—and therefore such approach assumes that the lncRNA locus has the most regulatory features in this region. As expected, we detected a strong negative correlation between the normalized coverage and the distance from the EPB41L4A-AS1 locus (⍴ = -0.51, p-value < 2.2e-16), and the genes in the two neighboring TADs exhibited the strongest association with the bait region (Author response image 5). The genes that we see are down-regulated in the adjacent TADs, namely NREP, MCC and MAN2A1 (Fig. 2F) show substantially higher contacts than background with the EPB41L4A-AS1 gene, thus potentially constituting additional cis-regulated targets of this lncRNA. We note that both SUB1 and NPM1 are located on chromosome 5 as well, albeit at distances exceeding 75 and 50 Mb, respectively, and they do not exhibit any striking association with the lncRNA locus.

      Author response image 5.

      UMI-4C coverage over the TSS of the genes located on chromosome 5. (A) Correlation between the normalized UMI-4C coverage over the TSS (± 5kb) of chromosome 5 genes and the absolute distance (in megabases, Mb) from EPB41L4A-AS1. (B) Same as in (A), but with the x axis showing the relative distance from EPB41L4A-AS1. In both cases, the genes in the two flanking TADs are colored in red and their names are reported.

      To increase the confidence in our RNA-seq data, we have now performed another round of polyA+ RNA-seq following EPB41L4A-AS1 knockdown using LNA1 or LNA2, as well as the previously used and an additional control GapmeR. The FPKMs of the control samples are highly-correlated both within replicates and between GapmeRs (Fig. S6A). More importantly, the fold-changes to control are highly correlated between the two on-target GapmeRs LNA1 and LNA2, regardless of the GapmeR used for normalization (Fig. S6B), thus showing that despite significant GapmeR-specific effects, the bulk of the response is shared and likely the direct result of the reduction in the levels of EPB41L4A-AS1. Notably, key targets NPM1 and MTREX (see discussion, Fig. S12A-C and comments to Reviewer 3) were found to be downregulated by both LNAs (Fig. S6C).

      However, we acknowledge that some of the dysregulated genes are observed only when using one GapmeR and not the other, likely due to a combination of indirect, secondary and non-specific effects, and as such it is difficult without short time-course experiments (Much et al., 2024) to infer the direct response. Supporting this, LNA2 yielded a total of 1,069 DEGs (617 up and 452 down) and LNA1 2,493 DEGs (1,328 up and 1,287 down), with the latter triggering a stronger response most likely as a result of the previously mentioned CDKN1A/p21 induction. Overall, 45.1% of the upregulated genes following LNA2 transfection were shared with LNA1, in contrast to only the 24.3% of the downregulated ones.

      We have now included these results in the Results section (see below) and in Supplementary Figure (Fig. S6).

      “Most of the consequences of the depletion of EPB41L4A-AS1 are thus not directly explained by changes in EPB41L4A levels. An additional trans-acting function for EPB41L4A-AS1 would therefore be consistent with its high expression levels compared to most lncRNAs detected in MCF-7 (Fig. S5G). To strengthen these findings, we have transfected MCF-7 cells with LNA1 and a second control GapmeR (NT2), as well as the previous one (NT1) and LNA2, and sequenced the polyadenylated RNA fraction as before. Notably, the expression levels (in FPKMs) of the replicates of both control samples are highly correlated with each other (Fig. S6A), and the global transcriptomic changes triggered by the two EPB41L4A-AS1-targeting LNAs are largely concordant (Fig. S6B and S6C). Because of this concordance and the cleaner (i.e., no CDKN1A upregulation) readout in LNA2-transfected cells, we focused mainly on these cells for subsequent analyses.”

      Figure 3B, SUB1 mRNA is reduced >half by EPB41L4A-AS1 KD. How much did SUB1 protein reduce after EPB41L4A-AS1 KD? Similarly, how much is the NPM1 protein reduced? If these two important proteins were affected by EPB41L4A-AS1 KD simultaneously, it is important to exclude how many of the 2,364 genes that changed after EPB41L4A-AS1 KD are due to the protein changes of these two key proteins. For SUB1, Figures S7E,F,G provided some answers. But NPM1 KD is also needed to fully understand such. Related to this, there are many other proteins perhaps changed in addition to SUB1 and NPM1, this renders it concerning how many of the EPB41L4A-AS1 KD-induced changes are directly caused by this RNA. In addition to the suggested study of cist targets, the alternative mechanism needs to be fully discussed in the paper as it remains difficult to fully conclude direct versus indirect effect due to such changes of key proteins or ncRNAs (such as snoRNAs or histone mRNAs).

      As requested by both Reviewer #2 and #3, we have performed WB for SUB1, NPM1 and FBL following EPB41L4A-AS1 KD with two targeting (LNA1 and LNA2) and the previous control GapmeRs. Interestingly, we did not detect any significant downregulation of either proteins (Author response image 3), although this might be the result of the high variability observed in the control samples. Moreover, the short timeframe in which the experiments have been conducted━that is, transient transfections for 3 days━might not be sufficient time for the existing proteins to be degraded, and thus, the downregulation is more evident at the RNA (Fig. 3B and Supplementary Figure 6C) rather than protein level.

      We acknowledge that many proteins might change simultaneously, and to pinpoint which ones act upstream of the plethora of indirect changes is extremely challenging when considering such large-scale changes in gene expression. In the case of SUB1 and NPM1━which were prioritized for their predicted binding to the lncRNA (Fig. 3A)━we show that the depletion of the former affects the latter in a similar way than that of the lncRNA (Fig. 5F). Moreover, snoRNAs changes are also similarly affected (as the reviewer pointed out, Fig. 4F), suggesting that at least this phenomenon is predominantly mediated by SUB1. Other effects might also be indirect consequences of cellular responses, such as the decrease in histone mRNAs (Fig. 4A) that might reflect the decrease in cellular replication (Fig. 8C) and cell cycle genes (Fig. 2I) (although a link between SUB1 and histone mRNA expression has been described (Brzek et al., 2018)). 

      Supporting the notion that additional proteins might be involved in driving the observed phenotypes, one of the genes that most consistently was affected by EPB41L4A-AS1 KD with GapmeRs is MTREX (also known as MTR4), that becomes downregulated at both the RNA and protein levels (now presented in the main text as Supplementary Figure 12). MTREX it’s part of the NEXT and PAXT complexes (Contreras et al., 2023), that target several short-lived RNAs for degradation, and the depletion of either MTREX or other complex members leads to the upregulation of such RNAs, that include PROMPTs, uaRNAs and eRNAs, among others. Given the lack in our understanding in snoRNA biogenesis from introns in mammalian systems(Monziani & Ulitsky, 2023), it is tempting to hypothesize a role for MTREX-containing complexes in trimming and degrading those introns and release the mature snoRNAs.  

      We updated the discussion section to include these observations:

      “Beyond its site of transcription, EPB41L4A-AS1 associates with SUB1, an abundant protein linked to various functions, and these two players are required for proper distribution of various nuclear proteins. Their dysregulation results in large-scale changes in gene expression, including up-regulation of snoRNA expression, mostly through increased transcription of their hosts, and possibly through a somewhat impaired snoRNA processing and/or stability. To further hinder our efforts in discerning between these two possibilities, the exact molecular pathways involved in snoRNAs biogenesis, maturation and decay are still not completely understood. One of the genes that most consistently was affected by EPB41L4A-AS1 KD with GapmeRs is MTREX (also known as MTR4), that becomes downregulated at both the RNA and protein levels (Fig. S12A-C). Interestingly, MTREX it is part of the NEXT and PAXT complexes(Contreras et al., 2023), that target several short-lived RNAs for degradation, and the depletion of either MTREX or other complex members leads to the upregulation of such RNAs, that include PROMPTs, uaRNAs and eRNAs, among others. It is therefore tempting to hypothesize a role for MTREX-containing complexes in trimming and degrading those introns, and releasing the mature snoRNAs. Future studies specifically aimed at uncovering novel players in mammalian snoRNA biology will both conclusively elucidate whether MTREX is indeed involved in these processes.”

      With regards to the changes in gene expression between the two LNAs, we provide a more detailed answer above and to the other reviewers as well.

      (3) A Strong discrepancy of results by different approaches of knockdown or overexpression:

      (a) CRISPRa versus LNA knockdown: Figure S4 - CRISPRa of EPB41L4A-AS1 did not affect EPB41L4A expression (Figure S4B). The authors should discuss how to interpret this result. Did CRISPRa not work to increase the nuclear/chromatin portion of EPB41L4A-AS1? Did CRISPRa of EPB41L4A-AS1 affect the gene in the upstream, the STARD4? Did CRISPRa of EPB41L4A-AS1 also affect chromatin interactions between EPB41L4A-AS1 and the EPB41L4A gene? If so, this may argue that chromatin interaction is not necessary for cis-gene regulation.

      There are indeed several possible explanations, the most parsimonious is that since the lncRNA is already very highly transcribed, the relatively modest effect of additional transcription mediated by CRISPRa is not sufficient to elicit a measurable effect. For this reason, we did not check by UMI-4C the contact frequency between the lncRNA and EPB41L4A upon CRISPRa.

      CRISPRa augments transcription at target loci, and thus, the nuclear and chromatin retention of EPB41L4A-AS1 are not expected to be affected. We did not check the expression of STARD4, because we focused on EPB41L4A which appears to be the main target locus according to Hi-C (Fig. 2A), UMI-4C (Fig. 2E and S4J) and GeneHancer (Fig. S1). 

      We already provide extensive evidence of a cis-regulation of EPB41L4A-AS1 over EPB41L4A, and show that EPB41L4A is lowly-expressed and likely has a limited role in our experimental settings. Thus, we respectfully propose that an in-deep exploration of the mechanism of action of this regulatory axis is out of scope of the current study, that instead focused more on the global effects of EPB41L4A-AS1 perturbation.

      (b) Related to this, while CRISPRa alone did not show an effect, upon LNA knockdown of EPB41L4A-AS1, CRISPRa of EPB41L4A-AS1 can increase EPB41L4A expression. It is perplexing as to why, upon LNA treatment, CRISPRa will show an effect (Figure S4H)? Actually, Figures S4H and I are very confusing in the way they are currently presented. They will benefit from being separated into two panels (H into 2 and I into two). And for Ectopic expression, please show controls by empty vector versus EPB41L4A-AS1, and for CRISPRa, please show sgRNA pool versus sgRNA control.

      The results are consistent with the parsimonious assumption mentioned above that the high transcription of the lncRNA at baseline is sufficient for maximal positive regulation of EPB41L4A, and that upon KD, the reduced transcription and/or RNA levels are no longer at saturating levels, and so CRISPRa can have an effect. We now mention this interpretation in the text:

      “Levels of EPB41L4A were not affected by increased expression of EPB41L4A-AS1 from the endogenous locus by CRISPR activation (CRISPRa), nor by its exogenous expression from a plasmid (Fig. S4B and S4C). The former suggests that endogenous levels of EPB41L4A-AS1—that are far greater than those of EPB41L4A—are sufficient to sustain the maximal expression of this target gene in MCF7 cells.”

      We apologize for the confusion regarding the control used in the rescue experiments in Fig. S4H and S4I. The “-” in the Ectopic overexpression and CRISPRa correspond to the Empty Vector and sgControl, respectively, and not the absence of any vector. We changed the text in the figure legends:

      “(H) Changes in EPB41L4A-AS1 expression after rescuing EPB41L4A-AS1 with an ectopic plasmid or CRISPRa following its KD with GapmeRs. In both panels (Ectopic OE and CRISPRa) the “-” samples represent those transfected with the Empty Vector or sgControl. Asterisks indicate significance relative to the –/– control (transfected with both the control GapmeR and vector). (I) Same as in (H), but for changes in EPB41L4A expression.”

      (c) siRNA versus LNA knockdown: Figure S3A showed that siRNA KD of EPB41L4A-AS1 does not affect EPB41L4A expression. How to understand this data versus LNA?

      As explained in the text, siRNA-mediated KD presumably affects mostly the cytoplasmic pool of EPB41L4A-AS1 and not the nuclear one, which we assume explains the different effects of the two perturbations, as observed for other lncRNAs (e.g., (Ntini et al., 2018)). However, we acknowledge that we do not know what aspect of the nuclear RNA biology is relevant, let it be the nascent EPB41L4A-AS1 transcription, premature transcriptional termination or even the nuclear pool of this lncRNA, and this can be elucidated further in future studies.

      (d) EPB41L4A-AS1 OE versus LNA knockdown: Figure 6F showed that EPB41L4A-AS1 OE caused reduction of EPB41L4A mRNA, particularly at 24hr. How to interpret that both LNA KD and OE of EPB41L4A-AS1 reduce the expression of EPB41L4A mRNA?

      We do not believe that the OE of EPB41L4A-AS1, and in particular the one elicited by an ectopic plasmid affects EPB41L4A RNA levels. In the experiment in Fig. 6F, EPB41L4A relative expression at 24h is ~0.65 (please note the log<sub>2</sub> scale in the graph), which is significant as reported. However, throughout this study (and as shown in Fig. S4C for the ectopic and Fig. S4B for the CRISPRa overexpression, respectively), we observed no such behavior, suggesting that the effect reported in Fig. 6F is the result of either that particular setting, and unlikely to reflect a general phenomenon.

      (e) Did any of the effects on snoRNAs or trans target genes after EPB41L4A-AS1 knockdown still appear by CRISPRa?

      As mentioned above, we did a limited number of experiments after CRISPRa, prompted by the fact that endogenous levels of EPB41L4A-AS1 are already high enough to sustain its functions. Pushing the expression even higher will likely result in no or artifactual effects, which is why we respectfully propose such experiments are not essential in this current work, which instead mostly relies on loss-of-function experiments.

      For issue 3, extensive data repetition using all these methods may be unrealistic, but key data discrepancy needs to be fully discussed and interpreted.

      Other comments on weakness:

      (1) This manuscript will benefit from having line numbers so comments from Reviewers can be made more specifically.

      We added line numbers as suggested by the reviewer.

      (2) Figure 2G, to distinguish if any effects of EPB41L4A-AS1 come from the cytoplasmic or nuclear portion of EPB41L4A-AS1, an siRNA KD RNA-seq will help to filter out the genes affected by EPB41L4A-AS1 in the cytoplasm, as siRNA likely mainly acts in the cytoplasm.

      This experiment would be difficult to interpret as while the siRNAs mostly deplete the cytoplasmic pool of their target, they can have some effects in the nucleus as well (e.g., (Sarshad et al., 2018)) and so siRNAs knockdown will not necessarily report strictly on the cytoplasmic functions.

      (3) Figure 2H, LNA knockdown of EPB41L4A should check the protein level reduction, is it similar to the change caused by knockdown of EPB41L4A-AS1?

      As suggested by reviewer #2, we have now replaced the EPB41L4A Western Blot that now shows the results with both LNA1 and LNA2. Please note that the previous Fig. 2C was a subset of this, i.e., we have previously cropped the results obtained with LNA1. Unfortunately, we did not have sufficient antibody to check for EPB41L4A protein reduction following LNA KD of EPB41L4A in a timely manner.

      (4) There are two LNA Gapmers used by the paper to knock down EPB41L4A-AS1, but some figures used LNA1, some used LNA2, preventing a consistent interpretation of the results. For example, in Figures 2A-D, LNA2 was used. But in Figures 2E-H, LNA1 was used. How consistent are the two in changing histone H3K27ac (like in Figure 2D) versus gene expression in RNA-seq? The changes in chromatin interaction appear to be weaker by LNA2 (Figure S4J) versus LNA1 (Figure 2E).

      As explained above and in response to Reviewer #1, we now provide more RNA-seq data for LNA1 and LNA2. We note that besides the unwanted and/or off-target effects, these two GapmeRs might be not equally effective in knocking down EPB41L4A-AS1, which could explain why LNA1 seems to have a stronger effect on chromatin than LNA2. Nonetheless, when we have employed both we have obtained similar and consistent results (e.g., Fig. 5A-D and 8A-C), suggesting that these and the other effects are indeed on target effects due to EPB41L4A-AS1 depletion.

      (5) It will be helpful if the authors provide information on how long they conducted EPB41L4A-AS1 knockdown for most experiments to help discern direct or indirect effects.

      The length of all perturbations was indicated in the Methods section, and we now mention them also  in the Results. Unless specified otherwise, they were carried out for 72 hours. We agree with the reviewer that having time course experiments can have added value, but due to the extensive effort that these will require, we suggest that they are out of scope of the current study.

      (6) In Figures 1C and F, the authors showed results about EPB41L4A-AS1 overlapping a strong chromatin boundary. But these are not mentioned anymore in the later part of the paper. Does this imply any mechanism? Does EPB41L4A-AS1 knockdown or OE, or CRISPRa affect the expression of genes near the other interacting site, STARD4? Do genes located in the two adjacent TADs change more strongly as compared to other genes far away?

      We discuss this point in the Discussion section:

      “At the site of its own transcription, which overlaps a strong TAD boundary, EPB41L4A-AS1 is required to maintain expression of several adjacent genes, regulated at the level of transcription. Strikingly, the promoter of EPB41L4A-AS1 ranks in the 99.8th percentile of the strongest TAD boundaries in human H1 embryonic stem cells(Open2C et al., 2024; Salnikov et al., 2024). It features several CTCF binding sites (Fig. 2A), and in MCF-7 cells, we demonstrate that it blocks the propagation of the 4C signal between the two flanking TADSs (Fig. 1F). Future studies will help elucidate how EPB41L4A-AS1 transcription and/or the RNA product regulate this boundary. So far, we found that EPB41L4A-AS1 did not affect CTCF binding to the boundary, and while some peaks in the vicinity of EPB41L4A-AS1 were significantly affected by its loss, they did not appear to be found near genes that were dysregulated by its KD (Fig. S11C). We also found that KD of EPB41L4A-AS1—which depletes the RNA product, but may also affect the nascent RNA transcription(Lai et al., 2020; Lee & Mendell, 2020)—reduces the spatial contacts between the TAD boundary and the EPB41L4A promoter (Fig. 2E). Further elucidation of the exact functional entity needed for the cis-acting regulation will require detailed genetic perturbations of the locus, that are difficult to carry out in the polypoid MCF-7 cells, without affecting other functional elements of this locus or cell survival as we were unable to generate deletion clones despite several attempts.”

      As mentioned in the text (pasted below) and in Fig. 2F, most genes in the two flanking TADs become downregulated following EPB41L4A-AS1 KD. While STARD4 – which was chosen because it had spatial contacts above background with EPB41L4A-AS1 – did not reach statistical significance, others did and are highlighted. Those included NREP, which we also discuss:

      “Consistently with the RT-qPCR data, KD of EPB41L4A-AS1 reduced EPB41L4A expression, and also reduced expression of several, but not all other genes in the TADs flanking the lncRNA (Fig. 2F).Based on these data, EPB41L4A-AS1 is a significant cis-acting activator according to TransCistor (Dhaka et al., 2024) (P=0.005 using the digital mode). The cis-regulated genes reduced by EPB41L4A-AS1 KD included NREP, a gene important for brain development, whose homolog was downregulated by genetic manipulations of regions homologous to the lncRNA locus in mice(Salnikov et al., 2024). Depletion of EPB41L4A-AS1 thus affects several genes in its vicinity.”

      (7) Related to the description of SUB1 regulation of genes are DNA and RNA levels: "Of these genes, transcripts of only 56 genes were also bound by SUB1 at the RNA level, suggesting largely distinct sets of genes targeted by SUB1 at both the DNA and the RNA levels." SUB1 binding to chromatin by Cut&Run only indicates that it is close to DNA/chromatin, and this interaction with chromatin may still likely be mediated by RNAs. The authors used SUB1 binding sites in eCLIP-seq to suggest whether it acts via RNAs, but these binding sites are often from highly expressed gene mRNAs/exons. Standard analysis may not have examined low-abundance RNAs close to the gene promoters, such as promoter antisense RNAs. The authors can examine whether, for the promoters with cut&run peaks of SUB1, SUB1 eCLIP-seq shows binding to the low-abundance nascent RNAs near these promoters.

      In response to a related comment by Reviewer 1, we now show that when considering expression level–matched control genes, knockdown of EPB41L4A-AS1 still significantly affects expression of SUB1 targets over controls. The results are presented in Supplementary Figure 7 (Fig. S7C).

      Based on this analysis, while there is a tendency of increased expression with increased SUB1 binding, when controlling for expression levels the effect of down-regulation of SUB1-bound RNAs upon lncRNA knockdown remains, suggesting that it is not merely a confounding effect. We have updated the text as follows:

      “We hypothesized that loss of EPB41L4A-AS1 might affect SUB1, either via the reduction in its expression or by affecting its functions. We stratified SUB1 eCLIP targets into confidence intervals, based on the number, strength and confidence of the reported binding sites. Indeed, eCLIP targets of SUB1 (from HepG2 cells profiled by ENCODE) were significantly downregulated following. EPB41L4A-AS1 KD in MCF-7, with more confident targets experiencing stronger downregulation (Fig. 3C). Importantly, this still holds true when controlling for gene expression levels (Fig. S7C), suggesting that this negative trend is not due to differences in their baseline expression.”

      (8) Figure 8, the cellular phenotype is interesting. As EPB41L4A-AS1 is quite widely expressed, did it affect the phenotypes similarly in other breast cancer cells? MCF7 is not a particularly relevant metastasis model. Can a similar phenotype be seen in commonly used metastatic cell models such as MDA-MB-231?

      We agree that further expanding the models in which EPB41L4A-AS1 affects cellular proliferation, migration and any other relevant phenotype is of potential interest before considering targeting this lncRNA as a therapeutic approach. However, given that 1) others have already identified similar phenotypes upon the modulation of EPB41L4A-AS1 in a variety of different systems (see Results and Discussion), and 2) we were most interested in the molecular consequences following the loss of this lncRNA, we respectfully suggest that these experiments are out of scope of the current study.

      References

      Bahar Halpern, K., Caspi, I., Lemze, D., Levy, M., Landen, S., Elinav, E., Ulitsky, I., & Itzkovitz, S. (2015). Nuclear Retention of mRNA in Mammalian Tissues. Cell Reports, 13(12), 2653–2662.

      Brabletz, T., Kalluri, R., Nieto, M. A., & Weinberg, R. A. (2018). EMT in cancer. Nature Reviews. Cancer, 18(2), 128–134.

      Brzek, A., Cichocka, M., Dolata, J., Juzwa, W., Schümperli, D., & Raczynska, K. D. (2018). Positive cofactor 4 (PC4) contributes to the regulation of replication-dependent canonical histone gene expression. BMC Molecular Biology, 19(1), 9.

      Cheng, Y., Wang, S., Zhang, H., Lee, J.-S., Ni, C., Guo, J., Chen, E., Wang, S., Acharya, A., Chang, T.-C., Buszczak, M., Zhu, H., & Mendell, J. T. (2024). A non-canonical role for a small nucleolar RNA in ribosome biogenesis and senescence. Cell, 187(17), 4770–4789.e23.

      Conesa, C., & Acker, J. (2010). Sub1/PC4 a chromatin associated protein with multiple functions in transcription. RNA Biology, 7(3), 287–290.

      Contreras, X., Depierre, D., Akkawi, C., Srbic, M., Helsmoortel, M., Nogaret, M., LeHars, M., Salifou, K., Heurteau, A., Cuvier, O., & Kiernan, R. (2023). PAPγ associates with PAXT nuclear exosome to control the abundance of PROMPT ncRNAs. Nature Communications, 14(1), 6745.

      Costea, J., Schoeberl, U. E., Malzl, D., von der Linde, M., Fitz, J., Gupta, A., Makharova, M., Goloborodko, A., & Pavri, R. (2023). A de novo transcription-dependent TAD boundary underpins critical multiway interactions during antibody class switch recombination. Molecular Cell, 83(5), 681–697.e7.

      Das, C., Hizume, K., Batta, K., Kumar, B. R. P., Gadad, S. S., Ganguly, S., Lorain, S., Verreault, A., Sadhale, P. P., Takeyasu, K., & Kundu, T. K. (2006). Transcriptional coactivator PC4, a chromatin-associated protein, induces chromatin condensation. Molecular and Cellular Biology, 26(22), 8303–8315.

      Dhaka, B., Zimmerli, M., Hanhart, D., Moser, M. B., Guillen-Ramirez, H., Mishra, S., Esposito, R., Polidori, T., Widmer, M., García-Pérez, R., Julio, M. K., Pervouchine, D., Melé, M., Chouvardas, P., & Johnson, R. (2024). Functional identification of cis-regulatory long noncoding RNAs at controlled false discovery rates. Nucleic Acids Research, 52(6), 2821–2835.

      Didiot, M.-C., Ferguson, C. M., Ly, S., Coles, A. H., Smith, A. O., Bicknell, A. A., Hall, L. M., Sapp, E., Echeverria, D., Pai, A. A., DiFiglia, M., Moore, M. J., Hayward, L. J., Aronin, N., & Khvorova, A. (2018). Nuclear Localization of Huntingtin mRNA Is Specific to Cells of Neuronal Origin. Cell Reports, 24(10), 2553–2560.e5.

      Dongre, A., & Weinberg, R. A. (2019). New insights into the mechanisms of epithelial-mesenchymal transition and implications for cancer. Nature Reviews. Molecular Cell Biology, 20(2), 69–84.

      Dubois, J.-C., Bonnell, E., Filion, A., Frion, J., Zimmer, S., Riaz Khan, M., Teplitz, G. M., Casimir, L., Méthot, É., Marois, I., Idrissou, M., Jacques, P.-É., Wellinger, R. J., & Maréchal, A. (2025). The single-stranded DNA-binding factor SUB1/PC4 alleviates replication stress at telomeres and is a vulnerability of ALT cancer cells. Proceedings of the National Academy of Sciences of the United States of America, 122(2), e2419712122.

      Garavís, M., & Calvo, O. (2017). Sub1/PC4, a multifaceted factor: from transcription to genome stability. Current Genetics, 63(6), 1023–1035.

      Gil, N., & Ulitsky, I. (2020). Regulation of gene expression by cis-acting long non-coding RNAs. Nature Reviews. Genetics, 21(2), 102–117.

      Hou, Y., Gan, T., Fang, T., Zhao, Y., Luo, Q., Liu, X., Qi, L., Zhang, Y., Jia, F., Han, J., Li, S., Wang, S., & Wang, F. (2022). G-quadruplex inducer/stabilizer pyridostatin targets SUB1 to promote cytotoxicity of a transplatinum complex. Nucleic Acids Research, 50(6), 3070–3082.

      Jan, C. H., Friedman, R. C., Ruby, J. G., & Bartel, D. P. (2011). Formation, regulation and evolution of Caenorhabditis elegans 3’UTRs. Nature, 469(7328), 97–101.

      Kaypee, S., Ochiai, K., Shima, H., Matsumoto, M., Alam, M., Ikura, T., Kundu, T. K., & Igarashi, K. (2025). Positive coactivator PC4 shows dynamic nucleolar distribution required for rDNA transcription and protein synthesis. Cell Communication and Signaling : CCS, 23(1), 283.

      Lai, F., Damle, S. S., Ling, K. K., & Rigo, F. (2020). Directed RNase H Cleavage of Nascent Transcripts Causes Transcription Termination. Molecular Cell, 77(5), 1032–1043.e4.

      Lee, J.-S., & Mendell, J. T. (2020). Antisense-Mediated Transcript Knockdown Triggers Premature Transcription Termination. Molecular Cell, 77(5), 1044–1054.e3.

      Lubelsky, Y., & Ulitsky, I. (2018). Sequences enriched in Alu repeats drive nuclear localization of long RNAs in human cells. Nature, 555(7694), 107–111.

      Ly, S., Didiot, M.-C., Ferguson, C. M., Coles, A. H., Miller, R., Chase, K., Echeverria, D., Wang, F., Sadri-Vakili, G., Aronin, N., & Khvorova, A. (2022). Mutant huntingtin messenger RNA forms neuronal nuclear clusters in rodent and human brains. Brain Communications, 4(6), fcac248.

      Maranon, D. G., & Wilusz, J. (2020). Mind the Gapmer: Implications of Co-transcriptional Cleavage by Antisense Oligonucleotides. Molecular Cell, 77(5), 932–933.

      Monziani, A., & Ulitsky, I. (2023). Noncoding snoRNA host genes are a distinct subclass of long noncoding RNAs. Trends in Genetics : TIG, 39(12), 908–923.

      Mortusewicz, O., Roth, W., Li, N., Cardoso, M. C., Meisterernst, M., & Leonhardt, H. (2008). Recruitment of RNA polymerase II cofactor PC4 to DNA damage sites. The Journal of Cell Biology, 183(5), 769–776.

      Much, C., Lasda, E. L., Pereira, I. T., Vallery, T. K., Ramirez, D., Lewandowski, J. P., Dowell, R. D., Smallegan, M. J., & Rinn, J. L. (2024). The temporal dynamics of lncRNA Firre-mediated epigenetic and transcriptional regulation. Nature Communications, 15(1), 6821.

      Ntini, E., Louloupi, A., Liz, J., Muino, J. M., Marsico, A., & Ørom, U. A. V. (2018). Long ncRNA A-ROD activates its target gene DKK1 at its release from chromatin. Nature Communications, 9(1), 1636.

      Oo, J. A., Warwick, T., Pálfi, K., Lam, F., McNicoll, F., Prieto-Garcia, C., Günther, S., Cao, C., Zhou, YGavrilov, A. A., Razin, S. V., Cabrera-Orefice, A., Wittig, I., Pullamsetti, S. S., Kurian, L., Gilsbach, R., Schulz, M. H., Dikic, I., Müller-McNicoll, M., … Leisegang, M. S. (2025). Long non-coding RNAs direct the SWI/SNF complex to cell type-specific enhancers. Nature Communications, 16(1), 131.

      Open2C, Abdennur, N., Abraham, S., Fudenberg, G., Flyamer, I. M., Galitsyna, A. A., Goloborodko, A., Imakaev, M., Oksuz, B. A., Venev, S. V., & Xiao, Y. (2024). Cooltools: Enabling high-resolution Hi-C analysis in Python. PLoS Computational Biology, 20(5), e1012067.

      Potapova, T. A., Unruh, J. R., Conkright-Fincham, J., Banks, C. A. S., Florens, L., Schneider, D. A., & Gerton, J. L. (2023). Distinct states of nucleolar stress induced by anticancer drugs. https://doi.org/10.7554/eLife.88799.

      Ray, D., Laverty, K. U., Jolma, A., Nie, K., Samson, R., Pour, S. E., Tam, C. L., von Krosigk, N., Nabeel-Shah, S., Albu, M., Zheng, H., Perron, G., Lee, H., Najafabadi, H., Blencowe, B., Greenblatt, J., Morris, Q., & Hughes, T. R. (2023). RNA-binding proteins that lack canonical RNA-binding domains are rarely sequence-specific. Scientific Reports, 13(1), 5238.

      Salgado, S., Abreu, P. L., Moleirinho, B., Guedes, D. S., Larcombe, L., & Azzalin, C. M. (2024). Human PC4 supports telomere stability and viability in cells utilizing the alternative lengthening of telomeres mechanism. EMBO Reports, 25(12), 5294–5315.

      Salnikov, P., Korablev, A., Serova, I., Belokopytova, P., Yan, A., Stepanchuk, Y., Tikhomirov, S., & Fishman, V. (2024). Structural variants in the Epb41l4a locus: TAD disruption and Nrep gene misregulation as hypothetical drivers of neurodevelopmental outcomes. Scientific Reports, 14(1), 5288.

      Sarshad, A. A., Juan, A. H., Muler, A. I. C., Anastasakis, D. G., Wang, X., Genzor, P., Feng, X., Tsai, P.-F., Sun, H.-W., Haase, A. D., Sartorelli, V., & Hafner, M. (2018). Argonaute-miRNA Complexes Silence Target mRNAs in the Nucleus of Mammalian Stem Cells. Molecular Cell, 71(6), 1040–1050.e8.

      Tafforeau, L., Zorbas, C., Langhendries, J.-L., Mullineux, S.-T., Stamatopoulou, V., Mullier, R., Wacheul, L., & Lafontaine, D. L. J. (2013). The complexity of human ribosome biogenesis revealed by systematic nucleolar screening of Pre-rRNA processing factors. Molecular Cell, 51(4), 539–551.

      Unfried, J. P., & Ulitsky, I. (2022). Substoichiometric action of long noncoding RNAs. Nature Cell Biology, 24(5), 608–615.

      Van Nostrand, E. L., Freese, P., Pratt, G. A., Wang, X., Wei, X., Xiao, R., Blue, S. M., Chen, J.-Y.,Cody, N. A. L., Dominguez, D., Olson, S., Sundararaman, B., Zhan, L., Bazile, C., Bouvrette, L. P. B., Bergalet, J., Duff, M. O., Garcia, K. E., Gelboin-Burkhart, C., … Yeo, G. W. (2020). A large-scale binding and functional map of human RNA-binding proteins. Nature, 583(7818), 711–719.

      Yang, K., Wang, M., Zhao, Y., Sun, X., Yang, Y., Li, X., Zhou, A., Chu, H., Zhou, H., Xu, J., Wu, M., Yang, J., & Yi, J. (2016). A redox mechanism underlying nucleolar stress sensing by nucleophosmin. Nature Communications, 7, 13599.

      Youssef, K. K., Narwade, N., Arcas, A., Marquez-Galera, A., Jiménez-Castaño, R., Lopez-Blau, C., Fazilaty, H., García-Gutierrez, D., Cano, A., Galcerán, J., Moreno-Bueno, G., Lopez-Atalaya, J. P., & Nieto, M. A. (2024). Two distinct epithelial-to-mesenchymal transition programs control invasion and inflammation in segregated tumor cell populations. Nature Cancer, 5(11), 1660–1680.

      Yu, L., Ma, H., Ji, X., & Volkert, M. R. (2016). The Sub1 nuclear protein protects DNA from oxidative damage. Molecular and Cellular Biochemistry, 412(1-2), 165–171.

      • How is this class going to help me when I get a job? Although my major might seem like it’s just a step toward a specific job, I see it as a way to grow both personally and professionally. "So much of what you do in college, such as doing research and taking general education classes or pursuing a minor and participating in student organizations, is designed to help you become a more intelligent, capable, understanding, aware, and competent person—regardless of your major." though this class I am trying to be patient and learn skills that will prepare me to be successful in my career and not just for my future job after graduation.
      • When Learning Job Skills Is Not Enough? "by themselves" are not enough. They are merely one aspect of the array of knowledge and abilities required to be successful in any given profession. Every employer are most interested in who the graduate has become, not just the specific set of profesional skills he or she learned.

      • An Invitation to a New Kind of Conversation What are you going to do with that major? Although my major might seem like it's just a step to a specific job, I see it as a way to grow both, personally and professional. "So much of what you do in college, such as doing research and taking general education classes or pursuing a minor and participating in student organizations, is designed to help you become a more intelligent, capable, understanding, aware, and competent person—regardless of your major" I see this in my own experience, my major is helping me develop my critical thinking, solve problems and communication skills, but also creating a more aware person. College isn't about the first job I get after graduation, it's also about who I becoming and the skills and growth I do into the path I chose.

    1. Youtuber Innuendo Studios talks about the way arguments are made in a community like 4chan: You can’t know whether they mean what they say, or are only arguing as though they mean what they say. And entire debates may just be a single person stirring the pot [e.g., sockpuppets]. Such a community will naturally attract people who enjoy argument for its own sake, and will naturally trend oward the most extremte version of any opinion. In short, this is the free marketplace of ideas. No code of ethics, no social mores, no accountability. … It’s not that they’re lying, it’s that they just don’t care. […] When they make these kinds of arguments they legitimately do not care whether the words coming out of their mouths are true. If they cared, before they said something is true, they would look it up. The Alt-Right Playbook: The Card Says Moops by Innuendo Studios While there is a nihilistic worldview where nothing matters, we can see how this plays out practically, which is that they tend to protect their group (normally white and male), and tend to be extremely hostile to any other group. They will express extreme misogyny (like we saw in the ihilistic worldview where nothing matters, we can see how this plays out practically, which is that they tend to protect their group (normally white and male), and tend to be extremely hostile to any other group. They will express extreme misogyny (like we saw in the Rules of the Internet: “Rule 30. There are no girls on the internet. Rule 31. TITS or GTFO - the choice is yours”), and extreme racism (like an invented Nazi My Little Pony character). Is this just hypocritical, or is it ethically wrong? It depends, of course, on what tools we use to evaluate this kind of trolling. If the trolls claim to be nihilists about ethics, or indeed if they are egoists, then they would argue that this doesn’t matter and that there’s no normative basis for objecting to the disruption and harm caused by their trolling. But on just about any other ethical approach, there are one or more reasons available for objecting to the disruptions and harm caused by these trolls! If the only way to get a moral pass on this type of trolling is to choose an ethical framework that tells you harming others doesn’t matter, then it looks like this nihilist viewpoint isn’t deployed in good faithf the only way to get a moral pass on this type of trolling is to choose an ethical framework that tells you harming others doesn’t matter, then it looks like this nihilist viewpoint isn’t deployed in good faith11. Rather, with any serious (i.e., non-avoidant) moral framework, this type of trolling is ethically wrong for one or more reasons (though how we explain it is wrong depends on the specific framework).

      This reading helped me see that trolling in spaces like 4chan isn’t just about “free speech” or joking, but about a lack of care for truth and harm. The idea that arguments are made without concern for whether they are true explains why these communities drift toward extreme misogyny and racism. While trolls may claim a nihilistic or egoist stance, this feels less like a genuine ethical position and more like a shield to avoid responsibility. Under almost any serious moral framework, the deliberate disruption and harm caused by trolling is ethically wrong, especially when it consistently targets marginalized groups.

    2. 7.6.3. Trolling and Nihilism# While trolling can be done for many reasons, some trolling communities take on a sort of nihilistic philosophy: it doesn’t matter if something is true or not, it doesn’t matter if people get hurt, the only thing that might matter is if you can provoke a reaction. We can see this nihilism show up in one of the versions of the self-contradictory “Rules of the Internet:” 8. There are no real rules about posting … 20. Nothing is to be taken seriously … 42. Nothing is Sacred Youtuber Innuendo Studios talks about the way arguments are made in a community like 4chan: You can’t know whether they mean what they say, or are only arguing as though they mean what they say. And entire debates may just be a single person stirring the pot [e.g., sockpuppets]. Such a community will naturally attract people who enjoy argument for its own sake, and will naturally trend oward the most extremte version of any opinion. In short, this is the free marketplace of ideas. No code of ethics, no social mores, no accountability. … It’s not that they’re lying, it’s that they just don’t care. […] When they make these kinds of arguments they legitimately do not care whether the words coming out of their mouths are true. If they cared, before they said something is true, they would look it up. The Alt-Right Playbook: The Card Says Moops by Innuendo Studios While there is a nihilistic worldview where nothing matters, we can see how this plays out practically, which is that they tend to protect their group (normally white and male), and tend to be extremely hostile to any other group. They will express extreme misogyny (like we saw in the Rules of the Internet: “Rule 30. There are no girls on the internet. Rule 31. TITS or GTFO - the choice is yours”), and extreme racism (like an invented Nazi My Little Pony character). Is this just hypocritical, or is it ethically wrong? It depends, of course, on what tools we use to evaluate this kind of trolling. If the trolls claim to be nihilists about ethics, or indeed if they are egoists, then they would argue that this doesn’t matter and that there’s no normative basis for objecting to the disruption and harm caused by their trolling. But on just about any other ethical approach, there are one or more reasons available for objecting to the disruptions and harm caused by these trolls! If the only way to get a moral pass on this type of trolling is to choose an ethical framework that tells you harming others doesn’t matter, then it looks like this nihilist viewpoint isn’t deployed in good faith1. Rather, with any serious (i.e., non-avoidant) moral framework, this type of trolling is ethically wrong for one or more reasons (though how we explain it is wrong depends on the specific framework).

      This section helped me think about trolling in a much more nuanced way, especially the idea that disruption itself isn’t automatically good or bad. I found the discussion about group formation and norm enforcement really useful, because it explains why trolling can feel threatening—it challenges the patterns and signals that groups rely on to define who belongs. The comparison between trolling, protest, and revolution also stood out to me, since it shows how moral judgment often depends on whether we see the existing social order as legitimate. Overall, this section made it clear that evaluating trolling ethically requires looking beyond intent or humor and examining what is being disrupted and who is harmed or protected by that disruption.

    1. One of the traditional pieces of advice for dealing with trolls is “Don’t feed the trolls,” which means that if you don’t respond to trolls, they will get bored and stop trolling. We can see this advice as well in the trolling community’s own “Rules of the Internet”: Do not argue with trolls - it means that they win But the essayist Film Crit Hulk argues against this in Don’t feed the trolls, and other hideous lies. That piece argues that the “don’t feed the trolls” strategy doesn’t stop trolls from harassing: Ask anyone who has dealt with persistent harassment online, especially women: [trolls stopping because they are ignored] is not usually what happens. Instead, the harasser keeps pushing and pushing to get the reaction they want with even more tenacity and intensity. It’s the same pattern on display in the litany of abusers and stalkers, both online and off, who escalate to more dangerous and threatening behavior when they feel like they are being ignored. Film Crit Hulk goes on to say that the “don’t feed the trolls” advice puts the burden on victims of abuse to stop being abused, giving all the power to trolls. Instead, Film Crit Hulk suggests giving power to the victims and using “skilled moderation and the willingness to kick people off platforms for violating rules about abuse”

      Many people often say that the best way to deal with online trolls is to “just ignore them.” It sounds reasonable—as if by not responding, the troll will lose interest and disappear. But reality doesn't always play out that way. For some trolls, being ignored only fuels their desire to “make their presence felt.” They become more abusive and attack more frequently, just to provoke a response.

      This advice of “ignoring trolls” actually places a huge burden on the person being harassed. It creates the impression that if things escalate, it's your fault for not handling it properly, rather than acknowledging the other party's inappropriate behavior. In contrast, shifting the focus to how platforms manage and address violating accounts seems more reasonable. Clearer rules, more timely moderation, and genuine consequences like account suspensions or feature restrictions would make trolls pay a price, rather than forcing victims to endure in silence.

    1. We keep blaming technology when it's our own misuse of these tools that has led to everything we hate it for. Just a simple reflection can show us that we have always been this way.

    2. In the early Internet message boards that were centered around different subjects, experienced users would “troll for newbies” by posting naive questions that all the experienced users were already familiar with.

      This way of trolling the mewbies seems like a power play or some sort of online bullying , that is excused by "trolling" I wonder if it's just an excuse to be mean, because if it happened in real life, I don't think people would find as amusing.

    1. lulz.

      It is interesting realizing that trolling is way more than just people being mean for "the lulz"—it can actually be a weird form of social protest or a way to make a point by exposing how gullible people are. It's basically using fake posts to mess with people's heads, whether that's to feel powerful or just to gate keep a community from "normies"

    1. It's helpful to know that we are not just being graded simply for work and completion but also the amount of our time we spend advancing our knowledge and gaining an understanding.

    1. Computer-based journey stories offer a new way of savoring exactly this pleasure, a pleasure that is intensified by uniting the problem solving with the active process of navigation.

      I think it's interesting to see how the diversification of mediums in terms of journey stories, has allowed for society expand their mindsets, beliefs, and understanding regarding their capabilities and more. Individuals can take control of their 'destiny' just as those in the journey stories they read. All in all, agency really has taken on a new form with every iteration of journey stories.

    1. What I was trying to say before you so rudelyattacked me was - you two just signed your own deathcertificates IF you don't get some help on your side.I'm that help.LUCKY.And why should we trust you?MALCOLM.Well, for one, if I wanted - you'd both be deadright now. And the fact that I ain't slicing and dicingyour crazy redneck bums should be an indication thatI'm not trying or motivated in killing either one of you.JESS. What do you think, Luck?LUCKY. I don't like him.JESS. Yeah, but you don't like nobody.LUCKY. It's safer that way.MALCOLM. Shooting Dingo Tolson is a declaration of waragainst The Devils.JESS.Yeah, we know that.MALCOLM.And if you can't beat me, there's no way you gota chance against The Devils

      This part creates an event due to the addition of a new character. Although it is early in the play the introduction of a character gives the change of an event. With the introduction being solely from Jess's point of view we are also stuck in this point of view with her during the introduction of Malcom. Due to Malcolm saying that if Jess wants to beat The Devils, she has to be able to beat him firsts, makes everyone ask about what is so special about Malcom that makes him on par with The Devils. It creates a tension between Jess and Lucky, and Malcom about the background of Malcom.

    Annotators

    1. “Judicial reform was already a top priority for Republican lawmakers in our legislative session that starts in less than three weeks. After today, our message to the judiciary is simply this: buckle up.

      It’s a warning, not just rhetoric. It signals lawmakers may pursue: Laws to limit judicial review Changes to court jurisdiction Alterations to judicial election rules Increased use of constitutional amendments to override court interpretations

    1. students to use datasets that they find on websites like Kaggle

      Just today I watched a video lecture by professor Wickes on her research about STEM education for non-stem people. Basically, she raised the same problem about Kaggle - it's place with all of the perfect datasets that and rarely you can see in real work, fr example in the company. The process of learning by using Kaggle omits the essential parts of the computing and data education - data cleaning and preparation.

    1. Agree Extremely

      I don't really care, but it's interesting that we're not using "Somewhat agree" and "Strongly agree" here, etc. I'm curious whether there's a reason for the change in word order (and "Extremely" for "Strongly"). Fine to leave it, just curious.

    1. E-books make up less than 5 percent of the current book market, but that number is growing. At the beginning of 2010, Amazon had about 400,000 titles available for the Kindle device.

      Less than 5% seems like a really small number, but when put into comparison to just how many books have been written and released physically over time, it's actually really impressive that e-books have made a decent dent in the book market. Especially given the short amount of time e-books have been released.

    1. State your opinion, leave it on the table to be debated, picked up and critiqued by your colleagues, and then develop a question out of that opinion.

      I love Kyla Tompkin's straightforward solutions! Don't mince your words to a self-answering (self-bragging) question. Just confidently state your opinion, and it's okay to not understand and be corrected! Something that we've been discussing during lecture is how academia prizes and rewards perfectionism and mastery over knowledge. I am still learning, but when I was first reading theory, I wanted to always perform mastery over a text, and would prepare exactly what I would say for some academic clout and personal validation. It wasn't until entering upper division Feminist Studies courses that I realized that the students that I respected the most were always asking the most provocative questions, ones that I had never even considered to think about the answer for.

    1. Anaect 13.3 raises an interesting paradox: Moral truth vs. Legal justice. What Confucius seems to be saying here is that being “true” isn’t the same thing as just following the law. The Governor of She treats reporting the father as a moral good, but Confucius pushes back by grounding truth in family relationships instead. For Confucius, morality isn’t mainly about rules or punishment; it’s about acting properly within your roles, especially as a son or father. From that perspective, turning in your own parent misses what really matters morally.

    1. Authenticity is a concept we use to talk about connections and interactions when the way the connection is presented matches the reality of how it functions.

      This definition elevates authenticity from "whether the content is true" to "whether the relationship is aligned": the type of interaction presented (friendly sharing, intimate confidences, genuine vulnerability) must be consistent with how it actually operates. In other words, authenticity is a kind of "contractual consistency"—what I think I'm getting is what I actually get. If an account uses "friend-like candor" to build intimacy, but is essentially just a marketing script run by a team, the problem isn't necessarily that it's a performance, but that it fails to make it clear to the audience what kind of relationship they are in, thus creating a gap between expectations and reality, and a feeling of being exploited. This also explains why some "performative" content (comedy accounts, role-playing) doesn't receive criticism: because its presentation and actual operation are consistent, and the audience knows what they are participating in.

    2. To describe something as authentic, we are often talking about honesty, in that the thing is what it claims to be. But we also describe something as authentic when we want to say that it offers a certain kind of connection.

      This sentence really clarified authenticity for me. It’s not just about whether something is “real” or “fake,” but whether expectations line up with reality. That helps explain why joke accounts or surprise parties don’t feel inauthentic in a bad way because the mismatch isn’t harmful or misleading.

    3. Authenticity is a concept we use to talk about connections and interactions when the way the connection is presented matches the reality of how it functions. An authentic connection can be trusted because we know where we stand. An inauthentic connection offers a surprise because what is offered is not what we get. An inauthentic connection could be a good surprise, but usually, when people use the term ‘inauthentic’, they are indicating that the surprise was in some way problematic: someone was duped.

      authenticity is basically “what you see is what you get,” so you can trust the vibe and know where you stand. When something’s inauthentic, it’s not just different than expected, it’s different in a way that feels misleading, like you got played. And yeah, even if the surprise could be harmless, “inauthentic” usually implies someone was tricked or taken advantage of.

    1. For example, we found in a 2017 study with colleagues that from 1970 to 2010 metropolitan areas with greater concentrations of immigrants, legal and undocumented combined, have less property crime than areas with fewer immigrants, on average. Critics suggested that our findings would not hold if we looked at only the subset of undocumented individuals.

      From a purely scientific standpoint - and not just a racist xenophobic one - it's best to look at data from all viewpoints. In this case, it would be crime statistics from cities with greater concentrations of purely undocumented immigrants in one study, then in another, statistics from documented immigrants.Then, separate the statistics by race, ethnicity, gender, orientation, etc. - to see if their were any underlying causes for the crimes.

    1. Author response:

      Reviewer #1 (Public review):

      The authors analysed large-scale brain-state dynamics while humans watched a short video. They sought to identify the role of thalamocortical interactions.

      Major concerns

      (1) Rationale for using the naturalistic stimulus

      In terms of brain state dynamics, previous studies have already reported large-scale neural dynamics by applying some data-driven analyses, like energy landscape analysis and Hidden Markov Model, to human fMRI/EEG data recorded during resting/task states. Considering such prior work, it'd be critical to provide sufficient biological rationales to perform a conceptually similar study in a naturalistic condition, i.e., not just "because no previous work has been done". The authors would have to clarify what type of neural mechanisms could be missed in conventional resting-state studies using, say, energy landscape analysis, but could be revealed in the naturalistic condition.

      We appreciate your insightful comments regarding the need for a biological rationale in our study. As you mentioned, there are similar studies, just like Meer et al. utilized Hidden Markov Models to identify various activation modes of brain networks that included subcortical regions[1], Song et al. linked brain states to narrative understandings and attentional dynamics[2, 3]. These studies could answer why we use naturalistic stimuli datasets. Moreover, there is evidence suggesting that the thalamus plays a crucial role in processing information in a more naturalistic context while pointing out the vital role in thalamocortical communications[4, 5]. So, we tended to bridge thalamic activity and cortical state transition using the energy landscape description.

      To address these gaps in conventional resting-state studies, we explored an alternative method—maximum entropy modeling based on the energy landscape. This allowed us to validate how the thalamus responds to cortical state transitions. To enhance clarity, we will update our introduction to emphasize the motivations behind our research and the significance of examining these neural mechanisms in a naturalistic setting.

      (2) Effects of the uniqueness of the visual stimulus and reproducibility

      One of the main drawbacks of the naturalistic condition is the unexpected effects of the stimuli. That is, this study looked into the data recorded from participants who were watching Sherlock, but what would happen to the results if we analyzed the brain activity data obtained from individuals who were watching different movies? To ensure the generalizability of the current findings, it would be necessary to demonstrate qualitative reproducibility of the current observations by analysing different datasets that employed different movie stimuli. In fact, it'd be possible to find such open datasets, like www.nature.com/articles/s41597-023-02458-8.

      We appreciate your concern regarding the reproducibility of our findings. The dataset from the "Sherlock" study is of high quality and has shown good generalizability in various research contexts. We acknowledge the importance of validating our results with different datasets to enhance the robustness of our conclusions. While we are open to exploring additional datasets, we intend to pursue this validation once we identify a suitable alternative. Currently, we are considering a comparison with the dataset from "Forrest Gump" as part of our initial plan.

      (3) Spatial accuracy of the "Thalamic circuit" definition

      One of the main claims of this study heavily relies on the accuracy of the localization of two different thalamic architectures: matrix and core. Given the conventional or relatively low spatial resolution of the fMRI data acquisition (3x3x3 mm^3), it appears to be critically essential to demonstrate that the current analysis accurately distinguished fMRI signals between the matrix and core parts of the thalamus for each individual.

      We acknowledge the importance of accurately localizing the different thalamic architectures, specifically the matrix and core regions. To address this, we downsampled the atlas of matrix and core cell populations from the previous study from a resolution of 2x2x2 mm<sup>3</sup> to 3x3x3 mm<sup>3</sup>, which aligns with our fMRI data acquisition. We would report the atlas as Supplementary Figures in our revision.

      (4) More detailed analysis of the thalamic circuits

      In addition, if such thalamic localisation is accurate enough, it would be greatly appreciated if the authors perform similar comparisons not only between the matrix and core architectures but also between different nuclei. For example, anterior, medial, and lateral groups (e.g., pulvinar group). Such an investigation would meet the expectations of readers who presume some microscopic circuit-level findings.

      We appreciate your suggestion regarding a more detailed analysis of thalamic circuits. We have touched upon this in the discussion section as a forward-looking consideration. However, we believe that performing nuclei segmentation with 3T fMRI may not be ideal due to well-documented concerns regarding signal-to-noise ratio and spatial resolution. That said, we are interested in exploring these nuclei-pathway connections to cortical areas in future studies with a proper 7T fMRI naturalistic dataset.

      (5) Rationale for different time window lengths

      The authors adopted two different time window lengths to examine the neural dynamics. First, they used a 21-TR window for signal normalisation. Then, they narrowed down the window length to 13-TR periods for the following statistical evaluation. Such a seemingly arbitrary choice of the shorter time window might be misunderstood as a measure to relax the threshold for the correction of multiple comparisons. Therefore, it'd be appreciated if the authors stuck to the original 21-TR time window and performed statistical evaluations based on the setting.

      Thank you for your valuable feedback regarding the choice of time window lengths. We aimed to maintain consistency in window lengths across our analyses. In light of your comments and suggestions from other reviewers, we plan to test our results using different time window lengths and report findings that generalize across these variations. Should the results differ significantly, we will discuss the implications of this variability in our revised manuscript.

      (6) Temporal resolution

      After identifying brain states with energy landscape analysis, this study investigated the brain state transitions by directly looking into the fMRI signal changes. This manner seems to implicitly assume that no significant state changes happen in one TR (=1.5sec), which needs sufficient validation. Otherwise, like previous studies, it'd be highly recommended to conduct different analyses (e.g., random-walk simulation) to address and circumvent this problem.

      Thank you for raising this important point regarding temporal resolution. Many fMRI studies, such as those examining event boundaries during movie watching, operate under similar assumptions concerning state changes within one TR. For example, Barnett et al. processed the dynamic functional connectivity (dFC) with a window of 20 TRs (24.4s). So, we do not think it is a limitation but is a common question related to fMRI scanning parameters. To strengthen our analysis of state transitions and ensure they are not merely coincidental, we plan to conduct random-walk simulations, as suggested, to validate our findings in accordance with methodologies used in previous research.

      Reviewer #2 (Public review):

      Summary:

      In this study, Liu et al. investigated cortical network dynamics during movie watching using an energy landscape analysis based on a maximum entropy model. They identified perception- and attention-oriented states as the dominant cortical states during movie watching and found that transitions between these states were associated with inter-subject synchronization of regional brain activity. They also showed that distinct thalamic compartments modulated distinct state transitions. They concluded that cortico-thalamo-cortical circuits are key regulators of cortical network dynamics.

      Strengths:

      A mechanistic understanding of cortical network dynamics is an important topic in both experimental and computational neuroscience, and this study represents a step forward in this direction by identifying key cortico-thalamo-cortical circuits. The analytical strategy employed in this study, particularly the LASSO-based analysis, is interesting and would be applicable to other data types, such as task- and resting-state fMRI.

      We thanks for this comment and encouragement.

      Weaknesses:

      Due to issues related to data preprocessing, support for the conclusions remains incomplete. I also believe that a more careful interpretation of the "energy" derived from the maximum entropy model would greatly clarify what the analysis actually revealed.

      Thank you for your valuable suggestions, and we apologize for any misunderstandings regarding the interpretation of the energy landscape in our study. To address this issue, we will include a dedicated paragraph in both the methods and results sections to clarify our use of the term "energy" derived from the maximum entropy model. This addition aims to eliminate any ambiguity and provide a clearer understanding of what our analysis reveals.

      (1) I think the method used for binarization of BOLD activity is problematic in multiple ways.

      a) Although the authors appear to avoid using global signal regression (page 4, lines 114-118), the proposed method effectively removes the global signal. According to the description on page 4, lines 117-122, the authors binarized network-wise ROI signals by comparing them with the cross-network BOLD signal (i.e., the global signal): at each time point, network-wise ROI signals above the cross-network signal were set to 1, and the rest were set to −1. If I understand the binarization procedure correctly, this approach forces the cross-network signal to be zero (up to some noise introduced by the binarization of network-wise signals), which is essentially equivalent to removing the global signal. Please clarify what the authors meant by stating that "this approach maintained a diverse range of binarized cortical states in data where the global signal was preserved" (page 4, lines 121-122).

      Thank you for highlighting the potential issue with our binarization method. We appreciate your insights regarding the comparison of network-wise ROI signals with the cross-network BOLD signal, as this may inadvertently remove the global signal. To address this, we will conduct a comparative analysis of results obtained from both our current approach and the original pipeline. If we decide to retain our current method, we will carefully reconsider the rationale and rephrase our descriptions to ensure clarity regarding the preservation of the global signal and the diversity of binarized cortical states.

      b) The authors might argue that they maintained a diverse range of cortical states by performing the binarization at each time point (rather than within each network). However, I believe this introduces another problem, because binarizing network-wise signals at each time point distorts the distribution of cortical states. For example, because the cross-network signal is effectively set to zero, the network cannot take certain states, such as all +1 or all −1. Similarly, this binarization biases the system toward states with similar numbers of +1s and −1s, rather than toward unbalanced states such as (+1, −1, −1, −1, −1, −1). These constraints and biases are not biological in origin but are simply artifacts of the binarization procedure. Importantly, the energy landscape and its derivatives (e.g., hard/easy transitions) are likely to be affected by these artifacts. I suggest that the authors try a more conventional binarization procedure (i.e., binarization within each network), which is more robust to such artifacts.

      Related to this point, I have a question regarding Figure S1, in which the authors plotted predicted versus empirical state probabilities. As argued above, some empirical state probabilities should be zero because of the binarization procedure. However, in Figure S1, I do not see data points corresponding to these states (i.e., there should be points on the y-axis). Did the authors plot only a subset of states in Figure S1? I believe that all states should be included. The correlation coefficient between empirical and predicted probabilities (and the accuracy) should also be calculated using all states.

      Thank you for your thoughtful examination of our data processing pipeline. We agree that a comparison between the conventional binarization method and our current approach is warranted, and we appreciate your suggestion. Upon reviewing Figure S1, we discovered that there was indeed an error related to the plotting style set to "log10." As you correctly pointed out, the data should reflect that the probabilities for states where all networks are either activated or deactivated are zero. We are very interested in exploring the state distributions obtained from both the original and current approaches, as your comments highlight important considerations. We sincerely appreciate your insightful feedback and will make sure to address these points thoroughly in our first revision.

      c) The current binarization procedure likely inflates non-neuronal noise and obscures the relationship between the true BOLD signal and its binarized representation. For example, consider two ROIs (A and B): both (+2%, +1%) and (+0.01%, −0.01%) in BOLD signal changes would be mapped to (+1, −1) after binarization. This suggests that qualitatively different signal magnitudes are treated identically. I believe that this issue could be alleviated if the authors were to binarize the signal within each network, rather than at each time point.

      Thank you for your important observation regarding the potential inflation of non-neuronal noise in our current binarization procedure. We recognize that this process could lead to qualitatively different signal magnitudes being treated similarly after binarization, as you illustrated with your example. While we acknowledge your point, we believe that conventional binarization pipelines may also encounter this issue, albeit by comparing signals to a network's temporal mean activity. To address this concern and maintain consistency with previous studies, we will discuss this limitation in our revised manuscript. Additionally, if deemed necessary, we will explore implementing a percentile-based threshold above the baseline to further refine our binarization approach. Your suggestion provides a valuable perspective, and we appreciate your insights.

      (2) As the authors state (page 5, lines 145-148), the "energy" described in the energy landscape is not biological energy but rather a statistical transformation of probability distributions derived from the Boltzmann distribution. If this is the case, I believe that Figure 2A is potentially misleading and should be removed. This type of schematic may give the false impression that cortical state dynamics are governed by the energy landscape derived from the maximum entropy model (which is not validated).

      Thank you for your valuable feedback regarding Figure 2A. We apologize for any confusion it may have created. While we recognize that similar figures are commonly used in literature involving energy landscapes (maximum entropy model), we agree that Figure 2A may mislead readers into thinking that cortical state dynamics are directly governed by the energy landscape derived from the maximum entropy model, which has not been validated. In light of your comments, we will remove Figure 2A and instead emphasize the analytical strategy presented in Figure 2B. Additionally, we will provide a simplified line graph as an illustrative example to clarify the concepts without the potential for misinterpretation.

      Reviewer #3 (Public review):

      Summary:

      In this study, Liu et al. analyze fMRI data collected during movie watching, applied an energy landscape method with pairwise maximum entropy models. They identify a set of brain states defined at the level of canonical functional networks and quantify how the brain transitions between these states. Transitions are classified as "easy" or "hard" based on changes in the inferred energy landscape, and the authors relate transition probabilities to inter-subject correlation. A major emphasis of the work is the role of the thalamus, which shows transition-linked activity changes and dynamic connectivity patterns, including differential involvement of parvalbumin- and calbindin-associated thalamic subdivisions.

      Strengths:

      The study is methodologically complex and technically sophisticated. It integrates advanced analytical methods into high-dimensional fMRI data. The application of energy landscape analysis to movie-watching data appears to be novel as well. The finding on the thalamus involved energy state transition and provides a strong linkage to several theories on thalamic control functions, which is a notable strength.

      Thanks for your comments on the novelty of our study.

      Weaknesses:

      The main weakness is the conceptual clarity and advances that this otherwise sophisticated set of analyses affords. A central conceptual ambiguity concerns the energy landscape framework itself. The authors note that the "energy" in this model is not biological energy but a statistical quantity derived from the Boltzmann distribution. After multiple reads, I still have major trouble mapping this measure onto any biological and cognitive operations. BOLD signal is a measure of oxygenation as a proxy of neural activity, and correlated BOLD (functional connectivity) is thought to measure the architecture of information communication of brain systems. The energy framework described in the current format is very difficult for most readers to map onto any neural or cognitive knowledge base on the structure and function of brain systems. Readers unfamiliar with maximum entropy models may easily misinterpret energy changes as reflecting metabolic cost, neural effort, or physiological variables, and it is just very unclear what that measure is supposed to reflect. The manuscript does not clearly articulate what conceptual and mechanistic advances the energy formalism provides beyond a mathematical and statistical report. In other words, beyond mathematical description, it is very hard for most readers to understand the process and function of what this framework is supposed to tell us in regards to functional connectivity, brain systems, and cognition. The brain is not a mathematical object; it is a biological organ with cognitive functions. The impact of this paper is severely limited until connections can be made.

      Thank you for your insightful and constructive comments regarding the conceptual clarity of our energy landscape framework. We appreciate your perspective on the challenges of mapping the statistical measure of "energy" derived from the Boltzmann distribution onto biological and cognitive operations. To address these concerns, we will revise our manuscript to clarify our expressions surrounding "energy" and emphasize its probabilistic nature. Additionally, we will incorporate a series of analyses that explicitly relate the features of the energy landscape to cognitive processes and key parameters, such as brain integration and functional connectivity. We believe these changes will help bridge the gap between our mathematical framework and its relevance to understanding brain systems and cognitive functions.

      Relatedly, the use of metaphors such as "valleys," "hills," and "routes" in multidimensional measures lacks grounding. Valleys and hills of what is not intuitive to understand. Based on my reading, these features correspond to local minima and barriers in a probability distribution over binarized network activation patterns, but similar to the first point, the manuscript does not clearly explain what it means conceptually, neurobiologically, or computationally for the brain to "move" through such a landscape. The brain is not computing these probabilities; they are measurement tools of "something". What is it? To advance beyond mathematical description, these measurements must be mapped onto neurobiological and cognitive information.

      Thank you for your valuable feedback. In our revisions, we would aim to link the concept of rapid transition routes in the energy landscape to cognitive processes, such as narrative understanding and related features. By exploring these connections, we hope to provide a clearer context for how our framework can enhance understanding of cognitive functions and their neural correlates.

      This conceptual ambiguity goes back to the Introduction. At the level of motivation, the purpose and deliverables of the study are not defined in the Introduction. The stated goal is "Transitions between distinct cortical brain states modulate the degree of shared neural processing under naturalistic conditions". I do not know if readers will have a clear answer to this question at the end. Is the claim that state transitions cause changes in inter-subject correlation, that they index moments of narrative alignment, or that they reflect changes in attentional or cognitive mode? This level of explanation is largely dissociated from the methods in their current form.

      Thank you for highlighting this important point regarding the conceptual clarity in our Introduction. We appreciate your feedback about the motivation and objectives of the study. To clarify the stated goal of investigating how transitions between distinct cortical brain states modulate shared neural processing under naturalistic conditions, we will revise the manuscript to explicitly define the specific claims we aim to address. We will ensure that these explanations are closely tied to the methods employed in our study, providing a clearer framework for our readers.

      Several methodological choices can use clarification. The use of a 21-TR window centered on transition offsets is unusually long relative to the temporal scale of fMRI dynamics and to the hypothesized rapidity of state transitions. On a related note, what is the temporal scale of state transition? Is it faster than 21 TRs?

      Thank you for your insightful questions regarding our methodological choices. Our focus on specific state transitions necessitated the use of a 21-TR window. While it’s true that other transitions may occur within this window, averaging across the same transitions at different times allows us to identify distinctive thalamic BOLD patterns that precede cortical state transitions. This methodology enables us to capture relevant dynamics while ensuring that we focus on the transitions of interest. We appreciate your feedback, and this clarification will be included in our revised manuscript. We would also add a figure that describe the dwell time of cortical states.

      The choice of movie-watching data is a strength. But, many of the analyses performed here, energy landscape estimation, clustering of states, could in principle be applied to resting-state data. The manuscript does not clearly articulate what is gained, mechanistically or cognitively, by using movie stimuli beyond the availability of inter-subject correlation.

      Thank you for your question, which closely aligns with a concern raised by Reviewer #1. Our core hypothesis posits that naturalistic stimuli yield a broader set of brain states compared to those observed during resting-state conditions. To support this assertion, we will clearly articulate the findings from previous studies that relate to this hypothesis. Additionally, if appropriate, we will provide a comparative analysis between our data and resting-state data to highlight the differences and emphasize the uniqueness of the brain states elicited by naturalistic stimuli.

      Because of the above issues, a broader concern throughout the results is the largely descriptive nature of the findings. For example, the LASSO analysis shows that certain state transitions predict ISC in a subset of regions, with respectable R² values. While statistically robust, the manuscript provides little beyond why these particular transitions should matter, what computations they might reflect, or how they relate to known cognitive operations during movie watching. Similar issues arise in the clustering analyses. Clustering high-dimensional fMRI-derived features will almost inevitably produce structure, whether during rest, task, or naturalistic viewing. What is missing is an explanation of why these specific clusters are meaningful in functional or mechanistic terms.

      Thank you for your questions. In our revisions, we will perform additional analyses aimed at linking state transitions to cognitive processes more explicitly. Regarding clustering, we will provide a thorough discussion in the revised manuscript.

      Finally, the treatment of the thalamus, while very exciting, could use a bit more anatomical and circuit-level specificity. The manuscript largely treats the thalamus as a unitary structure, despite decades of work demonstrating big functional and connectivity differences across thalamic nuclei. A whole-thalamus analysis without more detailed resolution is increasingly difficult to justify. The subsequent subdivision into PVALB- and CALB-associated regions partially addresses this, but these markers span multiple nuclei with overlapping projection patterns.

      This suggestion aligns with the feedback from Reviewer #1. We believe that performing nuclei segmentation with 3T fMRI may not be ideal due to well-documented concerns regarding signal-to-noise ratio and spatial resolution. Therefore, investigating core and matrix cell projections across different thalamic nuclei using 7T fMRI presents a promising avenue for further study.

      (1) Van Der Meer J N, Breakspear M, Chang L J, et al. Movie viewing elicits rich and reliable brain state dynamics [J]. Nature Communications, 2020, 11(1): 5004.

      (2) Song H, Park B Y, Park H, et al. Cognitive and Neural State Dynamics of Narrative Comprehension [J]. Journal of Neuroscience, 2021, 41(43): 8972-8990.

      (3) Song H, Shim W M, Rosenberg M D. Large-scale neural dynamics in a shared low-dimensional state space reflect cognitive and attentional dynamics [J]. Elife, 2023, 12.

      (4) Shine J M, Lewis L D, Garrett D D, et al. The impact of the human thalamus on brain-wide information processing [J]. Nature Reviews Neuroscience, 2023, 24(7): 416-430.

      (5) Yang M Y, Keller D, Dobolyi A, et al. The lateral thalamus: a bridge between multisensory processing and naturalistic behaviors [J]. Trends in Neurosciences, 2025, 48(1): 33-46.

    1. Teacher knowledge and skills as well as teacher attitudes impact the school context. Emergent bilinguals have a greater chance for success in a school with adequate numbers of highly qualified teachers with background in second language acquisition, second language teaching, linguistics, and cross-cultural communication. The presence of counselors with training in working with second language students is also essential. School resources, including bilingual libraries and adequate technology access, are important for success as well. As is true in all schools with strong parent involvement, programs that include the parents of culturally and linguistically diverse students result in better outcomes for emergent bilingual students.

      As a teacher you can't do everything that everybody wants you too and you can't get every certification and take every class and spend 28 hours in a 24 hour period dedicated to school things. It's just not possible, but it seems like this is almost expected of teachers. I keep hearing from my education that we need to keep doing more and more and more and we need to do this and this and this and then I go talk to teachers who are in the classroom teaching and one main message I keep getting from them is make sure you are taking personal time for you and meeting your own self care needs. The students will not benefit from you burning out. So it is also reasonable to not be certified in things like this, but rely partially on those who are and ask for help from them when you don't know where to go. All you can do is your best and sometime you don't always have 100% to give to your best, so just give whatever percentage of whatever you have and show you students you care for them and honestly at the end of the day if you student know you love and care for them then in my book you have won big time.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reply to the Reviewers __

      We thank the Reviewers for their positive assessment and recognition of the paper achievements. The insightful comments will strengthen the data and manuscript.

      Referee #1* *

      Minor comments

      1. Fig 1B - add arrows showing mRNAs being translated or not (the latter mentioned in line 113 is not so easy to see). We have magnified the inset of the colocalisation in the right column; we added arrows and arrowheads to differentiate colocalised and non-colocalised bcd with translating SunTag.

      2. Fig 2A - add a sentence explaining why 1,6HD, 2,5HD and NaCl disrupt P bodies. *

      We have added the information on the use of 1,6HD, 2,5HD, and NaCl to disrupt P-bodies as below. Revised line 158: “To further show that bcd storage in P bodies is required for translational repression, we treated mature eggs with chemicals known to disrupt RNP granule integrity (31, 37, 69-72). Previous work has shown that the physical properties of P bodies in mature Drosophila oocytes can be shifted from an arrested to a more liquid-like state by addition of the aliphatic alcohol hexanediol (HD) (Sankaranarayanan et al., 2021, Ribbeck and Görlich, 2002; Kroschwald et al., 2017). While 1,6 HD has been widely used to probe the physical state of phase-separated condensates both in vivo and in vitro (Alberti et al., 2019; McSwiggen et al., 2019; Gao et al., 2022), in some cells it appears to have unwanted cellular consequences (Ulianov et al., 2021). These include a potentially lethal cellular consequences that may indirectly affect the ability of condensates to form (Kroschwald et al., 2017) and wider cellular implications thought to alter the activity of kinases (Düster et al., 2021). While we did not observe any noticeable cellular issues in mature Drosophila oocytes with 1,6 HD, we also used 2,5 HD, known to be less problematic in most tissues (Ulianov et al., 2021) and the monovalent salt sodium chloride (NaCl), which changes electrostatic interactions (Sankaranarayanan et al., 2021).”

      *Fig 4C - explain in the legend what the white lines drawn over the image represent. And why is there such an obvious distinction in the staining where suddenly the DAPI is much more evident (is the image from tile scans)? *

      Figure 4C is the tile scan image of a n.c.10 embryo and the white line classified the image into four quadrants. We used this image to quantify the extent of bcd (magenta) colocalisation to SunTag (green) in the anterior and posterior domains of the embryo in the bar graph shown in panel C’. There is a formatting error in the image. We will correct this in the revised version. We will also include the details of white lines in the legends. Finally, based on further reviewer comments, in the revised version this data is shifted to the supplementary information.

      • Line 215 - 'We did not see any significant differences in the translation of bcd based on their position, however, there appears an enhanced translation of bcd localised basally to the nuclei (Figure S5).' Since the difference is not significant, I do not think the authors should conclude that translation is enhanced basally. *

      We agree with the reviewer. In this preliminary revision we have changed this statement to: “We did not see any differences in the translation of bcd based on their position with respect to the nuclei position (Figure S5)” (revised line 238-239).

      *Line 218: 'The interphase nuclei and their subsequent mitotic divisions appeared to displace bcd towards the apical surface (Figure S6B).' Greater explanation is needed in the legend to Fig S6B to support this statement as the data just seem to show a nuclear division - I would have thought an apical-basal view is needed to conclude this. *

      We have rearranged this figure and shown in clarity the apical-basal view of the blastoderm nuclei and the displacement of bcd from the surface of the blastoderm in Figure S8.

      New Figure S8: n.c.8 - pre-cortical migration; n.c.12,14- post cortical migration; Mitosis stages of n.c.9-10. The cortical interphase nuclei at n.c. 12,14 displaces bcd. The nuclear area (DAPI, cyan) does not show any bcd particles (magenta) indicated by blue stars. The mitotic nuclei (yellow arrowheads, yellow stars) displace bcd along the plane of nuclear division (doubled headed yellow arrows).

      Fig 5B - the authors compare Bcd protein distribution across developmental time. However, in the early time points cytoplasmic Bcd is measured (presumably as it does not appear nuclear until nc8 onwards) and compare the distribution to nuclear Bcd intensities from nc9 onwards. Is most/all of the Bcd protein nuclear localised form nc9 to validate the nuclear quantitation? Does the distribution look the same if total Bcd protein is measured per volume rather than just the nuclear signal? Are the authors assuming a constant fast rate of nuclear import?

      From n.c.8 onwards, the Bcd signal in interphase nuclei builds up, with the nuclear intensity becoming very high compared to cytoplasmic Bcd. However, we do see significant Bcd signal in the cytoplasm (i.e., above background). In earlier work, gradients of the nuclear Bcd and nuclear-import mutant Bcd overlapped closely (Figure 1B, Grimm et al., 2010). This essentially suggests the nuclear Bcd gradient reflects the corresponding gradient of cytoplasmic Bcd. Further, the nuclear import of Bcd occurs rapidly after photobleaching (Gregor et al., 2007). Based on these observations, and our own measurements, prior to n.c. 9, the cytoplasmic gradient is likely a good approximation of the overall shape, whereas post n.c. 9 the Bcd signal is largely nuclear localised. Further, the overall profile is not dependent on the nuclear volume.

      • Line 259 - 'We then asked if considering the spatiotemporal pattern of bcd translation' - the authors should clarify what new information was included in the model. Similarly in line 286, 'By including more realistic bcd mRNA translation' - what does this actually mean? In line 346, 'We see that the original SDD model .... was too simple.' It would be nice to compare the outputs from the original vs modified SDD models to support the statement that the original model was too simple. *

      We will improve the linking of the results to the model. The important point is that when and where Bcd production occurs is more faithfully used, compared with previous approximations. By including more realistic production domains, we can replicate the observed Bcd gradient within the SDD paradigm without resorting to more complex models.

      Fig S1A - clarify what the difference is between the 2 +HD panels shown.__ __

      The two +HD panels at stage 14 indicate that upon the addition of HD, there are no particles in 70% of the embryos, and 30% show reduced particles. We will add this information to the figure legend.

      • Fig S2E - the graph axis label/legend says it is intensity/molecule. Since intensity/molecule is higher in the anterior for bcd RNAs, is this because there are clumps of mRNAs (in which case it's actually intensity/puncta)? *

      The density of mRNA is very high in the anterior pole; there is a chance that more than one bcd particle is within the imaged puncta (due to optical resolution limitations). We will change the y-axis to average intensity per molecule to average intensity per puncta.


      • Fig S4 - I think this line is included in error: '(B) The line plots of bcd spread on the Dorsal vs. Ventral surfaces.'*

      Yes, we will correct this in the revision.

      • In B, D, E - is the plot depth from the dorsal surface? I would have preferred to see actual mRNA numbers rather than normalised mRNAs. In Fig S4D moderate, from 10um onwards there are virtually no mRNA counts based on the normalised value, but what is the actual number? The equivalent % translated data in Fig S4E look noisy so I wonder if this is due to there being a tiny mRNA number. The same is true for Figs S4D, E 10um+ in the low region.*

      Beyond 10um from the dorsal surface, the number of bcdsun10 counts is very low. It becomes negligible at the moderate and low domains. We will attach the actual counts of mRNA in all these domains as a supplementary table in the revised version.

      General assessment Strengths are: 1) the data are of high quality; 2) the study advances the field by directly visualising Bcd mRNA translation during early Drosophila development; 3) the data showing re-localisation of bcd mRNAs to P bodies nc14 provides new mechanistic insight into its degradation; 4) a new SDD model for Bcd gradient formation is presented. Limitations of the study are: 1) there was already strong evidence (but no direct demonstration) that bcd mRNA translation was associated with release from P bodies at egg activation; 2) it is not totally clear to me how exactly the modified SDD model varies from the original one both in terms of parameters included and model output.

      This is the first direct demonstration of the translation of bcd mRNA released as a single mRNA from P bodies. Previously, we have shown that P bodies disruption releases single bcd from the condensates (31). We have captured a comprehensive understanding of the status of individual bcd translation events, from their release from P bodies at the end of oocyte maturation until the end of blastoderm formation.

      The underlying SDD model – that of localised production, diffusion, and degradation – is still the same (up to spatially varying diffusion). Yet the model as originally formulated did not fit all aspects of the data, especially with regards to the system dynamics. Here, we demonstrate that by including more accurate approximations of when and where Bcd is produced, we can explain the formation of the Bcd morphogen gradient without recourse to any further mechanism.


      Referee #2

      1. Line 114: The authors claim to have validated the SunTag using a fluorescent reporter, but do not show any data. Ref 60 is a general reference to the SunTag, and not the Bcd results in this paper. Perhaps place their data into a supplemental figure or movie? To show the validation of our bcdSun32 line, we have composed a new Figure S1 that shows the translating bcdSun32 (magenta) colocalising to the ScFV-mSGFP2 (green). Yellow arrowheads in the zoom (right panel) points to the translating bcdSun32 (magenta) and red arrowheads points to the untranslated bcdSun32. In addition, we have also shown the validation of bcdSun32 with the anti-GCN4 staining in the main Figure 1B.

      Further, we have dedicated supplementary Figure S3 (previously Figure S2) for the validation of our bcdSun10 construct. Briefly, bcdSun10 is inserted into att40 site of chr.2. We did a rescue experiment, where bcdSun10 rescued the lethality of homozygous bcdE1 null mutant. We then performed a colocalisation experiment using smFISH, where we demonstrated that almost all bcd in the anterior pole are of type bcdSun10. We targeted specific fluorescent FISH probes against 10xSunTag sequence (magenta, Figure S2A) and bcd coding sequence (magenta, Figure S2A). Upon colocalisation, we found ~90% of the mRNA are of bcdSun10 type. The remaining 10% could likely be contributed by the noise level (Figure S2B). We will make sure these points are clear in the revised manuscript.

      Line 128 and Fig. 1E: The claim that bcd becomes dispersed is difficult to verify by looking at the image. The language could also be more precise. What does it mean to lose tight association? Perhaps the authors could quantify the distribution, and summarize it by a length scale parameter? This is one of the main claims of the paper (cf. Line 23 of the abstract) but it is described vaguely and tersely here.

      We have changed the text from, “We also confirmed that bcd becomes dispersed, losing its tight association with the anterior cortex (Figure 1E) (31)” to, “We also confirmed that bcd is released from the anterior cortex at egg activation (Figure 1E) (31, 21).” (Revised line 131).

      The release of bcd mRNA at egg activation was first shown in 2008 (Ref 21, Figure 4, D-E) and again in 2021 (Ref 31, Figure 7 B and E). The main point in line 127-128, “P bodies disassembled and bcd was no longer colocalised with P bodies” and the novel aspect of line 23 is “translation observed”. The distribution of bcd mRNA after egg activation was not the point of this section. We have improved the writing in the revision to make this clearer.

      Line 146, Fig. 1G: This is a really important figure in the paper, but it is confusing because it seems the authors use the word "translation," when they mean "presence of Bcd protein." In other places in the paper, the authors give the impression that "bcd translation" means translation in progress (assayed by the colocalization of GCN4 and bcd mRNA). However, in Fig. 1G, the focus is only on GCN4. Detecting Bcd protein only at the anterior does not mean that translation happens only at the anterior (e.g., diffusion or spatially-restricted degradation could be in play).

      In Figure 1G, we have shown only the “translated” Bcd by staining with a-GCN4. We have changed line 146 from, “Consistent with previous findings, we only observed bcd translation at the anterior of the activated egg and early embryo (Figure 1G-H) (3, 68)” to, “Consistent with previous findings, we only observed the presence of Bcd protein at the anterior of the activated egg and early embryo (Figure 1G-H) (3, 68). (Revised line 151-153). We will use “translating bcd” or “bcd in translation” where we show colocalisation of bcd with BcdSun10 or BcdSun32 elsewhere in the manuscript.

      We did not mean to claim that translation occurred only in the anterior pole. We show that the abundance of bcd is very high in the anterior pole (in agreement with previous work) and that this is where the majority of observed translation events took place. Indeed, we have also shown that posteriorly localised mRNAs have the same BcdSun10 intensity per bcd puncta from the posterior pole (Figure 3B & 4C’ and Figure S2 E), but these are much fewer in number.

      *It would also be helpful to show a plot with quantification of Bcd detection (or translation) on the y-axis and a continuous AP coordinate on the x-axis, instead of just two points (anterior and posterior poles, the latter of which is uninteresting because observing no Bcd at the posterior pole is expected). *

      In Figure 1G,H, our aim was to test whether release from P bodies allowed for bcd mRNA to be translated. We used the presence of Bcd protein at the anterior domain of the oocytes to show this. The posterior pole was included as an internal control. To show the spatial distribution of bcd mRNA and its translation, we used early blastoderm (Figure 3, Figure S4).

      • *

      Another issue with Fig. 1G is that the A and P panels presumably have different brightness and contrast. If not, just from looking at the A and P panels, the conclusion would be that Bcd protein is diffuse (and abundant) in the posterior and concentrated into puncta in the anterior. The authors should either make the brightness and contrast consistent or state that the P panel had a much higher brightness than the A panel.

      We agree with this shortcoming. We have now added the following to Figure 1 legend to clarify this observation. “G: Representative fixed 10 µm Z-stack images (from 10 samples) showing BcdSun32 protein (anti-GCN4) is only present at the anterior of an in vitro activated egg or early embryo 30-minute post fertilization. BcdSun32 protein is not detected in these samples at the posterior pole (image contrast increased to highlight the lack of distinct particles at the posterior). BcdSun32 protein is also not detected at the anterior or posterior of a mature oocyte or an in vitro activated egg incubated with NS8953 (images have the contrast increased to highlight the lack of distinct particles). Scale bar: 20 mm; zoom 2 mm.” (Revised line 623).

      • Line 176: This section is very confusing, because at this point the authors already addressed the spatial localization of translation in Fig. 1G,H (see my above comment). However, here it seems the authors have switched the definition of translation back to "translation in progress." Therefore, the confusion here could be eliminated by addressing the above point.*

      In the revised version, we will use Bcd protein when shown with anti-GCN4 staining. We will use “translating bcd” or “bcd in translation” where we show colocalisation of bcd with a-GCN4 (BcdSun10 or BcdSun32). We will change this in the corresponding text.

      Line 185: The sentence here is seemingly contradictory: "most...within 100 microns" implies that at least some are beyond 100 microns, while the sentence ends with "[none]...more than 100 microns." The language could perhaps be altered to be less vague/contradictory.

      We will clarify this in the revised version. There are few particles visible beyond 100 um. In the lower panel of Figure 3B, the posterior domain shows few particles. However, their actual number compared to bcd counts within the 100 um is negligible (Figure3C). Nonetheless, the few bcd particles we observe do seem to be under translation (quantified in Figure 4C’ and Figure S2E).

      • Line 204: It would be really nice to have quantification of the translation events, such as curves of rate of translation as a function of a continuous AP coordinate, and a curve for each nc.*__ __

      In the revised version we will provide the results quantifying the translation events across the anterior- posterior axis. This will provide a clarity to the presence of bcd and their translation in the posterior domain with time.

      Our colocalisation analysis is semi-automated. It includes an automated counting of the individual bcd particle counts and a manual judgement of the colocalised BcdSun10 protein (distinct spots, above noise) to bcd particles (Figure S3D). The bcd particle counts ran into thousands in each cyan square box (measuring 50um radius and ~ 20um deep from the dorsal surface). We selected three such boxes covering 150um (continuously) from the anterior pole across A-P axis and 20um deep of the flattened embryo mounts across D-V axis (Figure 3A-C, Figure S4). We have also scanned scarce particles in the posterior; however, bcd counts are very low compared to the anterior. Further, in Figure 4 we have repeated the same technique to measure translation of bcd particles in embryos at different nuclear cycles.

      We have also shown continuous intensity measurements of bcd particles with their respective BcdSun10 gradient in Figure 5 across the A-P axis at different nuclear cycles. Here, we know BcdSun10 intensity is not only from the “translating” bcd (colocalised BcdSun10 to bcd particles) but also from the translated BcdSun10 freely diffusing (non-colocalised BcdSun10 to bcd particles). As asked by the reviewer, in the revised version we will add bcd counts and their translation status from anterior to posterior axis for each of the nuclear cycles.

      In our future work, we planned to generate MS2 tagged bcdSun10 to measure the rates of translation in live across all nuclear cycles.

      • *

      *Line 209 and Fig 4C: The authors use the terms "intensity of translation events" or "translation intensity" without clearly defining them. From the figure (specifically from the y-axis label), it looks like the authors are quantifying the intensity per molecule (which is not clearly the same thing as "translation intensity"), but it would be nice if that were stated explicitly. *

      In the relevant result section, we have changed the results text to “the intensity of translation events” for explaining the results of Figure 4C’.

      • In addition, the authors again quantify only two points. This is a continuously frustrating part of the manuscript, which applies to nearly all figures where the authors looked only at two points in space. At a typical sample size of N = 3, it seems well within time constraints to image at multiple points along the AP axis.*__ __

      In addition to the quantification shown at the anterior and posterior locations of the embryo in the Figure 3 and 4, we will show in the revised version, the quantification of translation events across all locations from the anterior to the posterior. We will use three embryos for each nuclear cycle from n.c.1 to 14.

      • Furthermore, it sounds like the authors are saying the "translation intensity" is the same in anterior and the posterior, which is counterintuitive. The expectation is that translation would be undetectable at the posterior end, in part because bcd mRNA would not be present. (Note that this expectation is even acknowledged by the authors on Line 185, which I comment on above, and also on Line 197). There should also be very low levels of Bcd protein (possibly undetectable) at the posterior pole. As such, the authors should explain how they think their claim of the same "translation intensities" in the anterior vs posterior fits into the bigger picture of what we know about Bcd and what they have already stated in the manuscript. They should also explain how they observed enough molecules to quantify at the posterior end. The authors should also disclose how many points are in each box in the boxplot. For example, the sample size is N = 3 embryos. In just three embryos, how many bcd/GCN4 colocalizations did the authors observe at the posterior end of the embryo?*

      In n.c.4 in Figure3, we saw few bcd particles in the posterior. However, at n.c.10 in Figure 4C’ the number of posterior bcd particles are higher than at the early stages. We have quantified them in Figure 4C’. We will clarify this from the new set of quantification we are undertaking now to quantify translation across the A-P axis in the revision.

      Finally, we will also provide the number of bcd particle counts and their colocalisation with a-GCN4 as a supplementary table.

      • Line 215: The sentence that starts on this line seems self-contradictory: I cannot tell whether or not there is a difference in translation based on position. *

      We have not observed any difference in the translation of bcd particles depending on the position along the Z-axis. We will edit this in our revised version.

      • Line 229: Long-ranged is a relative term. From the graph, one could state there is some spatial extent to the mRNA gradient, so it is unclear what the authors mean when they say it is not "long-ranged." Could the mRNA gradient be quantified, such as with a spatial length scale? This would provide more information for readers to make their own conclusions about whether it is long-ranged.*

      We have quantified the bcd mRNA gradient for each n.c. (Figure 5B-C); absolute bcd intensities in Figure 5B, left panel and the normalised intensities in Figure 5C. The length of the mRNA spread appears constant with the half-length maximum of ~75um across all nuclear cycles. Our conclusion of a long ranged Bcd gradient is based on the comparisons of the half-length maximum measurements of bcd particles and BcdSun10 (Figure 5D).

      *Line 230: When the authors claim the Bcd gradient is steeper earlier, a quantification of the spatial extent (exponential decay length scale) would be appropriate. Indeed, lambda as a function of time would be beneficial. It should also be placed in context of earlier papers that claim the spatial length scale is constant. *

      We will show this effectively from the live movies of bcdSun10/nanos-scFv-sGFP2 in the revised version.

      • Lines 235-236: The two sentences that start on these two lines are vague and seemingly contradictory. The first sentence says there is a spatial shift, but the second sentence sounds like it is saying there is no spatial change. The language could be more precise to explain the conclusions. *

      We agree with the reviewer. We will edit this in revision.

      Minor comments

        • Line 81: Probably meant "evolutionarily conserved" * Yes, we have changed, “P bodies are an evolutionarily cytoplasmic RNP granule” to, “P bodies are an evolutionarily conserved cytoplasmic RNP granule.”(Revised line 84-85).

      *Figure 1 legend: part B says "from 15 samples" but also says N = 20. Which is it, or do these numbers refer to different things? *

      We have edited this from, “early embryo (from 15 samples)” to, “early embryo (from 20 samples)”. (Revised line 602).

      • Line 217: migration of what? *

      Edited to “cortical nuclear migration”.

      • Line 228: "early embryo" is vague. The authors should give specific time points or nuclear cycle numbers.*

      Edited to “nuclear cycles 1-8”.

      • Line 301: Other locations in the paper say 75 microns or 100 microns. *

      We will make the changes. It is 100 um.

      • Fig. 5: all images should be oriented such that the dorsal midline is on the upper half of the embryo/image. *

      We will flip the image to match.

      • Fig. 5B: There are light tan and/or light orange curves (behind the bold curves) that are not explained. *

      It is the standard deviation. This will be explained.

      • Fig. 5C: the plot says "normalized" but nowhere do the authors describe what the curves are normalized to. There is also no explanation for what the broad areas of light color correspond to.*__ __

      Normalised to the bcd intensity maxima. This will be explained.

      Significance

      The results, if upheld, are highly significant, as they are foundational measurements addressing a longstanding question of how morphogen gradients are formed, using Bcd (the foundational morphogen gradient) as a model. They also address fundamental questions in genetics and molecular biology: namely, control of mRNA distribution and translation.__ __

      We thank Reviewer 2 for highlighting the importance of our work in the field. We are confident that we address the issues raised by Reviewer 2 with the new set of quantifications we are currently working on.

      Referee #3

        • It is not evident from the main results and methods text that the new SDD model incorporates the phenomenon reported in figure 4B. From my reading, the parameter beta accounts for the Bcd translation rate, which according to figure 7B(ii) effectively switches from off to on around fertilization and thereafter remains constant. Figure 4B shows that the fraction of bcd mRNA engaged in translation decreases beginning around NC12/13, and this is one of the more powerful results that comes from monitoring translation in addition to RNA localization/abundance/stability. My expectation based on figure 4B would be that parameter beta should decrease over time beginning around 90-100 minutes and approach zero by ~150 minutes. This rate could be fit to the experimental data that yields figure 4B. The modeling should be repeated while including this information. This is a good observation. Currently, the reduced rate of bcd translation is modelled by incorporating an increased rate of bcd *mRNA degradation. Of course, this could also be reduced by a change in the rate of translation directly. As stated already, the beta parameter is the least well characterised. In the revision, we will include a model where beta changes but not the mRNA degradation rate. We will improve the discussion to make this point clearer.
      1. The presentation of the SDD model should be expanded to address how well the characteristic decay length fits A) measured Bcd protein distributions, B) measured at different nuclear cycles. This would strengthen the claim that the new SDD model better captures gradient dynamics given the addition of translation and RNA distribution. These experimental data already exist as reported in Figure 5. In the current Figure 7, panels D and D' add little to the story and could be moved to a supplement if the authors want to include it (in any case, please fix the typo on the time axis of fig 7D' to read "hours"). The model per cell cycle and the comparison of experimental and modeled decay lengths could replace current D and D'.*

      Originally, we kept discussion of the SDD model only to core points. It is clear from all Reviewers that expanding this discussion is important. In the revision, we will refocus Figure 7 on describing new results that we can learn. As outlined in the responses above, this paper reveals an important insight: the SDD model – with suitable modifications such as temporally restricted Bcd production – can explain all observed properties of Bcd gradient formation. Other mechanisms – such as bcd mRNA gradients – are not required.

      • The exposition of the manuscript would benefit significantly by including a section either in the introduction or the appropriate section of the results that defines the competing models for gradient formation. In the current version, these models are only cited, and the key details only come out late (e.g., lines 302 onward, in the Discussion). Nevertheless, some of the results are presented as if in dialog with these models, but it reads as a one-sided conversation. For instance: Figure 3. The undercurrent in this figure is the RNA-gradient model. In the context of this model, the results clearly show that translation of bcd is restricted to the anterior. Without this context, Figure 3 could read as a fairly unremarkable observation that translation occurs wherever there is mRNA. Restructuring the manuscript to explicitly name competing models and to address how experimental results support or detract from each competing model would greatly enhance the impact of the exposition.*

      We thank the reviewer for this suggestion. We will add the current models of Bcd gradient formation in the introduction section and will change the narrative of results in the section explaining the models.

      (4A) Related to point 3: The entire results text surrounding Figure 2 should be revised to include more detail about A) what specific hypotheses are being tested; and B) to critically evaluate the limitations of the experimental approaches used to evaluate these hypotheses. Hexanediol and high salt conditions are not named explicitly in the text, but the text touts these as "chemicals" that "disrupt P-body integrity." This implies that the treatments are specific to P-bodies. Neither of these approaches are only disrupting P Body integrity. This does not invalidate this approach, but the manuscript needs to state what hypothesis HD and NaCl treatment addresses, and acknowledge the caveats of the approach (such as the non-specificity and the assumptions about the mechanism of action for HD).

      We have made the following edits to resolve this point. Revised line 158: “To further show that bcd storage in P bodies is required for translational repression, we treated mature eggs with chemicals known to disrupt RNP granule integrity (31, 37, 69-72). Previous work has shown that the physical properties of P bodies in mature Drosophila oocytes can be shifted from an arrested to a more liquid-like state by addition of the aliphatic alcohol hexanediol (HD) (Sankaranarayanan et al., 2021, Ribbeck and Görlich, 2002; Kroschwald et al., 2017). While 1,6 HD has been widely used to probe the physical state of phase-separated condensates both in vivo and in vitro (Alberti et al., 2019; McSwiggen et al., 2019; Gao et al., 2022), in some cells it appears to have unwanted cellular consequences (Ulianov et al., 2021). These include a potentially lethal cellular consequences that may indirectly affect the ability of condensates to form (Kroschwald et al., 2017) and wider cellular implications thought to alter the activity of kinases (Düster et al., 2021). While we did not observe any noticeable cellular issues in mature Drosophila oocytes with 1,6 HD, we also used 2,5 HD, known to be less problematic in most tissues (Ulianov et al., 2021) and the monovalent salt sodium chloride (NaCl), which changes electrostatic interactions (Sankaranarayanan et al., 2021).”

      (4B) Continuing the comment above: it is good that the authors checked that HD and NaCl treatment does not cause egg activation. But no one outside of the field of Drosophila egg activation knows what the 2-minute bleach test is and shouldn't have to delve into the literature to understand this sentence. Please explain in one sentence that "if eggs are activated, then x happens following a short exposure to bleach (citations). We exposed HD and NaCl treated eggs to bleach and observed... ."

      We have made the following edits to resolve this point. Revised line 174: “After treating mature eggs with these solutions, we observed BcdSun32 protein in the oocyte anterior (Figure 2A-B). One caveat to this experiment could be that treating mature eggs with these chemicals results in egg activation which would in turn generate Bcd protein. To eliminate this possibility, we first screened for phenotypic egg activation markers, including swelling and a change in the chorion (73). We also applied the classic approach of bleaching eggs for two minutes which causes lysis of unactivated eggs (74). All chemically treated eggs failed this bleaching test meaning they were not activated (74). While we unable to rule out non-specific actions of these treatments, these experiments corroborate that storage in P bodies that adopt an arrested physical state is crucial to maintain bcd translational repression (31).”

      (4C) Continuing the comment above: The section of the results related to the endos mutation needs additional information. It is not apparent to the average reader how the endos mutation results in changes in RNP granules, nor what the expected outcome of such an effect would "further test the model" set up by the HD and NaCl experiments. The average reader needs more hand-holding throughout this entire section (related to figure 2) to follow the exposition of the results.

      We have made the following edits to resolve this point. Edited line 185: “Finally, we used a genetic manipulation to change the physical state of P bodies in mature oocytes. Mutations in Drosophila Endosulfine (Endos), which is part of the conserved phosphoprotein ⍺-endosulfine (ENSA) family (75), caused a liquid-like P body state after oocyte maturation, similar to that observed with chemical treatment (Figure 2C) (31). This temporal effect matched the known roles of Endos as the master regulator of oocyte maturation (75, 76). endos mutant oocytes lost the colocalisation of bcd mRNA and P bodies, concurrent with P bodies becoming less viscous during oocyte maturation (Figure 2D, Figure S1). Particle size and position analysis showed that bcd mRNA prematurely exhibits an embryo distribution in these mutants (Figure 2E). Due to genetic and antibody constraints, we are unable to test for translation of bcd in the endos mutant. However, it follows that bcd observed in this diffuse distribution outside of P bodies would be translationally active (Figure 2E-F).”

      • (4D) Continuing the comment above: The average reader also needs a better explanation of what hypothesis is being tested in Figure 1 with the pharmacological inhibition of calcium. *

      We have made the following edits to resolve this point. Revised line 138: “We next sought to maintain the relationship between bcd mRNA and P bodies through egg activation. This would act as a control to further test if colocalisation of bcd to P bodies was necessary for its translational repression. Previous work has shown that a calcium wave is required at egg activation for further development (references to add Kaneuchi et al., 2015; York-Anderson et al., 2019; Hu and Wolfner, 2019). Chemical treatment with NS8593 disrupts this calcium wave, while other phenotypic markers of egg activation are still observed (58). Using NS8593 to disrupt the calcium wave in the activated egg, we show P bodies are retained during ex vivo egg activation (Figure 1E). In these treated eggs, bcd mRNA remains colocalised with the retained P bodies (Figure 1F). Based on these results and previous observations (31, 66), we hypothesised that the loss of colocalisation between bcd and P bodies correlates with bcd translation.”

      *It is unclear why Bcd translation could not be measured in the endos mutant background, but it would be necessary to measure Bcd translation in the endos background. If genotypically it is not possible/inconvenient to invoke the suntag reporter in the endos background, would it not be sufficient to immunostain against Bcd itself? Different Bcd antisera have recently been reported and distributed by the Wieschaus and the Zeitlinger groups. *

      We have recently received the Bcd antibody from the Zeitlinger group. This has not been shown to work for immunostaining. It remains unclear if it will be successful in this capacity, but we are currently testing it and will include this experiment in the revision if successful.

      *Figure 4 overall is glorious, but there is a problem with panel C. What are the white lines? Why does the intensity for the green and magenta channel change abruptly in the middle of the embryo? *

      These white lines divide the embryo into 4 compartments. We used this method to quantify the intensity of Bcd translation with respect to the bcd puncta. We will correct this image as there is a problem in formatting.

      *It is noted that neither the methods section or the supplement does not contain any mention of how the modeling was performed. How was parameter beta fit? At least a brief section should be added to the methods describing how beta was fit (pending adjustments suggested in comment 1 above). A platinum-level addition would include a modeling supplement that reports the sensitivity of model outcomes to changes in parameters. *

      We apologise for this omission and will include full methodological details in the revision.

      Minor Comments:

        • Line 28: "Source-Diffusion-Degradation" should be changed to "Synthesis-..."* We will edit in the revised version.

      *Line 39: "blastocyst" should be "blastoderm stage embryo". *

      We will edit in the revised version.

      • Line 81: "P bodies are an evolutionarily cytoplasmic RNP granule." is "conserved" missing here? *

      We will edit in the revised version.

      • Throughout the manuscript, there should be better reporting of the imaged genotypes and whether the suntag is being visualized by indirect immunostaining of fixed tissues or through an encoded nanobody-GFP fusion. *

      We will explain in detail in the revised version.

      • Figure 1G: Why is the background staining so different across conditions? Is this a normalization artifact?*__ __

      We agree with this shortcoming. We have now added the following to the figure legend to clarify this observation. “G: Representative fixed 10 µm Z-stack images (from 10 samples) showing BcdSun32 protein (anti-GCN4) is only present at the anterior of an in vitro activated egg or early embryo 30-minute post fertilization. BcdSun32 protein is not detected in these samples at the posterior pole (image contrast increased to highlight the lack of distinct particles at the posterior). BcdSun32 protein is also not detected at the anterior or posterior of a mature oocyte or an in vitro activated egg incubated with NS8953 (images have the contrast increased to highlight the lack of distinct particles). Scale bar: 20 mm; zoom 2 mm.” (Revised line 623).

      Figure 2 legend: what is +Sch in the x-axis labels of figure 2B? The legend says that 2B is the quantification of the data in 2A, but there is no (presumed control) +Sch image in 2A.__ __

      Thank you for this suggestion we have added the data to Figure 2A.

      • Figure 5A largely repeats information presented in figure 4A. Please consider moving to a supplement. Also, please re-orient embryos to follow the convention that dorsal-most surfaces be presented on the top of the displayed images. *

      Thank you for this suggestion. We will consider moving Figure 5A to the supplementary.

      • The lower-case roman numerals referred to in the text for figure 7B are not included in the corresponding figure panel. *

      We will edit in the revised version.

      • Figure 7C y-axis typo (concentration). *

      We will edit in the revised version.

      • Line 222: "make a long-range functional gradient": more accurate to say, "but also marks mature, Bcd protein which resolves in the expected long-range gradient." *

      We will edit in the revised version.

      • Methods: Please check that all buffers referred to as acronyms are both compositionally defined in the reagents table, and that full names are written out at the time of first mention in the presented order. For instance, Schneider's media is referred to a few times before defining the acronym about midway through the methods section.*__ __

      We have added to Figure 2B: “Quantification of experiments shown in A. The number of oocytes that displayed Bcd protein at the anterior as measured by the presence of BcdSun32 at the anterior of the oocyte, but not the posterior. Schneider’s Insect Medium (+Sch) used as a negative control. N = 30 oocytes for each treatment. Scale bar: 5 um.” (Revised line 646).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this paper the authors use the Suntag system to visualise bcd mRNA translation in the Drosophila embryo. They elucidate the relationship between bcd mRNA translation and P body localisation. In the oocyte, bcd mRNAs are localised in P bodies and translationally repressed, but upon egg activation bcd mRNAs are released from P bodies and translated. In addition, during mid-nc14, bcd mRNAs become localised to embryonic P bodies and degraded. The authors use their data to modify the Synthesis, Diffusion, Degradation model of Bcd gradient formation, which recapitulates the Bcd gradient detected experimentally.

      Overall, I think the data are of high quality and support the authors' conclusions. I only have minor comments, as follows:

      Fig 1B - add arrows showing mRNAs being translated or not (the latter mentioned in line 113 is not so easy to see).

      Fig 2A - add a sentence explaining why 1,6HD, 2,5HD and NaCl disrupt P bodies.

      Fig 4C - explain in the legend what the white lines drawn over the image represent. And why is there such an obvious distinction in the staining where suddenly the DAPI is much more evident (is the image from tile scans)?

      Line 215 - 'We did not see any significant differences in the translation of bcd based on their position, however, there appears an enhanced translation of bcd localised basally to the nuclei (Figure S5).' Since the difference is not significant, I do not think the authors should conclude that translation is enhanced basally.

      Line 218: 'The interphase nuclei and their subsequent mitotic divisions appeared to displace bcd towards the apical surface (Figure S6B).' Greater explanation is needed in the legend to Fig S6B to support this statement as the data just seem to show a nuclear division - I would have thought an apical-basal view is needed to conclude this.

      Fig 5B - the authors compare Bcd protein distribution across developmental time. However, in the early time points cytoplasmic Bcd is measured (presumably as it does not appear nuclear until nc8 onwards) and compare the distribution to nuclear Bcd intensities from nc9 onwards. Is most/all of the Bcd protein nuclear localised form nc9 to validate the nuclear quantitation? Does the distribution look the same if total Bcd protein is measured per volume rather than just the nuclear signal? Are the authors assuming a constant fast rate of nuclear import?

      Line 259 - 'We then asked if considering the spatiotemporal pattern of bcd translation' - the authors should clarify what new information was included in the model. Similarly in line 286, 'By including more realistic bcd mRNA translation' - what does this actually mean? In line 346, 'We see that the original SDD model .... was too simple.' It would be nice to compare the outputs from the original vs modified SDD models to support the statement that the original model was too simple.

      Fig S1A - clarify what the difference is between the 2 +HD panels shown.

      Fig S2E - the graph axis label/legend says it is intensity/molecule. Since intensity/molecule is higher in the anterior for bcd RNAs, is this because there are clumps of mRNAs (in which case it's actually intensity/puncta)?

      Fig S4 - I think this line is included in error: '(B) The line plots of bcd spread on the Dorsal vs. Ventral surfaces.' In B, D, E - is the plot depth from the dorsal surface? I would have preferred to see actual mRNA numbers rather than normalised mRNAs. In Fig S4D moderate, from 10um onwards there are virtually no mRNA counts based on the normalised value, but what is the actual number? The equivalent % translated data in Fig S4E look noisy so I wonder if this is due to there being a tiny mRNA number. The same is true for Figs S4D, E 10um+ in the low region.

      Referees cross-commenting

      I think the concerns raised by reviewers 2 and 3 are valid, and that it is feasible for the authors to address all the reviewers' concerns in order to improve the manuscript.

      Significance

      General assessment

      Strengths are: 1) the data are of high quality; 2) the study advances the field by directly visualising Bcd mRNA translation during early Drosophila development; 3) the data showing re-localisation of bcd mRNAs to P bodies nc14 provides new mechanistic insight into its degradation; 4) a new SDD model for Bcd gradient formation is presented. Limitations of the study are: 1) there was already strong evidence (but no direct demonstration) that bcd mRNA translation was associated with release from P bodies at egg activation; 2) it is not totally clear to me how exactly the modified SDD model varies from the original one both in terms of parameters included and model output.

      Advance

      The advance is conceptual, technical and mechanistic.

      Audience

      The results will be important to a broad range of researchers interested in the formation of developmental morphogen gradients and the post-transcriptional regulation of gene expression, particularly the relationship with P bodies.

      My expertise

      Wetlab developmental biologist

    1. Reflect – You should go through what you read and try to answer the questions you noted before during the pre-reading stage. Check in after every section, chapter or topic to make sure you understand the material and can explain it, in your own words. Pretend you are responsible for teaching this section to someone else. Can you do it?   It’s at this stage that you consolidate knowledge, so refrain from moving on until you can recall the core information.

      Reflecting on the material you read is important to ensure you retained the information you just digested.

    1. Many citizens enjoy sports involving high-flying birds – falcons, hawks and the like – or hounds for hunting in the woods. The citizens have hunting rights in Middlesex, Hertfordshire, throughout the Chilterns, and in Kent as far as the river Cray

      FitzStephen shows that Londoners combined city life with rural privileges. Falconry and hunting were not just fun, they were status activities that connected citizens to nature and displayed skill and wealth. This also links to his idea of “men of superior quality,” since these sports required training, discipline, and resources. It’s interesting that medieval urban identity included mastery over the countryside, not just the city itself.

    1. In every society, these factors make some groups more likely to develop specific diseases.

      Yes! I love that the chapter mentioned the other factors that come into play when a disease is spreading or has developed. It's not just on the health and hygiene of a person but also what their environment looks like that can cause a great effect on the development of and spread of diseases.

    2. two-thirds of emerging pathogens come from other animals.

      I think most people, like me, don't often realize that some diseases to come from animals and it's not just humans. Same with foods and other items that can cause diseases. For example, when COVID-19, I remember news was spread that it started because a person in China had eaten a bat or something along those lines. It just goes to show that when people think about diseases, they don't normally realize it can come from other factors outside of a human being, like animals.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Weaknesses:

      The technical approach is strong and the conceptual framing is compelling, but several aspects of the evidence remain incomplete. In particular, it is unclear whether the reported changes in connectivity truly capture causal influences, as the rank metrics remain correlational and show discrepancies with the manipulation results.

      We agree that our functional connectivity ranking analyses cannot establish causal influences. As discussed in the manuscript, besides learning-related activity changes, the functional connectivity may also be influenced by neuromodulatory systems and internal state fluctuations. In addition, the spatial scope of our recordings is still limited compared to the full network implicated in visual discrimination learning, which may bias the ranking estimates. In future, we aim to achieve broader region coverage and integrate multiple complementary analyses to address the causal contribution of each region.

      The absolute response onset latencies also appear slow for sensory-guided behavior in mice, and it is not clear whether this reflects the method used to define onset timing or factors such as task structure or internal state.

      We believe this may be primarily due to our conservative definition of onset timing. Specifically, we required the firing rate to exceed baseline (t-test, p < 0.05) for at least 3 consecutive 25-ms time windows. This might lead to later estimates than other studies, such as using the latency to the first spike after visual stimulus onset (Siegle et al., 2021) or the time to half-max response (Goldbach, Akitake, Leedy, & Histed, 2021).

      The estimation of response onset latency in our study may also be affected by potential internal state fluctuations of the mice. We used the time before visual stimulus onset as baseline firing, since firing rates in this period could be affected by trial history, we acknowledge this may increase the variability of the baseline, thus increase the difficulty to statistically detect the onset of response.

      Still, we believe these concerns do not affect the observation of the formation of compressed activity sequence in CR trials during learning.

      Furthermore, the small number of animals, combined with extensive repeated measures, raises questions about statistical independence and how multiple comparisons were controlled.

      We agree that a larger sample size would strengthen the robustness of the findings. However, as noted above, the current dataset has inherent limitations in both the number of recorded regions and the behavioral paradigm. Given the considerable effort required to achieve sufficient unit yields across all targeted regions, we wish to adjust the set of recorded regions, improve behavioral task design, and implement better analyses in future studies. This will allow us to both increase the number of animals and extract more precise insights into mesoscale dynamics during learning.

      The optogenetic experiments, while intended to test the functional relevance of rank increasing regions, leave it unclear how effectively the targeted circuits were silenced. Without direct evidence of reliable local inhibition, the behavioral effects or lack thereof are difficult to interpret.

      We appreciate this important point. Due to the design of the flexible electrodes and the implantation procedure, bilateral co-implantation of both electrodes and optical fibers was challenging, which prevented us from directly validating the inhibition effect in the same animals used for behavior. In hindsight, we could have conducted parallel validations using conventional electrodes, and we will incorporate such controls in future work to provide direct evidence of manipulation efficacy.

      Details on spike sorting are limited.

      We have provided more details on spike sorting in method section, including the exact parameters used in the automated sorting algorithm and the subsequent manual curation criteria.

      Reviewer #2 (Public review):

      Weaknesses:

      I had several major concerns:

      (1) The number of mice was small for the ephys recordings. Although the authors start with 7 mice in Figure 1, they then reduce to 5 in panel F. And in their main analysis, they minimize their analysis to 6/7 sessions from 3 mice only. I couldn't find a rationale for this reduction, but in the methods they do mention that 2 mice were used for fruitless training, which I found no mention in the results. Moreover, in the early case, all of the analysis is from 118 CR trials taken from 3 mice. In general, this is a rather low number of mice and trial numbers. I think it is quite essential to add more mice.

      We apologize for the confusion. As described in the Methods section, 7 mice (Figure 1B) were used for behavioral training without electrode array or optical fiber implants to establish learning curves, and an additional 5 mice underwent electrophysiological recordings (3 for visual-based decision-making learning and 2 for fruitless learning).

      As we noted in our response to Reviewer #1, the current dataset has inherent limitations in both the number of recorded regions and the behavioral paradigm. Given the considerable effort required to achieve high-quality unit yields across all targeted regions, we wish to adjust the set of recorded regions, improve behavioral task design, and implement better analyses in future studies. These improvements will enable us to collect data from a larger sample size and extract more precise insights into mesoscale dynamics during learning.

      (2) Movement analysis was not sufficient. Mice learning a go/no-go task establish a movement strategy that is developed throughout learning and is also biased towards Hit trials. There is an analysis of movement in Figure S4, but this is rather superficial. I was not even sure that the 3 mice in Figure S4 are the same 3 mice in the main figure. There should be also an analysis of movement as a function of time to see differences. Also for Hits and FAs. I give some more details below. In general, most of the results can be explained by the fact that as mice gain expertise, they move more (also in CR during specific times) which leads to more activation in frontal cortex and more coordination with visual areas. More needs to be done in terms of analysis, or at least a mention of this in the text.

      Due to the limitation in the experimental design and implementation, movement tracking was not performed during the electrophysiological recordings, and the 3 mice shown in Figure S4 (now S5) were from a separate group. We have carefully examined the temporal profiles of mouse movements and found it did not fully match the rank dynamics for all regions, and we have added these results and related discussion in the revised manuscript. However, we acknowledge the observed motion energy pattern could explain some of the functional connection dynamics, such as the decrease in face and pupil motion energy could explain the reduction in ranks for striatum.

      Without synchronized movement recordings in the main dataset, we cannot fully disentangle movement-related neural activity from task-related signals. We have made this limitation explicit in the revised manuscript and discuss it as a potential confound, along with possible approaches to address it in future work.

      (3) Most of the figures are over-detailed, and it is hard to understand the take-home message. Although the text is written succinctly and rather short, the figures are mostly overwhelming, especially Figures 4-7. For example, Figure 4 presents 24 brain plots! For rank input and output rank during early and late stim and response periods, for early and expert and their difference. All in the same colormap. No significance shown at all. The Δrank maps for all cases look essentially identical across conditions. The division into early and late time periods is not properly justified. But the main take home message is positive Δrank in OFC, V2M, V1 and negative Δrank in ThalMD and Str. In my opinion, one trio map is enough, and the rest could be bumped to the Supplementary section, if at all. In general, the figure in several cases do not convey the main take home messages. See more details below.

      We thank the reviewer for this valuable critique. The statistical significance corresponding to the brain plots (Figure 4 and Figure 5) was presented in Figure S3 and S5 (now Figure S5 and S7 in the revised manuscript), but we agree that the figure can be simplified to focus on the key results.

      In the revised manuscript, we have condensed these figures to focus on the most important comparisons to make the visual presentation more concise and the take-home message clearer.

      (4) The analysis is sometimes not intuitive enough. For example, the rank analysis of input and output rank seemed a bit over complex. Figure 3 was hard to follow (although a lot of effort was made by the authors to make it clearer). Was there any difference between the output and input analysis? Also, the time period seems redundant sometimes. Also, there are other network analysis that can be done which are a bit more intuitive. The use of rank within the 10 areas was not the most intuitive. Even a dimensionality reduction along with clustering can be used as an alternative. In my opinion, I don't think the authors should completely redo their analysis, but maybe mention the fact that other analyses exist

      We appreciate the reviewer’s comment. In brief, the input- and output-rank analyses yielded largely similar patterns across regions in CR trials, although some differences were observed in certain areas (e.g., striatum) in Hit trials, where the magnitude of rank change was not identical between input and output measures. We have condensed the figures to only show averaged rank results, and the colormap was updated to better covey the message.

      We did explore dimensionality reduction applied to the ranking data. However, the results were not intuitive as well and required additional interpretation, which did not bring more insights. Still, we acknowledge that other analysis approaches might provide complementary insights.

      Reviewer #3 (Public review):

      Weaknesses:

      The weakness is also related to the strength provided by the method. It is demonstrated in the original method that this approach in principle can track individual units for four months (Luan et al, 2017). The authors have not showed chronically tracked neurons across learning. Without demonstrating that and taking advantage of analyzing chronically tracked neurons, this approach is not different from acute recording across multiple days during learning. Many studies have achieved acute recording across learning using similar tasks. These studies have recorded units from a few brain areas or even across brain-wide areas.

      We appreciate the reviewer’s important point. We did attempt to track the same neurons across learning in this project. However, due to the limited number of electrodes implanted in each brain region, the number of chronically tracked neurons in each region was insufficient to support statistically robust analyses. Concentrating probes in fewer regions would allow us to obtain enough units tracked across learning in future studies to fully exploit the advantages of this method.

      Another weakness is that major results are based on analyses of functional connectivity that is calculated using the cross-correlation score of spiking activity (TSPE algorithm). Functional connection strengthen across areas is then ranked 1-10 based on relative strength. Without ground truth data, it is hard to judge the underlying caveats. I'd strongly advise the authors to use complementary methods to verify the functional connectivity and to evaluate the mesoscale change in subnetworks. Perhaps the authors can use one key information of anatomy, i.e. the cortex projects to the striatum, while the striatum does not directly affect other brain structures recorded in this manuscript

      We agree that the functional connectivity measured in this study relies on statistical correlations rather than direct anatomical connections. We plan to test the functional connection data with shorter cross-correlation delay criteria to see whether the results are consistent with anatomical connections and whether the original findings still hold.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The small number of mice, each contributing many sessions, complicates the  interpretation of the data. It is unclear how statistical analyses accounted for the small  sample size, repeated measures, and non-independence across sessions, or whether  multiple comparisons were adequately controlled.

      We realized the limitation from the small number of animal subjects, yet the difficulty to achieve sufficient unit yields across all regions in the same animal restricted our sample size. Though we agree that a larger sample size would strengthen the robustness of the findings, however, as noted below the current dataset has inherent limitations in both the scope of recorded regions and the behavioral paradigm.

      Given the considerable effort required to achieve sufficient unit yields across all targeted regions, we wish to adjust the set of recorded regions, improve behavioral task design, and implement better analyses in future studies. This will allow us to both increase the number of animals and extract more precise insights into mesoscale dynamics during learning.

      (2) The ranking approach, although intuitive for visualizing relative changes in  connectivity, is fundamentally descriptive and does not reflect the magnitude or  reliability of the connections. Converting raw measures into ordinal ranks may obscure  meaningful differences in strength and can inflate apparent effects when the underlying  signal is weak.

      We agree with this important point. As stated in the manuscript, our motivation in taking the ranking approach was that the differences in firing rates might bias cross-correlation between spike trains, making raw accounts of significant neuron pairs difficult to compare across conditions, but we acknowledge the ranking measures might obscure meaningful differences or inflate weak effects in the data.

      We added the limitations of ranking approach in the discussion section and emphasized the necessity in future studies for better analysis approaches that could provide more accurate assessment of functional connection dynamics without bias from firing rates.

      (3) The absolute response onset latencies also appear quite slow for sensory-guided  behavior in mice, and it remains unclear whether this reflects the method used to  determine onset timing or factors such as task design, sensorimotor demands, or  internal state. The approach for estimating onset latency by comparing firing rates in  short windows to baseline using a t-test raises concerns about robustness, as it may  be sensitive to trial-to-trial variability and yield spurious detections.

      We agree this may be primarily due to our conservative definition of onset timing. Specifically, we required the firing rate to exceed baseline (t-test, p < 0.05) for at least 3 consecutive 25-ms time windows. This might lead to later estimates than other studies, such as using the latency to the first spike after visual stimulus onset (Siegle et al., 2021) or the time to half-max response (Goldbach, Akitake, Leedy, & Histed, 2021).

      The estimation of response onset latency in our study may also be affected by potential internal state fluctuations of the mice. We used the time before visual stimulus onset as baseline firing, since firing rates in this period could be affected by trial history, we acknowledge this may increase the variability of the baseline, thus increase the difficulty to statistically detect the onset of response.

      Still, we believe these concerns do not affect the observation of the formation of compressed activity sequence in CR trials during learning.

      (4) Details on spike sorting are very limited. For example, defining single units only by  an interspike interval threshold above one millisecond may not sufficiently rule out  contamination or overlapping clusters. How exactly were neurons tracked across days  (Figure 7B)?

      We have added more details on spike sorting, including the processing steps and important parameters used in the automated sorting algorithm. Only the clusters well isolated in feature space were accepted in manual curation.

      We attempted to track the same neurons across learning in this project. However, due to the limited number of electrodes implanted in each brain region, the number of chronically tracked neurons in each region was insufficient to support statistically robust analyses.

      This is now stated more clearly in the discussion section.

      (5) The optogenetic experiments, while designed to test the functional relevance of  rank-increasing regions, also raise questions. The physiological impact of the inhibition  is not characterized, making it unclear how effectively the targeted circuits were  actually silenced. Without clearer evidence that the manipulations reliably altered local  activity, the interpretation of the observed or absent behavioral effects remains  uncertain.

      We appreciate this important point. Due to the design of the flexible electrodes and the implantation procedure, bilateral co-implantation of both electrodes and optical fibers was challenging, which prevented us from directly validating the inhibition effect in the same animals used for behavior. In hindsight, we could have conducted parallel validations using conventional electrodes, and we will incorporate such controls in future work to provide direct evidence of manipulation efficacy. 

      (6) The task itself is relatively simple, and the anatomical coverage does not include  midbrain or cerebellar regions, limiting how broadly the findings can be generalized to more flexible or ethologically relevant forms of decision-making.

      We appreciate this advice and have expanded the existing discussion to more explicitly state that the relatively simple task design and anatomical coverage might limit the generalizability of our findings.

      (7) The abstract would benefit from more consistent use of tense, as the current mix of  past and present can make the main findings harder to follow. In addition, terms like  "mesoscale network," "subnetwork," and "functional motif" are used interchangeably in  places; adopting clearer, consistent terminology would improve readability.

      We have changed several verbs in abstract to past form, and we now adopted a more consistent terminology by substituting “functional motif” as “subnetwork”. We still feel the use of

      “mesoscale network” and “subnetwork” could emphasize different aspects of the results according to the context, so these words are kept the same.

      (8) The discussion could better acknowledge that the observed network changes may  not reflect task-specific learning alone but could also arise from broader shifts in  arousal, attention, or motivation over repeated sessions.

      We have expanded the existing discussion to better acknowledge the possible effects from broader shifts in arousal, attention, or motivation over repeated sessions.

      (9) The figures would also benefit from clearer presentation, as several are dense and  not straightforward to interpret. For example, Figure S8 could be organized more  clearly to highlight the key comparisons and main message

      We have simplified the over-detailed brain plots in Figure 4-5, and the plots in Figure 6 and S8 (now S10 in the revised manuscript).

      (10) Finally, while the manuscript notes that data and code are available upon request,  it would strengthen the study's transparency and reproducibility to provide open access  through a public repository, in line with best practices in the field.

      The spiking data, behavior data and codes for the core analyses in the manuscript are now shared in pubic repository (Dryad). And we have changed the description in the Data Availability secition accordingly.

      Reviewer #2 (Recommendations for the authors):

      (A) Introduction:

      (1) "Previous studies have implicated multiple cortical and subcortical regions in visual  task learning and decision-making". No references here, and also in the next sentence.

      The references were in the following introduction and we have added those references here as well.

      We also added one review on cortical-subcortical neural correlates in goal-directed behavior (Cruz et al., 2023).

      (2) Intro: In general, the citation of previous literature is rather minimal, too minimal.  There is a lot of studies using large scale recordings during learning, not necessarily  visual tasks. An example for brain-wide learning study in subcortical areas is Sych et  al. 2022 (cell reports). And for wide-field imaging there are several papers from the  Helmchen lab and Komiyama labs, also for multi-area cortical imaging.

      We appreciate this advice. We included mainly visual task learning literature to keep a more focused scope around the regions and task we actually explored in this study. We fear if we expand the intro to include all the large-scale imaging/recording studies in learning field, the background part might become too broad.

      We have included (Sych, Fomins, Novelli, & Helmchen, 2022) for its relevance and importance in the field.

      (3) In the intro, there is only a mention of a recording of 10 brain regions, with no  mention of which areas, along with their relevance to learning. This is mentioned in the  results, but it will be good in the intro.

      The area names are now added in intro.

      (B) Results:

      (1) Were you able to track the same neurons across the learning profile? This is not  stated clearly.

      We did attempt to track the same neurons across learning in this project. However, due to the limited number of electrodes implanted in each brain region, the number of chronically tracked neurons in each region was insufficient to support statistically robust analyses.

      We now stated this more clearly in the discussion section.

      (2) Figure 1 starts with 7 mice, but only 5 mice are in the last panel. Later it goes down  to 3 mice. This should be explained in the results and justified.

      We apologize for the confusion. As described in the Methods section, 7 mice (Figure 1B) were used for behavioral training without electrode array or optical fiber implants to establish learning curves, and an additional 5 mice underwent electrophysiological recordings (3 for visual-based decision-making learning and 2 for fruitless learning).

      (3) I can't see the electrode tracks in Figure 1d. If they are flexible, how can you make  sure they did not bend during insertion? I couldn't find a description of this in the  methods also.

      The electrode shanks were ultra-thin (1-1.5 µm) and it was usually difficult to recover observable tracks or electrodes in section.

      The ultra-flexible probes could not penetrate brain on their own (since they are flexible), and had to be shuttled to position by tungsten wires through holes designed at the tip of array shanks. The tungsten wires were assembled to the electrode array before implantation; this was described in the section of electrode array fabrication and assembly. We also included the description about the retraction of the guiding tungsten wires in the surgery section to avoid confusion.

      As an further attempt to verify the accuracy of implantation depth, we also measured the repeatability of implantation in a group of mice and found a tendency for the arrays to end in slightly deeper location in cortex (142.1 ± 55.2 μm, n = 7 shanks), and slightly shallower location in subcortical structure (-122.6 ± 71.7 μm, n = 7 shanks). We added these results as new Figure S1 to accompany Figure 1.

      (4) In the spike rater in 1E, there seems to be ~20 cells in V2L, for example, but in 1F,  the number of neurons doesn't go below 40. What is the difference here? 

      We checked Figure 1F, the plotted dots do go below 40 to ~20. Perhaps the file that reviewer received wasn’t showing correctly?

      (5) The authors focus mainly on CR, but during learning, the number of CR trials is  rather low (because they are not experts). This can also be seen in the noisier traces  in Figure 2a. Do the authors account for that (for example by taking equal trials from  each group)? 

      We accounted this by reconstructing bootstrap-resampled datasets with only 5 trials for each session in both the early stage and the expert stage. The mean trace of the 500 datasets again showed overall decrease in CR trial firing rate during task learning, with highly similar temporal dynamics to the original data.

      The figure is now added to supplementary materials (as Figure S3 in the revised manuscript).

      (6) From Figure 2a, it is evident that Hit trials increase response when mice become  experts in all brain areas. The authors have decided to focus on the response onset  differences in CRs, but the Hit responses display a strong difference between naïve  and expert cases.

      Judged from the learning curve in this task the mice learned to inhibit its licking action when the No-Go stimuli appeared, which is the main reason we focused on these types of trials.

      The movement effects and potential licking artefacts in Hit trials also restricted our interpretation of these trials.

      (7) Figure 3 is still a bit cumbersome. I wasn't 100% convinced of why there is a need  to rank the connection matrix. I mean when you convert to rank, essentially there could  be a meaningful general reduction in correlation, for example during licking, and this  will be invisible in the ranking system. Maybe show in the supp non-ranked data, or  clarify this somehow

      We agree with this important point. As stated in the manuscript and response to Reviewer #1, our motivation in taking the ranking approach was that the differences in firing rates could bias cross-correlation between spike trains, making raw accounts of significant neuron pairs difficult to compare across conditions, but we acknowledge the ranking measures might obscure meaningful differences or inflate weak effects in the data.

      We added the limitations of ranking approach in the discussion section and emphasized the necessity in future studies for better analysis approaches that could provide more accurate assessment of functional connection dynamics without bias from firing rates.

      (8) Figure 4a x label is in manuscript, which is different than previous time labels,  which were seconds.

      We now changed all time labels from Figure 2 to milliseconds.

      (9) Figure 4 input and output rank look essentially the same.

      We have compressed the brain plots in Figures 4-5 to better convey the take-home message.

      (10) Also, what is the late and early stim period? Can you mark each period in panel A? Early stim period is confusing with early CR period. Same for early respons and late response.

      The definition of time periods was in figure legends. We now mark each period out to avoid confusion.

      (11) Looking at panel B, I don't see any differences between delta-rank in early stim,  late stim, early response, and late response. Same for panel c and output plots.

      The rankings were indeed relatively stable across time periods. The plots are now compressed and showed a mean rank value.

      (12) Panels B and C are just overwhelming and hard to grasp. Colors are similar both  to regular rank values and delta-rank. I don't see any differences between all  conditions (in general). In the text, the authors report only M2 to have an increase in  rank during the response period. Late or early response? The figure does not go well  with the text. Consider minimizing this plot and moving stuff to supplementary.

      The colormap are now changed to avoid confusion, and brain plots are now compressed.

      (13) In terms of a statistical test for Figure 4, a two-way ANOVA was done, but over  what? What are the statistics and p-values for the test? Is there a main effect of time  also? Is their a significant interaction? Was this done on all mice together? How many  mice? If I understand correctly, the post-hoc statistics are presented in the  supplementary, but from the main figure, you cannot know what is significant and what  is not.

      For these figures we were mainly concerned with the post-hoc statistics which described the changes in the rankings of each region across learning.

      We have changed the description to “t-test with Sidak correction” to avoid the confusion.

      (14) In the legend of Figure 4, it is reported that 610 expert CR trials from 6 sessions,  instead of 7 sessions. Why was that? Also, like the previous point, why only 3 mice?

      Behavior data of all the sessions used were shown in Figure S1. There were only 3 mice used for the learning group, the difficulty to achieve sufficient unit yields across all regions in the same animal restricted our sample size

      (15) Body movement analysis: was this done in a different cohort of mice? Only now  do I understand why there was a division into early and late stim periods. In supp 4,  there should be a trace of each body part in CR expert versus naïve. This should also  be done for Hit trials as a sanity check. I am not sure that the brightness difference  between consecutive frames is the best measure. Rather try to calculate frame-to frame correlation. In general, body movement analysis is super important and should  be carefully analyzed.

      Due to the limitation in the experimental design and implementation, movement tracking was not performed during the electrophysiological recordings, and the 3 mice shown in Figure S4 (now S5) were from a separate group. We have carefully examined the temporal profiles of mouse movements and found it did not fully match the rank dynamics for all regions, and we have added these results and related discussion in the revised manuscript. However, we acknowledge the observed motion energy pattern could explain some of the functional connection dynamics, such as the decrease in face and pupil motion energy could explain the reduction in ranks for striatum.

      Without synchronized movement recordings in the main dataset, we cannot fully disentangle movement-related neural activity from task-related signals. We have made this limitation explicit in the revised manuscript and discuss it as a potential confound, along with possible approaches to address it in future work.

      (16) For Hit trials, in the striatum, there is an increase in input rank around the  response period, and from Figure S6 it is clear that this is lick-related. Other than that,  the authors report other significant changes across learning and point out to Figure 5b,c. I couldn't see which areas and when it occurred.

      We did naturally expect the activity in striatum to be strongly related to movement.

      With Figure S6 (now S7) we wished to show that the observed rank increase for striatum could not simply be attributed to changes in time of lick initiation.

      As some readers may argue that during learning the mice might have learned to only intensely lick after response signal onset, causing the observed rise of input rank after response signal, we realigned the spikes in each trial to the time of the first lick, and a strong difference could still be observed between early training stage and expert training stage.

      We still cannot fully rule out the effects from more subtle movement changes, as the face motion energy did increase in early response period. This result and related discussion has been added to the results section of revised manuscript.

      (17) Figure 6, again, is rather hard to grasp. There are 16 panels, spread over 4 areas,  input and output, stim and response. What is the take home message of all this?  Visually, it's hard to differentiate between each panel. For me, it seems like all the  panels indicate that for all 4 areas, both in output and input, frontal areas increase in  rank. This take-home message can be visually conveyed in much less tedious ways.  This simpler approach is actually conveyed better in the text than in the figures  themselves. Also, the whole explanation on how this analysis was done, was not clear  from the text. If I understand it, you just divided and ranked the general input (or  output) into individual connections? If so, then this should be better explained.

      We appreciate this advice and we have compressed the figures to better convey the main message.The rankings for Figure 6 and Figure S8 (now Figure S9) was explained in the left panel of Figure 3C. Each non-zero element in the connection matrix was ranked to value from 1-10, with a value of 10 represented the 10% strongest non-zero elements in the matrix.

      We have updated the figure legends of Figure 3, and we have also updated the description in methods (Connection rank analyses) to give a clearer description of how the analyses were applied in subsequent figures.

      (18) Figure 7: Here, the authors perform a ROC analysis between go and no-go  stimuli. They balance between choice, but there is still an essential difference between  a hit and a FA in terms of movement and licks. That is maybe why there is a big  difference in selective units during the response period. For example, during a Hit trial  the mouse licks and gets a reward, resulting in more licking and excitement. In FAs,the mouse licks, but gets punished, which causes a reduction in additional licking and  movements. This could be a simple explanation why the ROC was good in the late  response period. Body movement analysis of Hit and FA should be done as in Figure  S4.

      We appreciate this insightful advice.

      Though we balanced the numbers of basic trial types, we couldn’t rule out the difference in the intrinsic movement amount difference in FA trials and Hit trials, which is likely the reason of large proportion of encoding neurons in response period.

      We have added this discussion both in result section and discussion section along with the necessity of more carefully designed behavior paradigm to disentangle task information.

      (19) The authors also find selective neurons before stimulus onset, and refer to trial  history effects. This can be directly checked, that is if neurons decode trial history.

      We attempted encoding analyses on trial history, but regrettably for our dataset we could not find enough trials to construct a dataset with fully balanced trial history, visual stimulus and behavior choice.

      (20) Figure 7e. What is the interpretation for these results? That areas which peaked  earlier had more input and output with other areas? So, these areas are initiating  hubs? Would be nice to see ACC vs Str traces from B superimposed on each other.  Having said this, the Str is the only area to show significant differences in the early  stim period. But is also has the latest peak time. This is a bit of a discrepancy.

      We appreciate this important point.

      The limitation in the anatomical coverage of brain regions restricted our interpretation about these findings. They could be initiating hubs or earlier receiver of the true initiating hubs that were not monitored in our study.

      The Str trace was in fact above the ACC trace, especially in the response period. This could be explained by the above advice 18: since we couldn’t rule out the difference in the intrinsic movement amount difference in FA trials and Hit trials, and considering striatum activity is strongly related to movement, the Str trace may reflect more in the motion related spike count difference between FA trials and Hit trials, instead of visual stimulus related difference.

      This further shows the necessity of more carefully designed behavior paradigm to disentangle task information.

      The striatum trace also in fact didn’t show a true double peak form as traces in other regions, it ramped up in the stimulus region and only peaked in response period. This description is now added to the results section.

      In the early stim period, the Striatum did show significant differences in average percent of encoding neurons, as the encoding neurons were stably high in expert stage. The striatum activity is more directly affected Still the percentage of neurons only reached peak in late stimulus period.

      (21) For the optogenetic silencing experiments, how many mice were trained for each  group? This is not mentioned in the results section but only in the legend of Figure 8. This part is rather convincing in terms of the necessity for OFC and V2M

      We have included the mice numbers in results section as well.

      (C) Discussion

      (1) There are several studies linking sensory areas to frontal networks that should be  mentioned, for example, Esmaeili et a,l 2022, Matteucci et al., 2022, Guo et a,l 2014,Gallero Salas et al, 2021, Jerry Chen et al, 2015. Sonja Hofer papers, maybe. Probably more.

      We appreciate this advice. We have now included one of the mentioned papers (Esmaeili et al., 2022) in the results section and discussion section for its direct characterization of the enhanced coupling between somatosensory region and frontal (motor) region during sensory learning.The other studies mentioned here seem to focus more on the differences in encoding properties between regions along specific cortical pathways, rather than functional connection or interregional activity correlation, and we feel they are not directly related to the observations discussed.

      (2) The reposted reorganization of brain-wide networks with shifts in time is best  described also in Sych et al. 2021.

      We regret we didn’t include this important research and we have now cited this in discussion section.

      (3) Regarding the discussion about more widespread stimulus encoding after learning,  the results indicate that the striatum emerges first in decoding abilities (Figure 7c left  panel), but this is not discussed at all.

      We briefly discussed this in the result section. We tend to attribute this to trial history signal in striatum, but since the structure of our data could not support a direct encoding analysis on trial history, we felt it might be inappropriate to over-interpret the results.

      (4) An important issue which is not discussed is the contribution of movement which  was shown to have a strong effect on brain-wide dynamics (Steinmetz et al 2019;  Musall et al 2019; Stringer et al 2019; Gilad et al 2018) The authors do have some movement analysis, but this is not enough. At least a discussion of the possible effects of movement on learning-related dynamics should be added.

      We have included these studies in discussion section accordingly. Since the movement analyses were done in a separate cohort of mice, we have made our limitation explicit in the revised manuscript and discuss it as a potential confound, along with possible approaches to address it in future work.

      (D) Methods

      (1) How was the light delivery of the optogenetic experiments done? Via fiber  implantation in the OFC? And for V2M? If the red laser was on the skull, how did it get  to the OFC?

      The fibers were placed on cortex surface for V2M group, and were implanted above OFC for OFC manipulation group. These were described in the viral injection part of the methods section.

      (2) No data given on how electrode tracking was done post hoc

      As noted in our response to the advice 3 in results section, the electrode shanks were ultra-thin (1-1.5 µm) and it was usually difficult to recover observable tracks or electrodes in section.

      As an attempt to verify the accuracy of implantation depth, we measured the repeatability of implantation in a group of mice and found a tendency for the arrays to end in slightly deeper location in cortex (142.1 ± 55.2 μm, n = 7 shanks), and slightly shallower location in subcortical structure (-122.6 ± 71.7 μm, n = 7 shanks). We added these results as new Figure S1 to accompany Figure 1.

      Reviewer #3 (Recommendations for the authors):

      (1) The manuscript uses decision-making in the title, abstract and introduction.  However, nothing is related to decision learning in the results section. Mice simply  learned to suppress licking in no-go trials. This type of task is typically used to study behavioral inhibition. And consistent with this, the authors mainly identified changes  related to network on no-go trials. I really think the title and main message is  misleading. It is better to rephrase it as visual discrimination learning. In the  introduction, the authors also reviewed multiple related studies that are based on  learning of visual discrimination tasks.

      We do view the Go/No-Go task as a specific genre of decision-making task, as there were literature that discussed this task as decision-making task under the framework of signal detection theory or updating of item values (Carandini & Churchland, 2013; Veling, Becker, Liu, Quandt, & Holland, 2022).

      We do acknowledge the essential differences between the Go/No-Go task and the tasks that require the animal to choose between alternatives, and since we have now realized some readers may not accept this task as a decision task, we have changed the title to visual discrimination task as advised.

      (2) Learning induced a faster onset on CR trials. As the no-go stimulus was not  presented to mice during early stages of training, this change might reflect the  perceptual learning of relevant visual stimulus after repeated presentation. This further  confirms my speculation, and the decision-making used in the title is misleading. 

      We have changed the title to visual discrimination task accordingly.

      (3) Figure 1E, show one hit trial. If the second 'no-go stimulus' is correct, that trial  might be a false alarm trial as mice licked briefly. I'd like to see whether continuous  licking can cause motion artifacts in recording. 

      We appreciate this important point. There were indeed licking artifacts with continuous licking in Hit trials, which was part of the reason we focused our analyses on CR trials. Opto-based lick detectors may help to reduce the artefacts in future studies.

      (4) What is the rationale for using a threshold of d' < 2 as the early-stage data and d'>3  as expert stage data?

      The thresholds were chosen as a result from trade-off based on practical needs to gather enough CR trials in early training stage, while maintaining a relatively low performance.

      Assume the mice showed lick response in 95% of Go stimulus trials, then d' < 2 corresponded to the performance level at which the mouse correctly rejected less than 63.9% of No-Go stimulus trials, and d' > 3 corresponded to the performance level at which the mouse correctly rejected more than 91.2% of No-Go stimulus trials.

      (5) Figure 2A, there is a change in baseline firing rates in V2M, MDTh, and Str. There  is no discussion. But what can cause this change? Recording instability, problem in  spiking sorting, or learning?

      It’s highly possible that the firing rates before visual stimulus onset is affected by previous reward history and task engagement states of the mice. Notably, though recorded simultaneously in same sessions, the changes in CR trials baseline firing rates in the V2M region were not observed in Hit trials.

      Thus, though we cannot completely rule out the possibility in recording instability, we see this as evidence of the effects on firing rates from changes in trial history or task engagement during learning.

      References:

      Carandini, M., & Churchland, A. K. (2013). Probing perceptual decisions in rodents. Nat Neurosci, 16(7), 824-831. doi:10.1038/nn.3410.

      Cruz, K. G., Leow, Y. N., Le, N. M., Adam, E., Huda, R., & Sur, M. (2023).Cortical-subcortical interactions in goal-directed behavior. Physiol Rev, 103(1), 347-389. doi:10.1152/physrev.00048.2021

      Esmaeili, V., Oryshchuk, A., Asri, R., Tamura, K., Foustoukos, G., Liu, Y., Guiet, R., Crochet, S., & Petersen, C. C. H. (2022). Learning-related congruent and incongruent changes of excitation and inhibition in distinct cortical areas. PLOS Biology, 20(5), e3001667. doi:10.1371/journal.pbio.3001667

      Goldbach, H. C., Akitake, B., Leedy, C. E., & Histed, M. H. (2021). Performance in even a simple perceptual task depends on mouse secondary visual areas. Elife, 10, e62156. doi:10.7554/eLife.62156.

      Siegle, J. H., Jia, X., Durand, S., Gale, S., Bennett, C., Graddis, N., Heller, G.,Ramirez, T. K., Choi, H., Luviano, J. A., Groblewski, P. A., Ahmed, R., Arkhipov, A., Bernard, A., Billeh, Y. N., Brown, D., Buice, M. A., Cain, N.,Caldejon, S., Casal, L., Cho, A., Chvilicek, M., Cox, T. C., Dai, K., Denman, D.J., de Vries, S. E. J., Dietzman, R., Esposito, L., Farrell, C., Feng, D., Galbraith, J., Garrett, M., Gelfand, E. C., Hancock, N., Harris, J. A., Howard, R., Hu, B.,Hytnen, R., Iyer, R., Jessett, E., Johnson, K., Kato, I., Kiggins, J., Lambert, S., Lecoq, J., Ledochowitsch, P., Lee, J. H., Leon, A., Li, Y., Liang, E., Long, F., Mace, K., Melchior, J., Millman, D., Mollenkopf, T., Nayan, C., Ng, L., Ngo, K., Nguyen, T., Nicovich, P. R., North, K., Ocker, G. K., Ollerenshaw, D., Oliver, M., Pachitariu, M., Perkins, J., Reding, M., Reid, D., Robertson, M., Ronellenfitch, K., Seid, S., Slaughterbeck, C., Stoecklin, M., Sullivan, D., Sutton, B., Swapp, J., Thompson, C., Turner, K., Wakeman, W., Whitesell, J. D., Williams, D., Williford, A., Young, R., Zeng, H., Naylor, S., Phillips, J. W., Reid, R. C., Mihalas, S., Olsen, S. R., & Koch, C. (2021). Survey of spiking in the mouse visual system reveals functional hierarchy. Nature, 592(7852), 86-92. doi:10.1038/s41586-020-03171-x

      Sych, Y., Fomins, A., Novelli, L., & Helmchen, F. (2022). Dynamic reorganization of the cortico-basal ganglia-thalamo-cortical network during task learning. Cell Rep, 40(12), 111394. doi:10.1016/j.celrep.2022.111394

      Veling, H., Becker, D., Liu, H., Quandt, J., & Holland, R. W. (2022). How go/no-go training changes behavior: A value-based decision-making perspective. Current Opinion in Behavioral Sciences, 47,101206.

      doi:https://doi.org/10.1016/j.cobeha.2022.101206.

    1. The other one that happened was just before Russia invaded Ukraine, they managed to disable the Viasat modems. And this is an interesting case. These modems are used for satellite communications. And they were able to attack these modems so that they physically disabled themselves. It was not like the denial of service attack on the network. No, they managed to wipe the firmware of all these modems in such a way that it could not be replaced. The reason we know about this stuff so well is it turns out there were lots of windmills that also had these modems. In Germany, apparently 4,000 of these modems stopped working. And there were 4,000 wind turbines that could no longer be operated. So this was a military cyber attack that happened as Russia was invading Ukraine. And it was of great benefit to them because it disabled a lot of military communications in Ukraine. But this is the kind of thing that can happen, only that it’s quite rare.

      Russia compromised Viasat modems before the Feb2022 invasion, to disrupt Ukraine's satcoms for the military. However these modems are also used in German windparks, losing control over 4000 windmills.

    1. But handwriting requires more of the brain’s motor programs than typing.

      When I read that “handwriting requires more precise movements and engages more of the brain than typing" (Paragraph 9) it made sense to me because I always feel like I remember things better when I physically write them down. Typing feels faster but it’s easier to zone out and just copy words without really thinking. This quote made me realize that some newer technology might actually make learning easier in the moment but worse in the long run for your remembrance.

    1. Lotta shit happens, like, being in show businessA lot of shit happens, like, like, I make a lot of money, you knowAnd I’m really happy about itAnd I’m not bragging, I just wanna say somethingI make a so, fuck, it’s ridiculousBut wait, wait a minute, wait a minuteHey, if my father was alive today, I would go home and say“Dad, I wanna tell you how much money I made”You know what he’d say? You’s a lying motherfuckerJoe Lewis didn’t make that much moneyCome in here, get your ass out the houseComing here with that bullshit, hah

      The intro is a sample from comedian Richard Pryor’s stand-up routine “Fame”, part of "That African American Is Still Crazy: Good Shit From The Vaults", which features material from the '70s, '80s, and '90s.

      Richard Pryor is one of the most important African-American comedians of all time, active from the 1960s until the 1990s.

      In this skit, part of his stand-up routine "Fame", he jokes about how rich he is and how his own father could not even believe the amount of money he makes.

      Joe Louis was a professional boxer and the World Heavyweight Champion from 1937 to 1949.

    1. Gomez deleted social media apps from her smartphone and gave up knowledge of her passwords. She claims that the move has been liberating: “I suddenly had to learn how to be with myself.” She reflects that there were 150 million people on her phone, and “I just put it down. . . . That was such a relief.”

      It's nice to notice that some celebrities don't fully rely on social media and fame. Celebrities are human and sometimes want to feel like regular people too.

    1. Financial freedom my only hopeFuck livin’ rich and dyin’ brokeI bought some artwork for one millionTwo years later, that shit worth two millionFew years later, that shit worth eight millionI can’t wait to give this shit to my childrenY’all think it’s bougie, I’m like, it’s fineBut I’m tryin’ to give you a million dollars worth of game for $9.99I turned that 2 to a 4, 4 to an 8I turned my life into a nice first week release dateY’all out here still takin’ advances, huh?Me and my niggas takin’ real chances, uhY’all on the ‘Gram holdin’ money to your earThere’s a disconnect, we don’t call that money over here, yeah

      In the second verse, the rapper expands on the concept of developing wealth and obtaining financial freedom, which he defines as his only hope.

      He also makes reference to his streaming platform, Tidal, which offers a million dollars worth of game through its music catalog for $9.99 a month. Tidal is just one of the many business ventures of the rapper.

      In the closing lines, he mocks rappers who take "advances" - a form of loan that record labels offer to artists to finance their albums - and also the trend followed by some rappers on Instagram of showing off money by holding it close to their ears. He reveals, with a clever play on words, that there's a disconnect, we don't call that money over here.

      In this song, Jay-Z is constantly pointing out at the importance of property and wealth. He is, therefore, to be considered a Black capitalist. He shows how he strongly believes in economic success as the principal means reaching some form of equality or, at least, some form of upliftment.

      However, he does not really believe that wealth and status can be a proper shield from racism; thus, wealth is only a way to obtain a very partial equality.

    1. . Even works by professional rhetoricians areoften deliberately mislabeled. A colleague recently informed me thathis last three books, all of them originally employing "rhetoric" intheir titles, had been retitled by the publishers, since rhetorical termswould downgrade the text and reduce sales!

      I mean... I understand why casual readers would be turned off by more academic/professional terms, and if you're publishing something they have to be more widely applicable. It's just like me not wanting to pick up something that has proper scientific terms in the title. That's not something I think would interest me.

    Annotators

    1. Also, the emotional situation matters a lot. Teens are more likely to act impulsively in emotionally intense situations. So, it’s hard to say that a person’s behavior is caused just by how their brain is wired.

      3) I would like to hear someones take on this

    2. the average girl reaches puberty around age 12.5. Just a couple of hundred years ago, it was closer to 16 (Chumlea et al., 2003; Evelith, 2017). While the age of sexual maturity has been falling, the average age at which women have their first child has been rising; it’s now just over 26 in many developed countries.

      A key fact that stuck with me.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The Reviewer structured their review such that their first two recommendations specifically concerned the two major weaknesses they viewed in the initial submission. For clarity and concision, we have copied their recommendations to be placed immediately following their corresponding points on weaknesses.

      Strengths:

      Studying prediction error from the lens of network connectivity provides new insights into predictive coding frameworks. The combination of various independent datasets to tackle the question adds strength, including two well-powered fMRI task datasets, resting-state fMRI interpreted in relation to behavioral measures, as well as EEG-fMRI.

      Weaknesses:

      Major:

      (R1.1) Lack of multiple comparisons correction for edge-wise contrast:

      The analysis of connectivity differences across three levels of prediction error was conducted separately for approximately 22,000 edges (derived from 210 regions), yet no correction for multiple comparisons appears to have been applied. Then, modularity was applied to the top 5% of these edges. I do not believe that this approach is viable without correction. It does not help that a completely separate approach using SVMs was FDR-corrected for 210 regions.

      [Later recommendation] Regarding the first major point: To address the issue of multiple comparisons in the edge-wise connectivity analysis, I recommend using the Network-Based Statistic (NBS; Zalesky et al., 2010). NBS is well-suited for identifying clusters (analogous to modules) of edges that show statistically significant differences across the three prediction error levels, while appropriately correcting for multiple comparisons.

      Thank you for bringing this up. We acknowledge that our modularity analysis does not evaluate statistical significance. Originally, the modularity analysis was meant to provide a connectome-wide summary of the connectivity effects, whereas the classification-based analysis was meant to address the need for statistical significance testing. However, as the reviewer points out, it would be better if significance were tested in a manner more analogous to the reported modules. As they suggest, we updated the Supplemental Materials (SM) to include the results of Network-Based Statistic analysis (SM p. 1-2):

      “(2.1) Network-Based Statistic

      Here, we evaluate whether PE significantly impacts connectivity at the network level using the Network-Based Statistic (NBS) approach.[1] NBS relied on the same regression data generated for the main-text analysis, whereby a regression is performed examining the effect of PE (Low = –1, Medium = 0, High = +1) on connectivity for each edge. This was done across the connectome, and for each edge, a z-score was computed. For NBS, we thresholded edges to |Z| > 3.0, which yielded one large network cluster, shown in Figure S3. The size of the cluster – i.e., number of edges – was significant (p < .05) per a permutation-test using 1,000 random shuffles of the condition data for each participant, as is standard.[1] These results demonstrate that the networklevel effects of PE on connectivity are significant. The main-text modularity analysis converts this large cluster into four modules, which are more interpretable and open the door to further analyses”.

      We updated the Results to mention these findings before describing the modularity analysis (p. 8-9):

      “After demonstrating that PE significantly influences brain-wide connectivity using Network-Based Statistic analysis (Supplemental Materials 2.1), we conducted a modularity analysis to study how specific groups of edges are all sensitive to high/low-PE information.”

      (R1.2) Lack of spatial information in EEG:

      The EEG data were not source-localized, and no connectivity analysis was performed. Instead, power fluctuations were averaged across a predefined set of electrodes based on a single prior study (reference 27), as well as across a broader set of electrodes. While the study correlates these EEG power fluctuations with fMRI network connectivity over time, such temporal correlations do not establish that the EEG oscillations originate from the corresponding network regions. For instance, the observed fronto-central theta power increases could plausibly originate from the dorsal anterior cingulate cortex (dACC), as consistently reported in the literature, rather than from a distributed network. The spatially agnostic nature of the EEG-fMRI correlation approach used here does not support interpretations tied to specific dorsal-ventral or anterior-posterior networks. Nonetheless, such interpretations are made throughout the manuscript, which overextends the conclusions that can be drawn from the data.

      [Later recommendation] Regarding the second major point: I suggest either adopting a source-localized EEG approach to assess electrophysiological connectivity or revising all related sections to avoid implying spatial specificity or direct correspondence with fMRI-derived networks. The current approach, which relies on electrode-level power fluctuations, does not support claims about the spatial origin of EEG signals or their alignment with specific connectivity networks.

      We thank the reviewer for this important point, which allows us to clarify the specific and distinct contributions of each imaging modality in our study. Our primary goal for Study 3 was to leverage the high temporal resolution of EEG to identify the characteristic frequency at which the fMRI-defined global connectivity states fluctuate. The study was not designed to infer the spatial origin of these EEG signals, a task for which fMRI is better suited and which we addressed in Studies 1 and 2.

      As the reviewer points out, fronto-central theta is generally associated with the dACC. We agree with this point entirely. We suspect that there is some process linking dACC activation to the identified network fluctuations – some type of relationship that does not manifest in our dynamic functional connectivity analyses – although this is only a hypothesis and one that is beyond the present scope.

      We updated the Discussion to mention these points and acknowledge the ambiguity regarding the correlation between network fluctuation amplitude (fMRI) and Delta/Theta power (EEG) (p. 24):

      “We specifically interpret the fMRI-EEG correlation as reflecting fluctuation speed because we correlated EEG oscillatory power with the fluctuation amplitude computed from fMRI data. Simply correlating EEG power with the average connectivity or the signed difference between posterior-anterior and ventral-dorsal connectivity yields null results (Supplemental Materials 6), suggesting that this is a very particular association, and viewing it as capturing fluctuation amplitude provides a parsimonious explanation. Yet, this correlation may be interpreted in other ways. For example, resting-state Theta is also a signature of drowsiness,[2] which may correlate with PE processing, but perhaps should be understood as some other mechanism. Additionally, Theta is widely seen as a sign of dorsal anterior cingulate cortex activity,3 and it is unclear how to reconcile this with our claims about network fluctuations. Nonetheless, as we show with simulations (Supplemental Materials 5), a correlation between slow fMRI network fluctuations and fast EEG Delta/Theta oscillations is also consistent with a common global neural process oscillating rapidly and eliciting both measures.”

      Regarding source-localization, several papers have described known limitations of this strategy for drawing precise anatomical inferences,[4–6] and this seems unnecessary given that our fMRI analyses already provide more robust anatomical precision. We intentionally used EEG in our study for what it measures most robustly: millisecond-level temporal dynamics.

      (R1.2a)Examples of problematic language include:

      Line 134: "detection of network oscillations at fast speeds" - the current EEG approach does not measure networks.

      This is an important issue. We acknowledge that our EEG approach does not directly measure fMRI-defined networks. Our claim is inferential, designed to estimate the temporal dynamics of the large-scale fMRI patterns we identified. The correlation between our fMRI-derived fluctuation amplitude (|PA – VD|) and 3-6 Hz EEG power provides suggestive evidence that the transitions between these network states occur at this frequency, rather than being a direct measurement of network oscillations.

      To support the validity of this inference, we performed two key analyses (now in Supplemental Materials). First, a simulation study provides a proof-of-concept, confirming our method can recover the frequency of a fast underlying oscillator from slow fMRI and fast EEG data. Second, a specificity analysis shows the EEG correlation is unique to our measure of fluctuation amplitude and not to simpler measures like overall connectivity strength. These analyses demonstrate that our interpretation is more plausible than alternative explanations.

      Overall, we have revised the manuscript to be more conservative in the language employed, such as presenting alternative explanations to the interpretations put forth based on correlative/observational evidence (e.g., our modifications above described in our response to comment R1.2). In addition, we have made changes throughout the report to state the issues related to reverse inference more explicitly and to better communicate that the evidence is suggestive – please see our numerous changes described in our response to comment R3.1. For the statement that the reviewer specifically mentioned here, we revised it to be more cautious (p. 7):

      “Although such speed outpaces the temporal resolution of fMRI, correlating fluctuations in dynamic connectivity measured from fMRI data with EEG oscillations can provide an estimate of the fluctuations’ speed. This interpretation of a correlation again runs up against issues related to reverse inference but would nonetheless serve as initial suggestive evidence that spontaneous transitions between network states occur rapidly.”

      (R1.2b) Line 148: "whether fluctuations between high- and low-PE networks occur sufficiently fast" - this implies spatial localization to networks that is not supported by the EEG analysis.

      Building on our changes described in our immediately prior response, we adjusted our text here to say our analyses searched for evidence consistent with the idea that the network fluctuations occur quickly rather than searching for decisive evidence favoring this idea (p. 7-8):

      “Finally, we examined rs-fMRI-EEG data to assess whether we find parallels consistent with the high/low-PE network fluctuations occurring at fast timescales suitable for the type of cognitive operations typically targeted by PE theories.”

      (R1.2c) Line 480: "how underlying neural oscillators can produce BOLD and EEG measurements" - no evidence is provided that the same neural sources underlie both modalities.

      As described above, these claims are based on the simulation study demonstrating that this is a possibility, and we have revised the manuscript overall to be clearer that this is our interpretation while providing alternative explanations.

      Reviewer #2 (Public review):

      Strengths:

      Clearly, a lot of work and data went into this paper, including 2 task-based fMRI experiments and the resting state data for the same participants, as well as a third EEG-fMRI dataset. Overall, well written with a couple of exceptions on clarity, as per below, and the methodology appears overall sound, with a couple of exceptions listed below that require further justification. It does a good job of acknowledging its own weakness.

      Weaknesses:

      (R2.1) The paper does a good job of acknowledging its greatest weakness, the fact that it relies heavily on reverse inference, but cannot quite resolve it. As the authors put it, "finding the same networks during a prediction error task and during rest does not mean that the networks' engagement during rest reflects prediction error processing". Again, the authors acknowledge the speculative nature of their claims in the discussion, but given that this is the key claim and essence of the paper, it is hard to see how the evidence is compelling to support that claim.

      We thank the reviewer for this comment. We agree that reverse inference is a fundamental challenge and that our central claim requires a particularly high bar of evidence. While no single analysis resolves this issue, our goal was to build a cumulative case that is compelling by converging on the same conclusion from multiple, independent lines of evidence.

      For our investigation, we initially established a task-general signature of prediction error (PE). By showing the same neural pattern represents PE in different contexts, we constrain the reverse inference, making it less likely that our findings are a task-specific artifact and more likely that they reflect the core, underlying process of PE. Building on this, our most compelling evidence comes from linking task and rest at the individual level. We didn't just find the same general network at rest; we showed that an individual’s unique anatomical pattern of PE-related connectivity during the task specifically predicts their own brain's fluctuation patterns at rest. This highly specific, person-by-person correspondence provides a direct bridge between an individual's task-evoked PE processing and their intrinsic, resting-state dynamics. Furthermore, these resting-state fluctuations correlate specifically with the 3-6 Hz theta rhythm—a well-established neural marker for PE.

      While reverse inference remains a fundamental limitation for many studies on resting-state cognition, the aspects mentioned above, we believe, provide suggestive evidence, favoring our PE interpretation. Nonetheless, we have made changes throughout the manuscript to be more conservative in the language we use to describe our results, to make it clear what claims are based on correlative/observational evidence, and to put forth alternative explanations for the identified effects. Please find our numerous changes detailed in our response to comment R3.1.

      (R2.2) Given how uncontrolled cognition is during "resting-state" experiments, the parallel made with prediction errors elicited during a task designed for that effect is a little difficult to make. How often are people really surprised when their brains are "at rest", likely replaying a previously experienced event or planning future actions under their control? It seems to be more likely a very low prediction error scenario, if at all surprising.

      We (and some others) take a broad interpretation of PE and believe it is often more intuitive to think about PE minimization in terms of uncertainty rather than “surprise”; the word “surprise” usually implies a sudden emotive reaction from the violation of expectations, which is not useful here.

      When planning future actions, each step of the plan is spurred by the uncertainty of what is the appropriate action given the scenario set up by prior steps. Each planned step erases some of that uncertainty. For example, you may be mentally simulating a conversation, what you will say, and what another person will say. Each step of this creates uncertainty of “what is the appropriate response?” Each reasoning step addresses contingencies. While planning, you may also uncover more obvious forms of uncertainty, sparking memory retrieval to finish it. A resting-state participant may think to cook a frozen pizza when they arrive home, but be uncertain about whether they have any frozen pizzas left, prompting episodic memory retrieval to address this uncertainty. We argue that every planning step or memory retrieval can be productively understood as being sparked by uncertainty/surprise (PE), and the subsequent cognitive response minimizes this uncertainty.

      We updated the Introduction to include a paragraph near the start providing this explanation (p. 3-4):

      “PE minimization may broadly coordinate brain functions of all sorts, including abstract cognitive functions. This includes the types of cognitive processes at play even in the absence of stimuli (e.g., while daydreaming). While it may seem counterintuitive to associate this type of cognition with PE – a concept often tied to external surprises – it has been proposed that the brain's internal generative model is continuously active.[12–14] Spontaneous thought, such as planning a future event or replaying a memory, is not a passive, low-PE process. Rather, it can be seen as a dynamic cycle of generating and resolving internal uncertainty. While daydreaming, you may be reminded of a past conversation, where you wish you had said something different. This situation contains uncertainty about what would have been the best thing to say. Wondering about what you wish you said can be viewed as resolving this uncertainty, in principle, forming a plan if the same situation ever arises again in the future. Each iteration of the simulated conversation repeatedly sparks and then resolves this type of uncertainty.”

      (R2.3)The quantitative comparison between networks under task and rest was done on a small subset of the ROIs rather than on the full network - why? Noting how small the correlation between task and rest is (r=0.021) and that's only for part of the networks, the evidence is a little tenuous. Running the analysis for the full networks could strengthen the argument.

      We thank the reviewer for this opportunity to clarify our method. A single correlation between the full, aggregated networks would be conceptually misaligned with what we aimed to assess. To test for a personspecific anatomical correspondence, it is necessary to examine the link between task and rest at a granular level. We therefore asked whether the specific parts of an individual's network most responsive to PE during the task are the same parts that show the strongest fluctuations at rest. Our analysis, performed iteratively across all 3,432 possible ROI subsets, was designed specifically to answer this question, which would be obscured by an aggregated network measure.

      We appreciate the reviewer's concern about the modest effect size (r = .021). However, this must be contextualized, as the short task scan has very low reliability (.08), which imposes a severe statistical ceiling on any possible task-rest correlation. Finding a highly significant effect (p < .001) in the face of such noisy data, therefore, provides robust evidence for a genuine task-rest correspondence.

      We updated the Discussion to discuss this point (p. 22-23):

      “A key finding supporting our interpretation is the significant link between individual differences in task-evoked PE responses and resting-state fluctuations. One might initially view the effect size of this correspondence (r = .021) as modest. However, this interpretation must be contextualized by the considerable measurement noise inherent in short task-fMRI scans; the split-half reliability of the task contrast was only .08. This low reliability imposes a severe statistical ceiling on any possible task-rest correlation. Therefore, detecting a highly significant (p < .001) relationship despite this constraint provides robust evidence for a genuine link. Furthermore, our analytical approach, which iteratively examined thousands of ROI subsets rather than one aggregated network, was intentionally granular. The goal was not simply to correlate two global measures, but to test for a personspecific anatomical correspondence – that is, whether the specific parts of an individual's network most sensitive to PE during the task are the same parts that fluctuate most strongly at rest. An aggregate analysis would obscure this critical spatial specificity. Taken together, this granular analysis provides compelling evidence for an anatomically consistent fingerprint of PE processing that bridges task-evoked activity and spontaneous restingstate dynamics, strengthening our central claim.”

      (R2.4) Looking at the results in Figure 2C, the four-quadrant description of the networks labelled for low and high PE appears a little simplistic. The authors state that this four-quadrant description omits some ROIs as motivated by prior knowledge. This would benefit from a more comprehensive justification.Which ROIs are excluded, and what is the evidence for exclusion?

      Our four-quadrant model is a principled simplification designed to distill the dominant, large-scale connectivity patterns from the complex modularity results. This approach focuses on coherent, well-documented anatomical streams while setting aside a few anatomically distant and disjoint ROIs that were less central to the main modules. This heuristic additionally unlocks more robust and novel analyses.

      The two low-PE posterior-anterior (PA) pathways are grounded in canonical processing streams. (i) The OCATL connection mirrors the ventral visual stream (the “what” pathway), which is fundamental for object recognition and is upregulated during the smooth processing of expected stimuli. (ii) The IPL-LPFC connection represents a core axis of the dorsal attention stream and the Fronto-Parietal Control Network (FPCN), reflecting the maintenance of top-down cognitive control when information is predictable; the IPL-LPFC module excludes ROIs in the middle temporal gyrus, which are often associated with the FPCN but are not covered here.

      In contrast, the two high-PE ventral-dorsal (VD) pathways reflect processes for resolving surprise and conflict. (i) The OC-IPL connection is a classic signature of attentional reorienting, where unexpected sensory input (high PE) triggers a necessary shift in attention; the OC-IPL module excludes some ROIs that are anterior to the occipital lobe and enter the fusiform gyrus and inferior temporal lobe. (ii) The ATL-LPFC connection aligns with mechanisms for semantic re-evaluation, engaging prefrontal control regions to update a mental model in the face of incongruent information.

      Beyond its functional/anatomical grounding, this simplification provides powerful methodological and statistical advantages. It establishes a symmetrical framework that makes our dynamic connectivity analyses tractable, such as our “cube” analysis of state transitions, which required overlapping modules. Critically, this model also offers a statistical safeguard. By ensuring each quadrant contributes to both low- and high-PE connectivity patterns, we eliminate confounds like region-specific signal variance or global connectivity. This design choice isolates the phenomenon to the pattern of connectivity itself (posterior-anterior vs. ventral-dorsal), making our interpretation more robust.

      We updated the end of the Study 1A results (p. 10-11):

      “Some ROIs appear in Figure 2C but are excluded from the four targeted quadrants (Figures 2C & 2D) – e.g., posterior inferior temporal lobe and fusiform ROIs are excluded from the OC-IPL module, and middle temporal gyrus ROIs are excluded from the IPL-LPFC modules. These exclusions, in favor of a four-quadrant interpretation, are motivated by existing knowledge of prominent structural pathways among these quadrants. This interpretation is also supported by classifier-based analyses showing connectivity within each quadrant is significantly influenced by PE (Supplemental Materials 2.2), along with analyses of single-region activity showing that these areas also respond to PE independently (Supplemental Materials 3). Hence, we proceeded with further analyses of these quadrants’ connections, which summarize PE’s global brain effects.

      “This four-quadrant setup also imparts analytical benefits. First, this simplified structure may better generalize across PE tasks, and Study 1B would aim to replicate these results with a different design. Second, the four quadrants mean that each ROI contributes to both the posterior-anterior and ventral-dorsal modules, which would benefit later analyses and rules out confounds such as PE eliciting increased/decreased connectivity between an ROI and the rest of the brain. An additional, less key benefit is that this setup allows more easily evaluating whether the same phenomena arise using a different atlas (Supplemental Materials Y).”

      (R2.5) The EEG-fMRI analysis claiming 3-6Hz fluctuations for PE is hard to reconcile with the fact that fMRI captures activity that is a lot slower, while some PEs are as fast as 150 ms. The discussion acknowledges this but doesn't seem to resolve it - would benefit from a more comprehensive argument.

      We thank the reviewer for raising this important point, which allows us to clarify the logic of our multimodal analysis. Our analysis does not claim that the fMRI BOLD signal itself oscillates at 3-6 Hz. Instead, it is based on the principle that the intensity of a fast neural process can be reflected in the magnitude of the slow BOLD response. It’s akin to using a long-exposure photograph to capture a fast-moving object; while the individual movements are blurred, the intensity of the blur in the photo serves as a proxy for the intensity of the underlying motion. In our case, the magnitude of the fMRI network difference (|PA – VD|) acts as the "blur," reflecting the intensity of the rapid fluctuations between states within that time window.

      Following this logic, we correlated this slow-moving fMRI metric with the power of the fast EEG rhythms, which reflects their amplitude. To bridge the different timescales, we averaged the EEG power over each fMRI time window and convolved it with the standard hemodynamic response function (HRF) – a crucial step to align the timing of the neural and metabolic signals. The resulting significant correlation specifically in the 3-6 Hz band demonstrates that when this rhythm is stronger, the fMRI data shows a greater divergence between network states. This allows us to infer the characteristic frequency of the underlying neural fluctuations without directly measuring them at that speed with fMRI, thus reconciling the two timescales.

      Reviewer #3 (Public review):

      Bogdan et al. present an intriguing and timely investigation into the intrinsic dynamics of prediction error (PE)-related brain states. The manuscript is grounded in an intuitive and compelling theoretical idea: that the brain alternates between high and low PE states even at rest, potentially reflecting an intrinsic drive toward predictive minimization. The authors employ a creative analytic framework combining different prediction tasks and imaging modalities. They shared open code, which will be valuable for future work.

      (R3.1) Consistency in Theoretical Framing

      The title, abstract, and introduction suggest inconsistent theoretical goals of the study.

      The title suggests that the goal is to test whether there are intrinsic fluctuations in high and low PE states at rest. The abstract and introduction suggest that the goal is to test whether the brain intrinsically minimizes PE and whether this minimization recruits global brain networks. My comments here are that a) these are fundamentally different claims, and b) both are challenging to falsify. For one, task-like recurrence of PE states during resting might reflect the wiring and geometry of the functional organization of the brain emerging from neurobiological constraints or developmental processes (e.g., experience), but showing that mirroring exists because of the need to minimize PE requires establishing a robust relationship with behavior or showing a causal effect (e.g., that interrupting intrinsic PE state fluctuations affects prediction).

      The global PE hypothesis-"PE minimization is a principle that broadly coordinates brain functions of all sorts, including abstract cognitive functions"-is more suitable for discussion rather than the main claim in the abstract, introduction, and all throughout the paper.

      Given the above, I recommend that the authors clarify and align their core theoretical goals across the title, abstract, introduction, and results. If the focus is on identifying fluctuations that resemble taskdefined PE states at rest, the language should reflect that more narrowly, and save broader claims about global PE minimization for the discussion. This hypothesis also needs to be contextualized within prior work. I'd like to see if there is similar evidence in the literature using animal models.

      Thank you for bringing up this issue. We have made changes throughout the paper to address these points. First, we have omitted reference to a “global PE hypothesis” from the Abstract and Introduction, in favor of structuring the Introduction in terms of a falsifiable question (p. 4):

      “We pursued this goal using three studies (Figure 1) that collectively targeted a specific question: Do the taskdefined connectivity signatures of high vs. low PE also recur during rest, and if so, how does the brain transition between exhibiting high/low signatures?”

      We made changes later in the Introduction to clarify that the investigation is based on correlative evidence and requires interpretations that may be debated (p. 5-7):

      “Although this does not entirely address the reverse inference dilemma and can only produce correlative evidence, the present research nonetheless investigates these widely speculated upon PE ideas more directly than any prior work.

      Although such speed outpaces the temporal resolution of fMRI, correlating fluctuations in dynamic connectivity measured from fMRI data with EEG oscillations can provide an estimate of the fluctuations’ speed. This interpretation of a correlation again runs up against issues related to reverse inference but would nonetheless serve as initial suggestive evidence that spontaneous transitions between network states occur rapidly.

      Second, we examined the recruitment of these networks during rs-fMRI, and although the problems related to reverse inference are impossible to overcome fully, we engage with this issue by linking rs-fMRI data directly to task-fMRI data of the same participants, which can provide suggestive evidence that the same neural mechanisms are at play in both.”

      We made changes throughout the Results now better describing the results as consistent with a hypothesis rather than demonstrating it (p. 12-19):

      “In other words, we essentially asked whether resting-state participants are sometimes in low PE states and sometimes in high PE states, which would be consistent with spontaneous PE processing in the absence of stimuli.

      These emerging states overlap strikingly with the previous task effects of PE, suggesting that rs-fMRI scans exhibit fluctuations that resemble the signatures of low- and high-PE states. 

      To be clear, this does not entirely dissuade concerns about reverse inference, which would require a type of causal manipulation that is difficult (if not impossible) to perform in a resting state scan. Nonetheless, these results provide further evidence consistent with our interpretation that the resting brain spontaneously fluctuates between high/low PE network states.

      These patterns are most consistent with a characteristic timescale near 3–6 Hz for the amplitude of the putative high/low-PE fluctuations. This is notably consistent with established links between PE and Delta/Theta and is further consistent with an interpretation in which these fluctuations relate to PE-related processing during rest.”

      We have also made targeted edits to the Discussion to present the findings in a more cautious way, more clearly state what is our interpretation, and provide alternative explanations (p. 19-26):

      “The present research conducted task-fMRI, rs-fMRI, and rs-fMRI-EEG studies to clarify whether PE elicits global connectivity effects and whether the signatures of PE processing arise spontaneously during rest. This investigation carries implications for how PE minimization may characterize abstract task-general cognitive processes. […] Although there are different ways to interpret this correlation, it is consistent with high/low PE states generally fluctuating at 3-6 Hz during rest. Below, we discuss these three studies’ findings.

      Our rs-fMRI investigation examined whether resting dynamics resemble the task-defined connectivity signatures of high vs. low PE, independent of the type of stimulus encountered. The resting-state analyses indeed found that, even at rest, participants’ brains fluctuated between strong ventral-dorsal connectivity and strong posterior-anterior connectivity, consistent with shifts between states of high and low PE. This conclusion is based on correlative/observational evidence and so may be controversial as it relies on reverse inference.

      These patterns resemble global connectivity signatures seen in resting-state participants, and correlations between fMRI and EEG data yield associations, consistent with participants fluctuating between high-PE (ventral-dorsal) and low-PE (posterior-anterior) states at 3-6 Hz. Although definitively testing these ideas is challenging, given that rs-fMRI is defined by the absence of any causal manipulations, our results provide evidence consistent with PE minimization playing a role beyond stimulus process.”

      (R3.2) Interpretation of PE-Related Fluctuations at Rest and Its Functional Relevance. It would strengthen the paper to clarify what is meant by "intrinsic" state fluctuations. Intrinsic might mean taskindependent, trait-like, or spontaneously generated. Which do the authors mean here? Is the key prediction that these fluctuations will persist in the absence of a prediction task?

      Of the three terms the reviewer mentioned, “spontaneous” and “task-independent” are the most accurate descriptors. We conceptualize these fluctuations as a continuous background process that persists across all facets of cognition, without requiring a task explicitly designed to elicit prediction error – although we, along with other predictive coding papers, would argue that all cognitive tasks are fundamentally rooted in PE mechanisms and thus anything can be seen as a “prediction task” (see our response to comment R2.2 for our changes to the Introduction that provide more intuition for this point). The proposed interactions can be seen as analogous to cortico-basal-thalamic loops, which are engaged across a vast and diverse array of cognitive processes.

      The prior submission only used the word “intrinsic” in the title. We have since revised it to “spontaneous,” which is more specific than “intrinsic,” and we believe clearer for a title than “task-independent” (p. 1): “Spontaneous fluctuations in global connectivity reflect transitions between states of high and low prediction error”

      We have also made tweaks throughout the manuscript to now use “spontaneously” throughout (it now appears 8 times in the paper).

      Regardless of the intrinsic argument, I find it challenging to interpret the results as evidence of PE fluctuations at rest. What the authors show directly is that the degree to which a subset of regions within a PE network discriminates high vs. low PE during task correlates with the magnitude of separation between high and low PE states during rest. While this is an interesting relationship, it does not establish that the resting-state brain spontaneously alternates between high and low PE states, nor that it does so in a functionally meaningful way that is related to behavior. How can we rule out brain dynamics of other processes, such as arousal, that also rise and fall with PE? I understand the authors' intention to address the reverse inference concern by testing whether "a participant's unique connectivity response to PE in the reward-processing task should match their specific patterns of resting-state fluctuation". However, I'm not fully convinced that this analysis establishes the functional role of the identified modules to PE because of the following:

      Theoretically, relating the activities of the identified modules directly to behavior would demonstrate a stronger functional role.

      (R3.2a) Across participants: Do individuals who exhibit stronger or more distinct PE-related fluctuations at rest also perform better on tasks that require prediction or inference? This could be assessed using the HCP prediction task, though if individual variability is limited (e.g., due to ceiling effects), I would suggest exploring a dataset with a prediction task that has greater behavioral variance.

      This is a good idea, but unfortunately difficult to test with our present data. The HCP gambling task used in our study was not designed to measure individual differences in prediction or inference and likely suffers from ceiling effects. Because the task outcomes are predetermined and not linked to participants' choices, there is very little meaningful behavioral variance in performance to correlate with our resting-state fluctuation measure.

      While we agree that exploring a different dataset with a more suitable task would be ideal, given the scope of the existing manuscript, this seems like it would be too much. Although these results would be informative, they would ultimately still not be a panacea for the reverse inference issues.

      Or even more broadly, does this variability in resting state PE state fluctuations predict general cognitive abilities like WM and attention (which the HCP dataset also provides)? I appreciate the inclusion of the win-loss control, and I can see the intention to address specificity. This would test whether PE state fluctuations reflect something about general cognition, but also above and beyond these attentional or WM processes that we know are fluctuating.

      This is a helpful suggestion, motivating new analyses: We measured the degree of resting-state fluctuation amplitude across participants and correlated it with the different individual differences measures provided with the HCP data (e.g., measures of WM performance). We computed each participant’s fluctuation amplitude measure as the average absolute difference between posterior-anterior and ventral-dorsal connectivity; this is the average of the TR-by-TR fMRI amplitude measure from Study 3. We correlated this individual difference score with all of the ~200 individual difference measures provided with the HCP dataset (e.g., measures of intelligence or personality). We measured the Spearman correlation between mean fluctuation amplitude with each of those ~200 measures, while correcting for multiple hypotheses using the False Discovery Rate approach.[18]

      We found a robust negative association with age, where older participants tend to display weaker fluctuations (r = -.16, p < .001). We additionally find a positive association with the age-adjusted score on the picture sequence task (r = .12, p<sub>corrected</sub> = .03) and a negative association with performance in the card sort task (r = -.12, p<sub>corrected</sub> = 046). It is unclear how to interpret these associations, without being speculative, given that fluctuation amplitude shows one positive association with performance and one negative association, albeit across entirely different tasks.  We have added these correlation results as Supplemental Materials 8 (SM p. 11):

      “(8) Behavioral differences related to fluctuation amplitude 

      To investigate whether individual differences in the magnitude of resting-state PE-state fluctuations predict general cognitive abilities, we correlated our resting-state fluctuation measure with the cognitive and demographic variables provided in the HCP dataset.

      (8.1) Methods

      For each of the 1,000 participants, we calculated a single fluctuation amplitude score. This score was defined as the average absolute difference between the time-varying posterior-anterior (PA) and ventral-dorsal (VD) connectivity during the resting-state fMRI scan (the average of the TR-by-TR measure used for Study 3). We then computed the Spearman correlation between this score and each of the approximately 200 individual difference measures provided in the HCP dataset. We corrected for multiple comparisons using the False Discovery Rate (FDR) approach.

      (8.2) Results

      The correlations revealed a robust negative association between fluctuation amplitude and age, indicating that older participants tended to display weaker fluctuations (r = -.16, p<sub>corrected</sub> < .001). After correction, two significant correlations with cognitive performance emerged: (i) a positive association with the age-adjusted score on the Picture Sequence Memory Test (r = .12, p<sub>corrected</sub> = .03), (ii) a negative association with performance on the Card Sort Task (r = -.12, p<sub>corrected</sub> = .046). As greater fluctuation amplitude is linked to better performance on one task but worse performance on another, it is unclear how to interpret these findings.”

      We updated the main text Methods to direct readers to this content (p. 39-40):

      “(4.4.3) Links between network fluctuations and behavior

      We considered whether the extent of PE-related network expression states during resting-state is behaviorally relevant. We specifically investigated whether individual differences in the overall magnitude of resting-state fluctuations could predict individual difference measures, provided with the HCP dataset. This yielded a significant association with age, whereby older participants tended to display weaker fluctuations. However, associations with cognitive measures were limited. A full description of these analyses is provided in Supplemental Materials 8.”

      (R3.2b) Within participants: Do momentary increases in PE-network expression during tasks relate to better or faster prediction? In other words, is there evidence that stronger expression of PE-related states is associated with better behavioral outcomes?

      This is a good question that probes the direct behavioral relevance of these network states on a trial-by-trial basis. We agree with the reviewer's intuition; in principle, one would expect a stronger expression of the low-PE network state on trials where a participant correctly and quickly gives a high likelihood rating to a predictable stimulus.

      Following this suggestion, we performed a new analysis in Study 1A to test this. We found that while network expression was indeed linked to participants’ likelihood ratings: higher likelihood ratings correspond to stronger posterior-anterior connectivity, whereas lower ratings correspond to stronger ventral-dorsal connectivity (Connectivity-Direction × likelihood, β [standardized] = .28, p = .02). Yet, this is not a strong test of the reviewer’s hypothesis, and different exploratory analyses of response time yield null results (p > .05). We suspect that this is due to the effect being too subtle, so we have insufficient statistical power. A comparable analysis was not feasible for Study 1B, as its design does not provide an analogous behavioral measure of trialby-trial prediction success.

      (R3.3) A priori Hypothesis for EEG Frequency Analysis.

      It's unclear how to interpret the finding that fMRI fluctuations in the defined modules correlate with frontal Delta/Theta power, specifically in the 3-6 Hz range. However, in the EEG literature, this frequency band is most commonly associated with low arousal, drowsiness, and mind wandering in resting, awake adults, not uniquely with prediction error processing. An a priori hypothesis is lacking here: what specific frequency band would we expect to track spontaneous PE signals at rest, and why? Without this, it is difficult to separate a PE-based interpretation from more general arousal or vigilance fluctuations.

      This point gets to the heart of the challenge with reverse inference in resting-state fMRI. We agree that an interpretation based on general arousal or drowsiness is a potential alternative that must be considered. However, what makes a simple arousal interpretation challenging is the highly specific nature of our fMRI-EEG association. As shown in our confirmatory analyses (Supplemental Materials 6), the correlation with 3-6 Hz power was found exclusively with the absolute difference between our two PE-related network states (|PA – VD|)—a measure of fluctuation amplitude. We found no significant relationship with the signed difference (a bias toward one state) or the sum (the overall level of connectivity). This specificity presents a puzzle for a simple drowsiness account; it seems less plausible that drowsiness would manifest specifically as the intensity of fluctuation between two complex cognitive networks, rather than as a more straightforward change in overall connectivity. While we cannot definitively rule out contributions from arousal, the specificity of our finding provides stronger evidence for a structured cognitive process, like PE, than for a general, undifferentiated state. 

      We updated the Discussion to make the argument above and also to remind readers that alternative explanations, such as ones based on drowsiness, are possible (p. 24):

      “We specifically interpret the fMRI-EEG correlation as reflecting fluctuation speed because we correlated EEG oscillatory power with the fluctuation amplitude computed from fMRI data. Simply correlating EEG power with the average connectivity or the signed difference between posterior-anterior and ventral-dorsal connectivity yields null results (Supplemental Materials 6), suggesting that this is a very particular association, and viewing it as capturing fluctuation amplitude provides a parsimonious explanation. Yet, this correlation may be interpreted in other ways. For example, resting-state Theta is also a signature of drowsiness,[2] which may correlate with PE processing, but perhaps should be understood as some other mechanism.”

      (R3.4) Significance Assessment

      The significance of the correlation above and all other correlation analyses should be assessed through a permutation test rather than a single parametric t-test against zero. There are a few reasons: a) EEG and fMRI time series are autocorrelated, violating the independence assumption of parametric tests;

      Standard t-tests can underestimate the true null distribution's variance, because EEG-fMRI correlations often involve shared slow drifts or noise sources, which can yield spurious correlations and inflating false positives unless tested against an appropriate null.

      Building a null distribution that preserves the slow drifts, for example, would help us understand how likely it is for the two time series to be correlated when the slow drifts are still present, and how much better the current correlation is, compared to this more conservative null. You can perform this by phase randomizing one of the two time courses N times (e.g., N=1000), which maintains the autocorrelation structure while breaking any true co-occurrence in patterns between the two time series, and compute a non-parametric p-value. I suggest using this approach in all correlation analyses between two time series.

      This is an important statistical point to clarify, and the suggested analysis is valuable. The reviewer is correct that the raw fMRI and EEG time series are autocorrelated. However, because our statistical approach is a twolevel analysis, we reasoned that non-independence at the correlation-level would not invalidate the higher-level t-test. The t-test’s assumption of independence applies to the individual participants' coefficients, which are independent across participants. Thus, we believe that our initial approach is broadly appropriate, and its simplicity allows it to be easily communicated.

      Nonetheless, the permutation-testing procedure that the Reviewer describes seems like an important analysis to test, given that permutation-testing is the gold standard for evaluating statistical significance, and it could guarantee that our above logic is correct. We thus computed the analysis as the reviewer described. For each participant, we phase-randomized the fMRI fluctuation amplitude time series. Specifically, we randomized the Fourier phases of the |PA–VD| series (within run), while retaining the original amplitude spectrum; inverse transforms yielded real surrogates with the same power spectrum. This was done for each participant once per permutation. Each participant’s phase-randomized data was submitted to the analysis of each oscillatory power band as originally, generating one mean correlation for each band. This was done 1,000 times.

      Across the five bands, we find that the grand mean correlation is near zero (M<sub>r</sub> = .0006) and the 97.5<sup>th</sup> percentile critical value of the null distribution is r = ~.025; this 97.5<sup>th</sup> percentile corresponds to the upper end of a 95% confidence interval for a band’s correlation; the threshold minimally differs across bands (.024 < rs < .026). Our original correlation coefficients for Delta (M<sub>r</sub> = .042) and Theta (M<sub>r</sub> = .041), which our conclusions focused on, remained significant (p ≤ .002); we can perform family-wise error-rate correction by taking the highest correlation across any band for a given permutation, and the Delta and Theta effects remain significant (p<sub>FWE</sub>corrected ≤ .003); previously Reviewer comment R1.4c requested that we employ family-wise error correction.

      These correlations were previously reported in Table 1, and we updated the caption to note what effects remain significant when evaluated using permutation-testing and with family-wise error correction (p. 19):

      “The effects for Delta, Theta, Beta, and Gamma remain significant if significance testing is instead performed using permutation-testing and with family-wise error rate correction (p<sub>corrected</sub> < .05).”

      We updated the Methods to describe the permutation-testing analysis (p. 43):

      “To confirm the significance of our fMRI-EEG correlations with a non-parametric approach, we performed a group-level permutation-test. For each of 1,000 permutations, we phase-randomized the fMRI fluctuation amplitude time series. Specifically, we randomized the Fourier phases of the |PA–VD| series (within run), while retaining the original amplitude spectrum; inverse transforms yielded real surrogates with the same power spectrum. This procedure breaks the true temporal relationship between the fMRI and EEG data while preserving its structure. We then re-computed the mean Spearman correlation for each frequency band using this phase-randomized data. We evaluated significance using a family-wise error correction approach that accounts for us analyzing five oscillatory power bands. We thus create a null distribution composed of the maximum correlation value observed across all frequency bands from each permutation. Our observed correlations were then tested for significance against this distribution of maximums.”

      (R3.5) Analysis choices

      If I'm understanding correctly, the algorithm used to identify modules does so by assigning nodes to communities, but it does not itself restrict what edges can be formed from these modules. This makes me wonder whether the decision to focus only on connections between adjacent modules, rather than considering the full connectivity, was an analytic choice by the authors. If so, could you clarify the rationale? In particular, what justifies assuming that the gradient of PE states should be captured by edges formed only between nearby modules (as shown in Figure 2E and Figure 4), rather than by the full connectivity matrix? If this restriction is instead a by-product of the algorithm, please explain why this outcome is appropriate for detecting a global signature of PE states in both task and rest.

      We discuss this matter in our response to comment R2.(4).

      When assessing the correspondence across task-fMRI and rs-fMRI in section 2.2.2, why was the pattern during task calculated from selecting a pair of bilateral ROIs (resulting in a group of eight ROIs), and the resting state pattern calculated from posterior-anterior/ventral-dorsal fluctuation modules? Doesn't it make more sense to align the two measures? For example, calculating task effects on these same modules during task and rest?

      We thank the reviewer for this question, as it highlights a point in our methods that we could have explained more clearly. The reviewer is correct that the two measures must be aligned, and we can confirm that they were indeed perfectly matched.

      For the analysis in Section 2.2.2, both the task and resting-state measures were calculated on the exact same anatomical substrate for each comparison. The analysis iteratively selected a symmetrical subset of eight ROIs from our larger four quadrants. For each of these 3,432 iterations, we computed the task-fMRI PE effect (the Connectivity Direction × PE interaction) and the resting-state fluctuation amplitude (E[|PA – VD|]) using the identical set of eight ROIs. The goal of this analysis was precisely to test if the fine-grained anatomical pattern of these effects correlated within an individual across the task and rest states. We will revise the text in Section 2.2.2 to make this direct alignment of the two measures more explicit.

      Recommendations for authors:

      Reviewer #1 (Recommendations for authors):

      (R1.3) Several prior studies have described co-activation or connectivity "templates" that spontaneously alternate during rest and task states, and are linked to behavioral variability. While they are interpreted differently in terms of cognitive function (e.g., in terms of sustained attention: Monica Rosenberg; alertness: Catie Chang), the relationship between these previously reported templates and those identified in the current study warrants discussion. Are the current templates spatially compatible with prior findings while offering new functional interpretations beyond those already proposed in the literature? Or do they represent spatially novel patterns?

      Thank you for this suggestion. Broadly, we do not mean to propose spatially novel patterns but rather focus on how these are repurposed for PE processing. In the Discussion, we link our identified connectivity states to established networks (e.g., the FPCN). We updated this paragraph to mention that these patterns are largely not spatially novel (p. 20):

      “The connectivity patterns put forth are, for the most part, not spatially novel and instead overlap heavily with prior functional and anatomical findings.”

      Regarding the specific networks covered in the prior work by Rosenberg and Chang that the reviewer seems to be referring to, [7,8] this research has emphasized networks anchored heavily in sensorimotor, subcortical– cerebellar, and medial frontal circuits, and so mostly do not overlap with the connectivity effects we put forth.

      (R1.4) Additional points:

      (R1.4a) I do not think that the logic for taking the absolute difference of fMRI connectivity is convincing. What happens if the sign of the difference is maintained ?

      Thank you for pointing out this area that requires clarification. Our analysis targets the amplitude of the fluctuation between brain states, not the direction. We define high fluctuation amplitude as moments when the brain is strongly in either the PA state (PA > VD) or the VD state (VD > PA). The absolute difference |PA – VD| correctly quantifies this intensity, whereas a signed difference would conflate these two distinct high-amplitude moments. Our simulation study (Supplemental Materials, Section 5) provides the theoretical validation for this logic, showing how this absolute difference measure in slow fMRI data can track the amplitude of a fast underlying neural oscillator.

      When the analysis is tested in terms of the signed difference, as suggested by the Reviewer, the association between the fMRI data and EEG power is insignificant for each power band (ps<sub>uncorrected</sub> ≥ .47). We updated Supplemental Materials 6 to include these results. Previously, this section included the fluctuation amplitude (fMRI) × EEG power results while controlling for: (i) the signed difference between posterior-anterior and ventral-dorsal connectivity, (ii) the sum of posterior-anterior and ventral-dorsal connectivity, and (iii) the absolute value of the sum of posterior-anterior and ventral-dorsal connectivity. For completeness, we also now report the correlation between each EEG power band and each of those other three measures (SM, p. 9)

      “We additionally tested the relationship between each of those three measures and the five EEG oscillation bands. Across the 15 tests, there were no associations (ps<sub>uncorrected</sub>  ≥ .04); one uncorrected p-value was at p = .044, although this was expected given that there were 15 tests. Thus, the association between EEG oscillations and the fMRI measure is specific to the absolute difference (i.e., amplitude) measure.”

      (R1.4b) Reasoning of focus on frontal and theta band is weak, and described as "typical" (line 359) based on a single study.

      Sorry about this. There is a rich literature on the link between frontal theta and prediction error,[3,9–11] and we updated the Introduction to include more references to this work (p. 18): “The analysis was first done using power averaged across frontal electrodes, as these are the typical focus of PE research on oscillations.[3,9–11]”

      We have also updated the Methods to cite more studies that motivate our electrode choice (p. 41): “The analyses first targeted five midline frontal electrodes (F3, F1, Fz, F2, F4; BioSemi64 layout), given that this frontal row is typically the focus of executive-function PE research on oscillations.[9–11]”

      (R1.4c) No correction appears to have been applied for the association between EEG power and fMRI connectivity. Given that 100 frequency bins were collapsed into 5 canonical bands, a correction for 5 comparisons seems appropriate. Notably, the strongest effects in the delta and theta bands (particularly at fronto-central electrodes) may still survive correction, but this should be explicitly tested and reported.

      Thanks for this suggestion. We updated the Table 1 caption to mention what results survive family-wise error rate correction – as the reviewer suggests, the Delta/Theta effects would survive Bonferroni correction for five tests, although per a later comment suggesting that we evaluate statistical significance with a permutationtesting approach (comment R3.4), we instead report family-wise error correction based on that. The revised caption is as follows (p. 19):

      “The effects for Delta, Theta, Beta, and Gamma remain significant if significance testing is instead performed using permutation-testing and with family-wise error rate correction (p<sub>corrected</sub> < .05).”

      (R1.4d) Line 135. Not sure I understand what you mean by "moods". What is the overall point here?

      The overall argument is that the fluctuations occur rapidly rather than slowly. By slow “moods” we refer to how a participant could enter a high anxiety state of >10 seconds, linked to high PE fluctuations, and then shift into a low anxiety state, linked to low PE fluctuations. We argue that this is not occurring. Regardless, we recognize that referring to lengths of time as short as 10 seconds or so is not a typical use of the word “mood” and is potentially ambiguous, so we have omitted this statement, which was originally on page 6: “Identifying subsecond fluctuations would broaden the relevance of the present results, as they rule out that the PE states derive from various moods.”

      (R1.4e) Line 100. "Few prior PE studies have targeted PE, contrasting the hundreds that have targeted BOLD". I don't understand this sentence. It's presumably about connectivity vs activity?

      Yes, sorry about this typo. The reviewer is correct, and that sentence was meant to mention connectivity. We corrected (p. 5): “Few prior PE studies have targeted connectivity, contrasting the hundreds that have targeted BOLD.”

      (R1.4f) Line 373: "0-0.5Hz" in the caption is probably "0-50Hz".

      Yes, this was another typo, thank you. We have corrected it (p. 19): “… every 0.5 Hz interval from 0-50 Hz.”

      Reviewer #2 (Recommendations for authors):

      (R2.6) (Page 3) When referring to the "limited" hypothesis of local PE, please clarify in what sense is it limited. That statement is unclear.

      Thank you for pointing out this text, which we now see is ambiguous. We originally use "limited" to refer to the hypothesis's constrained scope – namely, that PE is relevant to various low-level operations (e.g., sensory processing or rewards) but the minimization of PE does not guide more abstract cognitive processes. We edited this part of the Introduction to be clearer (p. 3)

      “It is generally agreed that the brain uses PE mechanisms at neuronal or regional levels,[15,16] and this idea has been useful in various low-level functional domains, including early vision [15] and dopaminergic reward processing.[17] Some theorists have further argued that PE propagates through perceptual pathways and can elicit downstream cognitive processes to minimize PE.”

      (R2.7) (Page 5) "Few prior PE have targeted PE"... this statement appears contradictory. Please clarify.

      Sorry about this typo, which we have corrected (p. 5):

      “Few prior PE studies have targeted connectivity, contrasting the hundreds that have targeted BOLD.”

      (R2.8) What happened to the data of the medium PE condition in Study 1A?

      The medium PE condition data were not excluded. We modeled the effect of prediction error on connectivity using a linear regression across the three conditions, coding them as a continuous variable (Low = -1, Medium = 0, High = +1). This approach allowed us to identify brain connections that showed a linear increase or decrease in strength as a function of increasing PE. This linear contrast is a more specific and powerful way to isolate PErelated effects than a High vs. Low contrast. We updated the Results slightly to make this clearer (p. 8-9):

      “In the fMRI data, we compared the three PE conditions’ beta-series functional connectivity, aiming to identify network-level signatures of PE processing, from low to high. […] For the modularity analysis, we first defined a connectome matrix of beta values, wherein each edge’s value was the slope of a regression predicting that edge’s strength from PE (coded as Low = -1, Medium = 0, High = +1; Figure 2A).”

      (R2.9) (Page 15) The point about how the dots in 6H follow those in 6J better than those in 6I is a little subjective - can the authors provide an objective measure?

      Thank you for pointing out this issue. The visual comparison using Figure 6 was not meant as a formal analysis but rather to provide intuition. However, as the reviewer describes, this is difficult to convey. Our formal analysis is provided in Supplemental Materials 5, where we report correlation coefficients between a very large number of simulated fMRI data points and EEG data points corresponding to different frequencies. We updated this part of the Results to convey this (p. 16-17):

      “Notice how the dots in Figure 6H follow the dots in Figure 6J (3 Hz) better than the dots in Figure 6I (0.5 Hz) or Figure 6K (10 Hz); this visual comparison is intended for illustrative purposes only, and quantitative analyses are provided in Supplemental Materials 5.”

      References

      (1) Zalesky, A., Fornito, A. & Bullmore, E. T. Network-based statistic: identifying differences in brain networks. Neuroimage 53, 1197–1207 (2010)

      (2) Strijkstra, A. M., Beersma, D. G., Drayer, B., Halbesma, N. & Daan, S. Subjective sleepiness correlates negatively with global alpha (8–12 Hz) and positively with central frontal theta (4–8 Hz) frequencies in the human resting awake electroencephalogram. Neuroscience letters 340, 17–20 (2003).

      (3) Cavanagh, J. F. & Frank, M. J. Frontal theta as a mechanism for cognitive control. Trends in cognitive sciences 18, 414–421 (2014).

      (4) Grech, R. et al. Review on solving the inverse problem in EEG source analysis. Journal of neuroengineering and rehabilitation 5, 25 (2008)

      (5) Palva, J. M. et al. Ghost interactions in MEG/EEG source space: A note of caution on inter-areal coupling measures. Neuroimage 173, 632–643 (2018).

      (6) Koles, Z. J. Trends in EEG source localization. Electroencephalography and clinical Neurophysiology 106, 127–137 (1998).

      (7) Rosenberg, M. D. et al. A neuromarker of sustained attention from whole-brain functional connectivity. Nature neuroscience 19, 165–171 (2016).

      (8) Goodale, S. E. et al. fMRI-based detection of alertness predicts behavioral response variability. elife 10, e62376 (2021).

      (9) Cavanagh, J. F. Cortical delta activity reflects reward prediction error and related behavioral adjustments, but at different times. NeuroImage 110, 205–216 (2015)

      (10) Hoy, C. W., Steiner, S. C. & Knight, R. T. Single-trial modeling separates multiple overlapping prediction errors during reward processing in human EEG. Communications Biology 4, 910 (2021).

      (11) Neo, P. S.-H., Shadli, S. M., McNaughton, N. & Sellbom, M. Midfrontal theta reactivity to conflict and error are linked to externalizing and internalizing respectively. Personality neuroscience 7, e8 (2024).

      (12) Friston, K. J. The free-energy principle: a unified brain theory? Nature reviews neuroscience 11, 127–138 (2010)

      (13) Feldman, H. & Friston, K. J. Attention, uncertainty, and free-energy. Frontiers in human neuroscience 4, 215 (2010).

      (14) Friston, K. J. et al. Active inference and epistemic value. Cognitive neuroscience 6, 187–214 (2015).

      (15) Rao, R. P. & Ballard, D. H. Predictive coding in the visual cortex: a functional interpretation of some extraclassical receptive-field effects. Nature neuroscience 2, 79–87 (1999)

      (16) Walsh, K. S., McGovern, D. P., Clark, A. & O’Connell, R. G. Evaluating the neurophysiological evidence for predictive processing as a model of perception. Annals of the new York Academy of Sciences 1464, 242– 268 (2020)

      (17) Niv, Y. & Schoenbaum, G. Dialogues on prediction errors. Trends in cognitive sciences 12, 265–272 (2008).

      (18) Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological) 57, 289–300 (1995).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review)

      Summary

      We thank the reviewer for the constructive and thoughtful evaluation of our work. We appreciate the recognition of the novelty and potential implications of our findings regarding UPR activation and proteasome activity in germ cells.

      (1) The microscopy images look saturated, for example, Figure 1a, b, etc. Is this a normal way to present fluorescent microscopy?

      The apparent saturation was not present in the original images, but likely arose from image compression during PDF generation. While the EMA granule was still apparent, in the revised submission, we will provide high-resolution TIFF files to ensure accurate representation of fluorescence intensity and will carefully optimize image display settings to avoid any saturation artifacts.

      (2) The authors should ensure that all claims regarding enrichment/lower vs. lower values have indicated statistical tests.

      We fully agree. In the revised version, we will correct any quantitative comparisons where statistical tests were not already indicated, with a clear statement of the statistical tests used, including p-values in figure legends and text.

      (a) In Figure 2f, the authors should indicate which comparison is made for this test. Is it comparing 2 vs. 6 cyst numbers?

      We acknowledge that the description was not sufficiently detailed. Indeed, the test was not between 2 vs 6 cyst numbers, but between all possible ways 8-cell cysts or the larger cysts studied could fragment randomly into two pieces, and produce by chance 6-cell cysts in 13 of 15 observed examples. We will expand the legend and main text to clarify that a binomial test was used to determine that the proportion of cysts producing 6-cell fragments differed very significantly from chance.

      Revised text:

      “A binomial test was used to assess whether the observed frequency of 6-cell cyst products differed from random cyst breakage. Production of 6-cell cysts was strongly preferred (13/15 cysts; ****p < 0.0001).”

      (b) Figures 4d and 4e do not have a statistical test indicated.

      We will include the specific statistical test used and report the corresponding p-values directly in the figure legends.

      (3) Because the system is developmentally dynamic, the major conclusions of the work are somewhat unclear. Could the authors be more explicit about these and enumerate them more clearly in the abstract?

      We will revise the abstract to better clarify the findings of this study. We will also replace the term Visham with mouse fusome to reflect its functional and structural analogy to the Drosophila and Xenopus fusomes, making the narrative more coherent and conclusive.

      (4) The references for specific prior literature are mostly missing (lines 184-195, for example).

      We appreciate this observation of a problem that occurred inadvertently when shortening an earlier version.  We will add 3–4 relevant references to appropriately support this section.

      (5) The authors should define all acronyms when they are first used in the text (UPR, EGAD, etc).

      We will ensure that all acronyms are spelled out at first mention (e.g., Unfolded Protein Response (UPR), Endosome and Golgi-Associated Degradation (EGAD)).

      (6) The jumping between topics (EMA, into microtubule fragmentation, polarization proteins, UPR/ERAD/EGAD, GCNA, ER, balbiani body, etc) makes the narrative of the paper very difficult to follow.

      We are not jumping between topics, but following a narrative relevant to the central question of whether female mouse germ cells develop using a fusome.  EMA, microtubule fragmentation, polarization proteins, ER, and balbiani body are all topics with a known connection to fusomes. This is explained in the general introduction and in relevant subsections. We appreciate this feedback that further explanations of these connections would be helpful. In the revised manuscript, use of the unified term mouse fusome will also help connect the narrative across sections.  UPR/ERAD/EGAD are processes that have been studied in repair and maintenance of somatic cells and in yeast meiosis.  We show that the major regulator XbpI is found in the fusome, and that the fusome and these rejuvenation pathway genes are expressed and maintained throughout oogenesis, rather than only during limited late stages as suggested in previous literature.

      (7) The heading title "Visham participates in organelle rejuvenation during meiosis" in line 241 is speculative and/or not supported. Drawing upon the extensive, highly rigorous Drosophila literature, it is safe to extrapolate, but the claim about regeneration is not adequately supported.

      We believe this statement is accurate given the broad scope of the term "participates." It is supported by localization of the UPR regulator XbpI to the fusome. XbpI is the ortholog of HacI a key gene mediating UPR-mediated rejuvenation during yeast meiosis.  We also showed that rejuvenation pathway genes are expressed throughout most of meiosis (not previously known) and expanded cytological evidence of stage-specific organelle rejuvenation later in meiosis, such as mitochondrial-ER docking, in regions enriched in fusome antigens. However, we recognize the current limitations of this evidence in the mouse, and want to appropriately convey this, without going to what we believe would be an unjustified extreme of saying there is no evidence.

      Reviewer #2 (Public review):

      We thank the reviewer for the comprehensive summary and for highlighting both the technical achievement and biological relevance of our study. We greatly appreciate the thoughtful suggestions that have helped us refine our presentation and terminology.

      (1) Some titles contain strong terms that do not fully match the conclusions of the corresponding sections.

      (1a) Article title “Mouse germline cysts contain a fusome-like structure that mediates oocyte development”

      We will change the statement to: “Mouse germline cysts contain a fusome that supports germline cyst polarity and rejuvenation.”

      (1b) Result title “Visham overlaps centrosomes and moves on microtubules”

      We acknowledge that “moves” implies dynamics. We will include additional supplementary images showing small vesicular components of the mouse fusome on spindle-derived microtubule tracks.

      (1c) Result title “Visham associates with Golgi genes involved in UPR beginning at the onset of cyst formation”

      We will revise this title to: “The mouse fusome associates with the UPR regulatory protein Xbp1 beginning at the onset of cyst formation” to reflect the specific UPR protein that was immunolocalized.

      (1d) Result title “Visham participates in organelle rejuvenation during meiosis”

      We will revise this to: “The mouse fusome persists during organelle rejuvenation in meiosis.”

      (2) The authors aim to demonstrate that Visham is a fusome-like structure. I would suggest simply referring to it as a "fusome-like structure" rather than introducing a new term, which may confuse readers and does not necessarily help the authors' goal of showing the conservation of this structure in Drosophila and Xenopus germ cells. Interestingly, in a preprint from the same laboratory describing a similar structure in Xenopus germ cells, the authors refer to it as a "fusome-like structure (FLS)" (Davidian and Spradling, BioRxiv, 2025).

      We appreciate the reviewer’s insightful comment. To maintain conceptual clarity and align with existing literature, we will refer to the structure as the mouse fusome throughout the manuscript, avoiding introduction of a new term.

      Reviewer #3 (Public review):

      We thank the reviewer for emphasizing the importance of our study and for providing constructive feedback that will help us clarify and strengthen our conclusions.

      (1) Line 86 - the heading for this section is "PGCs contain a Golgi-rich structure known as the EMA granule"

      We agree that the enrichment of Golgi within the EMA PGCs was not shown until the next section. We will revise this heading to:

      “PGCs contain an asymmetric EMA granule.” 

      (2) Line 105-106, how do we know if what's seen by EM corresponds to the EMA1 granule?

      We will clarify that this identification is based on co-localization with Golgi markers (GM130 and GS28) and response to Brefeldin A treatment, which will be included as supplementary data. These findings support that the mouse fusome is Golgi-derived and can therefore be visualized by EM. The Golgi regions in E13.5 cyst cells move close together and associate with ring canals as visualized by EM (Figure 1E), the same as the mouse fusomes identified by EMA.

      (3) Line 106-107-states "Visham co-stained with the Golgi protein Gm130 and the recycling endosomal protein Rab11a1". This is not convincing as there is only one example of each image, and both appear to be distorted.

      Space is at a premium in these figures, but we have no limitation on data documenting this absolutely clear co-localization. We will replace the existing images with high-resolution, noncompressed versions for the final figures to clearly illustrate the co-staining patterns for GM130 and Rab11a1.

      (4) Line 132-133---while visham formation is disrupted when microtubules are disrupted, I am not convinced that visham moves on microtubules as stated in the heading of this section.

      We will include additional supplementary data showing small mouse fusome vesicles aligned along microtubules.

      (5) Line 156 - the heading for this section states that Visham associates with polarity and microtubule genes, including pard3, but only evidence for pard3 is presented.

      We agree and will revise the heading to: “Mouse fusome associates with the polarity protein Pard3.” We are adding data showing association of small fusome vesicles on microtubules.

      (6) Lines 196-210 - it's strange to say that UPR genes depend on DAZ, as they are upregulated in the mutants. I think there are important observations here, but it's unclear what is being concluded.

      UPR genes are not upregulated in DAZ in the sense we have never documented them increasing. We show that UPR genes during this time behave like pleuripotency genes and normally decline, but in DAZ mutants their decline is slowed.  We will rephrase the paragraph to clarify that Dazl mutation partially decouples developmental processes that are normally linked, which alters UPR gene expression relative to cyst development.

      (7) Line 257-259-wave 1 and 2 follicles need to be explained in the introduction, and how these fits with the observations here clarified.

      Follicle waves are too small a focus of the current study to explain in the introduction, but we will request readers to refer to the cited relevant literature (Yin and Spradling, 2025) for further details.

      We sincerely thank all reviewers for their insightful and constructive feedback. We believe that the planned revisions—particularly the refined terminology, improved image quality, clarified statistics, and restructured abstract—will substantially strengthen the manuscript and enhance clarity for readers.

      Reviewer #1 (Recommendations for the authors):

      (1) Figure 1E: need to use some immuno-gold staining to identify the Visham. Just circling an area of cytoplasm that contains ER between germ cell pairs is not enough.

      We appreciate the reviewer’s insistence that the association between the mouse fusome and Golgi be clearly demonstrated. However, the EMA granule is a large structure discovered and defined by light microscopy, and presents no inherent challenge to documenting its Golgi association by immunofluorescence experiments, which we presented and now further strengthened as described in the next paragraph.  We believe that the suggested EM experiment would add little to the EM we already presented (Figure 1E, E')  Moreover, due to facility limitations, we are currently unable to perform immunogold staining. 

      To strengthen previous immunolocalization experiments, we have now included additional immunostaining data showing the clear colocalization of the fusome region with the Golgi markers GM130 and GS28 (Figure S1H). We have also incorporated a new experiment using the Golgi-specific inhibitor Brefeldin A (BFA) see Figure S1I.  Treatment of in vitro–cultured gonads with BFA, disrupted EMA granule formation, demonstrating that EMA granules not only associate with Golgi, but require Golgi function to to be maintained.

      Additionally, in Figure 2, we showed that the fusome overlaps with the peri-centriolar region—a characteristic locus for Golgi due to its movement on microtubules.  We showed that the dynamic behavior of the fusome during the cell cycle, parallels Golgi dispersal and reassembly, and all these facts provide further strong support for the Golgi-association of the EMA granule and fusome.

      (2) Figure 1F: is this image compressed?

      We have now substituted the image in Figure 1F with a better image and have avoided the compression of the image. 

      (3) In the figure legends, are the sample sizes individual animals or individual sections? Please ensure that all figure legends for each figure panel consistently contain the sample size.

      We have now included the number of measurements (N) in every figure legend. Each experiment was performed using samples from at least three different animals, and in most cases from more than three. This information has also been added to the Methods section under Statistics. In addition, N values are now consistently provided for each graph throughout the figures.

      (4) Figure 2b/c: seemly likely based on the snapshot of different stages of cytokinesis that the "newly formed" visham is accurate, but without live imaging, this claim of "newly formed" is putative/speculative. It is OK if it is labeled as "putative" in the figure panel.  

      The behavior of the Drosophila fusome during mitosis was deduced without live imaging (deCuevas et al. 1998). We clarified that the conversion of a single mouse germ cell with one round fusome to an interconnected pair of cells with two round fusomes of greater total volume following mitosis is the basis for deducing that new fusome formation occurs each cell cycle. However, we agree with the reviewer that the phrase "newly formed" in the original label on Figure 2c suggested a specific mechanism of fusome increase that was not intended and this phrase has been removed entirely.  

      (5) Figure 2e/e is extremely difficult to follow. In order to improve the readability of these figure panels, can individual panels with a single stain be shown? The 'gap' between YFP+ sister cells is not immediately obvious in panel e or e" with the current layout. Since this is a key aspect of the author's claim about cleavage of the cyst, it would be best to make this claim more robust by showing more convincing images. In Figure 2E, the staining pattern of EMA needs to be clarified and described more fully in the text.

      We mapped discontinuities in the microtubule connections, not the fusome or YFP.  YFP is the lineage marker indicating that the cells of a single cyst are being studied. Consequently, no gap between YFP cytoplasmic expression is expected because only in the last example (figure E”), has fragmentation already occurred (and here there is a YFP gap).  The acetylated tubulin gap proceeds fragmentation.  The mitotic spindle remnants labeled by AcTub link the cells into two groups separated by a gap, which is clearly shown in the data images and in the third column where only the relevant AcTub from the cyst itself is shown. In response to the reviewers question about the fusome, which is not directly relevant to fragmentation, we have now provided images of the separate fusome channel and corresponding measurements for all three Figure 2E-E'' cysts in the supplementary Figure S4H. We have improved the text regarding this important figure to try and make it easier to follow, and also added a new example of a 10-cell cyst also in S2H (lower panels).  We also added, movies allowing full 3D study of one of the 8 cell cysts and the new 10-cell cyst.  I also suggest that the reviewer examine how the deduced mechanism of fragmentation explains previously published but not fully understood data on cyst fragmentation going back to 1998 as described in the expanded Discussion on this topic.  

      (6) It would be best to support the proposed model in Figure 2G (4+4+4) with microscopy images of a 12-cell or 16-cell cyst? Would these 12-cell or 16-cell cysts be too large to technically recover in a section?

      Unfortunately the reviewer 's suggestion that 12- or 16-cell cysts are too large to recover and present convincingly is correct. Because our analysis depends on capturing lineage-labeled cysts specifically at telophase with acetylated-tubulin connections, the likelihood of obtaining the correct stage is very low.  In addition, the dense packing of germ cells in the mouse gonad further limits our ability to fully reconstruct all the cells in large cysts, with difficulty increasing as cyst size grows.

      However, as noted, we added a well-resolved 10-cell cyst—the largest size we could confidently analyze—in a 3D video in Supplementary Figure S2H (lower panel), which shows a 6 + 4 breakage pattern.

      (7) We did not find a reference in the text for Figure 2G.

      We have now provided reference for 2G in the text and in the discussion section. 

      (8) Line 189: ERAD is used as an acronym, but is not defined until the discussion.

      We have now provided full form of acronym at its first usage in the text.

      (9) Fig 3i/i': the increase of UPR pathway components, increasing expression during zygotene, is interesting to note, but is not commented enough in the text of the paper.

      We have discussed this issue in the discussion section with specific reference to figure 3I. Please find the detailed discussion under the heading “Germ cell rejuvenation is highly active during cyst formation.”

      (10) Please quantify DNMT3A expression levels in WT control vs Dazl KO germ cells in Figure 4a.

      We have now quantified DNMT3A expression levels in WT control vs Dazl KO germ cells and have added the data in the Figure 4A.

      (11) Please introduce the rationale behind selecting DazL KO for studying cyst formation (text in line 197). This comes out of nowhere.

      True.  We significantly expanded our discussion of Dazl and citations of previous work, including evidence that it can affect cyst structures like ring canals, in the Introduction.  

      (12) It would be best to stain WT control vs DazL KO oogonia in Figure 4a with 5mC antibodies to support their claim that DNA methylation might be affected in the mutants.

      We respectfully disagree that this additional experiment is necessary within the scope of the current study. At the developmental stage examined (E12.5), germ cells in the Dazl mutant are clearly in an arrested and hypomethylated state, as supported by previous evidence (Haston et al. 2009).This initial experiments was designed to show that in our hands Dazl mutants show this known pkuripotency delay. However, the effects of Dazl mutation on female germline cyst development as it relates to polarity or the fusome was not studied before, and that is what the paper addresses, building on previous work.

      Because our study does not focus on germ-cell epigenetic modifications but rather on the consequences of Dazl loss on germ cell cyst development, adding 5mC immunostaining would not substantially advance the main conclusions. The existing data and previous published work already provide sufficient background.

      (13) Figure 4c: a very interesting figure, it would be best to quantify developmental pseudotime (perhaps using monocle3 analysis) and compare more rigorously the developmental stage of WT control vs DazL KO.

      Developmental pseudotime, such as through Monocle3 analysis, might sometimes be valuable but involves assumptions that when possible are better addressed by direct experimental examination. Our conclusions regarding cyst developmental stage are supported by straightforward evidence rather to which computational trajectory inference would add little. Specifically, we have performed analysis of germ-cell methylation state, ring canal formation, pluripotency markers, UPR pathway activity assay (Xbp1 and Proteomic assay), Golgi-stress analysis and Pard3 which collectively document the developmental status of the WT and Dazl KO germ cells. These empirical data demonstrate the same developmental pattern reflected in Figure 4c, making the less reliable pseudotime-based computational method superfluous.

      (14) Figure 4d has two panels labeled as "d".

      We have now corrected the labelling of the figure

      (15) Color coding in 4d, d', d" is confusing; please harmonize some visual presentation here.

      We have now harmonized the visual representation of all the graph in figure 4

      (16) Fig 4e' is labeled as DazL +/- but is this really a typo?

      Thank you for pointing it out. We have now corrected the typo

      (17) Figure F': typo labeled as E3.5, which is E13.5?

      Thank you for pointing it out. We have now corrected the typo

      (18) Figure F': was DazL KO mutant but no WT control.

      The WT control was not provided to avoid the redundancy. Please refer to earlier figure 3A-B, Fig S3C and D and videos S3A and S3b to refer to WT control at every stage.

      (19) Figure G: unusual choice in punctuation marks for cartoon schematic. No key to guide the reader for color-coded structures would be helpful to have something similar to 4h.

      We have now provided the key to guide the readers in the mentioned figure 4G.

      (20) The authors use WGA and EMA as interchangeable markers (Figure 5a) without fully explaining why they have switched markers.

      Because it is germ cell specific, we used EMA as a fusome marker during the time when it is found up through E13.5.  After that point we used WGA which is still usable, but also labels somatic cells.  This rationale is explicitly described at the end of the section “Fusome is highly enriched in Golgi and vesicles”, where we state:

      “EMA staining disappears from germ cells at E14.5 (Figure 1I). However, very similar (but non–germ-cell-specific) staining continued with wheat germ agglutinin (WGA) at later stages (Figure 1G, G’; Figure S1G).”

      To ensure this is fully clear to readers, we have now added an additional statement in the start of the text section discussing the figure 5:

      “For the reasons explained previously (see text for Figure 1G), WGA was used as a fusome marker beyond stage E14.5.”

      (21) Figure 5b' is compressed.

      We have now decompressed the image

      (22) Line 267, Balbiani body is misspelled.  

      We have now corrected the spelling.

      (23) The explanation of why the authors switch focus from DazL KO to DazL +/- is not adequately described. The authors should also explain the phenotype of the DazL +/- animals or reference a paper citing the hets are sterile or subfertile.

      We have now added the explanation of why Dazl KO is used in our introduction section where we have mentioned the phenotype of Dazl homozygous and heterozygous mouse.

      (24) Is Figure 5i actually DazL +/-? It is not labeled clearly in the text, the figure legend, or the figure panel. 

      We have now labelled the figure correctly in figure and in the legend.

      (25) The paper ends abruptly at line 275 with no context or summary.

      The manuscript does not end at line 275; the apparent interruption is due to a page break occurring immediately before the beginning of the Discussion section. We hope that continuation is fully visible in the reviewer 1 (your) version of the PDF.

      Reviewer #2 (Recommendations for the authors):

      (1) Line 93: Fig. 1B: DDX4 marks germ cells; do all the red and yellow cells in the NE inset originate from the same PGC? There are only 2 cells marked in yellow among the group of red cells. Is it a z-projection issue? Or do they come from different PGCs?

      This experiment used vasa staining to identify all germ cells, which are produced by multiple PGCs. Green labeling is a lineage marker derived from a single PGC (due to the low frequency of tamoxifen-activated labeling). Consequently, the two yellow cells observed in the NE inset of Fig. 1B represent YFP-labeled germ cells (YFP + DDX4 double-positive) that have arisen from a single, lineage-traced PGC. This approach, introduced in 2013, is described in the Methods, and represents the field's single largest technical advance that has made it possible to analyze mouse germ cell development at single cell resolution.

      To ensure clarity, we have added a brief explanatory note to the figure legend indicating that yellow cells represent the lineage-traced progeny of a single PGC, while the red staining marks all germ cells.

      (2) Line 96: Figure 1C vs 1C'. The difference between female and male Visham is not obvious, although quantification shows a clear difference. How was the quantification made? Manual or automatic thresholding? Would it be possible to show only the Visham channel?

      We thank the reviewer for pointing out this problem. We have now more clearly described in the text that the female fusome increases in some cells with close attachments to other cells (future oocytes) and decreases in distant nurse cells.  It branches due to rosette formation..  In males, the fusome remains much like the initial EMA granules present in early germ cells, with only fine and difficult to see connections.  The quantification shown in Figures 1C and 1C′ was performed manually, based on the presence of either (i) fused, branched EMA-positive fusome structures or (ii) dispersed, punctate EMA granules. This assessment was carried out across multiple E13.5 male and female gonad samples to ensure robustness.  To facilitate independent evaluation, we have already provided supplementary videos S3B1 and S3B2, which display the EMA-stained E13.5 male and female gonads in three dimensions. These videos allow the structural differences to be examined more clearly than in static images.

      In response to the reviewer’s request, we now additionally include the single-channel fusome image in Supplementary Figure S1E′. This presentation highlights the fusome signal alone and further clarifies the morphological differences underlying the quantification.

      (3) L118: Figure 2A, third row = 2-cell cyst? Please specify PCNT in the legend.

      We appreciate the reviewer’s observation. In Figure 2A (third row), the cells were not specifically labeled as a 2-cell cyst; rather, the intention was to illustrate the presence of two distinct centrosomes positioned on a fused fusome structure, a configuration we frequently observe.

      We have now updated the figure legend to explicitly define PCNT.

      (4) L169: Missing reference to S3B and video S3B1?

      We have now included the reference to S3B1 and S3B2 in the text and in the legend

      (5) L170: Please describe the graph in the Figure 3D legend.

      We have now described the Graph in the legend

      (6) L171: Would it be possible to have a close-up showing both Pard3 and Visham in a ringlike pattern related to RACGAP (RC) staining? The images are too small.

      It is difficult to capture this relationship perfectly in a two dimensional picture. The images represent the maximum close-up possible that still includes enough relevant area for the necessary conclusions. We have now provided additional three close-up images exclusively for ring-canal and Pard3 association in the supplementary Figure S3C for further clarity. However, we also note that the quality of the image permits the reader of a pdf to zoom and to visualize the images in great detail.

      (7) L181: Wrong reference, should be 3 then 3I.

      Thank you for pointing it out, we have now corrected the reference.

      (8) L199: In Figure S4B, was DNMT3 staining quantified? Red intensity differs globally between images; use the somatic red level as a reference? Note: EMA seems higher in Dazl- vs. WT?

      We have now performed quantification of DNMT3 staining, which is presented in Figure 4A. While the red intensity (DNMT3 or EMA) can appear to differ between images, this variation can result from biological differences between tissues or minor technical variability despite using consistent microscope settings. To account for this, we normalized the staining intensity using the somatic cell signal as an internal reference, ensuring that the quantification reflects genuine differences between WT and Dazl-/- samples rather than global intensity variation.

      (9) L229: Should be "proteasome."

      We have now corrected the spelling error.

      (10) L233: Quantify fragmentation of Gs28? EMA doesn't seem affected. Could you quantify both Gs28 and EMA? Images are too small.

      We thank the reviewer for this suggestion. While the current images are small, they can be examined in detail using zoom to visualize the structures clearly. As noted, EMA staining is not affected, (we agree) as cells are in arrested state. This arrested state creates stress on Golgi. The fragmentation of Gs28-labeled Golgi membranes is a classical indicator of Golgi stress, even though the fragmented membranes may remain functionally active. Our results show that Dazl deletion specifically affects Golgi in germ cells, while Golgi in neighboring somatic cells appears healthy. To quantify this effect, we have now included manual quantification of Golgi fragmentation in Figure 4F, assessing tissues for the presence of fragmented versus intact Golgi structures. This confirms that Golgi fragmentation is a germ cell–specific phenotype in Dazl– samples, while pre-formed EMA-positive fusomes remain unaffected but probably in arrested state.

      (11) L237: Figure 4F graph shows E3.5, not E13.5.

      We have now corrected the typo in the figure 

      (12) L257: Figure 5D: quantify as in 5A? overlap?

      Yes, it's an overlap and shown as two separate image with ring canal for better clarity. We have now quantified the image and have produced combined graph for fusome and pard3 in Figure 5A graph.

      (13) L261: Figure 5E-E': black arrowhead not mentioned in legend.

      We have now mentioned the black arrowhead in the legend

      (14) L262: Figure 5C: arrowhead not mentioned in legend. Figure 5F: oocyte appears separated from nurse cells compared to 5C.

      Yes, that may happen as cysts undergo fragmentation; what matters is all cells are lineage labelled and hence are members of a single cyst derived from one PGC.

      (15) L263: Figure 5G has no legend reference; nurse cells are not outlined as in 5C.

      We have now outlined the nurse cells and have added the reference to the graph in the legend.

      (16) L279: "The fusome and Visham and both..." should be replaced with "Both fusome and Visham...".

      We have now replaced the term Visham with fusome as suggested by reviewers and editor.  We updated the statement to correct the grammatical error.

      (17) L1127: Video S3B1: It is unclear what to focus on.

      We have now added the Rectangle area and arrow to highlight what to focus on

      (18) L1128: Video "S3B1" should be "S3B2."

      We have now corrected the legend

      (19) Finally: curiosity question: have the authors tried to use known markers of the Drosophila fusome in mice, such as Spectrin or other markers described in Lighthouse, Buszczak and Spradling, Dev Bio, 2008? And conversely, do EMA and WGA label the fusome in Drosophila?

      Yes, we and others used the most specific markers of the Drosophila fusome such alpha-spectrin, adducin-like Hts, tropomodulin, etc. to search for fusomes in vertebrate species. It was unsuccessful in clarifying the situation, because Hts and alpha-spectrin in Drosophila and other insects generate a protein skeleton that stabilizes the fusome and is easily stained. But this structure is simply not conserved in vertebrates. The polarity behavior of the fusome, it core developmental property, is conserved, however. The mammalian fusome still acquires and maintains cyst polarity, and goes even farther and reflects both initial cyst formation and cyst cleavage, before marking oocyte vs nurse cell development in the smaller cysts.  Expression of the inner microtubule-rich portion of the fusome, its Par proteins, and many ER-related and lysosomal fusome proteins are mostly conserved but their ability to mark the fusome alone varies with time and context (only some of the examples are shown in Figure 3I'). Nearly all of the proteins identified in Lighthouse et al. 2008 are expressed.  These proteins may be involved in rejuvenation as studied here.  We modified the first section of the Discussion to explicitly compare mouse, Xenopus and Drosophila fusomes, which was not possible before this work.  

      Reviewer #3 (Recommendations for the authors):

      The authors should either revise the conclusions or add additional evidence to support their claims. In addition, minor corrections are listed below.

      We have added additional evidence as noted in responses above, and revised some claims that were stated inaccurately.  In addition, we have attempted to clarify the evidence we do present, so that its full significance is more easily grasped by readers.    

      (1) Lines 20-21 are unclear - the cyst doesn't get sent into meiosis, each oocyte does.

      Research is showing that it's more complicated than that.  All cyst cells enter "pre-meiotic S phase", and most cell cycles are conventionally considered to start after the previous M phase-

      i.e. in G1 or S, not in the next prophase, an ancient view limited just to meiosis. Absent this old tradition from meiosis cytology, pre-meiotic S would just be called meiotic S as some workers on meiosis do.  In addition, in different species, nurse cells diverge from meiosis on different schedules, including many much later in the meiotic cycle.  Two cyst cells in Drosophila fully enter meiosis by all criteria, the oocyte and one nurse cell that only exits in late zygotene.  In Xenopus and mouse, scRNAseq shows that many cyst cells enter meiosis up to leptotene and zygotene, including nurse cells that specifically downregulate meiotic genes during this time, possibly to assist their nurse cell functions, while others remain in meiosis even longer (Davidian and Spradling, 2025; Niu and Spradling, 2022). Eventually, only the oocytes within each fragmented mouse cyst complete meiosis. 

      (2) Many places in the manuscript abbreviations are never defined or not defined the first time they are used (but the second or third time): Line 23-ER, Line 29-UPR, Line 33-PGC (not defined until line 45), Line 79-EGAD.

      We have defined full acronyms now upon their first occurrence.

      (3) Line 5 should be the pachytene substage of meiosis I.

      We have now updated the statement to “In pachytene stage of meiosis I…”

      (4) Line 59-61 - this statement needs a reference(s).

      These statements are a continuation from the references cited in the previous statements. However, for further clarity we have again cited the relevant reference here (Niu and Spradling, 2022).

      (5) Line 80 - should it be oocyte proteome quality control?

      We have now updated the statement to “Oocyte proteome quality control begins early”.

      (6) Line 87 - in this case, EMA does not stand for epithelial membrane antigen (AI will call it that, but it is not correct). I believe it originally was the abbrev for (Em)bryonic (a)ntigen, though some papers call it (e)mbryonic (m)ouse (a)ntigen. And the reference here is Hahnel and Eddy, 1986, but in the reference list is a different paper, 1987 (both refer to EMA-1).

      We have now updated the acronym EMA-1 in corrected form and have corrected the citation.

      (7) Line 176 - RNA seq.

      We have now updated the statement to “We performed single cell RNA sequencing (scRNA seq) of mouse gonad”.

      (8) Line 181 - Figure 4E and 4I should be 3E and 3I.

      We have now updated the figure reference in the text to correct one.

      (9) Line 183 - missing period.

      Added.

    1. Science is often hard to read. Most people assume that its difficulties are born out of necessity, out of the extreme complexity of scientific concepts, data and analysis. We argue here that complexity of thought need not lead to impenetrability of expression; we demonstrate a number of rhetorical principles that can produce clarity in communication without oversimplifying scientific issues. The results are substantive, not merely cosmetic: Improving the quality of writing actually improves the quality of thought.

      I think it's interesting that there is this emphasis on how science becomes convoluted when the writer uses big words and fancy language, and yet I found myself having to slow down while reading this article just because of the sheer amount of large words that felt unnecessary except to add a vibe of superiority to the piece.

    1. I think that’s the thing I would recommend for the [col-lege students] to do is to encourage the high school students to speak up, because they can have an idea that the university student has never heard or thought of.

      It's not just college students who need to encourage the younger generation to speak up and be more confident. Overall, everybody needs to encourage the younger generation to speak and be confident. College students can be influential since they were once in their shoes not too long ago, but they need more encouragement than just college students.

    1. (2.1) Confucius compares governing with excellence to the North Star - it stays put while everything orbits around it. I see this as a claim that the best political order isn't about "policy cleverness" or coercive force; rather, it's about reorienting people through character. This raises a key question for me: is the ruler's "stillness" a literal call for minimal intervention, or a metaphor for steadfast character? I argue that it is likely the first option, as Confucius focuses heavily on how institutional stability depends on moral exemplarity. This is because Confucius believes that when a leader acts as a stable reference point, people develop internal standards rather than just reacting to external constraints. This also ties into the text's theme of striving towards being exemplary as opposed to avoiding being seen as deplorable.

    2. (6.20) Again, Confucius is a proponent of applying oneself earnestly and to their fullest extent. Everything we do has meaning to it, and it's only proper to give all the motions of life their proper respect. Truly loving something means valuing it, rather than just knowing what it is, and making full use out of that concept/thing and integrating it into your life is even better than just love.

    1. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      This is a valuable polymer model that provides insight into the origin of macromolecular mixed and demixed states within transcription clusters. The well-performed and clearly presented simulations will be of interest to those studying gene expression in the context of chromatin. While the study is generally solid, it could benefit from a more direct comparison with existing experimental data sets as well as further discussion of the limits of the underlying model assumptions.

      We thank the editors for their overall positive assessment. In response to the Referees’ comments, we have addressed all technical points, including a more detailed explanation of the methodology used to extract gene transcription from our simulations and its analogy with real gene transcription. Regarding the potential comparison with experimental data and our mixing–demixing transition, we have added new sections discussing the current state of the art in relevant experiments. We also clarify the present limitations that prevent direct comparisons, which we hope can be overcome with future experiments using the emerging techniques.

      Reviewer #1 (Public Review):

      This manuscript discusses from a theory point of view the mechanisms underlying the formation of specialized or mixed factories. To investigate this, a chromatin polymer model was developed to mimic the chromatin binding-unbinding dynamics of various complexes of transcription factors (TFs).

      The model revealed that both specialized (i.e., demixed) and mixed clusters can emerge spontaneously, with the type of cluster formed primarily determined by cluster size. Non-specific interactions between chromatin and proteins were identified as the main factor promoting mixing, with these interactions becoming increasingly significant as clusters grow larger.

      These findings, observed in both simple polymer models and more realistic representations of human chromosomes, reconcile previously conflicting experimental results. Additionally, the introduction of different types of TFs was shown to strongly influence the emergence of transcriptional networks, offering a framework to study transcriptional changes resulting from gene editing or naturally occurring mutations.

      Overall I think this is an interesting paper discussing a valuable model of how chromosome 3D organisation is linked to transcription. I would only advise the authors to polish and shorten their text to better highlight their key findings and make it more accessible to the reader.

      We thank the Referee for carefully reading our manuscript and recognizing its scientific value. As suggested, we tried to better highlight our key findings and make the text more accessible while addressing also the comments from the other Referees.

      Reviewer #2 (Public Review):

      Summary:

      With this report, I suggest what are in my opinion crucial additions to the otherwise very interesting and credible research manuscript ”Cluster size determines morphology of transcription factories in human cells”.

      Strengths:

      The manuscript in itself is technically sound, the chosen simulation methods are completely appropriate the figures are well-prepared, the text is mostly well-written spare a few typos. The conclusions are valid and would represent a valuable conceptual contribution to the field of clustering, 3D genome organization and gene regulation related to transcription factories, which continues to be an area of most active investigation.

      Weaknesses:

      However, I find that the connection to concrete biological data is weak. This holds especially given that the data that are needed to critically assess the applicability of the derived cross-over with factory size is, in fact, available for analysis, and the suggested experiments in the Discussion section are actually done and their results can be exploited. In my judgement, unless these additional analysis are added to a level that crucial predictions on TF demixing and transcriptional bursting upon TU clustering can be tested, the paper is more fitted for a theoretical biophysics venue than for a biology journal such as eLife.

      We thank the Reviewer for their positive assessment of the soundness of our work and its contribution to the field. We have added a paragraph to the Conclusions highlighting the current state of experimental techniques and outlining near-term experiments that could be extended to test our predictions. We also emphasise that our analysis builds on state-of-the-art polymer models of chromatin and on quantitative experimental datasets, which we used both to build the model construction and to validate its outcomes (gene activity). We hope this strengthened link to experiment will catalyse further studies in the field.

      Major points:

      (1) My first point concerns terminology.The Merriam-Webster dictionary describes morphology as the study of structure and form. In my understanding, none of the analyses carried out in this study actually address the form or spatial structuring of transcription factories. I see no aspects of shape, only size. Unless the authors want to assess actual shapes of clusters, I would recommend to instead talk about only their size/extent. The title is, by the same argument, in my opinion misleading as to the content of this study.

      We agree with the Referee that the title could be misleading. In our study we characterized clusters size, that is a morphological descriptor, and cluster composition that isn’t morphology per se but used in the community in a broader sense. Nevertheless to strength the message we have changed the title in: “Cluster size determines internal structure of transcription factories in human cells”

      (2) Another major conceptual point is the choice of how a single TF:pol particle in the model relates to actual macromolecules that undergo clustering in the cell. What about the fact that even single TF factories still contain numerous canonical transcription factors, many of which are also known to undergo phase separation? Mediator, CDK9, Pol II just to name a few. This alone already represents phase separation under the involvement of different species, which must undergo mixing. This is conceptually blurred with the concept of gene-specific transcription factors that are recruited into clusters/condensates due to sequencespecific or chromatin-epigenetic-specific affinities. Also, the fact that even in a canonical gene with a ”small” transcription factory there are numerous clustering factors takes even the smallest factories into a regime of several tens of clustering macromolecules. It is unclear to me how this reality of clustering and factory formation in the biological cell relates to the cross-over that occurs at approximately n=10 particles in the simulations presented in this paper.

      This is a good point. However in our case we can either look at clustering transcription factors or transcription units. In an experimental situation, transcription units could be “coloured”, or assigned different types, by looking at different cell types, so that they can be classified as housekeeping, or cell-type independent, or cell-type specific. This is similar to how DHS can be clustered. In this way the mixing or demixing state can be identified by looking at the type of transcription unit, removing any ambiguity due to the fact that the same protein may participate in different TF complexes..

      (3) The paper falls critically short in referencing and exploiting for analysis existing literature and published data both on 3D genome organization as well as the process of cluster formation in relation to genomic elements. In terms of relevant literature, most of the relevant body of work from the following areas has not been included:

      (i) mechanisms of how the clustering of Pol II, canonical TFs, and specific TFs is aided by sequence elements and specific chromatin states

      (ii) mechanisms of TF selectivity for specific condensates and target genomic elements

      (iii) most crucially, existing highly relevant datasets that connect 3D multi-point contacts with transcription factor identity and transcriptional activity, which would allow the authors to directly test their hypotheses by analysis of existing data

      Here, especially the data under point (iii) are essential. The SPRITE method (cited but not further exploited by the authors), even in its initial form of publication, would have offered a data set to critically test the mixing vs. demixing hypothesis put forward by the authors. Specifically, the SPRITE method offers ordered data on k-mers of associated genomic elements. These can be mapped against the main TFs that associate with these genomic elements, thereby giving an account of the mixed / demixed state of these k-mer associations. Even a simple analysis sorting these associations by the number of associated genomic elements might reveal a demixing transition with increasing association size k. However, a newer version of the SPRITE method already exists, which combines the k-mer association of genomic elements with the whole transcriptome assessment of RNAs associated with a particular DNA k-mer association. This can even directly test the hypotheses the authors put forward regarding cluster size, transcriptional activation, correlation between different transcription units’ activation etc.

      To continue, the Genome Architecture Mapping (GAM) method from Ana Pombo’s group has also yielded data sets that connect the long-range contacts between gene-regulatory elements to the TF motifs involved in these motifs, and even provides ready-made analyses that assess how mixed or demixed the TF composition at different interaction hubs is. I do not see why this work and data set is not even acknowledged? I also strongly suggest to analyze, or if they are already sufficiently analyzed, discuss these data in the light of 3D interaction hub size (number of interacting elements) and TF motif composition of the involved genomic elements.

      Further, a preprint from the Alistair Boettiger and Kevin Wang labs from May 2024 also provides direct, single-cell imaging data of all super-enhancers, combined with transcription detection, assessing even directly the role of number of super-enhancers in spatial proximity as a determinant of transcriptional state. This data set and findings should be discussed, not in vague terms but in detailed terms of what parts of the authors’ predictions match or do not match these data.

      For these data sets, an analysis in terms of the authors’ key predictions must be carried out (unless the underlying papers already provide such final analysis results). In answering this comment, what matters to me is not that the authors follow my suggestions to the letter. Rather, I would want to see that the wealth of available biological data and knowledge that connects to their predictions is used to their full potential in terms of rejecting, confirming, refining, or putting into real biological context the model predictions made in this study.

      References for point (iii):

      - RNA promotes the formation of spatial compartments in the nucleus https://www.cell.com/cell/fulltext/S0092-8674(21)01230-7?dgcid=raven_jbs_etoc_email

      - Complex multi-enhancer contacts captured by genome architecture mapping https://www.nature.com/articles/nature21411

      - Cell-type specialization is encoded by specific chromatin topologies https://www.nature.com/articles/s41586-021-04081-2

      - Super-enhancer interactomes from single cells link clustering and transcription https://www.biorxiv.org/content/10.1101/2024.05.08.593251v1.full

      For point (i) and point (ii), the authors should go through the relevant literature on Pol II and TF clustering, how this connects to genomic features that support the cluster formation, and also the recent literature on TF specificity. On the last point, TF specificity, especially the groups of Ben Sabari and Mustafa Mirx have presented astonishing results, that seem highly relevant to the Discussion of this manuscript.

      We appreciate the Reviewer’s insightful suggestion that a comparison between our simulation results and experimental data would strengthen the robustness of our model. In response, we have thoroughly revised the literature on multi-way chromatin contacts, with particular attention to SPRITE and GAM techniques. However, we found that the currently available experimental datasets lack sufficient statistical power to provide a definitive test of our simulation predictions, as detailed below.

      As noted by the Reviewer, SPRITE experiments offer valuable information on the composition of highorder chromatin clusters (k-mers) that involve multiple genomic loci. A closer examination of the SPRITE data (e.g., Supplementary Material from Ref. [1]) reveals that the majority of reported statistics correspond to 3-mers (three-way contacts), while data on larger clusters (e.g., 8-mers, 9-mers, or greater) are sparse. This limitation hinders our ability to test the demixing-mixing transition predicted in our simulations, which occurs for cluster sizes exceeding 10.

      Moreover, the composition of the k-mers identified by SPRITE predominantly involves genomic regions encoding functional RNAs—such as ITS1 and ITS2 (involved in rRNA synthesis) and U3 (encoding small nucleolar RNA)—which largely correspond to housekeeping genes. Conversely, there is little to no data available for protein-coding genes. This restricts direct comparison to our simulations, where the demixing-mixing transition depends critically on the interplay between housekeeping and tissue-specific genes.

      Similarly, while GAM experiments are capable of detecting multi-way chromatin contacts, the currently available datasets primarily report three-way interactions [2,3].

      In summary, due to the limited statistical data on higher-order chromatin clusters [4], a quantitative comparison between our simulation results and experimental observations is not currently feasible. Nevertheless, we have now briefly discussed the experimental techniques for detecting multi-way interactions in the revised manuscript to reflect the current state of the field, mentioning most of the references that the Reviewer suggested.

      (4) Another conceptual point that is a critical omission is the clarification that there are, in fact, known large vs. small transcription factories, or transcriptional clusters, which are specific to stem cells and ”stressed cells”. This distinction was initially established by Ibrahim Cisse’s lab (Science 2018) in mouse Embryonic Stem Cells, and also is seen in two other cases in differentiated cells in response to serum stimulus and in early embryonic development:

      - Mediator and RNA polymerase II clusters associate in transcription-dependent condensates https://www.science.org/doi/10.1126/science.aar4199

      - Nuclear actin regulates inducible transcription by enhancing RNA polymerase II clustering https://www.science.org/doi/10.1126/sciadv.aay6515

      - RNA polymerase II clusters form in line with surface condensation on regulatory chromatin https://www.embopress.org/doi/full/10.15252/msb.202110272

      - If ”morphology” should indeed be discussed, the last paper is a good starting point, especially in combination with this additional paper: Chromatin expansion microscopy reveals nanoscale organization of transcription and chromatin https://www.science.org/doi/10.1126/science.ade5308

      We thank the Reviewer for pointing out the discussion about small and large clusters observed in stressed cells. Our study aims to provide a broader mechanistic explanation on the formation of TF mixed and demixed clusters depending on their size. However, to avoid to generate confusion between our terminology and the classification that is already used for transcription factories in stem and stressed cells, we have now added some comments and references in the revised text.

      (5) The statement scripts are available upon request is insufficient by current FAIR standards and seems to be non-compliant with eLife requirements. At a minimum, all, and I mean all, scripts that are needed to produce the simulation outcomes and figures in the paper, must be deposited as a publicly accessible Supplement with the article. Better would be if they would be structured and sufficiently documented and then deposited in external repositories that are appropriate for the sharing of such program code and models.

      We fully agree with the Reviewer. We have now included in the main text a link to an external repository containing all the codes required to reproduce and analyze the simulations.

      Recommendations for the authors:

      Minor and technical points

      (6) Red, green, and yellow (mix of green and red) is a particularly bad choice of color code, seeing that red-green blindness is the most common color blindness. I recommend to change the color code.

      We appreciate the Reviewer’s thoughtful comment regarding color accessibility. We fully agree that red–green combinations can pose challenges for color-blind readers. In our figures, however, we chose the red–green–yellow color scheme deliberately because it provides strong contrast and intuitive representation for different TF/TU types. To ensure accessibility, we optimized brightness and saturation within red-green schemes and we carefully verified that the chosen hues are distinguishable under the most common forms of color vision deficiency, i.e. trichromatic color blindness, using color-blindness simulation tools (e.g., Coblis).

      How is the dispersing effect of transcriptional activation and ongoing transcription accounted for or expected to affect the model outcome? This affects both transcriptional clusters (they tend to disintegrate upon transcriptional activation) as well as the large scale organization, where dispersal by transcription is also known.

      We thank the Reviewer for this very insightful question. The current versions of both our toy model and the more complex HiP-HoP model do not incorporate the effects of RNA Polymerase elongation. Our primary goal was to develop a minimalisitc framework that focuses on investigating TF clusters formation and their composition. Nevertheless, we find that this straightforward approach provides a good agreement between simulations and Hi-C and GRO-seq experiments, lending confidence to the reliability of our results concerning TF cluster composition.

      We fully agree, however, that the effects of transcription elongation are an interesting topic for further exploration. For example, modeling RNA Polymerases as active motors that continually drive the system out of equilibrium could influence the chromatin polymer conformation and the structure of TF clusters. Additionally, investigating how interactions between RNA molecules and nuclear proteins, such as SAF-A, might lead to significant changes in 3D chromatin organization and, consequently, transcription [5], is also in intriguing prospect. Although we do not believe that the main findings of our study, particularly regarding cluster composition and mixed-demixed transition, would be impacted by transcription elongation effects, we recognize the importance of this aspect. As such, we have now included some comments in the Conclusions section of the revised manuscript.

      “and make the reasonable assumption that a TU bead is transcribed if it lies within 2.25 diameters (2.25σ) of a complex of the same colour; then, the transcriptional activity of each TU is given by the fraction of time that the TU and a TF:pol lie close together.” How is that justified? I do not see how this is reasonable or not, if you make that statement you must back it up.

      As pointed out by the Referee, we consider a TU to be active if at least one TF is within a distance 2.25σ from that TU. This threshold is a slightly larger than the TU-TF interaction cutoff distance, r<sub>c</sub> \= 1.8σ between TFs and TUs. The rationale for this choice is to ensure that, in the presence of a TU cluster surrounded by TFs, TUs that are not directly in contact with a TF are still considered active. Nonetheless, we find that using slightly different thresholds, such as 1.8σ or 1.1σ, leads to comparable results, as shown in Fig. S11, demonstrating the robustness of our analysis.

      Clearly, close proximity in 1D genomic space favours formation of similarly-coloured clusters. This is not surprising, it is what you built the model to do. Should not be presented as a new insight, but rather as a check that the model does what is expected.

      We believed that this sentence already conveyed that the formation of single-color clusters driven by 1D genomic proximity is not a surprising outcome. However, we have now slightly rephrased it to better emphasize that this is not a novel insight.

      That said, we would like to highlight that while 1D genomic proximity facilitates the formation of clusters of the same color, the unmixed-to-mixed transition in cluster composition is not easily predictable solely from the TU color pattern. Furthermore, in simulations of real chromosomes, where TU patterns are dictated by epigenetic marks, the complexity of these patterns makes it challenging—if not impossible—to predict cluster composition based solely on the input data of our model.

      “…how closely transcriptional activities of different TUs correlate…” Please briefly state over what variable the correlation is carried out, is it cross correlation of transcription activity time courses over time? Would be nice to state here directly in the main text to make it easier for the reader.

      We have now included a brief description in the revised manuscript explaining how the transcriptional correlations were evaluated and how the correlation matrix was constructed.

      “The second concerns how expression quantitative trait loci (eQTLs) work. Current models see them doing so post-transcriptionally in highly-convoluted ways [11, 55], but we have argued that any TU can act as an eQTL directly at the transcriptional level [11].” This text does not actually explain what eQTLs do. I think it should, in concise words.

      We agree with the Referee’s suggestion. We have revised the sentence accordingly and now provide a clear explanation of eQTLs upon their first mention. The revised paragraph now reads as follows:

      “The second concerns how expression quantitative trait loci (eQTLs)—genomic regions that are statistically associated with variation in gene expression levels—function. While current models often attribute their effects to post-transcriptional regulation through complex mechanisms [6,7], we have previously argued that any transcriptional unit (TU) can act as an eQTL by directly influencing gene expression at the transcriptional level [7]. Here, we observe individual TUs up-regulating or down-regulating the activity of others TUs – hallmark behaviors of eQTLs that can give rise to genetic effects such as “transgressive segregation” [8]. This phenomenon refers to cases in which alleles exhibit significantly higher or lower expression of a target gene, and can be, for instance, caused by the creation of a non-parental allele with a specific combination of QTLs with opposing effects on the target gene.”

      “In the string with 4 mutations, a yellow cluster is never seen; instead, different red clusters appear and disappear (Fig. 2Eii)…” How should it be seen? You mutated away most of the yellow beads. I think the kymograph is more informative about the general model dynamics, not the effects of mutations. Might be more appropriate to place a kymograph in Figure 1.

      We agree with the Referee that the kymograph is the most appropriate graphical representation for capturing the effects of mutations. Panel 2E already refers to the standard case shown in Figure 1. We have now clarified this both in the caption and in the main text. In addition, we have rephrased the sentence—which was indeed misleading—as follows:

      “From the activity profiles in Fig. 2C, we can observe that as the number of mutations increases, the yellow cluster is replaced by a red cluster, with the remaining yellow TUs in the region being expelled (Fig. 2B(ii)). This behavior is reflected in the dynamics, as seen by comparing panels E(i) and E(ii): in the string with four mutations, transcription of the yellow TUs is inhibited in the affected region, while prominent red stripes—corresponding to active, transcribing clusters—emerge (Fig. 2E(ii)).” We hope that the comparison is now immediately clear to the reader.

      “…but this block fragments in the string with 4 mutations…” I don’t know or cannot see what is meant by ”fragmentation” in the correlation matrix.

      With the sentence “this block fragments in the string with 4 mutations” we mean that the majority of the solid red pixels within the black box become light-red or white once the mutations are applied. We have now added a clarification of this point in the revised manuscript.

      “Fig. 3D shows the difference in correlation between the case with reduced yellow TFs and the case displayed in Fig. 1E.” Can you just place two halves of the different matrices to be compared into the same panel? Similar to Fig. S5. Will be much easier to compare.

      We thank the Referee for this suggestion. We tried to implement this modification, and report the modified figure below (Author response image 1). As we can see, in the new figure it is difficult to spot the details we refer to in the main text, therefore we prefer to keep the original version of the figure.

      Author response image 1.

      Heatmap comparing activity correlations of TUs in the random string under normal conditions (top half) and with reduced yellow-TF concentration (bottom half).

      What is the omnigenic model? It is not introduced.

      We thank the Reviewer for highlighting this important point. The omnigenic model, first introduced by Boyle et al in Ref. [6], was proposed to explain how complex traits, including disease risk, are influenced by a vast number of genes. Accordingly to this model, the genetic basis of a trait is not limited to a small set of core genes whose expression is directly related to the trait, but also includes peripheral genes. The latter, although not directly involved in controlling the trait, can influence the expression of core genes through gene regulatory networks, thereby contributing to the overall genetic influence on the trait. We have now added a few lines in the revised manuscript to explain this point.

      “Additionally, blue off-diagonal blocks indicate repeating negative correlations that reflect the period of the 6-pattern.” How does that look in a kymograph? Does this mean the 6 clusters of same color steal the TFs from the other clusters when they form?

      The intuition of the Referee is indeed correct. The finite number of TFs leads to competition among TUs of the same colour, resulting in anticorrelation:when a group of six nearby TUs of a given colour is active, other, more distant TUs of the same colour are not transcribing due to the lack of available TFs. As the Referee suggested,this phenomenon is visible in the kymograph showing TU activity. In Author response image 2, it can be observed that typically there is a single TU cluster for each of the three colours (yellow, green, and red). These clusters can be long-lived (e.g., the yellow cluster at the center of the kymograph) or may destroy during the simulation (e.g., the red cluster at the top of the kymograph, which dissolves at t ∼ 600 × 10<sup>5</sup> τ<sub>B</sub>). In the latter case, TFs of the corresponding colour are released into the system and can bind to a different location, forming a new cluster (as seen with the red cluster forming at the bottom of the kymograph for t > 600 × 10<sup>5</sup> τ<sub>B</sub>). This point is further discussed at the point 2.30 of this Reply where additional graphical material is provided.

      Author response image 2.

      Kymograph showing the TU activity during a typical run in the 6-pattern case. Each row reports the transcriptional state of a TU during one simulation. Black pixels correspond to inactive TUs, red (yellow, green) pixels correspond to active red (yellow, green) TUs.

      “Conversely, negative correlations connect distant TUs, as found in the single-color model…” But at the most distal range, the negative correlation is lost again! Why leave this out? Your correlation curves show the same , equilibration towards no correlation at very long ranges.

      As highlighted in Figure 5Ai, long-range negative correlations (grey segments) predominantly connect distant TUs of the same colour. This is quantified in Figure 5Bi: restricting to same-colour TUs shows that at large genomic separations the correlation is almost entirely negative, with small fluctuations at distances just below 3000 kbp where sampling is sparse; we therefore avoid further interpretation of this regime.

      “These results illustrate how the sequence of TUs on a string can strikingly affect formation of mixed clusters; they also provide an explanation of why activities of human TUs within genomic regions of hundreds of kbp are positively correlated [60].” This is a very nice insight.

      We thank the Reviewer for the very supportive comment.

      “To quantify the extent to which TFs of different colours share clusters, we introduce a demixing coefficient, θ<sub>dem</sub> (defined in Fig. 1).” This is not defined in Fig. 1 or anywhere else here in the main text.

      We thank the Referee for pointing this out. For a given cluster, the demixing coefficient is defined as

      where n is the number of colors, i indexes each color present in the model, and x<sub>i,max</sub> the largest fraction of TFs of the same i-th color in a single TF cluster.

      The demixing coefficient is defined in the Methods section; therefore, we have replaced defined in Fig. 1 with see Methods for definition.

      “Mixing is facilitated by the presence of weakly-binding beads, as replacing them with non-interacting ones increases demixing and reduces long-range negative correlations (Figure S3). Therefore, the sequence of strong and weak binding sites along strings determines the degree of mixing, and the types of small-world network that emerge. If eQTLs also act transcriptionally in the way we suggest [11], we predict that down-regulating eQTLs will lie further away from their targets than up-regulating ones.” Going into these side topics and minke points here is super distracting and waters down the message. Maybe first deal with the main conclusions on mixed vs demixed clusters in dependence on the strong and specific binding site patterns, before dealing with other additional points like the role of weak binding sites.

      Thank you for the suggestion. We now changed the paragraph to highlight the main results. The new paragraph is as follows. “These results on activity correlation and TF cluster composition suggest that, if eQTLs act transcriptionally as expected [7], down-regulating eQTLs are likely to be located further from their target genes than up-regulating ones. In addition, it is important to note that mixing is promoted by the presence of weakly binding beads; replacing these with non-interacting ones leads to increased demixing and a reduction in long-range negative correlations (Figure S3). More generally, our findings indicate that the presence of multiple TF colors offers an effective mechanism to enrich and fine-tune transcriptional regulation.”

      “…provides a powerful pathway to enrich and modulate transcriptional regulation.” Before going into the possible meaning and implications of the results, please discuss the results themselves first.

      See previous point.

      Figure 5B. Does activation typically coincide with spatial compaction of the binding sites into a small space or within the confines of a condensate? My guess would be that colocalization of the other color in a small space is what leads to the mixing effect?

      As the Reviewer correctly noted, the activity of a given TU is indeed influenced by the presence of nearby TUs of the same color, since their proximity facilitates the recruitment of additional TFs and enhances the overall transcriptional activity. In this context, the mixing effect is certainly affected by the 1D arrangement of TUs along the chromatin fiber. As emphasized in the revised manuscript, when domains of same-color TUs are present (as in the 6-pattern string), the degree of demixing is greater compared to the case where TUs of different colors alternate and large domains are absent (as in the 1-pattern string). This difference in the demixing parameter as a function of the 1D TU arrangement is clearly visible in Fig. S2B.

      “…euchromatic regions blue, and heterochromatic ones grey.” Please also explain what these color monomers mean in terms of non specific interactions with the TFs.

      Generally, in our simulation approach we assume euchromatin regions to be more open and accessible to transcription factors, whereas heterochromatin corresponds to more compacted chromatin segments [9]. To reflect this, we introduce weak, non-specific interactions between euchromatin and TFs, while heterochromatin interacts with TFs only thorugh steric effects. To clarify this point, we have now slightly revised the caption of Fig.6.

      “More quantitatively, Spearman’s rank correlation coefficient is 3.66 10<sup>−1</sup>, which compares with 3.24 10<sup>−1</sup> obtained previously using a single-colour model [11].” This comparison does not tell me whether the improvement in model performance justifies an additional model component. There are other, likelihood based approaches to assess whether a model fits better in a relevant extent by adding a free model parameter. Can these be used for a more conclusive comparison? Besides, a correlation of 0.36 does not seem so good?

      We understand the Reviewer’s concern that the observed increase in the activity correlation may not appear to provide strong evidence for the improvement of the newly introduced model. However, within the context of polymer models developed to study realistic gene transcription and chromatin organization, this type of correlation analysis is a widely accepted approach for model validation. Experimental data commonly used for such validation include Hi-C maps, FISH experiments, and GRO-seq data [10,11]. The first two are typically employed to assess how accurately the model reproduces the 3D folding of chromatin; a comparison between experimental and simulated Hi-C maps is provided in the Supplementary Information (Fig. S5), showing a Pearson correlation of 0.7. GRO-seq or RNA-seq data, on the other hand, are used to evaluate the model’s ability to predict gene transcription levels. To date, the highest correlation for transcriptional activity data has been achieved by the HiP-HoP model at a resolution of 1 kbp [10], reporting a Spearman correlation of 0.6. Therefore, the correlation obtained with our 2-color model represents a good level of agreement when compared with the more complex HiP-HoP model. In this context, the observed increase in correlation—from 0.324 to 0.366—can be regarded as a modest yet meaningful improvement.

      “…consequently, use of an additional color provides a statisticallysignificant improvement (p-value < 10<sup>−6</sup>, 2-sided t-test).” I do not follow this argument. Given enough simulation repeats, any improvement, no matter how small, will lead to statistically significant improvements.

      We agree that this sentence could be misleading. We have now rephrased it in a clearer manner specifying that each of the two correlation values is statistically significant alone, while before we were wrongly referring to the significance of the improvement.

      “Additionally, simulated contact maps show a fair agreement with Hi-C data (Figure S5), with a Pearson correlation r ∼ 0.7 (p-value < 10<sup>−6</sup>, 2-sided t-test).” Nice!

      We thank the Reviewer for the positive comment.

      “Because we do not include heterochromatin-binding proteins, we should not however expect a very accurate reproduction of Hi-C maps: we stress that here instead we are interested in active chromatin, transcription and structure only as far as it is linked to transcription.” Then why do you not limit your correlation assessment to only these regions to show that these are very well captured by your model?

      We thank the Reviewer for this insightful comment. Indeed, we could have restricted our investigation to active chromatin regions, as done in our previous works [11,12]. However, our intention in this section of the manuscript was to clarify that the current model is relatively simple and therefore not expected to achieve a very high level of agreement between experimental and simulated Hi-C maps. Another important limitation of the two color model described in the section is the absence of active loop extrusion mediated by SMC proteins, which is known to play a central role in establishing TADs boundaries. Consequently, even if our analysis were limited to active chromatin regions, the agreement with experimental Hi-C maps would still remain lower than that obtained with more comprehensive models, such as HiP-HoP, that we use later in the last section of the paper. We have now added a comment in the revised manuscript explicitly noting the lack of active loop extrusion in our 2-color model.

      “We also measure the average value of the demixing coefficient, θ<sub>dem</sub> (Materials and Methods). If θ<sub>dem</sub> = 1, this means that a cluster contains only TFs of one colour and so is fully demixed; if θ<sub>dem</sub> = 0, the cluster contains a mixture of TFs of all colors in equal number, and so is maximally mixed.” Repetitive.

      We have now rephrased the sentence in a more concise way.

      “…notably, this is similar to the average number of productivelytranscribing pols seen experimentally in a transcription factory [6].” That seems a bit fast and loose. The number of Polymerases can differ depending on state, type of factory, gene etc. and vary between anything from to a few hundreds of Polymerase complexes depending on definition of factory, and what is counted as active. Also, one would think that polymerases only make up a small part of the overall protein pool that constitutes a condensate, so it is unclear whether this is a pertinent estimate.

      Here we refer to the average size of what is normally referred to as a PolII factory, not a generic nuclear condensate. These are the clusters which arise in our simulations. These structures emerge through microphase separation and have been well characterised, for instance see [13] for a recent review. For these structures while there is a distribution the average is well defined and corresponds to a size of about 100 nm, which is very much in line with the size of the clusters we observe, both in terms of 3D diameter and number of participating proteins. Because of the size, the number of active complexes which can contribute cannot be significantly more than ∼ 10. These estimates are, we note, very much in line with super-resolution measurements of SAF-A clusters [14], which are associated with active transcription and hence it is reasonable to assume they colocalise with RNA and polymerase clusters.

      “Conversely, activities of similar TUs lying far from each other on the genetic map are often weakly negatively correlated, as the formation of one cluster sequesters some TFs to reduce the number available to bind elsewhere.” This point is interesting, and I strongly suspect that this indeed happening. But I don’t think it was shown in the analysis of the simulation results in sufficient clarity. We need direct assessment of this sequestration, currently it’s only indirectly inferred.

      Indeed, this is the mechanism underlying the emergence of negative long-range correlations among TU activity values. As the Reviewer correctly pointed out, the competition for a finite number of TFs was only indirectly inferred in the original manuscript. To address this, we have now included a new figure explicitly illustrating this effect. In Fig. S12, we show the kymograph of active TUs (left panel), as in Fig. 2E(i) of the main text, alongside a new kymograph depicting the number of green TFs within a sphere of radius 10σ centered on each green TU (right panel). For simplicity, we focus here only on green TUs and TFs. It can be observed that, during the initial part of the simulation, green TFs are localized near genomic position ∼ 2000(right panel), where green TUs are transcriptionally active (left panel). Toward the end of the simulation, TUs near genomic position ∼ 500 become active, coinciding with the relocation of TFs to this region and the depletion of the previous one.

      In the definition for the demixing coefficient (equation 1), what does the index i stand for?

      Here i is an index denoting each of the colors present in the model. We have now specified the meaning of i after Eq. 1.

      Reviewer 3 (Public Review):

      In this work, the authors present a chromatin polymer model with some specific pattern of transcription units (TUs) and diffusing TFs; they simulate the model and study TFclustering, mixing, gene expression activity, and their correlations. First, the authors designed a toy polymer with colored beads of a random type, placed periodically (every 30 beads, or 90kb). These colored beads are considered a transcription unit (TU). Same-colored TUs attract with each other mediated by similarly colored diffusing beads considered as TFs. This led to clustering (condensation of beads) and correlated (or anti-correlation) ”gene expression” patterns. Beyond the toy model, when authors introduce TUs in a specific pattern, it leads to emergence of specialized and mixed cluster of different TFs. Human chromatin models with realistic distribution of TUs also lead to the mixing of TFs when cluster size is large.

      Strengths.

      This is a valuable polymer model for chromatin with a specific pattern of TUs and diffusing TF-like beads. Simulation of the model tests many interesting ideas. The simulation study is convincing and the results provide solid evidence showing the emergence of mixed and demixed TF clusters within the assumptions of the model.

      Weaknesses.

      Weakness of the work: The model has many assumptions. Some of the assumptions are a bit too simplistic. Concerns about the work are detailed below:

      We thank the Referee for this overall positive evaluation.

      We thank the Referee for this important observation. The way we The authors assume that when the diffusing beads (TFs) are near a TU, the gene expression starts. However, mammalian gene expression requires activation by enhancer-promoter looping and other related events. It is not a simple diffusion-limited event. Since many of the conclusions are derived from expression activity, will the results be affected by the lack of looping details?

      We do not need to assume promoter-enhancer contact, this emerges naturally through the bridging-induced phase separation and indeed is a key strength of our model. Even though looping is not assumed as key to transcriptional initiation, in practice the vast majority of events in which a TF is near a TU are associated with the presence of a cluster where regulatory elements are looped. So transcription in our case is associated with the bridging-induced phase separation, and there is no lack of looping, looping is naturally associated with transcription, and this is an emergent property of the model (not an assumption), which is an important feature of our model. Accordingly, both contact maps and transcriptional activity are well predicted by our model, both in the version described here and in the more sophisticated single-colour HiP-HoP model [10] (an important ingredient of which is the bridging-induced phase separation).

      Authors neglect protein-protein interactions. Without proteinprotein interactions, condensate formation in natural systems is unlikely to happen.

      We thank the Reviewer for pointing out the absence of protein-protein interactions in our simulations. While we acknowledge this limitation, we would like to emphasize that experimental studies have not observed nuclear proteins forming condensates at physiological concentrations in the absence of DNA or chromatin. For example, studies such as Ryu et al. [15] and Shakya et al. [16] show that protein-protein interactions alone are insufficient to drive condensate formation in vivo. Instead, the presence of a substrate, such as DNA or chromatin, is essential to favor and stabilize the formation of protein clusters.

      In our simulations, we propose that protein liquid-liquid phase separation (LLPS) is driven by the presence of both strong and weak attractions between multivalent protein complexes and the chromatin filament. As stated in our manuscript, the mechanism leading to protein cluster formation is the bridging induced attraction. This mechanism involves a positive feedback loop, where protein binding to chromatin induces a local increase in chromatin density, which then attracts more proteins, further promoting cluster formation.

      While we acknowledge that adding protein-protein interactions could be incorporated into our simulations, we believe this would need to be a weak interaction to remain consistent with experimental data. Additionally, incorporating such interactions would not alter the conclusions of our study.

      What is described in this paper is a generic phenomenon; many kinds of multivalent chromatin-binding proteins can form condensates/clusters as described here. For example, if we replace different color TUs with different histone modifications and different TFs with Hp1, PRC1/2, etc, the results would remain the same, wouldn’t they? What is specific about transcription factor or transcription here in this model? What is the logic of considering 3kb chromatin as having a size of 30 nm? See Kadam et al. (Nature Communications 2023). Also, DNA paint experimental measurement of 5kb chromatin is greater than 100 nm (see work by Boettiger et al.).

      We thank the Reviewer for this important observation, which we now address. To begin, we consider the toy model introduced in the first part of the manuscript, where TUs are randomly positioned rather than derived from epigenetic data. As the Reviewer points out, in this simplified context, our results reflect a generic phenomenon: the composition of clusters depends primarily on their size, independent of the specific types of proteins involved. However, the main goal of our work is to gain insights into apparently contradictory experimental findings, which show that some transcription factories consist of a single type of transcription factors, while other contain multiple types. This led us to focus on TF clusters and their role in transcriptional regulation and co-regulation of distant genes. Therefore, in the second part of the manuscript, we use DNase I hypersensitive site (DHS) data to position TUs based on predicted TF binding sites, providing a more biological framework. In both the toy model and the more realistic HiP-HoP model, we observe a size-dependent transition in cluster composition. However, we refrain from generalizing these results to clusters composed of other protein complexes, such as HP1 and PRC, as their binding is governed by distinct epigenetic marks (e.g. H3K927me3 and H3K27me3), which exhibit different genomic distributions compared to DHS marks.

      Finally, the mapping of 3kb to 30nm is an estimate which does not significantly impact our conclusions. The relationship between genomic distance (in kbp) and spatial distance (in nm) is highly dependent on the degree of chromatin compaction, which can vary across cell types and genomic context. As such, providing an exact conversion is challenging [17]. For example, in a previous work based on the HiP-HoP model [12] we compared simulated and experimental FISH measurements and found that 1kbp typically corresponds to 15 − 20nm, implying that 3kbp could span 60nm. Nevertheless, we emphasize that varying this conversion factor does not affect the core results or conclusions of our study. We have now included a clarification in the revised SI to highlight this point.

      Recommendations for the authors:

      Other points.

      Figure 1(D) caption says 2.25σ = 1.6 nanometer. Is this a typo? Sigma is 30nm.

      Yes, it was. As 1σ ∼ 30nm, we have 2.25σ = 2.25 · 30 nm = 67.2 nm ∼ 6.7 × 10<sup>−8</sup>m. We have now corrected the caption.

      Page 6, column 2nd, 3rd para, it is written that θ<sub>dem</sub> (”defined in Fig.1”). There is no θ<sub>dem</sub> defined in Fig.1, is there? I can see it defined in Methods but not in Fig. 1.

      Correct, we replaced (defined in Fig.1) with (see Methods for definition).

      Page 6, column 2, 4th para: what does “correlations overlap and correlations diverge mean”?

      With reference to the plots from Fig. 5B, correlation overlap and diverge simply refers to the fact that same-colour (red curves) and different-colour (blue curves) correlation trends may or may not overlap on each other. We have now clarified this point.

      What is the precise definition of correlation in Fig 5B (Y-axis)?

      In Fig.5B, correlation means Pearson correlation. We have now specified this point in the revised text and in the caption of Fig.5.

      References

      (1) S. A. Quinodoz, J. W. Jachowicz, P. Bhat, N. Ollikainen, A. K. Banerjee, I. N. Goronzy, M. R. Blanco, P. Chovanec, A. Chow, Y. Markaki et al., “Rna promotes the formation of spatial compartments in the nucleus,” Cell, vol. 184, no. 23, pp. 5775–5790, 2021.

      (2) R. A. Beagrie, A. Scialdone, M. Schueler, D. C. Kraemer, M. Chotalia, S. Q. Xie, M. Barbieri, I. de Santiago, L.-M. Lavitas, M. R. Branco et al., “Complex multi-enhancer contacts captured by genome architecture mapping,” Nature, vol. 543, no. 7646, pp. 519–524, 2017.

      (3) R. A. Beagrie, C. J. Thieme, C. Annunziatella, C. Baugher, Y. Zhang, M. Schueler, A. Kukalev, R. Kempfer, A. M. Chiariello, S. Bianco et al., “Multiplex-gam: genome-wide identification of chromatin contacts yields insights overlooked by hi-c,” Nature Methods, vol. 20, no. 7, pp. 1037–1047, 2023.

      (4) L. Liu, B. Zhang, and C. Hyeon, “Extracting multi-way chromatin contacts from hi-c data,” PLOS Computational Biology, vol. 17, no. 12, p. e1009669, 2021.

      (5) R.-S. Nozawa, L. Boteva, D. C. Soares, C. Naughton, A. R. Dun, A. Buckle, B. Ramsahoye, P. C. Bruton, R. S. Saleeb, M. Arnedo et al., “Saf-a regulates interphase chromosome structure through oligomerization with chromatin-associated rnas,” Cell, vol. 169, no. 7, pp. 1214–1227, 2017.

      (6) E. A. Boyle, Y. I. Li, and J. K. Pritchard, “An expanded view of complex traits: from polygenic to omnigenic,” Cell, vol. 169, no. 7, pp. 1177–1186, 2017.

      (7) C. Brackley, N. Gilbert, D. Michieletto, A. Papantonis, M. Pereira, P. Cook, and D. Marenduzzo, “Complex small-world regulatory networks emerge from the 3d organisation of the human genome,” Nat. Commun., vol. 12, no. 1, pp. 1–14, 2021.

      (8) R. B. Brem and L. Kruglyak, “The landscape of genetic complexity across 5,700 gene expression traits in yeast,” Proceedings of the National Academy of Sciences, vol. 102, no. 5, pp. 1572– 1577, 2005.

      (9) M. Chiang, C. A. Brackley, D. Marenduzzo, and N. Gilbert, “Predicting genome organisation and function with mechanistic modelling,” Trends in Genetics, vol. 38, no. 4, pp. 364–378, 2022.

      (10) M. Chiang, C. A. Brackley, C. Naughton, R.-S. Nozawa, C. Battaglia, D. Marenduzzo, and N. Gilbert, “Genome-wide chromosome architecture prediction reveals biophysical principles underlying gene structure,” Cell Genomics, vol. 4, no. 12, 2024.

      (11) A. Buckle, C. A. Brackley, S. Boyle, D. Marenduzzo, and N. Gilbert, “Polymer simulations of heteromorphic chromatin predict the 3d folding of complex genomic loci,” Mol. Cell, vol. 72, no. 4, pp. 786–797, 2018.

      (12) G. Forte, A. Buckle, S. Boyle, D. Marenduzzo, N. Gilbert, and C. A. Brackley, “Transcription modulates chromatin dynamics and locus configuration sampling,” Nature Structural & Molecular Biology, vol. 30, no. 9, pp. 1275–1285, 2023.

      (13) P. R. Cook and D. Marenduzzo, “Transcription-driven genome organization: a model for chromosome structure and the regulation of gene expression tested through simulations,” Nucleic acids research, vol. 46, no. 19, pp. 9895–9906, 2018.

      (14) M. Marenda, D. Michieletto, R. Czapiewski, J. Stocks, S. M. Winterbourne, J. Miles, O. C. Flemming, E. Lazarova, M. Chiang, S. Aitken et al., “Nuclear rna forms an interconnected network of transcription-dependent and tunable microgels,” BioRxiv, pp. 2024–06, 2024.

      (15) J.-K. Ryu, C. Bouchoux, H. W. Liu, E. Kim, M. Minamino, R. de Groot, A. J. Katan, A. Bonato, D. Marenduzzo, D. Michieletto et al., “Bridging-induced phase separation induced by cohesin smc protein complexes,” Science advances, vol. 7, no. 7, p. eabe5905, 2021.

      (16) A. Shakya, S. Park, N. Rana, and J. T. King, “Liquid-liquid phase separation of histone proteins in cells: role in chromatin organization,” Biophysical journal, vol. 118, no. 3, pp. 753–764, 2020.

      (17) A.-M. Florescu, P. Therizols, and A. Rosa, “Large scale chromosome folding is stable against local changes in chromatin structure,” PLoS computational biology, vol. 12, no. 6, p. e1004987, 2016.

    1. One famous example of reducing friction was the invention of infinite scroll. When trying to view results from a search, or look through social media posts, you could only view a few at a time, and to see more you had to press a button to see the next “page” of results. This is how both Google search and Amazon search work at the time this is written. In 2006, Aza Raskin invented infinite scroll, where you can scroll to the bottom of the current results, and new results will get automatically filled in below. Most social media sites now use this, so you can then scroll forever and never hit an obstacle or friction as you endlessly look at social media posts. Aza Raskin regrets what infinite scroll has done to make it harder for users to break away from looking at social media sites.

      I think infinite scroll is a classic example of “friction-reducing design,” but its impact is actually a bit scary. In the past, with formats like search results, you had to click “next page” after finishing one page. While this action was a bit cumbersome, it provided a pause point, reminding you, “Should I stop now?” Infinite scroll completely removes that barrier. You just keep scrolling down, and content automatically loads, making you completely unaware of how long you've been scrolling.

      I think this is also why scrolling through social media is so addictive: it's not because we genuinely want to look for that long, but because the design eliminates every opportunity to stop.

    1. We know that no phenomena or ideas exist “outside of history,” and that serious consideration of just about any aspect of public or private life requires some sort of inquiry into the past.

      I understand the idea that Grossman is making. However, i'm confused as to how he established this idea. Is it impossible for a world in constant change to come up with it's own ideas?

    1. “It’s just routine,” I tell Werner. “You know—criminal record, identity history . .

      What job is he applying for that requires him to get his fingerprints taken..

    2. “It’s anything that’s admirable,” he says. “Anything that would make people think better of you.”

      Sometimes we don't have to try to be kind. All of us have qualities that are admirable without us trying to work for them, but just by being themselves.

    1. It manages KV cache reuse (storing previous conversation context to speed up new responses)

      Baseten isn't managing the specific things in the KV cache - just the persistence and routing of it, so it stays warm. It routes user requests to a GPU that already has specific caches. It's KV cache orchestration, not determining exactly what sits on one within one GPU.

    1. Her expressions in this angerwere that she would stick as close to Abbot as the bark stuck to the tree, and that he should repent ofit afore seven years came to an end, so as Doctor Prescot should never cure him. These words wereheard by others besides Abbot himself, who also heard her say she would hold his nose as close tothe grindstone as ever it was held since his name was Abbot.

      Again, could be a lie. It is easy for several people to lie together if they just don't like one person. It's how many witchcraft accusations came about, especially if the person had land or something else to their name.

    Annotators

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work addresses a key question in cell signalling: how does the membrane composition affect the behaviour of a membrane signalling protein? Understanding this is important, not just to understand basic biological function but because membrane composition is highly altered in diseases such as cancer and neurodegenerative disease. Although parts of this question have been addressed on fragments of the target membrane protein, EGFR, used here, Srinivasan et al. harness a unique tool, membrane nanodisks, which allow them to probe full-length EGFR in vitro in great detail with cutting-edge fluorescent tools. They find interesting impacts on EGFR conformation in differently charged and fluid membranes, explaining previously identified signalling phenotypes.

      Strengths:

      The nanodisk system enables full-length EGFR to be studied in vitro and in a membrane with varying lipid and cholesterol concentrations. The authors combine this with single-molecule FRET utilising multiple pairs of fluorophores at different places on the protein to probe different conformational changes in response to EGF binding under different anionic lipid and cholesterol concentrations. They further support their findings using molecular dynamics simulations, which help uncover the full atomistic detail of the conformations they observe.

      Weaknesses:

      Much of the interpretation of the results comes down to a bimodal model of an 'open' and 'closed' state between the intracellular tail of the protein and the membrane. Some of the data looks like a bimodal model is appropriate, but its use is not sufficiently justified (statistically or otherwise) in this work in its current form. The experiments with varying cholesterol in particular appear to suggest an alternate model with longer fluorescent lifetimes. More justification of these interpretations of the central experiment of this work would strengthen the paper.

      We thank the reviewer for highlighting the strengths of the study, including the use of nanodiscs, single-molecule FRET, and MD simulations to probe full-length EGFR in controlled membrane environments.

      We agree that statistical justification is important for interpreting the distributions. To address this, we performed global fits of the data with both two- and three-Gaussian models and evaluated them using the Bayesian Information Criterion (BIC), which balances the model fit with a penalty for additional parameters. The three-Gaussian model gave a substantially lower BIC, indicating statistical preference for the more complex model. However, we also assessed the separability of the Gaussian components using Ashman’s D, which quantifies whether peaks are distinct. This analysis showed that two Gaussians (µ = 2.64 and 3.43 ns) are not separable, implying they represent one broad distribution rather than two states.

      Author response table 1.

      Both the two- and three-Gaussian models include a low-value component (µ = ~1.3 ns), but the apparent improvement of the three-Gaussian model arises only from splitting the central population into two overlapping Gaussians. Thus, while the BIC favors the three-Gaussian model statistically, Ashman’s D demonstrates that the central peak should not be interpreted as bimodal. Therefore, when all the distributions are fit globally, the data are best explained as two Gaussians, one centered at ~1.3 ns and the other at ~2.7 ns, with cholesterol-dependent shifts reflecting changes in the distribution of this population rather than the emergence of a separate state. Finally, we acknowledge that additional conformations may exist, but based on this analysis a bimodal model describes the populations captured in our data and so we limit ourselves to this simplest framework.

      We have clarified this in the revised manuscript by adding a section in the Methods (page 26) titled Model Selection and Statistical Analysis, which describes the results of the global two- versus three-Gaussian fits evaluated using BIC and Ashman’s D. Additional details of these analyses are also provided in response to Reviewer #1, Question 8 (Recommendations for the authors).

      Reviewer #2 (Public review):

      Summary:

      Nanodiscs and synthesized EGFR are co-assembled directly in cell-free reactions. Nanodiscs containing membranes with different lipid compositions are obtained by providing liposomes with corresponding lipid mixtures in the reaction. The authors focus on the effects of lipid charge and fluidity on EGFR activity.

      Strengths:

      The authors implement a variety of complementary techniques to analyze data and to verify results. They further provide a new pipeline to study lipid effects on membrane protein function.

      We thank the reviewer for noting the strengths of our approach, particularly the use of complementary techniques and the development of a new pipeline to study lipid effects on membrane protein function.

      Weaknesses:

      Due to the relative novelty of the approach, a number of concerns remain.

      (1) I am a little skeptical about the good correlation of the nanodisc compositions with the liposome compositions. I would rather have expected a kind of clustering of individual lipid types in the liposome membrane, in particular of cholesterol. This should then result in an uneven distribution upon nanodisc assembly, i.e., in a notable variation of lipid composition in the individual nanodiscs. Could this be ruled out by the implemented assays, or can just the overall lipid composition of the complete nanodisc fraction be analyzed?

      We monitored insertion of anionic lipids into nanodiscs by performing zeta potential measurements, which report on surface charge, and cholesterol insertion by Laurdan fluorescence, which reports on membrane order. Both assays provide information at the ensemble level, not single-nanodisc resolution. We clarified this in the Methods section (see below).

      Cholesterol clustering is well documented in ternary systems with saturated lipids and sphingolipids [Veatch, Biophys J., 2003; Risselada, PNAS, 2008]. However, in unsaturated POPC-cholesterol mixtures such as those used here, cholesterol primarily alters bilayer order and large-scale segregation is not typically observed.  The addition of POPS to the POPC-cholesterol mixture perturbs cholesterol-induced ordering, lowering the likelihood of cholesterol-rich domains [Kumar, J. Mol. Graphics Modell., 2021].

      Lipid heterogeneity between nanodiscs would be expected to give rise to heterogeneity in hydrodynamic properties, including potentially broadening the dynamic light scattering (DLS) distributions. However, the full width at half maximum (FWHM) values from the DLS measurements (see Author response table 2) do not indicate a broadening with cholesterol. Statistical testing (Mann-Whitney U test for non-normal data) showed no significant difference between samples with and without cholesterol (p = 0.486; n = 4 per group). While the sample size is small making firm conclusions challenging, these results suggest that large-scale heterogeneity is unlikely.

      Author response table 2.

      In the case of POPS lipids, clustering of POPS in EGFR embedded nanodiscs is a recognized property of receptor-lipid interactions. Molecular dynamics simulations have shown that POPS, although constituting only 30% of the inner leaflet, accounts for ~50% of the lipids directly contacting EGFR [Arkhipov, Cell, 2013], underscoring that anionic lipids are preferentially recruited to the receptor’s immediate environment.

      For nanodiscs containing cholesterol and anionic lipids, our smFRET experiments were designed to isolate the effect of EGF binding. The nanodisc population is the same in the ± EGF conditions as EGF was introduced just prior to performing sm-FRET experiments, and not during nanodisc assembly. Thus, for a given lipid composition, any observed differences between ligand-free and ligand-bound states reflect conformational changes of EGFR.

      Methods, page 23, “Zeta potential measurements to quantify surface charge of nanodiscs: Data analysis was processed using the instrumental Malvern’s DTS software to obtain the mean zeta-potential value. This ensemble measurement reports the average surface charge of the nanodisc population, verifying incorporation of anionic POPS lipids.”

      Methods, page 23, “Fluorescence measurements with Laurdan to confirm cholesterol insertion into nanodiscs: The excitation spectrum was recorded by collecting the emission at 440 nm and emission spectra was recorded by exciting the sample at 385 nm. Laurdan fluorescence provides an ensemble readout of membrane order and confirms cholesterol incorporation into the nanodisc population. While laurdan does not resolve the composition of individual nanodiscs, prior work has shown that POPC–cholesterol mixtures are miscible without forming cholesterol-rich domains[91,92], thus the observed ordering changes likely reflect the intended input cholesterol content at the ensemble level.”

      (91) Veatch, S. L. & Keller, S. L. Separation of liquid phases in giant vesicles of ternary mixtures of phospholipids and cholesterol. Biophysical journal, 85(5), 3074-3083 (2003).

      (92) Risselada, H. J. & Marrink, S. J. The molecular face of lipid rafts in model membranes. Proceedings of the National Academy of Sciences 105(45), 17367–17372 (2008).

      (2) Both templates have been added simultaneously, with a 100-fold excess of the EGFR template. Was this the result of optimization? How is the kinetics of protein production? As EGFR is in far excess, a significant precipitation, at least in the early period of the reaction, due to limiting nanodiscs, should be expected. How is the oligomeric form of the inserted EGFR? Have multiple insertions into one nanodisc been observed?

      We thank the reviewer for these insightful questions. Yes, the EGFR:ApoA1∆49 template ratio of 100:1 was empirically determined through optimization experiments now shown in the revised Supplementary Fig. 3. Cell-free reactions were performed across a range of EGFR:ApoA1∆49 template ratios (1:2 to 1:200) and sampled at different time points (2-19 hours). As shown in the gels, EGFR expression increased with higher template ratios and longer reaction times up to ~9 hours, while ApoA1 expression became clearly detectable only after 6 hours. Based on these results, we selected an EGFR:ApoA1∆49 ratio of 100:1 and 8-hour reaction time as the optimal condition, which yielded sufficient full-length EGFR incorporated into nanodiscs for ensemble and single-molecule experiments.

      In cell-free systems, protein yield does not scale directly with DNA template concentration, as translation efficiency is limited by factors such as ribosome availability and co-translational membrane insertion [Hunt, Chem. Rev., 2024; Blackholly, Front. Mol. Biosci., 2022]. Consistent with this, we observed that ApoA1∆49 is produced at higher levels than EGFR despite the lower DNA input (Supplementary Fig. 2b). Providing an excess EGFR template prevents the reaction from becoming limited by scaffold availability and helps compensate for the fact that, as a large multi-domain receptor, EGFR expression can yield truncated as well as full-length products. This strategy ensures that sufficient full-length receptors are available for nanodisc incorporation. We will clarify this in the Methods section (see below).

      We observed little to no visible precipitation under the reported cell-free conditions, likely due to the following reasons: (i) EGFR and ApoA1∆49 are co-expressed in the cell-free reaction, and ApoA1∆49 assembles into nanodiscs concurrently with receptor translation, providing an immediate membrane sink (ii) ApoA1∆49 is expressed at high levels, maintaining disc concentrations that keep the reaction in a soluble regime.

      The sample contains donor-labeled EGFR (snap surface 594) together with acceptor-labeled lipids (cy5-labeled PE doped in the nanodisc). We assess the oligomerization state of EGFR in nanodiscs using single-molecule photobleaching of the donor channel. Snap surface 594 is a benzyl guanine derivative of Atto 594 that reacts with the SNAP tag with near-stoichiometry efficiency [Sun, Chembiochem, 2011]. Most molecules (~75%) exhibited a single photobleaching step, consistent with incorporation of a single EGFR per nanodisc [Srinivasan, Nat. Commun., 2022]. A minority of traces (~15%) showed two photobleaching steps and about ~10% of traces showed three or more photobleaching steps, consistent with occasional multiple insertions. For all smFRET analysis, we restricted the dataset to single-step photobleaching traces, ensuring measurements were performed on monomeric EGFR.

      Methods, page 20, “Production of labeled, full-length EGFR nanodiscs: Briefly, the E.Coli slyD lysate, in vitro protein synthesis E.Coli reaction buffer, amino acids (-Methionine), Methionine, T7 Enzyme, protease inhibitor cocktail (Thermofisher Scientific), RNAse inhibitor (Roche) and DNA plasmids (20ug of EGFR and 0.2ug of ApoA1∆49) were mixed with different lipid mixtures. The DNA template ratio of EGFR:ApoA1∆49 = 100:1 was empirically chosen by testing different ratios on SDS-PAGE gels and selecting the condition that maximized full-length EGFR expression in DMPC lipids (Supplementary Fig. 3).”

      (3) The IMAC purification does not discriminate between EGFR-filled and empty nanodiscs. Does the TEM study give any information about the composition of the particles (empty, EGFR monomers, or EGFR oligomers)? Normalizing the measured fluorescence, i.e., the total amount of solubilized receptor, with the total protein concentration of the samples could give some data on the stoichiometry of EGFR and nanodiscs.

      Negative-stain TEM was performed to confirm nanodisc formation and morphology, but this method does not resolve whether a given disc contains EGFR. To directly assess receptor stoichiometry, we instead relied on single-molecule photobleaching of snap surface 594-labeled EGFR (see response to Point 2). These experiments showed that the majority of nanodiscs contain a single receptor, with a minority containing two receptors. For all smFRET analyses, we restricted data to single-step photobleaching traces, ensuring measurements were performed on monomeric EGFR.

      We did not normalize EGFR fluorescence to total protein concentration because the bulk protein fraction after IMAC purification includes both receptor-loaded and empty nanodiscs. The latter contribute to ApoA1∆49 mass but do not contain receptors and including them would underestimate receptor occupancy. Importantly, the presence of empty nanodiscs does not affect our measurements as photobleaching and single-molecule FRET analyses selectively report only on receptor-containing nanodiscs. This clarification has been added to the Methods.

      Methods, page 26, “Fluorescence Spectroscopy: Traces with a single photobleaching step for the donor and acceptor were considered for further analysis. Regions of constant intensity in the traces were identified by a change-point algorithm95. Donor traces were assigned as FRET levels until acceptor photobleaching. The presence of empty nanodiscs does not influence these measurements, as photobleaching and single-molecule FRET analyses selectively report on receptor-containing nanodiscs.”

      (4) The authors generally assume a 100% functional folding of EGFR in all analyzed environments. While this could be the case, with some other membrane proteins, it was shown that only a fraction of the nanodisc solubilized particles are in functional conformation. Furthermore, the percentage of solubilized and folded membrane protein may change with the membrane composition of the supplied nanodiscs, while non-charged lipids mostly gave rather poor sample quality. The authors normalize the ATP binding to the total amount of detectable EGFR, and variations are interpreted as suppression of activity. Would the presence of unfolded EGFR fractions in some samples with no access to ATP binding be an alternative interpretation?

      We agree that not all nanodisc-embedded EGFR molecules may be fully functional and that the fraction of folded protein could vary with lipid composition. In our ATP-binding assay, EGFR detection relies on the C-terminal SNAP-tag fused to an intrinsically disordered region. Successful labeling requires that this segment be translated, accessible, and folded sufficiently to accommodate the SNAP reaction, which imposes an additional requirement compared to the rigid, structured kinase domain where ATP binds. Misfolded or truncated EGFR molecules would therefore likely fail to label at the C-terminus. These factors strongly imply that our assay predominantly reports on receptor molecules that are intact and well folded.

      Additionally, our molecular dynamics simulations at 0% and 30% POPS support the experimental ATP-binding measurements (Fig. 2c, d). This consistency between both the experimental and simulated evidence, including at 0% POPS where reduced receptor folding might be expected, suggests that the observed lipid-dependent changes are more likely due to modulation of the functional receptor rather than receptor misfolding. We have clarified these points by adding the following

      Results, page 7, “Role of anionic lipids in EGFR kinase activity: In the presence of EGF, increasing the anionic lipid content decreased the number of contacts from 71.8 ± 1.8 to 67.8 ± 2.4, indicating increased accessibility, again in line with the experimental findings. Because detection of EGFR relies on labeling at the C-terminus and ATP binding requires an intact kinase domain, the ATPbinding assay is for receptors that are properly folded and competent for nucleotide binding. The consistency between experimental results and MD simulations suggests that the observed lipiddependent changes are more likely due to modulation of functional EGFR than to artifacts from misfolding.”

      Reviewer #1 (Recommendations for the authors):

      The experimental program presented here is excellent, and the results are highly interesting. My enthusiasm is dampened by the presentation in places which is confusing, especially Figure 3, which contains so many of the results. I also have some reservations about the bimodal interpretation of the lifetime data in Figure 3.

      We thank the reviewer for their positive assessment of our experimental approach and results. In the revised version, we have improved figure organization and readability by adding explicit labels for lipid composition and EGF presence/absence in all lifetime distributions, moving key supplementary tables into main text, and reorganizing the supplementary figures as Extended Data Figures following eLife’s format. Figures and tables now appear in the order in which they are referenced in the text to further improve readability.

      Regarding the bimodal interpretation of the lifetime distribution, we have performed global fits of the data with both two- and three-Gaussian models and evaluated them using the Bayesian Information Criterion (BIC) and Ashman’s D analysis, which supported the bimodal interpretation. Details of this analysis are provided in our response to comment (8) below and included in the manuscript.

      Specific comments below:

      (1) Abstract -"Identifying and investigating this contribution have been challenging owing to the complex composition of the plasma membrane" should be "has".

      We have corrected this error in the revised manuscript.

      (2) Results - p4 - some explanation of what POPC/POPS are would be helpful.

      We have added the text below discussing POPC and POPS.

      Results, page 4, “POPC is a zwitterionic phospholipid forming neutral membranes, whereas POPS carries a net negative charge and provides anionic character to the bilayer[56]. Both PC and PS lipids are common constituents of mammalian plasma membranes, with PC enriched in the outer leaflet and PS in the inner leaflet[22].”

      (22) Lorent, J. H., Levental, K. R., Ganesan, L., Rivera-Longsworth, G., Sezgin, E., Doktorova, M., Lyman, E. & Levental, I. Plasma membranes are asymmetric in lipid unsaturation, packing and protein shape. Nature Chemical Biology 16, 644–652 (2020).

      (56) Her, C., Filoti, D. I., McLean, M. A., Sligar, S. G., Ross, J. A., Steele, H. & Laue, T. M. The charge properties of phospholipid nanodiscs. Biophysical journal 111(5), 989–998 (2016).

      (3) Figure 2b - it would be easier to compare if these were plotted on top of each other. Are we at saturating ATP binding concentration or below it? Also, please put a key to say purple - absent and orange +EGF on the figure. I am also confused as to why, with no EGF, ATP binding is high with 0% POPS, but low when EGF is present, but that then reverses with physiological lipid content.

      While we agree that a direct comparison would be easier, the ATP-binding experiments for the ± EGF conditions were actually performed independently on separate SDS-PAGE gels, which unfortunately precludes such a comparison. We have added a color key to clarify the -EGF and +EGF datasets.

      The experiments were carried out at 1 µM of the fluorescently labeled ATP analogue (atto647Nγ ATP). Reported kinetic measurements for the isolated EGFR kinase domain indicate an K<sub>m</sub> of 5.2 µM suggesting that our experimental concentration is below, but close to the saturating range ensuring sensitivity to changes in accessibility of the binding site rather than saturating all available receptors.

      We have revised the manuscript to clarify these details by including the following text:

      Results, page 6, “To investigate how the membrane composition impacts accessibility, we measured ATP binding levels for EGFR in membranes with different anionic lipid content. 1 µM of fluorescently-labeled ATP analogue, atto647N-γ ATP, which binds irreversibly to the active site, was added to samples of EGFR nanodiscs with 0%, 15%, 30% or 60% anionic lipid content in the absence or presence of EGF.”

      Methods, page 24, “ATP binding experiments: Full-length EGFR in different lipid environments was prepared using cell-free expression as described above. 1μM of snap surface 488 (New England Biolabs) and atto647N labeled gamma ATP (Jena Bioscience) was added after cell-free expression and incubated at 30 °C , 300 rpm for 60 minutes. 1μM of atto647N-γ ATP was used, corresponding to a concentration near the reported Km of 5.2 µM for ATP binding to the isolated EGFR kinase domain[93], ensuring sensitivity to lipid-dependent changes in ATP accessibility.”

      (ii) Nucleotide binding is suppressed under basal conditions, likely to ensure that the catalytic activity is promoted only upon EGF stimulation.

      The molecular dynamics simulations at 0% and 30% POPS further support this interpretation, showing that anionic lipids modulate the accessibility of the ATP-binding site in a manner consistent with experimental trends (Fig. 2c and 2d).

      We have clarified these points in the main text with the following additions:

      Results, page 6, “In the presence of EGF, ATP binding overall increased with anionic lipid content with the highest levels observed in 60% POPS bilayers. In the neutral bilayer, ligand seemed to suppress ATP binding, indicating anionic lipids are required for the regulated activation of EGFR.”

      Results, page 7, “In the absence of EGF, increasing the anionic lipid content from 0\% POPS to 30% POPS increased the number of ATP-lipid contacts 58.6±0.7 to 74.4±1.2, indicating reduced accessibility, consistent with the experimental results and suggesting anionic lipids are required for ligand-induced EGFR activity.”

      (93) Yun, C. H., Mengwasser, K. E., Toms, A. V., Woo, M. S., Greulich, H., Wong, K. K., Meyerson,M. & Eck, M.J. The T790M mutation in EGFR kinase causes drug resistance by increasing the affinity for ATP. PNAS, 105(6), 2070–2075 (2008).

      (4) Figure 2d - how was the 16A distance arrived at?

      We thank the reviewer for pointing this out. The 16 Å cutoff was chosen based on the physical dimensions of the ATP analogue used in the experiments. Specifically, the largest radius of the atto647N-γ ATP molecule is ~16.9 Å, which defines the maximum distance at which lipid atoms could sterically obstruct access of ATP to the binding pocket. Accordingly, in the simulations, contacts were defined as pairs of coarse-grained atoms between lipid molecules and the residues forming the ATP-binding site (residues 694-703, 719, 766-769, 772-773, 817, 820, and 831) separated by less than 16 Å.

      We have rewritten the rationale for selecting the 16 Å cutoff in the Methods section to improve clarity.

      Methods, page 28, “Coarse-grained, Explicit-solvent Simulations with the MARTINI Force Field: We analyzed our simulations using WHAM[108,109] to reweight the umbrella biases and compute the average values of various metrics introduced in this manuscript. Specifically, we calculated the distance between Residue 721 and Residue 1186 (EGFR C-terminus) of the protein. To quantify the accessibility of the ATP-binding site, we calculated the number of contacts between lipid molecules and the residues forming the ATP-binding pocket (residues 694-703, 719, 766-769, 772-773, 817, 820, and 831)[110]. Close contact between the bilayer and these residues would sterically hinder ATP binding; thus, the contact number serves as a proxy for ATP-site accessibility. The cutoff distance for defining a contact was set to 16 Å, corresponding to the largest molecular radius of the fluorescent ATP analogue (atto647N-γ ATP, 16.96 Å111). Accordingly, we defined a contact as a pair of coarse-grained atoms, one from the lipid membrane and one from the ATP binding site, within a mutual distance of less than 16 Å.”

      (5) Figure 2e-h - I think a bar chart/violin plot/jitter plot would make it easier to compare the peak values. The statistics in the table should just be quoted in the text as value +/- error from the 95% confidence interval. The way it is written currently is confusing, as it implies that there is no conformational change with the addition of EGF in neutral lipids, but there is ~0.4nm one from the table. I don't understand what you mean by "The larger conformational response of these important domains suggests that the intracellular conformation may play a role in downstream signaling steps, such as binding of adaptor proteins"?

      We thank the reviewer for these suggestions. For the smFRET lifetime distributions (Figure 2j, k; previously Figure 2e, f), we have now included jitter plots of the donor lifetimes in the Supplementary Figure 11 to facilitate direct visual comparison of the median and distribution widths for each lipid composition and ±EGF conditions. The distance distributions for the ATP to C-terminus in Figure 2e, f (previously Figure 2g, h) were obtained from umbrella-sampling simulations that calculate free-energy profiles rather than raw, unbiased distance values. Because the sampling is guided by biasing potentials, individual distance values cannot be used to construct violin or jitter plots. We therefore present the simulation data only as probability density distributions, which best reflect the equilibrium distributions derived from them.

      We have also revised the text to report the median ± 95% confidence interval, improving clarity and consistency with the statistical table.

      Results, page 9: “In the neutral bilayer (0% POPS), the distributions in the absence of EGF peaks at 8.1 nm (95% CI: 8.0–8.2 nm) and in the presence of EGF peaks at 8.6 nm (95% CI: 8.5–8.7 nm) (Table 1, Supplementary Table 1). In the physiological regime of 30% POPS nanodiscs, the peak of the donor lifetime distribution shifts from 9.1 nm (95% CI: 8.9–9.2 nm) in the absence of EGF to 11.6 nm (95% CI: 11.1–12.6 nm) in the presence of EGF (Table 1, Supplementary Table 1), which is a larger EGF-induced conformational response than in neutral lipids.”

      Finally, we have rephrased the sentence in question for clarity. The revised text now reads:

      Results, page 9: “The larger conformational response observed in the presence of anionic lipids suggests that these lipids enhance the responsiveness of the intracellular domains to EGF, potentially ensuring interactions between C-terminal sites and adaptor proteins during downstream signaling.”

      (6) "r, highlighting that the charged lipids can enhance the conformational response even for protein regions far away from the plasma membrane" - is it not that the neutral membrane is just very weird and not physiological that EGFR and other proteins don't function properly?

      We agree with the reviewer that completely neutral (0% POPS) membranes are not physiological and likely do not support the native organization or activity of EGFR. We have revised the text to clarify that the 30% POPS condition represents a more native-like lipid environment that restores or stabilizes the expected conformational response, rather than "enhancing" it. The revised sentence now reads:

      Results, page 10: “Both experimental and computational results show a larger EGF-induced conformational change in the partially anionic bilayer, consistent with the notion that a partially anionic lipid bilayer provides a more native environment that supports proper receptor activation, compared to the non-physiological neutral membrane.”

      (7) "snap surface 594 on the C-terminal tail as the donor and the fluorescently-labeled lipid (Cy5) as the acceptor (Supplementary Fig. 2, 11)." Why not refer to Figure 3a here to make it easier to read?

      We have added the reference to Figure 3a, and we thank the Reviewer for the suggestion.

      (8) Figure 3 - the bimodality in many of these plots is dubious. It's very clear in some, i.e. 0% POPS +EGF, but not others. Can anything be done to justify bimodality better?

      We agree that statistical justification is important for interpreting lifetime distributions. To address this, we performed global fits of the data with both two- and three-Gaussian models and evaluated them using the Bayesian Information Criterion (BIC), which balances the model fit with a penalty for additional parameters. The three-Gaussian model gave a substantially lower BIC, indicating statistical preference for the more complex model. However, we also assessed the separability of the Gaussian components using Ashman’s D, which quantifies whether peaks are distinct. This analysis showed that two of the Gaussians are not separable, implying they represent one broad distribution rather than two discrete states. Therefore, when all the distributions are fit globally, the data are best described as two Gaussians, one centered at ~1.3 ns and the other at ~2.7 ns, with cholesterol-dependent shifts reflecting changes in the distribution of this population rather than the emergence of a separate state. We better justified our choice of model by incorporating the results of the global two- vs three-Gaussian fits with BIC and Ashman’s D analysis in the revised manuscript.

      Methods, page 27: “Model Selection and Statistical Analysis

      Global fitting of lifetime distributions was performed across all experimental conditions using maximum likelihood estimation. Both two-Gaussian and three-Gaussian distribution models were evaluated as described previously.62 Model performance was compared using the Bayesian Information Criterion (BIC),[101] which balances model likelihood and complexity according to

      BIC = -2 ln L + k ln n

      where L is the likelihood, k is the number of free parameters, and n is the number of singlemolecule photon bunches across all experimental conditions. A lower BIC value indicates a statistically better model[101]. The separation between Gaussian components was subsequently assessed using the Ashman’s D where a score above 2 indicates good separation[102]. For two Gaussian components with means µ1, µ2 and standard deviations σ1, σ2,

      where Dij represents the distance metric between Gaussian components i and j. All fitted parameters, likelihood values, BIC scores, and Ashman’s D values are summarized in Supplementary Table 5.”

      (101) Schwarz, G. Estimating the dimension of a model. The Annals of Statistics, 461–464 (1978).

      (102) Ashman, K. M., Bird, C. M. & Zepf, S. E. Detecting bimodality in astronomical datasets. The Astronomical Journal 108(6), 2348–2361 (1994).

      (9) Figure 3c - can you better label the POPS/POPC on here?

      We thank the reviewer for this suggestion. In the revised manuscript, Figure 3b (previously Figure 3c) has been updated to label the lipid composition corresponding to each smFRET distribution to make the comparison across conditions easier to follow.

      (10) Figure 3g - it looks like cholesterol causes a shift in both the peaks, such that the previous open and closed states are not the same, but that there are 2 new states. This is key as the authors state: "Remarkably, high anionic lipids and cholesterol content produce the same EGFR conformations but with opposite effects on signaling-suppression or enhancement." But this is only true if there really are the same conformational states for all lipid/cholesterol conditions. Again, the bimodal models used for all conditions need to be justified.

      We appreciate the reviewer’s insightful comment. We agree that the interpretation of the lifetime distributions depends on whether cholesterol and anionic lipids modulate existing conformational states or create new ones. To test this, we performed global fits of all distributions using the two- and three-Gaussian models and compared them using the Bayesian Information Criterion (BIC) and Ashman’s D, the results of which are described in detail in response to (8) above.

      Both fitting models, two- and three-Gaussian, identified the same short lifetime component (µ = 1.3 ns), suggesting this reflects a well separated conformation. While the three-Gaussian model gave a lower BIC, Ashman’s D analysis indicated that the two of the three components (µ = 2.6 ns and 3.4 ns) are not statistically separable, suggesting they represent a single broad conformational population rather than distinct states. If instead these two components reflected distinct states present under different conditions, Ashman’s D analysis would have found the opposite result. This supports our interpretation that high cholesterol and high anionic lipid content produce similar conformation ensembles with opposite effects on signaling output.

      Finally, we acknowledge that additional conformations may exist, but based on this analysis a bimodal model describes the populations captured in our data and so we limit ourselves to this simplest framework. We have clarified this rationale in the revised manuscript and added the results of the BIC and Ashman’s D analysis to support this interpretation.

      (11) Why are we jumping about between figures in the text? Figure 1d is mentioned after Figure 2. Also, DMPC is shown in the figures way before it is described in the text. It is very confusing. Figure 3 is so compact. I think it should be spread out and only shown in the order presented in the text. Different parts of the figure are referred to seemingly at random in the text. Why is DMPC first in the figure, when it is referred to last in the text?

      Following the Reviewer’s comment, we have revised the figure order and layout to improve readability and ensure consistency with the text. The previous Figures 1d-f which introduce the single-molecule fluorescence setup are now Figure 2g-i, positioned immediately before the first single-molecule FRET experiments (Fig 2j, k). The DMPC distribution in Figure 3 has been moved to the Supplementary Information (Supplementary Fig. 17), where it is shown alongside POPC, as these datasets are compared in the section “Mechanism of cholesterol inhibition of EGFR transmembrane conformational response”. The smFRET distributions in Figure 3 are now presented in the same sequence as they are discussed in the text, and the figure has been spread out for better clarity.

      (12) Throughout, I find the presentation of numerical results, their associated error, and whether they are statistically significantly different from each other confusing. A lot of this is in supplementary tables, but I think these need to go in the main text.

      To improve clarity and ensure that key quantitative results are easily accessible, we have moved the relevant supplementary tables to the main text. Specifically, the following tables have been incorporated into the main manuscript:

      (i) Median distance between the ATP binding site and the EGFR C-terminus, or between membrane and EGFR C-terminus from smFRET measurements (previously supplementary table 1 is now main table 1)

      (ii) Median distance between the membrane and the EGFR C-terminus in different anionic lipid environments (previously supplementary table 4 is now main table 2)

      (iii) Median distance between the membrane and the EGFR C-terminus in different cholesterol environments (previously supplementary table 8 and 12 is now combined to be main table 3)

      (13) Supplementary figures - in general, there is a need to consider how to combine or simplify these for eLife, as they will have to become extended data figures.

      We thank the reviewer for this helpful suggestion. In the revised manuscript, we have reorganized the supplementary figures into extended data figures in accordance with eLife’s format. Specifically:

      - Supplementary Figs. 1–7 are now grouped as Extended Data Figures for Figure 1 in the main text. They are now Figure 1 - figure supplements 1–7.

      - Supplementary Fig. 8–11 is now Extended Data Figure associated with Figure 2. It is now Figure 2 - figure supplements 1–4.

      - Supplementary Figs. 12–17 are now grouped as Extended Data Figures for Figure 3. They are now Figure 3 - figure supplements 1–6.

      (14) Supplementary Figure 2 - label what the two bands are in the EGFR and pEGFR sets at the bottom of panel c.

      We thank the reviewer for this comment. The two bands shown in the EGFR and pEGFR blots in Supplementary Fig. 2d (previously Supplementary Fig. 2c) corresponds to replicate samples under identical conditions. We have now clarified this in the figure legend and labeled the lanes as “Rep 1” and “Rep 2” in the revised figure and modified the figure legend.

      Supplementary Figure 2, page 31: “(d) Western blots were performed on labelled EGFR in nanodiscs. Anti-EGFR Western blots (left) and anti-phosphotyrosine Western blots (right) tested the presence of EGFR and its ability to undergo tyrosine phosphorylation, respectively, consistent with previous experiments on similar preparations[18, 54, 55]. The two lanes in each blot correspond to replicate samples under identical conditions.”

      (15) Supplementary Figures 3+4 - a bar chart/boxplot or similar would be easier for comparison here.

      In the revised version, we have replaced the histograms with jitter plots showing the nanodisc size distributions for each condition in supplementary figures 4 and 5 (previously supplementary figures 3 and 4). The plots display individual measurements with a horizontal line indicating the mean size (mean ± standard deviation values provided in the caption).

      (16) Supplementary Figures 10, 12, 13, 15, 16 - I would jitter these.

      We have incorporated jitter plots for the relevant datasets in Supplementary Figures 11, 13, 15, 16 and 17 (previously supplementary figures 10, 12 13, 15 and 16) to provide a clearer visualization of the data distributions and median values.

      Reviewer #2 (Recommendations for the authors):

      (1) Reactions were performed in 250 µL volumes. What is the average yield of solubilized EGFR in those reactions? Are there differences in the EGFR solubilization with the various lipid mixtures?

      The amount of solubilized EGFR produced in each 250 µL cell-free reaction was below the reliable detection limit for quantitative absorbance assays. At these protein levels, little to no EGFR precipitation was observed for all lipid compositions. Although exact yields could not be determined, fluorescence-based detection confirmed the presence of functional, nanodiscincorporated EGFR suitable for smFRET and ensemble fluorescence experiments. We observed variability in total yield between independent reactions within the same lipid composition, which is common for cell-free systems, but no consistent trend attributable to lipid composition.

      (2) Figure S2: It would be better to have a larger overview of the particles on a grid to get a better impression of sample homogeneity.

      TEM images showing a larger field of view have been added for each lipid composition in Supplementary Figures 4 and 5.

      (3) Figure 2b: It appears that there is some variation in the stoichiometry of ApoA1 and EGFR within the samples. Have equal amounts of each sample been analyzed? Are there, in addition, some precipitates of EGFR? It would further be good to have a negative control without expression to get more information about the additional bands in Figure S2b. As they do not appear in the fluorescent gel, it is unlikely that they represent premature terminations of EGFR.

      The fluorescence intensity from the bound ATP analogue (Atto 647N-ATP) and from the snap surface 488 label, which binds stoichiometrically to the SNAP tag at the EGFR C-terminus, was measured for each sample. The relative amount of ATP binding was quantified for each sample by normalizing to the EGFR content (Figure 2b). This normalization accounts for the different amounts of EGFR produced in each condition.

      We did not observe any visible precipitation under the reported cell-free conditions, likely due to the following reasons:

      (i) EGFR and ApoA1 are co-expressed in the cell-free reaction, and ApoA1 assembles into nanodiscs concurrently with receptor translation, providing an immediate membrane sink

      (ii) ApoA1 is expressed at high levels, maintaining disc concentrations that keep the reaction in a soluble regime.

      A control cell-free reaction containing only ApoA1∆49 (1 µg) and no EGFR template, analyzed after affinity purification, showed a single prominent band at ~ 25 kDa (gel image below), corresponding to ApoA1, along with faint background bands typical of Ni-NTA purification from cell-lysates. These weak, non-specific bands likely arise from co-purification of endogenous E.coli proteins.  

      The ApoA1∆49-only control gel has now been included as part of the supplementary figure 2.

      (4) Figure S2c: It would be better to show the whole lanes to document the specificity of the antibodies. Anti-Phosphor antibodies are frequently of poor selectivity. In that case, a negative control with corresponding tyrosine mutations would be helpful.

      We have updated Figure S2d (previously Figure S2c) to include the full gel lanes to better illustrate the specificity of both the total EGFR and phospho-EGFR (Y1068) antibodies. The results show a single clear band at the expected molecular weight for EGFR, conforming antibody specificity.

      (5) The Results section already contains quite some discussion. I would thus recommend combining both sections.

      We thank the reviewer for the suggestion. We have now created a results and discussion section to better reflect the content of these paragraphs, with the previous discussion section now a subsection focused on implications of these results.

    1. Notice that equality and equity are not synonymous. If everyone who reads this text is gifted a pair of reading glasses because the author indicates a desire to be inclusive and remove any barriers to reading ability, an equality approach might be to send everyone the same pair of glasses with the same prescription as the author. However, this wouldn’t actually level the playing field, would it? In fact, it might actually disadvantage some readers to use a prescription that would cause their eyes further strain, while advantaging people who happen to have the same prescription as the author.

      Equality and equity aren’t the same, and I see that clearly in my job in healthcare. An equality approach would mean treating every patient exactly the same way—giving everyone the same instructions, the same amount of time, and the same type of support. But in reality, patients come in with very different needs. For example, I might have two patients with the same diagnosis, but one understands medical terms easily while the other struggles with health literacy or is overwhelmed by stress. If I give them both the exact same explanation in the same way, only one of them is truly being helped. The other might leave confused, anxious, and less able to follow their care plan.

      For me, equity shows up when I adjust how I support each patient. I might slow down, use simpler language, check for understanding, or connect someone to extra resources. I’ve seen how much more effective this is than a one-size-fits-all approach. Just like giving everyone the same pair of glasses wouldn’t actually help everyone see, treating every patient the same doesn’t help them heal the same. My job constantly reminds me that fairness isn’t about sameness—it’s about meeting people where they are so they can actually move forward.

    1. He was taken into the family, when only a few weeks of age. His name being that of my own, mine was changed, for the purpose of giving precedence to his, though I was his senior by ten or twelve years.

      Having his name changed is taking away his identity, which is partly what slavery is. The white child gets precedence over the slave, even if it's just a name.

    1. One of the early ways of social communication across the internet was with Email, which originated in the 1960s and 1970s. These allowed people to send messages to each other, and look up if any new messages had been sent to them.

      Totally — email was basically the OG “DM.” It’s kind of wild that something created back in the 1960s/70s is still a core way we communicate today, just with nicer interfaces and way more spam.

    1. Yet here we are — sleeping, eating and panicking if we can't find our phones. It's not just our own connection to these devices, but a whole infrastructure and system that needs to change

      As of 2022, there are over 5 billion internet user over the world. Especially in Asia by having over half of the percentages of users. In the same year, North America had the most internet world penetration rate, up to 93%. The data showed how much we rely on our devices.

    1. Engelbart's Core Question

      This was Engelbart's driving question. Not just "How do we work faster?" but "How do we get better at the process of getting better?" It's a meta-question that shifts our focus from outputs to learning systems.

    1. Reviewer #2 (Public review):

      Summary:

      The authors present Altair-LSFM (Light Sheet Fluorescence Microscope), a high-resolution, open-source light-sheet microscope, that may be relatively easy to align and construct due to a custom-designed mounting plate. The authors developed this microscope to fill a perceived need that current open-source systems are primarily designed for large specimens and lack sub-cellular resolution or achieve high-resolution but are difficult to construct and are unstable. While commercial alternatives exist that offer sub-cellular resolution, they are expensive. The authors manuscript centers around comparisons to the highly successful lattice light-sheet microscope, including the choice of detection and excitation objectives. The authors thus claim that there remains a critical need for a high-resolution, economical and easy to implement LSFM systems and address this need with Altair.

      Strengths:

      The authors succeed in their goals of implementing a relatively low cost (~ USD 150K) open-source microscope that is easy to align. The ease of alignment rests on using custom-designed baseplates with dowel pins for precise positioning of optics based on computer analysis of opto-mechanical tolerances as well as the optical path design. They simplify the excitation optics over Lattice light-sheet microscopes by using a Gaussian beam for illumination while maintaining lateral and axial resolutions of 235 and 350 nm across a 260-um field of view after deconvolution. In doing so they rest on foundational principles of optical microscopy that what matters for lateral resolution is the numerical aperture of the detection objective and proper sampling of the image field on to the detection, and the axial resolution depends on the thickness of the light-sheet when it is thinner than the depth of field of the detection objective. This concept has unfortunately not been completely clear to users of high-resolution light-sheet microscopes and is thus a valuable demonstration. The microscope is controlled by an open-source software, Navigate, developed by the authors, and it is thus foreseeable that different versions of this system could be implemented depending on experimental needs while maintaining easy alignment and low cost. They demonstrate system performance successfully by characterizing their sheet, point-spread function, and visualization of sub-cellular structures in mammalian cells including microtubules, actin filaments, nuclei, and the Golgi apparatus.

      Weaknesses:

      There is still a fixation on comparison to the first-generation lattice light-sheet microscope, which has evolved significantly since then:

      (1) One of the major limitations of the first generation LLSM was the use of a 5 mm coverslip, which was a hinderance for many users. However, the Zeiss system elegantly solves this problem and so does Oblique Plane Microscopy (OPM), while the Altair-LSFM retains this feature which may dissuade widespread adoption. This limitation and how it may be overcome in future iterations is now discussed in the manuscript but remains a limitation in the currently implemented design.

      (2) Further, on the point of sample flexibility, all generations of the LLSM, and by the nature of its design the OPM, can accommodate live-cell imaging with temperature, gas, and humidity control. In the revised manuscript the authors now implement temperature control, but ideal live cell imaging conditions that would include gas and humidity control are not implemented. While, as the authors note, other microscopes that lack full environmental control have achieved widespread adoption, in my view this still limits the use cases of this microscope. There is no discussion on how this limitation of environmental control may be overcome in future iterations.

      (3) While the microscope is well designed and completely open source it will require experience with optics, electronics, and microscopy to implement and align properly. Experience with custom machining or soliciting a machine shop is also necessary. Thus, in my opinion it is unlikely to be implemented by a lab that has zero prior experience with custom optics or can hire someone who does. Altair-LSFM may not be as easily adaptable or implementable as the authors describe or perceive in any lab that is interested even if they can afford it. Claims on how easy it may be to align the system for a "Novice" in supplementary table 5, appear to be unsubstantiated and should be removed unless a Novice was indeed able to assemble and validate the system in 2 weeks. It seems that these numbers were just arbitrarily proposed in the current version without any testing. In our experience it's hard to predict how long an alignment will take for a novice.

      (4) There is no quantification on field uniformity and the tunability of the light sheet parameters (FOV, thickness, PSF, uniformity). There is no quantification on how much improvement is offered by the resonant and how its operation may alter the light-sheet power, uniformity and the measured PSF.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This useful study presents Altair-LSFM, a solid and well-documented implementation of a light-sheet fluorescence microscope (LSFM) designed for accessibility and cost reduction. While the approach offers strengths such as the use of custom-machined baseplates and detailed assembly instructions, its overall impact is limited by the lack of live-cell imaging capabilities and the absence of a clear, quantitative comparison to existing LSFM platforms. As such, although technically competent, the broader utility and uptake of this system by the community may be limited.

      We thank the editors and reviewers for their thoughtful evaluation of our work and for recognizing the technical strengths of the Altair-LSFM platform, including the custom-machined baseplates and detailed documentation provided to promote accessibility and reproducibility. Below, we provide point-by-point responses to each referee comment. In the process, we have significantly revised the manuscript to include live-cell imaging data and a quantitative evaluation of imaging speed. We now more explicitly describe the different variants of lattice light-sheet microscopy—highlighting differences in their illumination flexibility and image acquisition modes—and clarify how Altair-LSFM compares to each. We further discuss challenges associated with the 5 mm coverslip and propose practical strategies to overcome them. Additionally, we outline cost-reduction opportunities, explain the rationale behind key equipment selections, and provide guidance for implementing environmental control. Altogether, we believe these additions have strengthened the manuscript and clarified both the capabilities and limitations of AltairLSFM.

      Public Reviews:

      Reviewer #1 (Public review): 

      Summary: 

      The article presents the details of the high-resolution light-sheet microscopy system developed by the group. In addition to presenting the technical details of the system, its resolution has been characterized and its functionality demonstrated by visualizing subcellular structures in a biological sample.

      Strengths: 

      (1) The article includes extensive supplementary material that complements the information in the main article.

      (2) However, in some sections, the information provided is somewhat superficial.

      We thank the reviewer for their thoughtful assessment and for recognizing the strengths of our manuscript, including the extensive supplementary material. Our goal was to make the supplemental content as comprehensive and useful as possible. In addition to the materials provided with the manuscript, our intention is for the online documentation (available at thedeanlab.github.io/altair) to serve as a living resource that evolves in response to user feedback. We would therefore greatly appreciate the reviewer’s guidance on which sections were perceived as superficial so that we can expand them to better support readers and builders of the system.

      Weaknesses:

      (1) Although a comparison is made with other light-sheet microscopy systems, the presented system does not represent a significant advance over existing systems. It uses high numerical aperture objectives and Gaussian beams, achieving resolution close to theoretical after deconvolution. The main advantage of the presented system is its ease of construction, thanks to the design of a perforated base plate.

      We appreciate the reviewer’s assessment and the opportunity to clarify our intent. Our primary goal was not to introduce new optical functionality beyond that of existing high-performance light-sheet systems, but rather to substantially reduce the barrier to entry for non-specialist laboratories. Many open-source implementations, such as OpenSPIM, OpenSPIN, and Benchtop mesoSPIM, similarly focused on accessibility and reproducibility rather than introducing new optical modalities, yet have had a measureable impact on the field by enabling broader community participation. Altair-LSFM follows this tradition, providing sub-cellular resolution performance comparable to advanced systems like LLSM, while emphasizing reproducibility, ease of construction through a precision-machined baseplate, and comprehensive documentation to facilitate dissemination and adoption.

      (2) Using similar objectives (Nikon 25x and Thorlabs 20x), the results obtained are similar to those of the LLSM system (using a Gaussian beam without laser modulation). However, the article does not mention the difficulties of mounting the sample in the implemented configuration.

      We appreciate the reviewer’s comment and agree that there are practical challenges associated with handling 5 mm diameter coverslips in this configuration. In the revised manuscript, we now explicitly describe these challenges and provide practical solutions. Specifically, we highlight the use of a custommachined coverslip holder designed to simplify mounting and handling, and we direct readers to an alternative configuration using the Zeiss W Plan-Apochromat 20×/1.0 objective, which eliminates the need for small coverslips altogether.

      (3) The authors present a low-cost, open-source system. Although they provide open source code for the software (navigate), the use of proprietary electronics (ASI, NI, etc.) makes the system relatively expensive. Its low cost is not justified.

      We appreciate the reviewer’s perspective and understand the concern regarding the use of proprietary control hardware such as the ASI Tiger Controller and NI data acquisition cards. Our decision to use these components was intentional: relying on a unified, professionally supported and maintained platform minimizes complexity associated with sourcing, configuring, and integrating hardware from multiple vendors, thereby reducing non-financial barriers to entry for non-specialist users.

      Importantly, these components are not the primary cost driver of Altair-LSFM (they represent roughly 18% of the total system cost). Nonetheless, for individuals where the price is prohibitive, we also outline several viable cost-reduction options in the revised manuscript (e.g., substituting manual stages, omitting the filter wheel, or using industrial CMOS cameras), while discussing the trade-offs these substitutions introduce in performance and usability. These considerations are now summarized in Supplementary Note 1, which provides a transparent rationale for our design and cost decisions.

      Finally, we note that even with these professional-grade components, Altair-LSFM remains substantially less expensive than commercial systems offering comparable optical performance, such as LLSM implementations from Zeiss or 3i.

      (4) The fibroblast images provided are of exceptional quality. However, these are fixed samples. The system lacks the necessary elements for monitoring cells in vivo, such as temperature or pH control.

      We thank the reviewer for their positive comment regarding the quality of our data. As noted, the current manuscript focuses on validating the optical performance and resolution of the system using fixed specimens to ensure reproducibility and stability.

      We fully agree on the importance of environmental control for live-cell imaging. In the revised manuscript, we now describe in detail how temperature regulation can be achieved using a custom-designed heated sample chamber, accompanied by detailed assembly instructions on our GitHub repository and summarized in Supplementary Note 2. For pH stabilization in systems lacking a 5% CO₂ atmosphere, we recommend supplementing the imaging medium with 10–25 mM HEPES buffer. Additionally, we include new live-cell imaging data demonstrating that Altair-LSFM supports in vitro time-lapse imaging of dynamic cellular processes under controlled temperature conditions.

      Reviewer #2 (Public review): 

      Summary: 

      The authors present Altair-LSFM (Light Sheet Fluorescence Microscope), a high-resolution, open-source microscope, that is relatively easy to align and construct and achieves sub-cellular resolution. The authors developed this microscope to fill a perceived need that current open-source systems are primarily designed for large specimens and lack sub-cellular resolution or are difficult to construct and align, and are not stable. While commercial alternatives exist that offer sub-cellular resolution, they are expensive. The authors' manuscript centers around comparisons to the highly successful lattice light-sheet microscope, including the choice of detection and excitation objectives. The authors thus claim that there remains a critical need for high-resolution, economical, and easy-to-implement LSFM systems. 

      We thank the reviewer for their thoughtful summary. We agree that existing open-source systems primarily emphasize imaging of large specimens, whereas commercial systems that achieve sub-cellular resolution remain costly and complex. Our aim with Altair-LSFM was to bridge this gap—providing LLSM-level performance in a substantially more accessible and reproducible format. By combining high-NA optics with a precision-machined baseplate and open-source documentation, Altair offers a practical, high-resolution solution that can be readily adopted by non-specialist laboratories.

      Strengths: 

      The authors succeed in their goals of implementing a relatively low-cost (~ USD 150K) open-source microscope that is easy to align. The ease of alignment rests on using custom-designed baseplates with dowel pins for precise positioning of optics based on computer analysis of opto-mechanical tolerances, as well as the optical path design. They simplify the excitation optics over Lattice light-sheet microscopes by using a Gaussian beam for illumination while maintaining lateral and axial resolutions of 235 and 350 nm across a 260-um field of view after deconvolution. In doing so they rest on foundational principles of optical microscopy that what matters for lateral resolution is the numerical aperture of the detection objective and proper sampling of the image field on to the detection, and the axial resolution depends on the thickness of the light-sheet when it is thinner than the depth of field of the detection objective. This concept has unfortunately not been completely clear to users of high-resolution light-sheet microscopes and is thus a valuable demonstration. The microscope is controlled by an open-source software, Navigate, developed by the authors, and it is thus foreseeable that different versions of this system could be implemented depending on experimental needs while maintaining easy alignment and low cost. They demonstrate system performance successfully by characterizing their sheet, point-spread function, and visualization of sub-cellular structures in mammalian cells, including microtubules, actin filaments, nuclei, and the Golgi apparatus.

      We thank the reviewer for their thoughtful and generous assessment of our work. We are pleased that the manuscript’s emphasis on fundamental optical principles, design rationale, and practical implementation was clearly conveyed. We agree that Altair’s modular and accessible architecture provides a strong foundation for future variants tailored to specific experimental needs. To facilitate this, we have made all Zemax simulations, CAD files, and build documentation openly available on our GitHub repository, enabling users to adapt and extend the system for diverse imaging applications.

      Weaknesses:

      There is a fixation on comparison to the first-generation lattice light-sheet microscope, which has evolved significantly since then:

      (1) The authors claim that commercial lattice light-sheet microscopes (LLSM) are "complex, expensive, and alignment intensive", I believe this sentence applies to the open-source version of LLSM, which was made available for wide dissemination. Since then, a commercial solution has been provided by 3i, which is now being used in multiple cores and labs but does require routine alignments. However, Zeiss has also released a commercial turn-key system, which, while expensive, is stable, and the complexity does not interfere with the experience of the user. Though in general, statements on ease of use and stability might be considered anecdotal and may not belong in a scientific article, unreferenced or without data.

      We thank the reviewer for this thoughtful and constructive comment. We have revised the manuscript to more clearly distinguish between the original open-source implementation of LLSM and subsequent commercial versions by 3i and ZEISS. The revised Introduction and Discussion now explicitly note that while open-source and early implementations of LLSM can require expert alignment and maintenance, commercial systems—particularly the ZEISS Lattice Lightsheet 7—are designed for automated operation and stable, turn-key use, albeit at higher cost and with limited modifiability. We have also moderated earlier language regarding usability and stability to avoid anecdotal phrasing.

      We also now provide a more objective proxy for system complexity: the number of optical elements that require precise alignment during assembly and maintenance thereafter. The original open-source LLSM setup includes approximately 29 optical components that must each be carefully positioned laterally, angularly, and coaxially along the optical path. In contrast, the first-generation Altair-LSFM system contains only nine such elements. By this metric, Altair-LSFM is considerably simpler to assemble and align, supporting our overarching goal of making high-resolution light-sheet imaging more accessible to non-specialist laboratories.

      (2) One of the major limitations of the first generation LLSM was the use of a 5 mm coverslip, which was a hinderance for many users. However, the Zeiss system elegantly solves this problem, and so does Oblique Plane Microscopy (OPM), while the Altair-LSFM retains this feature, which may dissuade widespread adoption. This limitation and how it may be overcome in future iterations is not discussed.

      We thank the reviewer for this helpful comment. We agree that the use of 5 mm diameter coverslips, while enabling high-NA imaging in the current Altair-LSFM configuration, may pose a practical limitation for some users. We now discuss this more explicitly in the revised manuscript. Specifically, we note that replacing the detection objective provides a straightforward solution to this constraint. For example, as demonstrated by Moore et al. (Lab Chip, 2021), pairing the Zeiss W Plan-Apochromat 20×/1.0 detection objective with the Thorlabs TL20X-MPL illumination objective allows imaging beyond the physical surfaces of both objectives, eliminating the need for small-format coverslips. In the revised text, we propose this modification as an accessible path toward greater compatibility with conventional sample mounting formats. We also note in the Discussion that Oblique Plane Microscopy (OPM) inherently avoids such nonstandard mounting requirements and, owing to its single-objective architecture, is fully compatible with standard environmental chambers.

      (3) Further, on the point of sample flexibility, all generations of the LLSM, and by the nature of its design, the OPM, can accommodate live-cell imaging with temperature, gas, and humidity control. It is unclear how this would be implemented with the current sample chamber. This limitation would severely limit use cases for cell biologists, for which this microscope is designed. There is no discussion on this limitation or how it may be overcome in future iterations.

      We thank the reviewer for this important observation and agree that environmental control is critical for live-cell imaging applications. It is worth noting that the original open-source LLSM design, as well as the commercial version developed by 3i, provided temperature regulation but did not include integrated control of CO2 or humidity. Despite this limitation, these systems have been widely adopted and have generated significant biological insights. We also acknowledge that both OPM and the ZEISS implementation of LLSM offer clear advantages in this respect, providing compatibility with standard commercial environmental chambers that support full regulation of temperature, CO₂, and humidity.

      In the revised manuscript, we expand our discussion of environmental control in Supplementary Note 2, where we describe the Altair-LSFM chamber design in more detail and discuss its current implementation of temperature regulation and HEPES-based pH stabilization. Additionally, the Discussion now explicitly notes that OPM avoids the challenges associated with non-standard sample mounting and is inherently compatible with conventional environmental enclosures.

      (4) The authors' comparison to LLSM is constrained to the "square" lattice, which, as they point out, is the most used optical lattice (though this also might be considered anecdotal). The LLSM original design, however, goes far beyond the square lattice, including hexagonal lattices, the ability to do structured illumination, and greater flexibility in general in terms of light-sheet tuning for different experimental needs, as well as not being limited to just sample scanning. Thus, the Alstair-LSFM cannot compare to the original LLSM in terms of versatility, even if comparisons to the resolution provided by the square lattice are fair.

      We agree that the original LLSM design offers substantially greater flexibility than what is reflected in our initial comparison, including the ability to generate multiple lattice geometries (e.g., square and hexagonal), operate in structured illumination mode, and acquire volumes using both sample- and lightsheet–scanning strategies. To address this, we now include Supplementary Note 3 that provides a detailed overview of the illumination modes and imaging flexibility afforded by the original LLSM implementation, and how these capabilities compare to both the commercial ZEISS Lattice Lightsheet 7 and our AltairLSFM system. In addition, we have revised the discussion to explicitly acknowledge that the original LLSM could operate in alternative scan strategies beyond sample scanning, providing greater context for readers and ensuring a more balanced comparison.

      (5) There is no demonstration of the system's live-imaging capabilities or temporal resolution, which is the main advantage of existing light-sheet systems.

      In the revised manuscript, we now include a demonstration of live-cell imaging to directly validate AltairLSFM’s suitability for dynamic biological applications. We also explicitly discuss the temporal resolution of the system in the main text (see Optoelectronic Design of Altair-LSFM), where we detail both software- and hardware-related limitations. Specifically, we evaluate the maximum imaging speed achievable with Altair-LSFM in conjunction with our open-source control software, navigate.

      For simplicity and reduced optoelectronic complexity, the current implementation powers the piezo through the ASI Tiger Controller, which modestly reduces its bandwidth. Nonetheless, for a 100 µm stroke typical of light-sheet imaging, we achieved sufficient performance to support volumetric imaging at most biologically relevant timescales. These results, along with additional discussion of the design trade-offs and performance considerations, are now included in the revised manuscript and expanded upon in the supplementary material.

      While the microscope is well designed and completely open source, it will require experience with optics, electronics, and microscopy to implement and align properly. Experience with custom machining or soliciting a machine shop is also necessary. Thus, in my opinion, it is unlikely to be implemented by a lab that has zero prior experience with custom optics or can hire someone who does. Altair-LSFM may not be as easily adaptable or implementable as the authors describe or perceive in any lab that is interested, even if they can afford it. The authors indicate they will offer "workshops," but this does not necessarily remove the barrier to entry or lower it, perhaps as significantly as the authors describe.

      We appreciate the reviewer’s perspective and agree that building any high-performance custom microscope—Altair-LSFM included—requires a basic understanding of (or willingness to learn) optics, electronics, and instrumentation. Such a barrier exists for all open-source microscopes, and our goal is not to eliminate this requirement entirely but to substantially reduce the technical and logistical challenges that typically accompany the construction of custom light-sheet systems.

      Importantly, no machining experience or in-house fabrication capabilities are required. Users can simply submit the provided CAD design files and specifications directly to commercial vendors for fabrication. We have made this process as straightforward as possible by supplying detailed build instructions, recommended materials, and vendor-ready files through our GitHub repository. Our dissemination strategy draws inspiration from other successful open-source projects such as mesoSPIM, which has seen widespread adoption—over 30 implementations worldwide—through a similar model of exhaustive documentation, open-source software, and community support via user meetings and workshops.

      We also recognize that documentation alone cannot fully replace hands-on experience. To further lower barriers to adoption, we are actively working with commercial vendors to streamline procurement and assembly, and Altair-LSFM is supported by a Biomedical Technology Development and Dissemination (BTDD) grant that provides resources for hosting workshops, offering real-time community support, and developing supplementary training materials.

      In the revised manuscript, we now expand the Discussion to explicitly acknowledge these implementation considerations and to outline our ongoing efforts to support a broad and diverse user base, ensuring that laboratories with varying levels of technical expertise can successfully adopt and maintain the Altair-LSFM platform.

      There is a claim that this design is easily adaptable. However, the requirement of custom-machined baseplates and in silico optimization of the optical path basically means that each new instrument is a new design, even if the Navigate software can be used. It is unclear how Altair-LSFM demonstrates a modular design that reduces times from conception to optimization compared to previous implementations.

      We thank the reviewer for this insightful comment and agree that our original language regarding adaptability may have overstated the degree to which Altair-LSFM can be modified without prior experience. It was not our intention to imply that the system can be easily redesigned by users with limited technical background. Meaningful adaptations of the optical or mechanical design do require expertise in optical layout, optomechanical design, and alignment.

      That said, for laboratories with such expertise, we aim to facilitate modifications by providing comprehensive resources—including detailed Zemax simulations, complete CAD models, and alignment documentation. These materials are intended to reduce the development burden for expert users seeking to tailor the system to specific experimental requirements, without necessitating a complete re-optimization of the optical path from first principles.

      In the revised manuscript, we clarify this point and temper our language regarding adaptability to better reflect the realistic scope of customization. Specifically, we now state in the Discussion: “For expert users who wish to tailor the instrument, we also provide all Zemax illumination-path simulations and CAD files, along with step-by-step optimization protocols, enabling modification and re-optimization of the optical system as needed.” This revision ensures that readers clearly understand that Altair-LSFM is designed for reproducibility and straightforward assembly in its default configuration, while still offering the flexibility for modification by experienced users.

      Reviewer #3 (Public review):

      Summary: 

      This manuscript introduces a high-resolution, open-source light-sheet fluorescence microscope optimized for sub-cellular imaging. The system is designed for ease of assembly and use, incorporating a custommachined baseplate and in silico optimized optical paths to ensure robust alignment and performance. The authors demonstrate lateral and axial resolutions of ~235 nm and ~350 nm after deconvolution, enabling imaging of sub-diffraction structures in mammalian cells. The important feature of the microscope is the clever and elegant adaptation of simple gaussian beams, smart beam shaping, galvo pivoting and high NA objectives to ensure a uniform thin light-sheet of around 400 nm in thickness, over a 266 micron wide Field of view, pushing the axial resolution of the system beyond the regular diffraction limited-based tradeoffs of light-sheet fluorescence microscopy. Compelling validation using fluorescent beads and multicolor cellular imaging highlights the system's performance and accessibility. Moreover, a very extensive and comprehensive manual of operation is provided in the form of supplementary materials. This provides a DIY blueprint for researchers who want to implement such a system.

      We thank the reviewer for their thoughtful and positive assessment of our work. We appreciate their recognition of Altair-LSFM’s design and performance, including its ability to achieve high-resolution, imaging throughout a 266-micron field of view. While Altair-LSFM approaches the practical limits of diffraction-limited performance, it does not exceed the fundamental diffraction limit; rather, it achieves near-theoretical resolution through careful optical optimization, beam shaping, and alignment. We are grateful for the reviewer’s acknowledgment of the accessibility and comprehensive documentation that make this system broadly implementable.

      Strengths:

      (1) Strong and accessible technical innovation: With an elegant combination of beam shaping and optical modelling, the authors provide a high-resolution light-sheet system that overcomes the classical light-sheet tradeoff limit of a thin light-sheet and a small field of view. In addition, the integration of in silico modelling with a custom-machined baseplate is very practical and allows for ease of alignment procedures. Combining these features with the solid and super-extensive guide provided in the supplementary information, this provides a protocol for replicating the microscope in any other lab.

      (2) Impeccable optical performance and ease of mounting of samples: The system takes advantage of the same sample-holding method seen already in other implementations, but reduces the optical complexity.

      At the same time, the authors claim to achieve similar lateral and axial resolution to Lattice-light-sheet microscopy (although without a direct comparison (see below in the "weaknesses" section). The optical characterization of the system is comprehensive and well-detailed. Additionally, the authors validate the system imaging sub-cellular structures in mammalian cells.

      (3) Transparency and comprehensiveness of documentation and resources: A very detailed protocol provides detailed documentation about the setup, the optical modeling, and the total cost.

      We thank the reviewer for their thoughtful and encouraging comments. We are pleased that the technical innovation, optical performance, and accessibility of Altair-LSFM were recognized. Our goal from the outset was to develop a diffraction-limited, high-resolution light-sheet system that balances optical performance with reproducibility and ease of implementation. We are also pleased that the use of precisionmachined baseplates was recognized as a practical and effective strategy for achieving performance while maintaining ease of assembly.

      Weaknesses: 

      (1) Limited quantitative comparisons: Although some qualitative comparison with previously published systems (diSPIM, lattice light-sheet) is provided throughout the manuscript, some side-by-side comparison would be of great benefit for the manuscript, even in the form of a theoretical simulation. While having a direct imaging comparison would be ideal, it's understandable that this goes beyond the interest of the paper; however, a table referencing image quality parameters (taken from the literature), such as signalto-noise ratio, light-sheet thickness, and resolutions, would really enhance the features of the setup presented. Moreover, based also on the necessity for optical simplification, an additional comment on the importance/difference of dual objective/single objective light-sheet systems could really benefit the discussion.

      In the revised manuscript, we have significantly expanded our discussion of different light-sheet systems to provide clearer quantitative and conceptual context for Altair-LSFM. These comparisons are based on values reported in the literature, as we do not have access to many of these instruments (e.g., DaXi, diSPIM, or commercial and open-source variants of LLSM), and a direct experimental comparison is beyond the scope of this work.

      We note that while quantitative parameters such as signal-to-noise ratio are important, they are highly sample-dependent and strongly influenced by imaging conditions, including fluorophore brightness, camera characteristics, and filter bandpass selection. For this reason, we limited our comparison to more general image-quality metrics—such as light-sheet thickness, resolution, and field of view—that can be reliably compared across systems.

      Finally, per the reviewer’s recommendation, we have added additional discussion clarifying the differences between dual-objective and single-objective light-sheet architectures, outlining their respective strengths, limitations, and suitability for different experimental contexts.

      (2) Limitation to a fixed sample: In the manuscript, there is no mention of incubation temperature, CO₂ regulation, Humidity control, or possible integration of commercial environmental control systems. This is a major limitation for an imaging technique that owes its popularity to fast, volumetric, live-cell imaging of biological samples.

      We fully agree that environmental control is critical for live-cell imaging applications. In the revised manuscript, we now describe the design and implementation of a temperature-regulated sample chamber in Supplementary Note 2, which maintains stable imaging conditions through the use of integrated heating elements and thermocouples. This approach enables precise temperature control while minimizing thermal gradients and optical drift. For pH stabilization, we recommend the use of 10–25 mM HEPES in place of CO₂ regulation, consistent with established practice for most light-sheet systems, including the initial variant of LLSM. Although full humidity and CO₂ control are not readily implemented in dual-objective configurations, we note that single-objective designs such as OPM are inherently compatible with commercial environmental chambers and avoid these constraints. Together, these additions clarify how environmental control can be achieved within Altair-LSFM and situate its capabilities within the broader LSFM design space.

      (3) System cost and data storage cost: While the system presented has the advantage of being opensource, it remains relatively expensive (considering the 150k without laser source and optical table, for example). The manuscript could benefit from a more direct comparison of the performance/cost ratio of existing systems, considering academic settings with budgets that most of the time would not allow for expensive architectures. Moreover, it would also be beneficial to discuss the adaptability of the system, in case a 30k objective could not be feasible. Will this system work with different optics (with the obvious limitations coming with the lower NA objective)? This could be an interesting point of discussion. Adaptability of the system in case of lower budgets or more cost-effective choices, depending on the needs.

      We agree that cost considerations are critical for adoption in academic environments. We would also like to clarify that the quoted $150k includes the optical table and laser source. In the revised manuscript, Supplementary Note 1 now includes an expanded discussion of cost–performance trade-offs and potential paths for cost reduction.

      Last, not much is said about the need for data storage. Light-sheet microscopy's bottleneck is the creation of increasingly large datasets, and it could be beneficial to discuss more about the storage needs and the quantity of data generated.

      In the revised manuscript, we now include Supplementary Note 4, which provides a high-level discussion of data storage needs, approximate costs, and practical strategies for managing large datasets generated by light-sheet microscopy. This section offers general guidance—including file-format recommendations, and cost considerations—but we note that actual costs will vary by institution and contractual agreements.

      Conclusion:

      Altair-LSFM represents a well-engineered and accessible light-sheet system that addresses a longstanding need for high-resolution, reproducible, and affordable sub-cellular light-sheet imaging. While some aspects-comparative benchmarking and validation, limitation for fixed samples-would benefit from further development, the manuscript makes a compelling case for Altair-LSFM as a valuable contribution to the open microscopy scientific community. 

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) A picture, or full CAD design of the complete instrument, should be included as a main figure.

      A complete CAD rendering of the microscope is now provided in Supplementary Figure 4.

      (2) There is no quantitative comparison of the effects of the tilting resonant galvo; only a cartoon, a figure should be included.

      The cartoon was intended purely as an educational illustration to conceptually explain the role of the tilting resonant galvo in shaping and homogenizing the light sheet. To clarify this intent, we have revised both the figure legend and corresponding text in the main manuscript. For readers seeking quantitative comparisons, we now reference the original study that provides a detailed analysis of this optical approach, as well as a review on the subject.

      (3) Description of L4 is missing in the Figure 1 caption.

      Thank you for catching this omission. We have corrected it.

      (4) The beam profiles in Figures 1c and 3a, please crop and make the image bigger so the profile can be appreciated. The PSFs in Figure 3c-e should similarly be enlarged and presented using a dynamic range/LUT such that any aberrations can be appreciated.

      In Figure 1c, our goal was to qualitatively illustrate the uniformity of the light-sheet across the full field of view, while Figure 1d provided the corresponding quantitative cross-section. To improve clarity, we have added an additional figure panel offering a higher-magnification, localized view of the light-sheet profile. For Figure 3c–e, we have enlarged the PSF images and adjusted the display range to better convey the underlying signal and allow subtle aberrations to be appreciated.

      (5) It is unclear why LLSM is being used as the gold standard, since in its current commercial form, available from Zeiss, it is a turn-key system designed for core facilities. The original LLSM is also a versatile instrument that provides much more than the square lattice for illumination, including structured illumination, hexagonal lattices, live-cell imaging, wide-field illumination, different scan modes, etc. These additional features are not even mentioned when compared to the Altair-LSFM. If a comparison is to be provided, it should be fair and balanced. Furthermore, as outlined in the public review, anecdotal statements on "most used", "difficult to align", or "unstable" should not be provided without data.

      In the revised manuscript, we have carefully removed anecdotal statements and, where appropriate, replaced them with quantitative or verifiable information. For instance, we now explicitly report that the square lattice was used in 16 of the 20 figure subpanels in the original LLSM publication, and we include a proxy for optical complexity based on the number of optical elements requiring alignment in each system.

      We also now clearly distinguish between the original LLSM design—which supports multiple illumination and scanning modes—and its subsequent commercial variants, including the ZEISS Lattice Lightsheet 7, which prioritizes stability and ease of use over configurational flexibility (see Supplementary Note 3).

      (6) The authors should recognize that implementing custom optics, no matter how well designed, is a big barrier to cross for most cell biology labs.

      We fully understand and now acknowledge in the main text that implementing custom optics can present a significant barrier, particularly for laboratories without prior experience in optical system assembly. However, similar challenges were encountered during the adoption of other open-source microscopy platforms, such as mesoSPIM and OpenSPIM, both of which have nonetheless achieved widespread implementation. Their success has largely been driven by exhaustive documentation, strong community support, and standardized design principles—approaches we have also prioritized in Altair-LSFM. We have therefore made all CAD files, alignment guides, and detailed build documentation publicly available and continue to develop instructional materials and community resources to further reduce the barrier to adoption.

      (7) Statements on "hands on workshops" though laudable, may not be appropriate to include in a scientific publication without some documentation on the influence they have had on implanting the microscope.

      We understand the concern. Our intention in mentioning hands-on workshops was to convey that the dissemination effort is supported by an NIH Biomedical Technology Development and Dissemination grant, which includes dedicated channels for outreach and community engagement. Nonetheless, we agree that such statements are not appropriate without formal documentation of their impact, and we have therefore removed this text from the revised manuscript.

      (8) It is claimed that the microscope is "reliable" in the discussion, but with no proof, long-term stability should be assessed and included.

      Our experience with Altair-LSFM has been that it remains well-aligned over time—especially in comparison to other light-sheet systems we worked on throughout the last 11 years—we acknowledge that this assessment is anecdotal. As such, we have omitted this claim from the revised manuscript.

      (9) Due to the reliance on anecdotal statements and comparisons without proof to other systems, this paper at times reads like a brochure rather than a scientific publication. The authors should consider editing their manuscript accordingly to focus on the technical and quantifiable aspects of their work.

      We agree with the reviewer’s assessment and have revised the manuscript to remove anecdotal comparisons and subjective language. Where possible, we now provide quantitative metrics or verifiable data to support our statements.

      Reviewer #3 (Recommendations for the authors):

      Other minor points that could improve the manuscript (although some of these points are explained in the huge supplementary manual): 

      (1) The authors explain thoroughly their design, and they chose a sample-scanning method. I think that a brief discussion of the advantages and disadvantages of such a method over, for example, a laserscanning system (with fixed sample) in the main text will be highly beneficial for the users.

      In the revised manuscript, we now include a brief discussion in the main text outlining the advantages and limitations of a sample-scanning approach relative to a light-sheet–scanning system. Specifically, we note that for thin, adherent specimens, sample scanning minimizes the optical path length through the sample, allowing the use of more tightly focused illumination beams that improve axial resolution. We also include a new supplementary figure illustrating how this configuration reduces the propagation length of the illumination light sheet, thereby enhancing axial resolution.

      (2) The authors justify selecting a 0.6 NA illumination objective over alternatives (e.g., Special Optics), but the manuscript would benefit from a more quantitative trade-off analysis (beam waist, working distance, sample compatibility) with other possibilities. Within the objective context, a comparison of the performances of this system with the new and upcoming single-objective light-sheet methods (and the ones based also on optical refocusing, e.g., DAXI) would be very interesting for the goodness of the manuscript.

      In the revised manuscript, we now provide a quantitative trade-off analysis of the illumination objectives in Supplementary Note 1, including comparisons of beam waist, working distance, and sample compatibility. This section also presents calculated point spread functions for both the 0.6 NA and 0.67 NA objectives, outlining the performance trade-offs that informed our design choice. In addition, Supplementary Note 3 now includes a broader comparison of Altair-LSFM with other light-sheet modalities, including diSPIM, ASLM, and OPM, to further contextualize the system’s capabilities within the evolving light-sheet microscopy landscape.

      (3) The modularity of the system is implied in the context of the manuscript, but not fully explained. The authors should specify more clearly, for example, if cameras could be easily changed, objectives could be easily swapped, light-sheet thickness could be tuned by changing cylindrical lens, how users might adapt the system for different samples (e.g., embryos, cleared tissue, live imaging), .etc, and discuss eventual constraints or compatibility issues to these implementations.

      Altair-LSFM was explicitly designed and optimized for imaging live adherent cells, where sample scanning and short light-sheet propagation lengths provide optimal axial resolution (Supplementary Note 3). While the same platform could be used for superficial imaging in embryos, systems implementing multiview illumination and detection schemes are better suited for such specimens. Similarly, cleared tissue imaging typically requires specialized solvent-compatible objectives and approaches such as ASLM that maximize the field of view. We have now added some text to the Design Principles section that explicitly state this.

      Altair-LSFM offers varying levels of modularity depending on the user’s level of expertise. For entry-level users, the illumination numerical aperture—and therefore the light-sheet thickness and propagation length—can be readily adjusted by tuning the rectangular aperture conjugate to the back pupil of the illumination objective, as described in the Design Principles section. For mid-level users, alternative configurations of Altair-LSFM, including different detection objectives, stages, filter wheels, or cameras, can be readily implemented (Supplementary Note 1). Importantly, navigate natively supports a broad range of hardware devices, and new components can be easily integrated through its modular interface. For expert users, all Zemax simulations, CAD models, and step-by-step optimization protocols are openly provided, enabling complete re-optimization of the optical design to meet specific experimental requirements.

      (4) Resolution measurements before and after deconvolution are central to the performance claim, but the deconvolution method (PetaKit5D) is only briefly mentioned in the main text, it's not referenced, and has to be clarified in more detail, coherently with the precision of the supplementary information. More specifically, PetaKit5D should be referenced in the main text, the details of the deconvolution parameters discussed in the Methods section, and the computational requirements should also be mentioned. 

      In the revised manuscript, we now provide a dedicated description of the deconvolution process in the Methods section, including the specific parameters and algorithms used. We have also explicitly referenced PetaKit5D in the main text to ensure proper attribution and clarity. Additionally, we note the computational requirements associated with this analysis in the same section for completeness.

      (5)  Image post-processing is not fully explained in the main text. Since the system is sample-scanning based, no word in the main text is spent on deskewing, which is an integral part of the post-processing to obtain a "straight" 3D stack. Since other systems implement such a post-processing algorithm (for example, single-objective architectures), it would be beneficial to have some discussion about this, and also a brief comparison to other systems in the main text in the methods section. 

      In the revised manuscript, we now explicitly describe both deskewing (shearing) and deconvolution procedures in the Alignment and Characterization section of the main text and direct readers to the Methods section. We also briefly explain why the data must be sheared to correct for the angled sample-scanning geometry for LLSM and Altair-LSFM, as well as both sample-scanning and laser-scanning-variants of OPMs.

      (6) A brief discussion on comparative costs with other systems (LLSM, dispim, etc.) could be helpful for non-imaging expert researchers who could try to implement such an optical architecture in their lab.

      Unfortunately, the exact costs of commercial systems such as LLSM or diSPIM are typically not publicly available, as they depend on institutional agreements and vendor-specific quotations. Nonetheless, we now provide approximate cost estimates in Supplementary Note 1 to help readers and prospective users gauge the expected scale of investment relative to other advanced light-sheet microscopy systems.

      (7) The "navigate" control software is provided, but a brief discussion on its advantages compared to an already open-access system, such as Micromanager, could be useful for the users.

      In the revised manuscript, we now include Supplementary Note 5 that discusses the advantages and disadvantages of different open-source microscope control platforms, including navigate and MicroManager. In brief, navigate was designed to provide turnkey support for multiple light-sheet architectures, with pre-configured acquisition routines optimized for Altair-LSFM, integrated data management with support for multiple file formats (TIFF, HDF5, N5, and Zarr), and full interoperability with OMEcompliant workflows. By contrast, while Micro-Manager offers a broader library of hardware drivers, it typically requires manual configuration and custom scripting for advanced light-sheet imaging workflows.

      (8) The cost and parts are well documented, but the time and expertise required are not crystal clear.Adding a simple time estimate (perhaps in the Supplement Section) of assembly/alignment/installation/validation and first imaging will be very beneficial for users. Also, what level of expertise is assumed (prior optics experience, for example) to be needed to install a system like this? This can help non-optics-expert users to better understand what kind of adventure they are putting themselves through.

      We thank the reviewer for this helpful suggestion. To address this, we have added Supplementary Table S5, which provides approximate time estimates for assembly, alignment, validation, and first imaging based on the user’s prior experience with optical systems. The table distinguishes between novice (no prior experience), moderate (some experience using but not assembling optical systems), and expert (experienced in building and aligning optical systems) users. This addition is intended to give prospective builders a realistic sense of the time commitment and level of expertise required to assemble and validate AltairLSFM.

      Minor things in the main text:

      (1) Line 109: The cost is considered "excluding the laser source". But then in the table of costs, you mention L4cc as a "multicolor laser source", for 25 K. Can you explain this better? Are the costs correct with or without the laser source? 

      We acknowledge that the statement in line 109 was incorrect—the quoted ~$150k system cost does include the laser source (L4cc, listed at $25k in the cost table). We have corrected this in the revised manuscript.

      (2) Line 113: You say "lateral resolution, but then you state a 3D resolution (230 nm x 230 nm x 370 nm). This needs to be fixed.

      Thank you, we have corrected this.

      (3) Line 138: Is the light-sheet uniformity proven also with a fluorescent dye? This could be beneficial for the main text, showing the performance of the instrument in a fluorescent environment.

      The light-sheet profiles shown in the manuscript were acquired using fluorescein to visualize the beam. We have revised the main text and figure legends to clearly state this.

      (4) Line 149: This is one of the most important features of the system, defying the usual tradeoff between light-sheet thickness and field of view, with a regular Gaussian beam. I would clarify more specifically how you achieve this because this really is the most powerful takeaway of the paper.

      We thank the reviewer for this key observation. The ability of Altair-LSFM to maintain a thin light sheet across a large field of view arises from diffraction effects inherent to high NA illumination. Specifically, diffraction elongates the PSF along the beam’s propagation direction, effectively extending the region over which the light sheet remains sufficiently thin for high-resolution imaging. This phenomenon, which has been the subject of active discussion within the light-sheet microscopy community, allows Altair-LSFM to partially overcome the conventional trade-off between light-sheet thickness and propagation length. We now clarify this point in the main text and provide a more detailed discussion in Supplementary Note 3, which is explicitly referenced in the discussion of the revised manuscript.

      (5) Line 171: You talk about repeatable assembly...have you tried many different baseplates? Otherwise, this is a complicated statement, since this is a proof-of-concept paper. 

      We thank the reviewer for this comment. We have not yet validated the design across multiple independently assembled baseplates and therefore agree that our previous statement regarding repeatable assembly was premature. To avoid overstating the current level of validation, we have removed this statement from the revised manuscript.

      (6) Line 187: same as above. You mention "long-term stability". For how long did you try this? This should be specified in numbers (days, weeks, months, years?) Otherwise, it is a complicated statement to make, since this is a proof-of-concept paper.

      We also agree that referencing long-term stability without quantitative backing is inappropriate, and have removed this statement from the revised manuscript.

      (7) Line 198: "rapid z-stack acquisition. How rapid? Also, what is the limitation of the galvo-scanning in terms of the imaging speed of the system? This should be noted in the methods section.

      In the revised manuscript, we now clarify these points in the Optoelectronic Design section. Specifically, we explicitly note that the resonant galvo used for shadow reduction operates at 4 kHz, ensuring that it is not rate-limiting for any imaging mode. In the same section, we also evaluate the maximum acquisition speeds achievable using navigate and report the theoretical bandwidth of the sample-scanning piezo, which together define the practical limits of volumetric acquisition speed for Altair-LSFM.

      (8) Line 234: Peta5Kit is discussed in the additional documentation, but should be referenced here, as well.

      We now reference and cite PetaKit5D.

      (9) Line 256: "values are on par with LLSM", but no values are provided. Some details should also be provided in the main text.

      In the revised manuscript, we now provide the lateral and axial resolution values originally reported for LLSM in the main text to facilitate direct comparison with Altair-LSFM. Additionally, Supplementary Note 3 now includes an expanded discussion on the nuances of resolution measurement and reporting in lightsheet microscopy.

      Figures:

      (1) Figure 1 could be implemented with Figure 3. They're both discussing the validation of the system (theoretically and with simulations), and they could be together in different panels of the same figure. The experimental light-sheet seems to be shown in a transmission mode. Showing a pattern in a fluorescent dye could also be beneficial for the paper.

      In Figure 1, our goal was to guide readers through the design process—illustrating how the detection objective’s NA sets the system’s resolution, which defines the required pixel size for Nyquist sampling and, in turn, the field of view. We then use Figure 1b–c to show how the illumination beam was designed and simulated to achieve that field of view. In contrast, Figure 3 presents the experimental validation of the illumination system. To avoid confusion, we now clarify in the text that the light sheet shown in Figure 3 was visualized in a fluorescein solution and imaged in transmission mode. While we agree that Figures 1 and 3 both serve to validate the system, we prefer to keep them as separate figures to maintain focus within each panel. We believe this organization better supports the narrative structure and allows readers to digest the theoretical and experimental validations independently.

      (2) Figure 3: Panels d and e show the same thing. Why would you expect that xz and yz profiles should be different? Is this due to the orientation of the objectives towards the sample?

      In Figure 3, we present the PSF from all three orthogonal views, as this provides the most transparent assessment of PSF quality—certain aberration modes can be obscured when only select perspectives are shown. In principle, the XZ and YZ projections should be equivalent in a well-aligned system. However, as seen in the XZ projection, a small degree of coma is present that is not evident in the YZ view. We now explicitly note this observation in the revised figure caption to clarify the difference between these panels.

      (3) Figure 4's single boxes lack a scale bar, and some of the Supplementary Figures (e.g. Figure 5) lack detailed axis labels or scale bars. Also, in the detailed documentation, some figures are referred to as Figure 5. Figure 7 or, for example, figure 6. Figure 8, and this makes the cross-references very complicated to follow

      In the revised manuscript, we have corrected these issues. All figures and supplementary figures now include appropriate scale bars, axis labels, and consistent formatting. We have also carefully reviewed and standardized all cross-references throughout the main text and supplementary documentation to ensure that figure numbering is accurate and easy to follow.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In the manuscript submission by Zhao et al. entitled, "Cardiac neurons expressing a glucagon-like receptor mediate cardiac arrhythmia induced by high-fat diet in Drosophila" the authors assert that cardiac arrhythmias in Drosophila on a high fat diet is due in part to adipokinetic hormone (Akh) signaling activation. High fat diet induces Akh secretion from activated endocrine neurons, which activate AkhR in posterior cardiac neurons. Silencing or deletion of Akh or AkhR blocks arrhythmia in Drosophila on high fat diet. Elimination of one of two AkhR expressing cardiac neurons results in arrhythmia similar to high fat diet.

      Strengths:

      The authors propose a novel mechanism for high fat diet induced arrhythmia utilizing the Akh signaling pathway that signals to cardiac neurons.

      Comments on revisions:

      The authors have addressed my other concerns. The only outstanding issue is in regard to the following comment:

      The authors state that "HFD led to increased heartbeat and an irregular rhythm." In representative examples shown, HFD resulted in pauses, slower heart rate, and increased irregularity in rhythm but not consistently increased heart rate (Figures 1B, 3A, and 4C). Based on the cited work by Ocorr et al (https://doi.org/10.1073/pnas.0609278104), Drosophila heart rate is highly variable with periods of fast and slow rates, which the authors attributed to neuronal and hormonal inputs. Ocorr et al then describe the use of "semi-intact" flies to remove autonomic input to normalize heart rate. Were semi-intact flies used? If not, how was heart rate variability controlled? And how was heart rate "increase" quantified in high fat diet compared to normal fat diet? Lastly, how does one measure "arrhythmia" when there is so much heart rate variability in normal intact flies?

      The authors state that 8 sec time windows were selected at the discretion of the imager for analysis. I don't know how to avoid bias unless the person acquiring the imaging is blinded to the condition and the analysis is also done blind. Can you comment whether data acquisition and analysis was done in a blinded fashion? If not, this should be stated as a limitation of the study.

      Drosophila heart rate is highly variable. During the recording, we were biased to choose a time window when heartbeat was fairly stable. This is a limitation of the study, which we mentioned in the revised version. We chose to use intact over “semi-intact” flies with an intention to avoid damaging the cardiac neurons. 

      Reviewer #3 (Public review):

      Zhao et al. provide new insights into the mechanism by which a high-fat diet (HFD) induces cardiac arrhythmia employing Drosophila as a model. HFD induces cardiac arrhythmia in both mammals and Drosophila. Both glucagon and its functional equivalent in Drosophila Akh are known to induce arrhythmia. The study demonstrates that Akh mRNA levels are increased by HFD and both Akh and its receptor are necessary for high-fat diet-induced cardiac arrhythmia, elucidating a novel link. Notably, Zhao et al. identify a pair of AKH receptor-expressing neurons located at the posterior of the heart tube. Interestingly, these neurons innervate the heart muscle and form synaptic connections, implying their roles in controlling the heart muscle. The study presented by Zhao et al. is intriguing, and the rigorous characterization of the AKH receptor-expressing neurons would significantly enhance our understanding of the molecular mechanism underlying HFD-induced cardiac arrhythmia.

      Many experiments presented in the manuscript are appropriate for supporting the conclusions while additional controls and precise quantifications should help strengthen the authors' arguments. The key results obtained by loss of Akh (or AkhR) and genetic elimination of the identified AkhR-expressing cardiac neurons do not reconcile, complicating the overall interpretation.

      We thank the reviewer for the positive comments. We believe that more signaling pathways are active in the AkhR neurons and regulate rhythmic heartbeat. We are current searching for the molecules and pathways that act on the AkhR cardiac neurons to regulate the heartbeat. Thus, AkhR neuron x shall have a more profound effect. Loss of AkhR is not equivalent to AkhR neuron ablation. 

      The most exciting result is the identification of AkhR-expressing neurons located at the posterior part of the heart tube (ACNs). The authors attempted to determine the function of ACNs by expressing rpr with AkhR-GAL4, which would induce cell death in all AkhRexpressing cells, including ACNs. The experiments presented in Figure 6 are not straightforward to interpret. Moreover, the conclusion contradicts the main hypothesis that elevated Akh is the basis of HFD-induced arrhythmia. The results suggest the importance of AkhR-expressing cells for normal heartbeat. However, elimination of Akh or AkhR restores normal rhythm in HFD-fed animals, suggesting that Akh and AkhR are not important for maintaining normal rhythms. If Akh signaling in ACNs is key for HFD-induced arrhythmia, genetic elimination of ACNs should unalter rhythm and rescue the HFD-induced arrhythmia. An important caveat is that the experiments do not test the specific role of ACNs. ACNs should be just a small part of the cells expressing AkhR. Specific manipulation of ACNs will significantly improve the study. Moreover, the main hypothesis suggests that HFD may alter the activity of ACNs in a manner dependent on Akh and AkhR. Testing how HFD changes calcium, possibly by CaLexA (Figure 2) and/or GCaMP, in wild-type and AkhR mutant could be a way to connect ACNs to HFD-induced arrhythmia. Moreover, optogenetic manipulation of ACNs may allow for specific manipulation of ACNs.

      We thank the reviewer for suggesting the detailed experiments and we believe that address these points shall consolidate the results. As AkhR-Gal4 also expresses in the fat body, we set out to build a more specific driver. We planned to use split-Gal4 system (Luan et al. 2006. PMID: 17088209). The combination of pan neuronal Elav-Gal4.DBD and AkhRp65.AD shall yield AkhR neuron specific driver. We selected 2580 bp AkhR upstream DNA and cloned into pBPp65ADZpUw plasmid (Addgene plasmid: #26234). After two rounds of injection, however, we were not able to recover a transgenic line.

      We used GCaMP to record the calcium signal in the AkhR neurons. AkhR-Gal4>GCaMP has extremely high levels of fluorescence in the cardiac neurons under normal condition.

      We are screening Gal4 drivers, trying to find one line that is specific to the cardiac neurons and has a lower level of driver activity.   

      Interestingly, expressing rpr with AkhR-GAL4 was insufficient to eliminate both ACNs. It is not clear why it didn't eliminate both ACNs. Given the incomplete penetrance, appropriate quantifications should be helpful. Additionally, the impact on other AhkR-expressing cells should be assessed. Adding more copies of UAS-rpr, AkhR-GAL4, or both may eliminate all ACNs and other AkhR-expressing cells. The authors could also try UAS-hid instead of UASrpr.

      We quantified the AkhR neuron ablation and found that about 69% (n=28) showed a single ACN in AkhR-Gal4>rpr flies. It is more challenging to quantify other AkhR-expressing cells, as they are wide-spread distributed. We tried to add more copies of UAS-rpr or AkhR-Gal4, which caused developmental defects (pupa lethality). Thus, as mentioned above, we are trying to find a more specific driver for targeting the cardiac neurons.

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      The authors refer 'crop' as the functional equivalent of the human stomach. Considering the difference in their primary functions, this cannot be justified.

      In Drosophila, the crop functions analogously to the stomach in vertebrates. It is a foregut storage and preliminary processing organ that regulates food passage into the midgut. It’s more than a simple reservoir. Crop engages in enzymatic mixing, neural control, and active motility.

      Line 163 and 166, APCs are not neurons.

      Akh-producing cells (APCs) in Drosophila are neuroendocrine cells, residing in the corpora cardiaca (CC). While they produce and secrete the hormone AKH (akin to glucagon), they are not brain interneurons per se. APCs share many neuronal features (vesicular release, axon-like projections) and receive neural inputs, effectively functioning as a peripheral endocrine center.

    1. Tenochtitlán invented Chinampas: floating gardens built on barges made of reeds and filled with fertile soil. Lake water constantly irrigated these chinampas, which are still in use and can be seen today in the Xochimilco district of Mexico City.

      It's cool that chinampas were invented so long ago and yet are still being used today. I got to see one in real life, and it almost feels fairytale with how much beautiful greenery there is. It's incredible that they found this as a sustainable way to feed their population because it shows just how advanced their engineering was back then.

    1. cacao beans that were whipped into a chocolate drink formed the basis of commerce

      Many people seem to think that chocolate started somewhere in Europe, but I'd like to highlight the fact that it actually came from Indigenous cultures in what is now present-day Mexico. It's super important to acknowledge this fact because of just how culturally significant it was for Mesoamericans. Present-day Mexico still makes chocolate that stands out from all competitors. Trust me on this; try a Mexican hot chocolate mix named Chocolate Abuelita. It will absolutely change your view on how chocolate should taste.

    1. Acting in ways consistent with the virtues (e.g., courage, truthfulness, wittiness, friendliness, etc.) leads to flourishing of an individual. In acting virtuously, you are training yourself to become more virtuous, and you will subsequently be able to act even more virtuously.

      This is basically saying that living virtuously isn't just about doing the right thing once, but about building good habits that make your life better overall. The idea that virtue gets easier with practice makes sense. It's like how any skill improves the more you work at it.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewer 1:

      The authors introduce G2PT, a hierarchical graph transformer model that integrates genetic variants (SNPs), gene annotations, and multigenic systems (Gene Ontology) to predict and interpret complex traits.

      We thank the reviewer for this accurate summary of our approach and contributions.

      Major Comments:

      Comment 1-1. Insufficient Specification of Model Architecture: The description of the "hierarchical graph transformer" lacks technical depth. Key implementation details are missing: how node embeddings are initialized for SNPs, genes, and systems; how graph connectivity is defined at each level (e.g., adjacency matrices used in Equations 5-9, the sparsity); justification for the choice of embedding dimension and number of attention heads, including any sensitivity analysis; and the architecture of the feed-forward neural networks (e.g., number of layers, activation functions, and hidden dimensions).

      __Reply 1-1. __As requested, we have expanded the technical description of the model architecture, including the hierarchical graph transformer (HiGT), in the Materials and Methods section. Details regarding node initialization and hierarchical connectivity are now included in the new paragraph "Model Initialization and Graph Construction." Specifically, all node embeddings corresponding to SNPs, genes, and ontology-defined systems are initialized using uniform Xavier initialization (Glorot and Bengio, 2010).

      We have also clarified our hyperparameter optimization strategy. Learning rate, weight decay, hidden (embedding) dimension, and the number of attention heads were selected via grid search, as summarized in new Supplementary Fig. 8, reproduced below. Based on both performance and computational efficiency, we adopted four attention heads-consistent with the configuration commonly used in academic transformer models (Vaswani et al., 2017) (the original Transformer used eight).

      Regarding the feed-forward neural network, we follow the standard Transformer architecture consisting of two position-wise layers with hidden dimension four times larger than the node embedding size and a GeLU nonlinear activation function (Hendrycks and Gimpel, 2016). This configuration is widely established in the literature and functions as an intermediate processing step following attention; therefore, it is not a focus of hyperparameter tuning. All corresponding updates have been incorporated into the revised Methods section for clarity and completeness.

      Comment 1-2. No Simulation Studies to Validate Epistasis Detection: The ground truth epistasis interaction should use the ones that have been manually validated by literature. The central claim of discovering epistatic interactions relies heavily on the model's attention mechanism and downstream statistical filtering. However, no simulation studies are presented to validate that G2PT can reliably detect epistasis when ground-truth interactions are known. Demonstrating robust detection of non-additive interactions under varying genetic architectures and noise levels in simulated genotype-phenotype datasets is essential to substantiate the method's core capability.

      Reply 1-2. We agree that a simulation of epistasis detection using the G2PT model is a worthy addition to the manuscript. Accordingly, we have now incorporated a new section in the Results titled "Validation of Epistasis through Simulation Studies", which includes two new figures reproduced below (Supplementary Fig. 6 and Fig. 5). We have also added a new Methods section to describe this simulation study under the heading "Epistasis Simulation". These simulation studies show that G2PT recovers epistatic gene pairs with high fidelity when these pairs are coherent with the systems ontology (c.f. 'ontology coherence' in Supplementary Fig. 6, which reflects the probability that both SNPs are assigned to the same leaf system). Furthermore, G2PT outcompetes previous tools, such as PLINK-epistasis, which do not use knowledge of the systems hierarchy in the same way (Supplementary Fig 6b-d). Using simulation parameters consistent with current genome-wide association studies (n = 400,000) and understanding of heritability (h2 = 0.3 to 0.5) (Bloom et al. 2015; Speed and Evans 2023), we find that approximately 10% of all epistatic SNP pairs can be recovered at a precision of 50% (Fig. 5). We have provided the source code for this simulation study in our GitHub repository (https://github.com/idekerlab/G2PT/blob/master/Epistasis_simulation.ipynb)

      Comment 1-3. Lack of Justification for Model Complexity and Missing Ablation Insights: While Supplementary Figure 2 presents ablation studies, the manuscript needs to justify the high computational cost (168 GPU hours using 4×A30 GPUs) of the full model. It remains unclear how much performance gain is specifically due to reverse propagation (Equations 8-9), which is claimed to capture biological context. The benefit of using a full Gene Ontology hierarchy versus a flat system list is not quantified. There is also no comparison between bidirectional versus unidirectional propagation. Overall, the added complexity is not empirically shown to be necessary

      Reply 1-3. We thank the reviewer for prompting a clearer justification of complexity and ablations. We have now revised the Results to (i) quantify the specific value of the ontology and reverse propagation, and (ii) explain why a flat SNP→system model is computationally and biologically sub-optimal. We have added new ablation results to compare bidirectional (forward+reverse) versus forward-only propagation. Reverse propagation has little effect when epistatic pairs are within one system (ontology coherence ρ=1.0) but substantially improves retrieval when interactions span related systems (e.g., ρ≈0.8) (Figure reproduced below) A flat design scores a dense genes×systems map, ignoring known sparsity (sparse SNP→gene assignments; sparse ontology edges) and losing multi-scale context; our hierarchical formulation restricts computation to observed edges (SNP→gene→system) and aggregates signals across levels, yielding better efficiency and biological fidelity.

      Comment 1-4. Non-Equivalent Benchmarking Against PRS Methods: Figure 2 compares G2PT to polygenic risk score (PRS) methods such as LDpred2 and Lassosum, but G2PT is run only on SNPs pre-filtered by marginal association (p-values between 10⁻⁵ and 10⁻⁸), while the PRS methods use genome-wide SNPs. This introduces a strong bias in G2PT's favor by effectively removing noise. A fair comparison would require: (a) running LDpred2 and Lassosum on the same pre-filtered SNP sets as G2PT, or (b) running G2PT on genome-wide or LD-pruned SNP sets. The reported superior performance of G2PT may be driven primarily by this input filtering, not the model architecture.

      Reply 1-4. We appreciate the reviewer's concern regarding benchmarking equivalence. In response, we have extended our analyses to include PRS-CS (Ge et al., 2019) and SBayesRC (Zheng et al., 2024), two state-of-the-art Bayesian shrinkage methods comparable to LDpred2 and Lassosum. Although we initially attempted to run LDpred2 and Lassosum under all SNP-filtering conditions, their computational requirements at UK Biobank scale proved prohibitively time consuming. We therefore focused on PRS-CS and SBayesRC, which offer similar modeling principles with greater computational tractability. These methods have now been run at matched SNP-filtering conditions to our original study. The new results demonstrate that G2PT consistently outperforms PRS-CS and SBayesRC (new Fig. 2, reproduced below), indicating that its performance advantage is not solely attributable to SNP pre-filtering but also to its hierarchical attention-based architecture.

      Comment 1-5: No Details on Hyperparameter Optimization: Although the manuscript mentions grid search for hyperparameter tuning, it provides no information about which parameters were optimized (e.g., learning rate, dropout rate, weight decay, attention dropout, FFNN dimensions), what search space was explored, or what final values were selected. There is also no assessment of how sensitive the model's performance is to these choices. Better transparency would help facilitate reproducibility

      Reply 1-5. We agree with the reviewer and have expanded the manuscript to include full details of hyperparameter optimization. As described in the revised Methods section, we performed a grid search over learning rate {10−3,10−4,10−5} hidden dimension {64,128} and weight decay {0,10−5,10−3}. The results, summarized in Supplementary Fig. 8 (reproduced above), show that model performance is most sensitive to the learning rate, while hidden dimension and weight decay exert more moderate effects. Based on these findings, we selected a learning rate of 10−5, hidden dimension of 64, and weight decay of 10−3 for all subsequent experiments. Although a hidden dimension of 128 slightly improved performance, we adopted 64 to balance predictive accuracy with computational efficiency.

      Comment 1-6. Absence of Control for Key Confounders: In interpreting attention scores as reflecting genetic relevance (e.g., the role of the immunoglobulin system), the model includes only age, sex, and genetic principal components as covariates. Important confounders such as BMI, alcohol use, or medication (e.g., statins) have not been controlled for. Since TG/HDL levels are strongly influenced by environment and lifestyle, it is entirely plausible that some high-attention features reflect environmental tagging, not biological causality.

      Reply 1-6. In the current framework, we included age, sex, and genetic principal components to account for demographic and population-structure effects, focusing on genetic contributions within a controlled baseline. We acknowledge that non-genetic covariates can influence downstream biological states and may indirectly shape attention at the gene or system level. Accurately modeling such effects requires an extended framework where environmental variables directly modulate gene and system embeddings rather than being implicitly absorbed by the attention mechanism. We have clarified these limitations in the Discussion along with plans to incorporate explicit confounder modeling in future extensions of G2PT.

      Comment 1-7. Oversimplified Treatment of SNP-to-Gene Mapping: The SNP-to-gene mapping strategy combines cS2G, eQTL, and nearest-gene annotations, but the limitations of this approach are not adequately addressed. The manuscript does not specify how conflicts between methods are resolved or what fraction of SNPs map ambiguously to multiple genes. Supplementary Figure 2 shows model performance degrades when using only nearest-gene mapping, but there is no systematic analysis of how mapping uncertainties propagate through the hierarchy and affect attention or interpretation.

      Reply 1-7. In the revision (Results), we have clarified how conflicts between cS2G, eQTL, and nearest-gene annotations are resolved, and we have reported the proportion of SNPs that map to multiple genes across these three annotation approaches. We note that the hierarchical attention mechanism enables the model to prioritize among alternative gene mappings in a data-driven manner, and this is a major strength of the approach. As shown in Fig. 3 (Results, reproduced below), SNP-to-gene attention weights reveal dominant linkages, reducing the impact of mapping uncertainty on interpretation. We now explicitly describe this mechanism and acknowledge that further work in probabilistic mapping and fine-mapping approaches is a valuable future direction for improving resolution and interpretability.

      "For SNPs with several potential SNP-to-gene mappings (Methods), we found that G2PT often prioritized one of these genes in particular due to its membership in a high-attention system. For example, the chr11q23.3 locus contains multiple genes including the APOA1/C3/A4/A5 gene cluster (Fig. 3c) which is well-known to govern lipid transport, an important system for G2PT predictions (Fig. 3a). Due to high linkage disequilibrium in the region, all of its associated SNPs had multiple alternative gene mappings available. For example, SNP rs1145189 mapped not only to APOA5 but to the more proximal BUD13, a gene functioning in spliceosomal assembly (a system receiving substantially lower G2PT attention). Here, the relevant information flow learned by G2PT was from rs1145189 to APOA5 to lipid transport and protein-lipid complex remodeling (Fig. 3c; and conversely, deprioritizing BUD13 as an effector gene for TG/HDL). We found that this particular genetic flow was corroborated by exome sequencing, which implicates APOA5 but not BUD13 in regulation of TG/HDL, using data that were not available to G2PT. Similarly, two other SNPs at this locus - rs518547 and rs11216169 - had potential mappings to their closest gene SIK3, where they reside within an intron, but also to regulatory elements for the more distant lipid transport genes APOC3 and APOA4. Here, G2PT preferentially weighted the mappings to APOC3 and APOA4 rather than to SIK3 (Fig. 3c)."

      Comment 1-8. Naive Scoring of System Importance: The method used to quantify the biological relevance of systems (i.e., correlating attention scores with predicted phenotype values) risks circular reasoning. Since the model is trained to optimize prediction, systems that contribute strongly to prediction will naturally show high correlation-even if they are not biologically causal. No comparison is made with established gene set enrichment methods applied to GWAS summary statistics. The approach lacks an independent benchmark to validate that the "important" systems are biologically meaningful.

      Reply 1-8. As requested, we compared G2PT's system-level importance scores with results from MAGMA competitive gene-set analysis, an established enrichment approach. This analysis indeed shows significant correlation between the systems identified by the two approaches (ρ = 0.26, p .01; Supplementary Table. 2), reflecting a shared emphasis on canonical lipid processes. We also observed systems detected by G2PT but not strongly detected by MAGMA's linear enrichment model-for example, the lipopolysaccharide-mediated signaling pathway (Kalita et al. 2022)

      Comment 1-9. No External Validation to Assess Generalizability. All evaluations are performed using cross-validation within the UK Biobank. There is no assessment of generalizability to independent cohorts or diverse ancestries. Given population structure, genotyping platform, and phenotype measurement variability, external validation is essential before claiming the method is suitable for broader use in polygenic risk assessment.

      Reply 1-9. To externally validate the G2PT model requires individual level genotype data with paired TG/HDL measurements, sample size at the scale of the UK Biobank, and GPU access to this data. Thus, we approached the All of Us program, a large and diverse cohort with individual level data and T2D conditions with HbA1C measurements. We first processed the All of Us genotype and phenotype data as we had processed UKBB data (Methods), resulting in 41,849 participants with T2D and 80,491 without T2D across various ethnicities. We then transferred the trained T2D G2PT model to the AoU Workbench and evaluated its performance. The model demonstrated robust discriminative capability with an explained variance of 0.025, as shown in the new Fig. 2d, (reproduced above).

      Comment 1-10. Computational Burden and Scalability Are Not Addressed: The paper notes that training the model requires 168 GPU hours on 4×A30 GPUs for just ~5,000 SNPs. However, there is no discussion of whether G2PT can scale to larger SNP sets (e.g., genome-wide imputed data) or more complex biological hierarchies (e.g., Reactome pathways). Without addressing scalability, the model's applicability to real-world, large-scale genomic datasets remains unclear.

      Reply 1-10. We have addressed scalability with both engineering optimizations and new scalability experiments. First, we refactored the model to use the xFormer memory-efficient attention for the hierarchical graph transformer (Lefaudeux et al., 2022), which also helps full parallelization of training, reducing bottlenecks. Second, we added a scaling study with progressively increasing SNP count. On 4×A30 GPUs, end-to-end training time for the 5k-SNP setting decreased from 4000 to 400 min. (approximately 7 GPU-hours, ×10). These new results are given in Supplementary Fig. 7, reproduced below.

      Minor Comment:

      Comment 1-11. Attention Weights as Mechanistic Insight: The paper equates high attention scores with biological importance, for example in highlighting the immunoglobulin system. There is no causal validation showing that altering the highlighted SNPs, genes, or systems has an actual effect on TG/HDL. Attention weights in transformer models are known to sometimes reflect spurious correlations, especially in high-dimensional settings. The correlation between attention scores and predictions (Supplementary Fig. 3a,b) does not constitute biological evidence. The interpretability claims can be restated without supporting functional or causal validation.

      Reply 1-11. We thank the reviewer for this thoughtful comment. We agree that attention weights are not causal evidence. In the revision, we (1) reframe attention-based findings as hypothesis-generating rather than mechanistic, and (2) add an explicit limitation noting that correlations between attention scores and predictions do not constitute biological validation.

      Response to Reviewer 2:

      This manuscript describes the introduction of the Genotype-to-Phenotype Transformer (G2PT), described by the authors as "a framework for modeling hierarchical information flow among variants, genes, multigenic systems, and phenotypes." The authors used the ratio TG/HDL as a trait for proof of concept of this tool.

      This is a potentially interesting computational tool of interest to bioinformaticians, computational genomicists, and biologists.

      We thank the reviewer for their overall positive assessment of our study.

      Comment 2-1. The rationale for choosing the TG/HDL ratio for this proof of concept analysis is not well justified beyond it being a marker for insulin resistance. Overall the use of a ratio may be problematic (see below). Analyses of TG and HDL separately as individual quantitative traits would be of interest. And an analysis of a dichotomous clinical trait (T2DM or CAD) would also be of great interest.

      Reply 2-1. We thank the reviewer for this suggestion. In the revised manuscript, we have expanded our analyses beyond the TG/HDL ratio to include TG and HDL as individual quantitative traits (Fig. 2, reproduced below). These additional analyses demonstrate that G2PT captures predictive signals robustly across each lipid component, not solely through their ratio. Furthermore, to address the reviewer's interest in clinical outcomes, we incorporated an analysis of type 2 diabetes (T2D) as a dichotomous trait of direct clinical relevance. Collectively, these results strengthen the rationale for our chosen phenotype and show that the G2PT framework generalizes effectively across quantitative and binary traits, consistently outperforming advanced PRS and machine learning benchmarks.

      Comment 2-2. The approach to mapping SNPs to genes does not incorporate the most advanced approaches. This should be described in more detail.

      Reply 2-2. We agree that the choice of SNP-to-gene mapping materially affects both performance and interpretability-indeed, our epistasis simulations suggest that more accurate mappings can improve recovery and localization. In this proof-of-concept work we use a straightforward, modular mapping sufficient to demonstrate the modeling framework, and we have clarified this in the Methods. The architecture is designed to plug-and-play alternative SNP-to-gene maps (e.g., eQTL/colocalization-based assignments, promoter-capture Hi-C). A dedicated follow-up study will systematically compare these alternatives and quantify their impact on attribution and downstream discovery.

      Comment 2-3. The example of gene prioritization at the A1/C3/A4/A5 gene locus is not particularly illuminating, as the prioritized genes are already well-known to influence TG and HDL-C levels and the TG/HDL ratio. Can the authors provide an example where G2PT prioritized a gene at a locus that is not already a well-known regulator of TG and HDL metabolism?

      Reply 2-3. We thank the reviewer for this suggestion. We have revised the manuscript to de-emphasize the well-established APOA1 locus and instead highlight the less expected "Positive regulation of immunoglobulin production" system (Figure 3a,b, Discussion). Here our model prioritizes the gene TNFSF13 based on specific variants that are not previously associated with TG or HDL (e.g., rs5030405, rs1858406, shown in blue). This finding points to an intriguing, non-canonical link between B-cell regulation and lipid metabolism. While full exploration of this finding is beyond the scope of the present methods paper, this example demonstrates G2PT's ability to identify novel, high-priority candidates in atypical systems.

      Comment 2-4. The identification of epistatic interactions is a potentially interesting application of G2PT. However, suppl table 1 shows a very limited number of such interactions with even fewer genes, and most of these are well established biological interactions (such as LPL/apoA5). The TGFB1 and FKBP1A interaction is interesting and should be discussed. What is needed for increasing the number of potential interactions, greater power?

      Reply 2-4. We are glad the reviewer appreciates the use of the G2PT model to identify epistatic interactions. We have now discussed a potential mechanism of epistasis between TGFB1 and FKBP1A in the protein dephosphorylation system (Discussion). In addition, we have addressed the reviewer's question about statistical power through extensive epistasis simulations (Fig. 5 and Supplementary Fig. 6), which show that G2PT's detection ability scales strongly with sample size-1,000 samples are insufficient, performance improves at 5,000, and power becomes reliable at 100,000. Realistic simulations (Fig. 5b-d) further demonstrate that under biologically plausible architectures, G2PT can robustly recover specific interactions even within complex genetic backgrounds

      Comment 2-5. Furthermore, the use of the TG/HDL ratio for the assessment of epistatic interactions may be problematic. For example, if one SNP affected only TG and the other only HDL-C, it would appear to be an epistatic interaction with regard to the ratio, although the biological epistasis may be limited to non-existent.

      Reply 2-5. We have greatly expanded the example phenotypes modeled in our study, Please see our reply 2-1 above.

      Response to Reviewer 3:

      This manuscript by Lee et al provides a sensible and powerful approach to polygenic score prediction. The model aggregates information from SNPs to genes to systems, using a transformer based architecture, which appears to increase predictive performance, produce interpretable outputs of genes and systems that underlie risk, and identify candidates for epistasis tests.

      I think the manuscript is clear and well written, and conducted via state-of-the-art approaches. I don't have any concerns regarding the claims that are made.

      We thank the reviewer for their very positive assessment of our study.

      Major comments:

      Comment 3-1. Specifically, lipid based traits are perhaps the most well-powered and the most biologically coherent; they are also very well-studied biologically and thus overrepresented in the gene ontology. It is unclear whether this approach will work as well for a trait like Schizophrenia for which the underlying pathways are not as well captured in existing ontologies. The authors anticipate this in their limitations section, and I am not expecting them to solve every issue with this, but it would be nice to expand the testing a little bit beyond only this one trait.

      Reply 3-1. We appreciate the reviewer's suggestion to expand beyond a single lipid trait. In the revised manuscript, we have included analyses of additional phenotypes, including low-density lipoprotein (LDL) and T2D (Fig. 2). These additions demonstrate the broader applicability of our framework beyond a single trait class.

      Comment 3-2. It also seems like the authors have not compared their method to the truly latest PRS methods, such as PRS-CSx and SBayesR. I would suggest adding some of the methods shown to be the best from this recent paper: https://www.nature.com/articles/s41598-025-02903-1

      Reply 3-2. We agree these are important comparators. Accordingly, we have extended our comparison to include PRS‑CS (Ge et al., 2019) and SBayesRC (Zheng et al., 2024), following its strong performance demonstrated in recent benchmarking studies (see Figure 2 above). We confirmed that G2PT outperforms advanced PRS methods for all TG/HDL ratio, LDL, and T2D phenotypes.

      Comment 3-3. Another major comment regards whether this method could be applied to traits with just GWAS summary statistics, rather than individual level data. This would not enable identification of specific methods underlying an individual, but it could still learn SNP based weights that could be mapped to genes and systems that could help explain risk when the model is applied to individuals (kind of like a pretraining step?)

      Reply 3-3. We appreciate this suggestion. While SNP weights from GWAS summary statistics could, in principle, serve as informative priors for attention values, incorporating them would require a sophisticated mathematical formulation that is beyond the scope of this study. Our current framework also relies on individual-level genotype and phenotype data to capture multilevel information flow and individual-specific variation.

      Minor comments:

      Comment 3-4. Why the need to constrain to a small number of SNPs? Is it just computational cost? If so, what would happen as power increases and more SNPs exceed the thresholds used?

      Reply 3-4. Yes, it's about computational cost, but we've now modified the code for improved computational efficiency. First, we refactored the model to use the xFormer memory-efficient attention for the hierarchical graph transformer (Lefaudeux et al., 2022), which also helps full parallelization of training, reducing bottleneck effects. Second, we added a scaling study of the impact of varying SNP count. On 4×A30 GPUs, end-to-end training time for the 5k-SNP setting decreased from 65 hours to 7 GPU-hours (×9). We expect performance can potentially increase if more SNPs are provided to the model based on Fig. 2 (reproduced above). With the optimized implementation, users can raise SNP thresholds as power increases; the expected behavior is improved accuracy up to a plateau, while hierarchical sparsity maintains training tractability and ensures well-regularized results.

      Comment 3-5. What type of sample size/power does this method require to work well? If others were to use it, how many SNPs/samples would be needed to obtain good performance?

      Reply 3-5. To address this comment, we quantified performance as a function of training size by subsampling the cohort and retraining G2PT with identical architecture and SNP set. New Supplementary Fig. 3 (reproduced below) shows monotonic gains with sample size across three representative phenotypes. We found that stable performance is reached by ~100k samples. These trends hold for continuous traits (TG/HDL, LDL) and more modestly for a binary trait (T2D), consistent with lower per-sample information for case-control settings.

    1. But, since the donkey does not understand the act of protest it is performing, it can’t be rightly punished for protesting. The protesters have managed to separate the intention of protest (the political message inscribed on the donkey) and the act of protest (the donkey wandering through the streets).

      This is a pretty clever way to protest without getting in trouble. the donkey is just walking around with no idea it's part of a political statement, so you can't really punish it for anything. It's interesting how the protesters split up the different parts of protesting so that the actual messenger (the donkey) has no clue what's going on, while they get to stay anonymous and spread their message.

    1. When he was but a short distance from the ship, the horse which Erik was riding stumbled, and he was thrown from his back and wounded his foot, whereupon he exclaimed, “It is not designed for me to discover more lands than the one in which we are now living, nor can we now continue longer together.” Erik returned home to Brattahlid, and Leif pursued his way to the ship with his companions, thirty-five men.

      I think it's interesting that Erik gave up this amazing voyage just because he was injured slightly. He just turned around and left the whole plan when one thing didn't go his way. I wonder if he was really injured, or it was an excuse not to partake in the adventure.