10,000 Matching Annotations
  1. Jun 2026
    1. Reviewer #1 (Public review):

      The manuscript examines whether insects can use bat odor as a cue of predation risk. The authors focus on the insectivorous bat Scotophilus kuhlii and the cricket Loxoblemmus equestris. They first use fecal DNA metabarcoding to show that crickets are part of the bat's diet, and field surveys to show that L. equestris is abundant at local foraging sites. In laboratory Y-tube assays, the authors show that crickets strongly avoid air carrying bat body odor. Gas chromatography coupled with electroantennographic detection showed that cricket antennae respond to components of bat odor. Chemical analyses identified several volatile compounds, with 2,2-dimethylheptane and (−)-limonene associated with antennal responses. Further analyses suggested that snout secretions are likely to contribute to the bat's body odor. The authors then tested individual compounds. Among the commercially available candidates, (−)-limonene elicited a strong antennal response and was sufficient to cause avoidance in the olfactometer. In field plots, spraying (−)-limonene reduced cricket calling activity relative to pre-exposure levels, whereas calling increased in control plots treated with hexane. Overall, the study argues that crickets can detect a vertebrate predator through olfactory cues and that a single bat-associated volatile can trigger antipredator behavior.

      This is an interesting and enjoyable study that addresses an understudied aspect of predator-prey interactions. The manuscript is clearly written, the experiments are presented in a logical sequence, and the figures are crisp and easy to follow. I really appreciated the combination of behavioral assays, electrophysiology, chemical analysis, and field observations.

      My main issue concerns the identity and biological origin of the proposed bat odor cue, (−)-limonene. Limonene seems like an unusual compound to be emitted endogenously by a mammal, particularly by an insectivorous bat. It would be helpful if the authors could clarify whether mammals are known to synthesize this compound de novo, and, if not, what the likely source of this plant-associated terpene would be in S. kuhlii. Possible sources could include environmental exposure, diet, roosting material, handling, or temporary housing conditions.

      I do not doubt that crickets avoid synthetic (−)-limonene. Indeed, this result is quite plausible given that limonene is widely used in insect repellent or repellent-associated fragrance products. However, this also makes contamination an important issue to address explicitly. How did the authors exclude the possibility that limonene entered the samples from human-associated sources, such as insect repellents, soaps, cleaning products, field equipment, cloth bags, cages, gloves, or other materials used while handling wild-caught bats? It would strengthen the manuscript to report limonene levels for individual bat odor collections, all relevant blanks, and any handling or housing controls.

      More broadly, given the common occurrence of limonene in plants and human-associated products, I am not yet convinced that it would function as a reliable "keystone kairomone" as suggested around line 253. How would crickets distinguish bat-associated limonene from limonene emitted by a mint leaf, citrus peel, pine material, or other non-threatening environmental sources? The authors may wish to soften this interpretation or provide additional evidence that crickets respond to limonene in a bat-specific context, perhaps through concentration, temporal patterning, co-occurring volatiles, or enantiomeric composition.

    2. Reviewer #2 (Public review):

      Summary:

      Many insects possess extremely sensitive olfactory systems that can detect chemical signals from distances of several kilometers. For decades, the arms race between bats and insects has served as a prime example of acoustic co-evolution. The auditory adaptations of insects to echolocation have been well documented. Cricket has a multi-sensory predator recognition system with keen olfactory, tactile, and auditory senses. However, whether crickets can use the scent of bats to avoid them remains unknown at present. The authors hypothesized that cricket prey (Loxoblemmus equestris) might eavesdrop on predator bat (Scotophilus kuhlii) VOCs as an early warning. L. equestris is one of the prey species of S. kuhlii, and the authors demonstrated that the body odor of the insectivorous bat S. kuhlii triggers robust avoidance and electrophysiological responses in the cricket L. equestris, and that a single compound, (-)-limonene, is sufficient to elicit this avoidance in the laboratory and suppress calling in the field. Overall, this paper has a complete chain of evidence and should be a highly praised study.

      Comments:

      (1) Olfactory eavesdropping can transcend the evolutionary divide between vertebrate predators and invertebrate prey, enabling invertebrates to trigger defensive avoidance behaviors in response to predator-derived volatile odors. This phenomenon is empirically well-documented and requires no excessive emphasis.

      (2) Without quantitative analysis and without knowing the relative content of this key substance limonene, I don't quite understand how to determine the concentration of limonene standard for EAD, as well as the concentration in field experiments. How is the concentration of limonene determined in field spraying, and is this actually the case in the wild environment?

      (3) Figures 1C and D should compare the GC-EAD response of L. equestris to the odor of bat body and the odor of bat nasal secretions. It should not be compared with the air control group. Figure 1D has the same problem.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      Noell et al have presented a careful study of the dissociation kinetics of Kinesin (1,2,3) classes of motors moving in-vitro on a microtubule. These motors move against the opposing force from a ~1 micron DNA strand (DNA tensiometer) that is tethered to the microtubule and also bound to the motor via specific linkages (Fig 1A). Authors compare the time for which motors remain attached to the microtubule when they are tethered to the DNA, versus when they are not. If the former is longer, the intepretation is that the force on the motor from the stretched DNA (presumed to be working solely along the length of the microtubule) causes the motor's detachment rate from the microtubule to be reduced. Thus, the specific motor exhibits "catch-bond" like behaviour.

      Strengths:

      The motivation is good - to understand how kinesin competes against dynein through the possible activation of a catch bond. Experiments are well done and there is an effort to model the results theoretically.

      Weaknesses from original round of review:

      The motivation of these studies is to understand how kinesin (1/2/3) motors would behave when they are pitted in a tug of war against dynein motors as they transport cargo in bidirectional manner on microtubules. Earlier work on dynein and kinesin motors using optical tweezers has suggested that dynein shows catch bond phenomenon, whereas such signatures were not seen for kinesin. Based on their data with DNA tensiometer, the authors would like to claim that (i) Kinesin1 and kinesin2 also show catch-bonding and (ii) The earlier results using optical traps suffer from vertical forces, which complicates the catch-bond interpretation.

    2. Reviewer #2 (Public review):

      Summary:

      To investigate the detachment and reattachment kinetics of kinesin-1, 2 and 3 motors against loads oriented parallel to the microtubule, the authors used a DNA tensiometer approach comprising a DNA entropic spring attached to the microtubule on one end and a motor on the other. They found that for kinesin-1 and kinesin-2 the dissociation rates at stall were smaller than the detachment rates during unloaded runs. With regard to the complex reattachment kinetics found in the experiments, the authors argue that these findings were consistent with a weakly-bound 'slip' state preceding motor dissociation from the microtubule. The behavior of kinesin-3 was different and (by the definition of the authors) only showed prolonged "detachment" rates when disregarding some of the slip events. The authors performed stochastic simulations which recapitulate the load-dependent detachment and reattachment kinetics for all three motors. They argue that the presented results provide insight into how kinesin-1, -2 and -3 families transport cargo in complex cellular geometries and compete against dynein during bidirectional transport.

      Strengths:

      The present study is timely, as significant concerns have been raised previously about studying motor kinetics in optical (single-bead) traps where significant vertical forces are present. Moreover, the obtained data are of high quality and the experimental procedures are clearly described.

    3. Reviewer #3 (Public review):

      Summary:

      Several recent findings indicate that forces perpendicular to the microtubule accelerate kinesin unbinding, where perpendicular and axial forces were analyzed using the geometry in a single-bead optical trapping assay (Khataee and Howard, 2019), comparison between single-bead and dumbbell assay measurements (Pyrpassopoulos et al., 2020), and comparison of single-bead optical trap measurements with and without a DNA tether (Hensley and Yildiz, 2025).

      Here, the authors devise an assay to exert forces along the microtubule axis by tethering kinesin to the microtubule via a dsDNA tether. They compared the behavior of kinesin-1, -2, and -3 when pulling against the DNA tether. In line with previous optical trapping measurements, kinesin unbinding is less sensitive forces when the forces are aligned with the microtubule axis. Surprisingly, the authors find that both kinesin-1 and -2 detach from the microtubule more slowly when stalled against the DNA tether than in unloaded conditions, indicating that these motors act as catch bonds in response to axial loads. Axial loads accelerate kinesin-3 detachment. However, kinesin-3 reattaches quickly to maintain forces. For all three kinesins, the authors observe weakly-attached states where the motor briefly slips along the microtubule before continuing a processive run.

      Strengths:

      These observations suggest that the conventional view that kinesins act as slip bonds under load, as concluded from single-bead optical trapping measurements where perpendicular loads are present due to the force being exerted on the centroid of a large (relative to the kinesin) bead, need to be reconsidered. Understanding the effect of force on the association kinetics of kinesin has important implications for intracellular transport, where the force-dependent detachment governs how kinesins interact with other kinesins and opposing dynein motors (Muller et al., 2008; Kunwar et al., 2011; Ohashi et al., 2018; Gicking et al., 2022) on vesicular cargoes.

    1. Reviewer #1 (Public review):

      This manuscript by Zhang et al addresses how Pi scarcity/depletion drives PMB resistance in Enterobacteriaceae, because it proposes a mechanistically distinct pathway from the better-known PhoBR-linked phospholipid-remodeling responses in other Gram-negatives. The authors also suggest an intervention strategy based on Mg repletion or Fe chelation. The results are substantial and include genetic analyses, mass spectrometry, reporter assays, phospho-signaling readouts, metal quantification, and comparative analyses across enterobacterial species.

      The paper reads well with the emphasis on the Mg loss followed by Fe mobilization during Pi depletion that induces PmrAB TCS activation for lipid A modification through transcriptional activation of ugd and arn genes. However, PmrAB is a well-known TCS responsible for PMB resistance through lipid A modification in the extensive studies by the Groisman lab. PmrA is a well-known transcriptional regulator to activate the transcription of the ugd gene in Salmonella and Yersinia by Mg depletion and Fe mobilization. Therefore, the current paper should focus more on the upstream signaling to connect the dots between Pi depletion and Mg loss. This is important because Ugd gene expression is not affected by PmrAB in Pi depletion. It should also be considered that Mg loss is temporally associated with Fe mobilization, but the manuscript does not quantitatively show that Mg dissociation/redistribution is sufficient to trigger Fe mobilization in the absence of Pi depletion, considering that Mg is a macronutrient, whereas Fe is a micronutrient.

      Second, the relationship between arn and ugd regulation needs a clearer mechanistic resolution to orchestrate the synthesis of the L-Ara4N during Pi depletion, because the manuscript shows that arn activation is PmrAB-dependent, whereas ugd is only partially PhoBR-dependent and not dependent on PmrAB. Yet the current model and narrative treat the system as a unified "ugd-arn" output. This should be carefully addressed, given that Pi depletion and Mg depletion might trigger different signaling modules.

      Third, the manuscript argues that this is a "conserved" circuit in Enterobacteriaceae. The evidence for conservation is presently strongest in E. coli MG1655 and includes supportive observations in E. coli O157, one UTI strain by lipid A MS, several UTI isolates by killing assay, and S. Typhimurium for key phenotypes. No direct mechanistic validation is shown in other important genera belonging to Enterobacteriaceae, which include Klebsiella, Enterobacter, Citrobacter, Yersinia, Serratia, or other clinically important Enterobacteriaceae.

      Fourth, the reversal and translational claims are a bit stronger than the current evidence supports. The title and Abstract state that identifying and targeting the circuit reverses Pi depletion-driven PMB, and the manuscript suggests a pharmacological intervention framework based on Mg supplementation or Fe chelation. The actual intervention evidence is limited to in vitro killing assays under acute Pi-depleted minimal-medium conditions in E. coli and S. Typhimurium, without in vivo testing, in that the experiments are performed under an acute 3-hour starvation in MOPS medium, not in host-mimicking or infection-relevant environments. The reversal needs to be shown not only at the level of survival curves, but also by the quantitative MIC/MBC measurements.

      More importantly, the authors demonstrated that the signaling module upon Pi limitation in Enterobacteria differs from that in other Gram-negative bacteria such as Pseudomonads. However, they did not discuss why this difference would impact the life of Enterobacteria. The authors should consider the glycolytic pathways (i.e., EMP pathway for enterobacteria vs ED pathway for pseudomonads), in that the ED pathway requires less Pi, whereas the EMP pathway requires more Pi. It should be noted that Pi supply is highly limited in the natural environment for the free-living bacteria, rather than in the host environment for the commensals.

    2. Reviewer #2 (Public review):

      Summary:

      Using E. coli K-12 as a model system, the authors investigated how phosphate (Pi) depletion induces polymyxin resistance in Enterobacteriaceae, which notably lack the canonical phospholipid remodeling pathways commonly associated with phosphate starvation responses. They demonstrated that low-phosphate conditions promote L-Ara4N modification of lipid A, thereby enhancing polymyxin resistance. Proteomic analyses revealed significant upregulation of the arn operon and ugd under phosphate-limited conditions, and promoter activity assays further confirmed that both promoters are strongly induced during Pi depletion. Through gene deletion experiments, the authors showed that arn expression is regulated by the PmrAB two-component system, whereas ugd is controlled by PhoBR under low-phosphate conditions. Using ICP-MS analysis, they further found that phosphate limitation increases cell-associated Fe levels, and that reducing Fe availability abolishes PmrAB-dependent activation of the arn operon. Finally, the study demonstrated that Mg supplementation and Fe chelation can suppress polymyxin resistance, highlighting the critical role of metal homeostasis in phosphate depletion-induced antimicrobial resistance.

      Strengths:

      Overall, I found this study to be well conducted, with convincing results that strongly support the proposed model. Through comprehensive genetic analyses and detailed characterization of metal ion homeostasis and membrane lipid modifications, the authors uncovered a novel regulatory connection among Mg²⁺, Fe³⁺, and the PmrAB pathway, a key driver of polymyxin resistance. These findings are highly interesting and have important implications for understanding the evolution of the Fe-sensing PmrAB system, as well as the broader role of nutrient availability in shaping antibiotic resistance.

      Weaknesses:

      I did not identify any particular weaknesses.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript examines how phosphate limitation primes E. coli and Salmonella for defense against polymyxin antibiotics. Other environmental signals, such as altered levels of extracellular Mg or Fe, were previously shown to induce polymyxin resistance in Enterobacteriaceae, and phosphate limitation was known to augment polymyxin resistance in other organisms such as A. baumannii and P. aeruginosa; however, whether phosphate limitation boosted polymyxin resistance in Enterobacteriaceae was not known. This study shows that this indeed occurs, and the mechanism is distinct from that in A. baumannii and P. aeruginosa. The model proposed is: (1) low phosphate causes bacteria to jettison Mg to balance cellular P/Mg ratio, (2) extracellular Fe3+ associates with the cell envelope to replace Mg as LPS-bridging cation, and (3) envelope Fe3+ activates PmrAB, which mediates a transcriptional response leading to L-Ara4N modification of lipid A and protection from polymyxin B. Flooding with Mg or chelating the surface Fe3+ blocks the protective response to low phosphate in E. coli and Salmonella but not in P. aeruginosa despite Fe still mobilizing in the latter. The differential response between Enterobacteriaceae and P. aeruginosa is connected to the presence/absence of Fe-sensing motifs in the PmrB periplasmic domain.

      Strengths:

      The strengths of the study are the wide array of approaches used and the thorough characterization of a novel stress-response mechanism involving metal mobilization. Combined with the analysis of multiple bacterial families, the results clarify how different strategies have evolved to defend against polymyxins during phosphate starvation.

      Weaknesses:

      Controls are needed in some of the genetic experiments, namely complementation, to verify linkage of defective survival phenotypes to the genes mutated and to rule out protein stability defects for the PmrB variants tested. In addition, the generalizability of the metal mobilization feature of the model would be strengthened by examining media with differing metal composition. Claims about antibiotic resistance would be strengthened by data examining bacterial growth in the presence of an antibiotic.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      Knowing that small pupil-size variations accompany brightness variations (even when these are illusory), the authors asked whether pupil constrictions would accompany the synesthetic perception of a brighter color (compared with a darker one), induced by the presentation of a black-white character. This grapheme-colour synesthesia is only experienced by few participants, sixteen of whom were enrolled in this study. The results reliably showed that a relative pupil constriction would "betray" the perception of a brighter color in these participants, while no such effect would be observed in control participants who were asked to report a color in association with each grapheme, even though they did not perceive any.

      Strengths:

      The main strength of the study lays in its combination of psychophysics (brightness ratings) and pupillometry, which allowed for showing clear-cut results.

      Impact:

      This work is likely to improve our understanding of synesthesia, providing a new tool to quantify the subjective sensations; an interesting potential extension would be using pupillometry for tracking changes over time of the synesthetic experiences, opening up the possibility to evaluate the importance of learning for this peculiar experience.

    2. Reviewer #2 (Public review):

      Synesthesia is a neurological condition where stimulation of one sensory channel leads to involuntary, automatic, and consistent experience of another, unrelated percept. For example, Sir Francis Galton (1880, Nature) famously described the robust tendency of some individual (synesthetes) to associate numerals with a distinct color. Ever since, synesthesia keeps attracting a broad interest in the cognitive neurosciences in light of its implications for the study of domains such as perception, consciousness, and brain connectivity, among others.

      Strauch, Leenaars, and Rouw measured pupil size in a group of 16 grapheme-color synesthetes and two matched control groups. The participants were presented with gray digits - that is, visual stimuli having identical physical properties in terms of brightness. Each participant subsequently rated the corresponding evoked color and brightness: unlike controls, synesthetes did so in a very consistent and reliable fashion. Accordingly, this was also shown in their pupils: despite the same objective luminance, digits associated with brighter percepts caused their pupils to constrict and digits associated with darker percepts caused their pupils to dilate more than controls. These results highlight how crossmodal correspondences are deeply rooted in synesthetes, and puts forward pupillometry as a particularly appealing biomarker for some phenomenological experience (at least those grounded in "brightness").

      Further strengths of the technique are its temporal resolution and its responsiveness to several constructs. Across several tasks, the authors show for example that responses to synesthetic light are somewhat slower than responses to real light (i.e., they are likely mediated), but at the same time faster than responses to mental imagery. The role of mental imagery can also be reasonably dismissed when considering the second feature of pupil size: its responsiveness to mental effort and cognitive load. The pupils tend to dilate with demanding, challenging tasks, and this was the case when control participants were asked to report the color of a digit for which they did not consistently experience a synesthetic association. The same task was, instead, seemingly effortless for synesthetes, again speaking in favor of the automaticity of number-color correspondences in their case.

      Overall, the findings by Strauch, Leenaars, and Rouw are highly significant for the field and likely to be impactful. The strength of their evidence, when accounting for the relatively small sample size and the inherent variability of both phenomenology (color perception and subjective reporting) and physiology (pupil size), is adequate and sufficiently convincing.

    1. Reviewer #1 (Public review):

      Summary:

      This retrospective study provides a new data regarding the prevalence of pain in women with PCOS and its relationship with health outcomes. Using data from electronic health records (EHR), the authors found a significantly higher prevalence of pain among women with PCOS compared to those without the condition: 19.21% of women with PCOS versus 15.8% in non-PCOS women. The highest prevalence of pain was conducted among Black or African American (32.11%) and White (30.75%) populations. Besides, women with PCOS and pain have at least a 2-fold increased prevalence of obesity (34.68%) at baseline compared to women with PCOS in general (16.11%). Also, women with PCOS had the highest risk for infertility and T2D, but women with PCOS and pain had higher risks for ovarian cysts and liver disease. Regarding these results, authors suggested the critical need to address pain in the diagnosis and management of PCOS due to its significant impact on patient health outcomes.

      Strengths:

      The problem of pain assessment in PCOS patients is well described and authors provided a clear rationale selection of the retrospective design to investigate this problem.

      A large number of analyzed patient's records (76,859,666 women) and its uniformity increases the power of the study. Using the Propensity Score Matching makes possible to reduce the heterogeneity of the compared cohorts and influence of comorbid conditions.

      Analysis in different ethnic cohorts provides actual and necessary data regarding the prevalence of pain and its relationship with different health conditions that will be helpful for clinicians to make a diagnosis and manage the PCOS in women of different ethnicity.

      Assessment of risk of different health conditions as including PCOS-associated pathology as other common groups of diseases in PCOS women with or without pain allows to differentiate the risk of comorbid conditions depending on the presence of one symptom (pelvic or abdominal pain, dysmenorrhea).

      Weaknesses:

      The significant weakness of the study is the absence of Latin American cohort. Probably the White cohort includes Latin Americans or others, but results of the study cannot be extrapolated to particular White ethnicities.

      Comments on revised version:

      At present, I have no questions or recommendations for the authors, as they have exhaustively addressed the previous comments and incorporated the necessary corrections.

    2. Reviewer #2 (Public review):

      Summary:

      The study offers a thorough analysis of the prevalence of pain in women with polycystic ovary syndrome (PCOS) and its associations with health outcomes across various racial groups. Furthermore, the research investigates the prevalence of PCOS and pain among different racial demographics, as well as the increased risk of developing various conditions in comparison to individuals who have PCOS alone.

      Strengths:

      The study emphasizes pain as a significant comorbidity of PCOS, an area that is critically underexplored in existing literature. The findings regarding the increased prevalence of some of the diseases in the PCOS + pain group provide valuable direction for future research and clinical care. I believe physicians should incorporate pain score assessments into their clinical practice to improve patients' quality of life and raise awareness about pain management. If future research focuses on the mechanisms of pain, it would provide a better understanding of pain and allow for a focus on the underlying causes rather than just symptomatic management. The study also highlights the association between PCOS+pain and various comorbidities, such as obesity, hypertension, and type 2 diabetes, as well as conditions like infertility and ovarian cysts, offering a holistic view of the burden of PCOS.

      Weaknesses:

      Due to the nature of retrospective design, some data may not be readily available in the EHR system. Diagnosis of PCOS, pain is based on ICD codes, which may lead to misclassification and may not capture symptom severity or patient-reported experiences.

    1. Reviewer #1 (Public review):

      Summary:

      The authors aim to demonstrate that PGLYRP1 plays a dual role in host responses to B. pertussis infection. PGLYRP1 signaling is known to activate bactericidal responses due to recognition of peptidoglycan. Through NOD1 activation and TREM-1 engagement, it appears PGLYRP1 also has immunomodulator activities. The authors present mouse knockout studies and gene expression data to illustrate the role of PGGLYRL1 in relation to B. pertussis peptidogylcan. Mice lacking PGLYRP1 had slightly lower pathology scores. When TCT peptidoglycan was removed from the bacteria, surprisingly IL23A, IL6, IL1B and other pro-inflammatory genes encoding cytokines increased. The relationship to TCT and PGLYRP1 suggest the pathogen uses this strategy to decrease immune activation. The authors when on to show the relationship between PGLRP1 and TREM-1 as mediated with PGN using various versions of peptidoglycan. The study presents multiple angles of data to back up its findings and demonstrates an interesting strategy used by B. pertussis to down-regulate innate responses to its presence during infection.

      Strengths:

      Use of knockout mice of the key factor being considered paired with isogenic B. pertussis strains to reveal the mechanism of immune modulation to benefit the bacteria. The authors used in vivo gene expression paired with in vivo assays to establish each aspect of the mechanism.

      Weaknesses:

      The main focus was on innate responses, but some analysis of antigen specific antibody responses could improve the impact of the findings.

      Comments on revised version.

      I have no further input to add.

    2. Reviewer #2 (Public review):

      Since its original discovery, the mechanistic basis for TCT-mediated pathogenesis of Bordetella pertussis has been a moving target and difficult to uncouple from confounding variables. The current study provides some exciting data that suggest PGLYRP-1 modulates host responses upon 'activation' by TCT. While there are some strengths associated with the unbiased approaches and collective data to support the claims associated with TCT and PGLYRP-1's function in this system, caution should be used when interpreting and extrapolating some the information provided. While many of the initial concerns were addressed, one concern remains: using whole, intact PG sacculi from other species for comparative studies with a fragment of released PG (i.e., TCT).

      Comments on revised version.

      I have no further comments.

    3. Reviewer #3 (Public review):

      Summary:

      This study evaluates the contributions of the mammalian PG-binding protein PGLYRP1 to Bordetella infection. The authors find potential roles for PGLYRP1 in both bacterial killing (canonical) and regulation of inflammation (non-canonical). While these are interesting findings and the idea that PG fragment release has differential impacts on infection depending on fragment structure, the study is ultimately limited by the lack of connection between the in vivo and in vitro experiments and determining the precise mechanism of how PGLYRP1 regulates host responses and bacterial fitness during infection requires further study.

      Strengths:

      (1) The combination of scRNAseq with in vitro and in vivo assays provides complementary views of PGLYRP1 function during infection.

      (2) The use of TCT-deficient B. pertussis provides a useful control and perturbation in the in vitro assays.

      Weaknesses/Areas for future study:

      (1) The study does not ultimately resolve the initial early versus late phenotype divergence. While the in vitro assays suggest explanations for their in vivo observations, further mechanistic links are lacking and necessary for the author's conclusions throughout. To state one example, what is the early and late infection phenotype of TCT- Bp in mice lacking PGLYRP1? RNAseq data is reported from these mice but there are no burden or pathology studies. Furthermore, what are the neutrophil phenotypes (NOD-1/TREM-1 activation) in vivo? And are they dependent on PGLYRP1 and/or TCT? This will be an important topic of future study, as noted by the authors in their response.

      (2) It is unclear whether or how the NOD1 and TREM-1 pathways interact.

      (3) Many of the study's conclusions rely on the use of HEK293 reporter lines in the absence of bacterial infection, which may not be physiologically representative.

      Comments on revised version.

      The authors have responded adequately to my comments.

    1. Reviewer #1 (Public review):

      Summary:

      In their manuscript, Zhou and colleagues present a detailed look at how the JSP functions differently in the various cells of a breast tumor. The authors have effectively shown that the JSP acts as a double-edged sword, as it helps T cells fight cancer but also allows tumor cells to grow and avoid ferroptosis. These findings are important because they identify a useful biomarker to predict how TNBC patients might respond to PD-1 inhibitors.

      Strengths:

      This work is important because it provides a clear explanation for the conflicting roles of the JSP in the tumor environment. The evidence is solid, as it combines data from thousands of patients with single-cell analysis and lab experiments to confirm the role of STAT4 in cancer progression and immunity.

      Comments on revised version:

      The authors made a significant effort to improve the manuscript. My comments were sufficiently addressed.

    2. Reviewer #2 (Public review):

      Summary:

      The JAK-STAT pathway (JSP) exhibits cell-type-specific functional heterogeneity in breast cancer. This study investigates the JSP in breast cancer and its response to anti-PD‑1 immunotherapy. JSP displays distinct cell‑type heterogeneity: it promotes malignant phenotypes and immunosuppression in tumor cells, while enhancing cytotoxicity and reducing exhaustion in T cells. Elevated JSP expression correlates with improved immunotherapy responses, especially in triple‑negative breast cancer. These findings highlight the paradoxical roles of JSP, indicating that broad inhibition may compromise anti‑tumor immunity.

      Strengths:

      The major strengths of this study include the comprehensive characterization JSP heterogeneity across epithelial, tumor, and T cells in breast cancer. The identification of JSP and STAT4 as predictive biomarkers for immunotherapy response, particularly in triple‑negative breast cancer, provides clinically relevant insights for patient stratification.

      Weaknesses:

      The corresponding content has been revised.

    3. Reviewer #3 (Public review):

      Summary:

      This multi-omics study by Zhou et al elucidates the context-dependent roles of the Janus kinase-signal transducer and activator of transcription (JAK-STAT) pathway (JSP) across different cellular compartments in the breast cancer tumor microenvironment. While bulk JSP activity is associated with a favorable prognosis, single-cell analysis reveals a paradoxical landscape: high JSP in T cells drives anti-tumor cytotoxicity and reduces exhaustion, whereas high activity in tumor epithelial cells promotes malignancy and immunosuppression via the MIF-CD74 signaling axis. The JSP score (immune-related) serves as a robust predictive biomarker for response to anti-PD-1 immunotherapy, particularly in triple-negative breast cancer (TNBC). Furthermore, the study identifies the STAT4/SLC47A1 axis as a critical mechanism through which tumor cells resist ferroptosis, facilitating disease progression. These findings suggest that broad JAK-STAT inhibition may be counterproductive in cancer therapeutics; instead, therapeutic success depends on precise modulation and carefully timed interventions to preserve its T-cell-associated functions. This study may inspire future studies to explore specific factors that selectively modulate JAK-STAT activity in immune cells to achieve favorable therapeutic outcomes.

      Strengths:

      Significant therapeutics implications

      Weaknesses:

      Limited molecular mechanisms

      Comments on revised version:

      The authors have addressed my comments

    1. Reviewer #2 (Public review):

      Summary:

      This study aims to establish a rational framework for designing bacterial probiotics against respiratory infections. The central hypothesis is that in vitro antagonism, particularly through metabolic niche overlap with a pathogen, predicts in vivo efficacy.

      Strengths:

      (1) Systematic pipeline: The study integrates bacterial isolation, in vitro characterization, model development, and in vivo validation into a cohesive workflow.

      (2) Quantitative model: The introduction of the Niche Index (NI) and Niche Index Fraction (NIF) provides a novel, quantitative tool for predicting probiotic efficacy based on ecological principles.

      (3) Mechanistic insight: The work dissects different modes of action, clearly demonstrating that inhibition can be driven by specialized metabolite production (CP8) or carbon resource competition (e.g., CP7), with lactate utilization identified as a key factor.

      Weaknesses:

      (1) Limited model generalizability: The predictive power of the NI model is not universal. It fails to account for the in vivo inefficacy of CP8 (a metabolite-dependent inhibitor) and cannot explain the short-term protection conferred by some non-inhibitory CPs in vivo, suggesting unmodeled mechanisms like immune priming are at play.

      (2) Preliminary nature of key findings: The emphasis on lactate consumption as a critical predictor, while interesting, is not sufficiently explored to establish its general importance beyond the specific strains and conditions tested.

      Appraisal:

      The authors successfully achieve their aim of establishing a rational probiotic-design pipeline. The data robustly support the conclusion that metabolic niche overlap predicts efficacy for many strains, while also clearly delineating the model's limitations, as acknowledged by the authors.

      Impact:

      This work provides a valuable methodological framework for hypothesis-driven probiotic discovery. The quantitative Niche Index offers immediate utility to the field and, with further refinement, has the potential to become a fundamental tool for developing respiratory therapeutics.

      Comments on revised version.

      I thank the authors for their meticulous revisions.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents a potentially important integrative model linking spontaneous retinal waves, apoptosis, microglial activity, and vascular development during postnatal retinal maturation. Its significance lies in proposing a mechanistic framework that could reshape understanding of how neural activity and tissue remodeling are coordinated in the developing central nervous system. The evidence is strengthened by the use of multiple complementary techniques, including Ca++ imaging, high-throughput electrophysiology, transcriptomics, histology, and pharmacology.

      Strengths:

      (1) Multimodal Validation: The authors correlate large-scale functional imaging (calcium imaging and MEA) with high-resolution structural and molecular data (scRNA-seq and IHC), providing strong topographical evidence for the "centrifugal expansion" pattern.

      (2) The primary significance lies in identifying apoptotic Retinal Ganglion Cells (RGCs) as the physiological "pacemakers" for stage II retinal waves. By linking programmed cell death directly to neural activity and subsequent angiogenesis, the authors propose a self-regulating developmental loop.

      Weaknesses:

      (1) While the PANX1 pharmacological data provide compelling functional support, extending these conclusions to the broader CNS may be premature. Additional direct mechanistic validation would further strengthen the claim of causality.

      (2) While the manuscript beautifully illustrates the co-occurrence of events during retinal development, strengthening the distinction between correlation and direct causation would enhance the impact of the findings.

    2. Reviewer #2 (Public review):

      Summary:

      Savage et al. investigate the synchronization of retinal Ca2+ waves with developmental cell death, microglia activation, and vascular outgrowth. These developmental processes occur through a mechanism where apoptotic cells release ATP through Panx-1 channels to stimulate both Ca2+ retinal waves and microglia activation. Using scRNAseq, the authors classify autofluorescence cell clusters (ACCs) at the leading edge of vasculature outgrowth as Hmox-1+ microglia. From here, they show microglia engulfment of apoptotic RGCs, and the potential release of ATP may contribute to Ca2+ wave generation. The authors demonstrate these mechanisms through the use of two pharmacological agents to either block the ATP release from Panx-1 or block receptor binding to ATP. Furthermore, while previous studies have described the site of initiation of retinal Ca2+ waves as random, this study shows that the initiation of Ca2+ waves is biased to the leading edge of vascular growth in the developing retina. To do this, the authors use a combination of wide-field Ca2+ imaging and multi-electrode arrays to pinpoint the sites of Ca2+ wave initiation in the developing retina.

      Strengths:

      The authors use several techniques to interrogate these mechanisms, including single-cell RNAseq, wide-field Ca2+ imaging, and multi-electrode arrays. With these experiments, this manuscript proposes several novel ideas, such as ATP as the Ca2+ wave-initiating cue, and the localization of the Ca2+ wave initiation to the leading edge of vascular growth.

      Weaknesses:

      The main weakness of the manuscript is the overreliance on only two pharmacological agents to test the central hypotheses. These conclusions would be strengthened if, in addition to their pharmacological manipulations, they used genetic knockout models to perturb programmed cell death or ATP release (i.e., BAX-KO, Panx-1 KO).

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Lei and co-workers aim to uncover the genetic underpinnings of thermal adaptation across three strains of the diamondback moth (Plutella xylostella) through experimental evolution over three years under three different thermal regimes. They identify systematic differences in trait responses (e.g., survival, fecundity), metabolic profiles, gene expression, and in the amino acid sequence of the PxSODC gene, among others. These results suggest that the diamondback moth has a strong potential for rapid physiological adaptation to different thermal regimes. Overall, this is a comprehensive and generally well-executed study that addresses an important question in the face of ongoing climate change.

      Strengths:

      The authors employ multiple approaches to identify signatures of thermal adaptation across the three strains, such as trait performance comparisons, metabolomics, transcriptomics, and amino acid sequence comparisons. All these different angles form a convincing picture of the underlying factors that underpin thermal adaptation in this experimental system. The manuscript is also generally well written and easy to understand.

    2. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors set out to better understand the genetic mechanisms underlying thermal adaptation in insects. They experimentally evolved diamondback moth (Plutella xylostella) populations - a pest species with a wide distribution - under both hot (12h:12h 32{degree sign}C/27{degree sign}C) and cold (15{degree sign}C/10{degree sign}C) thermal conditions, and conducted phenotypic assays and metabolic and transcriptomic profiling to analyze how populations changed to deal with this thermal stress compared to the nonevolved ancestral population (constant 26{degree sign}C). Phenotypic assays showed that evolved hot populations had increased survival at high temperatures (42-43{degree sign}C) while evolved cold populations had lower freezing points compared to the ancestral population. When measured at the constant 26{degree sign}C conditions, metabolic and transcriptomic profiles of 3rd instar larvae from the evolved population were distinctive from the ancestral population, with a set of overlapping metabolic and transcriptomic pathways that were significantly differentially expressed in both hot and cold evolved populations compared to the ancestral. The authors narrowed down this set of candidate genes further by focusing on genes with high expression levels overall, whose expression profile was correlated with differentially expressed metabolites, and that contained mutants in both hot and cold strains. From this set, they chose the PxSODC gene for further functional validation, as it has previously been shown to be involved in the response of insects to abiotic stress with its antioxidative role in cellular defense. At the constant 26{degree sign}C, this gene showed lower expression across development in evolved strains compared to the ancestral population, while it showed similar expression patterns under thermal stress. Knockdown of PxSODC resulted in decreased survival rates at high temperatures and higher freezing points compared to the ancestral population. Based on this validation, the authors hypothesize that the non-synonymous mutation in the PxSODC gene that they found in the cold and hot evolved populations might alter the conformation of the PxSODC protein, increasing enzyme capacity. Their experimental evolution experiment furthermore indicates the capacity of the pest species, the diamondback moth, to adapt to a wide range of temperatures, providing insights into its capacity for global dispersal.

      Strengths:

      (1) The authors did a tremendous amount of work to characterize the mechanisms underlying thermal adaptation in the diamondback moth, artificially selecting populations for three years in the lab and characterizing how they evolved as a result at different biological levels: from phenotypes in different life stages, to larval metabolites and gene transcription, to functionally validating how one of the resulting gene candidates influences the capacity to deal with thermal stress.

      (2) The paper identifies and provides further evidence for candidate genetic mechanisms that might be particularly important for thermal adaptation in insects, including lipid metabolism, oxidoreductase activity, and DNA methylation. It is furthermore interesting that the authors found similar mechanisms to be involved in both the adaptation to cold and hot environments. Their functional validation of some of the genes involved in these mechanisms is very useful to understand how these genes might be causally involved in insect thermal adaptation.

      (3) The paper also has applied value: the diamondback moth is a pest species with a wide distribution, so understanding its adaptive capacity to different thermal environments is important for predicting the prevalence and potential further range expansion of this species under future climate change.

    1. Reviewer #1 (Public review):

      Summary:

      This research sheds light on the nuanced role of ABHD6 in regulating AMPARs, highlighting its interaction with TARP γ-2 as a critical factor in modulating receptor gating kinetics. It is crucial to understand that although ABHD6 alone does not alter AMPAR kinetics, its presence alongside TARP γ-2 accelerates AMPAR deactivation and desensitization, thereby affecting synaptic transmission dynamics.

      Strengths:

      Important findings in the research include:<br /> - ABHD6 does not affect the gating kinetics of GluA1 and GluA2(Q) homomeric receptors independently.<br /> - In the presence of TARP γ-2, ABHD6 accelerates deactivation and desensitization of these receptors, regardless of their splicing or editing isoforms.<br /> - The effect is consistent for both homomeric GluA1 and GluA2(Q) receptors and heteromeric GluA1i/GluA2(R)i-G receptors.<br /> - The recovery from desensitization of GluA1 with the flip splicing isoform is slowed by ABHD6 in the presence of TARP γ-2.

    2. Reviewer #2 (Public review):

      Summary:

      Cong et al. investigated the regulatory effects of ABHD6 on AMPARs. The authors performed adequate electrophysiology recordings to show the exact pattern of this regulation and covered major critical points.

      Strengths:

      The authors have performed high-quality ephys recordings and examined all potential regulatory aspects of ABHD6 on AMPARs. This is important to understand the AMPAR functions.

      Weaknesses:

      (1) The authors discussed CNIH-2 extensively from line 92-110 in the introduction, however, they did not perform related experiments. I suggest they move this part to the discussion where they also discussed the roles of CNIH.

      (2) The authors need to report the "n" for all the experiments they have presented in this manuscript. How many cells were recorded in each condition? How many batches? This information has to be in all of the figure legends, but it is missing except Fig. 4.

      (3) One question is what the physiological meanings of this regulatory effect are. The authors may consider adding some discussions.

      (4) About statistics. The authors need to add more details and make sure their statistics sound. For example, they also need to check the equality of variances. In their Table EVs, where the P values are reported, the authors need to report which statistics they have used, one-way ANOVA, K-W test, or others, and the exact post-hoc test type for each comparison. For one-way ANOVA, report the F values simultaneously with the P values in all figure legends.

      (5) Fig. 3J, the authors need to correct the label of the Y axis. It is shifted.

      Comments on revised version.

      In the revised manuscript, the authors have addressed all my concerns. The manuscript has been substantially strengthened by additional data and discussion.

    1. Reviewer #1 (Public review):

      Summary:

      Brunsdon et al. present a zebrafish model of mosaic PIK3CA activation to investigate mechanisms underlying PIK3CA-related overgrowth spectrum (PROS), with a particular focus on non-cell-autonomous mechanisms of tissue overgrowth. The study is timely and addresses an important gap in the understanding of how mosaic activation of PI3K signaling leads to tissue-specific developmental abnormalities.

      Using a Tol2-based mosaic expression system combined with single-cell transcriptomics, the authors provide evidence suggesting that mutant PIK3CA-expressing cells influence surrounding wild-type tissues through indirect signaling mechanisms, contributing to vascular malformations and tissue overgrowth.

      Overall, the work presents an interesting and potentially impactful model for studying mosaic PIK3CA-driven overgrowth and non-cell-autonomous signaling mechanisms. However, several aspects require clarification, additional controls, and improved presentation to strengthen the mechanistic conclusions and overall impact of the study.

      Strengths:

      This study addresses an important and timely question by investigating the mechanisms underlying mosaic PIK3CA activation in the context of PROS, a condition for which developmental mechanisms remain poorly understood. The use of a mosaic zebrafish model is particularly appropriate, as it closely reflects the mosaic nature of PIK3CA mutations observed in patients and allows the investigation of non-cell-autonomous effects.

      Another major strength of the study is the integration of single-cell transcriptomics, which provides valuable insight into potential signaling pathways involved in indirect tissue overgrowth and offers a rich dataset for hypothesis generation. The authors also propose an interesting conceptual framework in which PI3K-activated cells influence surrounding tissues through paracrine signaling, which could have broader implications beyond PROS and contribute to understanding mosaic developmental disorders more generally.

      Finally, the work has potential translational relevance, as identifying mechanisms driving mosaic PI3K activation and non-cell-autonomous signaling could inform future therapeutic strategies for PROS and related conditions.

      Weaknesses:

      Despite these strengths, several aspects of the study require clarification and additional experimentation.

      Major comments:

      (1) The Tol2-based system results in mosaic overexpression of mutant PIK3CA in the presence of endogenous wild-type PIK3CA, making it difficult to determine how co-expression of WT and mutant proteins influences the observed phenotypes. While mosaic expression is relevant to PROS, a complementary approach in which endogenous PIK3CA is knocked out prior to introducing mutant variants would allow clearer interpretation of mutant-specific effects.

      (2) The authors do not clearly describe the validation of editing or integration efficiency. It would be important for the authors to clarify whether sequencing was performed to confirm integration, to quantify the proportion of mosaic expression, and to measure transgene expression levels. These controls would strengthen confidence in the model and interpretation of the results.

      (3) The manuscript would benefit from rescue experiments to strengthen causal conclusions. It remains unclear whether the phenotypes induced by PIK3CA PROS variants can be rescued, either through expression of wild-type PIK3CA, pharmacological inhibition of PI3K signaling, or assessment of developmental reversibility. Such experiments would strengthen the link between PI3K activation and the observed phenotypes.

      (4) The authors propose candidate signaling molecules mediating non-cell-autonomous effects downstream of PI3K hyperactivation; however, these conclusions remain speculative, as no functional validation is provided. Testing selected candidate mediators identified in the RNA-seq dataset would significantly strengthen the mechanistic conclusions.

    2. Reviewer #2 (Public review):

      In this manuscript, Burnsdon et al. aim to study PIK3CA-related overgrowth spectrum (PROS) by establishing a mosaic zebrafish model with overexpression of pik3ca carrying hotspot mutations, coupled with an mScarlet+ reporter. Using fluorescence microscopy, the authors demonstrated that overexpression of pik3ca with a number of hotspot mutations led to mesodermal and particularly vascular malformations in the zebrafish model. Interestingly, they found a paucity of mScarlet+ mutant cells in the vascular lesions, consistent with the finding of low PIK3CA mutation burden in PROS tissue. Such data suggest a non-cell-autonomous effect of PIK3CA mutation. Following this logic, the authors performed single-cell RNA-Sequencing on zebrafish overexpressing WT pik3ca and mutant pik3ca at 19 hpf, and demonstrated widespread transcriptomic perturbations across multiple lineages, including lineage frequencies, key cell pathways, and cell-cell interactions. Importantly, they demonstrate that mScarlet+ cells carrying mutant pik3ca cluster separately from other cell types, do not demonstrate clear lineage identity, and have a general downregulation in signaling components.

      Overall, the conclusions in the manuscript are well-supported by the presented data. The imaging studies are particularly convincing. The transcriptomic analysis generated a list of potential pathways to further investigate and potentially target with future therapeutic interventions. Importantly, this study provides a valuable in vivo model of PROS that: 1) recapitulates key features of PROS (e.g., multiple mesodermal defects, paucity of mutation burden in lesions suggesting non-cell-autonomous interactions); 2) is scalable; and 3) offers direct visualization of lesion development, compatible with time-course live imaging. This model will be valuable to further understand PROS and potentially study other diseases where the PIK3CA pathway is altered (e.g., certain cancers).

      The following are not necessarily weaknesses of the data, but rather suggestions where the manuscript could be further strengthened:

      (1) The model recapitulates the variability of mesodermal lesions in PROS. It would be valuable to utilize this model to further study factors that are associated with the development of more severe lesions (e.g., by comparing samples with more severe lesions to those unaffected despite carrying the mutations, Figure 1F).

      (2) ScRNA-seq analysis could be enriched with a comparison between cells overexpressing mutant pik3ca vs. those overexpressing WT pik3ca.

      (3) In the scRNA-Seq analysis, it is curious that the C0 cluster, enriched with mScarlet+ cells, is found to have downregulated signaling interactions (Fig. 5C), yet exerts a widespread non-cell-autonomous effect. Meanwhile, there is also a noticeable loss of certain lineages (e.g., notochord, Figure 4E) and related cell-cell interactions (e.g., notochord-related interaction, Figure 5A). A deeper exploration of the basis of the non-cell-autonomous effect would be valuable.

      (4) The scRNA-Seq analysis was performed at one time point (19 hpf). Additional analysis (not necessarily by scRNA-Seq) at other time points to study whether findings at 19 hpf are persistent throughout development or undergo dynamic changes (e.g., cell fate/state of mSc+ mutant cells) would be helpful.

      (5) The scRNA-Seq analysis provides a valuable list of perturbed interactions that could be targeted by future therapeutic approaches. Validation of the scRNA-Seq findings with protein-level analysis, and studying the effect of targeting some of the pathways on the disease phenotype, would offer valuable data for the community.

    3. Reviewer #3 (Public review):

      Summary:

      The study "PIK3CA-related overgrowth spectrum (PROS) zebrafish models reveal pan-lineage developmental dysregulation" presents important findings that extend significantly beyond a single subfield, bridging developmental biology, vascular medicine, and cancer-related PI3K signalling. By developing mosaic zebrafish models of PROS and combining live imaging with single-cell transcriptomics, the authors provide compelling evidence for a non-cell-autonomous mechanism of tissue overgrowth, a conceptual shift with meaningful therapeutic implications.

      Strengths:

      The evidence is overall convincing, with methodology appropriate and well-validated relative to the current state of the art; the integration of multiple approaches (in vivo modelling, scRNA-seq, ligand-receptor inference) strengthens the central claims. However, some aspects of the proposed non-cell-autonomous signalling mechanisms remain partly correlative, and direct functional validation of the rewired ligand-receptor interactions would further consolidate the conclusions.

      Weaknesses:

      The transgenic overexpression approach chosen by the authors represents a well-established and effective strategy for generating mosaic models in zebrafish. However, this approach introduces notable limitations: the lack of control over transgene dosage and unknown integration sites may generate non-physiological effects, potentially confounding the interpretation of key findings.

      The authors are certainly aware that alternative approaches (though technically more demanding) could be considered in future studies to further strengthen the model. For instance, a CRISPR/Cas9-mediated knock-in of the pik3ca-PROS allele at the endogenous locus (retaining upstream native regulatory elements with only a minimal promoter in the construct, co-expressed with a fluorescent reporter via P2A) could allow even more physiological, lineage-restricted expression while enabling direct visualisation of mutant cells. Mesodermal specificity could potentially be further refined by driving mosaic Cas9 expression under a pan-mesodermal tbx promoter, restricting editing to the relevant lineage while simultaneously marking mutant cells fluorescently, thus even more closely mimicking the post-zygotic mutational events characteristic of PROS. As a complementary strategy, blastula transplantation experiments using pik3ca-PROS donor cells (ideally co-expressing a distinct fluorescent marker such as mCherry) into fli1:GFP transgenic hosts could provide a powerful and technically consolidated approach to directly visualise and quantify non-cell-autonomous effects on host vasculature, with precise control over mutant cell burden. This combinatorial framework, separating donor mutant cells from host tissue in a two-colour imaging setup, could be particularly compelling for validating the ligand-receptor rewiring predicted by single-cell transcriptomics in future investigations.

      These reflections are offered in the spirit of prospective methodological development and do not diminish the value of the current work, which opens a valuable new avenue for therapeutic investigation, suggesting that targeting indirect overgrowth-propagating signals, alongside PI3K inhibition, deserves serious consideration.

    1. Reviewer #1 (Public review):

      Summary:

      The authors performed seqFISH in 26 gastruloids and performed a variety of computational analyses on these novel spatial data sets. Whilst the data is valuable and the computational concepts useful (exposure index, L-metric, ... ), the article falls short on novelty and is written using a very clunky language, often with contradictory conclusions.

      Major issues:

      (1) The authors did well in explaining and detailing the provenance of data and the individual experiments performed. However, their 26 gastruloid data still constitute a very limited sampling from their total organoids: one experiment pooled 4 plates at an 80-94% success rate; 6 different aggregation experiments were done, making a total of 1843 gastruloids, sampled 26 (~1-2%). A simple IF stain of 2-3 markers in a bigger sample could have given a more accurate picture of specific domains of interest and their proximity. Regardless, more information should be given about the existing samples: variation across experimental batches, differences between 300-cell vs 100-cell gastruloids that were used.

      (2) Language in the manuscript should be revised. Overall the manuscript is very long, descriptive and written "impressions and beliefs" are often not adequately justified and indeed can be contradictory, e.g. in Section 1: the title states "cell types' locations ...are consistent", a few sentences down we find "there was substantial variation" and "within range of what would be considered a 'morphologically normal' gastruloid". "quite consistent", "compelling patterning", "we don't believe"... these types of expressions are best avoided and replaced with data or used and bolstered with quantitative numbers such as percentages when a given cutoff is used. Another example: "location of each cell type relative to gastruloid morphology was quite consistent the posterior region ... mainly consisted in NMPs." Given T expression in the posterior, this result phrased as such appears quite inflated, in fact, looking at cell types in Figures S1, 2a/b/c, this reviewer would state they are all but consistent and indeed it takes sophisticated analyses to find a pattern (of sorts) beyond the coarse domains expected!

      (3) Figure 6 is one of the most valuable parts of the work, as the authors use the battery of analyses developed to investigate the variable and not-so-robust endothelial clusters in gastruloids. However, this investigation is still very preliminary, and it should be further linked with known biology. It is still unclear what the unique organization of this cell type is (circularity isn't convincing) and whether any signalling cues of adjacent cells could explain it. Is there any evidence that more mature endodermal cell types are generated (like the suggested "liver") to give rise to endothelial cells? It would certainly be interesting to perform IF for this cell type together with mesodermal and endodermal markers to validate seqFISH predictions on a bigger sample.

      (4) Figures 1c and 6b need statistical significance assessments.

      (5) The article should include an analysis of Hox colinearity expression in these gastruloids as a validation of the system.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents an ambitious and technically challenging spatial-transcriptomic atlas of 26 gastruloids using seqFISH. The authors introduce quantitative metrics (mixing score, exposure index, L-metric / scL-metric, spatial L-metric, triplets) to characterize spatial organization at multiple scales. The dataset is valuable, and several analyses are original, particularly the rank-based L-metric family for mutual exclusivity.

      Strengths:

      The authors generate one of the most detailed spatial transcriptomic datasets of gastruloids to date. They propose creative computational metrics (L-metric/scL-metric) to quantify mutual exclusivity of gene expression without predefined thresholds, and they explore organizational principles from single-cell topology to cluster-level structure. Many observations align well with known gastruloid biology, such as posterior robustness and anterior variability. The writing is generally clear, and the figures are rich.

      Weaknesses:

      Several central claims rely on metrics whose computation and justification are insufficiently explained, making it difficult to assess how robust or interpretable the results are. Many choices in the analysis appear arbitrary or are insufficiently motivated (normalization schemes, choice of parameters such as the number of neighbors, the distance cutoffs, hierarchical clustering setup, and so on). The interpretations of spatial consistency, gene-program inference, and endothelial heterogeneity are plausible but might be stronger than the evidence currently supports.

      The manuscript would benefit from stronger benchmarking, quantification of uncertainty, and explicit controls for known artifacts in spatial transcriptomics (e.g., spillover, 2D slicing, cell type assignment entropy). The biological insights are promising, but since several depend on methodological assumptions that have not yet been demonstrated to be stable, they would benefit from clearer methodological explanation.

      The work is rich and could become a reference dataset. Then, clarifying and validating the quantitative methods will considerably strengthen the impact and reliability of the conclusions.

    3. Reviewer #3 (Public review):

      Summary:

      Triandafillou and colleagues report a single-cell resolved spatial atlas of gene expression of 26 gastruloids. While previous work had analyzed either single-cell gene expression or spatially coarse-grained patterns of gene expression (van den Brink et al, 2020), the authors here use multiplexed sequential RNA FISH (seqFISH) to create the first gastruloid atlas, which is simultaneously spatially and cellularly resolved. This atlas adds to a growing list of resources cataloging gastruloid development (see also Suppinger et al 2023).

      To analyze this dataset, the authors also describe a novel analytical framework. Their analysis centers around the 'L-metric', which measures the degree to which pairs of genes are either coexpressed or mutually exclusive. While this metric is similar to calculating correlations in gene expressions, it has important differences (including that it can, in principle, be asymmetric; although the authors symmetrize much of their analysis). In addition to the gene-centric L-metric analysis, the authors also analyze cells in their dataset according to the cell type entropy (an information-theoretical measure of confidence in cell type assignment) and the 'exposure index' (a measure of the similarity of nearest cellular neighbors).

      Using this framework, the authors focus their analysis on two major features of development. The first is the differentiation of the bipotent neuromesodermal progenitor (NMP) cells in the posterior of the gastruloid into either presomitic mesoderm (PSM) or spinal cord SC lineages. They use L-metric analysis to compare overlap in marker genes used to separate NMP, PSM, and SC fates. They highlight that L-metric analysis can recover spatial patterns of gene expression (without explicit spatial information) and discern subtle features of marker genes beyond simple binning of cell types (e.g., that Epha5 expression in anterior NMPs may predict future SC differentiation).

      The second is the formation of endothelial (spatial) clusters within the gastruloid. The authors highlight two subtypes of endothelial clusters: (1) smaller clusters within the somitic anterior region, and (2) larger clusters associated with endoderm. While the authors discern some subtle differences in gene expression between these two clusters, their different spatial patterns suggest a potential physiological difference that would not be captured in traditional droplet microfluidic-based scRNAseq pipelines.

      Overall, this manuscript is a sophisticated and technically sound study that will provide a valuable beachhead for future studies of developmental patterning in gastruloids and organoids.

      Strengths:

      The major strengths of this study are the overall technical sophistication of the data set and analysis, as well as its potential generalizability to other developmental systems (both in vitro and in vivo). The data are extensively analyzed and reasonably interpreted, and this atlas makes good use of the variability in gastruloid development to extract the statistical structure of developmental processes. The L-metric offers a parameter-free tool to analyze transcriptomic datasets that could overcome the pitfalls of other approaches.

      Weaknesses:

      The major limitations of this study are the depth and novelty of the developmental processes studied. The authors provide very convincing proof-of-concept that their data set can recover known features of gastruloid development, including NMP differentiation and endothelial development. However, further analysis and/or investigation would be required to discover new principles of gastruloid development and patterning.

    1. Reviewer #2 (Public review):

      Summary:

      Protein synthesis - translation - involves repeated recognition and incorporation of amino-acyl-tRNAs by the ribosome. This process is a trade-off between the rate and accuracy of selection (for review see (Johansson et al, 2008; Wohlgemuth et al, 2011)). The ribosome does not just maximise the rate or the accuracy, it balances the two. Therefore, it is possible to select mutants that translate faster than the wt (but are sloppy) or that are very accurate (more than the wt) but translate slower. Slow translation is detrimental as it limits the rate of protein synthesis (and, therefore, growth) and hyper-accurate mutants accumulate mis-translated proteins, which is detrimental for the cell.

      Bi and colleagues employ genetics, MIC measurements, reporter assays and structural biology to characterise the role of GidB rRNA methylase in translational accuracy in Mycobacterium smegmatis.

      Strengths:

      The genetics and phenotypic assays are convincing and establish the biological role of the methylase. The authors use a powerful set of complementary assays that convincingly demonstrates that the loss of GidB results in mistranslation.

      Weaknesses:

      Cryo-EM analysis of vacant 70S ribosomes is not sufficient for understanding the mechanisms underlying the accuracy defects in the gidB KO. Ideally, one should assemble and solve structurally near-cognate and non-cognate complexes.

      References:

      Johansson M, Lovmar M, Ehrenberg M (2008) Rate and accuracy of bacterial protein synthesis revisited. Curr Opin Microbiol 11: 141-147

      Wohlgemuth I, Pohl C, Mittelstaet J, Konevega AL, Rodnina MV (2011) Evolutionary optimization of speed and accuracy of decoding on the ribosome. Philos Trans R Soc Lond B Biol Sci 366: 2979-2986

    1. Reviewer #1 (Public review):

      [Editors' note: This version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review satisfactorily and toned down the comments as advised.]

      In this manuscript, the authors investigate the relationship between genetic codes and their robustness to single-point mutations. They construct ten alternative genetic codes by reassigning nine codons to Leu, Ser, or Ala, and assess mutational robustness using three reporter proteins subjected to error-prone PCR. This represents an interesting experimental approach to addressing the hypothesis that the standard genetic code is optimized for mutational robustness.

    2. Reviewer #2 (Public review):

      The study addresses the long-standing question in molecular biology and genetics: why has nature selected the current genetic code (SGC, or standard genetic code)? The authors have tested 'error minimization theory', one of the prevailing hypotheses to explain this. Their approach is to create a minimum genetic code (MGC) and its variants (3^9 theoretical possible codes). Using three parameters to quantify the effect of mutations (Polarity, volume, and hydropathy), they computationally test the cost of these genetic codes (3^9) by simulations. Finally, they test this cost experimentally using an in vitro translation system with 10 select genetic code variants with a range of costs (low to high). They use three randomly mutated reporter genes for this purpose - beta-galactosidase, luciferase, and mSG. They find no correlation between the cost of the genetic code and the reporters' output. Based on these observations, they suggest that error-minimization theory may not explain the current egocentric code.

      The question they are asking is very exciting, and their approach is solid. The authors are very careful in their analyses and conclusions.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Miyachi and Ichihashi investigate whether the arrangement of the genetic code affects mutational robustness. Using an in vitro minimal genetic code with vacant codons, they constructed 10 non-standard genetic codes by reassigning Ala, Ser, and Leu, generating codes with replacement costs that were generally higher than those of the standard genetic code across several amino acid property measures. They then tested how random mutations affected the activity of reporter proteins translated under these altered codes. Although error minimization theory predicts that higher-cost codes should make mutations more harmful, the authors report that protein function declined to a similar extent across all codes examined, suggesting that mutational robustness remains largely unchanged within the range of genetic code alterations tested here.

      Strengths:

      This is an interesting study that investigates one of the most fundamental and intriguing questions in molecular evolution: the emergence of the genetic code, which is nearly universal across nature. The in vitro approach is a powerful aspect of the work and provides an opportunity to examine this phenomenon experimentally at a depth that has previously been inaccessible.

    1. Reviewer #1 (Public review):

      Summary:

      In their article, Guo and coworkers investigate the Ca²⁺ signaling responses induced by Enteropathogenic Escherichia coli (EPEC) in epithelial cells and how these responses regulate NF-κB activation. The authors show that EPEC induces rapid, spatially coordinated Ca²⁺ transients mediated by extracellular ATP released through the type III secretion system (T3SS). Using high-speed Ca²⁺ imaging and stochastic modeling, they propose that low ATP levels trigger "Coordinated Ca²⁺ Responses from IP₃R Clusters" (CCRICs) via fast Ca²⁺ diffusion and Ca²⁺-induced Ca²⁺ release. These responses may dampen TNF-α-induced NF-κB activation through Ca²⁺-dependent modulation of O-GlcNAcylation of p65. The interdisciplinary work suggests a new perspective on calcium-mediated immune response by combining quantitative imaging, bacterial genetics, and computational modeling.

      Strengths:

      The study provides a new concept for host responses to bacterial infections and introduces the concept of Coordinated Ca²⁺ Responses from IP₃R Clusters (CCRICs) as synchronized, whole-cell-scale Ca²⁺ transients with the fast kinetics typical of local events. This is elegantly done by an interdisciplinary approach using quantitative measurements and mechanistic modelling.

      Comments on revised version.

      The revised version of the manuscript has addressed all my raised points. I'd like to thank the authors for the work they have put into the revision to make this a very compelling publication.

    2. Reviewer #2 (Public review):

      Summary:

      The authors of this study are trying to resolve how cellular infection by enteropathogenic E. coli (EPEC) subverts cellular signaling pathways to promote infection and dampen immune responses. Specifically, alteration in calcium dynamics has been evidenced in the prior literature as a potential initiator of these adaptions, and this study provides ideas and mechanistic detail as to how cellular calcium dynamics may be subverted by pathogens.

      Strengths:

      The clear strengths of this paper relate to the new ideas inherent in the proposed hypothesis and their support from the experimental approaches used. Overall, the proposed work provides new ideas in this area, which will benefit from further investigation. Certainly, this is an interesting and challenging paradigm to pick apart mechanistically, and is important for improving treatments from intestinal infections. The authors have provided additional data to clarify and expand on concerns raised during the original review, and these additions are helpful.

      Comments on revised version.

      Thorough response to original review. No further comments.

    1. Reviewer #1 (Public review):

      The authors present a compelling case for the necessity of age-specific templates in functional hyperalignment. Given that the brain undergoes substantial developmental, structural, and functional changes across the lifespan, a 'one-size-fits-all' canonical template is often insufficient. This study effectively demonstrates that incorporating age-congruent features significantly enhances the performance and sensitivity of hyperalignment models. By validating these findings across two independent datasets (Cam-CAN and DLBS), the paper provides robust evidence that accounting for age-related functional organization is a critical prerequisite for accurate functional alignment in lifespan research

      Comments on revised version:

      The authors have been exceptionally thorough in addressing the concerns raised by the reviewers. In particular, the inclusion of the supplemental analysis on the middle-aged cohort is a valuable addition that strengthens the manuscript. Furthermore, the rationale for employing a congruent template is well-articulated; this approach clearly provides a more robust and accurate foundation for reconstructing individualized connectomes. I appreciate the authors' detailed responses and have no further comments.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, Zhang and colleagues examine the role of participant selection in creating and using functional templates to improve analyses using hyperalignment. Hyperalignment aligns participants' functional MRI data to a shared functional template, analogous to the anatomical templates used to bring anatomical MRI data into a shared space (e.g., MNI152). The question of appropriate template creation is especially pressing for population-level analyses, where a large number of demographic groups (e.g., different age ranges, clinical statuses) may be included in the same analysis. These different demographic groups may have differences in their functional organization that complicate the creation of a single study-specific functional template.

      To provide an initial investigation of the potential effect of demographic-specific templates, the authors use the publicly available Cam-CAN dataset which contains participants from 18 to 87 years of age. They define a young adult (< 45 years of age) and an older adult group (> 65 years of age) from this dataset with approximately the same number of participants. They investigate whether "age-congruent" templates (i.e. defined in the same age group they are used) improve three analyses where hyperalignment has been previously shown to boost performance: inter-subject correlation, predicting individual connectomes, and predicting individual functional responses. Using the Cam-CAN derived older adult template, they then replicate the ISC analyses using the publicly available Dallas Lifespan Brain Study (DLBS).

      Overall, the presented results are highly suggestive that age-congruent templates consistently improve performance, though the absolute effects are small.

      Strengths:

      The use of a separate validation sample-re-using the same template calculated with Cam-CAN-highlights the potential of developing independent templates for individual demographic groups and then distributing these for wider use, analogous to the MNI templates that are widely used throughout the field of neuroimaging. This suggests that the potential impact of this framework is significant.

      Weaknesses:

      In their revision, the authors have addressed the previously raised "weaknesses" by providing guidance for researchers interested in using age-specific hyperalignment templates in practice.

      Impact:

      Overall, this work is likely to encourage future development of age-specific functional templates in the imaging community.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents a tunable Bessel-beam two-photon fluorescence microscopy (tBessel-TPFM) platform that enables high-speed volumetric imaging with stable axial focus. The work is technically strong and broadly significant, as it substantially improves the flexibility and practicality of Bessel-beam-based two-photon microscopy. The demonstrations are generally strong and bridge a wide range of neuroimaging applications, namely vascular dynamics, neurovascular coupling, optogenetic perturbation, and microglial responses. These convincingly show that the approach enables biological measurements that are difficult or impractical with existing methods.

      The evidence supporting the technical and biological claims is generally strong. The optical design is carefully motivated, clearly described, and validated through a combination of simulations and experimental characterization. The biological applications are diverse and well chosen to highlight the strengths of the proposed method, and the data are of high quality, with appropriate controls and comparative measurements where relevant.

      Strengths:

      (1) The optical innovation addresses a well-recognized limitation of existing Bessel-TPFM implementations, namely axial focus drift during tuning, and does so using a relatively simple, light-efficient, and cost-effective design.

      (2) The manuscript provides convincing experimental evidence for this being a versatile platform to map flow dynamics across diverse vessel sizes and orientations in both healthy and pathological states.

      (3) Biological demonstrations are comprehensive and span multiple domains such as hemodynamics, neurovascular coupling, and neuroimmune responses.

      (4) Quantitative analyses of blood flow across vessel sizes and orientations, including kilohertz line scanning, are particularly compelling and clearly beyond the reach of standard Gaussian TPFM.

      (5) Particular advantages are that higher blood slow speeds become measurable up to 23mm/sec (20x more than conventional frame scanning), and that simultaneous (Bessel-)imaging and (Gaussian-)perturbation are possible because of the stable axial focus.

      Weaknesses:

      (1) At present, the paper does not properly position the new Bessel-beam method against previous work, and fails to compare it to alternative fast volumetric imaging methods without Bessel beams.

      (2) The cost-effectiveness of the proposed method is not well described or supported by evidence; it would be useful to include more detail or remove this claim.

      (3) Some biological conclusions, e.g., regarding novel features of microglial dynamics (i.e., the observed two-wave responses and coordinated extension-retraction), are based on relatively limited sample size and would benefit from clearer discussion of variability across animals and fields of view.

      (4) The use of neural network-based denoising for microglial imaging is reasonable but introduces potential concerns about trustworthiness; additional clarification of validation or failure modes would strengthen confidence in these results.

      To conclude, most of the authors' claims are well supported by the data. The central conclusion, namely that tBessel-TPFM provides tunable volumetric imaging enabling experiments not feasible with existing two-photon approaches, is justified. Some biological interpretations would benefit from a more cautious framing, but they do not undermine the main technical and methodological contributions of the study. This is a strong and technically rigorous manuscript that makes a substantial methodological advance with clear relevance to neuroscience and intravital imaging. Minor clarifications and a slightly more measured discussion of certain biological findings are recommended.

    2. Reviewer #2 (Public review):

      Summary:

      The authors describe a tunable Bessel beam two-photon microscope (tBessel-TPFM) designed to overcome a common limitation of Bessel-based volumetric imaging: axial shifts of the effective focus during Bessel beam parameter tuning. Their optical design allows independent control of axial beam length and resolution while keeping the axial center fixed. This is extensively validated through simulations and experiments.

      Strengths:

      A major strength of the work is the breadth of validation combined with the level of technical detail provided. The authors carefully characterize the optical performance of the system and clearly explain the design choices and underlying derivations, which will make it easier for others to understand and implement. The authors demonstrate the utility of the method across several in vivo applications, including neurovascular imaging, blood flow measurements, optogenetic stimulation, and microglial dynamics.

      Weaknesses:

      In the in vivo demonstrations, the authors employ different Bessel beam configurations across experiments, but the beam parameters are not dynamically tuned during live imaging. A video example showing continuous or interactive tuning of the Bessel beam within a single in vivo imaging sequence would further highlight the practical advantages of this platform and strengthen the case for its potential applications. In addition, while excitation powers are reported, the manuscript does not place these values in the broader context of known photodamage thresholds for two-photon microscopy, which would be helpful to the readers. Denoising/image restoration are applied in one of the in vivo examples, but it is unclear why this step was used specifically for this dataset and whether it was necessary to achieve adequate SNR or primarily included as an additional demonstration.

    3. Reviewer #3 (Public review):

      Summary:

      The manuscript presents an elegant and cost-effective approach for generating a tunable Bessel beam on a conventional two-photon microscope. The authors assemble a compact optical module comprising three axicons and a series of lenses that permits rapid adjustment of both lateral resolution and axial extent without modifying the focal plane. This flexibility enables the system to be readily adapted to a variety of biological preparations. As a proof of concept, the authors employ the device to record blood flow velocities in cortical microcapillaries, arterioles, and venules, thereby directly visualizing vasodilatation and vasoconstriction dynamics and permitting quantitative analysis of neurovascular coupling across cortical layers in awake mice.

      The authors demonstrate that the tunability of the Bessel beam can be exploited to match the numerical aperture to the vessel type: a high NA configuration, albeit slower scan, is optimal for resolving flow in capillaries, whereas a low NA setting provides faster acquisition suitable for arterioles and venules. By implementing a one-dimensional line scan with the Bessel beam, they achieve an imaging speed that is twentyfold faster than conventional frame-by-frame scanning, which proves sufficient to capture hemodynamic transients before and after an induced ischemic stroke.

      In addition to pure observation, the authors integrate a co-propagating Gaussian line to the system, allowing simultaneous imaging and photostimulation within the same focal plane. This capability addresses a common limitation of other Bessel beam implementations, in which the observation and perturbation planes often become misaligned when the Bessel beam is altered. The manuscript also emphasizes the advantage of Bessel beam excitation for calcium imaging after a perturbation, because it captures neuronal activity in planes both above and below the nominal focal plane, signals that would be missed with a standard Gaussian focus. Finally, the authors apply the technique to investigate the neuroimmune response following targeted microglial ablation; they report that adjacent microglia extend processes toward the injury site while retracting processes in the opposite direction.

      Overall, the work offers a technically straightforward yet powerful extension to existing two-photon platforms, providing high-speed, volumetric imaging and stimulation capabilities that are well-suited to a broad range of neurovascular and neuroimmune studies. The experimental validation is quite thorough, and the presented data convincingly illustrates the benefits of the approach.

      Strengths:

      The authors present a truly clever and inexpensive optical module that can be integrated into almost any two-photon microscope, providing a tunable Bessel beam with a minimal modification of the existing system. The experimental data and accompanying quantitative analysis convincingly demonstrate that the system can reveal physiological events, such as capillary flow, calcium transients across multiple axial planes, and microglial process dynamics, that are difficult or impossible to capture with a conventional Gaussian beam. The breadth of experiments chosen for the manuscript illustrates the practical utility of the device and supports the authors' conclusions that it extends the functional repertoire of standard two-photon microscopy.

      Weaknesses:

      The manuscript would benefit from a more detailed contextualisation of the claimed speed advantage. Although the authors mention other techniques in the introduction, they do not provide any direct comparison with other state-of-the-art high-speed two-photon approaches such as light beads microscopy (Demas et al., Nat. Methods 2021), temporal multiplexing schemes (Weisenburger et al., Cell 2019), or random access microscopy (Villette et al., Cell 2019). A brief comparison of imaging speed, spatial resolution, and instrumental complexity would enable readers to assess the relative merits of the present method.

      A second limitation that warrants discussion is the inherent trade off between volumetric coverage and image specificity. Because the Bessel beam excites fluorescence throughout an extended axial range, the detector inevitably integrates signal from a three dimensional volume into a two dimensional image. In densely labelled tissue, this can lead to significant signal crosstalk, reducing contrast and complicating quantitative interpretation. A brief analysis of how labeling density affects the fidelity of flow or calcium measurements, or suggestions for mitigating crosstalk (e.g., computational deconvolution, adaptive excitation shaping, or combinatorial sparse labeling), would broaden the applicability of the technique.

    1. Reviewer #2 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      The authors pair analysis of replication timing and allele-specific expression in clonal populations of primary human cells. They combine these data with previously published data on clones from transformed human cell lines. They identify a number of genomic regions that display asynchronous replication timing in at least one clone and correlate these regions with allele-specific expression of genes within them. They also observe that several interesting gene sets, including genes that are associated with human diseases, map to asynchronously replicating regions. This is a good experimental approach that builds on already published data demonstrating the connection between allelic imbalance and replication timing.

    1. Joint Public Review:

      This manuscript puts forward the provocative idea that a posttranslational feedback loop regulates daily and ultradian rhythms in neuronal excitability. The authors used in vivo long-term tip recordings of the long trichoid sensilla of male hawkmoths to analyze spontaneous spiking activity indicative of the ORNs' endogenous membrane potential oscillations. This firing pattern was disrupted by pharmacological blockade of the Orco receptor. They then use these recordings together with computational modeling to predict that Orco receptor neuron (ORN) activity is required for circadian, not ultradian, firing patterns. Orco did not show a circadian expression pattern in a qPCR experiment, and its conductance was proposed to be regulated by cyclic nucleotide levels. This evidence led the authors to conclude that a post-translational feedback loop (PTFL) clockwork, associated with the ORN plasma membrane, allows for temporal control of pheromone detection via the generation of multi-scale endogenous membrane potential oscillations. The findings will interest researchers in neurophysiology, circadian rhythms, and sensory biology. However, the manuscript has limited experimental evidence to support its central hypothesis and is undermined by several assumptions that underlie their data analysis and model builds, as well as insufficient biological data including critical controls to validate and/or fully justify the model the authors are proposing.

      Strengths:

      The authors raise several intriguing model-based hypotheses regarding the mechanisms that underlie the generation of olfactory rhythms. The electrophysiological approach and the long-term recording paradigm are elegant and technically impressive. In the revised version, the authors have added additional qPCR data supporting the lack of rhythmic Orco transcript expression and included a new figure suggesting that cAMP can modulate Orco conductance.

      Major weaknesses:

      (1) The cAMP experiment was only conducted at one time-point, which is insufficient to support the central claim that "AMP and cGMP may have ZT-dependent effects on Orco conductivity".

      (2) The revised manuscript continues to rely heavily on prior publications or defers key mechanistic questions (or important manipulations) to future studies. In its current form, the evidence presented remains insufficient to support the central claim that a PTFL constitutes the primary underlying circadian clock mechanism. The proposed model is intriguing, but the data provided do not yet directly demonstrate the novel mechanism.

    1. Reviewer #1 (Public review):

      M. tuberculosis exhibits metabolic flexibility, enabling it to adapt to various environmental stresses, including antibiotic treatment. In this manuscript, Serafini et al. investigate the metabolic remodeling of M. tuberculosis used to survive iron-limited conditions by employing LC-MS metabolomics and 13C isotope tracing experiments. The results demonstrate that metabolic activity in the oxidative branch of the TCA cycle slows down, while the reductive branch is reverted to facilitate the biosynthesis of malate, which is subsequently secreted.

      Overall, this study is experimentally well-designed, particularly the use of 13C isotope tracing to monitor TCA cycle remodeling under iron-limited conditions. The findings are valuable as they offer potential new targets for antibiotics aimed at non-replicating M. tuberculosis occurring in the hosts.

      Comments on revised version:

      All concerns are well addressed.

      I have one minor concern: Page 3 line 16 - Fig. 1G & H: The kinetics of ATP levels between H37Rv and Erdman seem different; Erdman induces greater ATP at days 2 and 3 after DFO treatment, which was not clear in H37Rv. Fig. 1I shows NAD/NADH ratio not NADH/NAD ratio. Please change it to NADH/NAD+ to be consistent with Supplement Fig. 1 result. Include the 17-day result of NADH/NAD+ in the discussion section to explain the different viability between the two strains.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigated the effect of prolonged iron limitation (which does stop growth but does not lead to cell death) alters central metabolism in M. tuberculosis. The major tool they used is metabolomics combined with stable isotope tracing. They show that the Krebs cycle is still active, despite the fact that it is dependent on some iron-dependent enzymes. They show that carbon flux through the oxidative branch of the Krebs cycle is stalled, resulting in the accumulation of metabolites, such as malate and alpha-ketoglutarate that are partially secreted. Apparently, the carbon flux from glycolysis is partially diverted to the reductive branch of the Krebs cycle. This is not achieved by using the glyoxylate shunt but probably through the GABA shunt. This unprecedented split of the Krebs cycle and malate secretion allows a continuous flow of carbon through the core of carbon metabolism, overcoming the metabolic stalling triggered by iron starvation.

      Strengths:

      Novel insight in the central metabolism of a major pathogen and its adaptation to iron starvation. Carefully conducted experimentation. Paper ends with a clear and helpful model.

      Weaknesses:

      The authors show some surprising and important findings, but would need a little more effort to really substantiate this. Especially the role of the GABA shunt should be genetically tested, as they did for ICL and the glyoxylate shunt.

      Also, the dataset 1 is not very convincing, it is only based on transcriptomics and shown with up or down, hardly a strong base for major conclusions. The very least you want is actual differences, preferable on the protein level, where it really counts....

      Comments on the revised version:

      In the revised version all these points were appropriately dealt with and discussed, although some of them textually and not experimentally, but for reasons that are logical.

    1. Reviewer #1 (Public review):

      This paper aims to improve the accuracy of predictions of the impact of ITN strategies by developing a method to estimate duration of ITN access and use over time on a subnational scale from cross-sectional survey data and the numbers ITNs received annually. The subnational estimates are then input into a mathematical model to predict clinical cases under different ITN distribution strategies.

      Strengths:

      The approach is novel and addresses a useful and timely topic. It makes use of available routine data, and has considered all of the relevant components of ITN distributions.

      The authors have made revisions, particularly to the methods, appendices and title - leaving the paper easier to follow, and with a clear, consistent aim. The assumptions are clearly stated.

      Weaknesses:

      The weaknesses are shared with other models of a similar complexity - it is not easy for a casual reader to fully understand the model or the implications of the assumptions which were required to be made. That routine data is used is good for availability, but data quality may be an issue in some places.

    2. Reviewer #2 (Public review):

      Summary:

      The authors design a custom Bayesian model to estimate the probabilities of access, use and use given access of insecticide-treated nets in six African countries, providing sub-national estimates and inferring the average duration of ITN use and access. An individual-based model was employed to simulate malaria epidemics and estimate the effectiveness of different ITN distribution strategies. The study finds that the mean probability of use or access did not reach 80% (a universal coverage formerly targeted by WHO) for any of the regions even for biennial campaigns, demonstrates that switching from triennial to biennial distribution campaigns increases population use by 7.9%, and evaluates the impact of employing more efficient ITNs on P. falciparum prevalence.

      Strengths:

      The authors developed a data-driven model that accounts for data collection imperfections and sources of uncertainty while differentiating between ITN use and access. They developed a methodology to infer the timing of mass campaign from publicly available data instead of assuming fixed dates. The probability of use given access allows determining the regions where ITN distribution is least effective. This work can help better inform future interventions by identifying regions where increasing mass campaign frequency or employing better ITNs are most effective. Finally, in addition to insights on ITN access and use for the six countries analyzed, the paper contributes with a methodological framework that can likely be extended to other countries.

      Weaknesses:

      Since the models employed are rather complex, the methodology description may be hard to follow for some readers. In addition, the models assume many hypotheses, including exponential decay of ITN use/access and narrow prior distributions. It is worth noting that, in the revised version of the manuscript, the authors justified the choice of exponential decay and narrow prior distributions, and made a significant effort to clarify the methodology and the model equations.

      Comments on revised version:

      I appreciate the improvements made to the text. The methodology description is much clearer now. I have no further suggestions.

    1. Reviewer #1 (Public review):

      Summary:

      Launay et al., conducted a screen of PDE in 25 new Rhabditidae species through cytological approaches and found PDE is detected in 17 out of 25 species, representing 12 out of 17 genera within the family. This work is significant because it expands PDE from a few known nematodes to a much broader set of Rhabditidae species.

      Strengths:

      By demonstrating PDE across many genera with the exception of C. elegans and some other Caenorhabditis species, the study provides an important resource for investigating PDE's evolutionary origins, mechanisms of genome reorganization and DNA repair, and its functional consequences.

      Most of the observed PDEs were supported by solid evidence through a survey-style cytological screen (PDE detected in 17/25 species and in 12/17 genera), which supports the main claim of widespread occurrence.

      Weaknesses:

      Although most PDE claims are supported by solid evidence, some of the existing data do not describe the depth of characterization, e.g., how many replicates were conducted for each species? How reproducible are the claimed PDEs between embryos in terms of timing and cell identities destined for PDE? Is it possible to validate a subset of PDE with independent evidence, especially for those with marginal PDE? This is important because some dying embryos may fail to maintain their chromosome integrity and release some of the broken DNAs, some others may suffer from noise such as intracellular parasites, for example, microsporidia, or even highly condensed mitochondrial DNAs.

    2. Reviewer #2 (Public review):

      Summary:

      Programmed DNA elimination is increasingly recognised as an important phenomenon across many species, including in animals. Exactly how widespread is still unclear, and the function of PDE is even more mysterious in most species where it has been described. PDE has been discovered in several nematode species, and in this manuscript, the authors carry out a more extensive search for PDE. They find PDE in many species, indicating that it is widespread across the phylum.

      Strengths:

      The large number of species across many different clades provides good evidence that the phenomenon has evolved many times independently. The work will therefore prompt many further studies characterising individual species, and potentially linking the evolution of the phenomenon to other features of these species' ecological characteristics.

      Weaknesses:

      The major technical weakness of this project is the assay that is used to evaluate PDE. First, this assay is clearly insensitive, as the authors acknowledge, O. tipulae, which has PDE, does not appear in their screen. Second, the assay gives no information about breakpoints and only limited, non-quantitative information about how much DNA is eliminated. Thus, their data really is only a preliminary screen, which would need to be confirmed by genomic assays.

    3. Reviewer #3 (Public review):

      Summary:

      Somatic programmed DNA elimination (PDE), also known as chromatin diminution, has primarily been studied in parasitic nematodes, such as Ascaris species, in which it was discovered almost 140 years ago. Recently, PDE has also been reported in three non-parasitic nematode species. In this manuscript, Launay et al present the results of a large-scale cytological and evolutionary study of PDE across 29 free-living nematode species belonging to the Rhabditidae family, for which they established a phylogeny based on 18S and 28S ribosomal RNA sequences. By combining DNA staining and telomere DNA FISH labeling in developing embryos, they convincingly document the formation of lagging fragments and/or the loss of long germline telomeres in 17 species, during one particular division of somatic precursor cells.

      Strengths:

      (1) The whole study is well executed, and the results are convincing.

      (2) The authors present compelling evidence that PDE is an ancestral feature of Rhabditidae nematodes.

      (3) This study provides a valuable resource of lab-tractable species for future PDE studies.

      Weaknesses:

      (1) Some clarifications are necessary to make the figures more reader-friendly.

      (2) Important references to ciliates are missing.

    1. Reviewer #1 (Public review):

      "Learning is a fundamental source of individuality," by Manna and colleagues, interrogates different sources of variation in individual behavior. The authors place individual flies in a Y-shaped arena, which is a common design in the field, and illuminate the arms of the Y with blue versus green light. They track the color preference of individual animals and also perform operant conditioning, meaning that they teach the fly to avoid a particular color/arm by generating a foot shock when the fly enters that arm. There are a number of things that are impressive about this setup: The authors are able to collect data on thousands of individual flies of many different strain backgrounds, and they demonstrate a strong change in color preference after conditioning. This is nice, because in past papers, visual learning ability has been modest and difficult to study. To put a number on it, in this paper, animals on average don't show a color preference at the start of the assay, spending around 30% of their time in the one arm illuminated green, and the remaining time in the two arms illuminated blue. After conditioning, the average animal spends only 23% of its time in the green arm.

      The authors run 64 animals through the assay for each of 88 wild-type strains (maybe? see Major Point 1 below) and see considerable strain-specific (genetic) variation in the change in time spent in the shocked color after conditioning. Some strains show no learning, while others spend <10% of their time in the shocked color after conditioning. They also, I believe, see that some strains have more variability across individuals, which would suggest that some strains have stronger canalization at the development or circuit function level than others, i.e., some genotypes produce more consistent copies of the individual, others less consistent copies. (Or, some genotypes produce robust circuits, and others produce noisy circuits.)

      Finally, the authors argue statistically that learning itself increases variability in individual performance. This makes a lot of sense to me intuitively. Learning changes the physical/chemical properties of circuits in the brain, and because it evolves over time and interacts with environmental variables, it seems like it should send different animals down different channels. Or, at a conceptual level, if I learn to play the piano and my sister doesn't (because of some genetic difference between us or something stochastic), this learning experience will cause all sorts of other differences in our behavior as time passes. I also think the authors do have enough data to be able to make this finding. However, the presentation of the argument in this portion of the paper is hard for me to understand, and I am not an expert in statistics, so the strength of the result is difficult for me to evaluate.

      Major points

      (1) It's difficult to track through the paper the number of animals tested for different assays. At the beginning, it says N=5632, which works out to 64 flies for each of the 88 DGRP strains. 64 happens to be the number of parallel Y arenas they have. Later in the methods, there's a description of more variation within the set of 64 for each strain, two different parent sets per strain, different sexes, conditioned and unconditioned. And, while the results text focuses on the color learning, the methods discuss additional assays (place learning, multi-day learning).

      Given the numbers, does each run of the 64 mazes include all the tested flies of one strain, or are flies of many strains included in each batch? Do different flies do different assays (color, place, multi-day), or do they all do all the assays? Perhaps there is a table including this information already in the supplement, but I recommend making it much clearer in the main results text and methods. While the dataset is large, if it is split over many conditions and/or if batch and genotype confound each other, this will affect the robustness of the results and how strong the conclusions can be.

      (2) The data presentation in Figure 1 is elegant and easy to follow, but getting into Figure 2 and subsequently, I get lost in the statistics and have trouble understanding what is being measured. My understanding of the big picture is that while genetics and individual randomness contribute a lot to behavior, the evidence for learning as an amplifier of individuality is that variance in behavior among animals of the same strain increases over time in the conditioned group (i.e., the group that is doing the most learning, or a specific kind of learning), but not in the control group. This idea is illustrated in the flattening distributions in the cartoons in Figure 1A. The authors should include graphs of the real data that use the same format as in that cartoon. Instead, the graphs present "residuals," and I don't know what those are. I suspect it's "variation left over after accounting for effects of strain and individual stochasticity." I see the residuals being tracked per strain over time in Figure 2H, but I don't see the change over time in other graphs. I'm looking for something simple like, "variation within the strain at the beginning of learning and at later time points in learning." (But I'm not sure exactly what instantaneous measurement would be the focus in longitudinal analyses of learning behavior.)

      (3) Figure 3 is a cool stab at tracking down the precise mechanism by which a stochastic environment interacts with learning to send individuals along different behavioral routes. But again, like in Figure 2, I don't have the sophisticated understanding of statistics to understand exactly what the graphs are telling me, or how they relate to the underlying measurements. I'm relying on the results text alone to reach a conceptual understanding, and just taking the graphs on trust.

      So, overall, the authors have a very nice body of work here, and with the potential to add a new facet to our understanding of the origins of diversity in animal behavior. In addition to the interpretations they focus on here, this dataset also represents an advance in studying visual associative learning in general, and quite an amazing ability to make longitudinal measurements of many behavioral decisions within the same animals. Improving the data presentation to make it easier to follow for a larger swathe of researchers, especially in figures 2 and 3, will increase its potential impact.

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to test the extent to which differences in learning capacity and experience contribute to behavioural variation in a genetically identical population under identical environmental conditions.

      Strengths:

      The authors developed and used a scaled-up version of a simple two-choice behavioural paradigm, allowing them to test thousands of individuals across multiple genotypes. They then deployed clever and powerful statistical analysis methods and provided compelling evidence for a role of variability in learning in the expression of behavioural variation.

      Weaknesses:

      There are no major weaknesses, although some level of longitudinal analysis to strengthen the evidence for a strict definition of individuality would be a welcome extension of a future study. In addition, it would have been very interesting, although understandably beyond the current scope, to delineate a potential source of learning variability in the brain.

    1. Reviewer #1 (Public review):

      Summary:

      This paper presents a toolkit for the transformation of Blastocystis. The authors have screened a number of selectable agents, promoters and reporter genes and present their findings. This resource will be of immense use to those in the Blastocystsis field, as well as those seeking to establish transformation tools in other species where such tools do not yet exist. Establishing new transformation tools is extremely challenging, and the authors have done an excellent job.

      Strengths:

      The authors have carried out a systematic screen of promoters, reporter genes and selectable agents. They have screened numerous for each, and all the data is presented. It is good to see when things did not work as well as when things did, so this data set is extremely useful indeed.

      Weaknesses:

      The findings are reported by reporter gene assay (microscopy). No evidence is given using genetics. The authors claim that the DNA is maintained episomally. However, could it be possible that there is integration? No PCRS/RT-PCRs are shown (although it can safely be assumed that the DNA/RNA is present where the transformation was successful), nor are any Western blots. These would have been useful to show that the P2A ribosomal skipping had occurred, and that proteins were expressed individually rather than as a polyprotein.

    2. Reviewer #2 (Public review):

      This manuscript presents a substantial technical advance for the genetic manipulation of Blastocystis by establishing an integrated workflow for stable episomal transgenesis, antibiotic selection, clonal recovery, and reporter-based imaging in the ST7-B subtype. The study is particularly valuable because it combines multiple previously fragmented approaches into a coherent and practically applicable toolkit, including endogenous regulatory elements, optimized electroporation conditions, selectable markers, and anaerobic compatible fluorescent reporters. This methodological work greatly expands the molecular toolbox and future studies focused on both basic and infection biology can now build on the ability to express and localize proteins in fixed as well as live cells.

      The microscopy data are convincing and clearly demonstrate functional reporter expression and successful recovery of stable transgenic lines. Nevertheless, because this is primarily a methodological paper, the study would be further strengthened by the inclusion of Western blot validation of reporter expression and bicistronic constructs. In particular, biochemical analysis of the P2A-containing constructs would help assess the efficiency of ribosomal skipping and exclude the possible presence of uncleaved fusion proteins, thereby providing stronger support for the interpretation of the imaging data and the functionality of the expression system.

    3. Reviewer #3 (Public review):

      Summary:

      The primary objective of this study was to establish a practical and functional framework for the propagation of stable transgenic cell lines of Blastocystis, a common animal gut microeukaryote. Although the work focused on Blastocystis ST7-B, a subtype with relatively low prevalence in humans, this choice is justified by its association with more frequent negative health effects. Beyond their relevance to the medical field, the methodological advances described here have the potential to also expand cell biology studies of this anaerobic organism, including its unusual mitochondria and redox metabolism.

      Strengths:

      Prior to this work, genetic tools for Blastocystis were very limited, relying on a single strong promoter-terminator combination. The authors successfully expanded the available promoter set across a range of expression strengths by testing two dozen variants in luciferase-based assays. Critically, they developed an integrated workflow from a modular transgenic construct design, to an expanded inventory of molecular components (promoters, reporters), optimized DNA delivery, stepwise antibiotic resistance-mediated clonal selection and propagation, and to reporter validation. The evaluation of several anaerobiosis-compatible labeling strategies for live (and fixed) cell optical imaging will be particularly useful, with the SNAP-tag system appearing especially promising for Blastocystis.

      Weaknesses:

      The presented data generally provide solid support for the conclusions that the work reached, but clarification of reasoning and several inconsistencies, as well as amendments to the visual presentation of the data, would be highly beneficial, as detailed below.

      (1) Episomal persistence of the construct:<br /> The manuscript repeatedly assumes, including in its title, that constructs persist in Blastocystis in their episomal form, but no direct evidence is provided. Although this interpretation is plausible, it should be identified more clearly as provisional. Nuclear genomic integration (e.g., via NHEJ) remains a possible explanation unless supporting evidence or rationale is provided to exclude it. Testing whether the phenotype persists without drug-mediated selection in the generated transgenic cell lines would help strengthen the case for episomal maintenance.

      (2) Promoters and terminators:<br /> 2.1) There is a discrepancy between the claimed number of loci (14), from which promoters used to drive luciferase expression were derived, and those detailed as having been actually generated in Table 1 (11). This inconsistency should be corrected or explained, as it creates uncertainty around the accuracy of the dataset.<br /> 2.2) Based on the presented evidence, constructs benchmarked in bioluminescence assays differed only in their promoter composition. Although terminator selection is mentioned in the Methods section, no additional details are provided; for instance, Table 1 and Figure 2 only list 23 promoters in total. Figure 2A likewise shows only promoter-dependent variation. If the terminator was held constant (LeguP1?), this should be stated explicitly. The authors may then consider revising the wording of having tested "23 promoter-terminator pairs" to better reflect that only promoters varied.<br /> 2.3) Promoter benchmarking was done with a plasmid lacking a selection marker, so it is unclear how the maintenance of the luciferase construct was ensured. Without selection, the observed reporter intensity could reflect differential or stochastic plasmid retention rather than promoter strength alone. The luminescence assay was performed 16-18 hours after transfection, but the rationale for this particular timeframe should be explained. In this context, the authors should explicitly state whether the experiments shown in Fig.2A represent biological triplicates or technical triplicates from a single transfection.

      (3) Figure 2:<br /> 3.1) Several aspects of the current design may lead to ambiguity for the reader. The boxplots are colour-coded, but it is unclear whether the colours carry meaning or are purely decorative. Because the data are already spatially separated into bins, additional random colouring is redundant and may suggest distinctions that are not intended. In addition, part A of Figure 2 is split into two panels, with the scale for the left panel shown in the right panel and some of the boxplot colours falling in the range of the scale, but not in line with their counterparts in the left panel. Because the colour use is not consistent, it is difficult to tell whether the same scale should be applied to both panels or how it should be interpreted.<br /> 3.2) The left panel of part A uses a diverging blue-white-red colour scheme, which is most appropriate when the midpoint represents a meaningful central value such as zero. Because the values shown in this graph are only positive, a non-diverging 2-colour scale or a colour palette such as 'viridis' would make the plot easier to interpret.<br /> 3.3) A black background should be avoided: 'B' and 'C' labels are invisible, and it draws attention to a distracting design feature rather than the data themselves.

      (4) Figure 3:<br /> 4.1) Individual snapshots should be separated more clearly, either by using a white background or by adding visible borders to make the overall composition clearer. As currently displayed, some boundaries between fluorescent channels resemble image artifacts rather than intentional panel divisions.<br /> 4.2) In parts B-D, the legend should explain more clearly what each image shows, and the figure itself would benefit from annotations. There seem to be three sub-panels in each 'condition' of part B (as well as C and D): while the middle and rightmost panel can be easily inferred to represent the fluorescent protein and bright-field image, what the leftmost panels represent is not specified. If DAPI was used to dye DNA, an explanation why mostly multiple labelled regions are visible should be provided.<br /> 4.3) Cell morphology and appearance differ markedly between UnaG/smURFP and SNAP-tag images, which should be explained. A microscope issue is mentioned in the main text, but if that was the cause, the authors should consider replacing the images, as the current distortions complicate interpretation.

    1. Joint Public Review:

      In this manuscript, the authors proposed an approach to systematically characterise how heterogeneity in a protein signalling network affects its emergent dynamics, with particular emphasis on drug-response signalling dynamics in cancer treatments. They named this approach Meta Dynamic Network (MDN) modelling, as it aims to consider the potential dynamic responses globally, varying both initial conditions (i.e., expression levels) and biophysical parameters (i.e., protein interaction parameters). By characterising the "meta" response of the network, the authors propose that the method can provide insights not only into the possible dynamic behaviours of the system of interest but also into the likelihood and frequency of observing these dynamic behaviours in the natural system.

      The authors study the Early Cell Cycle (ECC) network as a proof of concept, focusing on pathways involving PI3K, EGFR, and CDK4/6 with the aim of identifying mechanisms that may underlie resistance to CDK4/6 inhibition in cancer. The biochemical reaction model comprises 50 state variables and 94 kinetic parameters, implemented in SBML and simulated in Matlab. A central component of the study is the generation of large ensembles of model instances, including 100,000 randomly sampled parameter sets intended to represent intra-tumour heterogeneity. On the basis of these simulations, the authors conclude that heterogeneity in kinetic rate parameters plays a stronger role in driving adaptive resistance than variation in baseline protein expression levels, and that resistance emerges as a network-level property rather than from individual components alone. The revised manuscript provides additional clarification regarding aspects of the simulation and filtering procedures and frames the comparison with experimental data as qualitative. Nonetheless, the study is best interpreted as a theoretical and exploratory analysis of the model's behaviour under heterogeneous conditions. Consequently, questions remain regarding the biological grounding of the sampled parameter regimes and the extent to which the reported frequencies of resistance-associated behaviours can be directly interpreted in physiological terms.

      While the authors propose a potentially useful computational framework to explore how heterogeneity shapes dynamic responses to drug perturbation, a number of important conceptual and methodological concerns remain to be addressed:

      (1) The sampling of kinetic parameters constitutes the backbone of the manuscript, yet important concerns remain regarding its biological grounding and transparency. Although the revised version provides additional clarification on the exploration of "model instances", it is still not sufficiently clear how parameter values and initial conditions are generated, nor how the chosen ranges relate to biological measurements. The kinetic rates are sampled over broad intervals without explicit justification in terms of experimentally measured bounds or inferred distributions. As a consequence, it remains uncertain whether the ensemble of simulated behaviours reflects physiologically plausible cellular regimes or primarily the properties of the assumed parameter space. In this context, the large-scale sampling (100,000 parameter sets) resembles a Monte Carlo exploration of the model rather than a biologically calibrated representation of tumour heterogeneity.

      Furthermore, the adequacy of the sampling strategy in such a high-dimensional space (94 free parameters) remains open to question. In the absence of biologically informed constraints, the combinatorial space of possible parameter configurations is vast, and it is unclear to what extent the sampled ensembles can be considered representative. This issue is particularly relevant because the manuscript interprets the frequency of resistance-associated behaviours as indicative of their likelihood.

      The validation presented in Figure 7 does not fully resolve these concerns. The comparison with experimental data is qualitative, and the simulations are performed in arbitrary time units, which complicates direct interpretation alongside time-resolved experimental measurements. Moreover, certain qualitative discrepancies between simulated and experimental trends (e.g., persistent versus decreasing CDK4/6 activity) are not thoroughly discussed. As this figure represents the primary empirical reference point in the manuscript, the extent to which the model captures experimentally observed dynamics remains uncertain.

      Finally, aspects of presentation continue to limit transparency. Parameter ranges are described at different points in the manuscript but are not consolidated clearly in the Methods, and the definition of initial conditions remains ambiguous - particularly whether these correspond to conserved quantities or to the dynamic variables used to initialise simulations. In addition, the exact number of model instances underlying specific analyses and figures is not always explicit. Greater clarity on these issues is essential for assessing reproducibility and for interpreting the quantitative claims of the study.

      (2) A central conclusion of the manuscript is that heterogeneity in protein-protein interaction kinetics is a stronger driver of adaptive resistance than heterogeneity in protein expression levels. To assess the latter, the authors fix a nominal set of kinetic parameters and generate 100,000 random initial concentrations for the 50 model species. However, according to the simulation protocol described in the manuscript, each trajectory includes three phases: (i) simulation under starvation conditions to equilibrium, (ii) mitogenic stimulation to a second ("fed") equilibrium, and (iii) application of drug treatment. The equilibrium concentrations reached in phases (i) and (ii) are determined by the kinetic parameters of the model and are independent of the initial concentrations, provided the system converges to a stable steady state. In dynamical systems terms, stable equilibria are defined by the parameter set and attract all initial conditions within their basin of attraction. Since the kinetic parameters are fixed in this experiment, the pre-treatment equilibrium that serves as the starting point for drug application should likewise be fixed. Under these conditions, it is therefore not unexpected that sampling a large number of initial concentrations has limited influence on the treated dynamics.

      This raises conceptual questions about the interpretation of the comparison between kinetic and expression heterogeneity. If the system converges to a unique stable steady state prior to treatment, then variability in initial concentrations does not propagate into variability in drug response, and the observed dominance of kinetic heterogeneity may partly reflect this structural property of the model rather than a biological principle. Clarification is needed regarding whether multiple steady states exist under the nominal parameter set, and if so, how basins of attraction are explored.

      More broadly, it remains unclear why initial protein concentrations can be sampled independently of the kinetic parameters. In biological systems, steady-state expression levels are typically determined by the underlying kinetic rates. A more consistent approach might require constraining initial concentrations to correspond to equilibrium states of the chosen parameter set, thereby introducing relationships between at least some of the 50 initial conditions and the 94 kinetic parameters. Finally, the manuscript employs a non-standard terminology regarding "initial conditions," which may further obscure interpretation of these results and would benefit from clarification.

      (3) The technical implementation of the modelling and simulation framework remains difficult to evaluate due to insufficient methodological detail. Although the authors state that kinetic parameters are randomly sampled, the manuscript does not specify the distributions from which parameters are drawn, nor whether potential correlations between parameters are considered or explicitly ignored. Without this information, it is not possible to assess how implicit modelling assumptions shape the ensemble of simulated behaviours. Given that the conclusions rely on frequency-based interpretations across sampled parameter sets, greater transparency regarding the sampling procedure is essential.

      A further concern relates to the parameter filtering step. The authors report that the "vast majority" of sampled parameter sets produced systems that were "too stiff," and that these were excluded on the grounds that stiff dynamics are not biologically plausible. However, the manuscript does not clearly define how stiffness is assessed, nor why stiffness is interpreted as biologically unrealistic rather than as a numerical property of the formulation. In standard practice, stiff systems are typically handled using appropriate implicit solvers rather than being discarded. Similarly, parameter sets that produce negative state values are excluded, yet such behaviour may arise from numerical artefacts rather than from intrinsic model inconsistency. The rationale for excluding these parameter sets, rather than adapting the numerical scheme, is not sufficiently justified.

      The reported rejection rate - approximately 90% of sampled parameter sets - is substantial and raises questions regarding the interplay between model structure, parameter ranges, and numerical methods. As currently described, the filtering step appears to select parameter sets based primarily on computational tractability rather than on experimentally motivated biological criteria. The manuscript would be strengthened by clarifying whether the retained parameter sets are representative of biologically meaningful regimes, and by distinguishing clearly between exclusions based on biological plausibility and those arising from numerical considerations.

      Finally, important aspects of the simulation protocol require clarification. The model is simulated under "fasted" and "fed" conditions until equilibrium is reached, yet the criterion used to determine convergence is not specified. It would be important to describe how equilibrium is assessed (e.g., based on the norm of the time derivatives). Additionally, it remains unclear whether the mitogenic stimulus applied in the "fed" phase is assumed to be constant over time and, if so, how this assumption relates to biological experimental conditions. Greater detail on these implementation choices is necessary to ensure interpretability and reproducibility.

      (4) The manuscript states that the modelling conclusions are strongly supported by existing literature; however, the validation presented does not fully substantiate this claim. As noted above, the comparison with CDK2 and CDK4/6 experimental data remains qualitative, and the use of arbitrary simulation time units complicates interpretation of temporal agreement. The extent to which the model quantitatively or mechanistically recapitulates experimentally observed dynamics therefore remains uncertain.

      The claim that the model reproduces known resistance mechanisms is also difficult to assess in light of Figure S10, where a large fraction of network nodes (~80%) appear implicated in resistance under some conditions. If most components of the network can, in at least some parameter regimes, be associated with resistance phenotypes, the resulting lack of selectivity weakens the strength of model-based validation. It becomes challenging to distinguish specific mechanistic insights from generic consequences of network connectivity.<br /> In addition, the Supplementary Information notes that certain components of the mitogenic and cell-cycle pathways were abstracted or excluded in order to maintain computational tractability. While such abstraction is understandable in a large ODE framework, it raises interpretative questions. Proteins identified as potential resistance drivers within the model may, in some cases, represent aggregated or simplified pathway effects. Clarifying in the main text how such abstractions may influence the attribution of resistance mechanisms would strengthen the biological interpretation of the results.

      Drug inhibition is central to the manuscript's conclusions. The revised version clarifies that inhibition is implemented as a fixed fractional modification of specific kinetic rate laws. This abstraction is appropriate for exploring network-level responses, but it represents a stylised perturbation rather than a pharmacologically calibrated model of drug action. For full interpretability and reproducibility, the mathematical form of the modified rate laws, as well as the timing of inhibition relative to network equilibration, should be specified unambiguously. The biological implications of the findings depend critically on understanding this modelling choice.

      The one-at-a-time perturbation analysis presented in Figure 5 provides an interpretable ranking of first-order control points across the ensemble and offers mechanistic insight into primary sensitivities of the network. However, many targeted therapies act on multiple components, and resistance frequently arises through combinatorial mechanisms. The reported rankings should therefore be interpreted as identifying primary influences under isolated perturbations, rather than as a comprehensive account of multi-target drug behaviour.

      Overall, the manuscript succeeds in presenting a conceptual and exploratory framework for analysing how signalling network topology can shape the qualitative landscape of adaptive responses under heterogeneous kinetic conditions. Its principal contribution lies in establishing a systematic platform for large-scale in silico exploration. At the same time, the current limitations in biological calibration, parameter grounding, and validation constrain the extent to which the conclusions can be interpreted as predictive or quantitatively representative of specific tumour contexts. Addressing these issues would further strengthen the connection between the theoretical landscape described here and experimentally observed resistance dynamics.

    1. Reviewer #2 (Public review):

      Summary:

      This paper attempts to examine how rare, extreme events impact decision-making in rats. The paper used an extensive behavioural study with rats to evaluate how the probability and magnitude of outcomes impact preference. The paper, however, provides limited evidence for the conclusions because the design did not allow for the isolation of the rare, extreme events in choice. There are many confounding factors, including the outcome variance and presence of less-rare, and less-extreme outcome in the same conditions.

      Strengths.

      (1) The major strength of the paper is the significant volume of behavioural data with a reasonable sample size of 20 rats.

      (2) The paper attempts to examine losses with rats (a notoriously tricky problem with non-human animals) by substituting time-outs as a proxy for losses. This allows for mixed gambles that have both gain and loss possible outcomes.

      (3) The paper integrates both a behavioural and a modelling approach to get at the factors that drive decision-making.

      (4) The paper takes seriously the question of what it means for an event to be rare, pushing to less frequent outcomes than usually used with non-human animals.

      Weaknesses:

      (1) The primary issue with this work is that the primary experimental manipulation fails to isolate the rare, extreme events in choice. As I understand the task, in all the conditions with a rare extreme event (e.g., 80 pellets with probability epsilon), there is also a less-rare, less-extreme event (e.g., 12 pellets with probability 5). In addition, the variance differs between the two conditions. So, any impact attributable to the rare, extreme event could be due to the less rare event or due difference in the variance (or other statistical moments, like skew or kurtosis). That the distributions can be shown to be different under specific assumption to value maximizing agents (e.g., with Jensen Gaps and Table 2) is not really relevant to what rats are sensitive and what drive their behaviour. The design here does not support the conclusions. Finally, by deliberately confounding rarity and extremity, the design does not allow for assessing the impact of either aspect on rat behaviour.

      (2) The RL modelling work also fails to show a specific impact of the rare extreme event. As best as I can understand Eq 2, the model provides a free parameter that adds a bonus to the value of either the two options with high-variance gains (A and V in the paper) or to the two options with high-variance losses (F and V in the paper). Or equivalently to the ones with "Jackpots" vs the ones with "Black Swans" (see Point 1 above as to how these different aspects are all confounded in this design). This parameter seems to only depends on whether this option could have possibly yielded the rare, extreme outcome (i.e., based on the generative probability) and was not connected to its actual appearance. [This point is unclear as the text says this, but the rebuttal states otherwise; plus some options never received the REE, see Table S11]. That makes it a free parameter that just bumps up (or down) the probability of selecting a pair of options. That may be due to presence of the REE or the other rare event or just the variance difference. Moreover, in the case of the "black swan" or high-variance loss conditions, this seems very much like a loss aversion parameter, but an additive one instead of a multiplicative one. Is there a theoretical claim here that "extreme losses" need an additive loss-aversion parameter?

      (3) The paper presented the methods and results with lots of neologisms and fairly obscure jargon (e.g., fragility, total REE sensitivity). That might it very hard to decipher exactly what was done and what was found. For example, on p. 4, the use of concave and convex was very hard to decipher; the text even has to repeat itself 3 times (i.e., "to repeat" and "in other words") and is still not clear. It would be much clearer (and probably accurate) to say that the options varied along the variance dimension, separately for gains and losses. Option A was low-variance gains and losses. Option B was low-variance losses and high-variance gains. Option C was high-variance losses and low-variance gains, and Option D was high-variance losses gains. That tells much more clearly what the animals experienced without the reader having to master a set of new terminologies around fragility and robustness, which brings a set of theoretical assumption unnecessarily into the description of the experimental design. Alternatively, if the authors are wary of using the term "variance" because other moments of the distribution also differ, they could use "high-value gains" or "high-value losses" or something else which does not obscure the experimental design with jargon. Again, this goes back to point 1 above, whereby the different options differ on so many dimensions (as is made even more apparent in the rebuttal) that the design cannot isolate the impact of the variables of interest.

      (4) Were the probabilities shuffled or truly random (seem to be fixed sequences, so neither)? What were the experienced probabilities? Given the fixed sequences, these experienced ("ex-post") probabilities, could differ tremendously from the scheduled ("ex ante") probabilities. It's quite possible than an animal never experienced the rare, extreme event for a specific option. From Table S11, that is guaranteed to have happened in that 4 animals only ever experienced the "black swan" outcome once. It's even possible (if they only picked a specific option on the 10th/60th choices by chance), that they only ever experienced that rare extreme event. This point still cannot be known given the information provided, which does not break down outcomes by options. The Supplemental in Table S11 only gives overall numbers but does not indicate what the rats experienced for each choice/option-which is what matters here. A simple table that indicates for each of the 4 options, how often they were selected, and how often the animals experienced each of the 6-8 possible outcome would make it much clearer how closely the experience matched the planned outcomes. In addition, by restricting the rare outcome to either the 10th or 60th activations in a session, these are not random. Did the animals learn this association? The text states that they did not, but no evidence is provided.

      (5) The choice data are generally presented in an overprocessed fashion with a sum and a difference (in both figures and tables). The basic datum (probability/frequency of selecting each of the 4 options) is not provided directly in the main text, even if it can theoretically be inferred from the sum and the difference. New right side of Table S4 is probably the most valuable piece in terms of explaining what rats did and should be highlighted a lot more. Inspection of that table reveals some interesting (and potentially worrying) results. Most notably, the vast majority of responding happens on the "anti-fragile" and "robust" option, often totalling around 90% of all selections, especially amongst the most common blue rats. Alas, those were all those the two options that were deliberately assigned to the two most preferred holes in the training phase (see p. 26). Does this reflect genuine preference for reward distributions or does this reflect a spatial hole bias? The assignment strategy makes this impossible to tell apart.

      (6) There is insufficient detail provided on the inferential statistical tests (e.g., no degrees of freedom or effect sizes), and only limited information on exactly what tests were run and how (bootstrapping, but little detail). Without code or data (only summary information is provided in the supplement), this is difficult to evaluate. In addition, the studies seem not to pre-registered in any way, leaving many research degrees of freedom. Not all studies need to be pre-registered and sometimes discovery of new things requires exploratory work, but preregistration does provide additional safeguards against overemphasizing post-hoc detected patterns-a serious issue in behavioural science. Moreover, this promotes transparency in reporting results and analyses, allowing for a better assessment of the strength of evidence for a claim. For example, here, were any alternative analysis pipelines attempted? Also, there were many sub-groupings of the animals and subsequent comparisons between them which all seemed post-hoc. On what grounds were these divisions made-were other divisions examined as well?

      (7) On p. 12 (Fig 4), there is an attempt to look at the impact of a rare, extreme event by plotting a measure of preference for the 10 trials before/after the rare, extreme event. In the human literature, the main impact of experiencing a rare, extreme event is what is known as the wavy recency effect (See Plonsky et al. 2015 in Psych Review for example, now cited). What this means is that there tends to there tends to be some immediate negative recency (e.g., avoiding a rare gain) followed by positive recency (e.g., chasing the rare gain). Typically, this refers to the specific option that yielded that outcome. First, as the other analyses do, the current analysis combines choice of the option that yielded the rare outcome with choice of other options, so that cannot directly assess the impact of the rare, extreme event on choice. Also, using a 10-trial window would thus obscure any impact of this rare, extreme event. There is mention of the very next trial, but an analysis that looks at the 10-trial time course trial-by-trial could reveal any impact that might be predicted from the human literature.

      (8) As I understood the method (p. 31), the assignment of options to physical locations was not random or counterbalanced, but deliberately biased to have one of the options in the preferred location. This would seem to create a bias towards a particular option and a bias away from the other options, which confounds the preference data in subsequent analyses. Table S4 reinforces this concern where the vast majority of response are clustered in the two most preferred options from training.

      (9) Are delays really losses? This is a big assumption. Magnitude and delay are different aspects of experience, which are not necessarily commensurable and can be manipulated independently. And, for the model, how were these delays transformed into outcomes for the model. Eq 1 skips over that. Is there an assumption of linearity? In addition, I was not wholly clear if the delays meant fewer trials in a session or if the delays merely extended the session and meant longer delays until the next choice period.

      Other points:

      (1) I think the authors still misunderstand the concept of "hot-stove effects". The idea is that the experience of a very bad outcome can lead to avoiding the situation again (i.e., not sampling that option) and can provide the appearance of oversensitivity to that bad outcome. Here, that might be more thought as "black-swan avoidance". Imagine if, to the rat, all options are equal in value, then some initial bad luck in encountering the black swan might make the animal avoid that option, even though with enough experience, then it would have been equal in value.

      (2) I am still not convinced that the Jensen inequalities add to this paper in terms of understanding the rat behaviour. That may be more suited for a different paper about the statistical and mathematical properties of certain generative distributions, but not here given what rats actually choose and experience.

      (3) Providing the data open access is very good. The code, however, should be equally available and not just upon request. Code needs to be available for assessment during peer review and for reproducibility checks. There are substantial enough problems with reproducibility in the field that code availability should be a minimum criterion for publication (see Miske et al., 2026 in Nature for the most recent large-scale evaluation of this problem).

      (4) The paper still somewhat mischaracterizes the literature on rare events, posing it as a series of "exceptions", rather than recognizing that a huge chunk of the literature uses rare events rarer than 10%. Also, there is even existing terminology in that literature for exactly the situation that is being created here-rare treasures (aka jackpots here) and rare disasters (aka Black Swans here).

      (5) Defining the observed behaviour in terms convexity, instead of stating choices more plainly obscures what is done/found. This is especially the case here because convex and concave mean different things when applied to gains/losses in terms of whether or not that option can lead to the REE. The use of the terms obscures rather than clarifies and probably is best left for the discussion (and maybe the intro) when mapping from theoretical distributions to the experiment at hand. In the paper, even the bottom of p.5 seems to incorrectly define "Total Sensitivity" as the combined proportion of selecting convex options in either domain, which does not map how convex is defined in Fig 1B or elsewhere in the text.

      (6). Fig 1C is baffling. Why are probabilities drawn moving away from the origin? The standard scientific plotting convention is for numbers to grow when moving away from the origin. That would be vastly clearer. Also, the color coding is confusing. Green-red maps onto convex-concave, but that would naturally seem to indicate gains vs losses, not convex vs concave. And why are probabilities growing larger in both directions from the origin? Much more sensible to communicate the procedure would likely be a standard plot of magnitude vs probability.

      (7) Discussion: I think the main difference between the human situations discussed and this experiment is that humans have not experienced those rare "black swan" outcomes. Rather, they hear about the disasters that are possible and do not incorporate that information, as discussed in the description-experience literature already cited in this paper (though not in that context).

    1. Reviewer #1 (Public review):

      I read this paper with great interest based on my experience in insect sciences. Previous concerns:

      (1) The paper has an original biological question that is overly broad and mechanistically ambitious. The central biological question, namely how CLas infection enhances fecundity of Diaphorina citri via dopamine signaling, is clearly stated and well motivated by previous literature. However, my advice to the authors is that, while the general question is clear, the manuscript attempts to answer multiple mechanistic layers simultaneously. As a result, I feel that the biological narrative becomes diffuse, especially in later sections where DA, miRNA regulation, AKH signaling, and JH signaling are all proposed as parts of a single linear cascade. In summary, my key concern is that the paper often moves from correlation to causal hierarchy without fully disentangling whether these pathways act sequentially, in parallel, or redundantly. A more explicitly framed primary hypothesis (e.g., "DA-DcDop2 is necessary and sufficient for CLas-induced fecundity") may improve conceptual clarity.

      (2) On the novelty of the data, I feel they are moderately novel, with substantial confirmatory components. If I am correct, the novel contributions include the identification of DcDop2 as the DA receptor responsive to CLas infection in D. citri, the discovery that miR-31a directly targets DcDop2, which is supported by luciferase assays and RIP, and thirdly, the integration of dopamine signaling into the already-described CLas-AKH-JH-fecundity framework. My advice to the authors is to focus more on the manuscript's novelty, which lies more in pathway integration than in discovering fundamentally new biological phenomena. This is appropriate for a mechanistic paper, but should be framed as an extension of existing models rather than a paradigm shift.

      (3) On the conclusions, I recommend that the authors modify their statements a little. I feel that there are some overstated or insufficiently supported claims. For instance, the assertion that CLas "hijacks" the DA-DcDop2-miR-31a-AKH-JH cascade implies direct pathogen manipulation, but no CLas-derived effector or mechanism is identified. Also, that the model suggests a linear signaling hierarchy, but the data largely show correlation and partial dependency rather than strict epistasis. In third, the term "mutualistic interaction" may be too strong, as host fitness costs outside fecundity (e.g., longevity, immunity) are not evaluated. In conclusion, I confirm that the data support a functional association, but mechanistic causality and evolutionary interpretation are somewhat overstated.

      Comments on revised version:

      The authors provided a satisfactory revision.

    2. Reviewer #2 (Public review):

      Summary:

      Nian and colleagues comprehensively apply metabolomics, molecular, and genetic approaches to demonstrate that CLas hijacks the DA/DcDop2-miR-31a-AKH-JH signaling cascade to enhance lipid metabolism and fecundity in D. citri, while concurrently promoting its own replication.

      Strengths:

      These findings provide solid evidence of a mutualistic interaction between CLas proliferation and ovarian development in the insect host. This insight significantly advances our understanding of the molecular interplay between plant pathogens and vector insects and offers novel targets and strategies for HLB field management.

      Weaknesses:

      While the article investigates the involvement of dopamine signaling and specific microRNAs in enhancing fecundity and pathogen proliferation, it still needs to provide a detailed mechanistic understanding of these interactions. The precise molecular pathways and feedback mechanisms by which CLas manipulates dopamine signaling in Diaphorina citri remain unclear.

    1. Reviewer #1 (Public review):

      (1) In this study, the authors aimed at characterizing Huntington's Disease (HD) - related microstructural abnormalities in the basal ganglia and thalami as revealed using Soma and Neurite Density Imaging (SANDI) indices (apparent soma density, apparent soma size, extracellular water signal fraction, extracellular diffusivity, apparent neurite density, fractional anisotropy and mean diffusivity).

      (2) The study implements a novel biophysical diffusion model that extends up-to-date methodologies and presents a significant potential for quantifying neurodegenerative processes of the grey matter of the human brain in vivo. The authors comment on the usefulness of this technique in other pathologies, but they exemplify only with multiple sclerosis. Further development of this, building evidence should be provided.

      (3) Study found that HD-related neurodegeneration in the striatum accounted significantly for striatal atrophy and correlated with motor impairments. HD was associated with reduced soma density, increased apparent soma size and extracellular signal fraction in the basal ganglia, but not in the thalami. Additionally, these affects were larger at manifest stage.

      (4) The results of this work demonstrate the impact of HD on basal ganglia and thalami which can be further explored as a non-invasive biomarker of disease progression. Additionally, the study shows that SANDI can be used to explore grey matter microstructure in a variety of neurological conditions.

      Comments on revised version.

      I have no further comments. Thank you

    2. Reviewer #3 (Public review):

      Summary:

      Ioakeimidis and colleagues studied miscrostructural abnormalities in N=56 Huntington's disease (HD) patients compared to N=57 normative controls. The authors used a powerful MRI Connectom scanner and applied the SANDI model to estimate the soma size, neurite size, soma density, and extracellular fraction in key subcortical nuclei related to HD. In the striatum, they found decreased soma density and increased soma size, which also seemed to become more pronounced in advanced HD individuals in the final exploratory analyses. The authors conducted important analyses to find whether the SANDI measures correlate with clinical scores (i.e., QMotor) and whether the variance of the striatal volume is explained by the SANDI measures. They found a relationship of SANDI measures to both.

      Strengths:

      The study is both innovative and of high interest for the HD community. The authors provide a rich pool of statistical analyses and results which anticipate the questions that may emerge in the HD research community. Statistics are carefully chosen and image processing is done with state-of-the-art methods and tools. The sample size gives sufficient credibility to the findings. Altogether, I think this study sets a milestone in the attempts of the HD community to understand neuropathological processes with non-invasive methods, and extends the current knowledge of microstructural anomalies identified in HD with diffusion MRI. More importantly, the newly identified anomalies in soma size and soma density open new avenues for studying these biological effects further, and perhaps develop these biomarkers for use in clinical trials.

      Weaknesses:

      (1) An important question is whether the SANDI measures, which require an expensive scanner and elaborate processing, are better biomarkers than the more traditional DTI measures. Can the authors compare the effect size of FA/MD with SANDI measures. In some of the plots and tables, FA/MD seem to have comparable, if not higher, correlations with QMotor or CAP scores. On the same vein, it is unclear whether DTI measures were included in hierarchical stepwise regression. I wonder if the stepwise models may have picked up FA/MD instead of SANDI measures if they are given a chance. Overall, I hope the authors can discuss their findings also in this light of cost vs. benefit of adopting SANDI in future studies, which is an important topic for clinical trials.

      (2) Similar to the above point, it is very important to consider how strong the biomarking signal is from SANDI measures compared to the good old striatal volume. Some plots seem to indicate that volumes still have the highest correlation with QMotor, and highest effect size in group comparisons. It would be helpful for the community to know where do the new SANDI measures stand compared to the most typically used volumes in terms of effect size.

      (3) The diffusion measures are inevitably correlated to some degree. Please provide a correlation matrix in supplementary material including all DWI measures to enable readers to understand better how similar SANDI measures are between each other or vs. other DTI measures. Perhaps adding volumes to this correlation matrix may also be a good future reference.

      (4) ISS stages:

      (a) The online ISS calculator requires cut-offs derived from the longitudinal Freesurfer pipeline, while the authors do not have longitudinal data. Thus, the ISS classification might be inaccurate to some degree if the authors used the FS cross-sectional pipeline. Please review this issue and see if updated cut-offs should be used to classify participants.<br /> (b) Were there really no participants with ISS 0 among 56 HD individuals, please clarify in the manuscript?<br /> (c) A note on terminology that might be confusing to some readers. According to the creators of ISS, the ISS stages are created for research only, they are not used or applied in the clinic. On the other hand, the terms "premanifest" and "manifest" have a clinical meaning, typically based on the diagnostic confidence level. The assignment of ISS0-1 to premanifest and ISS2-3 to manifest may create some non-trivial confusion, if not opposition, in some segments the HD community. The authors can keep their current terminology but will need to at least clarify to the reader that this assignment is speculative, does not fully match the clinically-based categories, and should not be confused with similarly named groups in the previous literature.

      Comments on revised version.

      The authors have moved to address many points from reviewers. The manuscript had indeed become more objective, transparent, and to the point. The amount of information and analyses is large, which perhaps is inevitable when new methods are being tested for the first time in a neurodegenerative disease.

    1. Reviewer #1 (Public review):

      Integrating large-field stimulation with a retinotopic atlas, this study introduces an fMRI-based method for measuring contrast sensitivity across the visual field. Retinotopy was assessed using pRF mapping and a calibrated Benson atlas. The authors validate their method by replicating known patterns of contrast sensitivity across eccentricities and visual field quadrants in healthy subjects, and demonstrate its potential clinical utility through case studies of both simulated and real visual field loss.

      Comments on revisions:

      I appreciate the addition of the quadrant-scotoma condition and the authors' clarification that the goal is to demonstrate individual-level detection sensitivity. The 95% CI argument is reasonable, and I am satisfied with framing the simulated-scotoma work as proof-of-concept.

    2. Reviewer #2 (Public review):

      Summary

      This study uses functional MRI to evaluate visual contrast sensitivity across the visual field at the level of the visual cortex, testing the method as a proof of principle in a small group of normally sighted individuals, modelling both normal vision and simulated vision loss, as well as a patient with independently verified vision loss. The results suggest a promising technique to measure vision objectively across the visual field and overcomes the requirement for careful fixation which is often challenging in those with low vision or sight loss.

      Strengths

      • Objective measure of central vision: The proposed method may provide a more comprehensive and objective assessment of residual visual function in individuals with sight loss. This may be particularly useful for those with central visual field loss without the requirement of stable fixation or subjective motor responses.

      • More sensitive measure: The use of slope to calculate contrast sensitivity across a range of contrasts within the brain is clever and likely more sensitive than single threshold measurements or standard clinical measures of visual acuity using letter charts. Standard supra-threshold (high contrast) tests are not ideal for capturing residual vision or partial vision loss.

      • Good agreement with standard atlas: The Benson atlas provides a good estimate of visual field maps within V1 based on anatomical landmarks, and the authors take steps to refine this informed by cortical magnification and V1 surface area (brain size) for each individual participant. This could allow the technique to be generalised without the need to collect lengthy individual mapping data from every participant.

      • Within-subject reproducibility: The measurements appear to be sensitive and reproducible, particularly in those with normal vision, and are consistent with known features of visual sensitivity differences in different parts of the visual field.

      • Potential tool to measure visual field sensitivity in controls: Even if the proposed methods are not ideal for widespread clinical translation, they do offer an exciting tool to test hypotheses about visual field differences in healthy controls. For example, there seems to be an increase in sensitivity on either side of the simulated ring scotoma (Fig 6 - perhaps due to the release of lateral inhibition?). Reliability measures suggest that individual differences are consistent in healthy controls (although not tested statistically, perhaps due to the small sample size?). Whether they reflect behaviourally meaningful differences in visual field sensitivity could be tested in individuals by comparing them to behavioural measures across the visual field.

      • Potential tool to test novel treatments: The proposed techniques could be used to test within-subject changes in visual function in environments that are equipped to measure and analyse fMRI data, including clinical trials aimed at determining the success of novel treatments. Preliminary testing in healthy controls with eye movements also suggests that the method is suitable for testing low vision patients with unstable fixation (e.g., nystagmus), and the authors have modelled the effects of varying amounts and types of eye movements on functional outcome measures.

      Weaknesses

      • Questionable sensitivity to differences in patients. The variability in heat maps across healthy control participants is somewhat surprising, and it is uncertain whether they represent actual visual sensitivity differences or an artifact of the measurement technique, e.g., due to signal-to-noise differences introduced by local variations in brain anatomy. Thus, it is uncertain whether the substantial variance across controls will allow for a sufficiently stable baseline to detect meaningful differences in individual patients. Also, as the authors rightly point out, Benson atlas does not model differences along meridians, so that upper/lower field differences might not be detectable. However, the authors acknowledge that this is a pilot study, and further testing a wider range of scotoma types in patients and simulated in controls will only improve the methods. Furthermore, the ability to capture visual field representations in human visual cortex is also likely to improve with computational advances, making the use of atlases more feasible, obviating the need for individualised population receptive field mapping.

      • Potential for clinical translation. Although it is a sensitive measure, functional MRI is costly, is not available in all clinical settings, requires significant post-processing analyses, and may be contraindicated in some individuals due to safety (e.g., metallic implants) or other concerns (e.g., claustrophobia). These could present significant barriers to widespread clinical translation, if this were the ultimate goal of the study.

      • Limited range of spatial frequencies. The spatial frequencies tested were still quite low (0.3 and 3cpd) compared to measures such a visual acuity. Extending the measurements to higher spatial frequencies could allow better characterization of central vision, although necessarily for peripheral vision. However, this may depend on the typical visual abilities of the patient population of interest.

      Appraisal and Impact:

      The authors used appropriate and robust methods to assess and model known features of visual sensitivity differences across the visual field in sighted controls. In addition, the assessment technique successfully captured sensitivity changes due to simulated and actual partial field loss but was also fairly resilient to eye movements and fixation instability, typical of patients with sight loss. Although currently providing a proof of principle, the method is likely to improve with further testing and increasing normative sample sizes, and as computational methods continue to advance visual field map predictions. Although it may not be adopted widely as a standard clinical assessment technique due to the expense and other obstacles, it would provide a valuable tool in assessing clinical populations, for example in the context of clinical trials to assess suitability for treatment interventions or monitor treatment outcomes.

    3. Reviewer #3 (Public review):

      Summary:

      Chow-Wing-Bom et al. introduce an innovative wide-field visual stimulation setup for 3T experiments that enables stimulation up to a diameter of 40{degree sign} visual angle while allowing continuous gaze tracking. Using this setup, the authors systematically investigate contrast sensitivity across the visual field by presenting subjects with sinusoidal gratings varying in contrast and spatial frequency. Their findings confirm the expected organization of contrast sensitivity, demonstrating a preference for high spatial frequencies in the central field and lower frequencies in the periphery. They also extend these measurements to eccentricities up to 20{degree sign}, which exceeds previous fMRI-based reports. Moreover, the study explores the potential of using contrast sensitivity calculations as a method for detecting visual field defects, demonstrated in a healthy subject with simulated ring-shaped and upper-right-quadrant scotomas, and in a patient with LHON. The revised version additionally characterises the robustness of the approach to varying degrees of fixation instability.

      Strengths:

      - The manuscript is well written and provides comprehensive methodological details, ensuring high transparency and reproducibility.

      - The visual stimulation setup represents a significant technical advance by enabling wide-field stimulation with continuous eye tracking, which is crucial for both research and potential clinical applications.

      - The study confirms established findings regarding the organization of contrast sensitivity while extending them to a larger eccentricity range.

      - The efforts to establish a measure for visual field losses aligns with current efforts to develop objective alternatives to conventional perimetry.

      - The revised manuscript includes an empirical assessment of how varying levels of eye movement affect cortical contrast sensitivity estimates, providing useful guidance on the tolerance of the approach to fixation instability.

      Weaknesses:

      - The original version left certain methodological aspects unclear, particularly the correction of eccentricity values from the Benson atlas and the V1 masks used in each analysis branch. The authors have added a dedicated figure illustrating the eccentricity correction procedure and now explicitly state that a manually delineated V1 mask was used for the pRF-based analyses while the Benson V1 label was used for the atlas-based analyses, together with a discussion of how this difference may influence the comparison.

      - Minor inconsistencies in reporting, such as the introduction of a second session in the Results section, have been corrected.

      The conclusion that high-contrast patterns as in pRF mapping are not optimal to test for subtle but potentially clinically relevant changes in the visual field coverage are very valid. The suggested use of contrast sensitivity can therefore be a potentially well-suited parameter for estimating visual field losses. The presented work is an interesting starting point, and the proposed method of using contrast sensitivity as measure for partial vision loss should be further explored.

      Comments on revisions:

      The authors have thoroughly addressed all points raised in my original review, and I have no further concerns.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors aim to characterize how moment-to-moment fluctuations in arousal during wakefulness shape large-scale functional brain connectivity. Using pupil diameter as an index of arousal and high-field functional imaging, they seek to determine whether arousal-related modulation of connectivity is uniform across the brain or organized into structured patterns, and whether such patterns show hemispheric asymmetry. The work further aims to assess whether these organizational features generalize across resting-state and naturalistic viewing conditions.

      Strengths:

      The study addresses an important and timely question regarding how spontaneous variations in arousal influence whole-brain communication during wakefulness. The dataset is rich, combining high-field imaging with concurrent physiological measurements, and the analyses are ambitious in scope. A key strength is the attempt to move beyond region-based effects and to describe arousal-related modulation at the level of large-scale connectivity organization. The comparison across rest and movie viewing provides useful context and suggests a degree of consistency across behavioral states.

      Weaknesses

      All analyses are based on 7T ultra-high-field imaging. The manuscript does not address whether the reported arousal-related patterns, including the community structure and hemispheric asymmetries, are expected to be reproducible at standard 3T field strengths. It therefore remains unclear whether the findings depend critically on the use of high-field data or whether they would generalize to more widely available datasets, limiting the broader applicability of the results.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses a clear and widely relevant question: how ongoing fluctuations in alertness during wakefulness relate to large scale patterns of coordinated brain activity. The authors combine high field magnetic resonance imaging with simultaneous pupil measurements, and they compute an edgewise measure of arousal-related coupling for every pair of regions. Their main contribution is to show that arousal-related coupling is low dimensional and organized into seven reproducible "connectivity communities", each with characteristic network pair compositions. A secondary contribution is the observation that these communities exhibit systematic but community-specific hemispheric asymmetries, including a striking left/right dissociation within the ventral attention network, where the left side participates broadly across communities while the right side forms a more cohesive, segregated arousal responsive module. A final contribution is cross-context generalization: the same organizational structure and lateralization signatures are largely preserved during naturalistic movie watching.

      Strengths:

      (1) The paper moves beyond state contrasts and quantifies arousal related modulation continuously within wakefulness, directly addressing a gap highlighted in the Introduction.

      (2) The hemispheric asymmetry result is not framed as a crude global dominance effect; the authors explicitly test and argue that the key signal lies in structured spatial heterogeneity rather than mean shifts.

      (3) The cross-paradigm replication in movie watching is a strong design choice and supports the claim that the organizational motifs are not limited to unconstrained rest.

      (4) Arousal effects on BOLD signals and on pupil size can have different delays. The authors have now tested lagged relationships (for example shifting the pupil series forward and backward) to show that the main community structure and lateralization results are not sensitive to an arbitrary temporal alignment.

      (5) Time resolved connectivity results are now shown to be robust to changes in parameters.

    3. Reviewer #3 (Public review):

      Summary:

      The paper investigates neural fluctuations underlying arousal using a combination of resting state/naturalistic movie watching fMRI and eye tracking data. The authors have used several data driven approaches, including time varying sliding window analyses and clustering methods, to characterize large scale brain organization and hemispheric asymmetries associated with arousal fluctuations. This is an interesting study framing arousal as a dynamic, continuously varying process rather than a discrete state. Overall, the manuscript is well written and the authors have provided sufficient details about the methodological choices, their impact on the results, along with the limitations of the study.

      Strengths:

      This is an interesting study framing arousal as a dynamic, continuously varying process rather than a discrete state. Overall, the manuscript is well written and provides sufficient methodological and analytical details to evaluate the results.

      Weakness:

      While the study provides new insights regarding neural processes underlying arousal, future studies may be needed to further examine the implications of identified cluster and patterns.

    1. Reviewer #1 (Public review):

      This is an excellent paper from Dr. Yokoyama and colleagues. The experiments are technically demanding, given the very low cell numbers and the challenges of working with implantation sites at gestational days 6.5, 10.5, and 14.5. Overall, the impact of TGF-β receptor II deficiency in the NK lineage on uterine trNK cell numbers and litter size is convincing, and the authors' conclusions are well supported by the data. Less convincing, however, is the claim that the decrease in trNK cells is compensated by an increase in cNK cells; rather, the absence of TGF-β receptor II appears to result in an overall reduction of NK/ILC1 cells.

      Comments on revised version:

      I thank the authors for addressing all my comments from my initial review.

    2. Reviewer #2 (Public review):

      In their manuscript "TGF-β drives the conversion of conventional NK cells into uterine tissue-resident NK cells to support murine pregnancy", Yokoyama and colleagues investigate the role of Tgfbr2 expression by NK cells in the formation of tissue-resident uterine NK cells and subsequent importance in murine pregnancy. By transferring congenic splenic conventional NK cells into pregnant mice, they show conversion of circulating NK cells into uterine ivCD45 negative tissue-resident NK cells. When interfering with the formation of uterine trNK cells, spiral artery remodelling was impaired, fetal resorption rates were increased, and litter sizes were reduced.

      Generally, this is a research topic of high interest, yet the manuscript is lacking detailed mechanistical insights and some questions remain open. At the current state, the data represent an interesting characterisation of the Tgfbr2-fl/fl Ncr1-Cre mice in pregnancy, but considering 1) the recent publication by the group (Ref#17) on the role of Eomes+ cNK cells during pregnancy, 2) the previously described role of Tgfbr2 and autocrine TGFb expression for uterine NK cell differentiation in virgin mice (also cited by the authors), and 3) the well-known relevance of uterine NK cells during pregnancy, additional experiments addressing the specific role of Tgfb during pregnancy would help to improve novelty and significance of the manuscript.

      Comments on revised version:

      In their revised version of the manuscript and their point-by-point response, the authors have very carefully addressed and discussed all of our concerns and suggestions.

    1. Reviewer #1 (Public review):

      Intron retention is observed in many long noncoding RNAs. The authors here used a powerful genome-wide screening strategy to identify proteins controlling intron retention in the long noncoding RNA PURPL. One of the top hits across multiple cell lines surprisingly, was U2AF2, which is well known to bind the polypyrimidine tract close to the 3' splice site to promote splicing. Nonetheless, U2AF2 is working in the opposite direction here. Convincing follow-up RT-PCR experiments confirmed that knocking down U2AF2 does indeed lead to reduced intron retention of PURPL. The authors then show that this intron retention event is functionally important for both the nuclear retention of PURPL as well as its ability to enhance cell proliferation.

      The authors then used transcriptome-wide analyses to look for additional intron retention events affected by U2AF2. Among the ~250 genes with decreased intron retention (more splicing) upon U2AF2 knockdown was MALAT1, a well-established long noncoding RNA that normally localizes to nuclear speckles. Depletion of U2AF2 or removal of the MALAT1 2nd intron resulted in reduced speckle localization and cell migration, revealing a critical and fascinating role for this intron retention event. Overall, the authors have used a set of complementary approaches to clearly demonstrate a very intriguing role for U2AF2 in controlling intron retention and functionality of a set of long noncoding RNAs.

      I feel the current work has revealed an important role of intron retention in controlling the localization and functionality of long noncoding RNAs, which is likely broad in scope and is likely regulated by cell state.

      One experimental suggestion: The authors show that expressing intron-2 containing PURPL in PURPL-depleted cells is sufficient to induce faster proliferation, but a valuable comparison would be identifying the phenotype expressing spliced PURPL transcript.

    2. Reviewer #2 (Public review):

      Summary:

      This study identified U2AF1/2 as a regulator of pre-mRNA splicing that either promotes or supresses the splicing of introns on different genes. The authors then focused on two genes PURPL and MALAT1 that U2AF1/2 can promote intron retention of specific introns, and characterized the biological implications of these introns regulated by U2AF1/2.

      Strengths:

      (1) The experiments in this manuscript are relatively rigorously designed and performed, often with validation checks such as verifying the knockout, verifying the treatment itself doesn't have an effect, etc.

      (2) The experiments provided comprehensive support for the claims that these specific introns are important for the stability or nuclear localization of the RNA, as well as that U2AF1/2 suppresses the splicing of these introns.

      (3) The writing of the manuscript is very clear and doesn't overstate the conclusions that can be drawn from the experiments.

      Weaknesses:

      I think one main weakness of this study is the lack of a deeper analysis of the mechanisms. Whether studying the mechanism is within the scope of this paper is probably debatable, but with the current experiment setup and data, I believe there are some analyses that can be relatively easily done to enhance the value or significance of this study. My detailed questions and suggestions are listed below:

      (1) Line 194-195 and Figure 2A: How many RBPs are included in "other RBPs" in line 194? Does "other RBPs" only include PTBP1, PRPF8 and SRSF1 in Figure 2A, or do they include all the ~100 RBPs with HepG2 eCLIP data available on ENCODE? If U2AF1/2 have the highest occupancy around the intron 2 region among the ~100 RBPs, it would be nice to visualize it.

      (2) Figure 2A and 2B: Why didn't U2AF2 show interaction with exon 2 and 3 in RNA-IP but showed enrichment over exon 2 and exon 3 regions in the eCLIP data?

      (3) Figure 3C - 3F: Maybe I misinterpreted the experiments, but to my understanding, these experiments showed that the exogenous PURPL with intron 2 promoted cell proliferation compared to when the exogenous PURPL wasn't induced, but didn't compare to the effect of the same amount of PURPL with intron 2 removed. Wouldn't it be clearer to compare the effects of exogenous PURPL with intron 2 and exogenous PURPL without intron 2 to pinpoint whether the effect is related to intron 2? Without an intron 2 specific experiment, these current experiments don't seem to provide much added value than "PURPL promotes cell proliferation".

      (4) It's not very clear what proportion of these introns are retained in the endogenous PURPL and MALAT1 in various tissues, cell types and conditions. I think it will be valuable to provide this background (either from previous research, public database or data from this study).

      (5) Since U2AF1/2 have a wide range of targets as demonstrated by Figure 4A, I think it would be valuable to have some experiments that directly disrupt the interaction between U2AF1/2 and PURPL and MALAT1 and test the effect on splicing outcomes, such as by mutating the sequence that U2AF1/2 bind to. The section on the weak py-tract of PURPL touched upon this topic but focused more on how the weak py-tract causes the intron 2 retention in the background rather than how U2AF1/2 binding and action were affected by sequence mutations. I think experiments on disrupting the direct binding between U2AF1/2 on targets can provide valuable mechanistic insights.

      (6) Across all the target genes of U2AF1/2, it might be feasible to do some systematic analysis to find what correlates with whether U2AF1/2 have a promoting or suppressing effect on intron splicing. For example, do genes with decreased IR after U2AF2 depletion systematically have a weak py-tract compared to genes with increased IR? This dataset can potentially provide many hypotheses for understanding the dual role of U2AF1/2.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript characterized the splicing regulation of two long non-coding RNAs relevant to cancer, starting with a focus on PURPL and ending with insights into MALAT1. A CRISPR screen for the regulators of PURPL intron retention revealed a role for the U2AF heterodimer in inducing this retention, with U2AF2 as the actual hit. This is surprising, because the canonical function of U2AF is to recognize the polypyrimidine tract (PPT) and 3' splice site junction to induce splicing at the site. The brief mechanistic characterization of this phenomenon showed that this intron retention accounts for the nuclear localization and instability of the PURPL transcript, and seems to confer the enhanced cell proliferation feature. U2AF2 also induces retention of two introns in MALAT1, and one of them is essential for its nuclear speckle localization and enhanced cell migration.

      Strengths:

      These findings about PURPL and MALAT1 are clear and interesting.

      Weaknesses:

      The results are not sufficiently connected to each other, because one regulation is nuclear-speckle dependent but not the other.

      Here are my specific comments:

      Major comments:

      The main issue is the lack of focus because of the distinct and incomplete analysis pertaining to the two long noncoding RNAs, PURPL and MALAT1. The paper starts with a very good genetic screen on the former, and immunofluorescence and functional analysis on the latter, with U2AF2 as the main link to induce intron retention. The first one does not show clear localization while the second docks to nuclear speckles, apparently because of the retained intron. Hence the two mechanisms are related yet distinct. Here are some suggestions to enhance the characterization and connection between the two cases:

      (1) As the MALAT1 intron 2 retention contributes to its speckle localization but not the retained PURPL intron, the retained introns or their 3' splice site sequences should be swapped to see if they determine the localization.

      (2) Figure 3, the rescue of the PURPL knockout by the intron-retained RNA to induce proliferation is a powerful experiment, that is lacking the rescue with the RNA without the intron as a control. This must be done and shown.

      (3) The weakness of the PPT of PURPL intron 2 appears as a clear feature of its retention dependent on U2AF2, which appears direct, as backed by CLIP data. It would be good to show direct binding by EMSA or equivalent techniques. Furthermore, the data is also consistent with other determinants. The exon and upstream intronic sequences, including the branch point, could also be involved, so mutations in these are also required.

      (4) In brief, what are the commonalities and differences between PURPL and MALAT1 with regard to their U2AF2-dependent intron retention?

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The authors generated mouse and zebrafish models for DeSanto-Shinawi Syndrome, caused by loss-of-function variants in the WAC gene. Using these vertebrate systems, they demonstrate conserved craniofacial and social-behavioral phenotypes that parallel human clinical features, along with deficits in GABAergic markers. They observe increased seizure susceptibility and male-biased brain volumetric changes in Wac mutant mice. Together, these findings begin to define the biological consequences of Wac haploinsufficiency and provide valuable resources for future mechanistic studies.

      Strengths:

      WAC is a high-confidence neurodevelopmental disorder gene and one of the genes identified by large-scale exome sequencing efforts, including the Satterstrom et al. (2020) autism spectrum disorder cohort. This study establishes the first vertebrate Wac models, addressing a major gap in the understanding of DeSanto-Shinawi Syndrome, and provides a framework for studying other syndromic forms of autism. The models generated will be impactful and useful to the community to study and understand DeSanto-Shinawi Syndrome.

      The cross-species analysis is important and well executed, and reveals both conserved and divergent phenotypes. The behavioral and anatomical assays are rigorously executed and well-controlled, and the inclusion of RNA-sequencing analyses adds valuable insights into the mechanisms underlying brain function in Wac mutants. Notably, the RNA-seq data reveal upregulation of several clustered protocadherins, genes central to neuronal identity and cell-cell interactions, which are known to be regulated by dynamic developmental regulation of chromatin architecture. This observation provides an intriguing hint that could link Wac function to higher-order chromatin organization and neuronal connectivity.

      Weaknesses:

      The evidence is solid, though the study remains incomplete in its mechanistic depth and molecular interpretation. The authors compellingly describe behavioral, anatomical, and transcriptomic phenotypes associated with WAC loss, yet do not explore how WAC mechanistically regulates chromatin or transcription. Given prior evidence that WAC interacts with the RNF20/40 ubiquitin ligase complex and promotes histone H2B ubiquitination and transcriptional elongation, the paper would benefit from a discussion of these functions as a potential link between Wac haploinsufficiency and the observed changes in neuronal gene expression. Similarly, the authors mention WAC's WW and coiled-coil domains but do not consider how these domains could mediate nuclear interactions or recruitment of transcriptional cofactors that shape gene regulation and chromatin organization in neurons.

      The transcriptomic analysis is rich but largely descriptive. Although the upregulation of clustered protocadherins is particularly intriguing, these findings are not validated or localized to specific neuronal populations. The study would be strengthened by independently validating the most significant RNA-seq changes, such as protocadherin gamma genes, using in situ hybridization methods to confirm the spatial and cellular specificity of expression changes.

    2. Reviewer #2 (Public review):

      The authors describe the first deep neurological characterization of WAC mutation in two vertebrate species (zebrafish and mouse). They examine these at various levels, guided by the work in humans that has associated a heterozygous WAC mutation with DeSantos Shinawi Syndrome (DESSH). Therefore, they investigate the animals for a variety of phenotypes, following a template for what is seen when characterizing a new mouse/fish model of a developmental disability gene. Investigations include analysis of skull and jaw for abnormalities(both species), MRI of brain structure(in mice), electrophysiology(mice), assessment of signaling pathways (by Western blot, in mice), cell counts (both, more in mice), transcriptomics (mice), and behavior (both).

      Generally, this describes an important first characterization of the consequences of the mutation. Most of the studies appear well-conducted and reasonably powered, thus solid or convincing.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The behaviour of cells expressing constitutively active HRas is examined in mosaic monolayers, both in MCF10a breast epithelial and Beas2b bronchial epithelial cell lines, mimicking the potential initial phase of development of carcinoma. Single HRas-positive cells are excluded from MCF10a but not Beas2b monolayers. Most interestingly, however, when in groups, these cells are not excluded, but rather sharply segregated within a MCF10a monolayer. In contrast, they freely mix with wt Beas2b cells. Biophysical analysis identifies high tension at heterotypic interfaces between HRas and wild-type cells as the likely reason for segregation of MCF10a cells. The hypothesis is supported experimentally, as myosin inhibition abolishes segregation. The probable reason for lack of segregation in the bronchial epithelium is to be found in the different intrinsic properties of these cells, which form a looser tissue with lower basal actomyosin activity. The behaviour of single cells and groups is recapitulated in a vortex model based on the principle of differential interfacial tension, under the condition of high heterotypic interfacial tension.

      Strengths:

      Despite being long recognized as a crucial event during cancer development, segregation of oncogenic cells has been a largely understudied question. This nice work addresses the mechanics of this phenomenon through a straightforward experimental design, applying the biophysical analytical approaches established in the field of morphogenesis. Comparison between two cell types provides some preliminary clues on the diversity of effects in various cancers.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigate the behavior of oncogenic cells in mammary and bronchial epithelia. They observe that individual oncogenic cells are preferentially excluded from the mammary epithelium, but they remain integrated in the bronchial epithelium. They also observe that clusters of oncogenic cells form a compact cluster in mammary epithelium, but they disperse in the bronchial epithelium. The authors demonstrate experimentally and in the vertex model simulations that the difference in observed behavior is due to the differential tension between the mutant and wild-type cells due to a differential expression of actin and myosin.

      Strengths:

      (1) Very detailed analysis of experiments to systematically characterize and quantify differences between mammary and bronchial epithelia.

      (2) Detailed comparison between the experiments and vertex model simulations to identify the differential cell line tension between the oncogenic and wild-type cells as one of the key parameters that are responsible for the different behavior of oncogenic cells in mammary and bronchial epithelia.

    1. Reviewer #1 (Public review):

      Summary:

      The authors address the lack of validated tools for the detection and quantification of proteins associated with amyotrophic lateral sclerosis (ALS) through an extensive screening of 303 commercially available antibodies to 33 protein targets. Their ALS-Reproducible Antibody Platform (ALS-RAP) delivers a validated antibody toolbox for ALS research, which will provide an advantageous starting point for researchers in this field. Ayoubi R. et al. showcase the characterization workflow, presenting as an example the characterization of antibodies targeting Galectin-1, encoded by the LGALS1 gene. A selection of these antibodies was also used to profile protein levels across human induced pluripotent stem cell (iPSC)-derived and primary neurological cell types, and the findings support that the ALS disease mechanism involves both neuronal and glial cells.

      Strengths:

      The knockout (KO)-based approach is definitely the major strength of this study, providing a high level of confidence in the data collected in human induced pluripotent stem cell (iPSC)-derived and primary neurological cell types. The focus on renewable reagents (monoclonal and recombinant antibodies) is also important. The extensive characterization of this set of antibodies will benefit any scientist interested in any of the 33 target proteins, even in fields other than neuroscience.

      The authors perform an interesting protein profiling study assessing 27 proteins, comparing RNA and protein expression data, and using two independent WB preparations of the same cell types.

      The conclusions that can be drawn from this first assessment might not be final, but the data are compelling because they have been collected with reliable and validated antibodies.

      Another strength of this work is the data dissemination strategy, which includes the Only Good Antibodies (OGA) platform, where YCharOS data are curated and presented in an easy and intuitive manner that facilitates antibody selection by the end user for WB, IP and IF applications.

      Weaknesses:

      The authors mentioned the development of single-chain variable fragment (scFv) recombinant antibodies raised by the SGC against the six proteins (ANXA11, OPTN, MATR3, PFN1, UBQLN2 and VCP) that had limited renewable antibodies that are commercially available. The development was optimized to generate antibodies particularly suitable for IP, and the clone selection process was carried out using IP coupled to mass spectrometry. Even though the generation of these novel reagents is not the focus of this work, the authors do not provide any data on this aspect.

      The protein profiling study is limited to WB data, and the authors did not provide any explanation on why there was no integration with IP and IF data, not even for those targets that have validated antibodies. Also, not all the cell types have been screened by chemiluminescence-based detection and by fluorescence-based WB, and the authors do not elaborate on the reason for such a choice.

    2. Reviewer #2 (Public review):

      Overall, this is a solid manuscript that delivers an important community resource. The execution is relatively simple, but the value is real, the work is rigorously performed, and the open dissemination through Zenodo, the F1000Research YCharOS Gateway and OGA is well executed. The effort invested in generating the knockout lines for validation experiments is a clear strength of the study. I have a number of comments that I think would strengthen the resource and the conclusions drawn from it.

      Below, I list specific points.

      (1) The rationale for the selection of these 33 genes is insufficient. The authors lean on the Nijs & Van Damme classification and on PubMed entry counts, but the number of PubMed entries is not a meaningful criterion for what constitutes an important ALS protein - some of the most disease-relevant genes are precisely those with fewer publications, while heavily cited genes such as CAV1 carry weak ALS-specific evidence. The authors should provide a more transparent and biologically motivated rationale for inclusion and exclusion (ClinGen evidence tier, replicated GWAS signals, large meta-analyses, ALSoD) and explain why specific risk genes outside this list were not part of ALS-RAP.

      (2) "107 of 231 (46%) demonstrated specific target staining in IF." The criteria used to define "specific target staining" at the IF level are not stated. From the Galectin-1 example, the mosaic WT/KO strategy provides a binary readout, but for proteins with low expression, weak punctate staining or unusual subcellular distributions, a single threshold is unlikely to capture specificity uniformly across 231 antibodies.

      (3) Several claims in the manuscript depend on differential protein abundance across cell types. As presented, these claims are supported by qualitative Western blot images only. They should be substantiated by quantification across multiple biological replicates.

      (4) This manuscript represents a unique opportunity to address antibody recognition of splicing variants, which is something of of considerable value to the community. For each target, the predicted isoforms in Ensembl could be cross-referenced against the observed bands, and the pattern of bands compared across cell types could be informative about which isoforms each antibody captures. This would convert ambiguous "extra bands" into useful biological information and would substantially increase the value of the resource. I strongly encourage the authors to include this analysis.

      (5) The iPSC-derived microglia receive a comprehensive QC panel (IBA1/PU.1 IF, CD45/CD11b flow, qRT-PCR for nine canonical markers; Figure S4), which allows the reader to assess culture purity. The other iPSC-derived lineages - motor neurons, dopaminergic neurons, oligodendrocytes and astrocytes - are validated by a single marker each in WB (Figure S3) without purity quantification. Given that several conclusions of the manuscript rest on the cell-type-specific detection of ALS-associated proteins, equivalent quality control should be performed for the other lineages so that the reader can evaluate the purity of each preparation.

      (6) The robustness of the resource would be substantially increased by validating at least a subset of the targets in a second iPSC background, in at least some of the cell types analysed.

      (7) The newly developed SGC scFv antibodies are arguably the most novel reagent contribution of this manuscript, yet they receive a single sentence in the body of the paper. A more thorough description is warranted.

      (8) Accessibility of the resource through Zenodo is not straightforward - the reader currently has to navigate to individual antibody characterization reports one by one to extract recommendations for a given target. While the use of an established public repository is important for permanence, a dedicated ALS-RAP website with an interactive, searchable interface - filterable by target, application, host species and clonality - would meaningfully improve uptake. The relationship between such a portal and the existing OGA platform should also be clarified.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors reveal that the availability of extracellular asparagine (Asn) represents a metabolic vulnerability for the activation and differentiation of naive CD4+ T cells. To deplete extracellular Asn, they employed two orthogonal approaches: activating naive CD4+ T cells in either PEGylated asparaginase (PEG-AsnASE)-treated medium or custom-formulated RPMI medium specifically lacking Asn. Importantly, they demonstrate that Asn depletion not only impaired metabolic reprogramming associated with CD4+ T cell activation but also reduced CD4+ helper T cell lineage-specific cytokine production, thereby ameliorating the severity of experimental autoimmune encephalomyelitis.

      The experiments presented here are comprehensive and well-designed, providing compelling evidence for the conclusions. The conclusions will be important to the field.

      Comments on revised version:

      The authors have sufficiently addressed my previous comments. The manuscript represents an excellent contribution to the field.

    2. Reviewer #2 (Public review):

      While the importance of asparagine in the differentiation and activation of CD8 T cells has been previously reported, its role in CD4 T cells remained unclear. Using culture media containing specific amino acids, the authors demonstrated that extracellular asparagine promotes CD4 T cell proliferation. Consistent with this, depletion of extracellular asparagine using PEG-AsnASE suppressed CD4 T cell activation. Proteomic analysis focusing on asparagine content revealed that, during the early phase of T cell activation, most asparagine incorporated into proteins is derived from extracellular sources. The authors further confirmed the importance of extracellular asparagine in vivo, demonstrating improved EAE pathology.

      While the data are well organized and convincing, the mechanism by which asparagine deficiency leads to altered T cell differentiation remains unclear. It is also necessary to investigate the transporters involved in asparagine uptake. In particular, elucidating whether different T cell subsets utilize the same or distinct transport mechanisms would provide important insight into the immunoregulatory role of asparagine.

      Comments on revised version:

      The authors have addressed the previous concerns, and the manuscript has been significantly improved.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors set out to define how arginine availability regulates lipid metabolism and to explore the implications of this relationship in pancreatic ductal adenocarcinoma (PDAC), a tumor type known to exist in an arginine-poor microenvironment. Using a combination of rigorous genetic and metabolomic approaches, they uncover a previously underappreciated role for arginine in maintaining lipid homeostasis. Importantly, they demonstrate that arginine deprivation sensitizes PDAC cells to ferroptosis through lipidome perturbations, which can be exploited therapeutically via co-treatment with aESA and ferroptosis inducers (FINs). These findings have meaningful implications for the field. They not only shed light on the metabolic vulnerabilities created by nutrient restriction in PDAC, but also suggest a practical avenue for combination therapies that exploit ferroptosis sensitivity. This is particularly relevant in the context of pancreatic cancer, which is notoriously resistant to conventional treatments. The methods employed are broadly applicable to other nutrient-stress contexts and may inspire similar investigations in other solid tumor types.

      Strengths:

      One of the major strengths of the study is the use of complementary and well-controlled approaches-including metabolomic profiling, genetic perturbations, and in vivo models-to support the central hypothesis. The experiments are thoughtfully designed and clearly presented, and the conclusions are, for the most part, well supported by the data. The findings provide mechanistic insight into nutrient-lipid crosstalk and identify a potential therapeutic strategy for targeting arginine-deprived tumors.

      Comments on revised version:

      The authors have substantially strengthened the revised manuscript and have addressed my prior concerns, and the evidence supports the central conclusions. This work provides meaningful insight into how nutrient limitation in the tumor microenvironment creates metabolic liabilities that may be therapeutically exploited, and it should be of interest to investigators studying cancer metabolism, pancreatic cancer, lipid biology, and ferroptosis.

    2. Reviewer #2 (Public review):

      This study by Jonker et al., examines how the metabolic adaptations to the microenvironment by pancreatic ductal adenocarcinomas (PDAC) present vulnerabilities that could be used for therapeutic purposes. The evidence supporting the claims of the authors is mostly solid, and the multiplicity of models used, as well as the combination of in vitro and in vivo work are appreciated, but some conclusions would benefit from additional substantiation. This work would be of interest to biologists working on the impact of microenvironment and metabolism in cancer, and especially those investigating pancreatic cancer.

      In this study, the authors use mostly "doublings per day" as an indicator of cell death, notably for figures 4 to 6. However, proliferative arrest (or a decrease in the proliferative rate) is not necessarily synonymous with cell death. It might be nice to complement these experiments with a true measure of cell death (e.g. PI uptake).

    3. Reviewer #3 (Public review):

      This important study investigates the impact of nutrient stress in the tumor microenvironment (TME), focusing on lipid metabolism in pancreatic ductal adenocarcinoma (PDAC). Understanding TME composition is crucial, as it highlights cancer vulnerabilities independent of intracellular mutations, particularly because PDAC tumors are often exposed to limited nutrient availability due to reduced perfusion.<br /> By utilizing a medium that mimics the nutrient conditions of PDAC tumors, the authors convincingly show that TME nutrient stress suppresses SREBP1, leading to reduced lipid synthesis, with low arginine levels identified as a key driver of this suppression. Importantly, mice with arginine-starved pancreatic tumors respond to polyunsaturated fatty acid-rich diet. This discovery uncovers a synthetic lethal interaction in the tumor microenvironment that could be leveraged through dietary interventions.

      Comments on revised version:

      The authors have satisfactorily resolved all previously raised concerns through the inclusion of additional data and clarifications in the discussion.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates how two closely related fish species differ in their processing of visual motion, with a focus on spatial and temporal integration underlying behavior. Using a series of behavioral assays combined with computational modeling, the authors identify clear species-specific differences in how visual information is integrated to guide movement.

      Strengths:

      A major strength of the work is the systematic and quantitative behavioral analysis, which reveals robust differences between species, including broader spatial integration and longer temporal persistence in medaka compared to zebrafish. The decomposition of behavior into distinct components provides a useful framework for interpreting these differences.

      Weaknesses:

      The computational modeling captures several key aspects of the observed temporal dynamics, particularly differences in response persistence. However, the modeling framework is primarily focused on temporal processing and does not incorporate spatial integration, which is a central finding of the study. In addition, some experimental observations, such as responses to short-duration stimuli and certain frequency-dependent features, are only partially reproduced. These limitations indicate that the link between the model and the full range of behavioral results remains incomplete.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents a comparative analysis of optomotor behavior in zebrafish and medaka larvae. Using multiple behavioral paradigms, the authors argue that the two species differ in both the spatial and temporal integration of visual motion. They further decompose turning behavior into large- and small-turn components and use a simple mechanistic model to capture several of the main response features. Overall, the study addresses an interesting question, and the comparative framework gives the work a clear conceptual appeal.

      Strengths:

      A major strength of the manuscript is the breadth of the behavioral analysis. The authors use several stimulus paradigms to probe spatial extent, temporal persistence, and response dynamics, which makes the cross-species comparison richer and more informative than a single-assay study. The decomposition into large and small turn components is also a useful feature of the work, as it provides a more structured account of where the species differences may arise. The modeling further helps organize the results and offers a useful framework for interpreting the behavioral differences.

      Weaknesses:

      The main limitations are in presentation and clarity rather than in the overall motivation or approach. In several places, it is difficult to determine exactly how some quantities are summarized statistically, and some figures and legends would benefit from clearer explanations. In addition, a few of the more specific interpretive claims would be strengthened by more explicit statistical framing and slightly clearer presentation. These issues appear addressable and do not detract from the overall interest of the study.

    1. Reviewer #1 (Public review):

      Summary:

      This paper presents rare and unique recordings of single neurons, LFPs, and SEEG data from human patients performing reading and listening tasks. They identify single neurons in temporal and ventral occipito-temporal cortex that respond specifically to spoken and written language, and primarily encode either phonological or orthographic features of the stimuli. They also identify neurons in the middle temporal and inferior frontal cortex that respond to both modalities, which they interpret as amodal language responses. In general, neuronal population firing rates are correlated with both micro- and macro- scale broadband gamma responses, though they observe some dissociations, particularly with the macro-scale. The results are interpreted to support a model of modality-specific to amodal processing throughout many distributed brain areas for language.

      Strengths:

      (1) The data are truly unique, providing a large-scale characterization of single neuron responses from the human brain during written and spoken language processing.

      (2) The task and stimulus conditions allow for examination of both low-level (e.g., orthographic/phonological) and higher-level (e.g., syntactic) encoding.

      (3) Showing relationships between single neuron and multi-scale LFP recordings from the same sites helps bridge neuronal and meso/macroscale literatures.

      Weaknesses:

      (1) My main comment about the paper is that it feels like a collection of somewhat random descriptions of a very small number of hand-picked single neurons. I think that the task and stimulus design shown in Figure 1A sets up some clear hypotheses that could be tested rigorously across the full neuronal population, but instead, the authors pick a few neurons and fit encoding models that don't take advantage of the contrasts. I agree that encoding models are a powerful approach, but with only 508 total words and what appears to be a limited set of variability across the various features, it's not clear to me that the stimuli, which were apparently designed as minimal pairs, provide enough power to find robust results. Perhaps this is why the majority of the results only show a very small number of units (most of which are actually buried in the supplement), but it's odd to me that they don't show the results of the minimal contrasts other than for length.

      (2) Related to point (1), other than Figure 2H and Figure 6A-B, the results are only shown for a tiny number of units. This is great for demonstrating qualitatively what the effects look like, but there is no quantification of the findings across the population, which undermines the point in the abstract that 1000 neurons were recorded. This is acknowledged in some places, but as a reader, it leaves me wondering how seriously to take the interpretations if they seemingly cannot be replicated. I understand this is a challenge with human single neuron recordings, but as presented, the paper as a whole comes across as largely anecdotal.

      (3) Some of the key claims rest on the idea that neurons were recorded from the superior temporal gyrus and fusiform gyrus. For the STG claim, I don't understand how this was done, or what specifically they mean by STG, since the microwire locations do not appear to be anywhere near the lateral surface. This makes sense given the profile of the Behnke-Fried electrodes, but if they want to claim that there are neurons from the STG, they need to be more specific and show where precisely these wires are. If they are more medial as it appears, they need to explain how they dissociated STG from Heschl's gyrus. Similarly, for the fusiform neurons, I can only see a couple of probes that appear to have their tips near where I would think this area is. Perhaps this is more of a visualization issue with Figure 1F, but overall, I am not convinced that the neurons are exactly where they say they are.

      (4) Related to point (3), some of the authors have made strong claims in prior work about the precise coordinates of the VWFA, so it would help to know how many units are within this exact region. The ROIs marked in Figure 2 are quite large, and given results like Vinckier et al. 2007, it's important to know where along the hierarchy the recordings were actually performed. Similarly, given the framing in the intro around the VWFA as a key area, the idea that some of the best example neurons are from the right fusiform is a bit confusing. I don't think they can make the claims about visual hemifields since it does not appear that they recorded eye tracking to verify constant central fixation, and it may be a bit surprising to see such strong orthographic selectivity in the right hemisphere (though, as a result, it may suggest a more nuanced view of lateralization of reading at the single neuron.

      (5) In many sections of the paper, there are vague and unquantified claims like "many neurons" or "a large number of units". This needs to be made explicit. It would also help to show where statistical threshold cutoffs are on plots like Figure 2H, since the "brain-score" is used to select units for many analyses.

      (6) More detail on the TRF models is needed in the methods. At the very least, a complete list of the features in each group is necessary to evaluate claims about very broad sets of features like "syntax". It would also help to know how the features were coded, especially where there is a mixture of continuous and discrete features within the model.

      (7) Depending on how exactly the features were defined, I'm skeptical of some of the claims, like position-specific "w". There are some obvious confounds that need to be controlled here, like whether word-initial "w" is strongly associated with shorter, higher frequency words (like "wh-" words). There are other examples, like whether specific forked letters tend to appear in certain syllables in English words. While it may be the case that these kinds of patterns are uniformly distributed, it needs to be established in this particular stimulus set.

      (8) The claim that there is monotonic encoding of word length does not seem strongly supported in the data. In both PC1 and the single neuron examples, it seems like there may be a non-linear relationship, which could suggest that another correlated feature (e.g., word frequency) is involved.

      Minor Points:

      (1) What are "boundaries"? They are not described anywhere I could find, but they are a feature group that was used in the TRFs. )

      (2) The caption for Figure 6C says MTG and insula, but the text says MTG and IFG. Similar to the above comment about STG and fusiform, it's not clear to me how they achieved single-unit recordings with Behnke-Fried probes in these areas.

      (3) The somewhat less robust correlations between firing rate and BGA in macro vs micro contacts are potentially interesting. However, did they verify that the closest macro contact was always in the gray matter of the same gyrus as the microwire?

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript, "Modality-Specific and Amodal Language Processing by Single Neurons," presents an intracranial electrophysiology study investigating how language is represented in the human brain across spoken and written modalities. The authors analyze activity from over one thousand single neurons and local field potentials recorded in twenty-one neurosurgical patients while participants read and listened to sentences. Using encoding models based on temporal receptive fields, they examine whether neural responses track modality-specific features, such as phonological and orthographic information, as well as higher-level linguistic features. The results are interpreted as evidence for a dissociation between modality-specific processing in sensory regions and modality-independent ("amodal") representations in temporal and frontal cortices, supporting a two-stage model of language processing.

      Strengths:

      This study uses a rare and valuable dataset, combining single-neuron recordings with broader field potential measures in human participants. The large-scale recording, in terms of both neuron count and anatomical coverage across multiple regions and individuals, represents a significant technical achievement for intracranial research.

      The use of encoding models to relate neural activity to multiple levels of linguistic representation is methodologically rigorous and provides a unified framework to compare phonological, orthographic, and higher-level features. This approach allows the authors to systematically test how different aspects of language are represented across neurons and regions.

      Another key strength is the attempt to directly link concepts from Linguistics to neural data. By framing the results in terms of modality-specific versus amodal representations, the study engages with longstanding theoretical questions and offers a potential bridge between linguistic theory and systems neuroscience.

      The manuscript is also very well written, and the data are presented clearly and effectively. The inclusion of raw data and raster plots is particularly valuable, as it allows readers to directly assess the neural responses and strengthens the transparency of the analyses.

      Weaknesses:

      Despite these strengths, the central claims of the paper are not fully supported by the analyses presented, and several key issues limit the strength of the conclusions.

      A primary concern is the lack of clear reporting and statistical characterization of the proportion of neurons that significantly encode the tested linguistic features. While the paper presents illustrative examples and regional patterns of encoding, it does not systematically quantify how many neurons exhibit significant effects across conditions, nor does it provide formal statistical comparisons of these proportions across brain regions or feature types. As a result, it is difficult to determine whether the reported dissociations reflect robust population-level phenomena or relatively sparse subsets of neurons identified through model fitting. Figure 2H offers a visual depiction of the distribution of Brain-Score (a measure of model evaluation) across the fusiform gyrus and superior temporal gyrus, but it falls short of providing formal statistical testing or quantitative summaries, limiting its interpretability in supporting the authors' claims. Given that the authors employ temporal receptive field (TRF) analyses, the framework naturally allows for straightforward quantification of the proportion of neurons that significantly encode any linguistic features in the model, which could be reported by region as well as by stimulus condition (auditory vs. visual). Including such analyses would further strengthen the population-level interpretation of the results.

      Relatedly, the interpretation of "amodal" neurons is not sufficiently substantiated. The classification of neurons as modality-independent relies on encoding model performance across conditions, but the statistical criteria for establishing cross-modal generalization are not always clearly defined or rigorously tested. Without explicit comparisons (e.g., testing whether the same neurons significantly encode features in both modalities above chance, and whether this exceeds what would be expected under appropriate null models), the claim of modality-independent representation remains somewhat underdetermined.

      More generally, the reliance on encoding models introduces some interpretational ambiguity. Although the observed dissociation between fusiform and superior temporal regions is consistent with orthographic and phonological processing, respectively, the feature spaces used in the models are partially linked to lower-level sensory properties (e.g., visual form and acoustic features). The authors' single-neuron results suggest these effects reflect genuine linguistic selectivity, but the findings do not uniquely distinguish between linguistic and perceptual explanations. While fully disentangling these factors may be beyond the scope of the current study, the manuscript could benefit from a brief discussion acknowledging these correlations or clarifying how lower-level sensory contributions were considered.

      Another limitation is that the proposed two-stage model of language processing is not directly tested against competing hypotheses. While the dissociation between modality-specific and amodal representations is consistent with this model, the authors note that higher-level features, such as syntax, may be encoded in a distributed or overlapping manner. These possibilities are not systematically tested, so the conclusions risk overinterpreting correlational patterns as evidence for a specific processing hierarchy. A more explicit discussion or quantitative consideration of these alternative accounts would strengthen the interpretation, while still allowing the two-stage model to be presented as a plausible framework.

    3. Reviewer #3 (Public review):

      Summary

      This paper analyzes human single-neuron activity recorded with Behnke-Fried electrodes during naturalistic listening and reading. The authors demonstrate a double dissociation between superior temporal gyrus neurons (responsive during listening but not reading) and fusiform gyrus neurons (responsive during reading but not listening), and report that these two classes of neurons show selectivity to specific phonological and orthographic features of the stimulus, respectively. Across the language network, the authors also report neurons whose responses are amodal (active during both listening and reading), which they organize into a modal-to-amodal processing hierarchy. A separate thread of analyses tracks the relationship between single-neuron spiking, micro-wire, and macro-wire signals across these regions. The authors interpret their findings as evidence for hierarchical processing across the language network and for a "compositional code" for orthography in reading.

      Strengths

      The dataset is rare and valuable. Simultaneous single-neuron, micro-wire, and macro-wire recordings during naturalistic reading and listening in the same patients are difficult to obtain, and the experimental design reflects substantial care. The cross-modality comparison at single-neuron resolution is a novel measurement, and the paper presents these results while also situating them against prior neuroimaging and intracranial work. The simultaneous availability of signals at three spatial scales within the human language network is an unusual and potentially important resource for the field.

      Weaknesses

      (1) Framing and novelty

      The paper appropriately situates its modality-selectivity findings against prior neuroimaging and intracranial work (citing Buchweitz et al. 2009 among others) and frames its novel contribution as bringing single-neuron resolution to a question that has previously been examined at population scales. This framing is fair as far as it goes. However, two issues remain. First, the paper does not engage with neuroimaging evidence that complicates its clean modality-selectivity story - most notably Wilson, Bautista, & McCarron (2018), who found that the dorsal superior temporal sulcus is activated by both intelligible and unintelligible inputs in both modalities. Several reconciliations of single-neuron modality selectivity with population-level cross-modal activation are possible (sparse coding, BOLD-vs-spiking dissociations, etc.), and the paper should engage with these possibilities. Second, the paper's discussion extends well beyond the modality-selectivity result that is its headline contribution, into broader claims about a "compositional code" for orthography and "hierarchical processing" across the language network. These broader claims are not supported by the analyses presented (see Weakness 3), and their inclusion distracts from and weakens the core finding rather than building on it. The paper would be stronger if these claims were either subjected to the population-level analyses they require or scaled back to exploratory observations.

      These framing issues are compounded by writing problems that obscure what the paper is claiming. Some passages, such as the assertion that the dataset "suggests an unprecedented examination of linguistic features across various brain regions at various resolutions," are not interpretable as written and should be rewritten.

      (2) Methodological concerns about the TRF analyses

      The selectivity findings in Figures 3 and 5 rest on temporal response function / temporal receptive field (TRF) analyses with several core issues.

      2.1) First, the construction of the TRF feature stream for the reading condition is not specified in the methods. Reading stimuli are presented in RSVP, with all letters of a word appearing simultaneously. How letter or letter-position features are mapped to a time-varying regressor reflects a substantive hypothesis about the psychological mechanisms of reading, with statistical consequences for what the TRF can recover and how reading and listening analyses can be compared.

      2.2) Second, the stimulus distribution limits which effects can be reliably estimated. While the design appears balanced for some features (e.g., subject gender and number), the features that drive the TRF analyses - particularly letter identity and position in the orthographic TRF - are unlikely to be well covered in a small stimulus set. This raises a concern about high-variance feature importance estimates.

      2.3) Third, the TRF feature set includes syntactic, semantic, and discourse predictors alongside phonological and orthographic features. The paper does not justify this choice in fitting single-neuron responses in STG and FSG, and the consequences for the unique-variance analyses are not discussed. Because syntactic features are correlated with phonological and orthographic features in natural stimuli (function words are short, have characteristic phoneme distributions, and so on), the unique variance attributed to each feature set depends on what is being controlled for. Including syntactic predictors when fitting STG or FSG neurons also risks inflating overall TRF fit by chance, particularly in the absence of cross-neuron correction.

      2.4) Fourth, there seems to be no correction for multiple comparisons across the neuron × feature grid. The within-neuron feature-importance procedure briefly described in the Figure 3 caption may help combat overestimates of feature importance within a single fit, but does not address the question of how many of the "selective" neurons reported across the paper would survive correction at the population level. With many neurons, many features, and a limited stimulus set, some neurons will appear selective to some features by chance alone, and these are likely to be the ones that appear as example panels in figures.

      Together, these issues mean the per-feature selectivity results cannot be interpreted as the paper currently interprets them. This is consequential because the per-feature selectivity findings underpin the paper's broader claims about a compositional code for orthography and about hierarchical processing across feature levels.

      (3) Claims that outrun the evidence

      Several of the paper's broader claims are not supported by the analyses presented.

      3.1) The authors claim a "compositional code" for orthography, in which single neurons code for the combination of letter identity and position. This claim is illustrated with two example neurons. A claim about a coding scheme is a population-level claim and requires a population-level analysis. A natural test would be a per-neuron model comparison between a TRF with letter identity alone and a TRF including letter identity × position interactions, controlled for model complexity, asking how many neurons show improved prediction with the interaction features. As noted above in {section sign}2.2, this analysis would also need to grapple with which letters and positions the data can support estimating. There is a potential connection to the data sparsity worries here: the n=2 example neurons may have the only selectivity profiles for which the relevant interactions could be estimated at all.

      3.2) The "hierarchical processing" claim is motivated by neurons selective to features at multiple levels - graphemes and sub-graphemes in reading, single phonemes and diphthongs in listening. This claim is not specified mechanistically. The paper does not state what kind of structural linguistic hierarchy is intended (segmental phonology to syllabic structure?), what kind of hierarchical neurocomputational mechanism is being proposed, or why selectivity at multiple levels of a feature hierarchy is evidence for that mechanism rather than for any other mechanism (e.g., parallel feature detectors). As written, the claim is too underspecified to evaluate.

      3.3) The "forked letters" finding (selectivity to k, v, w, y, z) is potentially confounded with letter frequency and co-occurrence structure. These letters are low-frequency, with some exhibiting strong positional asymmetries, and they infrequently co-occur with other letters. Under the unique-variance analysis, decorrelation from other features inflates apparent unique variance even in the absence of genuine selectivity.

      3.4) The word-length effect in Figure 4 is established by PCA on the top five fusiform neurons, with no analysis showing the effect is qualitatively similar across a broader selection. Beyond establishing that something varies with word length, the paper makes no substantive claim about what the neural code represents - for instance, whether it reflects letter- or word-specific processing or a more general visual response to stimulus extent. Prior intracranial work has reported word-length effects in regions posterior to the VWFA but not within it (Thesen et al. 2012), raising the question of whether the effect reported here reflects letter-specific processing or a more general visual response that happens to correlate with stimulus extent.

      (4) Missed opportunities

      Several aspects of the paper are not so much wrong as underdeveloped, in ways that the authors are well-positioned to address.

      4.1) The cross-scale comparison between single-neuron, micro-wire, and macro-wire signals is presented descriptively, without articulating what conclusion these analyses support about the relationship between scales of measurement. Given the rarity of simultaneous recordings at these scales, this is a substantial missed opportunity. The rasters in Figure 2 visually suggest a tight relationship between spiking and micro-population activity that is not evident in the summary in Figure 2g. This discrepancy is not explained. Characterizing the functional and temporal relationship linking spike rates to micro- and macro-HGA is a substantive scientific question, and the paper is well-positioned to address it.

      4.2) The stimuli include controlled grammatical manipulations, but these manipulations are used as nuisance regressors in the TRF analyses rather than as the object of structured analysis. A design with controlled comparisons is being treated as if it were unconstrained naturalistic stimulation, which underuses the experimental structure the authors built.

      4.3) Finally, the paper foregrounds the dataset as a contribution but does not describe data sharing plans. Given that several of this review's recommendations call for analyses the authors have not yet done, the long-term value of the dataset to the community will depend substantially on what is shared and how.

      ​​Buchweitz, A., Mason, R. A., Tomitch, L. M., & Just, M. A. (2009). Brain activation for reading and listening comprehension: An fMRI study of modality effects and individual differences in language comprehension. Psychology & neuroscience, 2(2), 111-123.

      Jobard, G., Vigneau, M., Mazoyer, B., & Tzourio-Mazoyer, N. (2007). Impact of modality and linguistic complexity during reading and listening tasks. Neuroimage, 34(2), 784-800.<br /> Thesen, T., McDonald, C. R., Carlson, C., Doyle, W., Cash, S., Sherfey, J., Felsovalyi, O., Girard, H., Barr, W., Devinsky, O., Kuzniecky, R., & Halgren, E. (2012). Sequential then interactive processing of letters and words in the left fusiform gyrus. Nature communications, 3, 1284.

      Wilson, S. M., Bautista, A., & McCarron, A. (2018). Convergence of spoken and written language processing in the superior temporal sulcus. Neuroimage, 171, 62-74.

    1. Reviewer #1 (Public Review):

      The medial reticular formation (MRF) in the brainstem has long been implicated in the regulation of locomotion. One common - albeit very simple - model often presents the MRF as a major relay station receiving inputs from MLR circuits, among other brain regions, that together convey locomotor signals through efferent projections targeting the caudal brainstem and the spinal cord. Yet, the MRF is a particularly large brain area whose cellular complexity is far from understood. How molecularly distinct MRF ensembles contribute to the regulation of locomotor behaviors is largely unknown. Here, the authors apply focal activation of either glutamatergic, GABAergic, or serotonergic neurons throughout the MRF using a chemogenetic gain-of-function approach to uncover the putative modulatory properties of these neuronal ensembles during walking. Using kinematic analysis of mice limbs during self-paced over-ground walkway locomotion, the authors find that activation of GABAergic MRF neurons can selectively slow down walking, whereas activation of glutamatergic neurons can induce a specific "shuffle" limb trajectory, altogether revealing that distinct MRF populations may retain the capability to engage divergent walking signatures, whose behavioral relevance are not yet clear. In contrast, the activation of serotonergic neurons did not affect walking signatures as described for the other two subgroups but led to an increase of locomotor speed. Interestingly, MRF neurons in each regional activation "hotspots" appear to target different domains in the lumbar spinal cord, suggesting that distinct circuit mechanisms are at play for the slowmo vs shuffle effects.

      Major points:

      1. While the experiments are carefully done and the results are well analyzed and clearly presented in a series of beautiful figures, several aspects of the methodology remain very confusing. In particular, the initial choice for the injection coordinates is not justified and the authors don't leverage the mapping of spinal projection neurons to drive their chemogenetic screen. Similarly, the authors group very different injection schemes (unilateral or bilateral targeting of MRF neurons), that should be analyzed separately. The choice of Z score cutoff that dictates the in-depth analysis of the chemogenetic phenotypes appears arbitrary and is not grounded in a set of objective criteria.

      2. One issue that arise from the work presented here is that we don't know if these MRF neurons are active during locomotion in normal, unperturbed conditions. Knowing the recruitment profile of these MRF neurons would clarify whether the chemogenetic activation boosts the firing of neurons that are already active during walking, or activate neurons that are otherwise silent. Disentangling between these possibilities may have a profound impact on the overall interpretation of the results.

      3. The results should be discussed in the broader context of historic stimulation experiments, notably in cats and other species, as well as more recent circuit mapping approaches in rodents. For instance, the notion that focal stimulation of distinct area within the MRF can elicit or modify the pattern of locomotion is not really new, so is the notion that some of these modulations are phase-specific and can influence the duration of single muscle activation during stance or swing phases. This last point has for instance already been assessed through individual muscle recordings paired with MRF stimulation in cats. Perhaps better introducing these key studies and a thorough discussion of what the results presented in this manuscript bring in terms of novelty will help readers ground this work into a more comprehensive and larger body of work.

    2. Reviewer #2 (Public Review):

      This paper is an interesting conceptual work where certain hotspot areas were found to induce unique gait patterns. These patterns differed from a classic change in speed or gait pattern from a walk to a gallop. From this, a hypothesis was formed that these areas could be important for possible alternative walking patterns seen, for example, during pathologies such as Parkinson's disease or perhaps related to stalking behaviors.

      While I liked the work and found it interesting, it remains descriptive in that the actual behaviors observed can't be causally related to a particular behavior such as stalking or shuffling. If the necessity or sufficiency of this region was related to a specific hunting behavior, for example, its interest to the field would be greater.

      Nevertheless, this paper does contribute to growing evidence that specific behaviors can be triggered by specific neuronal populations within the brainstem.

  2. May 2026
    1. Reviewer #1 (Public review):

      Summary:

      The authors considered the mechanism underlying previous observations that H2A.Z is preferentially excluded from methylated DNA regions. They considered two non-mutually exclusive mechanisms. First, they tested the hypothesis that nucleosomes containing both methylated DNA and H2A.Z might be intrinsically unstable due to their structural features. Second, they explored the possibility that DNA methylation might impede SRCAP-C from efficiently depositing H2A.Z onto these DNA methylated regions.<br /> Their structural analyses revealed subtle differences between H2A.Z-containing nucleosomes assembled on methylated versus unmethylated DNA. To test the second hypothesis, the authors allowed H2A.Z assembly on sperm chromatin in Xenopus egg extracts and mapped both H2A.Z localization and DNA methylation in this transcriptionally inactive system. They compared these data with corresponding maps from a transcriptionally active Xenopus fibroblast cell line. This comparison confirmed the preferential deposition or enrichment of H2A.Z on unmethylated DNA regions, an effect that was much more pronounced in the fibroblast genome than in sperm chromatin. Furthermore, nucleosome assembly on methylated versus unmethylated DNA, along with SRCAP-C depletion from Xenopus egg extracts, provided a means to test whether SRCAP-C contributes to the preferential loading of H2A.Z onto unmethylated DNA.

      Strengths:

      The strength and originality of this work lie in its focused attempt to dissect the unexplained observation that H2A.Z is excluded from methylated genomic regions.

      Weaknesses:

      The study has two weaknesses. First, although the authors identify specific structural effects of DNA methylation on H2A.Z-containing nucleosomes, they do not provide evidence demonstrating that these structural differences lead to altered histone dynamics or nucleosome instability. Second, building on the elegant work of Berta and colleagues (cited in the manuscript), the authors implicate SRCAP-C in the selective deposition of H2A.Z at unmethylated regions. Yet the role of SRCAP-C appears only partial, and the study does not address how the structural or molecular consequences of DNA methylation prevent efficient H2A.Z deposition. Finally, additional plausible mechanisms beyond the two scenarios the authors considered are not investigated or discussed in the manuscript.

      Comments on revisions:

      The authors have addressed all previously raised concerns and propose a revised version of the manuscript. Notably, the abstract and discussion sections have been improved, and new experimental data have been incorporated. Collectively, these revisions enhance the rigor and clarity of the data interpretation and discussion.

      Given these improvements, this reviewer believes that the manuscript could be published, particularly if this publication is accompanied by the critical points discussed in the rebuttal letter.

    2. Reviewer #2 (Public review):

      This manuscript aims to elucidate the mechanistic basis for the long-standing observation that DNA methylation and the histone variant H2A.Z occupy mutually exclusive genomic regions. The authors test two hypotheses: (i) that DNA methylation intrinsically destabilizes H2A.Z nucleosomes, thereby preventing H2A.Z retention, and (ii) that DNA methylation suppresses H2A.Z deposition by ATP-dependent chromatin-remodelling complexes. The revised manuscript addresses a number of previous concerns, and the manuscript has therefore improved accordingly. However, several limitations remain.

      Comments on revisions:

      The authors have addressed a number of my previous concerns, and the manuscript has improved accordingly. However, several limitations remain that, in my view, constrain the strength of the conclusions. In particular, the absence of a direct comparison with a canonical nucleosome assembled on the same DNA template. This control is essential to determine whether the observed effects are specific to H2A.Z or reflect more general properties of methylated DNA-nucleosome interactions. Notably, even within the authors' own data, there is a trend suggesting that methylated canonical H2A nucleosomes may also exhibit increased accessibility. Although this does not reach statistical significance, the authors themselves argue that subtle differences can be biologically meaningful; it is therefore plausible that extended digestion conditions (e.g., longer HinfI exposure) could reveal a significant effect. Unless a direct structural comparison with a canonical nucleosome is performed, the possibility that the reported phenomenon is not specific to H2A.Z remains. This is compounded by the reliance on a single restriction enzyme-based assay, which represents a limited experimental approach. Such an approach is insufficient to unequivocally support the central claim that DNA methylation increases accessibility of H2A.Z-containing nucleosomes. Additional orthogonal assays would be required to substantiate this conclusion. With respect to the cryo-EM analysis of methylated and unmethylated 601L H2A.Z nucleosomes, and in general, the authors still do not adequately consider the positional context of CpG methylation. Extensive literature demonstrates that the effects of DNA methylation on canonical nucleosome structure and stability are highly position-dependent. Without accounting for the location of methylated CpGs relative to key DNA-histone contact sites, the structural data remain difficult to interpret mechanistically. Overall, while the manuscript has improved, it remains a relatively limited study that draws broad mechanistic conclusions from a minimal experimental data.

    3. Reviewer #3 (Public review):

      Summary:

      Histone variant H2A.Z is evolutionarily conserved among various species. The selective incorporation and removal of histone variants on the genome play crucial roles in regulating nuclear events, including transcription. Shih et al. aimed to address antagonistic mechanisms between histone variant H2A.Z deposition and DNA methylation. To this end, the authors reconstituted H2A.Z nucleosomes in vitro using methylated or unmethylated human satellite II DNA sequence and examined how DNA methylation affects H2A.Z nucleosome structure and dynamics. The cryo-EM analysis revealed that DNA methylation induces a more open conformation in H2A.Z nucleosomes. Consistent with this, their biochemical assays showed that DNA methylation subtly increases restriction enzyme accessibility in H2A.Z nucleosomes compared with canonical H2A nucleosomes. The authors identified genome-wide profiles of H2A.Z and DNA methylation using genomic assays and found their unique distribution between Xenopus sperm pronuclei and fibroblast cells. Using Xenopus egg extract systems, the authors showed SRCAP complex, the chromatin remodelers for H2A.Z deposition, preferentially bind to unmethylated DNA to deposit H2A.Z.

      Strengths:

      The experiments are rigorously performed, and interpretations are clear. The study presents a high-resolution cryo-EM structure of human H2A.Z nucleosome with methylated DNA. Although the effect of DNA methylation on the physical stability of the H2A.Z nucleosome is subtle, this would be important finding that warrants further functional investigation. The discovery that the SRCAP complex senses DNA methylation is novel and provides important mechanistic insight into the antagonism between H2A.Z and DNA methylation.

      Weaknesses:

      The authors have satisfactorily addressed my concerns.

    1. Reviewer #2 (Public review):

      Summary:

      This is a laudable effort to help dissect the contributions of type I and type III IFNs to the antiviral response in chicken and therefore represents an important piece of work, not least in the light of birds being a key carrier and worldwide distributor of influenza virus. The first part of the study characterises the generation of IFNAR and IFNLR KO chicken strains and describes basic differences. Four different viruses are then tested in chicken embryos, while the subsequent analysis of the antiviral response in vivo is performed with one influenza H3N1 strain.

      Strengths:

      Having these two KO chicken strains as a tool is a great achievement. The initial analysis is solid. Clear effect of IFNAR deficiency in in vivo infection, less so for IFNLR deficiency.

      Weaknesses:

      (1) The antibody induction by KLH immunisation: We still don't know whether or not this vaccination induces IFN responses in wt mice, so it is still not possible to judge whether the effects observed are due to steady-state differences or to differential effects of IFN induced during the vaccination phase. Pre-immune results are now shown and are indeed zero. As suggested, the whole figure 4 is now condensed into one or two panels by proper calculation of Ab titers - would these titres be significantly different? This as all of the other in vivo experiments have not been repeated if I understand the methods section correctly. I understand that there are three R restrictions that are tighter in some countries, and I accept that with the numbers used here, some statistical significance is reached, but this is for instance not the case for survival.

      (2) The basic conundrum here and in later figures is now addressed by the authors in the discussion: Situations where IFN type 1 and 3 signalling deficiency each have an independent effect (i.e. fig.4d) suggest that they act by separate, unrelated mechanisms. However, all the literature about these IFN families suggest that they show almost identical signalling and gene induction downstream of their respective receptors. How can the same signalling, clearly active here downstream of the receptors for IFN type 1 or type 3, be non-redundant, i.e. why does the unaffected IFN family not stand in? The mouse studies, which showed a rather subtle phenotype when only one of the two IFN systems was missing, but a massive reduction in virus control in double KO mice, are discussed, but a clear-cut explanation for the differences has not been reached. Reasons could be a direct effect of IFNab on B cells and an indirect effect of IFNL through non-B cells, timing issues, and many other scenarios can be envisaged. The authors do not address this question experimentally, which limits the depth of analysis, they have however now included a discussion of this dilemma.

      (3) In the one in vivo experiment performed with chickens, only one virus tested, more influenza strains should be included as well as non-influenza viruses. I appreciate that this is logistically difficult.

      (4) The basic conundrum of point 2 applies equally to Fig. 6a, both KOs have a phenotype. Again, in 6d, both IFNs appear to be separately required for Mx induction. An explanation has been attempted, but more experiments, for instance looking at different time points to understand if we are dealing simply with different kinetics of the response, have not been attempted, despite the fact that such experiments are likely not covered by strict three R rules.

      (5) The in vivo infection is the most interesting experiment, and the key outcome here is that IFN type 1 is crucial for anti-H3N1 protection in chickens, while type 3 is less impactful. However, this experiment suffers from the different time points when chickens were culled, so many parameters are impossible to compare (e.g. weight loss, histopathology). Some explanation is given as to the comparisons chosen here, but a more thorough analysis at several time points would have strengthened this study.

      Comments on revised version:

      In the rebuttal, the authors have gone to some length to add to the discussion of the experiments, and some aspects are better explained now than before. Many of these explanations remain speculative however, so the study remains inconclusive in several aspects. As no new data was added, my overall judgement of this study remains unchanged.

    1. Reviewer #1 (Public review):

      Summary:

      Ducrocq et al. present research exploring the genetic link between simple multicellular group formation (ace2Δ/ace2Δ) and its interaction with cell-cycle progression mutants (e.g., cln3Δ/cln3Δ), demonstrating that this combination can provide fitness benefits during fluctuating resource conditions, resulting in a rapid increase in the fraction of multicellular cell-cycle mutants over unicellular yeast without selection for multicellular size. Because both the multicellular phenotype and the regulatory link enabling faster escape from the stationary phase are controlled by the ACE2 transcription factor, this work demonstrates that multicellular cluster formation can arise as a side effect of a completely independent fitness advantage unrelated to the benefits of group formation itself. As a "passenger phenotype," multicellularity could thus emerge for other selective reasons, potentially facilitating a later transition to more entrenched multicellularity if novel conditions arise that make multicellular group formation directly beneficial.

      Importantly, while the literature generally assumes that multicellular group formation incurs a cell-level fitness cost, this work demonstrates that certain genetic - environmental interactions can confer fitness benefits even at the level of individual cells forming multicellular groups. This finding should inspire both theoretical and empirical work exploring multicellular group formation selected for benefits at the level of individual cells, rather than the benefits of forming a larger organismal size that most work has relied on so far.

      Strengths:

      This work is novel and exciting for research exploring the very first steps of the transition from unicellularity to simple multicellularity. The formation of multicellular groups is almost always assumed to come at a cell-level fitness cost due to reduced reproductive fitness compared to remaining unicellular, which generally needs to be outweighed by the benefits of multicellular group formation (e.g., large size to escape predation) for the multicellular phenotype to be stable. However, this study presents an interesting case of a genetic and environmental condition under which individual cells forming simple multicellular clusters can actually have higher reproductive fitness than solitary living yeast cells. This contrasts with previous snowflake yeast studies where the multicellular phenotype was primarily beneficial due to strong selection for large groups (rather than cell-level fitness gains).

      The claims and interpretation of the results align well with the data presented. This is due to the careful and straightforward experimental design testing predictions with a clear, stepwise methodology. The authors rule out alternative explanations and provide support for the proposed link between the mutations (ace2, cln3, and others), their impact on faster exit from quiescence and earlier entry into reproduction in fresh media, and the resulting higher fitness in the snowflake yeast phenotype compared to unicellular yeast.

      This experimental framework (combining cell-cycle mutants under the same multicellular background) is very much likely to be adopted by others in the community to explore downstream implications of these results in laboratory and environmental yeast isolates.

      Weaknesses:

      The authors show that the same multicellular phenotype with higher cell-level fitness due to faster exit from the stationary phase can also be observed with alleles found at other loci in non-laboratory yeast strains, implying that the results are likely not specific to a peculiar case genetically engineered in laboratory strains, but that similar phenotypes may be present in nature. However, this remains to be explored by examining the natural ecology of commercially available or wild yeast isolates and their genomes. This is not a weakness of this study per se, but rather a direction for future work. It does mean, however, that the relevance of these findings for early multicellularity in yeast, and even more so for nascent multicellularity in distinct taxa, remains to be explored in the future. Until then, it is difficult to make strong claims about how applicable these results would be for non-laboratory yeast and other taxa. Regardless, this work represents a very exciting finding.

      Comments on revised version:

      The authors addressed all concerns thoroughly.

    1. Reviewer #1 (Public review):

      Summary:

      Morgan et al. studied how paternal dietary alteration influenced testicular phenotype, placental and fetal growth using a mouse model of paternal low protein diet (LPD) or Western Diet (WD) feeding, with or without supplementation of methyl-donors and carriers (MD). They found diet- and sex-specific effects of paternal diet alteration. All experimental diets decreased paternal body weight and the number of spermatogonial stem cells, while fertility was unaffected. WD males (irrespective of MD) showed signs of adiposity and metabolic dysfunction, abnormal seminiferous tubules and dysregulation of testicular genes related to chromatin homeostasis. Conversely, LPD induced abnormalities in the early placental cone, fetal growth restriction and placental insufficiency, which was partly ameliorated by MD. The paternal diets changed placental transcriptome in a sex-specific manner and led to a loss of sexual dimorphism in the placental transcriptome. These data provide a novel insight on how paternal health can affect the outcome of pregnancies, which is often overlooked in prenatal care.

      Strengths:

      The authors have performed a well-designed study using commonly used mouse models of paternal underfeeding (low protein) and overfeeding (Western diet). They performed comprehensive phenotyping at multiple timepoints including of the fathers, the early placenta and late gestation feto-placental unit. The inclusion of both testicular and placental morphological and transcriptomic analysis is a powerful non-biased tool for such exploratory observational studies. The authors describe changes in testicular gene expression revolving around histone (methylation) pathways that are linked to altered offspring development (H3.3 and H3K4), which is in line with hypothesised paternal contributions to offspring health. The authors report sex differences in control placentas that mimic those in humans, providing potential for translatability of the findings. The exploration of sexual dimorphism (often overlooked) and its absence in response to dietary modification is novel and contributes to the evidence-base for the inclusion of both sexes in developmental studies.

      Comments on revised version:

      The authors have done a great job addressing my concerns. The description of the data analysis and the figures are now much clearer. The inclusion of the potential links between the microbiome and male reproductive fitness is informative and improves the flow of the discussion.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigated the effects of a low-protein diet (LPD) and a high sugar- and fat-rich diet (Western diet, WD) on paternal metabolic and reproductive parameters and feto-placental development and gene expression. They did not observe significant effects on fertility; however, they reported gut microbiota dysbiosis, alterations in testicular morphology, and severe detrimental effects on spermatogenesis. In addition, they examined whether the adverse effects of these diets could be prevented by supplementation with methyl donors. Although LPD and WD showed limited negative effects on paternal reproductive health (with no impairment of reproductive success), the consequences on fetal and placental development were evident and, as reported in many previous studies, were sex-dependent.

      Strengths:

      This study is of high quality and addresses a research question of great global relevance, particularly in light of the growing concern regarding the exponential increase in metabolic disorders, such as obesity and diabetes, worldwide. The work highlights the importance of a balanced paternal diet in regulating the expression of metabolic genes in the offspring at both fetal and placental levels. The identification of genes involved in metabolic pathways that may influence offspring health after birth is highly valuable, strengthening the manuscript and emphasizing the need to further investigate long-term outcomes in adult offspring.

      The histological analyses performed on paternal testes clearly demonstrate diet-induced damage. Moreover, although placental morphometric analyses and detailed histological assessments of the different placental zones did not reveal significant differences between groups, their inclusion is important. These results indicate that even in the absence of overt placental phenotypic changes, placental function may still be altered, with potential consequences for fetal programming.

      Comments on revised version:

      The authors have adequately addressed all my previous comments.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In this manuscript, the authors employ diaphragm denervation in rats and mice to study titin-based mechanosensing and longitudinal muscle hypertrophy. By integrating bulk RNA-seq, proteomics, and phosphoproteomics, they map the stretch-responsive signalling landscape, uncovering robust induction of the muscle-ankyrin-repeat proteinsௗ(MARP1-3) together with enhanced phosphorylation of titin's N2A element.

      Genetic ablation of MARPs in mice amplifies longitudinal fibre growth and is accompanied by activation of the mTOR pathway, whereas systemic rapamycin treatment suppresses the hypertrophic response, highlighting mTORC1 as a key downstream effector of titin/MARP signalling.

      Strengths:

      The authors address a clear biological question: "how titin-associated factors translate mechanical stretch into longitudinal fibre growth" using a unique and clinically relevant animal model of diaphragm denervation. Using a comprehensive multiomics approach, the authors identify MARPs as potential mediators of these effects and use a genetic mouse model to provide compelling evidence supporting causality. Additionally, connecting these findings to rapamycin, a drug widely used clinically, further increases the relevance and potential impact of the study.

    2. Reviewer #2 (Public review):

      Summary:

      Muscle hypertrophy is a major regulator of human health and performance. Here, van der Pilj and colleagues assess the role of the giant elastic protein, titin, in regulating the longitudinal hypertrophy of diaphragm muscles following denervation. Interestingly, the authors find an early hypertrophic response, with 30% new serial sarcomeres added within 6 days, followed by subsequent muscle atrophy. Using RBM20 mutant mice, which express a more compliant titin, the authors discovered that this longitudinal hypertrophy is mediated via titin mechanosensing. Through an omics approach, it is suggested that the Muscle ankyrin proteins may regulate this approach. Genetic ablation of MARPs 1-3 blocks the hypertrophic response, although single knockouts are more variable, suggesting extensive complementation between these titin binding proteins. Finally, it is found through the administration of rapamycin that the mTOR signalling pathway plays a role in longitudinal hypertrophic growth.

      Strengths:

      This paper is well written and uses an impressive suite of genetic mouse models to address this interesting question of what drives longitudinal muscle growth.

      Weaknesses:

      While the findings are of interest, they lack sufficient mechanistic detail in the current state to separate cross-sectional versus longitudinal hypertrophy. The authors have excellent tools such as the RBM20 model to functionally dissect mTOR signalling to these processes. It is also unclear if this process is unique to the diaphragm or is conserved across other muscle groups during eccentric contractions.

    1. Reviewer #1 (Public review):

      Summary:

      Deng and colleagues pursue the possibility that red light exposure can provide some benefits and anti-senescence effects in aged mouse models. In addition, they show how red light influences metabolism in cultured keratinocytes. The authors provide a long dissection of the potential paths involved in the changes promoted by red light exposure, identifying CytC oxidase, SIRT4, PPARa and MCD as key players.

      Strengths:

      The authors did a thorough exploration of the multiple potential avenues by which red light exposure influences metabolism. The in vitro and in vivo evidence nicely complement each other.

      Weaknesses:

      This is a challenging hypothesis that would require some additional experimental controls. The pathway dissection, while extensive, is sometimes approached in unconvincing ways, and the results are not always evident to judge or interpret. Technically, the western blots and transcriptomic analyses require notable improvements.

    2. Reviewer #2 (Public review):

      Summary:

      This work identifies a previously unknown way that red light can slow ageing. The authors show that red light lowers the level of a protein called SIRT4 in skin cells. Reducing SIRT4 boosts fatty acid use and increases a type of histone modification that keeps genes active. These changes help cells clear away signs of ageing, reduce inflammation, and restore normal metabolism. The findings open the possibility of developing new treatments that target SIRT4 to reverse age‑related decline.

      Strengths:

      The evidence is solid because the authors use several complementary methods. They test red light in both cultured cells and naturally aged mice, and they confirm the key role of SIRT4 by silencing its gene. Measurements of metabolism, protein changes, and ageing markers all point in the same direction. However, the exact way red light lowers SIRT4 levels is not fully explained, which leaves a minor gap. Overall, the conclusions are well supported and convincing.

      Weaknesses:

      The paper does not evolve to use the mechanistic discoveries of the manuscript to help our community to identify the mechanism of photobiomodulation, which is not known so far.

      I would like to draw attention to a recently published paper by Herrera et al. (FEBS Letters 2025, doi:10.1002/1873-3468.70195), which shows that red light (660 nm) stimulates mitochondrial fatty acid oxidation in keratinocytes via AMPK‑dependent phosphorylation of ACC, without altering expression of electron transport chain complexes. I believe this paper is highly complementary to the current study.

      Herrera et al. demonstrate that red light increases basal, ATP‑linked, and maximal oxygen consumption rates in keratinocytes specifically through enhanced fatty acid oxidation (inhibited by etomoxir). This independently validates the central finding of the current manuscript, i.e., red light boosts lipid metabolism, strengthening the robustness of this concept.

      While the current manuscript focuses on the SIRT4‑MCD axis, Herrera et al. identify AMPK phosphorylation and ACC inhibition as key effectors. The authors can integrate and expand their discussion, since SIRT4 downregulation may converge on AMPK activation, or they may represent parallel, reinforcing mechanisms. This would enrich the mechanistic model and open new hypotheses.

      The mechanism of photobiomodulation: Herrera et al. explicitly challenge the prevailing paradigm that red light acts solely via cytochrome c oxidase (by showing long‑lasting effects, unchanged OXPHOS protein levels, and no difference in permeabilised cells). The current finding (red light acts through SIRT4 downregulation, i.e., not direct enzymatic activation) aligns perfectly with Herrera´s critique.

      Long‑term metabolic effects - Herrera et al. show that a single red light exposure elevates oxygen consumption for up to 2 days. The current study focuses on changes at 12‑24 h. Their data extend the time window and suggest that the metabolic reprogramming you describe may persist longer than currently discussed, which is clinically relevant.

      Discussing Herrera et al.'s results would not only acknowledge independent, corroborating evidence but would also allow the authors to position their SIRT4‑centric mechanism within a broader, emerging understanding of red‑light photobiomodulation.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript has several strengths, including a technically comprehensive approach that combines mouse genetics, electrophysiology, live imaging in assembloids, and human organoid models, providing a rich and multifaceted dataset. Cross-species validation through the parallel use of mouse and human systems strengthens the generality of the observed phenotypes and increases relevance to human neurodevelopment.

      Consistent phenotypic observations across systems show that ARHGEF6 loss affects migration, neurite morphology, growth cone structure, and neuronal survival, supporting a coherent role in cytoskeletal regulation.

      There is clear evidence for developmental defects, including reduced interneuron numbers, increased apoptosis in the ganglionic eminences, and migration deficits, all well supported by quantitative analyses. Also, there is a high-quality electrophysiological characterization that demonstrates reduced firing in interneurons, providing a well-controlled functional phenotype.

      Strengths:

      The manuscript has several strengths, including a technically comprehensive approach that combines mouse genetics, electrophysiology, live imaging in assembloids, and human organoid models, providing a rich and multifaceted dataset. Cross-species validation through the parallel use of mouse and human systems strengthens the generality of the observed phenotypes and increases relevance to human neurodevelopment.

      Consistent phenotypic observations across systems show that ARHGEF6 loss affects migration, neurite morphology, growth cone structure, and neuronal survival, supporting a coherent role in cytoskeletal regulation.

      There is clear evidence for developmental defects, including reduced interneuron numbers, increased apoptosis in the ganglionic eminences, and migration deficits, all well supported by quantitative analyses. Also, there is a high-quality electrophysiological characterization that demonstrates reduced firing in interneurons, providing a well-controlled functional phenotype.

      Weaknesses:

      Despite the strengths mentioned above, the study has some conceptual and experimental weaknesses that reduce its impact. The mechanistic insight is limited, as the research does not directly establish how ARHGEF6 regulates downstream signaling pathways.

      Also, there is insufficient evidence for interneuron specificity; although the central claim is that ARHGEF6 plays a selective role in interneurons, the data do not adequately exclude the possibility that the observed effects reflect broader neuronal defects. The study lacks critical controls across cell types, as several phenotypes observed in organoids and progenitors, including apoptosis, reduced neuronal output, and altered morphology, could also affect multiple neuronal populations without being directly tested. Furthermore, the data are predominantly descriptive, with many results remaining correlative and failing to establish causal relationships.

      Some more comments:

      (1) Given that ARHGEF6 is a guanine nucleotide exchange factor for Rac1 and Cdc42, the absence of direct measurements of GTPase activity or downstream signaling represents a significant gap. The interpretation that the observed phenotypes are mediated through specific cytoskeletal pathways, therefore, remains inferential.

      (2) The manuscript repeatedly interprets the findings as interneuron-specific. However, several key observations are not demonstrated to be restricted to IN. Without direct comparison to excitatory neurons or other cell types, it is difficult to conclude that ARHGEF6 plays a selective role in interneurons rather than a more general role in neuronal development. The well-done analysis of the transcriptomic dataset is not sufficient to claim IN specificity. This issue is particularly important for the interpretation of the human organoid experiments, where reductions in SOX2⁺ progenitors and NEUN⁺ neurons, as well as increased apoptosis, could reflect global developmental defects. Similarly, in the mouse experiments, the reduction in GAD67⁺ cells is compelling, but it is not shown whether other neuronal populations are also affected.

      (3) The study provides a strong phenotypic description but limited causal resolution. For example, migration defects, altered growth cone morphology, and reduced branching are all consistent with impaired cytoskeletal regulation, but the links between these phenotypes are not directly established. Likewise, while the electrophysiological data convincingly show reduced firing in interneurons, the connection between altered cytoskeletal dynamics and intrinsic excitability is not explored.

      (4) Several aspects of data presentation could be improved. In multiple figures (e.g., Figure 1A, D; Figure 4 and Video S1, 2), the images are difficult to interpret due to high cellular density, limited magnification, or lack of clear annotation. In some cases, it is not fully clear how quantifications were performed or which regions were analyzed. Improving the visual clarity with arrows, boxes, and high-magnification inserts of the data would strengthen confidence in the conclusions.

    2. Reviewer #2 (Public review):

      The authors investigate the impact of the deletion of the small GTPase regulator ARHGEF6 on the development and physiology of interneurons. Using public databases, they first show that ARHGEF6 is enriched in interneurons or in areas that give rise to them, both in development and adulthood, in humans and mice. Using a complete KO mouse previously reported, and using a GAD67-GFP reporter mice line, they show that in the adult mouse cortex and hippocampus, there is a notorious reduction GFP+ cells. These mice show increased apoptotic cells at different timepoints and areas of the brain during development. In the developing cortex of ARHGEF6-KO mice, there are fewer IN in all layers of the developing cortex, and cells present processes not correctly oriented. IN from the hippocampus in culture show reduced excitability and impaired neurite branching. The authors then established isogenic hiPSCs lines to study ARHGEF6 deletion in human cells and differentiated ventral forebrain neurons, to find interneuron-related and non-related phenotypes. Most importantly, human interneurons grown in organoids show reduced branching and altered growth cone morphology. The authors claim that the novel interneuron phenotypes found in these models can explain, in part, the human intellectual disabilities associated with mutations in this protein. The study is well conducted and opens new avenues of research not only for the role of small GTPases regulation in early nervous system development, but also for how interneuron deficiencies impact a wider range of intellectual disability syndromes found in humans.

      However, most conclusions of the present version would be strengthened after considering the following comments:

      Major comments

      (1) The reported biological processes evaluated at different developmental stages may be directly or indirectly related to ARHGEF6 function itself. As a model of a hereditary disease, full organism gene deletion is valid, since the human patients suffer from that condition as well. However, to investigate the roles of a protein, complete deletions may not be very accurate since they can give rise to phenotypes that are only indirectly related to the protein function itself. Most conclusions of the present manuscript should either be discussed in this regard or add evidence for a direct role of the protein. One such evidence is typically performed with acute knockdowns in culture, or in developing brains by in utero electroporation. For example, Figure 1C shows that the principal excitatory neurons in the hippocampus do not express ARHGEF6. However, most electrophysiological and behavioral evidence of defects in ARHGEF6-KO mice arises from evaluating these cells (Remakers et al., 2012). I am not suggesting that either previous or actual evidence is wrong. But I believe readers would benefit from a clear distinction (or add caution notes) between a functional consequence of the deletion (that can be months away and in other cells than the actual molecular defect) and a true cell biological function of the protein under study. In favor of the authors, this is a concern with most conclusions derived from KO organisms.

      (2) Figure 1E-G H I. All conclusions are made with a GAD67-GFP reporter, which is a very powerful and reliable tool for large-scale screening. All the conclusions of the paper would be strengthened if some immunohistochemical staining in the same areas of specific markers for interneurons would be added as supporting complementary evidence.

      (3) Cell death in development: It is surprising that the high amount of TUNEL staining during development does not translate into gross histological changes in the adult brain (studied elsewhere). Can authors discuss possible explanations?

      (4) Section 4 (Figures 2F-J) - The authors present this staining as an analysis of migration. Normally, migration studies are performed with a "pulse-chase" paradigm, where a single cohort is labeled and then followed over time (normally by in utero electroporation of a fluorescent protein). Tissue is then fixed at different time points, and migration can be followed. On the contrary, the evidence is from a single point, in an experimental setting in which all Gad67 IN are stained, and hence, one cannot imply a defect in migration. The differences between WT and ARHGEF6-KO are obvious and interesting; it is just that they cannot be solely attributed to a problem in migration.

      Also, a true phenotype of migration in the current setting should have found that the cells that failed to migrate are accumulated in deeper layers. My impression is that the changes in IN per layer are easier explained by total cell number, rather than migration. Perhaps evaluating earlier timepoints could clarify this.

      (5) It is known that ARHGEF6 deletion produces severe F-actin phenotypes in neurons. Have the authors confirmed in their hippocampal cultures GAD67 cells ALSO have these phenotypes? Stress fibers in somas, growth cones, and actin patches along neurites.

      (6) Section 4. The authors present data for deficient migration of the GFP-labeled interneurons. Is it possible to assess, in the same sections, whether other cell types are also affected? Although the hypothesis that ARHGEF6 deletion will have an impact in IN is well rooted in expression data, by assessing other cell types, one can even include a positive control or evidence for a cell-autonomous phenotype.

      (7) ARHGEDF6 deletion has an important impact on organoid development (size, shape, etc). Have the authors analysed whether these organoids produced fewer interneurons?

      (8) In assembloids, the differences in migration parameters are very small between WT and ARHGEF6-KO, which reinforces that perhaps what is observed in the different layers of cortex during mouse development is likely not entirely due to migration, as concluded.

      (9) To properly weigh the present evidence -interneuron deficits- using the ARHGEF6-KO model, authors should include a deeper discussion in light of much work that has been done using these mice. How does the finding of a diminished IN population in the brain of these mice explain the large amount of electrophysiological and behavioral evidence produced before with these animals? Perhaps the most important work to discuss these aspects is the initial ARHGEF6-KO report by Ramakers and colleagues (2012), but there are others.

      Minor comments

      (1) Figure 1A. It looks clear that the GE shows the highest expression of ARHGEF6; however, the reader needs the reference levels where the log2 expression is calculated. What are the reference levels?

      (2) Have the authors compared the number of GAD67-eGFP cells in the hippocampal cultures between WT and ARHGEF6-KO mice?

      (3) Section 3, as a caution note, authors should mention that it is not possible to know from the evidence provided which cells are dying.

      (4) In the dorsal-ventral assembloids, it is expected that the ventral organoid would contain lots of GFP expression compared to the dorsal, but in the image shown (Figure 5A) both parts of the assembloid seem to have the same amount and distribution of GFP. How is that possible?

    3. Reviewer #3 (Public review):

      Summary:

      ARHGEF6 is a RAC1/CDC42 guanine nucleotide exchange factor that has been proposed to be associated with X-linked intellectual disability, but its relevance to the pathology is not well established. ARHGEF6 has been assigned a role in spine density and plasticity of hippocampal pyramidal neurons, but nothing is known about its role in interneuron development. Here, the authors show that ARHGEF6 is expressed early in development in the inhibitory lineage during the peak of interneuron generation and migration. The aim of the study is therefore to investigate whether, in addition to its role in pyramidal neurons, ARHGEF6 could play a role in inhibitory neuron development. Using both ARHGEF6-KO mice and organoids from ARHGEF6-KO hiPSCs, the authors show that ARHGEF6 plays a critical role in interneuron development and function

      Strengths:

      The major strength of the paper is the very detailed analysis of the role of ARHGEF6 using two different systems: ARHGEF6-KO mice and deletion of ARHGEF6 in human iPSC-derived organoids. Strikingly, deletion of ARHGEF6 in both systems induces similar defects such as an increase in apoptosis, reduced neuronal output, impaired neuronal morphology, and disrupted migratory dynamics. This compelling evidence demonstrates that ARHGEF6, in addition to its already well-described role in spine formation and plasticity, is playing a crucial role during embryonic development through its function in interneurons.

      Weaknesses:

      (1) In Figure 1, the authors show that ARHGEF6 is expressed in different regions of the brain, including the interneuron lineage, and that depletion of ARHGEF6 reduces the number of GABAergic neurons in the adult cortex and hippocampus. To try to better characterize this defect, the authors in Figure 2 investigate whether deletion of ARHGEF6 affects interneuron migration and survival during embryonic development. To do so, ARHGEF6 ko mice were crossed with the GAD67-eGFP reporter line to follow the inhibitory lineage. The authors analyse apoptosis using TUNEL staining, and show that it is significantly increased in the ganglion eminence of ARHGEF6-KO E14.5 embryos. The authors claim that this is not the case in the cortex. However, the image shown in Figure 2A really suggests that staining is increased. Which part of the neocortex is analysed for quantification? This should be clarified.

      (2) In Figure 2F-J, the authors investigate the migration of interneurons by analysing the GAD67-eGFP staining, and clearly show that the migratory abilities of the depleted neurons are reduced. However, the authors do not discuss the fact that, because depletion of ARHGEF6 increases apoptosis, there are fewer neurons available for migration. This is important for the interpretation of the data. This point should be clarified.

      (3) In Supplementary Figure S2, the authors describe the establishment of the ARHGEF6-KO human iPSC line and test the ability of these cells to undergo correct development, especially for the generation of neural progenitor cells. I was wondering why the authors do not present the data of both control and ARHGEF6-KO cells.

      (4) At the molecular level, how ARHGEF6 depletion could affect neuronal survival is missing. In addition, as ARHGEF6 is a GEF for RAC1 and Cdc42 amongst other GEFs, I would have expected that the authors test how RAC1 activity (and Cdc42) is affected in ARHGEF6-depleted brains and in ARHGEF6-KO organoids. The measure of phalloidin staining and the anisotropy index are not really meaningful.

      (5) The authors show that ARHGEF6-KO forebrain organoids were markedly smaller compared to their isogenic controls, and their study suggests that ARHGEF6 expression impacts progenitor maintenance and neurogenesis. Despite representing only a minority of the total neuronal population, I was wondering whether ARHGEF6-KO mice present brain morphology defects such as microcephaly.

    1. Reviewer #1 (Public review):

      A triple-transgenic (3xTgAD) mouse model of Alzheimer's disease was exposed to a high-fat diet and assigned to one of three interventions: voluntary physical activity, a low-fat diet, and their combination. A high-fat diet significantly increased body weight and induced widespread neuroanatomical changes, with effects modulated by sex and genotype. The combined intervention led to significant weight loss in males of both genotypes. Neuroanatomical analyses revealed that a high-fat diet significantly reduced hippocampal and cerebellar volumes in wild-type mice but had a less pronounced effect on 3xTgAD mice; nevertheless, interventions, particularly the combined approach, increased localized brain volumes in these regions regardless of genotype. Spatial gene enrichment analysis of this pattern identified glucose homeostasis. Overall, these findings suggest that voluntary physical activity and a low-fat diet can modulate brain structure and behaviour, partially counteracting the effects of a high-fat diet, and potentially recruiting biological processes that may support brain health.

      The authors describe studies of the 3xTg mouse model of Alzheimer's disease (AD). They set out to study the interactions of diet and exercise on three outcomes: weight gain, MRI, and either the novel object recognition or Morris water maze tasks of memory.

      They conclude there are sex and genotype effects on hippocampal volume.

      There are several strengths to the study. First, they start out with a great deal of mice. Once they are divided into groups, the sample sizes are not always strong, however. It would be good to know that they were sufficiently powered.

      The data are also interesting. Mice were placed on several different diets during the study, which will be of interest to many who question the role of diet in outcomes. They also add exercise as an intervention, and study not only diet but also the combined effect of diet and exercise. This is relevant to those interested in controlling dementia by diet and exercise. Finally, they perform some very interesting analyses to study the data.

      That said, the study also has several limitations. For example, it is quite complex. Mice had a standard diet until 2 months of age, then were switched to either a low-fat or a high-fat diet. Some mice had both a different diet and exercise. MRI was performed at 2, 4, and 6 months, when behavior was tested. A drawback of this design is that no assessment of outcomes relevant to this animal model, such as amyloid-beta or tau phosphorylation, was conducted. Also, they used the novel object recognition task, despite stating in the Discussion that this task does not show impairments until well after 6 months of age. They added exercise, but it is not clear whether the animals used the exercise apparatus equally. Also, the animals were housed "communally", so adding an exercise wheel may have made the cage crowded, adding stress to the study. The diets were not simply low- or high-fat because many constituents besides fat content also changed. Regarding fat, the type of fat also changed between diets. Therefore, the gut microbiome was probably affected differently by factors other than fat intake. There was no measurement of food consumption, so some mice may not have eaten as much of the new diet as they did of the old diet they were used to.

      Regarding the data, only the outcomes of complex analyses are shown. One would first want to see the changes in body weight and perhaps later how it is analyzed in a more complex way. For behavior, one would first want to see outcomes as typically presented. For example, learning, recall, platform test results from the Morris water maze, and discrimination indices for object recognition. Note that, at one point, I believe the authors note that some groups did not explore thoroughly, which would make novel object recognition hard to interpret. If there was any difficulty with ambulation, both tasks would be hard to interpret.

      Regarding MRI, from what can be seen, structures cannot be distinguished clearly. At least some raw data should be shown to demonstrate this and to determine what the data show. The raw data suggest that some of the larger structures can be distinguished, and we should see the data for these areas, even if all areas can't be assessed. Lifestyle interventions can mitigate the effects of diet-induced obesity on body weight, behaviour, and brain anatomy in mouse models. Using a longitudinal design, wild-type and triple-transgenic (3xTgAD) mouse models of Alzheimer's disease were exposed to a high-fat diet and assigned to one of three interventions: voluntary physical activity, a low-fat diet, and their combination. A high-fat diet significantly increased body weight and induced widespread neuroanatomical changes, with effects modulated by sex and genotype. The combined intervention led to significant weight loss in males of both genotypes. Neuroanatomical analyses revealed that a high-fat diet significantly reduced hippocampal and cerebellar volumes in wild-type mice but had a less pronounced effect on 3xTgAD mice; nevertheless, interventions, particularly the combined approach, increased localized brain volumes in these regions regardless of genotype. Multivariate integration of behavioural and neuroanatomical measures identified a brain pattern linking hippocampal and cerebellar volumes to intervention and behavioural performance. Spatial gene-enrichment analysis of this pattern identified biological processes, including glucose homeostasis, as potential biological mechanisms underlying intervention effects. Overall, these findings suggest that voluntary physical activity and a low-fat diet can modulate brain structure and behaviour, partially counteracting the effects of a high-fat diet, and potentially recruiting biological processes that may support brain health. In the end, the authors focus primarily on the hippocampus and discuss the cerebellum, but it seems that changes occur throughout the brain. The choice to focus on the hippocampus and cerebellum needs to be supported.

      To gain further insight, the authors analyze genes across different brain regions using the Allen Brain Atlas. Although this seems reasonable in theory, once one realizes how many genes are shared across diverse brain regions, one wonders how such an analysis was conducted. More understanding of this approach, as well as how it was validated, is important. In the end, the authors conclude that the glucose homeostatic pathways were primarily altered, and one would like to understand whether that is indeed true and whether it is the only set of pathways that were changed.

      This raises another point: what occurs in a normal wild-type mouse on the standard diet during the first 6 months of life? Do the glucose homeostatic pathways change simply due to age? Sex? It may be that, with age, the mice become more sedentary, which is why. Once that is resolved, what occurs on the standard diet for the 3xTg mice? Perhaps they are more active or more sedentary, regardless of diet or exercise? Thus, the studies end up raising more questions than answers.

      Given so much work has already been done, it seems best to simply reorganize the presentation with raw data first, followed by the analysis. For the second section, the implicit assumptions of the analyses should be very clear so that the analyzed data are understood and believable. Limitations of the assumptions, pooling some groups, etc., need to be clear.

      Figures. In Figure 1, the weekly measurements are not shown. The points are connected, so an unbroken line is shown. Around the line are lighter lines indicating errors, but with all the lines and colours, one does not know what standard errors surround the values for any given group. This makes the data hard to interpret. In later figures, significant differences are indicated with asterisks, but this seems to be done inconsistently.

      In the text, more caution is needed for some assertions. For example, it is not clear that a 2- to 6-month-old is an adolescent. Opinions about the ages of mice that correspond to human life stages have always been debated. Another example is indicating that male mice might gain weight differently than females, as if it were an outcome of diet or exercise. This is because male rodents continue to gain weight in adulthood, but females stabilize because estrogen limits appetite. Additionally, females may not show group differences because they are more variable. This can relate to their estrous cycle. If stressed or housed without males nearby, they may not have a regular estrous cycle, which can then affect their outcomes. This may be particularly true for behavior when they may have been tested during different estrous cycle phases, if they had estrous cycles.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript describes an investigation into the effect of diet and exercise interventions in WT and transgenic (male and female) mice who are exposed to either a high-fat or a low-fat diet. The outcome variables include MRI volume and brain morphology, as well as memory performance. First, this study measured the impact of genotype (WT vs 3xTgAD mice), then examined the impact of a high-fat or low-fat diet in each group, and finally examined the impact of a low-fat diet, exercise, or a combined low-fat diet and exercise intervention. This is an important study as it allows us to better understand how changes to lifestyle can affect neurocognitive function and potentially change a person's AD risk.

      Strengths:

      (1) The study uses a well-controlled longitudinal design, allowing the authors to track how diet and exercise interventions influence brain and behaviour over time.

      (2) The integration of multiple levels of analysis (brain imaging, behaviour, and multivariate modelling) provides a rich and comprehensive assessment of intervention effects.

      (3) The inclusion of both genotype and sex as key variables strengthens the relevance and interpretability of the findings, given known differences in risk and response across groups.

      Weaknesses:

      There are a lot of analyses in this paper, and I had a little bit of trouble distilling the major take-home messages. For example, I was left wondering:

      (1) If the effect of genotype and the effect of the high-fat diet were consistent in the current study compared to the authors' previous work (e.g. Rollins et al., 2019). A more direct report on the consistency of these findings (maybe even an overlap map, if possible) would benefit the reader.

      (2) How consistent/different are the volumetric and morphometric (DBM) results from each other? Especially in the regions of interest (hippocampus and cerebellum), are increases in volumes always related to "expansion" of a given region using DBM? Some of the similarities are reported in the results, but for transparency, a side-by-side table comparing the results across techniques for each effect of interest might provide more clarity.

      (3) I was interested in the Partial Least Squares approach that the authors used to investigate how patterns of brain measures relate to the behavioral variables. Because they are presented mostly in the supplement (except for Figure 6E), it's difficult to map the LVs described onto the univariate contrasts in Figures 2-5. In general, greater clarity is needed regarding how the PLS-derived latent variables relate to the univariate findings, and whether the emphasis on LV3 reflects a principled selection or post hoc interpretation.

      (4) If I understand the results correctly, there were only modest differences in behavior reported, and the patterns were somewhat inconsistent across sex and genotype. In fact, the authors report that the high-fat diet alone did not impair memory on the Morris Water maze (line 323). The discrepancy between robust neuroanatomical effects and relatively modest behavioural changes raises important questions about the functional significance of the observed structural alterations.

      (5) On line 507, the authors state, "Notably, 3xTgAD mice already show smaller brain volumes at baseline, which may constrain the detectable impact of the diet." Is this true for the entire brain or just the hippocampus and cerebellum? Would a global reduction in brain volume due to the 3xTgAD AD model affect the interpretation of the intervention effects?

    3. Reviewer #3 (Public review):

      Summary:

      The authors sought to determine the individual and combined effects of exercise and low-fat diet consumption on regional brain volume and cognitive function in triple-transgenic Alzheimer's disease mice and wild-type controls.

      Strengths:

      (1) A strength of this study is its longitudinal design, which captures regional changes in brain volume across the interventions tested.

      (2) Its comprehensive design includes 10 groups and is well-powered to isolate genotype-, sex-, diet- and exercise-related effects (and interactions).

      (3) The analyses of volumetric and voxel-based measures are comprehensive.

      Weaknesses:

      (1) Use of automated tracking for NOR data reduces confidence in the behavioural data.

      (2) No measures of Ab or tau pathology appear to be performed.

      (3) Mice from the critical 'combined' intervention groups are not included in the PLS regression model that integrates behavioural and brain data.

      (4) Analyses of behavioural data include a large number of variables without adequate justification.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents an Important tool for the study of MR1 antigen binding, opening new possibilities, and cutting-edge techniques. The evidence supporting the claims of the authors is solid, although including some functional experiments using primary T-cells would also provide a more complete physiologic evaluation. The work will be of interest to T cell immunologists, in general, especially those studying unconventional T cells.

      Strengths:

      In this study, the authors developed a single-chain MR1-derived protein by exchanging the α3 domain and β2-microglobulin for a helical stabilizing domain that they had previously developed. The aim was to generate a more compact structure that would still fold properly, without the risk of losing β2-microglobulin. This overall more robust structure would facilitate ligand exploration using various cutting-edge biophysical techniques.

      The authors successfully demonstrated that their construct folds similarly to native MR1 and retains the ability to bind MAIT TCR in solution, as shown by cryo-EM experiments. Its melting temperature was equivalent to that of the native protein. Importantly, the construct enables the use of differential scanning fluorometry and transverse relaxation-optimized spectroscopy, which represent the main strengths of this work. These approaches should greatly facilitate the screening of additional unknown ligands and enable interaction mapping.

      Weaknesses:

      One possible area for improvement would be to extend the validation to additional known ligands, particularly weaker binders. Furthermore, although the cryo-EM data are highly convincing, including either MAIT cell staining or MAIT activation assays with the generated construct would provide stronger functional validation of its equivalence to the wild-type protein with respect to ligand-binding properties.

      Overall, this work is of great interest to the field, as several groups worldwide are seeking to identify endogenous/tumour-derived MR1 ligands. In addition, some pathogens lacking the capacity to produce 5-OP-RU have been shown to activate MAIT cells, raising the possibility that unknown pathogen-derived ligands may also exist.

    2. Reviewer #2 (Public review):

      Summary:

      The authors develop a miniaturized MR1 construct (SMART-MR1) in which the α1/α2 platform is stabilized by a synthetic domain, and show that it can bind ligands, engage a cognate TCR, and recapitulate native-like recognition by cryo-EM.

      Strengths:

      The work is well-written, technically strong and carefully executed. The authors combine biochemical, biophysical and structural approaches, including ITC, NMR and cryo-EM, to show that SMART-MR1 behaves in a manner closely resembling native MR1. The reduction in size and the demonstration of solution NMR are clear practical advantages for certain types of mechanistic studies.

      Weaknesses:

      The main limitation is that the manuscript does not clearly establish a practical advantage over existing MR1 formats, such as single-chain MR1-β2M or previously described stabilized constructs. The comparison is largely framed against native MR1, which risks overstating the problem, and on the basis of the data presented, it is unlikely that other researchers will adopt this system. In addition, the choice of the A-F7 TCR as a validation reagent may overestimate the generality of the approach, as this receptor is known to exhibit relatively broad ligand tolerance, including recognition of MR1 presenting vitamin B6 metabolites (PDB 9CGR) and structurally diverse synthetic ligands. The extent to which SMART-MR1 supports recognition by a broader range of MR1-restricted TCRs is not addressed.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript describes the engineering, production and validation of an MR1 variant with enhanced suitability for screening of ligands and biophysical and structural analysis. The authors utilize a previous advance from their laboratory on a classical MHC (HLA-A2) whereby the alpha 3 and b2m domains are replaced by a helical stabilizing domain.

      Strengths:

      This variant has a smaller molecular weight than the native MR1, can be produced easily through refolding and is thus much more suitable for NMR analysis. The authors provide data demonstrating that many of the parameters typically evaluated in protein biochemistry/biophysics are similar to reported values between this engineered variant and the wild-type protein. Overall, this is a significant advance to the MR1 field and more broadly to MR1 relevance in immunology and cancer biology, as this will accelerate high-throughput screening and discovery of disease-relevant ligands for MR1, which have been overshadowed by the misguided fixation on 5-OP-RU.

      Weaknesses:

      Minor concerns about the lack of comparison with the native MR1 extracellular domain construct in the validation of this engineered construct.

    1. Reviewer #1 (Public review):

      Summary:

      P. Izquierdo et al. investigated the genetic determinism of various traits of interest in switchgrass using large-scale genomic and transcriptomic data. More specifically, they worked on a diversity panel comprising 426 genotypes evaluated in common-garden experiments at two locations (Michigan and Texas). The phenotypic and genomic data were already published. In this work, they produced transcriptomic data for each of the 426 genotypes at each site, and they carried out phenotype predictions using genomic and transcriptomic data separately or together. While they were moderately correlated at each location, both omic information appeared to be complementary for the prediction of phenotype. To further exploit the fact that they have data across two locations, they computed differences for phenotypes and transcripts between locations as indicators of trait and transcript plasticity, respectively. They built predictive models of trait plasticity using genomic information and transcript plasticity, which proved to be quite accurate for traits affected by GxE. Finally, they made use of SHAP values from predictive models of flowering time and biomass at each location, as well as for their plasticity, to gain insight into their genetic determinism. These SHAP values provide the importance of the predictive features (SNP and/or transcripts) for trait prediction. This allowed them to confirm some candidate genes and to propose new candidates for both traits.

      Strengths:

      I found this study interesting and rich. I think the sample size (426 genotypes) is large enough to support the findings. The use of a modern machine-learning approach (XGBoost) together with SHAP indices to find interesting features and get insights into the biological mechanisms underlying flowering time and biomass production is quite original. The methodology employed is globally sound. I also like the fact that the authors accounted implicitly for the population structure by providing a baseline prediction using the first 5 PCs.

      Weaknesses:

      While the methodology is globally sound, I sometimes had difficulties following exactly what was done. This is partly due to the fact that the authors used 2 omics (SNPs and transcripts) to predict phenotypes, and sometimes, in the results, it is not clear which of the 2 is the focus. This was especially the case for the importance of the features and the interpretability of the models, where I found it sometimes hard to tell whether the analysis was done on SNPs or transcripts.

      Also, regarding the methodology, I did not understand why the authors needed to perform a feature selection approach. Maybe it was required to perform the interaction analysis, which could not be deployed on all the features? But regarding the importance of the features, I do not get the added value of the selection over the direct use of SHAP indices when using all features. Maybe this is because I am not a specialist in this kind of approach, but maybe the authors could add more details to explain the rationale behind the feature selection.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to evaluate whether integrating genomic (SNP) and transcriptomic information with machine learning can improve phenotypic prediction of polygenic traits across environments. The manuscript explored not only the predictability across models and predictor feature sets, but also attempted to identify meaningful genes and interactions underlying trait variation.

      Strengths:

      The main strength of the manuscript is its integration of SNP, transcriptomic, and phenotype datasets for 426 sorghum genotypes between Texas and Michigan. It provides a systematic comparison of predictor types (SNP versus transcriptomic abundance) and model strategies to integrate them.

      Weaknesses:

      (1) Experimental Design

      The experimental design raises several concerns that should be clarified before strong biological conclusions are drawn from the transcriptomic analyses.

      First, the transcriptomic sampling is not well aligned with the developmental stages most relevant to the phenotypes being modeled. Leaf tissue was collected at a single time point in each environment, whereas traits such as flowering time, biomass, tiller count, and panicle height arise from developmental processes occurring over extended and potentially distinct temporal windows. Consequently, the measured expression profiles are likely to reflect physiological states specific to the sampling dates (May 5-6 in Texas and June 22-24 in Michigan) rather than the regulatory processes underlying the target phenotypes.

      Second, the phrase "haphazardly randomized" is questionable for a field experiment. It is unclear whether the design included formal randomization, blocking, row/column structure, or spatial correction. Without explicit accounting for spatial field heterogeneity, environmental variation within sites may confound genotype and transcriptomic effects.

      Third, the Methods do not clearly describe biological replication for RNA-seq. If each genotype-by-environment combination were represented by a single transcriptomic sample, then within-genotype expression variance cannot be estimated. This is important because transcript abundance is highly sensitive to microenvironment, sampling time, tissue status, developmental stage, and technical variation. The absence of replication significantly weakens confidence in gene-level feature importance and gene-gene interaction claims.

      Four, the analysis of expression differences across environments is based on a simple subtraction (TX - MI) followed by correlation with genetic similarity. This approach is not standard in transcriptomic analysis and does not account for variability, replication, or statistical uncertainty. Conventional methods for assessing differential expression and genotype-by-environment interactions rely on model-based frameworks that explicitly estimate variance components and test for interaction effects. Without such modeling, the observed expression differences may reflect noise or confounding factors rather than genotype-driven responses.

      (2) SHAP contribution values

      Although SHAP is a well-established framework for decomposing model predictions into feature-level contributions, its use in this manuscript raises several concerns regarding interpretation, statistical validity, and biological inference.

      First, SHAP values quantify the contribution of features within the fitted model, conditional on the joint distribution of inputs and the model structure. They do not represent causal effects or direct biological importance. There is a difference where SHAP values are often in log-odds and the regression model uses absolute units. Without a fair evaluation of model fit, the interpretation of SHAP values needs to take a cautious step because a model could fit poorly when a feature shows very high SHAP values.

      In genomic data, where features are highly correlated due to linkage disequilibrium and co-expression, SHAP values can distribute contribution values across correlated variables in ways that are not uniquely identifiable. As a result, features highlighted as "important" may reflect correlation structure rather than true functional relevance.

      This correlative structure can be exacerbated in this manuscript because of the use of TPM-normalized transcript abundances as predictor variables without biological replicates. Assume the estimates of transcript abundances are robust, TPM values are compositional, with a constant-sum constraint that creates dependencies among all genes that induce negative correlations. This issue is particularly relevant for the interpretation of gene importance and interaction effects, where correlated predictors can lead to unstable and non-unique attributions. This biological interpretation of transcript-based features remains uncertain.

      (3) Result interpretation

      For example, in page 11, "plasticity SNP- and transcriptomic-based models generally outperformed single-environment models for traits with low cross-environment correlation, such as green-up (Fig. 2c, r = -0.13, p < 8.3 × 10⁻³) and tiller count (Fig. 2f, r = -0.08, p = 0.1) (Supplementary Fig. S1).", is too broad. For green-up, the Diff model appears much better than MI, but not clearly better than TX.

      And, same page 11, "...Diffexp was more predictive than SNPs for trait plasticity in biomass, flowering time, and tiller count..." only holds true for biomass, not flowering time, or tiller count.

      The aspect of "complementary information" between SNP and transcriptomic models in page 12 is stronger than what is supported by Figure 2. Figure 2 shows different predictive performance, but it does not by itself demonstrate complementarity. Establishing complementarity requires evidence that combining SNP+T improves prediction consistently or captures distinct, non-overlapping signals. Yet the preceding section says SNP+T outperformed either single data type in only 15% of cases, with modest gains. This is confusing. Also, there was not G+T in Figure 2; it is SNP+T.

    1. Reviewer #1 (Public review):

      Summary:

      Wang Liao and colleagues aim to provide a comprehensive synthesis of zebrafish circadian research, with particular emphasis on the decentralized photoreceptive architecture that distinguishes teleosts from mammals, and to outline future research directions leveraging emerging technologies for translational applications. The authors frame zebrafish as occupying a "crucial evolutionary and experimental niche" and argue that the model system is uniquely suited to address open questions in chronobiology.

      Strengths:

      The review is broad in scope and up to date in its citation of recent primary literature. The coverage of physiological outputs - spanning cardiovascular rhythmicity, hepatic metabolism, immune function, reproduction, and gut homeostasis - is more comprehensive than many existing reviews in this area, and researchers seeking an entry point into any of these subfields will find a useful orientation. The figures are well-designed and effectively summarise complex regulatory relationships. The section on immune rhythmicity is a particular strength, providing mechanistic detail on how specific clock components (Clock1a, Per1b, Per2, Cry1a) differentially regulate neutrophil behaviour, bacterial killing, and cytokine expression; this level of molecular specificity distinguishes it from comparable sections in the review. The brief discussion of non-canonical clock gene functions (CLOCK in neuronal connectivity, BMAL1 in stem cell state, vascular calcification) raises genuinely interesting points that are underexplored in the field and might deserve more prominence.

      The future perspectives section makes a conceptually interesting move in suggesting that the zebrafish decentralized architecture could reframe a central question in chronobiology - from how a master clock imposes order on passive peripheral oscillators, to how semi-autonomous oscillators achieve coherence. This is the most original conceptual contribution in the manuscript, and it would benefit from much further development.

      Weaknesses:

      The core limitation of this review is that it functions primarily as an annotated bibliography rather than a critical synthesis. Section after section follows the same pattern: a physiological system is introduced, several findings from recent papers are described in sequence, and the section ends. Missing throughout is an evaluative voice - where does the field agree, where does it disagree, which findings have been replicated versus remain preliminary, and which conceptual questions are genuinely unresolved versus merely unstudied? Readers with expertise in the field will find little that reframes their understanding; readers new to the field will receive information but not the interpretive scaffolding needed to assess its significance.

      The framing of zebrafish as occupying a "crucial evolutionary and experimental niche" is asserted but not substantiated. The experimental advantages of zebrafish - optical transparency, external development, genetic tractability - are real, but they apply primarily to larval stages, typically the first two weeks of development. The review does not adequately address whether the key features it highlights, particularly peripheral photosensitivity and autonomous peripheral oscillators, have been demonstrated in adult animals, where optical transparency is lost. Many of the physiological findings described (sleep-wake cycles, cardiovascular function, reproduction, and immune function) are most relevant in adult or juvenile fish, yet the mechanistic underpinnings often come from larval studies. Whether the mechanisms generalise across developmental stages is not discussed, and this is an important gap that the review could acknowledge explicitly.

      The claim that zebrafish bridge invertebrate and mammalian models is a conventional framing that appears in most zebrafish review articles; its repetition here adds little. More interesting - and underexplored - is the comparative question of how the decentralised clock architecture of teleosts compares with that of other non-mammalian vertebrates, or indeed with invertebrate systems such as Drosophila, where peripheral tissue clocks and non-visual photoreception have also been studied. The review does not engage with this comparative dimension, which would be the natural intellectual context for the claims being made.

      The future perspectives section identifies several promising directions - optogenetic circuit mapping, whole-body longitudinal imaging, inter-organ communication, network modeling - but these are described at a high level of generality. Most are not specific to the questions raised by the zebrafish decentralized clock architecture; they would appear in any forward-looking review of circadian biology. The one conceptually distinctive idea - that zebrafish could be used to ask how distributed oscillators achieve coordinated coherence without hierarchical control - is identified but not developed into concrete experimental questions or testable predictions. The discussion of non-canonical clock gene functions in the Future Perspectives section would benefit from being more directly connected to what zebrafish specifically can offer: given that teleost genome duplication has produced additional paralogues of clock genes, there is a concrete opportunity to dissect canonical from non-canonical functions through comparative analysis of paralogues with diverged expression patterns. This point is hinted at but not made explicitly.

      Appraisal of conclusions:

      The conclusions are broadly consistent with the evidence cited, and the authors are appropriately cautious in noting that many signalling cascades and inter-tissue communication mechanisms remain incompletely characterised. The conclusion that zebrafish represents a valuable and underexploited model for circadian-disease translational research is well-supported. However, the review would be significantly strengthened if the authors distinguished more clearly between what is firmly established, what is supported by preliminary or single-study evidence, and what remains genuinely speculative.

      Likely impact and utility:

      This review will be useful as an orientation document for researchers new to zebrafish circadian biology, and the comprehensive treatment of physiological outputs across organ systems is a genuine service to the field. Its impact as an intellectual contribution is limited by the descriptive approach and the absence of original synthesis or conceptual reframing. The most interesting ideas in the manuscript - the reframing of the central/peripheral clock hierarchy question, and the potential of clock gene paralogues for probing non-canonical functions - could be further developed and, if pursued, could form the basis of a more distinctive and impactful contribution.

    2. Reviewer #2 (Public review):

      Summary:

      This review is valuable in principle because circadian rhythms in zebrafish are unexplored and therefore this degree is valuable in principle. There are a number of significant weaknesses that should be addressed for it to have an impact. First, while the review covers a broad range of topics in chronobiology, it does not put them in context. Placing zebrafish work in the context of other model organisms that are better understood and other fish species would broaden the appeal. The review could also expand to a discussion of sleep, where the understanding in zebrafish is much more advanced. Critically, providing a novel framework, identifying new areas of opportunity and limitations of the system would expand the interest to non-zebrafish research groups. In addition, there are a number of misstatements/mis-citations that are critical to correct. Therefore, I find this review potentially impactful, but its current form is likely to limit its impact.

      Strengths:

      Focusing on decentralized photo sensing is a strength because it is relatively unique to zebrafish.

      The breadth of discussion in zebrafish is a strength.

      Weaknesses:

      It might be helpful to reorganize the review with an introduction on what is known in other better studied systems to be highly conserved, then to focus in on the components of zebrafish that are discussed here.

      A weakness is the lack of integration with other model organisms and other fish systems. Therefore, the narrow focus on zebrafish is unlikely to appeal to broader audiences.

      It's surprising that there is not more discussion of sleep, which has been studied in detail, and its relationship to the clock.

      Discussions of limitations of the model, including adult vs larval analysis and challenges performing long-term behavioral analysis in fish, would be valuable.

    3. Reviewer #3 (Public review):

      Summary:

      Over the past 3 or 4 decades, our understanding of the molecular mechanism underlying the circadian clock has increased substantially. This is in large part due to successful forward and reverse genetics approaches applied to a broad range of genetic model systems, notably Drosophila, Neurospora, mouse, Arabidopsis and cyanobacteria. Although the clock components in these species are diverse, the basic operating principles are highly conserved, allowing us to build a general view of clock mechanisms. Looking forward, there are still many unanswered questions regarding how clocks are organized at the systems level and, in turn, how they are coupled to key aspects of physiology. Each model species has its own set of advantages and disadvantages for tackling particular questions. As this timely review aims to illustrate, the zebrafish has become a particularly valuable model for exploring circadian clock biology. This is in part due to its technical advantages, accessibility of early developmental stages and its directly light-entrainable peripheral clocks. This provides unparalleled opportunities for studying the circadian clock hierarchy and its links with physiology.

      Strengths:

      This review does a good job of integrating the many lines of circadian clock research where the zebrafish has been used as a model and provides an overview of many future challenges it is well-suited to tackle.

      Weaknesses:

      There are citation errors, as well as inaccurate and misleading statements that must be remedied in a revised version.

    1. Reviewer #1 (Public review):

      Sheidaei and colleagues report a novel and potentially important role for an early mitotic actomyosin-based mechanism, PANEM contraction, in promoting timely congression of chromosomes located at the nuclear periphery, particularly those in polar positions. The manuscript will interest researchers studying cell division, cytoskeletal dynamics, and motor proteins. Although some data overlap with the group's prior work, the authors extend those findings by optimizing key perturbations and performing more detailed analyses of chromosome movements, which together provide a clearer mechanistic explanation. The study also builds naturally on recent ideas from other groups about how chromosome positioning influences both early and later mitotic movements.

      Comments on revised version:

      In the revised manuscript, organizational issues have been largely resolved. In addition, the inclusion of new experiments in additional cell lines, along with an expanded discussion that places actomyosin contractility in the broader conceptual context of other mechanisms governing chromosome movement, has significantly strengthened the manuscript.

    2. Reviewer #3 (Public review):

      Sheidaei et al. report how chromosomes are favourably positioned to facilitate kinetochore-microtubule interactions during early mitosis. Studying kinetochore capture during early prophase is extremely difficult due to kinetochore crowding, but the team has taken up the challenge by classifying types of kinetochore movements, carefully marking kinetochore positions in early mitosis, and linking these to map their fate/next positions over time. The work is an excellent addition to the chromosome segregation field, as most of the literature has thus far focused on tracking kinetochores at slightly later stages of mitosis. The authors show that PANEM facilitates chromosome positioning toward the interior of the newly forming spindle, which in turn promotes chromosome congression. In the absence of PANEM, chromosomes end up in unfavourable locations and fail to form proper kinetochore-microtubule interactions. The work highlights the perinuclear actomyosin network in early mitosis (PANEM) as a key spatial and temporal element of chromosome congression, a step that precedes the segregation process.

      Comments on revised version:

      The authors' revisions have brought clarity to the description of movements in many of the figures. The manuscript ties a fundamental process to differences in cancer cell lines.

      The work extends their published discovery that an actomyosin network forms on the cytoplasmic side of the nuclear envelope during prophase. The current manuscript explains how this network facilitates chromosome capture and congression by tracking the motions of individual kinetochores during early mitosis. The findings are broadly useful for the cell division and cytoskeletal fields.

    1. Reviewer #1 (Public review):

      Summary:

      This paper tries to address an important outstanding issue, which is the evolutionary origin of the SLC25 family of mitochondrial carrier proteins, which are common to all eukaryotic life, with few exceptions. The authors have carried out phylogenetic analyses and DALI searches of AlphaFold databases of bacterial and archaeal membrane proteins. They identify two bacterial proteins, CysZ and YhiY, and they propose that they are progenitors of SLC25 family members. Whilst the paper addresses an interesting topic, the conclusions are not supported by the data and are not presented in an unbiased manner, as they highlight only features that provide some tentative support for the hypothesis. They do not address the large number sequence and structural properties that refute the hypothesis, such as the asymmetric vs three-fold pseudo-symmetric features, hexamer vs monomer, and the complete lack of any conserved motifs with similar functions. Any resemblances between CysZ/YhiY and mitochondrial carriers thus seem to be superficial and could well be coincidental, as they represent generic properties of membrane proteins rather than specific ones, indicative of an evolutionary relationship.

      Strengths:

      This paper explores the evolutionary origins of the SLC25 family of mitochondrial carrier proteins, which are found across nearly all eukaryotic organisms. They were likely to be present in the last common ancestor of all eukaryotes, around two billion years ago. The question is whether they are of bacterial, archeal or eukaryotic origin. The authors propose that two bacterial proteins, CysZ and YihY, may represent ancestral forms of these carriers, based on structural comparisons of models, a sequence motif, and phylogenetic analyses. While the research addresses an important and longstanding question, the presented evidence does not convincingly support their hypothesis.

      Weaknesses:

      A central concern is the reliance on structural similarity searches using predicted protein models, since these models are often built using known protein structures as templates, and thus these searches may produce misleading matches. The reported similarities between CysZ, YihY, and mitochondrial carriers are weak and fall within ranges expected for unrelated membrane proteins, which commonly share general structural features, such as helical bundles. Quantitative measures of similarity are low and do not support a shared evolutionary origin. The case for YhiY is extremely poor as neither structure nor sequence features support the claim. Importantly, the opening of the YihY is towards the membrane rather than the water phase, as is the case for carriers, indicating that it has a very different structure and function. The case for CysZ is somewhat better, as it is a helical bundle with two short helices somewhat resembling the matrix helices of mitochondrial carriers, and a short sequence PXDXXK that is part of one of the known sequence motifs of mitochondrial carriers, but this is where the similarities end.

      Mitochondrial carriers have a distinctive threefold pseudo-symmetrical structure and a highly complex transport mechanism involving six structural elements. This paper's hypothesis does not explain how such a high level of threefold pseudo-symmetry could have evolved from entirely asymmetric proteins. To complicate matters further, CysZ is not functional as a monomer but forms a functional hexamer, which also explains why it has two half helices rather than two transmembrane helices. Thus, the hypothesis is that CysZ, which is an asymmetric protomer of a functional hexamer, has evolved into a three-fold pseudo-symmetric protein, which is functional as a monomer. A more convincing explanation is that the threefold pseudo-symmetrical structure arose from gene triplication and fusions, with later mutations introducing asymmetry to support diverse substrate binding. In support of this notion, mitochondrial carriers transporting large molecules, such as ATP, show more asymmetry, whereas those for small molecules remain nearly symmetrical. In general, the vast majority of transport proteins arose from gene duplications and fusions of the domains.

      Although mitochondrial carriers have a similar sequence motif as found in CysZ (PXDXXK), their roles are very different. In mitochondrial carriers, this motif is located roughly in the middle of transmembrane helices H1, H3, and H5, where proline creates a pronounced kink, bringing the charged residues inward to form a salt-bridge network in the central water-filled cavity. The formation and disruption of this network is essential for the transport mechanism when switching between inward- and outward-open states. In CysZ, the motif is found at the end of a helix and in the following loop at the end of the transporter, with residues pointing outward toward the water phase. These residues are typical of membrane-water interface regions, where proline acts as a helix breaker and charged residues interact with the water phase. Thus, this motif in CysZ does not match the position or function seen in mitochondrial carriers, and its presence is likely to be coincidental, because these residues often occur in the water-membrane region. Importantly, none of the other important conserved three-fold symmetrical motifs of mitochondrial carriers is found in these bacterial proteins, such as the cytoplasmic network [YF][DE]xx[RK], cardiolipin binding sites, ER-links, and sequences of small amino acids, which are critical for its dynamic mechanism.

      The phylogenetic relationship is also overstated, as there is no sequence similarity between these proteins other than that occurring because of similar biophysical properties, such as transmembrane helices. The authors suggest that a specific mitochondrial carrier represents the ancestral member of the family, but this conclusion appears to be inferred rather than rigorously demonstrated. Key aspects, such as tree rooting and taxon sampling, are not sufficiently addressed, weakening confidence in the evolutionary claims. Further, the selection of only a few bacterial and archaeal proteomes for analysis limits the study's scope. Broader searches would be necessary to support claims about conservation and ancestry. Independent sequence searches indicate that CysZ and YihY are not widely conserved in the bacterial groups most closely related to mitochondria, undermining the argument that they are plausible ancestors.

      Overall, the presented similarities are superficial and can be explained by general features of membrane proteins rather than by specific adaptations to function. The hypothesis that CysZ and YihY are evolutionary precursors of mitochondrial carriers is not supported by the presented data.

    2. Reviewer #2 (Public review):

      Summary:

      Here, the authors performed a phylogenetic analysis of mitochondrial ATP/ADP carrier (AAC) proteins. They also performed a structure-based screen for remote homologs, seeking to reveal their evolutionary origins. The authors claim that AACs are found at the root of their family tree, and through a structure-based homolog search protocol, identify putative prokaryotic homologs.

      The proposed evolutionary history of AACs is bold and complicated, but the phylogenetic methodology and the way in which the tree is interpreted are incomplete and unconvincing. Further, the structure-based search strategy uses very relaxed cutoffs for fold similarity, which may be fine, but it does not clearly justify this decision. This is potentially very problematic, as I did not find the quantitative or qualitative assessments of fold similarity particularly compelling.

      In summary, the authors have presented a bold and extremely interesting hypothesis for the evolution of these proteins, but there is insufficient support for their claims.

      Strengths:

      (1) The authors are presenting a very interesting hypothesis about the birth of these proteins, including that they may have undergone a radical rearrangement in their sequence at some point in evolution.

      (2) The paper makes use of appropriate tools for structure-based homolog identification.

      (3) Identification of a conserved sequence motif in these twilight zone proteins would be a rare and interesting occurrence, and could be consistent with their proposed homology.

      Weaknesses:

      (1) The phylogenetic analysis and its interpretations are incomplete. The authors regularly refer to the root of the tree, and its placement is given central importance. However, the methodology by which they selected the root is unexplained. This is notable, as the proposed root is curious and quite confusing. It implies that (at least) yeast and Paramecium AACs are independently paraphyletic. While certainly not impossible, this evokes quite a complicated evolutionary history. The taxonomy of this gene family, when rooted this way, does not seem to echo the phylogeny of species, suggesting an extremely complex history of duplication/loss and horizontal gene transfer, none of which the authors discuss in detail. Perhaps more clearly and specifically: I'm very surprised by the branching order at the root, where there are three independent branches of fungal proteins, followed by the excavate proteins in a monophyletic clade, followed by several independent branches of the Paramecium proteins. I very much expect incomplete lineage sorting at this evolutionary depth, but this seems extreme to the point that I question if it is accurately placed. More directly: this very much looks like an unrooted tree, presented radially.

      (2) The Bayesian and ML trees seem quite incongruent, but this is not discussed. In fact, the text states that they "exhibit a similar tree topology." This is admittedly very difficult to assess without very carefully going over the tree, branch by branch, but there are nevertheless differences, the most obvious being paraphyly vs monophyly of taxon-specific AAC clades. Do the authors have any comments on this, and can they show some sort of consensus tree? How does this affect their interpretation?

      (3) Presenting branch support as similarly-sized points makes it nearly impossible to actually judge the strength of support.

      (4) The use of structure for remote homology detection is becoming increasingly popular, and in my opinion, is very powerful. But it is still much too early to be taken for granted. The methodology must be justified. Most importantly, the authors have not clearly described why they chose these quantitative cutoffs (I'm mostly thinking of the Dali Z-score cutoff, which here seems very low for a transmembrane protein of this size, as the Z-score is very dependent on alignment length). The authors reference categories defined by tool authors, but why a Z-score of 3, specifically? The same goes for TM scores. There are not yet any defined best practices, to my knowledge, so the authors should independently validate/justify their approach in some way and/or cite and discuss relevant literature (there have been a growing number of these screens using similar approaches in recent years).

      (5) The proposed homologs have very little quantitative structural similarity to the query structure, or to each other, as shown in Figure 3 (and hence my concerns about the methodology). Also, I did not find the structural alignments in the supplement or Figure 4 to be qualitatively compelling. They simply appear too different, and I cannot discard this qualitative assessment because the quantitative similarities are likewise very weak. It's not clear to me if this is because the folds are in fact different, or if my view of them is a presentation issue (perhaps it could be improved by visualizing more angles, or more carefully cartooning the similarities and differences).

      (6) The authors point out that the alpha-helices are ordered differently in YihY and CysZ, and that their membrane orientation is flipped. Taken at face value, I would view this as evidence against homology. This could perhaps be more reasonably explained as convergent global fold similarity resulting from different underlying structures. However, the authors imply that this may be the result of the transposition of the sequences encoding these alpha helices, yet there is no convincing description or argument concerning when and how this could have occurred. I think this would be a deeply interesting phenomenon, but there is insufficient evidence and discussion to seriously consider whether or not it is homology or convergence.

      (7) Following up on comment #5, the authors did perform a very interesting in silico experiment by transposing sequences to reorder the helices. They then note that structural similarity improved. This is very, very interesting, but without other evidence of homology between the transposed alpha helices, I do not think this disproves alternative hypotheses. Does any such evidence exist?

      (8) The authors show in Figure 5E-F that sequence transposition flips the membrane orientation, such that YihY and CysZ have extracellular termini (which you would expect from homologs, I suppose). But it is just cartooned and not discussed. Is this computationally or experimentally supported?

      (9) The putative presence of a conserved motif would be a very compelling piece of evidence consistent with homology. However, it is not clear to me in the text which proteins actually have the repeats - is it truly just CysZ? What does this mean for YihY? Further, what specifically is being proposed to be homologous? Is SLC25 repeat 2 proposed to be homologous to CysZ repeat 2 (and the same for 3 to 3)? If so, this would seem to have implications for the transposition hypothesis. The helix nomenclature (e.g., H1-6) suggests homology across the proteins (i.e, H1 is homologous to H1); however, wouldn't the presence of these conserved domains instead, for example, suggest homology between SLC H3 and CysZ H2? The authors' conclusions are not clear, and it is difficult to interpret what the implications are for assessing homology.

      (10) The sequence retrieval methods are incomplete, so it is impossible to reproduce the searches or to judge their accuracy and scope. What were the E-value cutoffs and other settings used in the searches?

      (11) The phylogenetic methods are incomplete. What substitution models were used, and how were they chosen? What branch support method was used? What were the stop conditions of the Bayesian analysis (e.g. did the authors monitor for convergence, and how)? How much of the Bayesian analysis was considered burn-in, if any? And echoing points 1 & 2 above, how were these phylogenies rooted?

      (12) Throughout, there is a distinct lack of careful, evolutionarily informative language.

      (i) In reference to the phylogeny, the authors frequently refer to "grouping," but it's not entirely clear what this means. Referring to clades and their branching order would be more informative.

      (ii) The authors refer to the excavate branch as the "most ancient." Whether or not excavates most closely resemble LECA is somewhat irrelevant, because the branch itself is not the most ancient - it is equally as ancient as its sister branch, which may be all other eukaryotes.

      (iii) Likewise, the authors refer to bacterial proteins as "the evolutionary ancestor of mitochondrial AACs," and state that "AAC emerged from the conserved sulfat transporter CysZ." But extant bacteria are not the ancestors of mitochondria - nor are extant proteins descended from other extant proteins. They are, perhaps more accurately, cousins.

      (iv) The authors refer to AACs as "evolutionarily founder member of the SLC25 carrier family," but I'm not sure that has a clear evolutionary meaning, unless the authors mean to say that the common ancestor was more AAC-like than anything-else-like. Even if the rooting is accurate, a basal branch does not necessarily reflect the ancestral state.

    3. Reviewer #3 (Public review):

      Summary:

      The most important weakness is that the authors have avoided the direct structural comparison of experimentally determined x-ray structures of AAC and CysZ. Instead, the comparisons are made through predicted membrane topologies and predicted structural models of protein homologs, which give rise to misleading results. Direct comparison of the X-ray structures of the ADP/ATP carrier and CysZ clearly shows that these proteins have very different folds. Therefore, flaws in the methods produce results that lead to the wrong conclusions, and the authors have not achieved their aims.

      Weaknesses:

      (1) Figure 2. There is something very strange about how the tree is drawn, given that S. cerevisiae AAC1, AAC2 and AAC3 share about 76-83% sequence identity but appear to be very diversified in the tree. The phylogenetic trees are only based on the sequences of three species. The authors should explain in much more detail how they made the phylogenetic trees to support their statement that all mitochondrial carriers have come from an ancient AAC.

      (2) There are at least three and seven X-ray structures of CysZ (with about 43% sequence identity to the E. coli homolog) and AAC, respectively, deposited in the Protein Data Bank. Therefore, there is no need for the approach using predicted structures as described in the manuscript. It is clear from direct comparison of the CysZ and AAC structures that they have very different folds, i.e. lengths of the transmembrane helices, their orientation and packing. CysZ has been suggested to form dimers or trimers of dimers (eLife 2018;7:e27829), with each protomer formed by two long transmembrane helices and four short helices that do not cross the membrane totally. Thus, CysZ has a different membrane topology and oligomeric state than AAC (monomer with six transmembrane helices). CysZ is therefore rightfully classified in a separate 3D domain fold from mitochondrial carriers in various protein family and domain databases.

      (3) In the 3D structures of CysZ, the conserved QYXDYPXDNHK motif is involved in a network of hydrogen bonds and salt bridges thought to hold the helical bundle together (eLife 2018;7:e27829). This motif is similar to PX[DE]XX[KR], a part of the signature motif, typical of mitochondrial carriers, which is repeated three times in the sequences and forms a three-fold pseudo-symmetrical salt bridge network of the so-called matrix gate that opens and closes during the transport cycle. Therefore, although this single motif in CysZ is similar to those of mitochondrial carriers, it is not found in a similar structural context to those in AAC structures.

      (4) It appears odd that the sulfate transporter CysZ should be more similar to nucleotide-transporting AAC than any of the other mitochondrial carriers, of which some transport sulfate.

      (5) The alphafold model of YihY is not very similar to either the crystal structures of CysZ or AAC.

      (6) The authors are relying too much on the TM-score results. The values of 0.5-0.6 between AAC and CysZ or YihY probably reflect that they contain six main helices. However, as noted in point 2, they have very different folds.

    1. Reviewer #1 (Public review):

      Renard, Ukrow et al. applied their recently published computational pipeline (CHROMAS) to the skin of Euprymna berryi and Sepia officinalis to track the dynamics of cephalopod chromatophore expansion. By segmenting each chromatophore into radial slices, and analyzing the co-expansion of slices across regions of the skin, they inferred the motor control underlying chromatophore groups.

      Strengths:

      - The authors demonstrate that most motor units of cephalopod skin include a subregion of multiple chromatophores, creating "virtual chromatophores" between fixed chromatophores. This is an interesting concept that challenges prevailing models of chromatophore organization, and raises interesting possibilities for how chromatophore arrays may be patterned during development.

      - This study introduces new analytical approaches of cephalopod skin that will be valuable for the quantitative study of cephalopod behavior.

      Weaknesses:

      - The authors use patch-clamp experiments in E. berryi to test their approach for inferring motor units. The stimulations indeed evoke expansions of sub-regions of each chromatophore, creating "virtual chromatophores". However, they were not able to predict these motor units from behavioral analysis before confirming them with patch-clamp, limiting the strength of this validation.

      - In S. officinalis, chromatophores are far more numerous than in E. berryi and exhibit frequent spontaneous activity, making it more challenging to distinguish shared motor drive. Patch-clamp experiments in this species would provide important validation and strengthen confidence in the method for inferring motor units.

      - Although multiple experimental conditions were tested (e.g., age, size, behavioral context, sedation, head-fixation, lighting), data is only shown from a small subset of experiments. Analyzing pooled data across conditions would allow for more generalizable conclusions.

      - Different clustering algorithms were used for the two species (HDBSCAN for E. berryi and Affinity Propagation for S. officinalis). Since Affinity Propagation appeared to better capture correlation structure in S. officinalis, it would be informative to reanalyze the E. berryi data using the same method to assess potential algorithm-dependent biases.

      Conclusion:

      The CHROMAS tool is likely to be valuable to the field, given the need for quantitative frameworks in cephalopod biology. The predictions outlined here provide a useful foundation for future experimental investigation.

    2. Reviewer #2 (Public review):

      Summary:

      Overall, this is an excellent paper, making use of a newly developed system for monitoring the behaviour of chromatophores in the skin of (mostly) free swimming bobtail squid and European cuttlefish. The manuscript is very well written, clearly presented and very well structured. The central finding, that individual chromatophores are connected to multiple motor neurones, is not new. Novelty instead comes from the ability to measure the actuation of chromatophore sections across wide areas of skin in free-swimming animals, showing the diversity of local motor units and reinforcing the notion that individual chromatophores are not necessarily the individual units of colour change, but rather local motor units that cover multiple neighbour and near neighbour chromatophore muscles. This is an excellent finding and one that will shape our understanding of the neural control of cephalopod skin colour. I have a number of minor points below that the authors will need to address before acceptance.

      Strengths:

      The methodological approach to collecting large amounts of data about local variations in the expansion of sections of chromatophores is exciting, and the analysis pipeline for clustering sections of chromatophores whose spontaneous activity correlated over time is powerful and exciting.

      Comments on revisions:

      All concerns have been addressed in the revised version of the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      This study uses high-resolution videography and a custom computer-vision pipeline to dissect the motor control of cephalopod chromatophores in Euprymna berryi and Sepia officinalis. By quantifying anisotropic chromatophore deformations and applying dimensionality reduction methods, the authors infer that individual chromatophores can be a part of multiple motor units. Clustering analyses reveal putative motor units that often span multiple chromatophores, with diverse and overlapping geometries. Chromatophore expansion dynamics are faster and more stereotyped than relaxation, consistent with active neural contraction followed by passive recoil. Together, the results show that chromatophores function not as uniform pixels but as fractionated, coordinately controlled elements that enable flexible pattern generation

      Strengths:

      The authors present compelling, direct evidence that a). chromatophore deformations are anisotropic, and indirect evidence that b). individual chromatophores can be split across multiple putative motor units. This evidence is provided through data collected over large spatial scales, but also at a sub-chromatophore resolution. This combination of scale and resolution is not possible using traditional neuroanatomical and physiological approaches alone.

      The authors also develop a new non-invasive, image analysis approach to extract information about chromatophore deformation across large spatial scales on the organism's body. In principle this approach is applicable across species and may allow for further comparative characterization of chromatophore motor control. It is therefore a promising new tool and useful resource for the community.

      Weaknesses:

      An important weakness of the work is that the methods the authors develop can only be applied during resting, spontaneous 'flickering' activity of chromatophores to yield interpretable results at the motor unit level. This is because common presynaptic input would confound the identification of individual motor units. Thus, there remains a large difficulty in linking insights about single motor unit organization to the circuit and behavioral levels.

      Another weakness of this paper is the rather limited electrophysiological validation of the computational findings. The authors present only one electrophysiology experiment in E. berryi, the species that they used only for 'methodological development' and not for detailed characterization. A complementary electrophysiological experiment in S. officinalis, or some visualization of neuron morphology confirming that motor neurons do indeed project to multiple chromatophores would strengthen the generalizability of their computational analysis. This would be particularly pertinent to validate the author's claim that some motor units contain chromatophores that are quite distant from one another on the animal.

      Overall, the authors' technical contributions and method development are an important advance. This work serves as an excellent proof of concept that their method can extract useful information about chromatophore motor control. Further validation of their method is needed to fully trust the fine-scale conclusions drawn about the distribution and composition of multi-innervated chromatophores. Furthermore, the authors raise many interesting ideas about developmental constraints on circuit wiring and potential adaptive significance of multi-innervated chromatophores for certain features of camouflage patterning. Their method may be able to help resolve some of these questions in the future if it is refined and applied across developmental stages, regions on the animal, and across species

      Comments on revisions:

      Thank you for clarifying my major point of confusion regarding how one might connect these results to behaviorally relevant camouflage. I now have a better understanding of the author's rationale in studying resting activity of motor units and believe that the clarifications added to the manuscript will help other readers who encounter similar confusion.

    1. Reviewer #1 (Public review):

      Summary:

      This paper investigates whether transformer-based models can represent sentence-level semantics in a human-like way. The authors designed a set of 108 sentences specifically to dissociate lexical semantics from sentence-level information and collected 7T fMRI data from 30 participants reading these sentences. They conducted representational similarity analysis (RSA) comparing brain data and model representations, as well as the human behavioral ratings. It is found that transformer-based models match brain representation better than static word embedding baseline which ignores word order but fall short of models that encode the structural relations between words. The main contributions of this paper are:

      (1) The construction of a sentence set that disentangles sentence structure from word meaning.

      (2) A comprehensive comparison of neural sentence representations (via fMRI), human behavior, and multiple computational models at the sentence level.

      Strengths:

      (1) The paper evaluates a wide variety of models, including layer-wise analysis for transformers and region-wise analysis in the human brain.

      (2) The stimulus design allows precise dissociation between lexical and sentence-level semantics. The RSA-based approach is empirically sound and intuitive.

      (3) The constructed sentences, along with the fMRI and behavioral data, represent a valuable resource for studying sentence representation.

      Weaknesses:

      (1) The rationale behind averaging sentence embeddings across multiple transformer models (with different architectures and training objectives) is unclear. These transformer-based models have different training paradigms and model architectures, which may result in misaligned semantic spaces. The averaging operation may dilute the distinct sentence representations learned by each model, potentially weakening the overall semantic encoding for sentences. Please clarify this choice or cite supporting methodology.

      (2) All structure-sensitive models discussed incorporate semantics to some extent. Including a purely syntactic baseline, such as a model based on context-free grammar, would help confirm the importance of syntactic structures.

      (3) In Figure 2, human behavioral judgments show weak correlations with neural data, and even fall below those of computational models, suggesting the behavioral judgments may not reflect the sentence structures in a brain-like way. This discrepancy between behavioral and neural data should be clarified, as it affects the interpretation of the results.

      (4) To better contextualize model and neural performance, sentence similarity should be anchored to a notion of semantic "ground truth", such as the matrix shown in Figure 1a. Comparing this reference with human judgments, brain responses, and model similarities would help establish an upper bound.

      (5) The structure of this paper is confusing. For instance, Figure 5 is cited early but appears much later. Reordering sections and figures would enhance readability.

      (6) While the analysis is broad and comprehensive, it lacks depth in some respects. For instance, it remains unclear what specific insights are gained from comparing across brain regions (e.g., whole brain, language network, and other subregions). Similarly, the results of simple-average and group-average RSA appear quite similar and may not advance the interpretation.

      (7) While explaining the grid-like pattern due to sentence length is important, this part feels somewhat disconnected from the central question of this paper (word order). It might be better placed in supplementary material.

      Comments on revised version:

      The new version of the paper has addressed my main concerns, including:

      (1) clarification about the methodology of Transformer embeddings

      (2) discussion about the purely syntactic models

      (3) discussion about the low correlation between behavioural ratings and brain activations

      (4) better structure of the paper

      (5) clarification about pre-registration

      I believe the paper has been substantially improved after revision.

    2. Reviewer #3 (Public review):

      Summary:

      Large Language Models have revolutionized Artificial Intelligence and can now match or surpass human language abilities on many tasks. This has fuelled interest in cognitive neuroscience in exposing representational similarities between Language Models and brain recordings of language comprehension. The current study breaks from this mold by: (1) Systematically identifying sentence structures for which brain and Large Language Model representations diverge. (2) Accounting for such sentence structures using a model structured by semantic roles. As such the study may now fuel interest in characterizing how Large Language Models and brain representations differ, which may prompt new more brain like language models.

      Strengths:

      * This study presents a bold challenge to a literature trend that has touted similarities between Transformer models and human cognition based on representational correlations with brain activity. This challenge is substantiated by identifying sentences for which brain and model representations of sentences diverge.

      * This study conducts a rigorous pre-registered analysis of a comprehensive selection of the state-of-the-art Large Language Models, on a controlled sentence comprehension fMRI dataset. The analysis is conducted within a Representation Similarity framework to support similarity comparisons between graph structures and brain activity without needing to vectorize graphs. Transformer models are predicted and shown to diverge from brain representations on subsets of sentences with similar word-level content but different sentence structures.

      * The study introduces a 7T fMRI sentence comprehension dataset and accompanying human sentence similarity ratings which may be a fruitful resource for developing more human-like language models. Unlike other model-based sentence datasets, the relation between grammatical structure and word-level content is controlled, and subsets of sentences for which models and brains diverge are identified.

      Weaknesses:

      * The interpretation of findings is nuanced. Although Transformers underperform as brain models on the critical subsets of controlled sentences, a Transformer outperforms all other models when evaluated on the union of all sentences when both word-level content and structure vary. Transformers also yield equivalent or better models of human behavioral data. Thus, although Transformers have demonstrable flaws as human models which are pinpointed here, in the general case (some) Transformers are more human-like than the other models considered.

      * There may be confounds between the critical sentence structure manipulations and visual processing. This is inconvenient because activation in brain regions that process semantics tends to partially correlate with low-level representations of sentence surface features encoded in visual cortex. Although the study commendably controls for confounds associated with sentence length, correlations with the key sentence structure models are most salient in visual cortex and diminish in other brain networks when V1-V4 activation is controlled for.

      * Sentence similarity computations are emphasized as the basis for unifying comparative analyses of graph structures and vector data. A strength of this approach is that correlation is not always the ideal similarity metric. However, a weakness is that similarity computations are not unified across models. This has practical consequences because different similarity metrics applied to the same model produce positive or negative correlations with brain data and repeating analyses with a different representational dissimilarity measure seems to produce some anomalous results.

    1. Reviewer #1 (Public review):

      This rigorous and creative study uses an elegant combination of metabolomics, transcriptomics, and budding yeast molecular genetics to discover that (i) activating AMPK to maintain mitochondrial respiration fueled by cytosolic Acetyl CoA and (ii) increasing fatty acid synthesis independent of respiration drive independent pathways that increase the fitness of replicatively-aged budding yeast cells, albeit without increasing their lifespan. This work will be of interest to scientists in the field of aging and metabolism. Some clarifications in the text would address the following concerns, which would increase the impact of the study:

      (1) What does activation of AMPK (via PGDP-Sak1 expression) do to the replicative lifespan? How many bud scars, in general, do the subpopulations that are older - yet have less Tom70 (increased mitochondrial fitness) - have, after the 48 hrs time point that they are examining? How many divisions occurred in this 48hr time period - i.e. is it long enough to have all cells reach the end of their replicative lifespan? This information is important to rule out that a subset of the mutant cells just divided faster and hence had more divisions within 48 hrs (growing faster and living longer are different things). Having identical growth curves doesn't indicate per se that they all divide at the same rate, as there may be a subpopulation that divides faster and a subpopulation that doesn't grow so well.

      (2) A2A cells do not have an extended replicative lifespan (RLS) but show an increase in the "low senescence" population (Figure 2). If the cells are not becoming senescent, why don't they have longer RLS? Not having a longer lifespan seems inconsistent with the statement that "bud scar counting confirmed that A2A cells reach a higher age than wild type", which comes back to how many times the cells can divide in the 48hr timepoint studied and their rate of cell division? Also, the lifespan curve shown is plotted against time, not cell division number, which does not take into account different division times of cells within the population (described above). It would be much more useful to show standard lifespan curves showing cell division numbers per lifespan per cell.

      (3) Increased "fitness" of the old cells is implied from the increased size of the colonies that the old cells can make. However, this is a measure of the fitness of the daughters per se, not the old mother cells. Are the old mothers just passing on healthier mitochondria and more lipids to the daughters, such that they can divide more times? If the aged cells have an "increased fitness", why don't they divide more times themselves (i.e. live longer?).

      (4) The statement is made that "these experiments define two classes of aging cells with distinct metabolic needs, coherent with the model of two aging trajectories previously proposed (referencing Nan Hao's work)". However, the big difference here is that in Nan Hao's work, their two aging trajectories influenced the length of lifespan, but that does not appear to be the case here. That distinction should be made clear. Perhaps the authors could also speculate as to why the A2A yeast stops dividing after presumably the same number of cell divisions, even though they have an activated AMPK and activated fatty acid synthesis pathway.

      (5) I am a bit confused by the use of the word "senescence" by this lab here and in their previous growth on galactose studies. If yeast don't senesce, which is usually defined as an irreversible arrest of the cell cycle where cells stop dividing, shouldn't the yeast that do not senesce still be dividing and hence have a longer lifespan? Should a different term be used rather than senescence? Such as "fitness late in life". The authors giving their definition of senescence may help reduce this apparent contradiction.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors investigate how cytosolic acetyl-CoA metabolism influences replicative aging in budding yeast. They propose that acetyl-CoA regulates aging through three major pathways: (1) mitochondrial transport to support mitochondrial function, (2) fatty acid synthesis, and (3) global protein acetylation. The data show that AMPK activation promotes mitochondrial import of acetyl-CoA and partially mitigates mitochondrial decline in a subset of aging cells.

      Furthermore, the engineered A2A strain, which enhances mitochondrial acetyl-CoA utilization while relieving inhibition of fatty acid synthesis, increases the proportion of cells exhibiting a "low senescence" phenotype.

      Overall, this is a thoughtful and potentially impactful study that advances our understanding of metabolic control of aging. Addressing the points below, particularly by refining interpretations and, where feasible, incorporating additional analyses, will further strengthen the manuscript and its conclusions.

      Strengths:

      The study has several notable strengths. It addresses an important question by shifting the focus from lifespan to preservation of late-life fitness, which is highly relevant to aging biology. The work integrates metabolic, genetic, and functional analyses to link cytosolic acetyl-CoA flux with distinct aging outcomes, and the engineering of the A2A strain provides a clear and elegant demonstration of how coordinated pathway modulation can improve cellular fitness.

      Weaknesses:

      (1) While the manuscript focuses on mitochondrial transport and fatty acid synthesis, cytosolic acetyl-CoA is also a key regulator of histone acetylation and chromatin silencing. It would strengthen the study to consider whether acetyl-CoA depletion contributes to improved fitness through enhanced rDNA silencing. Given the well-established role of rDNA instability in yeast aging, additional experiments examining rDNA silencing and stability would be valuable. For example, monitoring rDNA copy number changes (not necessarily ERCs) under AMPK activation, oleic acid supplementation, and in the A2A strain, similar to approaches used in the authors' prior work, would help clarify whether chromatin regulation contributes to the observed phenotypes.

      (2) The current data do not fully distinguish whether AMPK activation and oleic acid supplementation act on distinct subpopulations of aging cells. An alternative explanation is that oleic acid supplementation enhances mitochondrial function and acts additively with AMPK activation, thereby increasing the fraction of cells in the "low senescence" state. Since this distinction is not central to the main conclusions, I suggest softening the language around subpopulation specificity. Emphasizing instead that the A2A strain coordinately modulates multiple branches of acetyl-CoA metabolism to improve late-life fitness would maintain the strength of the central message without overinterpretation.

      (3) The manuscript proposes that lipid starvation and excess acetyl-CoA are major drivers of senescence in distinct subpopulations of wild-type aging cells. This conclusion is not yet fully supported by the presented data. Direct measurements of age-dependent divergence in acetyl-CoA and fatty acid levels at the single-cell level would be needed to substantiate this model. Based on the current evidence, a more conservative interpretation would be that aging cells exhibit differential sensitivity to perturbations in acetyl-CoA and lipid metabolism. Accordingly, I recommend revising the statement in the Abstract ("We further implicate lipid starvation and excess acetyl coenzyme A availability as major drivers of senescence...") and the corresponding discussion text to better align with the data.

    3. Reviewer #3 (Public review):

      Summary:

      These findings suggest that PGPD-SAK1 yeast show a subpopulation with lowered TOM70-GFP expression in high bud scar staining aged cells. Deletion of CAT2 or MLS1 reduces this effect. A PGPD-SAK1 acc1S1157A double mutant (called "A2A" here) shows an even larger effect of lowered tom70 expression in high bud scar staining aged cells. Utilization of various additional mutants involved in acetyl-CoA transport, carnitine shuttle, respiration, etc., leads the authors to conclude that these shifts in TOM70-GFP in aged cells are linked to the AMPK-fatty acid metabolic regulatory system.

      Strengths:

      These extensive and clearly described experiments reveal interesting changes in TOM70-GFP intensity in subsets of aged yeast in several mutants eventually identified as linked to the AMPK-fatty acid metabolic regulatory system.

      Weaknesses:

      (1) 3 biological replicates for mRNASeq is low.

      (2) While "Traditional conceptions of ageing implicate a progressive accumulation of damage leading to systemic degradation in performance until death, with evolutionary pressures acting to maximise early life fitness and fecundity at the expense of ageing health." is tangential perhaps to the data and conclusions of the study, both claims of this sentence are at best controversial, and the manuscript is no weaker for their omission.

      (3) The statement that "Here, we determine the basis of senescence and fitness loss in replicatively ageing yeast" is a bit strong as a summary of the present careful work presented here. If the authors had created yeast mutants that retained fitness indefinitely, this would be a more appropriate strength of claim to summarize the work.

    1. Reviewer #1 (Public review):

      [Editor's note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In this work the authors investigate the molecular dynamics of MinD, a component of the Bacillus subtilis Min system, in vitro and in vivo. In Escherichia coli the Min system is highly dynamic and displays rapid pole to pole oscillation whereby a time average minimum of the Min proteins at mid cell is established. However, in B. subtilis, this is not the case, and there is no MinE present. MinD in B. subtilis dynamically relocalizes from the poles to division sites, and binds to MinC and MinJ, which mediates its interaction with DivIVA. This paper reports biochemical characterization of B. subtilis MinD in vitro and dynamics of MinD variants in vivo, providing mechanistic insight into the mechanism of dynamic localization.

      Strengths:

      In the current study, the authors perform a detailed biochemical characterization of the in vitro ATPase activity of MinD and demonstrate that rapid hydrolysis is elicited by adding phospholipids. They further show using a collection of substitution mutants of MinD that both monomers and dimers bind to the membrane, and ATP occupancy changes the on and off rates. Identification, quantification, and tracking of discrete Halo-MinD populations was nicely done and showed that mutations in MinD alter dynamic localization, correlating with PL binding on and off rates in vitro.

      In the revised manuscript, the authors now demonstrate localization and tracking data for minC and minJ deletion strains, which suggest that MinJ impacts MinD membrane cycling, but MinC does not. Additional in vitro work showed that the PDZ domain of MinJ modifies MinD ATP hydrolysis rates, and the authors propose that MinJ may promote MinD dimer formation.

      Weaknesses of the revised version: No major weaknesses.

    2. Reviewer #2 (Public review):

      Summary:

      Feddersen & Bramkamp determined important characteristics of how MinD protein binds/dissociates to/from the membrane, and dimerizes in relation to its ATPase activity. The presented data clearly shows the differences in function of MinD homologs from B. subtilis and E. coli.

      Strengths:

      The work presents well-executed experiments that lead to interesting conclusions and a new model of how Min system works during B. subtilis mid-cell division. Importantly, this model is supported by in vitro characterization of well-chosen mutants in the functional domains of MinD. Outstandingly, most of the in vitro data are confirmed by single-molecule localization microscopy.

    1. Reviewer #1 (Public review):

      Summary:

      The study by the Obata group characterizes the dynamics of the canonical malate dehydrogenase-citrate synthase metabolon in yeast.

      Strengths:

      The study is well-written and appears to give clear demonstrations of this phenomenon.

      Studies of the dynamics of metabolon formation are rare; if the authors can address the concern detailed below, then they have provided such for one of the canonical metabolons in nature.

      Weaknesses:

      There is a fundamental issue with the study, which is that the authors do not provide enough support or information concerning the split luciferase system that they use. Is the binding reversible or not? How the data is interpreted is massively influenced by this fact. What are the pros and cons of this method in comparison to, for example, FLIM-FRET? The authors state that the method is semi-quantitative - can they document this? All of the conclusions are based on the quality of this method. I know that it has been used by others, but at least some preliminary documentation to address these questions is required.

      Comments on revised version:

      I feel that the authors have adequately addressed my prior concerns. I have no further critiques of their work.

    2. Reviewer #2 (Public review):

      This study explores the dynamic association between malate dehydrogenase (MDH1) and citrate synthase (CIT1) in Saccharomyces cerevisiae, with the aim of linking this interaction to respiratory metabolism. Utilizing a NanoBiT split-luciferase system, the authors monitor protein-protein interactions in vivo under various metabolic conditions.

      Major Concerns:

      (1) NanoBiT Signal May Reflect Protein Abundance Rather Than Interaction Strength<br /> In Figure 1C, the authors report increased MDH1-CIT1 interaction under respiratory (acetate) conditions and decreased interaction during fermentation (glucose), as indicated by NanoBiT luminescence. However, this signal appears to correlate strongly with the expression levels of MDH1 and CIT1, raising the possibility that the observed luminescence reflects protein abundance rather than specific interaction dynamics. To resolve this, NanoBiT signals should be normalized to the expression levels of both proteins to distinguish between abundance-driven and interaction-driven changes.

      (2) Lack of Causal Evidence<br /> The study presents a series of metabolic perturbation experiments (e.g., arsenite, AOA, antimycin A, malonate) and correlates changes in metabolite levels with NanoBiT signals. However, these data are correlative and do not establish a functional role for the MDH1-CIT1 interaction in metabolic regulation. To demonstrate causality, the authors should implement approaches to specifically disrupt the MDH1-CIT1 interaction. One strategy could involve using a 15-residue peptide (Pept1) derived from the Pro354-Pro366 region of CIT1, previously shown to mediate the interaction or introducing the cit1Δ3 (Arg362Glu) mutation, which perturbs binding. Metabolic flux analysis using ^13C-labeled glucose and mitochondrial respiration assays (e.g., Seahorse) could then assess functional consequences.

      (3) Absence of Protein Expression Controls Under Perturbation Conditions<br /> In experiments involving acetate, arsenite, AOA, antimycin A, and malonate, the authors infer changes in MDH1-CIT1 association based solely on NanoBiT signals. However, no accompanying data are provided on MDH1 and CIT1 protein levels under these conditions. This omission weakens the conclusions, as altered expression rather than interaction strength could underlie the observed luminescence changes. Immunoblotting or quantitative proteomics should be used to confirm constant protein expression across conditions.

      Conclusion:

      Although the central question is compelling and the use of NanoBiT in live cells is a strength, the manuscript requires additional experimental rigor. Specifically, normalization of interaction signals, introduction of causative perturbations, and validation of protein expression are essential to substantiate the study's claims.

      Comments on revised version:

      The manuscript is much improved.

    3. Reviewer #3 (Public review):

      Summary:

      Metabolons are multisubunit complexes that promote the physical association of sequential enzymes within a metabolic pathway. Such complexes are proposed to increase metabolic flux and efficiency by channeling reaction intermediates between enzymes. The TCA cycle enzymes malate dehydrogenase (MDH1) and citrate synthase (CIT1) have been linked to metabolon formation, yet the conditions under which these enzymes interact, and whether such interactions are dynamic in response to metabolic cues, remains unclear, particularly in the native cellular context. This study uses a nanoBIT protein-protein interaction assay to map the dynamic behavior of the MDH1-CIT1 interaction in response to multiple metabolic stimuli and challenges in yeast. Beyond mapping these interactions in real time, the authors also performed GC-MS metabolomics to map whole cell metabolite alterations across experimental conditions. Finally, the authors use microscale thermophoresis to determine components that alter the MDH1-CIT1 interaction in vitro. Collectively, the authors synthesize their collected data into a model in which the MDH1-CIT1 metabolon dissociates in conditions of low respiratory flux, and is stimulated during conditions of high respiratory flux. While their data largely support these models, some key exceptions are found that suggest this model is likely oversimplified and will require further work to understand the complexities associated with MDH1-CIT1 interaction dynamics. Nonetheless, the authors put forth an interesting and timely toolkit to begin to understand the interaction kinetics and dynamics of key metabolic enzymes that should serve as a platform to begin disentangling these important yet understudied aspects of metabolic regulation.

      Strengths:

      - The authors address an important question: how do metabolon-associated protein protein interactions change across altered metabolic conditions?

      - The development and validation of the MDH1-CIT1 nanoBIT assay provides an important tool to allow the quantification of this protein-protein interaction in vivo. Importantly, the authors demonstrate that the assay allows kinetic and real time assessment of these protein interactions, which reveal interesting and dynamic behavior across conditions.

      - The use of classic biochemical techniques to confirm that pH and various metabolites can alter the MDH1-CIT1 interaction in vitro is rigorous and supports the model put forth by the authors.

      Weaknesses:

      The authors have addressed identified weaknesses within the revision of their manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigated how visuospatial attention influences the way people build simplified mental representations to support planning and decision-making. Using computational modeling and virtual maze navigation, the authors examined whether spatial proximity and the spatial arrangement of obstacles determine which elements are included in participants' internal models of a task. The study developed and tested an extension of the value-guided construal (VGC) model that incorporates features of spatial attention for selecting simpler task mental representation.

      Strengths:

      (1) Original Perspective: The study introduces an explicit attentional component to established models of planning, offering an approach that bridges perception, attention, and decision-making.

      (2) Methodological Approach: The combination of computational modeling, behavioral data, and eye-tracking provides converging measures to assess the relationship between attention and planning representations.

      (3) Cross-validated data: The study relies on the analysis of three separate datasets, two already published and an additional novel one. This allows for cross-validation of the findings and enhances the robustness of the evidence.

      (4) Focus on Individual Differences: Reports of how individual variability in attentional "spillover" correlates with the sparsity of task representations and spatial proximity add depth to the analysis.

      Appraisal of Aims and Results:

      The study sets out to determine how spatial attention shapes the construction of task representations in planning contexts. The authors provide evidence that spatial proximity and arrangement influence which environmental features are incorporated into internal models used for navigation, and that accounting for these effects improves model predictions. There is clear documentation of individual variation, with some participants showing greater attentional spillover and more sparse awareness profiles.

      Comments on revised version:

      The authors did a great job and I am very happy with the revised manuscript.

    2. Reviewer #2 (Public review):

      Summary:

      Castanheira et al. investigate the role of spatial attention for planning during three maze navigation experiments (one new experiment and two existing datasets). Effective planning in complex situations requires the construction of simplified representations of the task at hand. The authors find that these mental representations (as assessed by conscious awareness) of a given stimulus are influenced by (spatially) surrounding stimuli. Individual participants varied in the degree to which attention influenced their task representations, and this attentional effect correlated with the sparsity of representations (as measured by the range of awareness reports across all stimuli). Spatially grouping task-relevant information on either the left or right side of the maze led to mental representations more similar to optimal representations predicted by the value-guided construal (VGC) model - a normative model describing a theoretical approach to simplifying complex task information. Finally, the authors propose an update to this model, incorporating an attentional spotlight component; the revised descriptive model predicts empirical task representations better than the original (normative) VGC model.

      Strengths:

      The novelty of this study lies in the proposal and investigation of a cognitive mechanism through which a normative model like value-guided construal can enable human planning. After proposing attention as this mechanism, the authors make concrete hypotheses about mismatches between the VGC predictions and real human behavior, which are experimentally validated. Thus, not only does this study describe a possible mechanism for simplification of task information for planning, but the authors also propose a descriptive model, revising VGC to incorporate this attentional component.

      A strength of this paper is the variety of investigative approaches: analysis of existing data, novel experiment, and a computational approach to predict experimental findings from a theoretical model. Analyzing pre-existing datasets increases the size of the participant cohort and strengthens the authors' conclusions. Meanwhile, comparing the predictions of the existing normative model and the authors' own refined model is a clever approach to substantiate their claims. In addition, the authors describe several crucial controls, which are key to the interpretability of their results. In particular, the eye tracking results were critical.

      In summary, this paper constitutes an important step toward a more complete understanding of the human ability to plan.

      Comments on revised version:

      I am overall happy with the revision and agree that the authors have addressed most of the comments.

    3. Reviewer #3 (Public review):

      Summary:

      The authors build on a recent computational model of planning, the "value-guided construal" framework by Ho et al. (2022), which proposes that people plan by constructing simple models of a task, such as by attending to a subset of obstacles in a maze. They analyze both published experimental data and new experimental data from a task in which participants report attention to objects in mazes. The authors find that attention to objects is affected by spatial proximity to other objects (i.e., attentional overspill) as well as whether relevant objects are lateralized to the same hemifield. To account for these results, the authors propose a "spotlight-VGC" model, in which, after calculating attention scores based on the original VGC model, attention to objects is enhanced based on distance. They find that this model better explains participant responses when objects are lateralized to different hemifields. These results demonstrate complex interactions between filtering of task-relevant information and more classical signatures of attentional selection.

      Strengths:

      (1) The paper builds on existing modeling work in a novel manner and integrates classic results on attention into the computational framework.

      (2) The authors report new and extensive analyses of existing data that shed light on additional sources of systematic variability in responses related to attentional spillover effects

      (3) They collect new data using new stimuli in the original paradigm that directly test predictions related to the lateralization of task-relevant information, including eye tracking data that allows them to control for possible confounds.

      (4) The extended model (spotlight-VGC) provides a formal account of these new results.

      Comments on revised version:

      I also agree that the authors addressed our comments and the manuscript is much stronger now.

    4. Reviewer #1 (Public review):

      Summary: This study investigated how visuospatial attention influences the way people build simplified mental representations to support planning and decision-making. Using computational modeling and virtual maze navigation, the authors examined whether spatial proximity and the spatial arrangement of obstacles determine which elements are included in participants' internal models of a task. The study developed and tested an extension of the value-guided construal (VGC) model that incorporates features of spatial attention for selecting simpler task mental representation.

      Strengths:

      (1) Original Perspective: The study introduces an explicit attentional component to established models of planning, offering an approach that bridges perception, attention, and decision-making.

      (2) Methodological Approach: The combination of computational modeling, behavioral data, and eye-tracking provides converging measures to assess the relationship between attention and planning representations.

      (3) Cross-validated data: The study relies on the analysis of three separate datasets, two already published and an additional novel one. This allows for cross-validation of the findings and enhances the robustness of the evidence.

      (4) Focus on Individual Differences: Reports of how individual variability in attentional "spillover" correlates with the sparsity of task representations and spatial proximity add depth to the analysis.

      Weaknesses:

      (1) Clarity of the VGC model and behavioral task: The exposition of the VGC model lacks sufficient detail for non-expert readers. It is not clear how this model infers which maze obstacles are relevant or irrelevant for planning, nor how the maze tasks specifically operationalize "planning" versus other cognitive processes.

      The method for classifying obstacles as relevant or irrelevant to the task and connecting metacognitive awareness (i.e., participants' reports of noticing obstacles) to attentional capture is not well justified. The rationale for why awareness serves as a valid attention proxy, as opposed to behavioral or neurophysiological markers, should be clearer.

      (2) Attention framework: The account of attention is largely limited to the "spotlight" model. When solving a maze, participants trace the correct trail, following it mentally with their overt or covert attention. In this perspective, relevant concepts are also rooted in attention literature pertaining to object-based attention using tasks like curve tracing (e.g., Pooresmaeili & Roelfsema, 2014) and to mental maze solving (e.g., Wong & Scholl, 2024), which may be highly relevant and add nuance to the current work. This view of attention may be more pertinent to the task than models of simultaneously tracking multiple objects cited here. Prior work (notably from the Roelfsema group) indicates that attentional engagement in curve-tracing tasks may be a continuous, bottom-up process that progressively spreads along a trajectory, in time and space, rather than a "spotlight" that simply travels along the path. The spread of attention depends on the spatial proximity to distractors - a point that could also be pertinent to the findings here.

      Moreover, the tracing of a "solution" trail in a maze may be spontaneous and not only a top-down voluntary operation (Wong & Scholl, 2024), a finding that requires a more careful framing of the link to conscious perception discussed in the manuscript.

      Conceptualizing attention as a spatial spotlight may therefore oversimplify its role in navigation and planning. Perhaps the observed attentional modulation reflects a perceptual stage of building the trail in the maze rather than a filter for a later representation for more efficient decision making and planning. A fuller discussion of whether the current model and data can distinguish between these frameworks would benefit readers.

      (3) Lateralization of attention: The analysis considers whether relevant information is distributed bilaterally or unilaterally across the visual display, but does not sufficiently address evidence for attentional asymmetries across the left and right visual fields due to hemispheric specialization (e.g., Bartolomeo & Seidel Malkinson, 2019). Whether effects differ for left versus right hemifield arrangements is not made explicit in the presented findings.

      (4) Individual differences: Individual differences in attentional modulation are a strength of the work, but similar analyses exploring individual variation in lateralization effects could provide further insight, and the lack of such analyses may mask important effects.

      (5) Distinction between overt and covert attention: The current report at times equates eye movement patterns with the locus of attention. However, attention can be covertly shifted without corresponding gaze changes (see, for example, Pooresmaeili & Roelfsema, 2014).

      The implications for interpreting the relationship between eye movement, memory, and attention in this setting are not fully addressed. The potential dynamics of attention along a maze trajectory and their impact on lateralization analysis would benefit from further clarification.

      Appraisal of Aims and Results:

      The study sets out to determine how spatial attention shapes the construction of task representations in planning contexts. The authors provide evidence that spatial proximity and arrangement influence which environmental features are incorporated into internal models used for navigation, and that accounting for these effects improves model predictions. There is clear documentation of individual variation, with some participants showing greater attentional spillover and more sparse awareness profiles.

      However, some conceptual and methodological aspects would be clearer with greater engagement with the broader literature on attention dynamics, a more explicit justification of operational choices, and more targeted lateralization analyses.

    5. Reviewer #2 (Public review):

      Summary:

      Castanheira et al. investigate the role of spatial attention for planning during three maze navigation experiments (one new experiment and two existing datasets). Effective planning in complex situations requires the construction of simplified representations of the task at hand. The authors find that these mental representations (as assessed by conscious awareness) of a given stimulus are influenced by (spatially) surrounding stimuli. Individual participants varied in the degree to which attention influenced their task representations, and this attentional effect correlated with the sparsity of representations (as measured by the range of awareness reports across all stimuli). Spatially grouping task-relevant information on either the left or right side of the maze led to mental representations more similar to optimal representations predicted by the value-guided construal (VGC) model - a normative model describing a theoretical approach to simplifying complex task information. Finally, the authors propose an update to this model, incorporating an attentional spotlight component; the revised descriptive model predicts empirical task representations better than the original (normative) VGC model.

      Strengths:

      The novelty of this study lies in the proposal and investigation of a cognitive mechanism through which a normative model like value-guided construal can enable human planning. After proposing attention as this mechanism, the authors make concrete hypotheses about mismatches between the VGC predictions and real human behavior, which are experimentally validated. Thus, not only does this study describe a possible mechanism for simplification of task information for planning, but the authors also propose a descriptive model, revising VGC to incorporate this attentional component.

      A strength of this paper is the variety of investigative approaches: analysis of existing data, novel experiment, and a computational approach to predict experimental findings from a theoretical model. Analyzing pre-existing datasets increases the size of the participant cohort and strengthens the authors' conclusions. Meanwhile, comparing the predictions of the existing normative model and the authors' own refined model is a clever approach to substantiate their claims. In addition, the authors describe several crucial controls, which are key to the interpretability of their results. In particular, the eye tracking results were critical.

      In summary, this paper constitutes an important step toward a more complete understanding of the human ability to plan.

      Weaknesses:

      (1) There is a critical conceptual gap in the study and its interpretation, mainly due to the reliance on a self-report metric of awareness (rather than an objective measure of behavioral performance).

      a. Awareness is tested by a 9-point self-report scale. It is currently unclear why awareness of task-irrelevant obstacles in this task would necessarily compromise optimal planning. There is no indication of whether self-reported awareness affects performance (e.g., navigation path distance, time to complete the maze, number of errors). Such behavioral evidence of planning would be more compelling.

      b. Relatedly, it would have been more convincing to have an objective measure of awareness, for instance, how the presence or absence of a "task-irrelevant" obstacle affects performance (e.g., change navigation path distance or time to complete the maze), or whether participants can accurately recall the location of obstacles.

      c. Consequently, I'm not sure that we can conclude that the spatial context does impact participants' ability to plan spatial navigation or to "incorporate task-relevant information into their construal". We know that the spatial context affects subjective (self-reported) awareness, but the authors do not present evidence that spatial context affects behavioral performance.

      d. Another concern that may complicate interpretation is the following: Figure 3c shows improved VGC model predictions (steeper slope) for mazes with greater lateralization. However, there are notable outliers in these plots, where a high lateralization index does not correspond to good model performance. There is currently no discussion/explanation of these cases.

      (2) I noticed an issue with clarity regarding task-relevance. It is currently not fully clear which obstacles are "task irrelevant". Also, the term is used inconsistently, sometimes conflating with "awareness". For example, in the "Attentional spotlight model of task representations" section, the authors state that "task-relevant information becomes less relevant when surrounded by task-irrelevant information". But they really mean that participants become less aware of those task-relevant obstacles. I assume task-relevance is an objective characteristic related to maze organization, not to a participant's construal. Indeed, the following paragraph provides evidence of model predictions of awareness.

      (3) The behavioral paradigm has some distinct disadvantages, and the validity of the task is not backed up by behavioral data.

      a. I understand the need for central fixation, but it also makes the task less naturalistic.

      b. The task with its top-down grid view does not seem to mimic real human navigation. Though this grid may be similar to mental maps we form for navigation, the sensory stimuli corresponding to possible paths and to spatial context during real-life navigation are very different.

      c. Behavioral performance is not reported, so it is unknown whether participants are able to properly complete the task. The task seems pretty difficult to navigate, especially when the obstacles disappear, and in combination with the central fixation.

      d. There is no discussion of whether/how this navigation task generalizes to other forms of planning.

    6. Reviewer #3 (Public review):

      Summary:

      The authors build on a recent computational model of planning, the "value-guided construal" framework by Ho et al. (2022), which proposes that people plan by constructing simple models of a task, such as by attending to a subset of obstacles in a maze. They analyze both published experimental data and new experimental data from a task in which participants report attention to objects in mazes. The authors find that attention to objects is affected by spatial proximity to other objects (i.e., attentional overspill) as well as whether relevant objects are lateralized to the same hemifield. To account for these results, the authors propose a "spotlight-VGC" model, in which, after calculating attention scores based on the original VGC model, attention to objects is enhanced based on distance. They find that this model better explains participant responses when objects are lateralized to different hemifields. These results demonstrate complex interactions between filtering of task-relevant information and more classical signatures of attentional selection.

      Strengths:

      (1) The paper builds on existing modeling work in a novel manner and integrates classic results on attention into the computational framework.

      (2) The authors report new and extensive analyses of existing data that shed light on additional sources of systematic variability in responses related to attentional spillover effects

      (3) They collect new data using new stimuli in the original paradigm that directly test predictions related to the lateralization of task-relevant information, including eye tracking data that allows them to control for possible confounds.

      (4) The extended model (spotlight-VGC) provides a formal account of these new results.

      Weaknesses:

      (1) The spotlight-VGC model has a free parameter - the "width" of the attentional spotlight. This seems to have been fixed to be 3 squares. It would be good if the authors could describe a more principled procedure for selecting the width so that others can use the model in other contexts.

      (2) Have the authors considered other ways in which factors such as attentional spillover and lateralization could be incorporated into the model? The spotlight-VGC model, as presented, involves first computing VGC predictions and only afterwards computing spillover. This seems psychologically implausible, since it supposes that the "optimal" representation is first formed and then it gets corrupted. Is there a way to integrate these biases directly into the VGC framework, perhaps as a prior on construals? The authors gesture towards this when they talk about "inductive biases", but this is not formalized.

      (3) Can the authors rule out that the lateralization effects are the result of memory biases since the main measure used is a self-report of attention?

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript the authors derive a mean-field model for a network of Hodgkin-Huxley neurons retaining the equations for ion exchange between the intracellular and extracellular space.<br /> The mean-field model derived in this work relies on approximations and heuristic arguments that, on the one hand, allow a closed-form derivation of the mean-field equations, and on the other hand restrict its validity to a limited regime of activity corresponding to quasi-synchronous neuronal populations. Therefore, rather than an exact mean-field representation, the model provides a description of a mesoscopic population of connected neurons driven by ion exchange dynamics.

      Strengths:

      The idea of deriving a mean-field model which relates the slow-timescale biophysical mechanism of ion exchange and transportation in the brain to the fast-timescale electrical activities of large neuronal ensembles.

      Weaknesses:

      The idea underlying this work is not completely implemented in practice.

      The derived mean field model do not show a one-to-one correspondence with the neural network simulations, except in strongly synchronous regimes. The agreement with the in vitro experiment is hardly evident, both for the mean-field model and for the network model. The assumptions made to derive the closed-form equations of the mean field model have not been justified by any biological reason, they just allow for the mathematical derivation. The final form of the mean-field equations do not clarify whether or not microscopic variables are used together with macroscopic variables in an inconsistent mixture.

      Comments on revisions:

      The main weaknesses I listed in the first report are still present, since the authors did not answer my questions on a solid basis. I report the list for completeness:

      (1) It seems that the reduction methodology that is employed is not the most suitable one for the single-neuron model they are considering.<br /> (2) The formulation of the mean-field derivation is unnecessarily complicated. It could be heavily simplified by following previously published approaches to derive biologically realistic neural masses.<br /> (3) The model seems to work only for highly synchronized situations and not for the standard asynchronous evolution usually observed in neural circuits.

      Therefore, my statement remains unchanged.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aiming in developing a neural mass model characterized by few collective variables mimicking the dynamics of a network of Hodgkin - Huxley neurons encompassing ion-exchange mechanisms. They describe in details the derivation of the mean-field model , then they compare experimental results obtained for the hippocampus of a mice with the neural network simulations and the mean-field results. Furthermore, they report a bifurcation analysis of the developed model and simulation of a small network containing various coupled neural masses, somehow moving towards the simulation of an entire connectome.

      Strengths:

      The author attempts to develop a mean-field model for a globally coupled network of heterogeneous Hodgkin-Huxley neurons with explicit ion exchange mechanism between the cell interior and exterior.

      Weaknesses:

      (1) They do not employ the reduction methodology more suited for the single neuron model they consider.<br /> (2) Their derivation of the neural mass model is based on several assumptions, and not all well justified.<br /> (3) Their formulation of the mean-field derivation is unnecessary complicated, it can be strongly simplified by following previously published approaches to derive biologically realistic neural masses.<br /> (4) Their model seems to work only for highly synchronized situations and not for the standard asynchronous evolution usually observed in neural circuits.

      General Statements:

      The authors honestly declared the many limitations of their approach, once assumed this the results of the mean-field are somehow inconsistent with the neural network simulations as expected.

      The authors suggest to employ this model for the simulations on the whole connectome to follow seizure propagation, however I believe that a simpler model, as the Epileptor, remains superior in this respect to this model. That indeed includes biophysical parameters but their correspondence with the ones employed in the network dynamics remain elusive, due to the many assumptions required to derive this mean field model. Furthermore it is more complicated than the Epileptor, I do not think that the present model will be largely employed by the community.

      Comments on revisions:

      The authors have corrected mistakes present in the manuscript and put a correct list of references.

      However, they refuse

      (1) To simplify the formulation of the model, the model contains unnecessary complications, as I have clearly written in my report, the authors agree, but they do not want to change the formulation;

      (2) To derive the mean field model in a simpler way, as possible, and as I asked many times in my Referee report, this would help the readers to understand the important aspect of the derivation, without not needed and confusing complicated formulations;

      (3) To compare direct simulations of the network with neural mass results in sub-section "Bifurcation analysis: emergent network states and multistability" to show bistability, as I asked.

      As a matter of fact the performed modifications do not solve my previous doubts on the validity of the results reported in the manuscript.

      Therefore, my previous assessments remain valid.

    1. Reviewer #1 (Public review):

      Pyne and Pandey et al. report the observation of early DNA degradation at the phagocytic cup during macrophage engulfment. Using an elegant experimental system that combines actin staining to visualise cup formation with direct monitoring of DNA degradation, the authors identify rapid recruitment of the membrane-bound nuclease DNase X (DNase1L1) to nascent phagocytic cups. This recruitment occurs within minutes of cup formation, is independent of DNA presence at the substrate, and appears to originate from intracellular membrane structures rather than from the extracellular environment. The results support the conclusion that DNase X activity is present at the phagocytic cup and that DNA digestion can begin prior to phagolysosomal maturation.

      The study is technically strong. The experimental system is clean, specific, and allows precise spatial and temporal detection of DNA degradation. The imaging-based approaches are carefully executed and enable convincing visualisation of DNase X recruitment and activity. The use of an alternative substrate beyond the primary SNS system strengthens the core observation, and the data broadly support the authors' central claim.

      However, several limitations temper the physiological interpretation. The system relies largely on short, free DNA substrates, leaving open how efficiently DNase X processes more complex or physiologically relevant DNA structures, such as nucleosome-bound DNA or neutrophil extracellular traps (NETs). It remains unclear whether DNase X deficiency would alter macrophage responses to larger nucleic acid structures, influence engulfment efficiency, or modify downstream inflammatory signalling pathways such as TLR9 or STING activation. Moreover, the experimental setup prevents full phagocytic cup closure, potentially prolonging DNase activity compared with physiological phagocytosis, which typically proceeds rapidly to cargo internalisation. For example, the peak signal observed in Figure 5 occurs approximately 90 minutes after phagocytic cup formation, a time point at which many phagocytic cups would be expected to have already closed under physiological conditions. Additional work using fully engulfed cargo in more physiological contexts would clarify whether early DNase X activity meaningfully contributes to overall DNA clearance kinetics.

      Mechanistically, the signal that triggers DNase X recruitment remains unresolved. Although actin rearrangement was excluded as the primary driver, the upstream cues that direct DNase X-containing membrane structures to the forming cup are not yet defined.

      In the broader context, early DNase X activity at the phagocytic cup could represent an additional safeguard against inflammatory signalling by limiting extracellular or surface-associated DNA before phagolysosomal degradation by DNase II. This mechanism may be particularly relevant in settings where DNA fragmentation before engulfment is incomplete, such as necroptosis or NET formation. Determining whether DNase X deficiency exacerbates inflammatory responses, alters DNA clearance efficiency in vivo, or contributes to immune pathology will be critical for establishing its physiological and disease relevance.

      Overall, this is a compelling study that introduces a novel concept of pre-phagolysosomal DNA digestion. The conclusions are well supported within the in vitro system used, but further investigation using diverse DNA substrates and physiologically relevant models will be required to fully define the impact of this mechanism on immune regulation and disease.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents an elegant and innovative imaging approach to visualize DNase activity at the interface between macrophages and extracellular substrates. The platform is technically strong and enables the study of localized DNA degradation with high spatial resolution. The work is of clear interest and provides a useful framework to investigate how immune cells process extracellular DNA. However, several aspects of the mechanistic interpretation and conceptual framing would benefit from clarification.

      Strengths:

      (1) The study introduces a creative and well-designed imaging platform that allows visualization of localized DNase activity at cell-substrate interfaces.

      (2) The approach is technically robust and represents a valuable tool that could be broadly useful to the field.

      (3) The experiments are thoughtfully designed and address an important question regarding how immune cells interact with extracellular DNA.

      (4) The work opens interesting avenues for studying DNA processing in contexts such as infection and inflammation.

      Weaknesses:

      While the experimental approach is strong, several key conclusions rely on interpretations that would benefit from further clarification:

      (1) First, the conclusion that DNaseX is recruited to phagocytic cups from the "cytoplasm" appears conceptually imprecise. Given that DNaseX is a membrane-anchored protein, it is unlikely to exist as a freely soluble cytoplasmic pool. A more plausible interpretation is that DNaseX is supplied from intracellular membrane compartments. This interpretation would also be more consistent with the data showing dependence on a membrane anchor.

      (2) Second, the interpretation that actin polymerization is not required for DNaseX recruitment raises concerns. Phagocytic cup formation is known to depend strongly on actin dynamics, and it is therefore unclear whether the structures observed under actin inhibition represent fully formed functional cups or partial cell-substrate contacts. This distinction is important for interpreting recruitment versus activity, particularly since enzymatic activity is reduced under these conditions.

      (3) Third, the identification of DNaseX as the main nuclease responsible for the observed activity is not fully resolved. The conclusions rely primarily on gene silencing and staining approaches, but the specificity of these strategies relative to other nucleases is not addressed. It therefore remains possible that additional enzymes contribute to the observed activity.

      (4) Finally, the interpretation of the biofilm experiments may be overstated. While the data clearly show localized DNA degradation in contact with macrophages, it is not fully established that this process depends specifically on phagocytic cup structures. An alternative explanation is that membrane-associated DNase activity more generally mediates this effect. In addition, the physiological relevance of this mechanism would benefit from further discussion.

      Overall, the study is technically strong and introduces a valuable methodology, but several central conclusions are only partially supported by the current data and would benefit from more cautious interpretation and clearer conceptual framing.

    1. Reviewer #1 (Public review):

      Summary:

      During erythroid differentiation, hematopoietic progenitors relinquish multipotency and activate lineage programs. The switch from GATA2 to GATA1 is particularly important in this process, yet GATA2 chromatin‑binding kinetics remain undefined. The authors investigated GATA2-chromatin interaction dynamics during erythroid differentiation in three different cell systems using single‑molecule live‑cell imaging, and they also used CUT&Tag to profile GATA2 chromatin occupancy.

      By single‑molecule imaging, the authors report two interaction modes for GATA2: short‑lived (<1 s) and long‑lived (>5 s) binding. The proportion of long‑lived molecules, the number of binding events, and the duration of long‑lived binding change (or are maintained) during differentiation. Notably, long‑lived chromatin engagement by GATA2 increases during early erythroid differentiation and decreases at the late stage. CUT&Tag identifies regulatory elements selectively occupied by GATA2 during the early transition stage. Together, these results support a model in which transcription factor kinetics form a dynamic chromatin‑engagement profile that characterizes the GATA2‑to‑GATA1 transition.

      Strengths:

      (1) Characterizing transcription‑factor binding kinetics during the GATA2->GATA1 transition addresses a fundamental mechanism in erythroid differentiation.

      (2) Combining single‑molecule live imaging with CUT&Tag provides both dynamic and locus‑specific perspectives.

      (3) Single-molecule analysis across three different cell systems strengthens the potential generalizability of the findings and highlights biological variability.

      Weaknesses:

      I agree that single‑molecule imaging is a powerful approach for investigating GATA2 kinetics, but the single‑molecule data are the most important part of the paper and need improvement. The analyses focus on three measures: (i) duration of long binding, (ii) proportion of short‑ and long‑binding molecules, and (iii) total binding events. However, several methodological and control issues limit confidence in the kinetic interpretations. The authors should address the following major concerns.

      (1) Two binding states: justification and controls

      The authors propose two states of GATA2 binding. Are there only two states? Studies that separate short‑ and long‑lived binding (e.g., Chen et al., 2014, PMID: 25342811) address two states of transcriptional factors very carefully. Some long‑binding duration distributions here are very long‑tailed (e.g., Figure 2D middle), suggesting a possible third state. The authors must explain how they determined that two states provide the "best fit" to the data and how they classified "short" versus "long" binding.

      Controls should be included for long‑lived and short‑lived binding (e.g., histone proteins, HaloTag‑NLS, or a binding‑deficient GATA2 mutant) as in other studies. These controls are essential to exclude alternative explanations (see points below).

      (2) Exclude photophysical and focal‑plane artifacts

      The authors should exclude contributions from (i) photobleaching, (ii) blinking, and (iii) Z‑axis motion (disappearance from the focal plane). Although photobleaching correction is mentioned in the Methods, no details are provided. Describe and quantify the photobleaching correction and demonstrate that it was applied across all cell types and conditions.

      Some spots in the supplementary movies appear to blink or to move substantially between frames. Provide analyses or controls that distinguish true dissociation events from photophysical blinking/bleaching or axial motion.

      (3) HILO illumination and nuclear region sampled

      HILO is powerful but sensitive to illumination angle: slight changes sample different nuclear regions (e.g., nuclear interior versus periphery). The nuclear periphery is enriched in heterochromatin and may bias binding statistics. Explain how the authors controlled the HILO angle and confirmed that comparable nuclear regions were imaged across cells and conditions.

      (4) Quantification of event counts and long‑binding durations

      The number of binding events and measured long‑binding durations are strongly affected by imaging conditions (labeling/staining, bleaching, nucleus size, cell cycle state, focal plane, spot detectability, etc.). Imaging clarity appears to differ among cells/conditions in the supplementary movie. Provide more careful analysis describing how these variables were controlled or corrected for, and assess the sensitivity of results to choices in detection and tracking parameters.

      (5) Evidence that spots are single molecules

      The authors state that spots represent single molecules but do not provide supporting evidence. Spot brightness varies considerably in the movies. Brightness differences may reflect axial position. Provide evidence supporting single‑molecule assignment (e.g., single‑step photobleaching traces, brightness distributions compared to a known single‑molecule control, or photon count analysis).

      (6) Description of spot‑analysis pipeline

      The manuscript lacks a sufficient description of the spot‑analysis method. I reviewed the STRAP pipeline paper cited (Haque and Coleman 2025 bioRxiv) and the GitHub code, but the Methods in the current manuscript should include a detailed STRAP pipeline. This would enable readers to evaluate and reproduce the analyses.

      (7) Differences among cell systems

      The three cell systems yield notably different results (e.g., Figure 2C vs 4C and Figure 2D/3D vs 4D). Provide a more detailed explanation for these differences and discuss how biological variability, technical differences, or imaging biases might account for the discrepancies.

    2. Reviewer #2 (Public review):

      In this study, the authors address the molecular mechanism underlying the transcriptional changes during erythroid differentiation from hematopoietic progenitor cells. The authors combine single-molecule live cell imaging and CUT&RUN to analyze the chromatin binding properties of the GATA2 transcription factor prior to and after initiation of differentiation into the erythroid cell lineage. Using three distinct cellular systems, the authors demonstrate that the chromatin binding of GATA2 is transiently increased early in the differentiation process, as evidenced by increased chromatin binding residence time and the emergence of new genomic binding sites identified by CUT&RUN. The strength of the study lies in the combination of single-molecule imaging, which reports on binding dynamics but is agnostic of the binding site, with CUT&RUN, which reports on the binding sites but does not provide dynamic information. The authors clearly demonstrate that chromatin binding of GATA2 is altered early in the differentiation process and is later displaced as cells switch to expression of GATA1, which has been previously observed. The use of three distinct cell lines, in particular the GATA2-SNAP mouse model, is a strength in principle; however, the results are not fully consistent between the different cell systems. A key difference is that the G1E-ER4 and HPC7 cell line models express HaloTagged GATA2 in addition to the endogenous GATA2 protein. The authors go through great lengths to control GATA2-HaloTag expression levels, but they use polyclonal cell lines and do not analyze expression levels of the GATA2-HaloTag transgene, which is a key variable in interpreting their experimental results. Finally, a key variable determined in their single-molecule analysis is the number of binding events observed during the distinct differentiation changes. The number of binding events observed is influenced by the expression level of the tagged protein, which in turn is controlled by the Shield-1 ligand, and the fraction of molecules labeled with the HaloTag ligand. Since transgene protein levels and the labeling efficiency were not determined, it is hard to assess how reliable the measurements of the number of binding events are across all cell lines.

      To address the weaknesses summarized above the authors could take the following steps:

      (1) Determine the expression levels of the GATA2-HaloTag transgene over the course of differentiation under the conditions used for single-molecule imaging. This will not only allow them to determine the expression of the transgene but also the endogenous untagged protein with which the GATA2-HaloTag fusion proteins compete for binding sites.

      (2) To determine the fraction of molecules labeled during imaging, the authors could carry out a titration of the HaloTag ligand and compare the amount of labeled protein under single-molecule imaging conditions to that of saturating labeling of the HaloTag. This approach will ensure that the number of labeled molecules per cell is comparable across experimental conditions and allow the authors to draw more solid conclusions regarding the number of binding events.

      (3) The analysis of residence times using single-molecule imaging requires robust single-particle tracking without gaps or interruptions of trajectories. The authors should show images of their particle trajectories to demonstrate that their tracking is robust. Or even better, movies superimposing the trajectories onto the imaging data.

    3. Reviewer #3 (Public review):

      Hobbs et al. use live-cell single-molecule tracking (SMT) of HaloTag- and SNAP-tagged GATA2 combined with CUT&Tag chromatin profiling to examine how GATA2 chromatin engagement evolves during erythroid differentiation. Across three complementary systems, G1E-ER4 cells, HPC7 cells, and primary bone marrow progenitors from a new Gata2-SNAP knock-in mouse, they report a transient strengthening of long-lived GATA2 chromatin binding at the "Early" (2 h) erythroid stage, manifested either as increased residence time (G1E-ER4) or expansion of the long-lived bound fraction (HPC7, primary cells). CUT&Tag identifies 1,167 Early-restricted GATA2 peaks partitioning into GATA2-only (promoter-proximal, GATA/RUNX motifs) and GATA2+GATA1 co-bound (distal, GATA/E-box motifs) subclasses. The authors propose that this kinetic phase represents a previously unappreciated dimension of the GATA switch.

      This is a strong study with a genuinely novel finding-the non-monotonic kinetic behavior of GATA2 during erythroid priming, supported by complementary measurements in three biological systems. The issues below are largely clarifications, additional analyses of existing data, and modest refinements to the discussion. With these addressed, the manuscript will make a valuable contribution. I recommend a minor revision.

      Specific points:

      (1) Clarify the photobleaching correction and report per-cell bleach lifetimes.

      The long-lived residence time claim in G1E-ER4 cells depends on careful accounting for photobleaching, which the Methods indicate was done via a right-censoring model. For reviewer and reader confidence, the authors should report the per-stage (or per-cell distribution of) photobleaching lifetimes and the photobleach-corrected residence time values alongside the apparent values in Figure 2D. If feasible, including a brief supplementary analysis with an H2B-Halo or similar long-lived control under matched conditions would further solidify the quantitative claims. This is an analysis of existing data and should not require new imaging.

      (2) Unify or explicitly discuss the mechanistic differences across systems.

      The three systems show qualitatively different signatures: residence time change in G1E-ER4, bound fraction expansion in HPC7, and primary cells. The authors currently group these under "enhanced engagement," but these signatures imply different underlying mechanisms (koff decrease vs. increased kon or increased specific-binding-competent pool). The Discussion partially addresses this by noting engineered vs. native differences, but a more explicit framing in both Results and Discussion would help readers. Specifically, reporting an on-rate proxy (for example, binding events per unit time normalized to detectable molecule number) alongside koff would let readers see how the mechanistic pieces fit together. I do not think this changes the central message; it sharpens it.

      (3) Per-cell GATA2 concentration would strengthen the "uncoupling" claim.

      A central claim of the Figure 6 model is that chromatin engagement is uncoupled from protein abundance. The ectopic Shield-1 stabilization system is a reasonable design choice, but quantifying total nuclear GATA2-Halo signal (for example, from the pre-bleach frame or a brief high-power acquisition) on a per-cell basis across stages would directly support the interpretation. For the primary cells, where the biological claim is strongest, a western blot or quantitative immunofluorescence on the flow-sorted populations would make the uncoupling argument much more defensible. I recognize this may be one additional experiment, but it is a high-value one.

      (4) Additional single-cell distribution analysis.

      Figure 1E and Figures 2 to 4 show substantial cell-to-cell heterogeneity, and the Early populations in particular look potentially bimodal. Given that the authors cite Wheat et al. and Palii et al. on probabilistic hematopoietic transitions, a brief supplementary analysis using distribution-based statistics (K-S test, or mixture model) rather than, or alongside, mean-based ANOVA would align the analysis with this conceptual framing and may reveal whether the Early state represents a subpopulation transition rather than a uniform shift. This is purely an analysis of existing data.

      (5) Quantitative integration of CUT&Tag with SMT.

      The manuscript presents SMT and CUT&Tag as complementary but does not attempt to quantitatively connect them. A back-of-the-envelope calculation of whether a 21% increase in residence time (G1E-ER4), or the fraction expansion in other systems, is consistent with the acquisition of the 1,167 Early-restricted sites, given plausible site affinities, would substantially strengthen integration. Even if the calculation is approximate, framing it explicitly would help readers appreciate that the two datasets reinforce each other.

      (6) Short-lived kinetic interpretation and tracking parameters.

      The 1.5 s gap allowance in tracking is long relative to the 0.55 to 0.73 s short-lived residence times reported in primary cells (Figure Supplement 1F), which could affect the interpretation of the "slowing of target search" claim. A brief sensitivity analysis with tighter gap parameters in the supplement would reassure readers that this effect is robust. Additionally, please clarify how the inferred slowing of search, which should reduce kon, is reconciled with the increased number of binding events per cell at the Early stage.

      (7) CUT&Tag peak definition.

      The Early-restricted peak set is defined by presence and absence at q less than 0.01, which can be sensitive to near-threshold peaks. Please report either (a) the CUT&Tag signal intensity distribution at the 1,167 sites across all three stages as a quantitative scatter or density plot, beyond the heatmap in Figure 5C, or (b) the result of a differential binding analysis (for example, DESeq2 on read counts in a union peak set) as a supplementary confirmation. Please also state the number of CUT&Tag replicates per stage and the overlap of Early-restricted sets across replicates.

      (8) Knock-in mouse validation.

      The Gata2-SNAP allele is a valuable new tool, and it would benefit from slightly more quantitative validation in the supplement. A brief characterization of basic hematopoietic parameters in homozygotes (CBC, LSK/HSPC frequencies, or colony assays) would confirm that the tagged allele is truly physiological and would serve the community that will want to use this mouse going forward. If this has been done, please include it; if not, a statement about what was checked would suffice.

    1. Reviewer #1 (Public review):

      This manuscript is very interesting and timely. By introducing the critical effects of desolvation barriers and solvent (water)-separated minima into the implicit-solvent potentials (of mean force, PMFs) for coarse-grained molecular dynamics simulations of biomolecular liquid-liquid phase separation (LLPS), this work fills a gap that should be apparent to researchers of protein folding in the past couple of decades but has so far escaped deserved attention such that these basic features of aqueous solvation have seldom, though not never, been invoked in recent studies of biomolecular condensates. Although the present paper deals almost exclusively with homopolymers, this work can be a foundation for the future development of a new, more physical coarse-grained interaction scheme for simulating amino acid sequence-dependent effects, which I presume is the authors' ongoing or next endeavor. The results presented in this manuscript are highly valuable.

      However, there is room for improvement in the authors' description of (i) the broader impact of effects of desolvation barrier and solvent-separated minimum in the thermodynamics of biomolecular condensates, especially with regard to the ramifications on hydrostatic pressure-dependent effects; (ii) the physical implication of using a 20-parameter hydropathy scale rather than a 210-parameter pairwise amino acid interaction scheme; and (iii) temperature-dependent effects, including the authors' discussion of "enthalpic" and "entropic" contributions. In all these aspects, the authors' discussion should be put in a more comprehensive context of the existing literature. At a few other places, the description of the methods and results should be clarified as well. Accordingly, the authors should revise the manuscript to address the following items thoroughly within the revised manuscript (not merely in the response letter) with the additional references mentioned below included in the revised discussion:

      (1) In several places, e.g., on line 77 (p.2), the authors appear to suggest that "implicit-solvent representation" is the origin of the deficiency in commonly utilized coarse-grained potentials that this study is aiming to rectify. But desolvation barriers and solvent-separated minima are also features of implicit-solvent representations; they are just features that should be incorporated in more accurate implicit-solvent potentials. This point is stated quite clearly and accurately in the Abstract (p.1) but not consistently in the rest of the text. The authors should check the entire text carefully to ensure that a coherent, accurate perspective is presented.

      (2) In the discussion of the importance of desolvation barriers and solvent-separated minima in the Introduction (pp.1-3), connections should be drawn to recent works that utilize these PMF features to rationalize hydrostatic pressure (P)-modulated effects on biomolecular LLPS, including the P-dependent reentrant phase separation of alpha elastin; see Cinar et al. (2019) Chem Eur J 25:13049 (https://chemistry-europe.onlinelibrary.wiley.com/doi/full/10.1002/chem.201902210) and references therein, especially discussions around Figures 10, 11 & 13 in this reference.

      (3) In the lower panels of Figures 2D, E (p.5), what do the differently colored small circles in the double-minimum free energy profiles represent? Does the color shading have the same meaning as that in the upper panels? If so, what do the positions of the circles on the free energy profile represent? The authors should clarify this.

      (4) The discussion regarding entropy and enthalpy around Figure 2 is quite confusing as it stands. What do the authors mean exactly by the association of entropy or enthalpy with the desolvation barrier of the solvent-separated minimum? Are they referring to conformational entropy?

      (5) Do the authors assume that the PMF (effective implicit-solvent potential) is a purely enthalpic term? It appears to be the authors' assumption. If so, the assumption has to be stated clearly in their discussion of "entropy" vs "enthalpy" around Figure 2.

      (6) Closely related to points 3-5 above, it should be stated clearly that the "temperature" used in the authors' simulations does not represent experimental temperature if the authors are using purely enthalpic effective potentials because PMFs are in fact temperature-dependent. This clarification is necessary to avoid misunderstanding. In this regard, it should be noted that temperature-dependent effective interactions have been used for modeling biomolecular condensates in analytical theory (Lin, Song, Forman-Kay & Chan, J Mol Liq 2017, already in the citation list) as well as in coarse-grained molecular dynamics simulations [Dignon et al. (2019) ACS Cent Sci 5:821-830 (https://pubs.acs.org/doi/10.1021/acscentsci.9b00102); Chakravarti & Joseph (2025) Protein Sci 34:e70284 (https://onlinelibrary.wiley.com/doi/10.1002/pro.70284)]. The latter two studies, not cited currently, are particularly relevant and thus should be cited because the authors may wish to incorporate temperature-dependent features in their ongoing or future effort in constructing a more comprehensive coarse-grained interaction scheme for biomolecular LLPS simulation.

      (7) In tackling "entropy" vs "enthalpy", it should be noted that the temperature dependence of the effective interactions entails an entropic contribution (which is itself temperature dependent) in addition to conformational entropy. As for the effective potential with desolvation barrier and solvent-separated minimum, it should be noted that the decomposition into entropic and enthalpic contributions at the direct contact, desolvation barrier, and solvent-separated minimum can be dramatically different, see, e.g., MaCallum et al. (2007) PNAS 104:6206-6210 (https://www.pnas.org/doi/full/10.1073/pnas.0605859104) and references therein.

      (8) P.7, line 340: The proportionality relation follows directly from the standard Flory-Huggins result T_c = T chi(T)/chi_c, thus the proportionality constant is exactly 1/chi_c. Is this the standard relation that the authors are invoking here? The authors should clarify this.

      (9) The study on dynamic consequences on pp.8-11 is interesting, but clarifications are necessary:

      (i) The vertical schematic in Figure 4A should be explained in detail in its entirety. As it stands, no explanation is provided either in the figure caption or in the text. In particular, what does "elasticity driven" refer to?

      (ii) The top snapshot in Figure 4A is labeled t_sim = 0 ns. Does it mean that the snapshot shown is the only chain configuration that the authors used to start the simulation, and that the snapshot does NOT represent the result of any time evolution, no matter how short the duration is? However, if that is the case, why is this snapshot identified with spinodal decomposition if it is not the product of a time evolution from a more homogeneous configuration?

      (iii) Related to (ii) - do the rectangular boxes shown represent the entire simulation box or just part of the box containing the polymer chains? One would imagine that if the top snapshot represents spinodal decomposition, the simulation would have been started at a more uniform distribution a short time prior? Why is this not the case?

      (iv) What precisely do the small yellow beads and black-colored springs in the zoom-in image of Figure 4E represent?

      (10) In discussing dynamic effects, it is useful to draw connections to related works on the effect of chain flexibility on "aging" of condensate [Biswas & Potoyan (2024) PRX 45:9222-9245 (https://journals.aps.org/prxlife/abstract/10.1103/PRXLife.2.023011)] and characterization of viscoelasticity in simulations of biomolecular condensates [Tejedor et al. (2023) J Phys Chem B 127:4441-4459 (https://pubs.acs.org/doi/10.1021/acs.jpcb.3c01292)], as the effects of desolvation can be explored further based on these prior works.

      (11) Much of the present study is based on the original HPS formulation of Dignon et al. (2018). In this regard and also in anticipation of future development of improved interaction schemes, several issues should be stated and discussed, even if briefly:

      (i) The original HPS model has a basic shortcoming in accounting for the relative interaction strengths of, among others, arginine vs lysine residues [Das et al. (2020) PNAS 117:28795-28805 (https://www.pnas.org/doi/10.1073/pnas.2008122117)].

      (ii) Compared to 210-parameter pairwise interaction schemes, such as KH in Dignon et al. (2018) and Joseph et al. (2021), the 20-parameter interaction scheme is likely too restrictive to account for pairwise amino acid residue interactions [Wessén et al. (2022) J Phys Chem B 45:9222-9245 (https://pubs.acs.org/doi/10.1021/acs.jpcb.2c06181)].

      (iii) The height of the desolvation barrier may vary significantly for different amino acid residue pairs, see, e.g., Figure 11 of Cinar et al. (2019) mentioned above (and references therein). The authors should discuss these nuances in the revised version. They may also wish to take them into consideration in future investigations.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses an important and timely question in the molecular simulation of biomolecular condensates. Most residue-level coarse-grained models used for IDP phase separation employ implicit solvent and represent effective interactions through relatively simple pairwise potentials. While these models have been very useful, they usually do not explicitly distinguish direct contacts from solvent-separated interactions, nor do they include an energetic barrier associated with water removal. This manuscript attempts to address that limitation by introducing desolvation-inspired terms into coarse-grained models and examining their consequences for phase behavior, chain conformations, dense-phase packing, and dynamics.

      Strengths:

      The central idea is physically well motivated. Using a simple homopolymer model, the authors show that increasing the desolvation barrier suppresses phase separation, whereas stabilizing solvent-separated contacts enhances phase separation. They further show that solvent-separated interactions can reduce dense-phase over-compaction, which is a meaningful result given the known challenges in obtaining both accurate single-chain dimensions and realistic dense-phase properties from the same coarse-grained model. The finding that desolvation-like terms can reshape dense-phase packing without simply rescaling the overall interaction strength is interesting and could be useful for future model development. I also found the attempt to connect conformational changes across dilute and dense phases with thermal distance from the critical point to be intriguing. The dynamic analysis, including the FRAP-like simulations and the discussion of kinetic arrest during coarsening, adds another useful dimension to the work.

      Weaknesses:

      At the same time, there are several places where the manuscript would benefit from more careful framing. First, the desolvation terms are still effective coarse-grained parameters rather than a direct representation of water molecules. The language sometimes gives the impression that desolvation is being treated explicitly, whereas the model introduces desolvation-inspired effective interactions into an implicit-solvent framework. Second, the conformational analysis is interesting, but the broader context of prior work on dilute-to-dense phase conformational reorganization of IDPs could be more clearly discussed. This would help clarify what is new in the present work, whether it is the conformational change itself, its dependence on desolvation terms, or the proposed scaling with distance from the critical point. Third, the dynamic results are potentially useful, but the manuscript should more clearly articulate what is nontrivial beyond the expected slowing of local rearrangements by an added barrier in the potential.

      Overall, I think this is a useful and potentially important contribution.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents an original quantitative approach for tracking the online formation and updating of prior beliefs. In an Alternating Serial Reaction Time task, participants were exposed to probabilistic visual streams, and their pre-stimulus saccadic behavior (i.e., the first eye movement after the previous stimulus disappeared) was monitored via eye-tracking. Since the stimuli followed an alternating probabilistic sequence, upcoming events did not appear with full certainty: some stimuli had a higher, some a lower probability. By comparing anticipatory oculomotor behavior between high and low probability events, the authors dissociated between learning/belief updating and general oculomotor noise. Noise-driven errors were more frequent than learning-dependent errors, which nonetheless triggered more belief updating (i.e., a change in oculomotor behavior in a subsequent encounter of the same event). Interestingly, updating depended more strongly on whether a prior belief was consistent with the task's probabilistic structure than on prediction errors. These findings suggest that incidental, implicit statistical learning may rely on conservative updating with a relatively low learning rate, or on errorless algorithms, rather than prediction errors per se.

      Strengths:

      By applying a fine-grained analysis of anticipatory oculomotor behavior, this work establishes new continuous metrics to quantify the gradual learning and refinement of prior expectations during statistical learning. These metrics provide convincing evidence of the dynamics of anticipatory oculomotor behavior.

      The method is paradigm-independent, offering generalizable metrics for tracking the dynamic formation and refinement of predictive models in any task involving probabilistic stimulus streams. In the future, computational modeling may leverage these continuous metrics to better dissect the mechanisms underlying statistical learning.

      Weaknesses:

      The authors subscribe to the idea that statistical learning is not a unified concept but rather is implemented via multiple underlying mechanisms. However, it remains unspecified what these different mechanisms could be, and how eye movements could contribute to distinguishing between them.

      The authors claim that they developed a novel methodological approach to probe whether anticipatory eye movements directly reflect priors, thereby filling an outstanding gap. However, this claim ignores mounting relevant work on structure learning using eye-tracking in the developmental field.

      The authors claim that their framework quantifies trial-by-trial oculomotor dynamics, while in fact the analyses use epochs (i.e. groups of multiple trials) as predictors. Why not use trial number as a predictor to truly investigate trial-by-trial dynamics that directly reflect anticipation, surprisal, and revision?

    2. Reviewer #2 (Public review):

      Summary:

      Hann and colleagues introduce a gaze-based analytical framework designed to capture, on a trial-by-trial basis, how people form and revise their predictions during implicit probabilistic sequence learning. Using an eye-tracking adaptation of an alternating sequence task, they record the first anticipatory saccade during the response-stimulus interval and classify each such saccade along two dimensions: whether it was directed toward a high- or low-probability upcoming stimulus (the learning-dependent vs. not-learning-dependent distinction), and whether the anticipated location coincided with the stimulus that actually appeared. A complementary iterative-updating metric codes whether a participant's prediction for a given three-element context is repeated or revised on successive encounters of that context.

      On the basis of these measures, the authors report that errors congruent with the inferred regularity - which they interpret as reflecting environmental noise - become progressively more frequent than errors reflecting an inaccurate internal model; that participants show a pronounced tendency to repeat their previous prediction rather than revise it; and that updates depend more on whether a prior belief is congruent with the task's statistical structure than on whether the previous prediction was confirmed. They interpret these results as evidence that statistical learning is less error-driven and more repetition-based (Hebbian in character) than is typically assumed.

      Strengths:

      The methodological ambition of the work is considerable, and the paper makes several contributions that are likely to be useful to the implicit-learning and predictive-processing communities. Using the first anticipatory saccade as a pre-response behavioral readout of prediction is conceptually well-motivated: it provides a trial-by-trial index of predictive orienting at a temporal resolution that manual reaction times cannot deliver, and it does so before the outcome of the trial is known. The explicit distinction between errors arising because the task's outcome is stochastic - that is, predictions congruent with the statistical structure but unconfirmed by the stochastic sample - and errors arising because the internal model is inaccurate is a theoretically meaningful move: predictive-coding and Bayesian accounts have long argued that these two sources of surprise should carry different weight for model revision, and the authors offer a behavioral operationalization of that distinction. The analytical pipeline is not tied to the specific paradigm used here and could be applied to other probabilistic sequence-learning tasks, which gives it broader methodological utility than a single-paradigm report. Finally, the demonstration that learners maintain their prior across successive occurrences of the same context, even when it has been disconfirmed by the most recent outcome, is a robust behavioral observation that speaks directly to an unresolved debate about whether statistical learning is dominantly error-driven.

      Weaknesses:

      The framework and the core behavioral observations are valuable, but several inferential steps - from the gaze signal to the cognitive constructs the authors invoke - are not fully supported by the present design, and these gaps affect how readers should interpret the stronger theoretical conclusions.

      The "process-pure" framing conflates sensitivity with construct purity. The authors repeatedly describe the eye-tracking measure as providing a more process-pure index of statistical learning than manual-response paradigms. Anticipatory saccades are themselves a learned motor behavior - the oculomotor system is among the most plastic motor outputs the primate brain generates, and sequence learning in the saccadic system is well-documented. The present design does not dissociate learning of the statistical structure from learning of the oculomotor sequence that expresses it, so the measure is not, on its face, free from the motor-learning confound that the authors criticize in button-press paradigms. The framing should be read as aspirational rather than as demonstrated by the present data.

      The oculomotor reaction-time data do not show the canonical signature of statistical learning. Reaction times for low-probability trials rise across epochs while those for high-probability trials remain approximately flat (Figure 5). The emerging difference between the two trial types, therefore, appears to be driven by a slowing of responses to low-probability stimuli rather than by a facilitation of responses to high-probability ones, and the authors do not rule out the alternative interpretations that this pattern reflects fatigue, a motor floor effect, or inhibition of unexpected locations. Because no fixation constraint is imposed during the response-stimulus interval, pre-stimulus gaze drift toward the anticipated location will artifactually reduce reaction time on precisely those trials the authors wish to treat as learning-driven; the fact that measured reaction times remain well above zero even on trials classified as correct anticipations is itself evidence that this contamination is present. The oculomotor reaction-time data, therefore, do not provide as clean a verification of learning as the manuscript implies.

      The correct/error labeling of anticipatory saccades incorporates information that the participant did not have. Because the first saccade occurs during the response-stimulus interval - that is, before the upcoming stimulus is revealed - the participant's internal predictive state is identical whether the trial is subsequently classified as a learning-dependent correct response or a learning-dependent error. Any difference in the epochwise frequency of these two categories must therefore be driven, at least in part, by the external stochastic structure of the task rather than by a difference in the predictive process itself. In particular, the observation that learning-dependent errors are the most frequent saccade type (Figure 7) is predicted by the prior probabilities of the outcomes alone, given a high-probability prediction, without appeal to any difference in predictive state. Readers should recognize that the theoretically meaningful contrast is between learning-dependent and not-learning-dependent anticipations (two categories), and that the four-way split risks confounding predictive state with outcome stochasticity.

      The iterative-updating metric does not distinguish prior revision from alternative processes. The binary update / no-update code, computed across non-contiguous occurrences of the same three-element context, does not discriminate between a genuine update of the internal model, simple episodic retrieval of a previously encountered triplet, and oculomotor perseveration. Without a formal generative model to anchor the interpretation, the central theoretical claim - that statistical learning is less error-driven than commonly assumed - is underdetermined by the data. The repetition pattern the authors observe is equally consistent with an error-driven model equipped with a low learning rate in a stable environment, an interpretation the authors themselves acknowledge in the Discussion. Adjudicating between these possibilities requires comparison against explicit computational models, which the present manuscript does not provide.

      Data loss and the absence of fixation control. An interpretable saccade is detected on fewer than half of all trials (48.76%; line 889), and the manuscript does not report the distribution of saccade counts per interval, the per-condition trial counts after all exclusions, or the decomposition of the 20% missing-data threshold into its underlying causes. Given that the entire inferential apparatus rests on this subset of trials, the degree of data loss is a relevant context for the reader. Separately, no fixation constraint is imposed between trials: the participant's starting gaze position at the onset of each response-stimulus interval is whatever position was reached at the end of the preceding response, and this starting position carries trial-history information correlated with the upcoming stimulus. This leaves open the possibility that what is classified as predictive orienting partly reflects the mechanical consequences of where the eye happened to be at the end of the previous trial. The authors defend the absence of a fixation cross on the grounds that it would transform the transitional structure of the task, but this is an empirical claim presented without a supporting citation.

      Heterogeneity within the high-probability condition is not addressed. The two routes to a high-probability triplet in the design - pattern-random-pattern (50% of trials) and random-pattern-random (12.5%) - differ both in their base rate and in the reliability of the contextual cue they provide. Collapsing across these subtypes is an analytical choice that may conceal heterogeneity in the underlying learning process.

      Appraisal: Do the results support the authors' conclusions?

      The framework succeeds in providing a trial-by-trial behavioral readout of predictive orienting that is more fine-grained than conventional reaction-time measures, and the behavioral dissociation between errors congruent with the regularity and errors reflecting an inaccurate internal model is a genuine empirical contribution. The conclusions about the mechanistic nature of statistical learning should be read as motivating hypotheses for future modeling work rather than as settled empirical claims.

      Impact and utility:

      The analytical framework introduced here is likely to be useful to researchers working on implicit learning, predictive processing, and Bayesian models of perception and cognition. The measure of predictive orienting and the iterative-updating code could be adapted to a range of probabilistic learning paradigms, and the behavioral dissociation between noise-driven and model-mismatch errors fills a methodological gap that the field has long acknowledged. The authors share their data and code openly, which will facilitate reuse. The most durable contribution of the paper is methodological; the theoretical claims about the nature of statistical learning will require additional computational modeling before they can be regarded as established.

    1. Reviewer #1 (Public review):

      This paper reports an auditory-directed analysis of the HCP 7T short movie dataset. It has the goal of using the film audio to create tonotopic (pRF) maps and combine these with other HCP-provided data (e.g., T1/T2 ratio) to improve understanding of auditory cortex organization and relative functional segregation, particularly in reference to speech processing.

      The paper is ambitious, uses well-founded existing tools for combining data across subjects, and in the Discussion in particular, makes a lot of careful points about interpretation. The paper shows that, at least for a very large dataset on 7T (and for at least a few individual participants) good quality cross-subject-average tonotopic maps can be extracted from fMRI movie datasets via basic spectral modelling of the movie soundtracks. It also suggests ways that these movie-based maps can be combined to come up with potential models of cortical organization. The PCA analysis is a creative way of combining maps (see below for comments)

      These are valuable tools for the field in exploiting/exploring existing data, and I look forward to trying them out myself. I want to emphasize that this is not 'damning with faint praise' - a concrete demonstration of this approach with freely available tools/examples is not only the product of a lot of effort (thank you!), but will be an impetus to research going forward.

      In terms of the contribution to our understanding of auditory cortex organization, using this large N cohort, they replicate a number of findings in the literature from the last couple of decades, including the overlap of low frequency preference with greater speech stimulus preference (e.g. Moerel, de Martino, & Formisano, 2012, J Neuro), patterns of BF width across cortex (Moerel et al., various; Thomas et al. 2015), use of shorter and longer natural sounds (Moerel et al., 2012, 2014; Dick et al., 2012), the importance/influence of sustained spectral attention for tonotopic mapping (da Costa et al., 2013; Dick et al., 2017; Riecke et al. 2017), the use of tonotopy and 'myelin' mapping to establish areal or regional boundaries (Dick et al., 2012; de Martino et al., 2015; Besle et al., 2018, etc) and the overall shape and consistency of tonotopic maps (e.g., Talavage et al., 2004, Humphries et al., 2010 and many others). To my knowledge/memory, this is the first tonotopy paper that has used the cross-subject cortical-surface-based averaging techniques that are driven by more than curvature/sulcal alignment.

      The paper focuses in particular on creating new sets of ROIs based on the various maps derived from the data. Despite being quite familiar with this body of work, I found it difficult to follow how the ROIs were derived, and how and why they were different and/or an improvement over existing parcellation schemes (see for instance Sereno, Sood, & Huang, 2022 for a comprehensive parcellation framework across modalities including auditory, based on combined receptive surface mapping, myelin estimates, and other metrics).

      Given the hour of fast(ish) fMRI data on a 7T with pretty big voxels (so high SNR), one aspect of the results that I found surprising - and potentially informative - was the lack of reliable tonotopic 'mappability' in the majority of participants. The authors' analytic approach to computing the pRFs seems completely reasonable (and shows good average maps), and yet individual maps seem unreliable except for the very best examples. I wondered if this might be due to problems in data collection with earbuds becoming slightly uncoupled and therefore delivering a lot less lower-frequency response and also not preventing scanner noise from getting to the ear; this is often a problem with any in-scanner earbud system (including the Sensimetrics). I wondered if the robustness of the 'speech maps' was associated with that of tonotopy; if they are highly associated, that would suggest that either there were huge individual differences in auditory attention, or perhaps that there was some variability in the acoustic signal delivered to each participant.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors leverage a high-powered 7T fMRI dataset of subjects viewing naturalistic audiovisual movies to elucidate the topographic organization of the human auditory cortex. By applying a nonlinear pRF model, they successfully map tonotopic gradients extending beyond the auditory core into the STG and STS areas. A primary finding is a medial-to-lateral gradient of increasing response compressivity, which the authors claim mirrors the hierarchical cascade architecture of the visual system. Furthermore, the modeling reveals that regions exhibiting high speech selectivity predominantly occupy the low-frequency portions of non-primary tonotopic maps. The authors argue that this architecture reflects an efficient coding mechanism where the cortex magnifies specific spectral features to facilitate the transition from acoustic encoding to flexible speech representation.

      Overall, the study presents concise analyses and compelling high-resolution results that advance our understanding of auditory cortical organization. However, the manuscript currently exhibits several significant theoretical and methodological gaps that temper its broader claims. Most notably, the authors' reliance on a spatial, retinotopic-like analogy overlooks the fundamentally temporal nature of audition. Decoding continuous, natural speech relies heavily on dynamic, full-spectrum temporal integration and contextual recurrent computations, which are difficult to reconcile with the purely static, low-frequency spatial tuning observed here.

      Strengths:

      (1) The utilization of ultra-high-field 7T functional imaging combined with large-scale, naturalistic continuous stimuli provides an excellent signal-to-noise ratio and captures cortical responses under ecologically valid conditions.

      (2) The application of a non-linear pRF encoding model provides a robust, quantitative method for parameterizing and mapping tonotopic features across the cortex, moving beyond simple contrast-based parcellations.

      (3) The manuscript effectively demonstrates the relationship between category selectivity (e.g., speech) and underlying tonotopy, drawing an elegant and structurally useful analogy to the well-established relationship between category selectivity and retinotopy in the visual cortex.

      Weaknesses:

      (1) While the PCA mapping of the functional and structural parameter space is visually compelling, the robustness of this representational geometry across varying acoustic contexts remains ambiguous. Because the model relies on the specific statistical regularities of a single naturalistic audiovisual stimulus set, it is unclear if this low-dimensional structure would hold when tested against isolated speech sounds, environmental noise, or spectrally matched non-speech control stimuli.

      (2) The methodological descriptions currently lack the computational precision required for replication and deep evaluation. I would suggest that the exact mathematical formulation of the encoding model be fully specified in the Methods section. This should include an explicit definition of the objective function, a clear accounting of all terms and hyperparameters utilized during the fitting process, and the exact dimensionalities of both the input feature space and the resulting parameter space.

      (3) There is a critical theoretical disconnect between the observed static, low-frequency tuning in the STG and the known acoustic requirements for continuous speech perception. Speech is a full-spectrum signal; while fundamental frequencies and formants dominate the lower spectrum (which is vital for processing dynamic pitch contours), high-frequency bands (>1 kHz) carry indispensable phonetic information, such as the rapid spectrotemporal dynamics of consonants, especially fricatives. If the speech-responsive cortex is primarily and statically tuned to a low-frequency spectrum, it is unclear how the dynamic, high-frequency spectral information required for semantic decoding is represented. A rich body of electrophysiological literature documents diverse spectrogram coding in the STG. For example, Mesgarani et al. (Science, 2014) demonstrated using spectrotemporal receptive field models that neural populations in the STG are tuned to both low and high-frequency spectrograms well above 1 kHz. The authors must address this discrepancy and attempt to reconcile their static tonotopic findings with the existing literature on dynamic speech encoding.

      (4) While drawing parallels between visual and auditory processing hierarchies is conceptually attractive, the modalities face fundamentally different computational challenges. Vision is largely resolved in space, making a retinotopic spatial coding strategy ecologically and computationally sound. Audition, however, evolves continuously in time. Complex temporal structure, continuous temporal integration, and contextual recurrent computations are paramount for auditory processing, particularly for speech comprehension. In this sense, a purely spatial or tonotopic coding framework is insufficient to fully explain the complex temporal processing dynamics required in the higher-order auditory domain.

    3. Reviewer #3 (Public review):

      Summary:

      The work has the potential to identify the topographical organization of the auditory cortex, which remains controversial with current unnaturalistic sound stimulation, using an elegant approach developed in the visual domain with population receptive field mapping to study the organization of the visual system with naturalistic stimulation conditions.

      Strengths:

      This work presents an analysis of the topographic study of auditory cortical organization, using a substantial Human Connectome Project 7-Tesla functional imaging dataset in which 174 participants viewed naturalistic movies.

      Weaknesses:

      The key issue for the paper is that even the authors seem undecided on what the topographical results are and whether these results are consistent with, refute, or expand our notion of human auditory cortical field organization using this massive dataset obtained under movie-watching conditions. Short of this clarity, and much of the discussion of the issues surrounding topographic mapping is buried in the Supplementary materials section, it is not clear what the authors think the advance of the current work is beyond the large datasets.

      On the flip side, there is little consideration of the challenges of mapping the auditory cortex using naturalistic stimuli that prevent dissociating visual from auditory stimulation conditions, contributing to this clarity or lack thereof in tonotopic mapping.

      As such, the current manuscript struggles to achieve its full potential.

    1. Reviewer #1 (Public review):

      Summary:

      This study by Tsuji et al. explores a mechanical threat model in Drosophila using air puffs as a stimulus. The authors first establish the paradigm and show that air puffs induce cardiac deceleration along with increased locomotion. They then identify dopamine as a key regulator of this response and go on to map the underlying circuit. In doing so, they pinpoint two pairs of DA-WED neurons as critical players. They carefully used intersectional strategies to achieve relatively clean labeling of these neurons, which helps ensure that the observed effects can be attributed specifically to DA-WED neurons. They further show that DA-WED neurons are both required and sufficient to drive cardiac deceleration, and that their activity increases in response to air puff stimulation. These neurons also contribute to the locomotor response. Directly inducing cardiac deceleration via optogenetic manipulation of cardiomyocytes also increases locomotion, suggesting a link between cardiac state and behavioral output.

      Strengths:

      Overall, the experiments are thoughtfully designed, well-controlled, and clearly presented. The figures are easy to follow, and the conclusions are generally well supported by the data. The manuscript is also clearly written, with a discussion that acknowledges potential caveats and outlines future directions. The genetic tools, behavioral paradigm, heart rate measurement approaches, and stimulation methods introduced here will be valuable resources for the community.

      Weaknesses:

      A few minor points to add to the clarity of the manuscript:

      (1) The DA-WED driver (R48A08-AD ∩ VT008692-DBD ∩ TH-FLP) appears quite clean in the brain. However, since the study focuses on cardiac function and locomotion, it would be helpful to check expression in cardiomyocytes and the ventral nerve cord. This would help rule out any off-target expression that might contribute to the phenotypes and further support the idea of a descending pathway from brain dopaminergic neurons.

      (2) Since DA-WED>Kir2.1 abolishes the puff-induced locomotor response (Figure 5b), suggesting that DA-WED neurons are directly involved in mediating locomotion. In the model (Figure 5L), it might make more sense for the pathway from mechanical threat to locomotion to pass through DA-WED neurons. The authors could consider adjusting the schematic if they agree.

      (3) In line 408, Figure 5K should be 5L as it's a discussion of the model.

      (4) In Figure 5j, the x-axis is missing time labels. Even if it matches Figure 5h, adding labels would make it easier to interpret at a glance.

      (5) In line 312, it would be helpful to briefly explain why a 28 ms light pulse was used, compared to other pulse durations elsewhere in the paper.

      (6) The cardiac deceleration seems to recover quickly after the air puff ends, whereas the locomotor response persists longer (around 10-15 seconds; see Figure 1 and Figure 5). This difference might suggest that DA-WED neurons influence locomotion through an additional or partially independent pathway, beyond their role in cardiac regulation. It could be worth briefly discussing this possibility.

    2. Reviewer #2 (Public review):

      Summary:

      The authors study cardiac deceleration during threat responses in Drosophila. Particularly, it focuses on identifying the neuronal control of this deceleration. Using behavioral and cardiac tracking and analysis, genetics, and calcium imaging, they identify two pairs of dopaminergic neurons involved in cardiac deceleration during air puff responses

      Strengths:

      The study is overall well done, and the paper is clearly written. Particularly, the work on identifying the two pairs of dopaminergic neurons involved in cardiac deceleration using a series of drivers and generating new ones is rigorous and extensive. Finally, the authors manipulate the heartbeat to investigate how it influences threat responses

      Weaknesses:

      There are, however, several points that need to be clarified, as some claims are not entirely supported by evidence.

      The authors, for example, claim that dopaminergic neurons are responsible for cardiac deceleration (during the air puff, lines 182-3, page 9). However, based on the work in this study, it seems that other neurons could be involved in this control as well. In addition to dopaminergic neurons, the authors test serotonergic and octopaminergic neurons, which, based on silencing experiments, also show an implication in heart-beat deceleration. Furthermore, because they find that dopaminergic neurons are the only ones that, upon thermogenetic activation, lead to lower heart beat frequency, they conclude that the dopaminergic neurons are responsible for air -puff induced cardiac deceleration.

      However, these activation experiments are done in a different context than the air puff experiments (at a higher temperature, which could have an effect on the heartbeat changes upon activation of different neuron groups), and because silencing of other monoaminergic neuron types during the air puff also resulted in less cardiac deceleration, one cannot exclude the implication of octopaminergic or serotonergic neurons in air-puff-induced deceleration.

      Activation experiments without high temperatures (using, for example, optogenetics) and/or in the presence of the air puff would be important to determine that the dopaminergic neurons are the main type of monoaminergic neurons involved in air-puff-induced cardiac deceleration. Otherwise, the related claims should be rephrased in a way that clearly doesn't exclude a possible implication of other monoaminergic neurons.

      Regarding the interactions between the cardiac deceleration and locomotion, the authors propose, based on the results, that the optogenetic cardiac deceleration is sufficient to induce an increase in locomotion, and that it is the decrease in heartbeat that would be responsible via interoceptive pathways to trigger an increase in locomotion. In the model they propose, the DA-WED neurons would induce a decrease in heartbeat that, in turn, would trigger an increase in locomotion. There is not enough proof that cardiac deceleration is the one that triggers an increase in locomotion during air puff responses. As the authors themselves state, the experiments that would demonstrate this would involve preventing cardiac deceleration while optogenetically activating DA-WED. It can therefore not be excluded that the DA-WED neurons trigger an increase in locomotion that is possibly modulated by the cardiac activity. Both alternatives should be considered (models in Figures 4 and 5).

    3. Reviewer #3 (Public review):

      Summary:

      In this elegant study, Tsuji et al. identify a relationship in Drosophila between cardiodynamics and threatening stimuli where mild air puffs elicit a brief bradycardia that coincides with locomotion increases. They then take advantage of the arsenal of genetic tools available in the fruit fly to reveal the indispensability of dopamine, through the action of Dop1R2, in this phenomenon. Further, they pinpoint the source of this dopamine to two specific pairs of neurons - DA-WED that are threat-activated. They then test and find a potential role for cardiac interoception from the heart in linking behavior and cardiodynamics.

      Strengths:

      This is an interesting and timely story that brings together the tools of fruit fly systems neuroscience and links it with physiology. The experiments are well done and tell a very nice story. In particular, the primary message of the paper - that the authors have identified specific dopaminergic neurons that regulate cardiac activity - is sound.

      Weaknesses:

      There are no important problems with the scientific approach. Rather, there are some interpretive changes I would consider.

      (1) The changes in heart rate are small (10% or so), and, as far as I can tell, are evident for a beat or two. So the data may be better interpreted not as a change in rate but as a lengthening of diastole for a beat or two. That may seem a petty difference, but it might point to particular stretch-activated systems or changes in blood flow as the determinant.

      Heart rate must be averaged over time, and so might be blurring the effects. It may be useful to produce figures centered on beat count and duration rather than time. Because the effect may even be just on a single beat, we suggest the authors try plotting the average beat duration for each beat that follows the air puff. If it's really just the first beat, using a quantification of the change of this duration relative to the average that precedes the puff may produce more striking figures.

      (2) The author's model that cardiac deceleration leads to walking data is only partially supported by their data. In the first figure, the relationship between cardiac deceleration and walking probability seems to be inverted relative to their model (weak stimulus -> strong cardiac effect and weak locomotor effect; strong stimulus-> weak cardiac effect and strong locomotor effect). It is possible that this discrepancy may disappear when the authors look at beat duration rather than heart rate (for instance, if following the strong stimulus, there is a very long beat that is followed by tachycardia, thus weakening their observed HR change). It would also be easier to relate this data in Figure 1 to their interoceptive model if some data were shown that illustrated the relative timing of the cardiac change and the locomotor start.

      (3) Also, since the locomotor and cardiac changes are probabilistic, it would be very useful to see how their respective probabilities change when conditioned on the other. According to their interoceptive model, locomotion should preferentially increase on trials where cardiac deceleration occurs. The authors should discuss this incongruity and also potential alternative interpretations of their cardiac manipulation experiments. Perhaps the bradycardia makes them more sensitive to threats - as suggested in the introduction? Control flies show a mild increase in locomotion following green light (Figure 5j), so perhaps by slowing the heart, they are more sensitive and thus respond more strongly to this stimulus?

      (4) Looking at the example shapes of the beats in Figure 5g versus Figure 1c, the optogenetically induced diastole has a very different shape from the naturally occurring long beat. Thus, the exact cardiac stimulus may be unnatural. If this is true across trials and animals, it may be worth considering that the funny beat (like an anxiogenic atrial fibrillation in mammals) is the source of the fear and, in turn, locomotor behavior (also interesting!) rather than being a true replication of the cardiac events seen following the puff stimulus.

    1. Reviewer #1 (Public review):

      Summary:

      The current study is a follow-up to a previously published study by the same research group (Nold et al. 2025). In the previous study, the authors had included a set of exploratory analyses which assessed the effects of fitness level (denominated by a relative FTP), sex, and drug treatment (Naxolone versus placebo). In this previous study, the authors state that "exploratory analysis showed a significant main effect of fitness level on differences in pain ratings in the [saline] condition... suggesting increased hypoalgesia with increasing fitness levels, pooled across all stimulus intensities".

      In the current study, the authors have recruited an additional 22 female participants (21 included in analysis) from local cycling clubs to assess if fitness level does indeed impact exercise-induced hypoalgesia responses to experimental thermal and pressure pain models.

      Strengths:

      The current study has the potential to present a convincing argument about the effect of fitness level and potentially other factors (e.g., sex) on exercise-induced hypoalgesia responses. Combining data across two of their primary studies would be highly fruitful to the research community interested in this area. Specifically, it has the potential to inform sports medicine practitioners and how they administer exercise protocols to help those experiencing pain with a further consideration for the fitness level (and maybe sex) of their patients.

      Weaknesses:

      However, the current study makes several bold claims about the role of fitness level and sex on exercise-induced hypoalgesia, which I do not believe that this study on its own - or in conjunction with the previously published study by the same authors - can make at present. Namely, the current study does not appear to conduct any specific analyses between the cohorts from either study (current and present). The results mention a difference in the group mean values in "fitness level" between cohorts, but the analysis itself on pain responses/exercise-induced hypoalgesia is limited only to the cohort from the current study. If the authors wanted to provide a convincing argument that fitness level has an effect on exercise-induced hypoalgesia, then the analysis of this study would have to include an analysis between the groups considered to be of "high" and "low" fitness level. I do not think the current study does this. Instead, it makes an assumption from the previous study (Nold et al. 2025) which only states that "exploratory analysis showed a significant main effect of fitness level on differences in pain ratings in the [saline] condition... suggesting increased hypoalgesia with increasing fitness levels, pooled across all stimulus intensities". The analysis of this study would have to include fitness level "high fitness" versus "low fitness" of participants across both studies in its statistical model to properly discern if fitness level has an impact on exercise-induced hypoalgesia.

      A similar comment can be made with respect to sex differences, as these have not been assessed in the analysis of this study either.

      Another area of weakness in this study is how "fitness level" has been demarcated across participants. One issue is how authors have assumed that the current cohort is 'fit', whereas the previous cohort was 'less fit', meaning that the authors could be coming to false conclusions about fitness level. In detail, figures within the current study show a large overlap between the 'fit' and 'less fit' cohorts, where some participants have a higher relative functional threshold power (FTP) in the 'less fit' cohort than the 'fit' cohort and vice versa. Therefore, I believe the authors should better demarcate between those that are in the 'more fit' and 'less fit' groups according to a validated and well-established criterion from the kinesiology and sport science literature. That being said, I think this may be problematic in some ways as FTP is considered a relatively poor measure to denote fitness levels, a limitation highlighted in the previous study's review.

      Altogether, whilst I commend the researchers on their body of work across the two studies, the current methods and analysis provide an incomplete assessment of their primary research question, and therefore, I would urge the authors to reconsider some of their methods/analysis and the framing of their results to better reflect the main research question they have attempted to answer. Likewise, I would recommend that readers ensure they consider the current results with caution until the authors have addressed some areas of concern which currently limit their main conclusions.

    2. Reviewer #2 (Public review):

      This study addresses an important question regarding exercise-induced modulation of pain in women, but the conclusions appear to be based on relatively limited and selective evidence. The authors report an interaction between exercise intensity and stimulus intensity, which they interpret as evidence for exercise-induced hypoalgesia and conclude that fitness, but not sex, modulates this effect. However, this main result relies on a relatively small interaction that emerges only under specific conditions, with inconsistent findings across pain modalities and stimulus intensities, and an analysis approach that does not fully exploit the continuous pain ratings collected. The lack of a baseline condition further limits the interpretability of the findings as reflecting hypoalgesia, and overall, the data provide a rather constrained basis for drawing broader conclusions.

      Strengths:

      (1) The focus on women is important and timely, particularly given the ambiguity in prior findings and the historical bias toward male-dominated samples.

      (2) The attempt to revisit previous findings in a new cohort is valuable in principle.

      Weaknesses:

      (1) The core interpretation may not be fully supported by the data

      The central claim-that the results demonstrate exercise-induced hypoalgesia and its dependence on fitness but not sex-does not appear to be fully supported by the evidence presented.

      1.1 Lack of baseline condition

      The absence of a no-exercise baseline substantially limits interpretation. The study compares high- and low-intensity exercise, but without a baseline, it is not possible to determine whether either condition produces hypoalgesia or hyperalgesia relative to calibration. The observed HI-LI difference, therefore, reflects only a relative contrast between exercise intensities, not an absolute reduction in pain. As a result, attributing the findings to "hypoalgesia" may be difficult to justify fully.

      1.2 Lack of internal replication across conditions

      The reported effect is highly specific and does not clearly generalise across the experimental design. It emerges significantly only for heat pain at the highest stimulus intensity, with no clear effects for other intensities and for pressure pain. Moreover, the main statistical result is a relatively small interaction effect with a modest p value, which translates into a difference of approximately 6-8 VAS units on a 150 scale. This combination-a small effect size, limited statistical strength, and restriction to a single condition-substantially weakens the evidence for a robust or generalisable effect.

      1.3 Deviations from the original study and selective use of data

      Although framed as a follow-up to previous work, the current study introduces substantial methodological changes, particularly in the acquisition and scaling of pain ratings (continuous vs post-hoc ratings, modified VAS with sub-threshold range). Despite collecting rich continuous data, the analysis focuses on peak responses to approximate the previous study. While this may aid comparability, it results in a strong emphasis on a single data point (highest intensity), rather than leveraging the full dataset. This limits both interpretability and comparability.

      1.4 Over-reliance on null results regarding sex differences

      The conclusion that fitness, but not sex, modulates exercise-induced pain may not be directly supported by the data presented. The current study includes only highly fit women, and comparisons with men or less-fit women rely on non-significant differences in a previous cohort. The absence of a significant difference does not provide evidence for equivalence, and no formal statistical support for a null effect is provided. As such, conclusions about the absence of sex differences would unfortunately benefit from more cautious interpretation.

      (2) Limited sample and lack of diversity

      The dataset is narrow in scope, comprising a small sample (N = 21) of healthy, highly fit women. Key demographic characteristics (e.g. age range, BMI distribution) are not fully presented, explored or discussed. This limits generalisability and makes it difficult to draw broader conclusions about exercise-induced pain modulation in women, as the main focus of the study.

      (3) Methodological choices limit the interpretability of the data

      Several methodological decisions would benefit from stronger justification:

      3.1 The use of a non-standard VAS scale (0-150 with a fixed pain threshold at 50) is unconventional and may influence how participants report pain, while limiting comparability with related literature.

      3.2 Participants explicitly reported expecting exercise to reduce pain, introducing a potential confound that is not presently addressed.

      3.3 A more comprehensive use of the full time series of pain ratings would provide a stronger and more transparent basis for interpretation of the present findings.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript investigates how cellular NAD/NADH ratios are controlled in cancer cell lines in vitro. The authors build on previous work, which shows that serine synthesis is sensitive to NAD/NADH ratios and PHGDH expression. Here, the authors demonstrate that serine synthesis is variable across a panel of cell lines, even when controlling for expression of serine synthesis enzymes such as PHGDH. The authors show that cellular NAD/NADH ratios correlate with the ability to synthesize serine and grow in serine-deprived environments when PHGDH levels remain constant. Investigating this variability in NAD/NADH ratios, the authors find that the cells that can positively respond to serine deprivation are able to increase oxygen consumption and cellular NAD/NADH ratios. Cells that do not increase oxygen consumption in response to serine deprivation do not increase NAD/NADH ratios and cannot grow well without serine. The authors go on to show that in cells with the ability to increase oxygen consumption upon serine deprivation, PHGDH expression alone is sufficient to fully restore growth-serine; in cells that cannot increase oxygen consumption, both PHGDH expression and interventions to increase NAD/NADH ratios are required to increase growth. Thus, cells need both PHGDH and NAD/NADH increases to maximize serine synthesis in response to serine deprivation. The authors previously showed that lipid synthesis likewise requires NAD regeneration. Interestingly, one cell line that does not increase oxygen consumption in response to serine limitation tends to increase oxygen consumption in response to lipid deprivation; accordingly, depriving this cell line of lipids increases the synthesis of serine. Together, these findings show that how cells respond to nutrient deprivation is highly variable and that the response to nutrient deprivation (for example, whether or not oxygen consumption is increased) will determine how well cells tolerate depletion of nutrients with related biosynthetic constraints. This work sheds light on the complexity of cancer cell metabolism and helps to explain why it is difficult to predict which nutrients will be limiting to any cancer cell type or environment.

      Strengths:

      (1) The authors use multiple interventions to manipulate NAD/NADH ratios in cells.

      (2) Experiments are well controlled and appropriately interpreted.

      Comments on revised version:

      The authors thoughtfully and thoroughly responded to all reviewer comments. The revised manuscript addresses the critiques.

    2. Reviewer #2 (Public review):

      In the manuscript "Cancer cells differentially modulate mitochondrial respiration to alter redox state and enable biomass synthesis in nutrient-limited environments", Chang et al investigate how cancer cells respond to the limitation of certain environmental nutrients by regulating the cellular NAD+/NADH ratio. They focus on serine and lipid metabolism, pathways known to be controlled by the NAD+/NADH ratio, and propose that changes in mitochondrial respiration in response to deprivation of these nutrients can influence the NAD+/NADH ratio, thereby impacting biomass synthesis.

      While the study is descriptive in nature and does not investigate specific molecular mechanisms that explain the crosstalk between nutrient availability and mitochondrial redox changes, the experimental component is robust, and the conclusions are well supported by the results. Some suggestions could further refine the conclusions and enhance the quality of the manuscript.

      Comments on revised version:

      The authors have provided a very comprehensive response. Their updated paper has improved, and the critiques have been mitigated.

    1. Reviewer #1 (Public review):

      The authors conducted a comprehensive benchmarking and evaluation of co-folding platforms, including AlphaFold3, Boltz-2, Chai-1, and the docking algorithm Dock3.7, which employs a physics-based scoring function that incorporates van der Waals interactions, electrostatics, and ligand desolvation energies. The system of interest was the SARS-CoV-2 NSP3 macrodomain (Mac1), an increasingly popular antiviral target, and the ligand sets comprised 557 unseen ligand poses (keeping the training for these co-folding platforms in mind). Additionally, the authors investigated whether the co-folding models could distinguish true ligands from non-binding small molecules. The study is thorough, with extensive statistical support and consensus across multiple metrics (chemoinformatics for quantifying ligand similarity and efficacy). The questions that the authors aim to address are whether the co-folding models struggle with memorization, whether they can distinguish between a true and a false binder, whether they replicate experimental binding affinities and efficacy, and how they compare to the physics-based docking algorithm (Dock3.7).

      Strengths:

      Overall, this is a scientifically solid paper.

      The work is highly detailed and well executed, featuring thorough data analysis and statistical assessment.

      Comments on revised version:

      The authors have adequately addressed my concerns.

    2. Reviewer #3 (Public review):

      Summary:

      Core conclusions are well-supported by data: co-folding outperforms docking in known ligand pose/affinity prediction (validated by RMSD and IC₅₀ correlation), struggles with false positive discrimination in virtual screens (lower AUC values), and is complementary to docking (non-correlated errors, distinct strengths in drug discovery stages).

      Strengths:

      Unprecedented prospective design with 557 novel Mac1-ligand complexes ensures rigorous, independent evaluation of co-folding methods, provides an unbiased and rigorous benchmark dataset, which contains structures and compounds absent from the co-folding models training sets. Comprehensive comparison of 3 co-folding tools (AlphaFold3, Chai-1, Boltz-2) with DOCK3.7 across diverse targets and metrics enables nuanced performance assessment. The revised results clarify an intriguing finding: co-folding can predict correct ligand poses even when protein formations are mispredicted. The study clearly demonstrates complementary roles of co-folding (superior pose/affinity prediction for known ligands) and docking (better hit prioritization), and addresses deep learning memorization concerns via ligand similarity analysis.

      Weaknesses:

      The study identifies a major limitation of co-folding-failure to capture rare protein conformational changes, which deserve future investigation. The authors include uncalibrated Boltz-2 affinity data (addressing a prior comment) but note that large-scale free energy perturbation (FEP) comparisons are beyond their capabilities.

      Appraisal of Aims Achieved:

      The authors successfully achieved their primary aims and the results provide strong, well-supported evidence for their core conclusions. Key conclusions are grounded in the study's unbiased, training-set independent data, ensures the conclusions are not confounded by model memorization and are broadly applicable to the field's use of these co-folding models.

      Field Impact:

      This study provides a critical reality check for the field: co-folding models are powerful tools for pose prediction but are not yet standalone solutions for virtual screening, a key distinction that will prevent over-reliance on these models and guide more rational tool selection.

    1. Reviewer #1 (Public review):

      Summary:

      Eroglu and Hobert demonstrate that injecting CRISPR guides and repair constructs to target three genes at a time, tagging each with a different fluorescent protein, and selecting which gene to tag with which fluorophore based on genes' expression levels, can improve efficiency of gene tagging.

      Strengths:

      This manuscript demonstrates that three genes can be targeted efficiently with three different fluorophores. It also presents some practical considerations, like using the fluorophore least complicated by agar/worm autofluorescence for genes with low expression levels, and cost calculations if the same methods were used on all genes.

      Weaknesses:

      Eroglu has demonstrated in a previous publication that single-stranded DNA injection can increase efficiency of CRISPR in C. elegans, while inserting two fluorescent proteins and a co-CRISPR marker into three loci, and Paix et al 2015 demonstrated simultaneous insertion of two fluorescent tags. The current work is valuable and incremental advance. In general, I applaud the authors' willingness to strategize about how whole proteome tagging might be accomplished. I predict that the advance here will be one of many small advances that will get the field to that goal. The title oversells the advance presented, in my view, since seems like one among many key advances, and the first sentence of the Discussion seems a more apt summary of the key advance here.

      Some injections targeted genes on the same chromosome together, which will create unnecessary issues when doing crossing that will be useful for some future experiments. This made me wonder if injecting 3 together really is helpful vs targeting each gene separately, since only 5 worms need to be injected. It cuts time down by 2/3, but perhaps avoiding targeting the same chromosome with two tags would be useful.

      The limited utility of current blue fluorescent proteins makes me wonder if it's worth using at this stage, before there are better blue fluorescent proteins, or better yet, far red, to avoid issues with live imaging under phototoxic UV or near-UV illumination.

    2. Reviewer #2 (Public review):

      Original Review:

      The manuscript by Eroglu and Hobert presents a set of strains each harboring up to three fluorescently tagged endogenous proteins. While there is technically nothing wrong with the method and the images are beautiful, we struggled to appreciate the advance of this work - who is this paper for?

      As a technical method, the advance is minimal since the first author had already demonstrated that three mutations (fluorophore insertion and co-CRISPR marker) could be introduced simultaneously.

      As a pilot for creating genome-scale resources, it is not clear whether three different fluorophores in one animal, while elegantly designed and implemented, will be desired by the broader community.

      Finally, the interpretation of the patterns observed in the created lines leaves much to be desired. A Table with all the observations must be included and can replace the tedious (and often wrong) descriptions of the observations with the different lines. It would be too much to point out every mistaken expectation of protein expression. Two examples include:

      The expectation that ACDH-10 is enriched in the intestine and epidermal tissues (hypodermis) is naïve - there are multiple paralogs of this protein (look at WormPaths or WormFlux) that may share functions in different tissues. There is also no reason to assume that fatty acid metabolism does not occur in other tissues (including the germline). Finally, there are no published studies about this enzyme, so we really don't know for sure what it's doing.

      The expectation that HXK-1 is ubiquitously expressed is similarly naïve. There are three paralogous enzymes that are all associated with the same reaction, and we have shown that these three function redundantly in vivo, perhaps in different tissues (PMID: 40011787). Moreover, single cell RNA-seq data (PMID: 38816550) also shows enrichment of hxk-1 in gonadal sheath cells.

      The table should have at least the following information: gene/protein name - Wormbase ID - TPM levels of single cell data assigned to tissues for L2, L4 and adult (all published) - tissues in which expression is observed in the lines presented by the authors.

      Other points:

      (1) We would encourage the authors to provide systematic validation of the reported insertions. The manuscript reports that 24 of 30 tags were isolated and visible but does not clearly state whether each isolated line was confirmed by sequence‑level validation to be correctly in‑frame and free of unintended mutations at the target locus.

      (2) The manuscript presents aggregated success counts (e.g., 8/10 mTagBFP2 tags, 9/10 mStayGold, 7/10 mScarlet3) and useful narrative descriptions of injection outcomes. We suggest also to include per‑locus success rates.

      (3) For pools that required re‑injection after initial failures, we would like to see a description of the specific changes that were made to the injection mixes or procedures (e.g., new repair template prep, different Cas9 reagent lot, guide redesign). This will be useful troubleshooting information for others.

      (4) The authors states that the fluorophore sequences are codon-optimized for C. elegans. We suggest they provide the exact donor/tag sequences used specifically state whether the fluorophore sequences contain any synthetic/artificial introns or other sequence modifications (e.g., silent PAM‑disrupting mutations) were included in the donor templates.

      (5) Page 3: Include a reference for "The C. elegans genome encodes around 20,000 genes"

      We hope these comments are useful.

      Comments on Revised Version:

      Overall, we found the responses to be quite recalcitrant.

      We have one remaining composite concern about the comparison between observed expression patterns with the new strains versus published data.

      First, the authors only report patterns for one stage while it should be not too much effort to image the different life stages. However, since this is a revision, we are not formally requesting they do this.

      Second, in the now provided Table (thank you) 'observed expression' (last column) is lacking for 9 of the 30 proteins, and for 6 of these the procedure was not successful. Why not report patterns for the other three? It is confusing also because on page 5, the authors say that "overall, 24 of 30 tags ...all of which were visible with fluorescence stereomicroscopy" - are we missing something? Also, they then said that they "obtained 6/9 of the originally failed tags"; why are the corresponding patterns not included in table 1, and are 9 proteins still labeled as "no" in the "success?" Column?

      Third, we strongly feel that the response to our comments about expression patterns is not adequate. On page 5 the authors say that "all proteins were expected to be ubiquitously expressed" and that "scRNA-seq indicated that transcript abundance was ubiquitous and without strong tissue-specific enrichment with few exceptions". However, in their rebuttal, the authors now argue for tissue-specific expression for proteins with paralogs, turning around their own argument! Moreover, their Table indicates that many genes show tissue-enriched expression by RNA-seq while many of their tagged proteins exhibit ubiquitous expression.

      Overall, this indicates that both the overall accomplishment of generating tagged protein strains and analyzing their expression is oversold.

    3. Reviewer #3 (Public review):

      Summary:

      The authors argue that establishing the expression pattern and sub-cellular localisation of an animal's proteome will highlight hypotheses for further study. This claim is probably accepted by many in the community. This manuscript seeks to confirm the feasibility of establishing such a resource, by using current transgenic methods to knock in DNA encoding different colored fluorescent tags into C. elegans genes.

      Strengths:

      The authors make the points above. For example, they provide evidence that the C. elegans germline harbors two populations of mitochondria that differ qualitatively in the proteins they express. They also confirm that labelling the whole proteome is an achievable goal with relatively limited resources and time.

      Weaknesses:

      The work is somewhat incremental in that it uses existing transgenic technology. Cell biology in C. elegans is challenging because of the small size of many of its cells, notably neurons. This can make establishing the sub-cellular localisation of a fluorescently tagged protein, or co-localizing it with another protein, tricky. The authors point out in their introduction that advances in light microscopy such as diSPIM, STED and ISM (a close relative of SIM), have increased the resolution of light microscopy. They also point out that recent advances in expansion microscopy can similarly help overcome the resolution limit. However, they do not use these technologies to characterize their transgenic strains.

    4. Reviewer #4 (Public review):

      Summary:

      Tagging the entire proteome of a metazoan would be a landmark achievement, providing a powerful complement and extension to existing "omic" catalogs in model systems. Here, Eroglu and Hobert argue that efficiently tagging multiple loci in a single "batch" would make the community-based achievement of this goal realistic. They provide rigorous evidence that such an approach is indeed feasible, exploring issues related to efficiency, design and screening strategies, disruption of gene function, and the potential for endogenously tagged alleles to reveal unexpected aspects of protein expression and localization. While the work has some minor gaps that are important to rigorously assess the feasibility of the proposed effort, the detailed and valuable insights that emerge should provide impetus to the community to coordinate efforts to make this ambitious goal a reality.

      Strengths:

      The work has numerous strengths. The authors provide compelling evidence that:

      - three distinct loci can be efficiently targeted with three distinct fluorescent tags in a single injection.

      - thoughtful targeting design can reduce the likelihood of disruption of function by the tag.

      - systematic design principles based on expression level and predicted localization/function can be used to optimize tagging strategies.

      - the resulting tags can provide unexpected insight into patterns of protein production and subcellular localization.

      Not all of these advances are novel in themselves, but taken together, they represent an important technical and conceptual advance. The most important strength comes from the exceptionally high value of the goal itself, in that the work is that it has the potential to spur a community-wide effort toward achieving the ambitious goal of proteome-wide tagging.

      Weaknesses:

      The work's shortcomings are minor.

      - One concern has to do with the feasibility of the proposed screening strategies. The experimental design cleverly coinjects tags for three loci in different gene expression 'zones'; this expression level determines which tag will be used. As the authors allude to, there is an important distinction between genes with the same overall FKPM value between those that are expressed broadly and those focally expressed in a specific tissue. The proposed strategy claims that there are a sufficient number of highly expressed genes "to be used as visible markers" for recovering successfully edited animals. It would be useful for the authors to discuss the issue of broad vs focused expression among this set of genes a bit more thoroughly, with an eye toward the issue of how likely it is that these genes could indeed consistently be used as visible markers, particularly for those at the low end of this limit.

      - What fraction of the proteome (on a per-gene basis) is secreted proteins? How difficult will it be to screen these for successful tags? Are there specific tags that would be more optimal for secreted proteins? (The authors mention the use of an SL2 or T2A cassette to label the cells in which these proteins are expressed but note that there are technical challenges associated with doing this at scale.)

      - For secreted and/or weakly expressed genes, it would be useful for the authors to estimate for what fraction of these would successful insertions need to be screened by PCR, and what resources (time and money) this would likely entail.

      - For how many genes would a single tag not capture all predicted isoforms?

      - Finally, some readers might object to the authors' assertion in the abstract that this work is "a first step in this direction" (presumably referring to designing a strategy for whole-proteome tagging). There is no concern that the authors are disregarding the extensive work of other groups, as they explicitly mention the contributions of other groups to the foundation that enables the present work. However, the spirit of the abstract could be misinterpreted by a well-intentioned reader.

    1. Reviewer #1 (Public review):

      [Editor's note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The authors aimed to uncover novel therapeutic vulnerabilities in APC-mutant colorectal cancer (CRC), which constitutes the majority of CRC cases. They hypothesized that modulating oxygen-sensing pathways (via PHD inhibition) could disrupt adaptive stress responses in these tumours.

      Strengths:

      The study employs a powerful, two-pronged approach to identify Molidustat's targets. By using both Thermal Proteome Profiling (TPP) and an orthogonal chemical proteomic competition assay, the authors provide compelling evidence that GSTP1 is a genuine, direct off-target, effectively addressing the common limitation of indirect effects in proteomic screens.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to determine Molidustat targets and the potential utility of these findings. They clearly demonstrate that Molidustat interferes with GSTP1 and some other proteins on top of PHD2. They also demonstrate that PHD2 deletion is not sufficient to recapitulate Molidustat effects in cells and proteomes. Finally, they demonstrate synthetic lethality in organoids for Molidustat and APC deletion.

      Strengths:

      The data on Molidustat proteomes, GSTP1 binding, inhibition and metabolic health of organoids is really clear. All biochemical, docking and omic data are really strong. The potential impact of these findings could be the use of Molidustat in APC null tumours and awareness of potential off-target effects.

    3. Reviewer #3 (Public review):

      In this paper, the authors revealed that Molidustat can induce a dose-dependent increase in Caspase-3/7 activity in the HT29 cell line, which is an APC-mutant colorectal cancer cell line. More importantly, they found that targeting PHD2 alone cannot cause cell death. By using thermal proteome profiling (TPP) and orthogonal chemical proteomic competition assays, they determined GTSP1 as a previously undiscovered off-target of Molidustat. They also revealed that combined PHD2 and GSTP1 loss leads to an increase in intracellular ROS and apoptosis. Moreover, they evaluated the effects of Molidustat in colonic organoids and showed that Molidustat has a high selectivity for colonic organoids with activated WNT signaling and/or KRAS pathway alterations, and this effect is not reproduced by hydroxylase inhibition alone, providing a new potential approach to targeting both PHD2 and GTSP1 for the treatment of APC-mutant CRC.

    1. Reviewer #1 (Public review):

      Summary:

      Knowing that small pupil-size variations accompany brightness variations (even when these are illusory), the authors asked whether pupil constrictions would accompany the synesthetic perception of a brighter color (compared with a darker one), induced by the presentation of a black-white character. This grapheme-colour synesthesia is only experienced by few participants, sixteen of whom were enrolled in this study. The results reliably showed that a relative pupil constriction would "betray" the perception of a brighter color in these participants, while no such effect would be observed in control participants who were asked to report a color in association with each grapheme, even though they did not perceive any.

      Strengths:

      The main strength of the study lays in its combination of psychophysics (brightness ratings) and pupillometry, which allowed for showing clear-cut results.

      Weaknesses:

      I only see the following relatively minor weaknesses, namely:

      - The pupil traces in Figure3 (main results) are heavily pre-processed (per-participant demeaned), loosing any feature besides the effect of interest. As I argued in my first review, I worry that this format gives unrealistic expectations about the effect (the perception of dark/bright colors do not generate a net dilation/constriction of the pupil; perception-related modulations of pupil size are always relative and generally small compared to the numerous other effects registered in pupil size; these include a pupil dilation that is more prominent in the controls and that gets analyzed later on in the manuscript; I do not think that eliminating one of the effects of interests from a main results figure helps the reader understand the results). In the revised manuscript, the authors addressed this concern by adding a Supplementary Figure 4, where a more complete representation of the results is shown (traces from individual trials are baseline corrected and averaged, resulting in more informative timecourses). I would strongly recommend that Supplementary Figure4 is brought to the main text (Figure3 could be presented in Supplementary).

      - Responses to physical brightness modulations were only measured in the synesthethes group, not in controls. The authors point out that pupillary light responses have been thoroughly characterized in previous studies, and conclude that synesthethes' responses were in line with the expectations both in terms of amplitude and latency. However, as we are not dealing with standardized measurements, subtle differences in pupil reactivity across the two populations remain a possibility. I recommend that this possibility is mentioned in the discussion.

      Impact:

      This work is likely to improve our understanding of synesthesia, providing a new tool to quantify the subjective sensations; an interesting potential extension would be using pupillometry for tracking changes over time of the synesthetic experiences, opening up the possibility to evaluate the importance of learning for this peculiar experience.

    2. Reviewer #2 (Public review):

      Synesthesia is a neurological condition where stimulation of one sensory channel leads to involuntary, automatic, and consistent experience of another, unrelated percept. For example, Sir Francis Galton (1880, Nature) famously described the robust tendency of some individual (synesthetes) to associate numerals with a distinct color. Ever since, synesthesia keeps attracting a broad interest in the cognitive neurosciences in light of its implications for the study of domains such as perception, consciousness, and brain connectivity, among others.

      Strauch, Leenaars, and Rouw measured pupil size in a group of 16 grapheme-color synesthetes and two matched control groups. The participants were presented with gray digits - that is, visual stimuli having identical physical properties in terms of brightness. Each participant subsequently rated the corresponding evoked color and brightness: unlike controls, synesthetes did so in a very consistent and reliable fashion. Accordingly, this was also shown in their pupils: despite the same objective luminance, digits associated with brighter percepts caused their pupils to constrict and digits associated with darker percepts caused their pupils to dilate more than controls. These results highlight how crossmodal correspondences are deeply rooted in synesthetes, and puts forward pupillometry as a particularly appealing biomarker for some phenomenological experience (at least those grounded in "brightness").

      Further strengths of the technique are its temporal resolution and its responsiveness to several constructs. Across several tasks, the authors show for example that responses to synesthetic light are somewhat slower than responses to real light (i.e., they are likely mediated), but at the same time faster than responses to mental imagery. The role of mental imagery can also be reasonably dismissed when considering the second feature of pupil size: its responsiveness to mental effort and cognitive load. The pupils tend to dilate with demanding, challenging tasks, and this was the case when control participants were asked to report the color of a digit for which they did not consistently experience a synesthetic association. The same task was, instead, seemingly effortless for synesthetes, again speaking in favor of the automaticity of number-color correspondences in their case.

      Overall, the findings by Strauch, Leenaars, and Rouw are highly significant for the field and likely to be impactful. The strength of their evidence, when accounting for the relatively small sample size and the inherent variability of both phenomenology (color perception and subjective reporting) and physiology (pupil size), is adequate and sufficiently convincing.

      Comments on revisions:

      I thank the authors for addressing all my comments in a satisfactory way. I think that the paper has improved, especially in terms of transparency of the reporting and clarity of the results.

    3. Reviewer #3 (Public review):

      Summary:

      In the present study, the authors examined pupillary responses to uncolored stimuli (number graphemes) among number-color synesthetes and non-synesthetes. After seeing a digit, the synesthetes and active control participants were asked to indicate which color they perceived using three dimensions of hue, saturation, and lightness. The lightness values were the primary independent variable for follow-up analyses. To see how the pupil responded to psychologically "bright" and "dark" digits, the authors split the reported lightness values at the median and plotted them. The synesthetes showed a pupillary constriction to digits they perceived as bright and dilation to digits they perceived as dark. Active control participants did not show that effect. In a subsequent block, only the synesthetes were shown the colors they reported perceiving as colored discs. Their pupillary responses were similar. The authors also found that the differences in pupillary responses between light and dark perceptions (with digits) were only slightly delayed in their onset to the perception of a colored disc, and therefore the color perception accompanying a digit is unlikely to be effortful or a retrieved association, but occurs rather automatically.

      Strengths:

      The authors employed a well-controlled and designed quasi-experiment comparing color-grapheme synesthetes to non-synesthetes and showed convincingly that the color perceptions accompanying graphemes alter the physical perception of brightness. They also made a reasoned attempt to ruled out the possibility that color associations are occurring effortful via retrieved associations.

      The follow are questions which I had asked in a first round of reviews, and which were answered adequately by the authors:

      (1) Are the pupillary responses among synesthetes, which objectively do not seem to match the degree of physical stimulation entering the retina, in any way maladaptive for eye functioning? I understand the constriction/dilation of the pupil to not only benefit visual acuity but also to protect the retina from damage. Are synesthetes at any risk of retinal damage due to over-dilation of the pupil to brighter stimuli? Or are these effects of a magnitude that is too small to matter? As reported in arbitrary units, it was hard to know how large these effects were in terms of measurable changes in dilation (e.g., millimeters).

      (2) Likewise, is the automatic synesthetic merging of two percepts something that could be learned such that natural synesthetes and "artificial" synesthetes would look similar? For example, if a group of non-synesthetic participants were to learn a color-grapheme association to automaticity, would you expect their pupillary responses to the graphemes look similar to the synesthetes? If so (or if not), what would this tell us anything about the phenomenology of synesthesia?

      (3) Do the synesthetic perceptions of digit graphemes merge in a sensible way? For example, if a synesthete sees a particular color with the digit 1, and a different color with the digit 9, what do they perceive when they see 19? or 1-9, or 1 9? Is there color blending, or an altogether different color perception?

    1. Reviewer #1 (Public review):

      This work compiles a comprehensive atlas of ncORFs across mammalian tissues and cell types, derived from reanalysis of ~400 public ribosome profiling datasets. The authors then evaluate cross-species conservation and functional signatures, proposing that evolutionarily ancient ncORFs tend to have higher translation potential, stronger expression, and closer relationships with canonical coding sequences.

      Strengths:

      In general, the study provides a large-scale and timely resource of annotated ncORFs, which could be broadly useful for the community. The authors collected ~400 public ribosome profiling datasets for annotations of ncORFs, which, to my best knowledge, is the largest collection of data for such purpose. The catalog could facilitate future investigations into ncORF biology and broaden understanding of the coding potential of the "non-coding" genome.

      Weaknesses:

      Based on the ncORF catalog, some of the analyses were not properly done. Some of the results are descriptive.

      (1) Bias and representations of data source. Public ribo-seq datasets are unevenly distributed across tissues and cell lines, raising concerns about heterogeneity and underrepresentation of certain contexts. This may limit the generalizability of the catalog.

      (2) The discussion on modular domains of ncORFs is unclear, and the claim that they may originate via TE-related mechanisms is not well supported. Stronger evidence or clearer reasoning is needed.

      (3) The conservation comparisons are not fully convincing. Figure S7 shows only mild differences between ncORFs and CDS, and statistical significance is not clearly demonstrated. Comparisons with other non-coding RNAs should be added, and overlapping sequences between ncORFs and CDS should be excluded to avoid bias.

      (4) Figure 3 indicates that some ncORFs are subject to evolutionary constraints. This is not surprising. The authors should provide further analyses on more detailed features of these "conserved" ncORFs vs. the "non-conserved" ones. Some pretty informative works have been done in drosophila, worms, mouse, and human. Figure 3 suggests some ncORFs are under evolutionary constraint, but this is not unexpected. More granular analyses contrasting "conserved" versus "non-conserved" ncORFs would be informative. In fact, small ORFs, especially uORFs, have been extensively studied, for their functions and corss-species conservations. The authors should explicitly show what is new here in their analyses.

      (5) Translation levels are reported using RPF counts. However, translation efficiency (normalized by RNA expression) is a more appropriate measure to account for expression heterogeneity.

      (6) The correlation analyses between ncORF translation levels and PhyloCSF are confusing and largely descriptive. These sections need sharper framing and clearer conclusions.

      (7) Public ribo-seq datasets, generated by different research labs, are known for their strong batch effects. Representations of tissues and cells are also very unbalanced. Therefore, the co-translation analysis between ncORFs and canonical CDS is not well controlled. This should be done by referring to a recent large-scale ribo-seq meta-analysis (Nat Biotechnol. 2025. doi: 10.1038/s41587-025-02718-5).

      Comments on revisions:

      The authors have made efforts to address most of the previous concerns, and several points have been clarified or improved in the revision. However, in a number of cases, the responses rely more on acknowledgment and reframing rather than substantive analytical strengthening. Overall, the manuscript is improved, particularly in terms of clarity, transparency, and positioning of claims. I support its publication and look forward to seeing how the field engages with and discusses these claims.

    2. Reviewer #2 (Public review):

      Summary:

      Chang et al. attempted to analyze a large number of ribo-seq datasets through a standardized pipeline, identifying novel non-canonical ORFs and elucidating their evolutionary and expression characteristics.

      Strengths:

      (1) The datasets analyzed by the authors are sufficiently comprehensive, and the use of standardized pipelines ensures excellent analytical consistency.

      (2) Their analyses of ORF evolution and co-expression further deepen our understanding of these ORFs.

      Weaknesses:

      (1) The authors primarily conducted analyses through bioinformatics, lacking sufficient wet-lab experimental evidence.

      (2) Some analytical methods and standards were not clearly presented in the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      Meijer et al. sought to investigate the role of cortical layer 6b (L6b) neurons in modulating sleep-wake states and cortical oscillations under baseline and sleep deprived conditions and in response to orexin A and B. Using chronic EEG recordings in mice with silencing of Drd1a+ neurons (via constitutive Cre-dependent knockout of SNAP25), the authors report that while overall baseline sleep-wake architecture and response to sleep deprivation are minimal/unchanged, "L6b silencing leads" to a slowing of theta activity during wakefulness and REM sleep, and a reduction in EEG power during NREM sleep. The manuscript is well written with clarity and transparency. Although Drd1a+ neurons are not exclusive to L6b, the authors describe key future studies to identify a causal role for L6b neurons in brain state regulation. These studies contribute to a growing body of evidence that cortex-in addition to subcortical brain regions-plays a role in brain state regulation.

      Strengths:

      (1) The text is well written.

      (2) The authors are transparent about methodological details and study limitations.

      (3) The stated sleep, circadian, and orexin infusion experiments are well designed, executed, and analyzed.

      Weaknesses:

      (1) Outcomes are attributed to silencing cortical L6b neurons, but the genetic manipulation is not specific to L6b neurons or cortex. The authors acknowledge this as a limitation and offer targets for future studies to identify L6b neuron-specific contributions to stated outcomes that include spatially restricted manipulations.

      (2) Experiments use only male mice, which limits generalizability to females.

      Comments on revised version:

      The authors took great care in addressing my previous comments, and I do not have any additional concerns.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Meijer and colleagues investigated the effects of inactivation (conditional silencing) of cortical layer 6b neurons on sleep-wake states and EEG spectral power under the following three conditions: during natural sleep-wake states, after sleep deprivation, or after intracerebroventricular administration of orexin A and B. The authors report that silencing of L6b neurons did not have a significant effect on the total time spent in sleep-wake states, duration or number of state epochs, or the response to sleep deprivation. However, silencing of L6b neurons did slow down theta-frequency (6-9 Hz) during wake and REM sleep, and reduced the total EEG power during NREM sleep. Infusion of orexin A in the mice in which cortical layer 6b neurons were inactivated produced an increase in wakefulness. A similar effect was observed after infusion of orexin A in the mice in which these neurons were not silenced, but the effect (i.e., increase in wakefulness) was of a smaller magnitude. Silencing of cortical layer 6b neurons attenuated the effect of orexin B in increasing theta activity, as was observed in the control mice. The authors conclude that the cortical neurons in layer 6b play an essential role in state-dependent dynamics of brain activity, vigilance state control and sleep regulation.

      Strengths:

      - A focus on cortical layer 6b neurons, which is an understudied neuronal population, especially in the context of brain and behavioral state transitions.

      - The authors used a well-established mouse model to study the effect of inactivation of cortical layer 6b neurons.

      Weaknesses:

      - Although the authors used a highly selective approach to silence layer 6b neurons, the observed changes in EEG oscillations cannot be solely attributed to layer 6b neurons because of the ICV route for orexin administration.

      - The rationale for using only male rats is not provided.

      Comments on revised version:

      The authors have addressed my concerns.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript explores the role of the Evening Complex (EC), specifically focusing on ELF3, a disordered protein component of the EC, and its temperature-dependent phase behavior. The study highlights the role of polyQ tracts in modulating temperature-sensitive condensate formation and provides a combination of computational approaches, including REST2 simulations and coarse-grained Martini simulations, to investigate how polyQ tract length and sequence context influence this behavior.

      Strengths:

      The study addresses a key question in plant biology - how temperature influences circadian clock-mediated growth regulation through protein phase behavior. The manuscript introduces the novel finding that polyQ tract length modulates the temperature-dependent formation of helices and condensates.

      Weaknesses:

      (1) Coarse-Grained Simulation Results Not Supported by Data:

      The results presented in Figure 6A of the manuscript do not seem to show a clear trend in the number of clusters formed as a function of polyQ tract length. This is particularly evident in the comparison between 0Q and 7Q polyQ lengths, which display statistically similar values in terms of the number of clusters. The lack of distinction between these values raises questions about the sensitivity of the coarse-grained simulations to polyQ tract length, which the authors claim as a key modulator of condensate formation. This discrepancy weakens the argument that polyQ length directly impacts the clustering behavior in the simulations.

      Suggested Analysis:

      a) A more detailed statistical analysis should be performed to assess whether the observed differences between polyQ lengths are significant. This could involve hypothesis testing or the use of error bars in the graphs to better communicate the variability in the data.

      b) Additionally, the authors should examine whether there are other features, such as cluster shape or internal structure, that might differentiate between different polyQ lengths, even if the total number of clusters is similar.

      (2) Inconsistency in Cluster Size Across Temperatures (Figure 6B):

      The results in Figure 6B show a striking difference in the size of the largest cluster between temperatures of 290K and 300K. This abrupt shift in behavior lacks a clear mechanistic explanation. Typically, phase transitions driven by temperature are more gradual, unless there is some underlying structural or chemical shift that the authors have not accounted for. Without a clear explanation, this sudden change in behavior reduces confidence in the simulation results.

      Suggested Analysis:

      a) The authors should explore possible explanations for the dramatic difference in cluster size between 290K and 300K. For example, they could investigate whether specific interactions (such as the breaking or formation of hydrogen bonds or hydrophobic contacts) might explain the behavior at higher temperatures.

      b) It is important to check whether the coarse-grained simulation model has been adequately parameterized and scaled for accurate temperature dependence. Atomistic simulations of monomers and dimers with varying polyQ tract lengths could be used to fine-tune the coarse-grained model, ensuring it accurately reflects molecular behavior. The gross estimate of a 10% scaling factor might be insufficient and could lead to inaccurate representations of cluster formation.

      (3) Scaling of Coarse-Grained Model with Atomistic Simulations:

      As mentioned, the coarse-grained model used in the study may not have been properly scaled against atomistic data. A simple scaling factor of 10% may not be appropriate for accurately capturing the behavior of polyQ tracts across different lengths, especially considering their sensitivity to subtle changes in temperature. Without rigorous validation against atomistic simulations, the coarse-grained model's predictions could be skewed.

      Suggested Analysis:

      a) To address this, the authors should compare the coarse-grained model with atomistic simulations of monomeric and dimeric forms of ELF3 with different polyQ tract lengths. By comparing key structural parameters (e.g., radius of gyration, contact maps, and clustering propensity), the authors could adjust the coarse-grained model to more accurately reflect the atomistic behavior. The authors have wealth of atomistic simulation data that could afford such benchmarking and identification of scaling factor

      b) Additionally, the authors should investigate whether the assumed scaling factor of 10% is appropriate for each polyQ length or whether it needs to be refined based on specific properties, such as the number of hydrophobic interactions or secondary structure stability.

      (4) Lack of Analysis for Liquid-Like Behavior in Phase Separation:

      The simulations presented in the manuscript do not analyze the liquid-like behavior of ELF3 condensates, which is a key characteristic of liquid-liquid phase separation (LLPS). In LLPS systems, condensates are often dynamic, with chains exchanging between clusters, indicating liquid-like rather than solid-like behavior. The authors fail to probe this crucial aspect, which is necessary to support the claim that ELF3 undergoes phase separation.

      Suggested Analysis:

      a) The authors should conduct additional analyses to probe the liquid-like nature of the clusters formed by ELF3. One approach would be to analyze the dynamics of chain exchange between clusters, measuring how frequently chains leave one cluster and join another over time. This analysis would reveal whether the condensates behave as liquid-like, dynamic structures or more static, solid-like aggregates.

      b) Additionally, the temperature dependence of these exchange dynamics should be investigated. In true liquid-liquid phase separation, the rate of chain exchange is often sensitive to temperature. Observing how this rate changes between 290K and 300K, for instance, could help explain the abrupt shift in cluster size seen in Figure 6B.

      c) The authors should also analyze whether the internal structures of the condensates are consistent with a liquid-like phase. For example, radial distribution functions and contact lifetimes could be calculated to reveal whether the clusters exhibit liquid-like organization.

      (5) Lack of justification of polydispersity of polyQ:

      The authors don't provide any rationale for choice of different copies of polyQ used in the manuscript for their chain-growth simulation studies. It will be more apt if it can be motivated via some precedent experimental observations.

      (6) Lack of initiative to connect to Experiments:

      While the computational models and simulations provide robust theoretical insights, the absence of direct experimental validation weakens the overall impact of the manuscript. For example, experimental data on how specific mutations in the polyQ tract influence ELF3 behavior in vivo would significantly bolster the authors' claims. The manuscript would benefit from either citing existing experimental studies that corroborate these findings or from suggesting future experimental directions.

      Comments on revised version:

      The authors have now adequately addressed to the key concerns of manuscript. The manuscript in the present form looks significantly improved.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigate how ELF3, a disordered scaffolding protein in the plant circadian Evening Complex, responds to temperature by forming reversible nuclear condensates. They focus on the C-terminal prion-like domain and on a variable polyglutamine tract within it, asking how the tract length and surrounding sequence context tune temperature-responsive structural and condensation behavior. Using a tiered set of computational approaches, including sequence heuristics, hierarchical chain-growth ensembles, all-atom enhanced-sampling simulations, and coarse-grained condensate simulations of 100 monomers, they characterize wild-type, polyQ deletion, polyQ expansion, and an aromatic-disrupting F527A variant. In the revised manuscript, the central claim has been reframed so that polyQ length is now described as tuning condensate material properties rather than driving temperature-sensitive phase separation, with temperature-responsive condensation attributed primarily to a sticker-rich aromatic contact network.

      Strengths:

      The biological question is important and timely, and the multiscale computational strategy provides a fresh view of an intrinsically disordered protein and its variants. The all-atom enhanced sampling analyses identify a temperature-dependent long-range aromatic contact involving F527 and a methionine-tyrosine coordination motif, which are concrete and mechanistically interesting observations beyond what coarse-grained or sequence-only methods could provide. In response to the previous round of review the authors have added replicate averaged statistics with error bars on the new condensate analyses, introduced new dynamics observables including effective diffusivity, an anomalous diffusion exponent, the self van Hove function, shape anisotropy, per chain radius of gyration in the condensed phase, and a condensate lifetime, provided cluster size time series for transparency, justified the choice of polyQ tract lengths against published Arabidopsis polymorphisms, expanded the Methods with explicit formulas for the new analyses, and included a split half convergence check for the all atom ensembles. The reframing toward a sticker spacer interpretation is consistent with recent experimental work and represents a more cautious and defensible reading of the data.

      Weaknesses:

      Despite these substantive additions, several core concerns from the previous review remain only partially addressed, and, on close reading, the new supplementary analyses do not robustly support the reframed claim that polyQ length tunes condensate material properties. Error bars and replicate-averaged statistics were added to the new condensate panels, but the helical propensity and per-residue analyses throughout the rest of the manuscript still show only a single curve per temperature, so variability for these key observables remains unreported. Several of the newly added dynamics observables show that the variants are essentially indistinguishable within the reported uncertainty: the self van Hove distributions, the shape anisotropy distributions, and the per chain radius of gyration distributions in the condensed phase overlap almost entirely across variants, and the anomalous diffusion exponent has between replica spreads at low temperature that exceed the variant to variant differences, with variant orderings that change with temperature. The variant-dependent signal that does survive, namely a drop in condensate lifetime for the polyQ expansion and the aromatic mutant at the highest temperature studied, rests on a single temperature point, with replicate spreads spanning most of the metric's dynamic range.

      The cluster size time series at higher temperatures shows the dominant cluster oscillating over a wide range across replicas, indicating intermittent dissolution and incomplete convergence in the very temperature regime where the variant-specific claims are made. The only convergence test provided is a split-half radius-of-gyration analysis for the all-atom ensembles, with no slab-geometry or coexistence-density check for the coarse-grained condensate simulations. The polyQ deletion variant forms dominant clusters comparable in size to wild type at low and intermediate temperatures, which on its own argues that variable polyQ presence is not a primary determinant of clustering and supports the earlier concern that the temperature sensitive behavior is dominated by generic chain length and aromatic sticker effects rather than polyQ specific sequence effects, a concern that the reframing softens but does not resolve. Statistical significance is not assessed anywhere, and with three replicas and largely overlapping error bars, claims of variant-specific differences would benefit from explicit statistical tests. Minor quality control issues are also visible in the supplementary material, including a mislabeling of the aromatic mutant in two analysis panels and an inconsistent trajectory length for one variant at one temperature.

      Additional Context for Readers:

      Readers should interpret the molecular mechanism proposed here with caution. The reframing from polyQ length driving temperature-sensitive phase separation to polyQ length tuning of condensate material properties is more scientifically measured and aligns with recent experimental work, but several of the supplementary observables introduced to support this revised claim indicate that the variants studied are statistically indistinguishable within the reported replicate uncertainty. The most robust observation in the revised work is that the prion-like domain undergoes a temperature-responsive break of an aromatic contact in all-atom simulations and that aromatic sticker contacts dominate inter-protein interactions in coarse-grained condensate simulations. The mechanistic role of the polyQ tract, beyond generic chain length and hydration effects, remains, as in the original submission, not clearly established by the simulations presented. Independent experimental validation of the proposed aromatic contact and of the predicted material-state differences between polyQ variants will be needed to establish the molecular mechanism, and improved condensate convergence tests, uniformly reported error bars across all simulation-derived figures, and explicit statistical tests of variant-versus-variant differences would substantially strengthen confidence in the conclusions.

    1. Reviewer #1 (Public review):

      Summary:

      The authors present a novel approach to subcellular spatial proteomics by combining laser microdissection with expansion microscopy and LC-MS/MS analysis (SPEx). They implement two different workflows for LMD and LC-MS/MS quantification:

      (1)The standard approach, where an area of interest is cut out by LMD, subjected to proteomics analysis, and compared to the rest of the cell without the dissected ROI.

      (2) The subtraction approach, where ROIs are removed, and the remaining cellular material is compared to samples containing both the surrounding material and the ROI.

      The authors assess the technique by applying it to subcellular targets of various sizes, volumes, and protein compositions such as the nucleus, nucleoli, and Golgi. They demonstrate that SPEx can identify proteins enriched or reduced in ROIs.

      Strengths:

      The broad, relatively easy, and inexpensive applicability of this approach to potentially many cell types and subcellular areas of interest provides an exciting alternative to subcellular fractionation, native immunoprecipitation, or genetically encoded proximity labeling constructs. Moreover, by visually selecting ROIs for subsequent analysis, subcellular context or organelle morphology can be taken into account, as discussed by the authors in the discussion section.

      Weaknesses:

      While strongly supporting the sharing of this approach, we have a number of comments and questions that will improve the impact of the manuscript:

      (1) General:

      a) The manuscript would benefit from restructuring and language revision. In its current form, the writing is sometimes dense and verbose (in particular, the Results section). This makes it difficult to follow the authors' arguments.

      b) The authors mention the possibility of selecting organelles based on morphology. This is left for the discussion, but it seems like a missed opportunity - the authors could compare individual organelles in different morphological states, e.g., connected vs. fragmented mitochondria.

      (2) Technical:

      a) Why do the authors strive and optimize for a 10x expansion factor? Is SPEx compatible with a more standard 4x expansion, as e.g., used in the classic U-ExM approach (https://www.nature.com/articles/s41592-018-0238-1)? This could be added to the discussion.

      b) The U-ExM approach shows improved ultrastructural preservation when using 3%FA with 0.1% glutaraldehyde fixation (GA). Is SPEx compatible with the use of low amounts of GA for fixation?

      c) Related to the above, was the anchoring efficiency reduced only to achieve a 10x expansion factor or does this additionally affect the proteome coverage?

      d) Have the authors considered using alternative anchoring approaches, such as GMA (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0291506#pone.0291506.s001), which potentially increase the amount of sample retained in the hydrogel, thus allowing for better proteome coverage? This could be added to the discussion.

      e) The limitation of the approach to near-2D samples should be mentioned, and alternative approaches for more 3D samples could be discussed.

      f) How are peptides that are directly anchored to the hydrogel dealt with during LC-MS/MS analysis? Are they excluded, or can they be identified during the spectral search? The latter would allow us to get a deeper structural understanding of how proteins are actually anchored into hydrogels, which so far has not been assessed.

      An alternative approach to address this question would be to investigate if the peptide coverage of proteins detected by SPEx is enriched for peptides representing the folded core of proteins as opposed to the surface-exposed regions, which likely get more anchored into the hydrogel.

      g) Same question regarding peptides with NHS labeling. Can they be identified, or do they just compete for ionization and thus negatively affect coverage and dynamic range of the LC-MS/MS approach?

      h) How are the primary and secondary antibodies affecting the proteomics analysis identified as contaminants?

      i) Have the authors observed differences in proteomics coverage of only antibody vs NHS-labeling? Depending on the questions above, could pure antibody-based labeling increase proteomic coverage?

    2. Reviewer #2 (Public review):

      Summary:

      This study introduces a method that combines physical expansion of cells, imaging-guided isolation of defined regions, and protein identification to enable compartment-resolved analysis of protein composition at the subcellular scale. The authors aim to address a central limitation in existing approaches, namely the loss of spatial information during sample preparation or the indirect nature of proximity-based labeling methods. Using several cellular compartments as examples, they demonstrate that their approach can recover compartment-enriched protein sets and identify candidate proteins with previously unassigned localization.

      Strengths:

      A major strength of this work is the conceptual simplicity and accessibility of the approach. By combining established techniques in a modular way, the method avoids the need for genetic manipulation or specialized labeling strategies, making it broadly adaptable across experimental systems. The ability to directly select regions of interest based on imaging represents a clear advantage over indirect enrichment strategies and allows flexible targeting of both membrane-bound and non-membrane-bound compartments.

      The experimental design is also a strong aspect of the study. The use of complementary comparison strategies-analyzing isolated compartments alongside matched "subtracted" controls-provides an internal framework for assessing enrichment and depletion, increasing confidence in spatial assignment. The application of the method across multiple organelles of different sizes and properties demonstrates versatility, and the reported specificity for several compartments is encouraging. In particular, the ability to profile small and biochemically challenging structures highlights a potentially important niche for the approach.

      Weaknesses:

      Despite these strengths, several methodological limitations constrain the interpretation of the results. The most important relates to spatial accuracy in three dimensions. While lateral resolution is improved through physical expansion, the lack of depth resolution introduces uncertainty regarding contributions from structures above and below the selected region. Although the authors argue that this does not substantially affect specificity, the current evidence is largely indirect, and a more rigorous quantification of potential contamination would strengthen this conclusion.<br /> Quantitative interpretation also remains challenging. Because the measurements reflect total protein abundance rather than local concentration, differences in compartment size and protein density can influence enrichment values, particularly for small structures embedded within larger volumes. This issue is evident in the analysis of smaller compartments and complicates direct comparison across conditions. Additional normalization or modeling would help clarify how to interpret these measurements.

      Another limitation concerns variability in the expansion process and its downstream consequences. Differences in expansion factor across samples may affect the definition of regions of interest and introduce variability in sampling, yet the impact of this variability is not fully explored. Similarly, the use of a modified chemical treatment to preserve proteins for downstream analysis is central to the workflow but is not extensively validated with respect to preservation of spatial organization.

      While the identification of previously unannotated proteins is an appealing aspect of the study, validation is limited to a small number of examples, and broader support from independent datasets or literature context is lacking. In addition, the study primarily focuses on steady-state measurements in a single cell type, and therefore does not yet demonstrate the ability of the method to capture dynamic or condition-dependent changes in protein localization.

      Finally, the positioning of the method relative to existing approaches could be more clearly articulated. Although qualitative comparisons are provided, a more systematic and quantitative benchmarking against alternative strategies would help readers better understand the specific advantages and trade-offs.

    3. Reviewer #3 (Public review):

      Franziscus et al. describe an elegant approach for spatially specific proteome analysis. To achieve this, they expand fixed cells and subsequently use a laser to micro-dissect a region of interest, which is then analyzed by mass spectrometry.

      They demonstrate the effectiveness of their approach by analyzing the nucleus, nucleolus, and the Golgi, and benchmark their hits against previous datasets for these organelles.

      The manuscript is very well written and nicely guides the reader through the applied methods. The presented data is convincing, and I do not see the need for additional experimental verification of the protocol. The only minor concern is the novelty of the method and the presentation. A combination of expansion, laser microdissection, and proteomics has been applied in the past (PMID: 36450705, PMID: 39477916). In the manuscript, one of these studies is cited, though it does not become clear that this approach is already described. However, Franziscus et al. describe the approach better and make it more accessible to the reader, especially since the other studies described this methodology in combination with tissue expansion and not in combination with single cell expansion as it is done here. I would ask the authors to be clearer in the introduction about what others have already done and what their contribution is here. In general, I am convinced that the community will benefit from the presented protocol to analyze organelle proteomics in detail.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates a fundamental question in cognitive science: is our ability to reason about the physical world an abstract mental process, or is it "embodied"-directly rooted in our real-time physical interactions with the environment? The authors compared participants' performance in computerized reasoning games with and without Galvanic Vestibular Stimulation (GVS). They suggest that participants failed more often and utilized suboptimal strategies under GVS compared to a sham stimulation condition. Furthermore, they found that this detrimental effect of GVS was reduced when the games were governed by altered gravity (hyper- and hypo-gravity). Consequently, the authors conclude that the physical experience of the body modifies high-level cognitive skills, such as reasoning.

      Strengths:

      The manuscript is well-written, organized, and easy to follow, making complex concepts accessible. Also, combining a specialized physical reasoning task with real-time vestibular disruption (GVS) is an intriguing approach to testing the boundaries of embodied cognition.

      Weaknesses:

      (1) Lack of Overall Effects and Inflated Type I Error for Game-Level Effects

      The study utilizes a within-subject design. Taking Study 1 as an example, each subject participated in a familiarization session (4 games), a baseline session (12 games without stimulation), a GVS session (14 games), and a sham session (14 games). No game was repeated for any single subject. Performance was quantified using three primary measures (success rate, number of attempts, and time per attempt) and two strategy measures (tool switching and the distance between tool placements).

      For Study 1, to identify condition differences at the game level (i.e., Figure 2), the authors effectively conducted 70 independent t-tests (5 measures × 14 games). While 7 significant results were reported, this large number of independent tests invites an inflated Type I error rate, as no multiple-comparison correction appears to have been applied.

      A similar inflation is expected in Study 2, where 50 independent t-tests (5 measures × 10 games) yielded 5 significant comparisons (Figure 4). Although the authors might argue the direction of the differences is systematic, implying GVS generally impairs performance, at least one significant comparison shows the opposite effect: tool switching indicates that GVS led to better performance for the 'Table_A' game in Study 2 (Figure 4d), whereas the same variable indicated GVS led to worse performance in Study 1 (Figure 2d). I suspect that none of the significant game-level results would survive a proper statistical correction. If possible, the authors can redo statistical testing with corrections (FDR or Bonferroni) or with LMM using game as a random effect. Before proper statistical analyses, I strongly encourage the authors to refrain from drawing broad conclusions based on these isolated game-level results.

      Furthermore, when analyzing data across all games, the study found no significant effect of GVS on overall performance or strategy measures in either Study 1 or Study 2. This lack of an aggregate effect contradicts the authors' conclusion that participants failed more often or utilized suboptimal strategies under GVS.

      (2) Missing Rationale for Classification Analysis

      It is puzzling why the authors pursued two exploratory analyses on tool placement after revealing that the two related primary measures (tool positioning and switching) did not generate significant condition differences in Study 1. These additional analyses-the Dirichlet Process Gaussian Mixture Model and leave-one-out classification-were not pre-registered. In the absence of overall condition differences, the authors appear to be "doubling down" by applying sophisticated classification tools to the raw data without a clear prior rationale.

      (3) Insufficient Evidence for the Reduced Effect of GVS Under Altered Gravity

      To compare Study 1 and Study 2, the authors devised a "gravity-weighted index," but its definition is not sufficiently justified. The index assigns weights of 1, 2, and 3 to low-, medium-, and high-gravity-dependent games, respectively. The choice of these specific weights appears arbitrary, making the quantitative results difficult to interpret. More importantly, there is no citation or explanation regarding how these three levels of "gravity impact" were defined in the first place (Line 468). This index was also not pre-registered.

      The authors state that for the success rate index, a value close to -1 indicates a large negative difference for GVS, 0 indicates no difference, and 1 indicates a large positive difference. These are theoretical bounds; the actual distribution of each index should be examined to validate such claims. However, the paper lacks descriptive statistics for this composite index.

      Notably, the "reduction" of the GVS effect in altered gravity was only demonstrated in one of the five available indices (success rate, p = 0.046). In fact, the success rate in Study 2 was 66.7(sham) vs 67.3 (GVS) in Table 2. It is highly debatable whether this marginal result justifies the conclusion that GVS effects "were reduced when the games included reasoning about altered gravity".

      (4) Questionable Assumptions Regarding Strategy

      The authors assume that "big changes in tool positioning and frequent tool switching indicate poor evaluation of the failed outcome". This assumption is questionable. In solving this cognitive task, participants must explore and exploit solutions based on feedback. Large shifts in positioning or frequent tool switching might reflect active, adaptive exploration based on failed outcomes rather than a failure to evaluate them.

      (5) Confounding Factors in GVS Interpretation

      The central theoretical question is whether physical reasoning is grounded in physical experience. GVS is used here to manipulate that experience. However, GVS does not selectively target the vestibular nerve; it also activates distributed fronto-parietal attention networks and hippocampal circuits essential for any reasoning task. Additionally, the vestibular system is linked to the limbic system and the cerebellum, which regulate emotional reactivity and arousal. Because attention and emotion are likely affected by GVS, the authors should be much more cautious in attributing their behavioral findings solely to changes in the "physical experience of the body."

    2. Reviewer #2 (Public review):

      Summary

      The paper investigates whether the real-time physical experience of the body shapes high-level physical reasoning. Participants played a set of computerized tool-use reasoning games (the Virtual Tools paradigm) in which they must use knowledge of physical laws - including gravity, collisions, and inertia - to guide a ball into a target area. In Study 1, participants played the games under terrestrial gravity while receiving either Galvanic Vestibular Stimulation (GVS), which introduces noise into the vestibular organ and disrupts gravitational signalling, or a Sham condition with matched skin sensation. In Study 2, a separate cohort played the same games redesigned under hypogravity (0.5 g - half Earth g) or hypergravity (2 g - double Earth g), again with concurrent GVS or Sham stimulation. Performance was assessed through success rate, number of attempts, and time per attempt; strategy was assessed through the spatial distance between successive tool placements and the frequency of tool switching across attempts. A post-hoc gravity-weighted index (GWI) was computed to compare the effect of vestibular perturbation across the two studies. The main finding is that GVS impairs performance in gravity-dependent games under terrestrial gravity, yet the same perturbation appears to be neutral or even beneficial when the game environment involves non-terrestrial gravity - a result the authors interpret as evidence for an adaptable, body-grounded internal model of physics.

      Strengths

      One of the most notable strengths of this work is its conceptual positioning at the intersection of embodied cognition and physical reasoning. Rather than treating the human body either as an abstract information-processing device or as a purely biomechanical system, the authors take seriously the idea that cognition is scaffolded by ongoing sensorimotor state - and they test this idea with a paradigm that is both tractable and theoretically motivated. The use of the Virtual Tools paradigm is well-suited to this goal: the games vary systematically in their reliance on gravitational predictions, allowing selective impairment (rather than general disruption) to serve as a signature of embodied physical reasoning.

      The dual-study design is another strength. Testing the same vestibular perturbation under terrestrial and altered game-gravity conditions, and observing a reversal in its effect depending on context, provides a form of internal control that is conceptually compelling. The additional clustering analyses (Dirichlet Process Gaussian Mixture Model and leave-one-out kernel density classification) strengthen the strategy results beyond raw distance measures, confirming that GVS systematically shifts participants' spatial exploration strategies.

      The paper is also clearly written and engages meaningfully with relevant theoretical frameworks - predictive coding, embodied cognition, and stochastic resonance - making it accessible and stimulating for a broad audience.

      Weaknesses

      (1) Absence of multiple-comparisons correction. A large number of game-level pairwise t-tests are conducted in both studies (upward of twenty per study) without correction for familywise error rate. The game-level effects that anchor the main narrative - in Study 1 alone: Remove, GoalMove, Spiky, Falling_A, Shafts_B, Gap, and Chaining - arise from an uncorrected pool of comparisons. The probability that some of these constitute false positives is non-trivial. The authors should apply a correction (e.g., Benjamini-Hochberg) or at a minimum discuss this limitation explicitly.

      (2) The facilitation claim rests on a post-hoc and arbitrarily parameterized index. The gravity-weighted index (GWI), which drives the central cross-study comparison, uses integer coefficients (1, 2, 3) to weight games by gravity dependency level. These coefficients are entirely arbitrary and bear no principled relationship to the actual gravitational magnitudes used in the study. Why not use the gravity dependency ratings themselves, or the empirically estimated gravity impact scores from the computational modelling mentioned in the Methods? The choice of weights should be either principled or tested across a range of values to demonstrate robustness. Furthermore, the notation in equation (1) as currently typeset reads as "Gravity minus Weighted Index" rather than "Gravity-Weighted Index"; this should be corrected.

      (3) The "facilitation" interpretation exceeds what the data in Study 2 directly support. Across all games in Study 2, GVS versus Sham differences in absolute performance are non-significant in all directions. The facilitation claim derives entirely from the GWI being higher in Study 2 than in Study 1 - a between-subjects comparison involving different participant groups and a non-pre-registered metric. The language of "facilitation" should be tempered accordingly, or the authors should provide additional analyses to support this framing.

      (4) Gravitational manipulation is visual only, and the vestibular system is only one component of the gravity-sensing network. Gravity perception results, as the authors very well know, from a distributed multisensory integration process that involves, in addition to the vestibular system, visual, proprioceptive, and visceral inputs. The present paradigm manipulates gravitational context solely through visual cues and targets the vestibular system through GVS - a point the authors acknowledge but do not discuss in sufficient depth. It is important to distinguish clearly between real gravitational alterations (as achieved in parabolic flight or centrifuge environments, where the entire body is physically subjected to a different gravitational vector) and virtually altered gravity, where only one sensory modality is targeted while others remain anchored to 1 g. The scope of the conclusions should reflect this distinction.

      (5) The choice of 0.5 g and 2 g may lack sensitivity. Combining the two altered-gravity conditions in Study 2, because no significant effect of hypo versus hypergravity was found, is statistically pragmatic but conceptually unsatisfying. There is evidence in the space physiology literature that gravitational processing is not linearly symmetric around 1 g: threshold effects exist below and above terrestrial gravity that may not be captured by modest deviations (half and double g) - see refs below. It is worth discussing whether the absence of a hypo/hyper distinction in Study 2 reflects a genuine equivalence or a lack of sensitivity, and whether more extreme conditions (e.g., near-zero g or 4-5 g) might reveal different processing regimes. Whether 0.5 g and 2 g were sufficient to saturate the system or merely insufficient to perturb it remains an open question with direct implications for the interpretation of the null GWI effects on strategy measures.

      Lee SMC, Ribeiro LC, Martin DS, Zwart SR, Feiveson AH, Laurie SS, Macias BR, Crucian BE, Krieger S, Weber D, Grune T, Platts SH, Smith SM, and Stenger MB. Arterial structure and function during and after long-duration spaceflight. J Appl Physiol (1985) 129: 108-123, 2020.

      de Winkel KN, Clément G, Groen EL, and Werkhoven PJ. The perception of verticality in lunar and Martian gravity conditions. Neurosci Lett 529: 7-11, 2012.

      Clément G, Moore ST, Raphan T, and Cohen B. Perception of tilt (somatogravic illusion) in response to sustained linear acceleration during spaceflight. Exp Brain Res 138: 410-418, 2001.

      Benson AJ, Kass JR, and Vogel H. European vestibular experiments on the Spacelab-1 mission: 4. Thresholds of perception of whole-body linear oscillation. Exp Brain Res 64: 264-271, 1986.

      (6) High-level reasoning is not defined with sufficient precision. The term "high-level reasoning" appears from the title onward and in the heading of the Study 1 results section (line 138), but it is never formally defined. The reader needs a clearer account of what distinguishes high-level physical reasoning from low-level sensorimotor prediction, and where the games used here fall along that continuum. What specific physical competencies - ballistic trajectories, free-fall predictions, collision dynamics, frictional forces, inertial effects - are required across the game set? When describing the subset of games that drive key effects, this information is critical for evaluating whether effects are specific to gravity reasoning or to some other physical concept.

      (7) Performance measures are disconnected from underlying kinematics. The performance measures (success rate, number of attempts, time per attempt) are coarse, high-level summaries. Time per attempt is used as a proxy for performance efficiency, yet participants received no instructions regarding speed, and different individuals may have adopted systematically different speed-accuracy trade-offs. It would be valuable to know whether time per attempt correlates with attempt number within a given game (which would indicate within-game learning) and whether mouse movement data - trajectory, velocity, hesitation - were recorded and could be analysed to provide more mechanistic insight into strategy formation.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript investigates a theoretically important question in cognitive science: whether higher-level physical reasoning is an abstract, modular process or is grounded in real-time body-environment interactions. To address this question, the authors combine galvanic vestibular stimulation (GVS) with the Virtual Tools task to test whether perturbing vestibular gravity signals affects performance in physical reasoning. The study is conceptually innovative and has the potential to bridge embodied sensory processing and higher-level cognition. However, in its current form, the evidence only partially supports the main claims, and several aspects of the analysis and interpretation limit the strength of the conclusions.

      Strengths:

      A major strength of the manuscript is the originality of the experimental paradigm. The combination of galvanic vestibular stimulation (GVS), which perturbs gravity-related vestibular signals, with computerized game-based tasks that require physical reasoning provides a novel way to test whether ongoing bodily experience influences higher-level cognition. Conceptually, the study is highly original and meaningfully bridges two domains that are often studied separately: sensorimotor processing and higher-level cognition.

      Weaknesses:

      The main weakness of the manuscript is that its central conclusion is not strongly supported by the data. The key finding depends on a marginally significant cross-study comparison, whereas direct GVS-versus-Sham differences in Study 2 are minimal across aggregate measures. In addition, many game-level analyses involve a large number of uncorrected multiple comparisons, raising the possibility that some of the reported effects may reflect chance findings. The manuscript's most important metric, the Gravity-Weighted Index, was not preregistered and is exploratory in nature, yet it is treated as a primary basis for confirmatory conclusions. The cross-study comparison is also difficult to interpret because the two studies differ in participant samples, number of games, and partially in the stimulus set. Finally, the mechanistic claims in the Discussion-particularly those invoking predictive coding, stochastic resonance, or updating of internal gravity models-go well beyond what can be directly inferred from the present behavioral data. Overall, the study provides intriguing but limited evidence that vestibular signals may influence some physical reasoning tasks under specific conditions, rather than strong evidence for a broad account of physical reasoning as grounded in online vestibular processing

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors study two residues in the GHKL ATPase active site of Aq MutL and GyrB, and argue that the catalytic base function is shared between two conserved acidic residues that are 3 residues apart.

      They generated mutant versions in MutL and GyrB (both ala and the appropriate Asn/Gln version) and performed ATPase analysis. They also generated high-resolution crystal structures of the GyrB NTD with AMPPnP for WT and mutants of the two acidic residues. The data show that mutation in either of these residues does not fully kill activity (with the exception of the Alanine mutation of the first of the two, which interferes with ATP (or AMPPnP) binding). When the acidic residues are mutated to Asn/Gln, the catalytic water can still be positioned, and hence these mutants are more active than the Ala mutants. In both cases, the double mutation is catalytically dead.

      The authors then perform phylogenetic analysis and ancestral gene reconstruction, and based on this, they argue that HSP90 forms a different class of GHKL ATPases, and lost rather than gained this separate status.

      Strengths:

      The biochemical analysis seems solid.

      Weaknesses:

      (1) A major question that remains is why the mutations have so much more detrimental effect in MutL (100-fold lower kcat/KM) than they do in GyrB (3-fold lower). Can the authors explain this? Doesn't this argue against the proposed catalytic conservation?

      (2) The structure figures all have omit maps for just the AMPPnP and the water, whereas the density for the acidic residues and their mutants is not shown.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Fukui et al. re-examined the ATP hydrolysis mechanism in GHKL ATPases, revealing a cooperative role of two conserved acidic residues rather than one. The authors have used a range of biochemical and structural techniques on various mutants from different members of the GHKL ATPase family to test and validate their proposed mechanism.

      Through a detailed re-analysis of their previously published structure of the aqMutL NTD (ATPase domain) in complex with AMPPCP, they identified Glu29 and Glu32 as interacting with nucleophilic water for the catalysis. The authors carefully dissected the respective roles of these two acidic residues with a series of site-directed mutations. Mutations at Glu29 impaired ATPase activity without affecting protein secondary structure or ATP binding in the case of the E29Q mutant. Moreover, mutations at Glu32 did not affect secondary structure (except for E32G) but reduced ATPase activity. Activity was abolished when both residues (E29Q/E32Q) were mutated.

      The authors extended their study to another GHKL ATPase, aqGyrB. Their findings further supported the cooperative function of the corresponding acidic residues in aqGyrB (Glu48 and Asp51) during ATP hydrolysis. Mutation of these residues partially impaired ATP hydrolysis without affecting protein secondary structure. ATPase activity was completely lost in the double mutant E48Q/D51M. While the E48Q mutant retained the ability to bind ATP, the E48A mutant did not. High-resolution structures of the WT and E48A, E48Q, D51A, and D51N mutants of the aqGyrB NTD demonstrated that nucleophilic water positioning depended on these residues. E48 played a dominant role in water positioning and is critical for stabilising ATP lid formation and associated conformational changes, whereas D51 contributed cooperatively to catalysis.

      The authors investigated the functional impact of mutating the corresponding residues in the human MutL homologs PMS2 and MLH1. Clinical variants consistently exhibited reduced or abolished ATPase activity, providing a potential molecular basis for Lynch syndrome through impaired DNA mismatch repair.

      Lastly, through evolutionary analysis, the authors inferred that the second acidic residue was likely present in the common ancestor of MutL, GyrB, and MORC proteins, but was lost in the case of Hsp90.

      Strengths:

      (1) This study contains a detailed structural and biochemical analysis of a biologically important set of GHKL ATPases. The authors identify a second acidic residue that is conserved and contributes to catalysis in a large subset of GHKL ATPases. An updated and extended mechanistic model of ATP hydrolysis by this class of enzymes is proposed, which involves cooperative and partially overlapping roles for the catalytic residue pair. This revised mechanistic model is invaluable for the interpretation of clinical variants of GHKL ATPases such as PMS2 and MLH1.

      (2) The work described was performed to an excellent and rigorous technical standard. The structural and biochemical data are sound. The evidence supporting the claims is compelling.

      Weaknesses:

      (1) The identification in this study of a second acidic residue contributing to catalysis but not absolutely essential for catalysis is a useful finding. However, given that many structures of GHLK ATPases have been determined with different nucleotide analogs bound and that the essential role of the first acidic residue is well established, the importance and scope of the advances described here remain focused within the field of study of GHKL ATPases.

      (2) The authors assessed the consequences of variants in the human MutL homologs PMS2 and MLH1, but various other human GHKL ATPases contain clinically relevant variants, some of which have stronger disease associations than the mutations examined in this study. A broader analysis of the effect (or likely effect) of disease-linked mutations in GHKL ATPases would have strengthened this study.

      (3) In MLH1, the E37K mutation completely abolishes ATPase activity, but the corresponding mutations in aqMutL, aqGyrB, and PMS2 do not. It remains unclear why E37K in MLH1 leads to complete loss of activity, as the authors propose that water molecule positioning via the first acidic residue, as well as ATP lid stabilisation and associated conformational changes, should still be possible.

      (4) The authors do not examine ATP binding in the E32 mutants of aqMutL NTD and the D51 mutants of aqGyrB, or AMPPNP binding of the NLH1 and PMS2 mutants. Hence, the relative contributions of the acidic residues to ATP binding and hydrolysis remain partially unclear.

      (5) The ATPase assays for PMS2 and MLH1 (Figure 7 and Table 1) were performed with purification/solubility tags still present. Hence, it cannot be ruled out that these tags influence the measured activities.

      (6) The authors suggest that the two-acidic-residue mechanism proposed in this study could be shared among several GHKL ATPase families, yet they also state that the hydrogen-bonding network was not observed in MutL and MORC family proteins. This raises doubt about how conserved the mechanism is, e.g., in MutL and MORC proteins.

    1. Reviewer #1 (Public review):

      Summary:

      By using an established NAFLD model, choline-deficient high-fat diet, Barros et al show that LPS challenge causes excessive IFN-γ production by hepatic NK cells which further induces recruitment and polarization of a PD-L1 positive neutrophil subset leading to massive TNFα production and increased host mortality. Genetic inhibition of IFN-γ or pharmacological blockade of PD-L1 decreases recruitment of these neutrophils and TNFα release, consequently preventing liver damage and decreasing host death.

      Since NAFLD is often accompanied by chronic, low-grade inflammation, it can lead to an overactive but dysfunctional immune response and increase the body's overall susceptibility to infections, therefore this is very important research question.

      Strengths:

      The biggest strength of the manuscript is vast number of mouse strains used.

      Weaknesses:

      After the review, there are still some open questions from my side:

      (1) I would like the authors to defend their choice of diet type since this has not been done in the review/response to authors. In case they cannot, we need additional proof (HFD or WD model).

      (2) Since the authors used same control groups (chow and HFCD), as required by the animal ethics committee, they must have power analysis test to show that the number of controls (but also in other groups) they used is enough to see the effect. Please provide it.

    2. Reviewer #2 (Public review):

      Summary:

      This is an extremely interesting mouse study, trying to understand how sepsis is tolerated during obesity/NAFLD. The researchers combine a well-established model of NASH (Choline-deficiency with High Fat Diet) with a sepsis model (IP injection of 10mg/kg LPS), leading to dramatic mortality in mice. Using this model, they characterize the complex contributions of immune cells. Specifically, they find that NK-cells and Neutrophils contribute the most to mortality in this model due to IFNG and PD-L1+ Neutrophils.

      Strengths:

      The biggest strength of the manuscript is how clear the primary phenotypes/endpoints of their model are. Within 6 hours of LPS injection, there is a stark elevation of liver inflammation and damage, which is exacerbated by a High Fat/CholineDeficient diet (HFCD). And after 1 day, almost all of the mice die. Using these endpoints, the authors were able to identify which cells were critical for mortality in the model and the specific mediators involved.

      Comments on revisions:

      I have no further comments.

    1. Reviewer #1 (Public review):

      Summary:

      This study asks whether synapses formed by the same broad neuronal class (excitatory pyramidal neurons, PN) adapt their presynaptic organization in a cortex-specific manner, comparing the prefrontal cortex (PFC) with the primary somatosensory cortex (S1). The authors combine sophisticated electrophysiology (paired recordings and extracellular minimal stimulation), pharmacological perturbations of presynaptic Ca²⁺-secretion coupling, bouton Ca²⁺ imaging, and mechanistic modeling. Across two prominent excitatory connections (Layer 5 (L5) PN-L5PN and L2/3-L5PN), they provide convergent evidence that mature PFC synapses operate with looser Ca²⁺ channel-release sensor coupling than their S1 counterparts.

      Overall, the study provides an appealing mechanistic link between synaptic nano/micro-architecture and cortical-area specialization. The idea that PFC synapses retain a more "plasticity-favoring" presynaptic state, while the primary sensory cortex emphasizes reliability and timing precision, is potentially impactful for how we think about circuit computation and plasticity across cortical hierarchies.

      Strengths:

      A major strength is the multi-pronged experimental strategy. The paper first establishes robust, area-dependent differences in synaptic efficacy, reliability, timing, and short-term plasticity (facilitation prevailing in PFC versus depression in S1), using both paired recordings and minimal extracellular stimulation paradigms. The coupling interpretation is then directly supported by differential sensitivity to EGTA (and appropriate positive-control effects of fast chelators). Finally, volume-averaged calcium signals are reported to be similar across areas, arguing against trivial explanations based on gross differences in calcium influx, and the modeling provides a quantitative framework for interpreting the observed chelator effects.

      Weaknesses:

      Limitations are minor and concern interpretation/clarity rather than core results. Some key inferences rely on indirect readouts (chelator sensitivity, fluctuation analysis-derived parameters, bouton-averaged calcium signals), each of which carries assumptions and potential confounds that should be discussed more explicitly. In particular, the repatching paradigm for the paired-recording EGTA experiment, though very impressive, and the limited number of extracellular calcium conditions used for fluctuation analysis (three concentrations), can influence quantitative estimates and the confidence intervals around them.

    2. Reviewer #2 (Public review):

      Schwarze et al. investigated whether synaptic efficacy is brain-region specific. To this end, they compared synaptic connections established by layer 5 (L5) neocortical pyramidal cells and between L5 and L2/3 pyramidal cells. In order to identify the mechanism of this brain region specificity, the authors employed several experimental approaches, including paired electrophysiological recordings, extracellular stimulation, low- and high-affinity intracellular calcium chelators (EGTA and BAPTA), multiple probability fluctuation analysis (MPFA), and intracellular measurements of calcium transients as well as computational modelling. The findings of the present study indicate that synaptic connections in the primary somatosensory cortex (S1) are significantly stronger and more reliable than those in the prefrontal cortex (PFC).

      The study is timely, and the topic is of significant interest to the neuroscience community. Despite the extensive research that has been carried out on the neuroanatomy and receptor distribution of different brain regions, comparatively little attention has been paid to differences in synaptic physiology. The authors' approach is characterised by its elegance and comprehensive nature, and the conclusions drawn are compelling. Nevertheless, there are a number of unresolved issues.

      Major points:

      (1) The authors state that data from the S1 cortex were obtained in a previous study. In the context of an explicitly comparative study (PFC vs. S1cortex), it would have been advantageous for the authors to perform a subset of experiments in which both cortices were obtained from a single animal. This is a feasible undertaking, given the spatial separation of the PFC and S1 cortex.

      (2) Figure 1A is somewhat misleading because it could suggest that the authors have performed dual recordings in identified PFC pyramidal cells.

      (3) PFC and S1 cortex in rodents differ markedly in their morphological organisation. For example, in all sensory cortices, layer 4 is very pronounced; however, in the PFC of rodent,s no clear layer 4 can be found. On the other hand, PFC shows a clear separation of layers 2 and 3, which is not visible inthe S1 cortex. Furthermore, PFC pyramidal cells in layers 2, 3, and 5 exhibit significant heterogeneity, diverging considerably from those found in layers 5a and 5b of S1 cortex. Thus, there is no clear correlation between L5 pyramidal cells in the PFC and the S1 cortex. In order to achieve a meaningful comparison of the data obtained in PFC and S1 cortex, it is necessary for the authors to determine whether the record is from similar pyramidal cell populations.

      (3) In addition, PFC pyramidal cells in layer 2, 3 and 5 are highly heterogeneous and differ markedly from those in layer 5a and 5b of S1 cortex. To achieve a meaningful comparison of the data obtained in the PFC and the S1 cortex, the authors need to determine whether the record from similar pyramidal cell populations.

      (4) For the S1 cortex, in rats it has been found that L5 synaptic connection between pairs of L5a pyramidal cells and pairs of L5b pyramidal cells differ markedly with respect to mean EPSP amplitude, latency and coefficient of variation (cv, a surrogate measure for the synaptic release probability) (cf. Markram et al., 1997; Frick et al., 2008). It is therefore likely that PFC and S1 pre- and postsynaptic pyramidal cells are not only morphologically and electrophysiological distinct but also with respect to their synaptic properties. At least, the authors need to discuss these confounding issues and preferentially address them experimentally. For example, it would be helpful to demonstrate that paired recordings were made from the same pyramidal cell types, perhaps by documenting their morphology and/or firing patterns. In addition, they should discuss the marked difference in EPSP amplitude and putative release probability between their data and the earlier studies.

      (5) In order to perform multiple probability fluctuation analysis (MPFA), a parabolic fit with a mere three points is inadequate, particularly because 2 mM and 5 mM Ca2+ are close to the peak of the variance-to-mean parabola, and only 1 mM Ca2+ is on its initial linear part. A more meaningful result would have been obtained with an additional Ca2+ concentration between 1.0 and 2.0 mM, as these are closer to the physiological range. In this context, the authors should have quoted the more recent and more detailed paper by the Silver group (Saviane and Silver, 2006; Lanore and Silver, 2016) and not just the Clements and Silver review paper.

      (6) Methods: The authors should clarify whether their paired recordings from L5 pyramidal cells involved whole-cell recordings from both pre- and postsynaptic neurons. From Figure 1B, it appears as if the presynaptic neurons were not recorded in whole cell mode but rather stimulated in cell-attached mode. This is also reflected in the artefact visible in the current trace recorded in the postsynaptic neuron. The authors should explicitly state their methodological approach and mention how reliable the timing of the presynaptic action potential was under these circumstances. The same holds true for the extracellular stimulation protocol. A significantly more detailed description of the experimental protocol is necessary here.

      (7) Methods: The authors use Student's t-test for data comparison. The authors should verify that the data distribution was indeed normal, e.g. by using a Shapiro-Wilk test. If this is not the case, non-parametric tests should be used.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Max Schwarze and colleagues examined the coupling distance between presynaptic Ca²⁺ channels and the vesicular release sensor at neocortical synapses in mice. They propose that Ca²⁺ channel-release sensor coupling differs across cortical areas, with relatively loose (microdomain) coupling in prefrontal cortex (PFC) and tighter (nanodomain) coupling in primary somatosensory cortex (S1) for comparable pyramidal-neuron synapse types. To test this, they combine paired recordings and minimal stimulation with chelator manipulations (EGTA/BAPTA), mean-variance/MPFA-style analyses, presynaptic Ca²⁺ imaging, and computational modeling. They conclude that presynaptic coupling organization is area-specific in the mature cortex and contributes to regional differences in synaptic timing, reliability, and short-term plasticity.

      Strengths:

      This study tackles an important question and is strengthened by a cohesive body of evidence assembled from multiple complementary approaches. A major asset is the inclusion of high-value datasets, particularly the paired recordings between L5 pyramidal neurons and the systematic assessment of EGTA sensitivity, which provide a solid functional foundation for the authors' central claims. The work is further distinguished by its genuinely multimodal design: combining electrophysiology with presynaptic calcium imaging (and integrating these observations with quantitative analyses and modeling) offers a more mechanistic view of neurotransmitter release than any single method could provide. Overall, the direct, within-framework comparison of presynaptic release-control mechanisms across cortical areas for comparable synapse types is compelling and gives the conclusions a level of robustness and interpretability that is often difficult to achieve in studies of cortical synaptic diversity.

      Weaknesses:

      Several aspects would benefit from clearer explanation, stronger integration with the existing literature, and a more explicit discussion of limitations and potential confounds. Without these additions, some conclusions remain speculative. Throughout the manuscript, the authors also often imply that different measurements reflect the same underlying synapse population. This is unlikely to be strictly true across all experiments and makes it difficult to integrate results from the various approaches into a single, unified set of functional synaptic properties. In addition, some statements-particularly those linking coupling mode to "higher-order neocortical functions"-appear broader than what is directly supported by the experiments and should be tempered or more precisely scoped.

      Below, I list several topics that could help better frame the main findings of the present study and clarify how it relates to previously published work.

      (1) The authors use EGTA sensitivity of EPSCs (together with additional metrics) to argue that S1 and PFC synapses differ in Ca²⁺ channel-release sensor coupling. While this is a plausible interpretation, EGTA effects are not uniquely determined by coupling distance and can also reflect differences in Ca²⁺ entry kinetics, action potential waveform, endogenous buffering/extrusion, or release-sensor/vesicle state. The authors use a constrained modeling approach, but the rationale for the different constraint sets is not fully clear from the current description. It would be helpful to expand and clarify the Methods section to explain how these constraints were defined, justified, and applied (and how alternative constraint choices would affect the results). In this context, the Abstract's broader claim that the study "reveals microdomain coupling as a presynaptic structure-function correlate of higher-order neocortical functions" appears overstated. Given the well-known diversity of cortical synapses even within a single region (e.g., synapses onto different interneuron subclasses or different PN cell types, extracortical sources like thalamus), the authors should clarify the intended scope: is the conclusion meant to apply broadly across synapse classes in S1 and PFC, or only to the specific connection type(s) examined here?

      (2) The chelator logic is sound in principle, but the Discussion should more explicitly acknowledge standard caveats and alternative explanations. The authors partly address this by including presynaptic Ca²⁺ imaging and modeling, yet it would help to explain more clearly how the combination of (i) chelator sensitivity, (ii) presynaptic Ca²⁺ signals, and (iii) model constraints rules out-or substantially reduces the likelihood of-changes in AP waveform, Ca²⁺ influx kinetics, buffering/extrusion, or sensor/vesicle state as the primary drivers. In addition, recent hypotheses emphasizing vesicle priming and/or release-site occupancy as contributors to apparent EGTA sensitivity should be discussed as a complementary or alternative interpretation.

      (3) A substantial portion of the S1 comparison appears to rely on previously published datasets. This should be made unambiguous in the Results and Methods, and it would be helpful to summarize this clearly (e.g., in a table indicating which figures/analyses use new data versus reanalysis of published data). If this information is already present, it should be highlighted more prominently.

      (4) The modeling is informative, but the choice of a specific VGCC-release-site geometry and channel arrangement is not sufficiently justified. The manuscript adopts a particular spatial configuration, yet the rationale for selecting this geometry, rather than other plausible architectures discussed in the literature, is not clearly explained, nor is it meaningfully revisited in the Discussion. The authors should justify why the same organization is assumed across two distinct cortical areas and, ideally, include (or at a minimum discuss) a sensitivity analysis showing how key inferences (e.g., coupling distance and channel number) depend on the assumed geometry.

      (5) The calcium imaging data are valuable, but given the diversity of synapses within each cortical layer, it is not clear that imaged boutons can be confidently assigned to the specific connection types being interrogated electrophysiologically. A substantial fraction of boutons likely corresponds to different postsynaptic targets (including interneurons and distinct pyramidal-cell classes), and this heterogeneity could complicate interpretation. This limitation should be discussed explicitly

      (6) In unitary connections, the authors assess EGTA effects alongside other functional parameters (strength, delay, short-term plasticity), which is a major strength. However, for L2/3 to L5 connections, it appears that EGTA sensitivity was tested primarily using extracellular stimulation. Given anatomical and circuit differences between PFC and S1, extracellular stimulation may recruit different synapse populations across regions, potentially confounding regional comparisons of EGTA sensitivity. This limitation should be acknowledged explicitly. While I am not requesting technically demanding L2/3↔L5 paired recordings in S1, the possibility that different synapse identities are being sampled should be treated as a meaningful source of uncertainty. The Discussion would also benefit from placing the magnitude of EGTA effects in the context of prior "loose coupling" literature, where comparatively large EGTA effects have been reported in some systems. In addition, the reported difference between adult PFC EGTA effects and S1 inhibition appears small (on the order of <10%) and should be interpreted cautiously, especially given that PFC and S1 mature on different timelines and P21-P26 is unlikely to reflect a mature PFC circuit state. The adult cohort (P90-P100) is therefore important, but the age mismatch complicates PFC-S1 comparisons; ideally, S1 should be assessed at matched ages, or this limitation should be discussed explicitly. Finally, for statistical robustness, in panel D of Figure 2, were the comparisons corrected for multiple testing to control Type I error?

      (7) Alterations in initial release probability are often associated with changes in short-term plasticity. In the present manuscript, the authors report similar initial release probability at PFC and S1 synapses, yet observe differences in short-term plasticity profiles. The mechanistic basis for this apparent dissociation is not addressed and should be discussed explicitly, including potential explanations.

      (8) There are multiple instances where the text appears to cite non-existent or misnumbered figure panels (e.g., references to "Figure 4G-I / 4J" when the relevant material appears elsewhere). These should be corrected throughout, as they currently reduce readability and confidence.

      (9) The Methods describe P21-P26 animals, whereas the Results include older cohorts (e.g., P90-P100) and additional regions (e.g., mPFC). The Methods should be updated so that all cohorts and regions analyzed in the Results are fully described.

    1. Reviewer #1 (Public review):

      Very nice and coherent body of work with appropriate in vitro to in vivo transition in methods.

      Lovely and easy to follow figures that can be understood even without the manuscript.

      My recommendation is that a sentence or two be added clearly stating the authors think nafamostat is off the table and suggest other approaches/drugs that might be considered instead of just making a general statement. I think all this can be done in a few sentences.

      Gabexate was administered to a snakebite victim in this case report from about 20 years ago and also a good example of the now better recognized threat to pregnancy.

      Nasu K, Ueda T, Miyakawa I. Intrauterine fetal death caused by pit viper venom poisoning in early pregnancy. Gynecol Obstet Invest. 2004;57(2):114-6. doi: 10.1159/000075676. Epub 2003 Dec 19. PMID: 14691344

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to test whether a defined set of small molecules can lessen damaging effects caused by venoms from several Bothrops species, and whether these effects are consistent enough to suggest a broadly applicable approach. They present a cross-venom dataset spanning in-vitro activity readouts and blood-based functional outcomes, and include a chicken embryo model to explore whether venom inhibition can translate into improved survival. The central message is that certain small molecules can reduce specific venom-driven effects across multiple samples, providing a comparative resource for the field and a basis for prioritizing future validation.

      Strengths:

      The main value of this work is the breadth and structure of the dataset, which places multiple venoms and multiple readouts into a single, comparable framework that should be useful for readers evaluating patterns across samples. The experimental flow is generally coherent, moving from activity measurements to functional outcomes and then to an in-vivo test, which helps the reader understand how the authors link mechanism-oriented assays to more integrated endpoints. The manuscript also provides practical information for the community by highlighting which readouts appear most consistently affected across venoms, which can help guide hypothesis generation and study design in follow-up work.

      Comments on revisions:

      I would like to thank the authors for answering my questions. The manuscript has gained in quality, knowing the limitations that are now better stated in the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents a new Bayesian approach to estimate importation probabilities of malaria combining epidemiological data, travel history, and genetic data through pairwise IBD estimates. Importation is an important factor challenging malaria elimination, especially in low transmission settings. This paper focus on Magude and Matutuine, two districts in south Mozambique with very low malaria transmission. The results show isolation-by-distance in Mozambique, with genetic relatedness decreasing with distances larger than 100 km, and no spatial correlation for distances between 10 and 100 km. But again strong spatial correlation in distances smaller than 10 km. They report high genetic relatedness between Matutuine and Inhambane, higher than between Matutuine and Magude. Inhambane is the main source of importation in Matutuine, accounting for 63.5% of imported cases. Magude, on the other hand, shows smaller importation and travel rates than Matutuine, as it is a rural area with less mobility. Additionally, they report higher levels of importation and travel in the dry season, when transmission is lower. Also, no association with importation was found for occupation, sex and other factors. These data have practical implications for public health strategies aiming malaria elimination, for example, testing and treating travelers from Matutuine in the dry season.

      Strengths:

      The strength of this study relies in the combination of different sources of data - epidemiological, travel and genetic data - to estimate importation probabilities, the statistical analyses.

      Weaknesses:

      The authors recognize the limitations related to sample size and the biases of travel reports.

    2. Reviewer #2 (Public review):

      Summary:

      Based on a detailed dataset, the authors present a novel Bayesian approach to classify malaria cases as either imported or locally acquired.

      Strengths:

      The proposed Bayesian approach for case classification is simple, well justified, and allows the integration of parasite genomics, travel history, and epidemiological data.

      Weakness:

      While the authors aim to classify cases as imported or locally acquired, the method does not quantify the contribution of each case type to overall transmission, which the authors leave for future study.

    3. Reviewer #3 (Public review):

      This work provides a novel statistical model to identify imported malaria cases, which are an important challenge for elimination, particularly in low-transmission areas. This tool was applied in Plasmodium falciparum populations in Mozambique and determined differences in importation rates in two low-transmission districts in the South.

      Strengths:

      The study has several strengths, particularly the development of a novel Bayesian model integrating genomic, epidemiological, and travel data to estimate importation probabilities. The findings provided important insights into malaria transmission dynamics, including the identification of importation sources and regional differences in importation rates across Mozambique. These results highlight the potential value of targeted interventions among traveler populations to support malaria elimination efforts. Moreover, this approach could be adapted to other epidemiological settings.

      Weaknesses:

      The study has some limitations, including uneven sample representation across provinces, incomplete metadata for risk factor analysis and a proxy for transmission intensity. Future work will include a new sample collection effort and the incorporation of monthly malaria incidence estimates.

    1. Reviewer #1 (Public review):

      Summary:

      Sidarta-Oliveira et al. present TopOMetry, a novel dimensionality reduction method based on the eigendecomposition of approximated Laplace-Beltrami Operator. Shortly, TopOMetry is an iterative version of the existing spectral methods (e.g., Laplacian Eigenmap or Diffusion map). It approximates the Laplacian operators twice, once in a "phenotypic space" and then once again in the eigenbases space. By doing this the approximated operator will contain more information of the manifold, which allows for more robust and accurate downstream analyses.

      Strengths:

      - Introduces operator-native fidelity scores and Riemannian diagnostics to single-cell analysis, enabling researchers to evaluate and trust embeddings - functionality absent in prior methods.<br /> - The approach was rigorously tested based on synthetic and real single-cell RNA-seq datasets.<br /> - The package is well-made and easily scalable to millions of cells.<br /> - The comprehensive documentation helps the end-users to run desired analyses.

      Weaknesses:

      - The method is an extension of the current state-of-art methods, not a fundamentally new one.

      Comments on revised version:

      The revised manuscript partially addresses the concerns raised in the prior review. The jargon weakness has been substantially mitigated by relocating mathematical derivations to the Methods section and simplifying language in the main text; this weakness has been updated accordingly.

      The introduction of operator-native fidelity scores and Riemannian diagnostics represents a meaningful addition and has been added to the Strengths. The benchmarking scope has also been notably expanded.

      The core weakness - that the method is an extension of existing spectral methods rather than a fundamentally new contribution - remains unchanged, as the authors' rebuttal did not provide a sufficiently precise mathematical argument to overturn it.

    2. Reviewer #2 (Public review):

      Summary:

      This work introduces a novel framework to systematically learn the latent dimensions of single-cell data, grounded in the theory of the Riemannian manifold. The authors demonstrate how this framework can be applied to various important tasks, such as estimating intrinsic dimensionalities, annotating cell types, etc. They did a great job of tackling an important but not yet established problem in the field and approaching it with a theoretically sound and novel approach. I think after a more rigorous and comprehensive validation, this work could be impactful.

      Strengths:

      - Dimensionality reduction is a routine step in analyzing many high-dimensional data, such as molecular data. While the downstream analysis results depend heavily on this step, existing methods rely on strong assumptions and are sometimes heuristic. The authors present a novel, theoretically grounded approach to address this important problem.

      - The authors demonstrated its usability in downstream analysis in a comprehensive manner. Especially, they show evidence suggesting novel T-cell subpopulations.

      - I commend the authors for releasing and maintaining their software well with comprehensive documentation. This significantly increases the usability and accessibility of the method.

      Weaknesses:

      - The paper lacks experiments that validate the results. It would be beneficial to see additional evaluation settings with better-established ground truths to more strongly demonstrate the method's effectiveness.

      - Batch effects are prevalent in single-cell data. The paper does not adequately address how the proposed method handles this issue.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Ding et al. use genetic mouse models to demonstrate that atrial trabeculation is more dependent on Tie1/Tie2 signaling than ventricular trabeculation. With additional experimentation that would support the current claims, the results may hold significant value, as atrial trabeculation remains an understudied phenomenon in cardiac biology with potential implications for atrial cardiomyopathy and atrial fibrillation.

      Strengths:

      Detailed characterization of atrial versus ventricular trabeculation across different developmental timepoints, and the use of appropriate animal models to address the scientific question at hand.

      Weaknesses:

      The authors have consistently treated mice with tamoxifen after ventricular, but not atrial, trabeculation has already started. As such, the observed cardiac phenotypes - where predominantly atrial trabeculation is affected - might be a mere consequence of the precise time window in which Tie1/2 signaling was impaired, rather than a direct measurement of its relative importance for atrial versus ventricular trabeculation. The conclusions of the paper may thus be significantly strengthened by depleting Tie1/2 signaling prior to the onset of ventricular trabeculation, as is done for atrial trabeculation.