10,000 Matching Annotations
  1. Last 7 days
    1. Joint Public Review:

      In this manuscript, the authors proposed an approach to systematically characterise how heterogeneity in a protein signalling network affects its emergent dynamics, with particular emphasis on drug-response signalling dynamics in cancer treatments. They named this approach Meta Dynamic Network (MDN) modelling, as it aims to consider the potential dynamic responses globally, varying both initial conditions (i.e., expression levels) and biophysical parameters (i.e., protein interaction parameters). By characterising the "meta" response of the network, the authors propose that the method can provide insights not only into the possible dynamic behaviours of the system of interest but also into the likelihood and frequency of observing these dynamic behaviours in the natural system.

      The authors study the Early Cell Cycle (ECC) network as a proof of concept, focusing on pathways involving PI3K, EGFR, and CDK4/6 with the aim of identifying mechanisms that may underlie resistance to CDK4/6 inhibition in cancer. The biochemical reaction model comprises 50 state variables and 94 kinetic parameters, implemented in SBML and simulated in Matlab. A central component of the study is the generation of large ensembles of model instances, including 100,000 randomly sampled parameter sets intended to represent intra-tumour heterogeneity. On the basis of these simulations, the authors conclude that heterogeneity in kinetic rate parameters plays a stronger role in driving adaptive resistance than variation in baseline protein expression levels, and that resistance emerges as a network-level property rather than from individual components alone. The revised manuscript provides additional clarification regarding aspects of the simulation and filtering procedures and frames the comparison with experimental data as qualitative. Nonetheless, the study is best interpreted as a theoretical and exploratory analysis of the model's behaviour under heterogeneous conditions. Consequently, questions remain regarding the biological grounding of the sampled parameter regimes and the extent to which the reported frequencies of resistance-associated behaviours can be directly interpreted in physiological terms.

      While the authors propose a potentially useful computational framework to explore how heterogeneity shapes dynamic responses to drug perturbation, a number of important conceptual and methodological concerns remain to be addressed:

      (1) The sampling of kinetic parameters constitutes the backbone of the manuscript, yet important concerns remain regarding its biological grounding and transparency. Although the revised version provides additional clarification on the exploration of "model instances", it is still not sufficiently clear how parameter values and initial conditions are generated, nor how the chosen ranges relate to biological measurements. The kinetic rates are sampled over broad intervals without explicit justification in terms of experimentally measured bounds or inferred distributions. As a consequence, it remains uncertain whether the ensemble of simulated behaviours reflects physiologically plausible cellular regimes or primarily the properties of the assumed parameter space. In this context, the large-scale sampling (100,000 parameter sets) resembles a Monte Carlo exploration of the model rather than a biologically calibrated representation of tumour heterogeneity.

      Furthermore, the adequacy of the sampling strategy in such a high-dimensional space (94 free parameters) remains open to question. In the absence of biologically informed constraints, the combinatorial space of possible parameter configurations is vast, and it is unclear to what extent the sampled ensembles can be considered representative. This issue is particularly relevant because the manuscript interprets the frequency of resistance-associated behaviours as indicative of their likelihood.

      The validation presented in Figure 7 does not fully resolve these concerns. The comparison with experimental data is qualitative, and the simulations are performed in arbitrary time units, which complicates direct interpretation alongside time-resolved experimental measurements. Moreover, certain qualitative discrepancies between simulated and experimental trends (e.g., persistent versus decreasing CDK4/6 activity) are not thoroughly discussed. As this figure represents the primary empirical reference point in the manuscript, the extent to which the model captures experimentally observed dynamics remains uncertain.

      Finally, aspects of presentation continue to limit transparency. Parameter ranges are described at different points in the manuscript but are not consolidated clearly in the Methods, and the definition of initial conditions remains ambiguous - particularly whether these correspond to conserved quantities or to the dynamic variables used to initialise simulations. In addition, the exact number of model instances underlying specific analyses and figures is not always explicit. Greater clarity on these issues is essential for assessing reproducibility and for interpreting the quantitative claims of the study.

      (2) A central conclusion of the manuscript is that heterogeneity in protein-protein interaction kinetics is a stronger driver of adaptive resistance than heterogeneity in protein expression levels. To assess the latter, the authors fix a nominal set of kinetic parameters and generate 100,000 random initial concentrations for the 50 model species. However, according to the simulation protocol described in the manuscript, each trajectory includes three phases: (i) simulation under starvation conditions to equilibrium, (ii) mitogenic stimulation to a second ("fed") equilibrium, and (iii) application of drug treatment. The equilibrium concentrations reached in phases (i) and (ii) are determined by the kinetic parameters of the model and are independent of the initial concentrations, provided the system converges to a stable steady state. In dynamical systems terms, stable equilibria are defined by the parameter set and attract all initial conditions within their basin of attraction. Since the kinetic parameters are fixed in this experiment, the pre-treatment equilibrium that serves as the starting point for drug application should likewise be fixed. Under these conditions, it is therefore not unexpected that sampling a large number of initial concentrations has limited influence on the treated dynamics.

      This raises conceptual questions about the interpretation of the comparison between kinetic and expression heterogeneity. If the system converges to a unique stable steady state prior to treatment, then variability in initial concentrations does not propagate into variability in drug response, and the observed dominance of kinetic heterogeneity may partly reflect this structural property of the model rather than a biological principle. Clarification is needed regarding whether multiple steady states exist under the nominal parameter set, and if so, how basins of attraction are explored.

      More broadly, it remains unclear why initial protein concentrations can be sampled independently of the kinetic parameters. In biological systems, steady-state expression levels are typically determined by the underlying kinetic rates. A more consistent approach might require constraining initial concentrations to correspond to equilibrium states of the chosen parameter set, thereby introducing relationships between at least some of the 50 initial conditions and the 94 kinetic parameters. Finally, the manuscript employs a non-standard terminology regarding "initial conditions," which may further obscure interpretation of these results and would benefit from clarification.

      (3) The technical implementation of the modelling and simulation framework remains difficult to evaluate due to insufficient methodological detail. Although the authors state that kinetic parameters are randomly sampled, the manuscript does not specify the distributions from which parameters are drawn, nor whether potential correlations between parameters are considered or explicitly ignored. Without this information, it is not possible to assess how implicit modelling assumptions shape the ensemble of simulated behaviours. Given that the conclusions rely on frequency-based interpretations across sampled parameter sets, greater transparency regarding the sampling procedure is essential.

      A further concern relates to the parameter filtering step. The authors report that the "vast majority" of sampled parameter sets produced systems that were "too stiff," and that these were excluded on the grounds that stiff dynamics are not biologically plausible. However, the manuscript does not clearly define how stiffness is assessed, nor why stiffness is interpreted as biologically unrealistic rather than as a numerical property of the formulation. In standard practice, stiff systems are typically handled using appropriate implicit solvers rather than being discarded. Similarly, parameter sets that produce negative state values are excluded, yet such behaviour may arise from numerical artefacts rather than from intrinsic model inconsistency. The rationale for excluding these parameter sets, rather than adapting the numerical scheme, is not sufficiently justified.

      The reported rejection rate - approximately 90% of sampled parameter sets - is substantial and raises questions regarding the interplay between model structure, parameter ranges, and numerical methods. As currently described, the filtering step appears to select parameter sets based primarily on computational tractability rather than on experimentally motivated biological criteria. The manuscript would be strengthened by clarifying whether the retained parameter sets are representative of biologically meaningful regimes, and by distinguishing clearly between exclusions based on biological plausibility and those arising from numerical considerations.

      Finally, important aspects of the simulation protocol require clarification. The model is simulated under "fasted" and "fed" conditions until equilibrium is reached, yet the criterion used to determine convergence is not specified. It would be important to describe how equilibrium is assessed (e.g., based on the norm of the time derivatives). Additionally, it remains unclear whether the mitogenic stimulus applied in the "fed" phase is assumed to be constant over time and, if so, how this assumption relates to biological experimental conditions. Greater detail on these implementation choices is necessary to ensure interpretability and reproducibility.

      (4) The manuscript states that the modelling conclusions are strongly supported by existing literature; however, the validation presented does not fully substantiate this claim. As noted above, the comparison with CDK2 and CDK4/6 experimental data remains qualitative, and the use of arbitrary simulation time units complicates interpretation of temporal agreement. The extent to which the model quantitatively or mechanistically recapitulates experimentally observed dynamics therefore remains uncertain.

      The claim that the model reproduces known resistance mechanisms is also difficult to assess in light of Figure S10, where a large fraction of network nodes (~80%) appear implicated in resistance under some conditions. If most components of the network can, in at least some parameter regimes, be associated with resistance phenotypes, the resulting lack of selectivity weakens the strength of model-based validation. It becomes challenging to distinguish specific mechanistic insights from generic consequences of network connectivity.<br /> In addition, the Supplementary Information notes that certain components of the mitogenic and cell-cycle pathways were abstracted or excluded in order to maintain computational tractability. While such abstraction is understandable in a large ODE framework, it raises interpretative questions. Proteins identified as potential resistance drivers within the model may, in some cases, represent aggregated or simplified pathway effects. Clarifying in the main text how such abstractions may influence the attribution of resistance mechanisms would strengthen the biological interpretation of the results.

      Drug inhibition is central to the manuscript's conclusions. The revised version clarifies that inhibition is implemented as a fixed fractional modification of specific kinetic rate laws. This abstraction is appropriate for exploring network-level responses, but it represents a stylised perturbation rather than a pharmacologically calibrated model of drug action. For full interpretability and reproducibility, the mathematical form of the modified rate laws, as well as the timing of inhibition relative to network equilibration, should be specified unambiguously. The biological implications of the findings depend critically on understanding this modelling choice.

      The one-at-a-time perturbation analysis presented in Figure 5 provides an interpretable ranking of first-order control points across the ensemble and offers mechanistic insight into primary sensitivities of the network. However, many targeted therapies act on multiple components, and resistance frequently arises through combinatorial mechanisms. The reported rankings should therefore be interpreted as identifying primary influences under isolated perturbations, rather than as a comprehensive account of multi-target drug behaviour.

      Overall, the manuscript succeeds in presenting a conceptual and exploratory framework for analysing how signalling network topology can shape the qualitative landscape of adaptive responses under heterogeneous kinetic conditions. Its principal contribution lies in establishing a systematic platform for large-scale in silico exploration. At the same time, the current limitations in biological calibration, parameter grounding, and validation constrain the extent to which the conclusions can be interpreted as predictive or quantitatively representative of specific tumour contexts. Addressing these issues would further strengthen the connection between the theoretical landscape described here and experimentally observed resistance dynamics.

    1. Reviewer #2 (Public review):

      Summary:

      This paper attempts to examine how rare, extreme events impact decision-making in rats. The paper used an extensive behavioural study with rats to evaluate how the probability and magnitude of outcomes impact preference. The paper, however, provides limited evidence for the conclusions because the design did not allow for the isolation of the rare, extreme events in choice. There are many confounding factors, including the outcome variance and presence of less-rare, and less-extreme outcome in the same conditions.

      Strengths.

      (1) The major strength of the paper is the significant volume of behavioural data with a reasonable sample size of 20 rats.

      (2) The paper attempts to examine losses with rats (a notoriously tricky problem with non-human animals) by substituting time-outs as a proxy for losses. This allows for mixed gambles that have both gain and loss possible outcomes.

      (3) The paper integrates both a behavioural and a modelling approach to get at the factors that drive decision-making.

      (4) The paper takes seriously the question of what it means for an event to be rare, pushing to less frequent outcomes than usually used with non-human animals.

      Weaknesses:

      (1) The primary issue with this work is that the primary experimental manipulation fails to isolate the rare, extreme events in choice. As I understand the task, in all the conditions with a rare extreme event (e.g., 80 pellets with probability epsilon), there is also a less-rare, less-extreme event (e.g., 12 pellets with probability 5). In addition, the variance differs between the two conditions. So, any impact attributable to the rare, extreme event could be due to the less rare event or due difference in the variance (or other statistical moments, like skew or kurtosis). That the distributions can be shown to be different under specific assumption to value maximizing agents (e.g., with Jensen Gaps and Table 2) is not really relevant to what rats are sensitive and what drive their behaviour. The design here does not support the conclusions. Finally, by deliberately confounding rarity and extremity, the design does not allow for assessing the impact of either aspect on rat behaviour.

      (2) The RL modelling work also fails to show a specific impact of the rare extreme event. As best as I can understand Eq 2, the model provides a free parameter that adds a bonus to the value of either the two options with high-variance gains (A and V in the paper) or to the two options with high-variance losses (F and V in the paper). Or equivalently to the ones with "Jackpots" vs the ones with "Black Swans" (see Point 1 above as to how these different aspects are all confounded in this design). This parameter seems to only depends on whether this option could have possibly yielded the rare, extreme outcome (i.e., based on the generative probability) and was not connected to its actual appearance. [This point is unclear as the text says this, but the rebuttal states otherwise; plus some options never received the REE, see Table S11]. That makes it a free parameter that just bumps up (or down) the probability of selecting a pair of options. That may be due to presence of the REE or the other rare event or just the variance difference. Moreover, in the case of the "black swan" or high-variance loss conditions, this seems very much like a loss aversion parameter, but an additive one instead of a multiplicative one. Is there a theoretical claim here that "extreme losses" need an additive loss-aversion parameter?

      (3) The paper presented the methods and results with lots of neologisms and fairly obscure jargon (e.g., fragility, total REE sensitivity). That might it very hard to decipher exactly what was done and what was found. For example, on p. 4, the use of concave and convex was very hard to decipher; the text even has to repeat itself 3 times (i.e., "to repeat" and "in other words") and is still not clear. It would be much clearer (and probably accurate) to say that the options varied along the variance dimension, separately for gains and losses. Option A was low-variance gains and losses. Option B was low-variance losses and high-variance gains. Option C was high-variance losses and low-variance gains, and Option D was high-variance losses gains. That tells much more clearly what the animals experienced without the reader having to master a set of new terminologies around fragility and robustness, which brings a set of theoretical assumption unnecessarily into the description of the experimental design. Alternatively, if the authors are wary of using the term "variance" because other moments of the distribution also differ, they could use "high-value gains" or "high-value losses" or something else which does not obscure the experimental design with jargon. Again, this goes back to point 1 above, whereby the different options differ on so many dimensions (as is made even more apparent in the rebuttal) that the design cannot isolate the impact of the variables of interest.

      (4) Were the probabilities shuffled or truly random (seem to be fixed sequences, so neither)? What were the experienced probabilities? Given the fixed sequences, these experienced ("ex-post") probabilities, could differ tremendously from the scheduled ("ex ante") probabilities. It's quite possible than an animal never experienced the rare, extreme event for a specific option. From Table S11, that is guaranteed to have happened in that 4 animals only ever experienced the "black swan" outcome once. It's even possible (if they only picked a specific option on the 10th/60th choices by chance), that they only ever experienced that rare extreme event. This point still cannot be known given the information provided, which does not break down outcomes by options. The Supplemental in Table S11 only gives overall numbers but does not indicate what the rats experienced for each choice/option-which is what matters here. A simple table that indicates for each of the 4 options, how often they were selected, and how often the animals experienced each of the 6-8 possible outcome would make it much clearer how closely the experience matched the planned outcomes. In addition, by restricting the rare outcome to either the 10th or 60th activations in a session, these are not random. Did the animals learn this association? The text states that they did not, but no evidence is provided.

      (5) The choice data are generally presented in an overprocessed fashion with a sum and a difference (in both figures and tables). The basic datum (probability/frequency of selecting each of the 4 options) is not provided directly in the main text, even if it can theoretically be inferred from the sum and the difference. New right side of Table S4 is probably the most valuable piece in terms of explaining what rats did and should be highlighted a lot more. Inspection of that table reveals some interesting (and potentially worrying) results. Most notably, the vast majority of responding happens on the "anti-fragile" and "robust" option, often totalling around 90% of all selections, especially amongst the most common blue rats. Alas, those were all those the two options that were deliberately assigned to the two most preferred holes in the training phase (see p. 26). Does this reflect genuine preference for reward distributions or does this reflect a spatial hole bias? The assignment strategy makes this impossible to tell apart.

      (6) There is insufficient detail provided on the inferential statistical tests (e.g., no degrees of freedom or effect sizes), and only limited information on exactly what tests were run and how (bootstrapping, but little detail). Without code or data (only summary information is provided in the supplement), this is difficult to evaluate. In addition, the studies seem not to pre-registered in any way, leaving many research degrees of freedom. Not all studies need to be pre-registered and sometimes discovery of new things requires exploratory work, but preregistration does provide additional safeguards against overemphasizing post-hoc detected patterns-a serious issue in behavioural science. Moreover, this promotes transparency in reporting results and analyses, allowing for a better assessment of the strength of evidence for a claim. For example, here, were any alternative analysis pipelines attempted? Also, there were many sub-groupings of the animals and subsequent comparisons between them which all seemed post-hoc. On what grounds were these divisions made-were other divisions examined as well?

      (7) On p. 12 (Fig 4), there is an attempt to look at the impact of a rare, extreme event by plotting a measure of preference for the 10 trials before/after the rare, extreme event. In the human literature, the main impact of experiencing a rare, extreme event is what is known as the wavy recency effect (See Plonsky et al. 2015 in Psych Review for example, now cited). What this means is that there tends to there tends to be some immediate negative recency (e.g., avoiding a rare gain) followed by positive recency (e.g., chasing the rare gain). Typically, this refers to the specific option that yielded that outcome. First, as the other analyses do, the current analysis combines choice of the option that yielded the rare outcome with choice of other options, so that cannot directly assess the impact of the rare, extreme event on choice. Also, using a 10-trial window would thus obscure any impact of this rare, extreme event. There is mention of the very next trial, but an analysis that looks at the 10-trial time course trial-by-trial could reveal any impact that might be predicted from the human literature.

      (8) As I understood the method (p. 31), the assignment of options to physical locations was not random or counterbalanced, but deliberately biased to have one of the options in the preferred location. This would seem to create a bias towards a particular option and a bias away from the other options, which confounds the preference data in subsequent analyses. Table S4 reinforces this concern where the vast majority of response are clustered in the two most preferred options from training.

      (9) Are delays really losses? This is a big assumption. Magnitude and delay are different aspects of experience, which are not necessarily commensurable and can be manipulated independently. And, for the model, how were these delays transformed into outcomes for the model. Eq 1 skips over that. Is there an assumption of linearity? In addition, I was not wholly clear if the delays meant fewer trials in a session or if the delays merely extended the session and meant longer delays until the next choice period.

      Other points:

      (1) I think the authors still misunderstand the concept of "hot-stove effects". The idea is that the experience of a very bad outcome can lead to avoiding the situation again (i.e., not sampling that option) and can provide the appearance of oversensitivity to that bad outcome. Here, that might be more thought as "black-swan avoidance". Imagine if, to the rat, all options are equal in value, then some initial bad luck in encountering the black swan might make the animal avoid that option, even though with enough experience, then it would have been equal in value.

      (2) I am still not convinced that the Jensen inequalities add to this paper in terms of understanding the rat behaviour. That may be more suited for a different paper about the statistical and mathematical properties of certain generative distributions, but not here given what rats actually choose and experience.

      (3) Providing the data open access is very good. The code, however, should be equally available and not just upon request. Code needs to be available for assessment during peer review and for reproducibility checks. There are substantial enough problems with reproducibility in the field that code availability should be a minimum criterion for publication (see Miske et al., 2026 in Nature for the most recent large-scale evaluation of this problem).

      (4) The paper still somewhat mischaracterizes the literature on rare events, posing it as a series of "exceptions", rather than recognizing that a huge chunk of the literature uses rare events rarer than 10%. Also, there is even existing terminology in that literature for exactly the situation that is being created here-rare treasures (aka jackpots here) and rare disasters (aka Black Swans here).

      (5) Defining the observed behaviour in terms convexity, instead of stating choices more plainly obscures what is done/found. This is especially the case here because convex and concave mean different things when applied to gains/losses in terms of whether or not that option can lead to the REE. The use of the terms obscures rather than clarifies and probably is best left for the discussion (and maybe the intro) when mapping from theoretical distributions to the experiment at hand. In the paper, even the bottom of p.5 seems to incorrectly define "Total Sensitivity" as the combined proportion of selecting convex options in either domain, which does not map how convex is defined in Fig 1B or elsewhere in the text.

      (6). Fig 1C is baffling. Why are probabilities drawn moving away from the origin? The standard scientific plotting convention is for numbers to grow when moving away from the origin. That would be vastly clearer. Also, the color coding is confusing. Green-red maps onto convex-concave, but that would naturally seem to indicate gains vs losses, not convex vs concave. And why are probabilities growing larger in both directions from the origin? Much more sensible to communicate the procedure would likely be a standard plot of magnitude vs probability.

      (7) Discussion: I think the main difference between the human situations discussed and this experiment is that humans have not experienced those rare "black swan" outcomes. Rather, they hear about the disasters that are possible and do not incorporate that information, as discussed in the description-experience literature already cited in this paper (though not in that context).

    1. Reviewer #1 (Public review):

      I read this paper with great interest based on my experience in insect sciences. Previous concerns:

      (1) The paper has an original biological question that is overly broad and mechanistically ambitious. The central biological question, namely how CLas infection enhances fecundity of Diaphorina citri via dopamine signaling, is clearly stated and well motivated by previous literature. However, my advice to the authors is that, while the general question is clear, the manuscript attempts to answer multiple mechanistic layers simultaneously. As a result, I feel that the biological narrative becomes diffuse, especially in later sections where DA, miRNA regulation, AKH signaling, and JH signaling are all proposed as parts of a single linear cascade. In summary, my key concern is that the paper often moves from correlation to causal hierarchy without fully disentangling whether these pathways act sequentially, in parallel, or redundantly. A more explicitly framed primary hypothesis (e.g., "DA-DcDop2 is necessary and sufficient for CLas-induced fecundity") may improve conceptual clarity.

      (2) On the novelty of the data, I feel they are moderately novel, with substantial confirmatory components. If I am correct, the novel contributions include the identification of DcDop2 as the DA receptor responsive to CLas infection in D. citri, the discovery that miR-31a directly targets DcDop2, which is supported by luciferase assays and RIP, and thirdly, the integration of dopamine signaling into the already-described CLas-AKH-JH-fecundity framework. My advice to the authors is to focus more on the manuscript's novelty, which lies more in pathway integration than in discovering fundamentally new biological phenomena. This is appropriate for a mechanistic paper, but should be framed as an extension of existing models rather than a paradigm shift.

      (3) On the conclusions, I recommend that the authors modify their statements a little. I feel that there are some overstated or insufficiently supported claims. For instance, the assertion that CLas "hijacks" the DA-DcDop2-miR-31a-AKH-JH cascade implies direct pathogen manipulation, but no CLas-derived effector or mechanism is identified. Also, that the model suggests a linear signaling hierarchy, but the data largely show correlation and partial dependency rather than strict epistasis. In third, the term "mutualistic interaction" may be too strong, as host fitness costs outside fecundity (e.g., longevity, immunity) are not evaluated. In conclusion, I confirm that the data support a functional association, but mechanistic causality and evolutionary interpretation are somewhat overstated.

      Comments on revised version:

      The authors provided a satisfactory revision.

    2. Reviewer #2 (Public review):

      Summary:

      Nian and colleagues comprehensively apply metabolomics, molecular, and genetic approaches to demonstrate that CLas hijacks the DA/DcDop2-miR-31a-AKH-JH signaling cascade to enhance lipid metabolism and fecundity in D. citri, while concurrently promoting its own replication.

      Strengths:

      These findings provide solid evidence of a mutualistic interaction between CLas proliferation and ovarian development in the insect host. This insight significantly advances our understanding of the molecular interplay between plant pathogens and vector insects and offers novel targets and strategies for HLB field management.

      Weaknesses:

      While the article investigates the involvement of dopamine signaling and specific microRNAs in enhancing fecundity and pathogen proliferation, it still needs to provide a detailed mechanistic understanding of these interactions. The precise molecular pathways and feedback mechanisms by which CLas manipulates dopamine signaling in Diaphorina citri remain unclear.

    1. Reviewer #1 (Public review):

      (1) In this study, the authors aimed at characterizing Huntington's Disease (HD) - related microstructural abnormalities in the basal ganglia and thalami as revealed using Soma and Neurite Density Imaging (SANDI) indices (apparent soma density, apparent soma size, extracellular water signal fraction, extracellular diffusivity, apparent neurite density, fractional anisotropy and mean diffusivity).

      (2) The study implements a novel biophysical diffusion model that extends up-to-date methodologies and presents a significant potential for quantifying neurodegenerative processes of the grey matter of the human brain in vivo. The authors comment on the usefulness of this technique in other pathologies, but they exemplify only with multiple sclerosis. Further development of this, building evidence should be provided.

      (3) Study found that HD-related neurodegeneration in the striatum accounted significantly for striatal atrophy and correlated with motor impairments. HD was associated with reduced soma density, increased apparent soma size and extracellular signal fraction in the basal ganglia, but not in the thalami. Additionally, these affects were larger at manifest stage.

      (4) The results of this work demonstrate the impact of HD on basal ganglia and thalami which can be further explored as a non-invasive biomarker of disease progression. Additionally, the study shows that SANDI can be used to explore grey matter microstructure in a variety of neurological conditions.

      Comments on revised version.

      I have no further comments. Thank you

    2. Reviewer #3 (Public review):

      Summary:

      Ioakeimidis and colleagues studied miscrostructural abnormalities in N=56 Huntington's disease (HD) patients compared to N=57 normative controls. The authors used a powerful MRI Connectom scanner and applied the SANDI model to estimate the soma size, neurite size, soma density, and extracellular fraction in key subcortical nuclei related to HD. In the striatum, they found decreased soma density and increased soma size, which also seemed to become more pronounced in advanced HD individuals in the final exploratory analyses. The authors conducted important analyses to find whether the SANDI measures correlate with clinical scores (i.e., QMotor) and whether the variance of the striatal volume is explained by the SANDI measures. They found a relationship of SANDI measures to both.

      Strengths:

      The study is both innovative and of high interest for the HD community. The authors provide a rich pool of statistical analyses and results which anticipate the questions that may emerge in the HD research community. Statistics are carefully chosen and image processing is done with state-of-the-art methods and tools. The sample size gives sufficient credibility to the findings. Altogether, I think this study sets a milestone in the attempts of the HD community to understand neuropathological processes with non-invasive methods, and extends the current knowledge of microstructural anomalies identified in HD with diffusion MRI. More importantly, the newly identified anomalies in soma size and soma density open new avenues for studying these biological effects further, and perhaps develop these biomarkers for use in clinical trials.

      Weaknesses:

      (1) An important question is whether the SANDI measures, which require an expensive scanner and elaborate processing, are better biomarkers than the more traditional DTI measures. Can the authors compare the effect size of FA/MD with SANDI measures. In some of the plots and tables, FA/MD seem to have comparable, if not higher, correlations with QMotor or CAP scores. On the same vein, it is unclear whether DTI measures were included in hierarchical stepwise regression. I wonder if the stepwise models may have picked up FA/MD instead of SANDI measures if they are given a chance. Overall, I hope the authors can discuss their findings also in this light of cost vs. benefit of adopting SANDI in future studies, which is an important topic for clinical trials.

      (2) Similar to the above point, it is very important to consider how strong the biomarking signal is from SANDI measures compared to the good old striatal volume. Some plots seem to indicate that volumes still have the highest correlation with QMotor, and highest effect size in group comparisons. It would be helpful for the community to know where do the new SANDI measures stand compared to the most typically used volumes in terms of effect size.

      (3) The diffusion measures are inevitably correlated to some degree. Please provide a correlation matrix in supplementary material including all DWI measures to enable readers to understand better how similar SANDI measures are between each other or vs. other DTI measures. Perhaps adding volumes to this correlation matrix may also be a good future reference.

      (4) ISS stages:

      (a) The online ISS calculator requires cut-offs derived from the longitudinal Freesurfer pipeline, while the authors do not have longitudinal data. Thus, the ISS classification might be inaccurate to some degree if the authors used the FS cross-sectional pipeline. Please review this issue and see if updated cut-offs should be used to classify participants.<br /> (b) Were there really no participants with ISS 0 among 56 HD individuals, please clarify in the manuscript?<br /> (c) A note on terminology that might be confusing to some readers. According to the creators of ISS, the ISS stages are created for research only, they are not used or applied in the clinic. On the other hand, the terms "premanifest" and "manifest" have a clinical meaning, typically based on the diagnostic confidence level. The assignment of ISS0-1 to premanifest and ISS2-3 to manifest may create some non-trivial confusion, if not opposition, in some segments the HD community. The authors can keep their current terminology but will need to at least clarify to the reader that this assignment is speculative, does not fully match the clinically-based categories, and should not be confused with similarly named groups in the previous literature.

      Comments on revised version.

      The authors have moved to address many points from reviewers. The manuscript had indeed become more objective, transparent, and to the point. The amount of information and analyses is large, which perhaps is inevitable when new methods are being tested for the first time in a neurodegenerative disease.

    1. Reviewer #1 (Public review):

      Integrating large-field stimulation with a retinotopic atlas, this study introduces an fMRI-based method for measuring contrast sensitivity across the visual field. Retinotopy was assessed using pRF mapping and a calibrated Benson atlas. The authors validate their method by replicating known patterns of contrast sensitivity across eccentricities and visual field quadrants in healthy subjects, and demonstrate its potential clinical utility through case studies of both simulated and real visual field loss.

      Comments on revisions:

      I appreciate the addition of the quadrant-scotoma condition and the authors' clarification that the goal is to demonstrate individual-level detection sensitivity. The 95% CI argument is reasonable, and I am satisfied with framing the simulated-scotoma work as proof-of-concept.

    2. Reviewer #2 (Public review):

      Summary

      This study uses functional MRI to evaluate visual contrast sensitivity across the visual field at the level of the visual cortex, testing the method as a proof of principle in a small group of normally sighted individuals, modelling both normal vision and simulated vision loss, as well as a patient with independently verified vision loss. The results suggest a promising technique to measure vision objectively across the visual field and overcomes the requirement for careful fixation which is often challenging in those with low vision or sight loss.

      Strengths

      • Objective measure of central vision: The proposed method may provide a more comprehensive and objective assessment of residual visual function in individuals with sight loss. This may be particularly useful for those with central visual field loss without the requirement of stable fixation or subjective motor responses.

      • More sensitive measure: The use of slope to calculate contrast sensitivity across a range of contrasts within the brain is clever and likely more sensitive than single threshold measurements or standard clinical measures of visual acuity using letter charts. Standard supra-threshold (high contrast) tests are not ideal for capturing residual vision or partial vision loss.

      • Good agreement with standard atlas: The Benson atlas provides a good estimate of visual field maps within V1 based on anatomical landmarks, and the authors take steps to refine this informed by cortical magnification and V1 surface area (brain size) for each individual participant. This could allow the technique to be generalised without the need to collect lengthy individual mapping data from every participant.

      • Within-subject reproducibility: The measurements appear to be sensitive and reproducible, particularly in those with normal vision, and are consistent with known features of visual sensitivity differences in different parts of the visual field.

      • Potential tool to measure visual field sensitivity in controls: Even if the proposed methods are not ideal for widespread clinical translation, they do offer an exciting tool to test hypotheses about visual field differences in healthy controls. For example, there seems to be an increase in sensitivity on either side of the simulated ring scotoma (Fig 6 - perhaps due to the release of lateral inhibition?). Reliability measures suggest that individual differences are consistent in healthy controls (although not tested statistically, perhaps due to the small sample size?). Whether they reflect behaviourally meaningful differences in visual field sensitivity could be tested in individuals by comparing them to behavioural measures across the visual field.

      • Potential tool to test novel treatments: The proposed techniques could be used to test within-subject changes in visual function in environments that are equipped to measure and analyse fMRI data, including clinical trials aimed at determining the success of novel treatments. Preliminary testing in healthy controls with eye movements also suggests that the method is suitable for testing low vision patients with unstable fixation (e.g., nystagmus), and the authors have modelled the effects of varying amounts and types of eye movements on functional outcome measures.

      Weaknesses

      • Questionable sensitivity to differences in patients. The variability in heat maps across healthy control participants is somewhat surprising, and it is uncertain whether they represent actual visual sensitivity differences or an artifact of the measurement technique, e.g., due to signal-to-noise differences introduced by local variations in brain anatomy. Thus, it is uncertain whether the substantial variance across controls will allow for a sufficiently stable baseline to detect meaningful differences in individual patients. Also, as the authors rightly point out, Benson atlas does not model differences along meridians, so that upper/lower field differences might not be detectable. However, the authors acknowledge that this is a pilot study, and further testing a wider range of scotoma types in patients and simulated in controls will only improve the methods. Furthermore, the ability to capture visual field representations in human visual cortex is also likely to improve with computational advances, making the use of atlases more feasible, obviating the need for individualised population receptive field mapping.

      • Potential for clinical translation. Although it is a sensitive measure, functional MRI is costly, is not available in all clinical settings, requires significant post-processing analyses, and may be contraindicated in some individuals due to safety (e.g., metallic implants) or other concerns (e.g., claustrophobia). These could present significant barriers to widespread clinical translation, if this were the ultimate goal of the study.

      • Limited range of spatial frequencies. The spatial frequencies tested were still quite low (0.3 and 3cpd) compared to measures such a visual acuity. Extending the measurements to higher spatial frequencies could allow better characterization of central vision, although necessarily for peripheral vision. However, this may depend on the typical visual abilities of the patient population of interest.

      Appraisal and Impact:

      The authors used appropriate and robust methods to assess and model known features of visual sensitivity differences across the visual field in sighted controls. In addition, the assessment technique successfully captured sensitivity changes due to simulated and actual partial field loss but was also fairly resilient to eye movements and fixation instability, typical of patients with sight loss. Although currently providing a proof of principle, the method is likely to improve with further testing and increasing normative sample sizes, and as computational methods continue to advance visual field map predictions. Although it may not be adopted widely as a standard clinical assessment technique due to the expense and other obstacles, it would provide a valuable tool in assessing clinical populations, for example in the context of clinical trials to assess suitability for treatment interventions or monitor treatment outcomes.

    3. Reviewer #3 (Public review):

      Summary:

      Chow-Wing-Bom et al. introduce an innovative wide-field visual stimulation setup for 3T experiments that enables stimulation up to a diameter of 40{degree sign} visual angle while allowing continuous gaze tracking. Using this setup, the authors systematically investigate contrast sensitivity across the visual field by presenting subjects with sinusoidal gratings varying in contrast and spatial frequency. Their findings confirm the expected organization of contrast sensitivity, demonstrating a preference for high spatial frequencies in the central field and lower frequencies in the periphery. They also extend these measurements to eccentricities up to 20{degree sign}, which exceeds previous fMRI-based reports. Moreover, the study explores the potential of using contrast sensitivity calculations as a method for detecting visual field defects, demonstrated in a healthy subject with simulated ring-shaped and upper-right-quadrant scotomas, and in a patient with LHON. The revised version additionally characterises the robustness of the approach to varying degrees of fixation instability.

      Strengths:

      - The manuscript is well written and provides comprehensive methodological details, ensuring high transparency and reproducibility.

      - The visual stimulation setup represents a significant technical advance by enabling wide-field stimulation with continuous eye tracking, which is crucial for both research and potential clinical applications.

      - The study confirms established findings regarding the organization of contrast sensitivity while extending them to a larger eccentricity range.

      - The efforts to establish a measure for visual field losses aligns with current efforts to develop objective alternatives to conventional perimetry.

      - The revised manuscript includes an empirical assessment of how varying levels of eye movement affect cortical contrast sensitivity estimates, providing useful guidance on the tolerance of the approach to fixation instability.

      Weaknesses:

      - The original version left certain methodological aspects unclear, particularly the correction of eccentricity values from the Benson atlas and the V1 masks used in each analysis branch. The authors have added a dedicated figure illustrating the eccentricity correction procedure and now explicitly state that a manually delineated V1 mask was used for the pRF-based analyses while the Benson V1 label was used for the atlas-based analyses, together with a discussion of how this difference may influence the comparison.

      - Minor inconsistencies in reporting, such as the introduction of a second session in the Results section, have been corrected.

      The conclusion that high-contrast patterns as in pRF mapping are not optimal to test for subtle but potentially clinically relevant changes in the visual field coverage are very valid. The suggested use of contrast sensitivity can therefore be a potentially well-suited parameter for estimating visual field losses. The presented work is an interesting starting point, and the proposed method of using contrast sensitivity as measure for partial vision loss should be further explored.

      Comments on revisions:

      The authors have thoroughly addressed all points raised in my original review, and I have no further concerns.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors aim to characterize how moment-to-moment fluctuations in arousal during wakefulness shape large-scale functional brain connectivity. Using pupil diameter as an index of arousal and high-field functional imaging, they seek to determine whether arousal-related modulation of connectivity is uniform across the brain or organized into structured patterns, and whether such patterns show hemispheric asymmetry. The work further aims to assess whether these organizational features generalize across resting-state and naturalistic viewing conditions.

      Strengths:

      The study addresses an important and timely question regarding how spontaneous variations in arousal influence whole-brain communication during wakefulness. The dataset is rich, combining high-field imaging with concurrent physiological measurements, and the analyses are ambitious in scope. A key strength is the attempt to move beyond region-based effects and to describe arousal-related modulation at the level of large-scale connectivity organization. The comparison across rest and movie viewing provides useful context and suggests a degree of consistency across behavioral states.

      Weaknesses

      All analyses are based on 7T ultra-high-field imaging. The manuscript does not address whether the reported arousal-related patterns, including the community structure and hemispheric asymmetries, are expected to be reproducible at standard 3T field strengths. It therefore remains unclear whether the findings depend critically on the use of high-field data or whether they would generalize to more widely available datasets, limiting the broader applicability of the results.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses a clear and widely relevant question: how ongoing fluctuations in alertness during wakefulness relate to large scale patterns of coordinated brain activity. The authors combine high field magnetic resonance imaging with simultaneous pupil measurements, and they compute an edgewise measure of arousal-related coupling for every pair of regions. Their main contribution is to show that arousal-related coupling is low dimensional and organized into seven reproducible "connectivity communities", each with characteristic network pair compositions. A secondary contribution is the observation that these communities exhibit systematic but community-specific hemispheric asymmetries, including a striking left/right dissociation within the ventral attention network, where the left side participates broadly across communities while the right side forms a more cohesive, segregated arousal responsive module. A final contribution is cross-context generalization: the same organizational structure and lateralization signatures are largely preserved during naturalistic movie watching.

      Strengths:

      (1) The paper moves beyond state contrasts and quantifies arousal related modulation continuously within wakefulness, directly addressing a gap highlighted in the Introduction.

      (2) The hemispheric asymmetry result is not framed as a crude global dominance effect; the authors explicitly test and argue that the key signal lies in structured spatial heterogeneity rather than mean shifts.

      (3) The cross-paradigm replication in movie watching is a strong design choice and supports the claim that the organizational motifs are not limited to unconstrained rest.

      (4) Arousal effects on BOLD signals and on pupil size can have different delays. The authors have now tested lagged relationships (for example shifting the pupil series forward and backward) to show that the main community structure and lateralization results are not sensitive to an arbitrary temporal alignment.

      (5) Time resolved connectivity results are now shown to be robust to changes in parameters.

    3. Reviewer #3 (Public review):

      Summary:

      The paper investigates neural fluctuations underlying arousal using a combination of resting state/naturalistic movie watching fMRI and eye tracking data. The authors have used several data driven approaches, including time varying sliding window analyses and clustering methods, to characterize large scale brain organization and hemispheric asymmetries associated with arousal fluctuations. This is an interesting study framing arousal as a dynamic, continuously varying process rather than a discrete state. Overall, the manuscript is well written and the authors have provided sufficient details about the methodological choices, their impact on the results, along with the limitations of the study.

      Strengths:

      This is an interesting study framing arousal as a dynamic, continuously varying process rather than a discrete state. Overall, the manuscript is well written and provides sufficient methodological and analytical details to evaluate the results.

      Weakness:

      While the study provides new insights regarding neural processes underlying arousal, future studies may be needed to further examine the implications of identified cluster and patterns.

    1. Reviewer #1 (Public review):

      This is an excellent paper from Dr. Yokoyama and colleagues. The experiments are technically demanding, given the very low cell numbers and the challenges of working with implantation sites at gestational days 6.5, 10.5, and 14.5. Overall, the impact of TGF-β receptor II deficiency in the NK lineage on uterine trNK cell numbers and litter size is convincing, and the authors' conclusions are well supported by the data. Less convincing, however, is the claim that the decrease in trNK cells is compensated by an increase in cNK cells; rather, the absence of TGF-β receptor II appears to result in an overall reduction of NK/ILC1 cells.

      Comments on revised version:

      I thank the authors for addressing all my comments from my initial review.

    2. Reviewer #2 (Public review):

      In their manuscript "TGF-β drives the conversion of conventional NK cells into uterine tissue-resident NK cells to support murine pregnancy", Yokoyama and colleagues investigate the role of Tgfbr2 expression by NK cells in the formation of tissue-resident uterine NK cells and subsequent importance in murine pregnancy. By transferring congenic splenic conventional NK cells into pregnant mice, they show conversion of circulating NK cells into uterine ivCD45 negative tissue-resident NK cells. When interfering with the formation of uterine trNK cells, spiral artery remodelling was impaired, fetal resorption rates were increased, and litter sizes were reduced.

      Generally, this is a research topic of high interest, yet the manuscript is lacking detailed mechanistical insights and some questions remain open. At the current state, the data represent an interesting characterisation of the Tgfbr2-fl/fl Ncr1-Cre mice in pregnancy, but considering 1) the recent publication by the group (Ref#17) on the role of Eomes+ cNK cells during pregnancy, 2) the previously described role of Tgfbr2 and autocrine TGFb expression for uterine NK cell differentiation in virgin mice (also cited by the authors), and 3) the well-known relevance of uterine NK cells during pregnancy, additional experiments addressing the specific role of Tgfb during pregnancy would help to improve novelty and significance of the manuscript.

      Comments on revised version:

      In their revised version of the manuscript and their point-by-point response, the authors have very carefully addressed and discussed all of our concerns and suggestions.

    1. Reviewer #1 (Public review):

      Intron retention is observed in many long noncoding RNAs. The authors here used a powerful genome-wide screening strategy to identify proteins controlling intron retention in the long noncoding RNA PURPL. One of the top hits across multiple cell lines surprisingly, was U2AF2, which is well known to bind the polypyrimidine tract close to the 3' splice site to promote splicing. Nonetheless, U2AF2 is working in the opposite direction here. Convincing follow-up RT-PCR experiments confirmed that knocking down U2AF2 does indeed lead to reduced intron retention of PURPL. The authors then show that this intron retention event is functionally important for both the nuclear retention of PURPL as well as its ability to enhance cell proliferation.

      The authors then used transcriptome-wide analyses to look for additional intron retention events affected by U2AF2. Among the ~250 genes with decreased intron retention (more splicing) upon U2AF2 knockdown was MALAT1, a well-established long noncoding RNA that normally localizes to nuclear speckles. Depletion of U2AF2 or removal of the MALAT1 2nd intron resulted in reduced speckle localization and cell migration, revealing a critical and fascinating role for this intron retention event. Overall, the authors have used a set of complementary approaches to clearly demonstrate a very intriguing role for U2AF2 in controlling intron retention and functionality of a set of long noncoding RNAs.

      I feel the current work has revealed an important role of intron retention in controlling the localization and functionality of long noncoding RNAs, which is likely broad in scope and is likely regulated by cell state.

      One experimental suggestion: The authors show that expressing intron-2 containing PURPL in PURPL-depleted cells is sufficient to induce faster proliferation, but a valuable comparison would be identifying the phenotype expressing spliced PURPL transcript.

    2. Reviewer #2 (Public review):

      Summary:

      This study identified U2AF1/2 as a regulator of pre-mRNA splicing that either promotes or supresses the splicing of introns on different genes. The authors then focused on two genes PURPL and MALAT1 that U2AF1/2 can promote intron retention of specific introns, and characterized the biological implications of these introns regulated by U2AF1/2.

      Strengths:

      (1) The experiments in this manuscript are relatively rigorously designed and performed, often with validation checks such as verifying the knockout, verifying the treatment itself doesn't have an effect, etc.

      (2) The experiments provided comprehensive support for the claims that these specific introns are important for the stability or nuclear localization of the RNA, as well as that U2AF1/2 suppresses the splicing of these introns.

      (3) The writing of the manuscript is very clear and doesn't overstate the conclusions that can be drawn from the experiments.

      Weaknesses:

      I think one main weakness of this study is the lack of a deeper analysis of the mechanisms. Whether studying the mechanism is within the scope of this paper is probably debatable, but with the current experiment setup and data, I believe there are some analyses that can be relatively easily done to enhance the value or significance of this study. My detailed questions and suggestions are listed below:

      (1) Line 194-195 and Figure 2A: How many RBPs are included in "other RBPs" in line 194? Does "other RBPs" only include PTBP1, PRPF8 and SRSF1 in Figure 2A, or do they include all the ~100 RBPs with HepG2 eCLIP data available on ENCODE? If U2AF1/2 have the highest occupancy around the intron 2 region among the ~100 RBPs, it would be nice to visualize it.

      (2) Figure 2A and 2B: Why didn't U2AF2 show interaction with exon 2 and 3 in RNA-IP but showed enrichment over exon 2 and exon 3 regions in the eCLIP data?

      (3) Figure 3C - 3F: Maybe I misinterpreted the experiments, but to my understanding, these experiments showed that the exogenous PURPL with intron 2 promoted cell proliferation compared to when the exogenous PURPL wasn't induced, but didn't compare to the effect of the same amount of PURPL with intron 2 removed. Wouldn't it be clearer to compare the effects of exogenous PURPL with intron 2 and exogenous PURPL without intron 2 to pinpoint whether the effect is related to intron 2? Without an intron 2 specific experiment, these current experiments don't seem to provide much added value than "PURPL promotes cell proliferation".

      (4) It's not very clear what proportion of these introns are retained in the endogenous PURPL and MALAT1 in various tissues, cell types and conditions. I think it will be valuable to provide this background (either from previous research, public database or data from this study).

      (5) Since U2AF1/2 have a wide range of targets as demonstrated by Figure 4A, I think it would be valuable to have some experiments that directly disrupt the interaction between U2AF1/2 and PURPL and MALAT1 and test the effect on splicing outcomes, such as by mutating the sequence that U2AF1/2 bind to. The section on the weak py-tract of PURPL touched upon this topic but focused more on how the weak py-tract causes the intron 2 retention in the background rather than how U2AF1/2 binding and action were affected by sequence mutations. I think experiments on disrupting the direct binding between U2AF1/2 on targets can provide valuable mechanistic insights.

      (6) Across all the target genes of U2AF1/2, it might be feasible to do some systematic analysis to find what correlates with whether U2AF1/2 have a promoting or suppressing effect on intron splicing. For example, do genes with decreased IR after U2AF2 depletion systematically have a weak py-tract compared to genes with increased IR? This dataset can potentially provide many hypotheses for understanding the dual role of U2AF1/2.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript characterized the splicing regulation of two long non-coding RNAs relevant to cancer, starting with a focus on PURPL and ending with insights into MALAT1. A CRISPR screen for the regulators of PURPL intron retention revealed a role for the U2AF heterodimer in inducing this retention, with U2AF2 as the actual hit. This is surprising, because the canonical function of U2AF is to recognize the polypyrimidine tract (PPT) and 3' splice site junction to induce splicing at the site. The brief mechanistic characterization of this phenomenon showed that this intron retention accounts for the nuclear localization and instability of the PURPL transcript, and seems to confer the enhanced cell proliferation feature. U2AF2 also induces retention of two introns in MALAT1, and one of them is essential for its nuclear speckle localization and enhanced cell migration.

      Strengths:

      These findings about PURPL and MALAT1 are clear and interesting.

      Weaknesses:

      The results are not sufficiently connected to each other, because one regulation is nuclear-speckle dependent but not the other.

      Here are my specific comments:

      Major comments:

      The main issue is the lack of focus because of the distinct and incomplete analysis pertaining to the two long noncoding RNAs, PURPL and MALAT1. The paper starts with a very good genetic screen on the former, and immunofluorescence and functional analysis on the latter, with U2AF2 as the main link to induce intron retention. The first one does not show clear localization while the second docks to nuclear speckles, apparently because of the retained intron. Hence the two mechanisms are related yet distinct. Here are some suggestions to enhance the characterization and connection between the two cases:

      (1) As the MALAT1 intron 2 retention contributes to its speckle localization but not the retained PURPL intron, the retained introns or their 3' splice site sequences should be swapped to see if they determine the localization.

      (2) Figure 3, the rescue of the PURPL knockout by the intron-retained RNA to induce proliferation is a powerful experiment, that is lacking the rescue with the RNA without the intron as a control. This must be done and shown.

      (3) The weakness of the PPT of PURPL intron 2 appears as a clear feature of its retention dependent on U2AF2, which appears direct, as backed by CLIP data. It would be good to show direct binding by EMSA or equivalent techniques. Furthermore, the data is also consistent with other determinants. The exon and upstream intronic sequences, including the branch point, could also be involved, so mutations in these are also required.

      (4) In brief, what are the commonalities and differences between PURPL and MALAT1 with regard to their U2AF2-dependent intron retention?

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The authors generated mouse and zebrafish models for DeSanto-Shinawi Syndrome, caused by loss-of-function variants in the WAC gene. Using these vertebrate systems, they demonstrate conserved craniofacial and social-behavioral phenotypes that parallel human clinical features, along with deficits in GABAergic markers. They observe increased seizure susceptibility and male-biased brain volumetric changes in Wac mutant mice. Together, these findings begin to define the biological consequences of Wac haploinsufficiency and provide valuable resources for future mechanistic studies.

      Strengths:

      WAC is a high-confidence neurodevelopmental disorder gene and one of the genes identified by large-scale exome sequencing efforts, including the Satterstrom et al. (2020) autism spectrum disorder cohort. This study establishes the first vertebrate Wac models, addressing a major gap in the understanding of DeSanto-Shinawi Syndrome, and provides a framework for studying other syndromic forms of autism. The models generated will be impactful and useful to the community to study and understand DeSanto-Shinawi Syndrome.

      The cross-species analysis is important and well executed, and reveals both conserved and divergent phenotypes. The behavioral and anatomical assays are rigorously executed and well-controlled, and the inclusion of RNA-sequencing analyses adds valuable insights into the mechanisms underlying brain function in Wac mutants. Notably, the RNA-seq data reveal upregulation of several clustered protocadherins, genes central to neuronal identity and cell-cell interactions, which are known to be regulated by dynamic developmental regulation of chromatin architecture. This observation provides an intriguing hint that could link Wac function to higher-order chromatin organization and neuronal connectivity.

      Weaknesses:

      The evidence is solid, though the study remains incomplete in its mechanistic depth and molecular interpretation. The authors compellingly describe behavioral, anatomical, and transcriptomic phenotypes associated with WAC loss, yet do not explore how WAC mechanistically regulates chromatin or transcription. Given prior evidence that WAC interacts with the RNF20/40 ubiquitin ligase complex and promotes histone H2B ubiquitination and transcriptional elongation, the paper would benefit from a discussion of these functions as a potential link between Wac haploinsufficiency and the observed changes in neuronal gene expression. Similarly, the authors mention WAC's WW and coiled-coil domains but do not consider how these domains could mediate nuclear interactions or recruitment of transcriptional cofactors that shape gene regulation and chromatin organization in neurons.

      The transcriptomic analysis is rich but largely descriptive. Although the upregulation of clustered protocadherins is particularly intriguing, these findings are not validated or localized to specific neuronal populations. The study would be strengthened by independently validating the most significant RNA-seq changes, such as protocadherin gamma genes, using in situ hybridization methods to confirm the spatial and cellular specificity of expression changes.

    2. Reviewer #2 (Public review):

      The authors describe the first deep neurological characterization of WAC mutation in two vertebrate species (zebrafish and mouse). They examine these at various levels, guided by the work in humans that has associated a heterozygous WAC mutation with DeSantos Shinawi Syndrome (DESSH). Therefore, they investigate the animals for a variety of phenotypes, following a template for what is seen when characterizing a new mouse/fish model of a developmental disability gene. Investigations include analysis of skull and jaw for abnormalities(both species), MRI of brain structure(in mice), electrophysiology(mice), assessment of signaling pathways (by Western blot, in mice), cell counts (both, more in mice), transcriptomics (mice), and behavior (both).

      Generally, this describes an important first characterization of the consequences of the mutation. Most of the studies appear well-conducted and reasonably powered, thus solid or convincing.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The behaviour of cells expressing constitutively active HRas is examined in mosaic monolayers, both in MCF10a breast epithelial and Beas2b bronchial epithelial cell lines, mimicking the potential initial phase of development of carcinoma. Single HRas-positive cells are excluded from MCF10a but not Beas2b monolayers. Most interestingly, however, when in groups, these cells are not excluded, but rather sharply segregated within a MCF10a monolayer. In contrast, they freely mix with wt Beas2b cells. Biophysical analysis identifies high tension at heterotypic interfaces between HRas and wild-type cells as the likely reason for segregation of MCF10a cells. The hypothesis is supported experimentally, as myosin inhibition abolishes segregation. The probable reason for lack of segregation in the bronchial epithelium is to be found in the different intrinsic properties of these cells, which form a looser tissue with lower basal actomyosin activity. The behaviour of single cells and groups is recapitulated in a vortex model based on the principle of differential interfacial tension, under the condition of high heterotypic interfacial tension.

      Strengths:

      Despite being long recognized as a crucial event during cancer development, segregation of oncogenic cells has been a largely understudied question. This nice work addresses the mechanics of this phenomenon through a straightforward experimental design, applying the biophysical analytical approaches established in the field of morphogenesis. Comparison between two cell types provides some preliminary clues on the diversity of effects in various cancers.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigate the behavior of oncogenic cells in mammary and bronchial epithelia. They observe that individual oncogenic cells are preferentially excluded from the mammary epithelium, but they remain integrated in the bronchial epithelium. They also observe that clusters of oncogenic cells form a compact cluster in mammary epithelium, but they disperse in the bronchial epithelium. The authors demonstrate experimentally and in the vertex model simulations that the difference in observed behavior is due to the differential tension between the mutant and wild-type cells due to a differential expression of actin and myosin.

      Strengths:

      (1) Very detailed analysis of experiments to systematically characterize and quantify differences between mammary and bronchial epithelia.

      (2) Detailed comparison between the experiments and vertex model simulations to identify the differential cell line tension between the oncogenic and wild-type cells as one of the key parameters that are responsible for the different behavior of oncogenic cells in mammary and bronchial epithelia.

    1. Reviewer #1 (Public review):

      In this work, Gaurav et al. present an extensive study of phase-separated condensates formed by the foci-forming region (FFR) of the MUT-16 protein. The authors first report in vitro experiments showing that these condensates exhibit upper critical solution temperature (UCST) behavior. They then provide a detailed analysis based on atomistic simulations of MUT-16 FFR condensates, identifying key interactions responsible for LLPS, including salt bridges, cation-π interactions, and the role of Na⁺ ions.

      Overall, the manuscript is well written. However, there are several concerns that should be addressed.

      Major Concerns:

      (1) I have several questions regarding the system preparation that require clarification. The authors state that "65 copies of the coarse-grained MUT-16 FFR were embedded in a slab-shaped simulation," but it is not clear how this initial configuration was generated. Were the molecules randomly distributed in the simulation box, or were they initially arranged in a preformed condensate? Alternatively, were they randomly inserted and allowed to self-assemble into a condensate during NpT simulations?

      In Figure 1, the atomistic snapshot appears to show a well-defined condensate at the center of the simulation box. It would be important to clarify how this configuration was obtained: Was it generated from coarse-grained simulations starting from random initial conditions? Or was a preassembled condensate used as input?

      Related to this, how do the authors ensure that the simulations are equilibrated? While 20 μs appears to be a reasonably long simulation time for coarse-grained simulations, it would be useful to demonstrate equilibration explicitly. For example, the authors could plot the center-of-mass positions (in the long axis of the simulation box) of individual proteins over time to show that all molecules reach a steady state and remain within the condensate without systematic drift.

      (2) The authors experimentally observe UCST behavior for these condensates. Do the coarse-grained or atomistic simulations reproduce this behavior?

      While atomistic simulations may be too computationally demanding to systematically explore temperature dependence, coarse-grained simulations could be used to test whether condensates are stable at lower temperatures and dissolve at higher temperatures. Such an analysis would provide valuable support for the experimental observations.

      (3) Regarding the analysis of ions, several points could be clarified and extended:

      a) It would be helpful to report the total number of ions and quantify how many are located inside vs. outside the condensate. While qualitative trends can be inferred from density profiles, quantitative analysis would strengthen the conclusions.

      b) It would also be interesting to analyze the number of contact ion pairs (e.g., Na⁺-Cl⁻ pairs), as described in J. Chem. Phys. 156, 044505 (2022). It is known that some ion models tend to overestimate ion pairing and underestimate solubility (e.g., J. Chem. Phys. 153, 010903 (2020)).

      c) In this context, the use of scaled-charge models has been shown to improve the description of ionic solutions and biomolecular systems (e.g., J. Phys. Chem. Lett. 2019, 10, 23, 7531-7536). I would suggest that, at least for one trajectory, the authors perform a test simulation using scaled charges (e.g., scaling by ~0.8) to evaluate whether ion distributions and protein-ion interactions are significantly affected.

      d) Finally, while the selected water model is known to be accurate, it would be useful to assess its performance for concentrated salt solutions. For example, the authors could estimate the density of a 6 m salt solution and compare it with experimental data or validated models (e.g., J. Chem. Phys. 151, 134504 (2019)). This would help clarify to what extent the conclusions depend on the chosen force field.

      Minor Concerns

      (1) In the Introduction, it would be helpful to elaborate further on the possible driving forces of LLPS in this region. Are there prior hypotheses or evidence pointing to specific interactions (e.g., cation-π, π-π, electrostatic interactions)? While this work addresses these questions, a brief discussion of previous experimental or theoretical insights would provide useful context.

      (2) On page 18, the authors state:<br /> "MUT-16 FFR satisfies the length (172 residues), aromatic content (20.35%), and Arg enrichment (85.71%) criteria. Its charge content (10.47%) and charge balance (38.89% positive charge fraction) are slightly below the nominal thresholds."<br /> It would be very helpful to include a schematic representation of the protein sequence highlighting these features (aromatic residues, charge distribution, etc.) in the corresponding figure, to provide a more intuitive understanding.

      (3) A question regarding ion hydration: What is the coordination environment of the ions that bridge proteins? Are they still hydrated by water molecules, or does the reduced water content inside the condensate significantly affect their solvation?<br /> Typically, Na⁺ and Cl⁻ ions have coordination numbers around 5-6 in aqueous solution. Do protein interactions and reduced solvent conditions within the condensate alter this coordination? A brief analysis or discussion would be valuable.

    2. Reviewer #2 (Public review):

      Summary:

      Gaurav et al. investigate residue-level interactions within the MUT-16 FFR condensate using all-atom molecular dynamics simulations. The authors first argue, based on sequence analysis, that MUT-16 FFR is more representative than the widely studied FUS LCD. They then characterize the UCST phase behavior of MUT-16 FFR experimentally, followed by a detailed analysis of residue-level contact frequencies and lifetimes. In addition, the manuscript examines ion-residue interactions and water-mediated interactions. Overall, this work provides a comprehensive view of the dynamic interactions within the MUT-16 FFR condensate.

      Strengths:

      Large-scale all-atom molecular dynamics simulations have been performed to investigate dynamical interactions within condensates. The analysis is comprehensive and rigorous, and the claims are strongly justified by the data.

      Weaknesses:

      The large amount of detail in the results section sometimes makes it difficult to identify the central take-home messages. I encourage the authors to more clearly highlight the principal findings and the physical insights that may generalize to other condensate-forming systems. The authors may also consider streamlining parts of the Results section to improve focus and readability.

    3. Reviewer #3 (Public review):

      Summary:

      The authors aim to characterize the molecular interaction network inside phase-separated condensates formed by the MUT-16 foci-forming region (FFR), using atomistic simulations combined with residue-resolved analyses of contact frequencies, contact lifetimes, specific non-covalent interactions, ions, and water.

      Strengths:

      The work addresses an interesting and biologically relevant system, and the combination of large-scale atomistic simulations with an extensive contact analysis has clear potential value for the broader condensate field.

      Weaknesses:

      In its current form, several technical issues need to be addressed before the main conclusions can be considered robust. Most importantly, the simulated sequence is 172 residues long, while the atomistic slab has box dimensions of only 12 nm in two directions. This length scale is comparable to the expected end-to-end distances of a disordered 172-residue chain. It is therefore not clear whether individual protein chains interact with their own periodic images, which could substantially affect overall chain dynamics and subsequently bias contact lifetimes, residue-residue interaction statistics, and the inferred condensate dynamics. The authors should check, for each chain, histograms of end-to-end distances. For chains for which more than ~2-3% of the end-to-end distances exceed ~11 nm, the authors should explicitly check for self-image interactions (for example, using "gmx mindist -pi") and report whether such interactions occur and for what fraction of the trajectory. Without this control, at least in the Supporting Information, I do not think the simulation-derived contact dynamics are sufficiently trustworthy.

      A second major concern is the treatment of ions. The manuscript makes important conclusions about Na⁺ association and Na⁺-mediated bridging, but the atomistic ion model is not explicitly stated. This is a reproducibility problem and also affects interpretation - for example, standard Amber ions are known to bind too strongly to the oppositely charged residues. In their results, one acidic residue appears to interact on average with roughly two Na⁺ ions, which is not obviously expected from charge balance alone. The authors should state the exact Na⁺/Cl⁻ parameters used, justify their compatibility with TIP4P-D and the protein force field, and explicitly interpret why such a strong Na⁺ association with acidic residues is observed.

      More generally, because the manuscript is centered on contact lifetimes, the choice of the atomistic force field needs stronger justification. Salt bridges, cation-pi contacts, pi-pi stacking, ion coordination, and water-mediated interactions are all force-field-sensitive. Since there is no direct experimental observable used here to validate the simulations, the authors should discuss the expected limitations of the chosen force field (while I do acknowledge that testing different force fields would be computationally too demanding).

      I also find the sequence-comparison section somewhat confusing. The authors compare one specific IDR, MUT-16 FFR, with the average properties of human IDRs and then frame it as more representative than FUS LCD. It is not clear how informative this is because IDR behavior depends strongly on sequence-specific patterning, molecular connectivity, and the particular interaction network of each protein. Averages over human IDRs may provide a broad context, but they do not necessarily define what is physically or biologically representative for phase separation. In addition, FUS LCD is not intended to be a representative human IDR; it is an unusually low-complexity, phase-separating domain. Therefore, the "more representative than FUS" framing should be toned down. At most, this analysis shows that MUT-16 FFR is compositionally less extreme than FUS LCD.

      The ion- and water-bridging analyses are also potentially overinterpreted. A distance-based simultaneous contact with two residues does not by itself establish functional mediation or regulation of condensate dynamics. The authors should either add appropriate controls, such as local-density-normalized baselines or randomized-contact expectations, or soften the language to describe these as geometrically defined co-contact events rather than mechanistic bridging interactions.

      Finally, the independence of the atomistic replicas is unclear. The manuscript should state whether all ten all-atom simulations were initiated from the same coarse-grained condensate configuration or from distinct CG frames. If the starting structures came from one CG trajectory, the authors should report how far apart those frames were in simulation time and provide evidence that the initial atomistic configurations are structurally independent. If only velocities differ, the simulations should not be described as fully independent structural replicas.

    1. Reviewer #1 (Public review):

      Summary:

      The authors address the lack of validated tools for the detection and quantification of proteins associated with amyotrophic lateral sclerosis (ALS) through an extensive screening of 303 commercially available antibodies to 33 protein targets. Their ALS-Reproducible Antibody Platform (ALS-RAP) delivers a validated antibody toolbox for ALS research, which will provide an advantageous starting point for researchers in this field. Ayoubi R. et al. showcase the characterization workflow, presenting as an example the characterization of antibodies targeting Galectin-1, encoded by the LGALS1 gene. A selection of these antibodies was also used to profile protein levels across human induced pluripotent stem cell (iPSC)-derived and primary neurological cell types, and the findings support that the ALS disease mechanism involves both neuronal and glial cells.

      Strengths:

      The knockout (KO)-based approach is definitely the major strength of this study, providing a high level of confidence in the data collected in human induced pluripotent stem cell (iPSC)-derived and primary neurological cell types. The focus on renewable reagents (monoclonal and recombinant antibodies) is also important. The extensive characterization of this set of antibodies will benefit any scientist interested in any of the 33 target proteins, even in fields other than neuroscience.

      The authors perform an interesting protein profiling study assessing 27 proteins, comparing RNA and protein expression data, and using two independent WB preparations of the same cell types.

      The conclusions that can be drawn from this first assessment might not be final, but the data are compelling because they have been collected with reliable and validated antibodies.

      Another strength of this work is the data dissemination strategy, which includes the Only Good Antibodies (OGA) platform, where YCharOS data are curated and presented in an easy and intuitive manner that facilitates antibody selection by the end user for WB, IP and IF applications.

      Weaknesses:

      The authors mentioned the development of single-chain variable fragment (scFv) recombinant antibodies raised by the SGC against the six proteins (ANXA11, OPTN, MATR3, PFN1, UBQLN2 and VCP) that had limited renewable antibodies that are commercially available. The development was optimized to generate antibodies particularly suitable for IP, and the clone selection process was carried out using IP coupled to mass spectrometry. Even though the generation of these novel reagents is not the focus of this work, the authors do not provide any data on this aspect.

      The protein profiling study is limited to WB data, and the authors did not provide any explanation on why there was no integration with IP and IF data, not even for those targets that have validated antibodies. Also, not all the cell types have been screened by chemiluminescence-based detection and by fluorescence-based WB, and the authors do not elaborate on the reason for such a choice.

    2. Reviewer #2 (Public review):

      Overall, this is a solid manuscript that delivers an important community resource. The execution is relatively simple, but the value is real, the work is rigorously performed, and the open dissemination through Zenodo, the F1000Research YCharOS Gateway and OGA is well executed. The effort invested in generating the knockout lines for validation experiments is a clear strength of the study. I have a number of comments that I think would strengthen the resource and the conclusions drawn from it.

      Below, I list specific points.

      (1) The rationale for the selection of these 33 genes is insufficient. The authors lean on the Nijs & Van Damme classification and on PubMed entry counts, but the number of PubMed entries is not a meaningful criterion for what constitutes an important ALS protein - some of the most disease-relevant genes are precisely those with fewer publications, while heavily cited genes such as CAV1 carry weak ALS-specific evidence. The authors should provide a more transparent and biologically motivated rationale for inclusion and exclusion (ClinGen evidence tier, replicated GWAS signals, large meta-analyses, ALSoD) and explain why specific risk genes outside this list were not part of ALS-RAP.

      (2) "107 of 231 (46%) demonstrated specific target staining in IF." The criteria used to define "specific target staining" at the IF level are not stated. From the Galectin-1 example, the mosaic WT/KO strategy provides a binary readout, but for proteins with low expression, weak punctate staining or unusual subcellular distributions, a single threshold is unlikely to capture specificity uniformly across 231 antibodies.

      (3) Several claims in the manuscript depend on differential protein abundance across cell types. As presented, these claims are supported by qualitative Western blot images only. They should be substantiated by quantification across multiple biological replicates.

      (4) This manuscript represents a unique opportunity to address antibody recognition of splicing variants, which is something of of considerable value to the community. For each target, the predicted isoforms in Ensembl could be cross-referenced against the observed bands, and the pattern of bands compared across cell types could be informative about which isoforms each antibody captures. This would convert ambiguous "extra bands" into useful biological information and would substantially increase the value of the resource. I strongly encourage the authors to include this analysis.

      (5) The iPSC-derived microglia receive a comprehensive QC panel (IBA1/PU.1 IF, CD45/CD11b flow, qRT-PCR for nine canonical markers; Figure S4), which allows the reader to assess culture purity. The other iPSC-derived lineages - motor neurons, dopaminergic neurons, oligodendrocytes and astrocytes - are validated by a single marker each in WB (Figure S3) without purity quantification. Given that several conclusions of the manuscript rest on the cell-type-specific detection of ALS-associated proteins, equivalent quality control should be performed for the other lineages so that the reader can evaluate the purity of each preparation.

      (6) The robustness of the resource would be substantially increased by validating at least a subset of the targets in a second iPSC background, in at least some of the cell types analysed.

      (7) The newly developed SGC scFv antibodies are arguably the most novel reagent contribution of this manuscript, yet they receive a single sentence in the body of the paper. A more thorough description is warranted.

      (8) Accessibility of the resource through Zenodo is not straightforward - the reader currently has to navigate to individual antibody characterization reports one by one to extract recommendations for a given target. While the use of an established public repository is important for permanence, a dedicated ALS-RAP website with an interactive, searchable interface - filterable by target, application, host species and clonality - would meaningfully improve uptake. The relationship between such a portal and the existing OGA platform should also be clarified.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors reveal that the availability of extracellular asparagine (Asn) represents a metabolic vulnerability for the activation and differentiation of naive CD4+ T cells. To deplete extracellular Asn, they employed two orthogonal approaches: activating naive CD4+ T cells in either PEGylated asparaginase (PEG-AsnASE)-treated medium or custom-formulated RPMI medium specifically lacking Asn. Importantly, they demonstrate that Asn depletion not only impaired metabolic reprogramming associated with CD4+ T cell activation but also reduced CD4+ helper T cell lineage-specific cytokine production, thereby ameliorating the severity of experimental autoimmune encephalomyelitis.

      The experiments presented here are comprehensive and well-designed, providing compelling evidence for the conclusions. The conclusions will be important to the field.

      Comments on revised version:

      The authors have sufficiently addressed my previous comments. The manuscript represents an excellent contribution to the field.

    2. Reviewer #2 (Public review):

      While the importance of asparagine in the differentiation and activation of CD8 T cells has been previously reported, its role in CD4 T cells remained unclear. Using culture media containing specific amino acids, the authors demonstrated that extracellular asparagine promotes CD4 T cell proliferation. Consistent with this, depletion of extracellular asparagine using PEG-AsnASE suppressed CD4 T cell activation. Proteomic analysis focusing on asparagine content revealed that, during the early phase of T cell activation, most asparagine incorporated into proteins is derived from extracellular sources. The authors further confirmed the importance of extracellular asparagine in vivo, demonstrating improved EAE pathology.

      While the data are well organized and convincing, the mechanism by which asparagine deficiency leads to altered T cell differentiation remains unclear. It is also necessary to investigate the transporters involved in asparagine uptake. In particular, elucidating whether different T cell subsets utilize the same or distinct transport mechanisms would provide important insight into the immunoregulatory role of asparagine.

      Comments on revised version:

      The authors have addressed the previous concerns, and the manuscript has been significantly improved.

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors set out to define how arginine availability regulates lipid metabolism and to explore the implications of this relationship in pancreatic ductal adenocarcinoma (PDAC), a tumor type known to exist in an arginine-poor microenvironment. Using a combination of rigorous genetic and metabolomic approaches, they uncover a previously underappreciated role for arginine in maintaining lipid homeostasis. Importantly, they demonstrate that arginine deprivation sensitizes PDAC cells to ferroptosis through lipidome perturbations, which can be exploited therapeutically via co-treatment with aESA and ferroptosis inducers (FINs). These findings have meaningful implications for the field. They not only shed light on the metabolic vulnerabilities created by nutrient restriction in PDAC, but also suggest a practical avenue for combination therapies that exploit ferroptosis sensitivity. This is particularly relevant in the context of pancreatic cancer, which is notoriously resistant to conventional treatments. The methods employed are broadly applicable to other nutrient-stress contexts and may inspire similar investigations in other solid tumor types.

      Strengths:

      One of the major strengths of the study is the use of complementary and well-controlled approaches-including metabolomic profiling, genetic perturbations, and in vivo models-to support the central hypothesis. The experiments are thoughtfully designed and clearly presented, and the conclusions are, for the most part, well supported by the data. The findings provide mechanistic insight into nutrient-lipid crosstalk and identify a potential therapeutic strategy for targeting arginine-deprived tumors.

      Comments on revised version:

      The authors have substantially strengthened the revised manuscript and have addressed my prior concerns, and the evidence supports the central conclusions. This work provides meaningful insight into how nutrient limitation in the tumor microenvironment creates metabolic liabilities that may be therapeutically exploited, and it should be of interest to investigators studying cancer metabolism, pancreatic cancer, lipid biology, and ferroptosis.

    2. Reviewer #2 (Public review):

      This study by Jonker et al., examines how the metabolic adaptations to the microenvironment by pancreatic ductal adenocarcinomas (PDAC) present vulnerabilities that could be used for therapeutic purposes. The evidence supporting the claims of the authors is mostly solid, and the multiplicity of models used, as well as the combination of in vitro and in vivo work are appreciated, but some conclusions would benefit from additional substantiation. This work would be of interest to biologists working on the impact of microenvironment and metabolism in cancer, and especially those investigating pancreatic cancer.

      In this study, the authors use mostly "doublings per day" as an indicator of cell death, notably for figures 4 to 6. However, proliferative arrest (or a decrease in the proliferative rate) is not necessarily synonymous with cell death. It might be nice to complement these experiments with a true measure of cell death (e.g. PI uptake).

    3. Reviewer #3 (Public review):

      This important study investigates the impact of nutrient stress in the tumor microenvironment (TME), focusing on lipid metabolism in pancreatic ductal adenocarcinoma (PDAC). Understanding TME composition is crucial, as it highlights cancer vulnerabilities independent of intracellular mutations, particularly because PDAC tumors are often exposed to limited nutrient availability due to reduced perfusion.<br /> By utilizing a medium that mimics the nutrient conditions of PDAC tumors, the authors convincingly show that TME nutrient stress suppresses SREBP1, leading to reduced lipid synthesis, with low arginine levels identified as a key driver of this suppression. Importantly, mice with arginine-starved pancreatic tumors respond to polyunsaturated fatty acid-rich diet. This discovery uncovers a synthetic lethal interaction in the tumor microenvironment that could be leveraged through dietary interventions.

      Comments on revised version:

      The authors have satisfactorily resolved all previously raised concerns through the inclusion of additional data and clarifications in the discussion.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates how two closely related fish species differ in their processing of visual motion, with a focus on spatial and temporal integration underlying behavior. Using a series of behavioral assays combined with computational modeling, the authors identify clear species-specific differences in how visual information is integrated to guide movement.

      Strengths:

      A major strength of the work is the systematic and quantitative behavioral analysis, which reveals robust differences between species, including broader spatial integration and longer temporal persistence in medaka compared to zebrafish. The decomposition of behavior into distinct components provides a useful framework for interpreting these differences.

      Weaknesses:

      The computational modeling captures several key aspects of the observed temporal dynamics, particularly differences in response persistence. However, the modeling framework is primarily focused on temporal processing and does not incorporate spatial integration, which is a central finding of the study. In addition, some experimental observations, such as responses to short-duration stimuli and certain frequency-dependent features, are only partially reproduced. These limitations indicate that the link between the model and the full range of behavioral results remains incomplete.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents a comparative analysis of optomotor behavior in zebrafish and medaka larvae. Using multiple behavioral paradigms, the authors argue that the two species differ in both the spatial and temporal integration of visual motion. They further decompose turning behavior into large- and small-turn components and use a simple mechanistic model to capture several of the main response features. Overall, the study addresses an interesting question, and the comparative framework gives the work a clear conceptual appeal.

      Strengths:

      A major strength of the manuscript is the breadth of the behavioral analysis. The authors use several stimulus paradigms to probe spatial extent, temporal persistence, and response dynamics, which makes the cross-species comparison richer and more informative than a single-assay study. The decomposition into large and small turn components is also a useful feature of the work, as it provides a more structured account of where the species differences may arise. The modeling further helps organize the results and offers a useful framework for interpreting the behavioral differences.

      Weaknesses:

      The main limitations are in presentation and clarity rather than in the overall motivation or approach. In several places, it is difficult to determine exactly how some quantities are summarized statistically, and some figures and legends would benefit from clearer explanations. In addition, a few of the more specific interpretive claims would be strengthened by more explicit statistical framing and slightly clearer presentation. These issues appear addressable and do not detract from the overall interest of the study.

    1. Reviewer #1 (Public review):

      Summary:

      This paper presents rare and unique recordings of single neurons, LFPs, and SEEG data from human patients performing reading and listening tasks. They identify single neurons in temporal and ventral occipito-temporal cortex that respond specifically to spoken and written language, and primarily encode either phonological or orthographic features of the stimuli. They also identify neurons in the middle temporal and inferior frontal cortex that respond to both modalities, which they interpret as amodal language responses. In general, neuronal population firing rates are correlated with both micro- and macro- scale broadband gamma responses, though they observe some dissociations, particularly with the macro-scale. The results are interpreted to support a model of modality-specific to amodal processing throughout many distributed brain areas for language.

      Strengths:

      (1) The data are truly unique, providing a large-scale characterization of single neuron responses from the human brain during written and spoken language processing.

      (2) The task and stimulus conditions allow for examination of both low-level (e.g., orthographic/phonological) and higher-level (e.g., syntactic) encoding.

      (3) Showing relationships between single neuron and multi-scale LFP recordings from the same sites helps bridge neuronal and meso/macroscale literatures.

      Weaknesses:

      (1) My main comment about the paper is that it feels like a collection of somewhat random descriptions of a very small number of hand-picked single neurons. I think that the task and stimulus design shown in Figure 1A sets up some clear hypotheses that could be tested rigorously across the full neuronal population, but instead, the authors pick a few neurons and fit encoding models that don't take advantage of the contrasts. I agree that encoding models are a powerful approach, but with only 508 total words and what appears to be a limited set of variability across the various features, it's not clear to me that the stimuli, which were apparently designed as minimal pairs, provide enough power to find robust results. Perhaps this is why the majority of the results only show a very small number of units (most of which are actually buried in the supplement), but it's odd to me that they don't show the results of the minimal contrasts other than for length.

      (2) Related to point (1), other than Figure 2H and Figure 6A-B, the results are only shown for a tiny number of units. This is great for demonstrating qualitatively what the effects look like, but there is no quantification of the findings across the population, which undermines the point in the abstract that 1000 neurons were recorded. This is acknowledged in some places, but as a reader, it leaves me wondering how seriously to take the interpretations if they seemingly cannot be replicated. I understand this is a challenge with human single neuron recordings, but as presented, the paper as a whole comes across as largely anecdotal.

      (3) Some of the key claims rest on the idea that neurons were recorded from the superior temporal gyrus and fusiform gyrus. For the STG claim, I don't understand how this was done, or what specifically they mean by STG, since the microwire locations do not appear to be anywhere near the lateral surface. This makes sense given the profile of the Behnke-Fried electrodes, but if they want to claim that there are neurons from the STG, they need to be more specific and show where precisely these wires are. If they are more medial as it appears, they need to explain how they dissociated STG from Heschl's gyrus. Similarly, for the fusiform neurons, I can only see a couple of probes that appear to have their tips near where I would think this area is. Perhaps this is more of a visualization issue with Figure 1F, but overall, I am not convinced that the neurons are exactly where they say they are.

      (4) Related to point (3), some of the authors have made strong claims in prior work about the precise coordinates of the VWFA, so it would help to know how many units are within this exact region. The ROIs marked in Figure 2 are quite large, and given results like Vinckier et al. 2007, it's important to know where along the hierarchy the recordings were actually performed. Similarly, given the framing in the intro around the VWFA as a key area, the idea that some of the best example neurons are from the right fusiform is a bit confusing. I don't think they can make the claims about visual hemifields since it does not appear that they recorded eye tracking to verify constant central fixation, and it may be a bit surprising to see such strong orthographic selectivity in the right hemisphere (though, as a result, it may suggest a more nuanced view of lateralization of reading at the single neuron.

      (5) In many sections of the paper, there are vague and unquantified claims like "many neurons" or "a large number of units". This needs to be made explicit. It would also help to show where statistical threshold cutoffs are on plots like Figure 2H, since the "brain-score" is used to select units for many analyses.

      (6) More detail on the TRF models is needed in the methods. At the very least, a complete list of the features in each group is necessary to evaluate claims about very broad sets of features like "syntax". It would also help to know how the features were coded, especially where there is a mixture of continuous and discrete features within the model.

      (7) Depending on how exactly the features were defined, I'm skeptical of some of the claims, like position-specific "w". There are some obvious confounds that need to be controlled here, like whether word-initial "w" is strongly associated with shorter, higher frequency words (like "wh-" words). There are other examples, like whether specific forked letters tend to appear in certain syllables in English words. While it may be the case that these kinds of patterns are uniformly distributed, it needs to be established in this particular stimulus set.

      (8) The claim that there is monotonic encoding of word length does not seem strongly supported in the data. In both PC1 and the single neuron examples, it seems like there may be a non-linear relationship, which could suggest that another correlated feature (e.g., word frequency) is involved.

      Minor Points:

      (1) What are "boundaries"? They are not described anywhere I could find, but they are a feature group that was used in the TRFs. )

      (2) The caption for Figure 6C says MTG and insula, but the text says MTG and IFG. Similar to the above comment about STG and fusiform, it's not clear to me how they achieved single-unit recordings with Behnke-Fried probes in these areas.

      (3) The somewhat less robust correlations between firing rate and BGA in macro vs micro contacts are potentially interesting. However, did they verify that the closest macro contact was always in the gray matter of the same gyrus as the microwire?

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript, "Modality-Specific and Amodal Language Processing by Single Neurons," presents an intracranial electrophysiology study investigating how language is represented in the human brain across spoken and written modalities. The authors analyze activity from over one thousand single neurons and local field potentials recorded in twenty-one neurosurgical patients while participants read and listened to sentences. Using encoding models based on temporal receptive fields, they examine whether neural responses track modality-specific features, such as phonological and orthographic information, as well as higher-level linguistic features. The results are interpreted as evidence for a dissociation between modality-specific processing in sensory regions and modality-independent ("amodal") representations in temporal and frontal cortices, supporting a two-stage model of language processing.

      Strengths:

      This study uses a rare and valuable dataset, combining single-neuron recordings with broader field potential measures in human participants. The large-scale recording, in terms of both neuron count and anatomical coverage across multiple regions and individuals, represents a significant technical achievement for intracranial research.

      The use of encoding models to relate neural activity to multiple levels of linguistic representation is methodologically rigorous and provides a unified framework to compare phonological, orthographic, and higher-level features. This approach allows the authors to systematically test how different aspects of language are represented across neurons and regions.

      Another key strength is the attempt to directly link concepts from Linguistics to neural data. By framing the results in terms of modality-specific versus amodal representations, the study engages with longstanding theoretical questions and offers a potential bridge between linguistic theory and systems neuroscience.

      The manuscript is also very well written, and the data are presented clearly and effectively. The inclusion of raw data and raster plots is particularly valuable, as it allows readers to directly assess the neural responses and strengthens the transparency of the analyses.

      Weaknesses:

      Despite these strengths, the central claims of the paper are not fully supported by the analyses presented, and several key issues limit the strength of the conclusions.

      A primary concern is the lack of clear reporting and statistical characterization of the proportion of neurons that significantly encode the tested linguistic features. While the paper presents illustrative examples and regional patterns of encoding, it does not systematically quantify how many neurons exhibit significant effects across conditions, nor does it provide formal statistical comparisons of these proportions across brain regions or feature types. As a result, it is difficult to determine whether the reported dissociations reflect robust population-level phenomena or relatively sparse subsets of neurons identified through model fitting. Figure 2H offers a visual depiction of the distribution of Brain-Score (a measure of model evaluation) across the fusiform gyrus and superior temporal gyrus, but it falls short of providing formal statistical testing or quantitative summaries, limiting its interpretability in supporting the authors' claims. Given that the authors employ temporal receptive field (TRF) analyses, the framework naturally allows for straightforward quantification of the proportion of neurons that significantly encode any linguistic features in the model, which could be reported by region as well as by stimulus condition (auditory vs. visual). Including such analyses would further strengthen the population-level interpretation of the results.

      Relatedly, the interpretation of "amodal" neurons is not sufficiently substantiated. The classification of neurons as modality-independent relies on encoding model performance across conditions, but the statistical criteria for establishing cross-modal generalization are not always clearly defined or rigorously tested. Without explicit comparisons (e.g., testing whether the same neurons significantly encode features in both modalities above chance, and whether this exceeds what would be expected under appropriate null models), the claim of modality-independent representation remains somewhat underdetermined.

      More generally, the reliance on encoding models introduces some interpretational ambiguity. Although the observed dissociation between fusiform and superior temporal regions is consistent with orthographic and phonological processing, respectively, the feature spaces used in the models are partially linked to lower-level sensory properties (e.g., visual form and acoustic features). The authors' single-neuron results suggest these effects reflect genuine linguistic selectivity, but the findings do not uniquely distinguish between linguistic and perceptual explanations. While fully disentangling these factors may be beyond the scope of the current study, the manuscript could benefit from a brief discussion acknowledging these correlations or clarifying how lower-level sensory contributions were considered.

      Another limitation is that the proposed two-stage model of language processing is not directly tested against competing hypotheses. While the dissociation between modality-specific and amodal representations is consistent with this model, the authors note that higher-level features, such as syntax, may be encoded in a distributed or overlapping manner. These possibilities are not systematically tested, so the conclusions risk overinterpreting correlational patterns as evidence for a specific processing hierarchy. A more explicit discussion or quantitative consideration of these alternative accounts would strengthen the interpretation, while still allowing the two-stage model to be presented as a plausible framework.

    3. Reviewer #3 (Public review):

      Summary

      This paper analyzes human single-neuron activity recorded with Behnke-Fried electrodes during naturalistic listening and reading. The authors demonstrate a double dissociation between superior temporal gyrus neurons (responsive during listening but not reading) and fusiform gyrus neurons (responsive during reading but not listening), and report that these two classes of neurons show selectivity to specific phonological and orthographic features of the stimulus, respectively. Across the language network, the authors also report neurons whose responses are amodal (active during both listening and reading), which they organize into a modal-to-amodal processing hierarchy. A separate thread of analyses tracks the relationship between single-neuron spiking, micro-wire, and macro-wire signals across these regions. The authors interpret their findings as evidence for hierarchical processing across the language network and for a "compositional code" for orthography in reading.

      Strengths

      The dataset is rare and valuable. Simultaneous single-neuron, micro-wire, and macro-wire recordings during naturalistic reading and listening in the same patients are difficult to obtain, and the experimental design reflects substantial care. The cross-modality comparison at single-neuron resolution is a novel measurement, and the paper presents these results while also situating them against prior neuroimaging and intracranial work. The simultaneous availability of signals at three spatial scales within the human language network is an unusual and potentially important resource for the field.

      Weaknesses

      (1) Framing and novelty

      The paper appropriately situates its modality-selectivity findings against prior neuroimaging and intracranial work (citing Buchweitz et al. 2009 among others) and frames its novel contribution as bringing single-neuron resolution to a question that has previously been examined at population scales. This framing is fair as far as it goes. However, two issues remain. First, the paper does not engage with neuroimaging evidence that complicates its clean modality-selectivity story - most notably Wilson, Bautista, & McCarron (2018), who found that the dorsal superior temporal sulcus is activated by both intelligible and unintelligible inputs in both modalities. Several reconciliations of single-neuron modality selectivity with population-level cross-modal activation are possible (sparse coding, BOLD-vs-spiking dissociations, etc.), and the paper should engage with these possibilities. Second, the paper's discussion extends well beyond the modality-selectivity result that is its headline contribution, into broader claims about a "compositional code" for orthography and "hierarchical processing" across the language network. These broader claims are not supported by the analyses presented (see Weakness 3), and their inclusion distracts from and weakens the core finding rather than building on it. The paper would be stronger if these claims were either subjected to the population-level analyses they require or scaled back to exploratory observations.

      These framing issues are compounded by writing problems that obscure what the paper is claiming. Some passages, such as the assertion that the dataset "suggests an unprecedented examination of linguistic features across various brain regions at various resolutions," are not interpretable as written and should be rewritten.

      (2) Methodological concerns about the TRF analyses

      The selectivity findings in Figures 3 and 5 rest on temporal response function / temporal receptive field (TRF) analyses with several core issues.

      2.1) First, the construction of the TRF feature stream for the reading condition is not specified in the methods. Reading stimuli are presented in RSVP, with all letters of a word appearing simultaneously. How letter or letter-position features are mapped to a time-varying regressor reflects a substantive hypothesis about the psychological mechanisms of reading, with statistical consequences for what the TRF can recover and how reading and listening analyses can be compared.

      2.2) Second, the stimulus distribution limits which effects can be reliably estimated. While the design appears balanced for some features (e.g., subject gender and number), the features that drive the TRF analyses - particularly letter identity and position in the orthographic TRF - are unlikely to be well covered in a small stimulus set. This raises a concern about high-variance feature importance estimates.

      2.3) Third, the TRF feature set includes syntactic, semantic, and discourse predictors alongside phonological and orthographic features. The paper does not justify this choice in fitting single-neuron responses in STG and FSG, and the consequences for the unique-variance analyses are not discussed. Because syntactic features are correlated with phonological and orthographic features in natural stimuli (function words are short, have characteristic phoneme distributions, and so on), the unique variance attributed to each feature set depends on what is being controlled for. Including syntactic predictors when fitting STG or FSG neurons also risks inflating overall TRF fit by chance, particularly in the absence of cross-neuron correction.

      2.4) Fourth, there seems to be no correction for multiple comparisons across the neuron × feature grid. The within-neuron feature-importance procedure briefly described in the Figure 3 caption may help combat overestimates of feature importance within a single fit, but does not address the question of how many of the "selective" neurons reported across the paper would survive correction at the population level. With many neurons, many features, and a limited stimulus set, some neurons will appear selective to some features by chance alone, and these are likely to be the ones that appear as example panels in figures.

      Together, these issues mean the per-feature selectivity results cannot be interpreted as the paper currently interprets them. This is consequential because the per-feature selectivity findings underpin the paper's broader claims about a compositional code for orthography and about hierarchical processing across feature levels.

      (3) Claims that outrun the evidence

      Several of the paper's broader claims are not supported by the analyses presented.

      3.1) The authors claim a "compositional code" for orthography, in which single neurons code for the combination of letter identity and position. This claim is illustrated with two example neurons. A claim about a coding scheme is a population-level claim and requires a population-level analysis. A natural test would be a per-neuron model comparison between a TRF with letter identity alone and a TRF including letter identity × position interactions, controlled for model complexity, asking how many neurons show improved prediction with the interaction features. As noted above in {section sign}2.2, this analysis would also need to grapple with which letters and positions the data can support estimating. There is a potential connection to the data sparsity worries here: the n=2 example neurons may have the only selectivity profiles for which the relevant interactions could be estimated at all.

      3.2) The "hierarchical processing" claim is motivated by neurons selective to features at multiple levels - graphemes and sub-graphemes in reading, single phonemes and diphthongs in listening. This claim is not specified mechanistically. The paper does not state what kind of structural linguistic hierarchy is intended (segmental phonology to syllabic structure?), what kind of hierarchical neurocomputational mechanism is being proposed, or why selectivity at multiple levels of a feature hierarchy is evidence for that mechanism rather than for any other mechanism (e.g., parallel feature detectors). As written, the claim is too underspecified to evaluate.

      3.3) The "forked letters" finding (selectivity to k, v, w, y, z) is potentially confounded with letter frequency and co-occurrence structure. These letters are low-frequency, with some exhibiting strong positional asymmetries, and they infrequently co-occur with other letters. Under the unique-variance analysis, decorrelation from other features inflates apparent unique variance even in the absence of genuine selectivity.

      3.4) The word-length effect in Figure 4 is established by PCA on the top five fusiform neurons, with no analysis showing the effect is qualitatively similar across a broader selection. Beyond establishing that something varies with word length, the paper makes no substantive claim about what the neural code represents - for instance, whether it reflects letter- or word-specific processing or a more general visual response to stimulus extent. Prior intracranial work has reported word-length effects in regions posterior to the VWFA but not within it (Thesen et al. 2012), raising the question of whether the effect reported here reflects letter-specific processing or a more general visual response that happens to correlate with stimulus extent.

      (4) Missed opportunities

      Several aspects of the paper are not so much wrong as underdeveloped, in ways that the authors are well-positioned to address.

      4.1) The cross-scale comparison between single-neuron, micro-wire, and macro-wire signals is presented descriptively, without articulating what conclusion these analyses support about the relationship between scales of measurement. Given the rarity of simultaneous recordings at these scales, this is a substantial missed opportunity. The rasters in Figure 2 visually suggest a tight relationship between spiking and micro-population activity that is not evident in the summary in Figure 2g. This discrepancy is not explained. Characterizing the functional and temporal relationship linking spike rates to micro- and macro-HGA is a substantive scientific question, and the paper is well-positioned to address it.

      4.2) The stimuli include controlled grammatical manipulations, but these manipulations are used as nuisance regressors in the TRF analyses rather than as the object of structured analysis. A design with controlled comparisons is being treated as if it were unconstrained naturalistic stimulation, which underuses the experimental structure the authors built.

      4.3) Finally, the paper foregrounds the dataset as a contribution but does not describe data sharing plans. Given that several of this review's recommendations call for analyses the authors have not yet done, the long-term value of the dataset to the community will depend substantially on what is shared and how.

      ​​Buchweitz, A., Mason, R. A., Tomitch, L. M., & Just, M. A. (2009). Brain activation for reading and listening comprehension: An fMRI study of modality effects and individual differences in language comprehension. Psychology & neuroscience, 2(2), 111-123.

      Jobard, G., Vigneau, M., Mazoyer, B., & Tzourio-Mazoyer, N. (2007). Impact of modality and linguistic complexity during reading and listening tasks. Neuroimage, 34(2), 784-800.<br /> Thesen, T., McDonald, C. R., Carlson, C., Doyle, W., Cash, S., Sherfey, J., Felsovalyi, O., Girard, H., Barr, W., Devinsky, O., Kuzniecky, R., & Halgren, E. (2012). Sequential then interactive processing of letters and words in the left fusiform gyrus. Nature communications, 3, 1284.

      Wilson, S. M., Bautista, A., & McCarron, A. (2018). Convergence of spoken and written language processing in the superior temporal sulcus. Neuroimage, 171, 62-74.

    1. Reviewer #1 (Public Review):

      The medial reticular formation (MRF) in the brainstem has long been implicated in the regulation of locomotion. One common - albeit very simple - model often presents the MRF as a major relay station receiving inputs from MLR circuits, among other brain regions, that together convey locomotor signals through efferent projections targeting the caudal brainstem and the spinal cord. Yet, the MRF is a particularly large brain area whose cellular complexity is far from understood. How molecularly distinct MRF ensembles contribute to the regulation of locomotor behaviors is largely unknown. Here, the authors apply focal activation of either glutamatergic, GABAergic, or serotonergic neurons throughout the MRF using a chemogenetic gain-of-function approach to uncover the putative modulatory properties of these neuronal ensembles during walking. Using kinematic analysis of mice limbs during self-paced over-ground walkway locomotion, the authors find that activation of GABAergic MRF neurons can selectively slow down walking, whereas activation of glutamatergic neurons can induce a specific "shuffle" limb trajectory, altogether revealing that distinct MRF populations may retain the capability to engage divergent walking signatures, whose behavioral relevance are not yet clear. In contrast, the activation of serotonergic neurons did not affect walking signatures as described for the other two subgroups but led to an increase of locomotor speed. Interestingly, MRF neurons in each regional activation "hotspots" appear to target different domains in the lumbar spinal cord, suggesting that distinct circuit mechanisms are at play for the slowmo vs shuffle effects.

      Major points:

      1. While the experiments are carefully done and the results are well analyzed and clearly presented in a series of beautiful figures, several aspects of the methodology remain very confusing. In particular, the initial choice for the injection coordinates is not justified and the authors don't leverage the mapping of spinal projection neurons to drive their chemogenetic screen. Similarly, the authors group very different injection schemes (unilateral or bilateral targeting of MRF neurons), that should be analyzed separately. The choice of Z score cutoff that dictates the in-depth analysis of the chemogenetic phenotypes appears arbitrary and is not grounded in a set of objective criteria.

      2. One issue that arise from the work presented here is that we don't know if these MRF neurons are active during locomotion in normal, unperturbed conditions. Knowing the recruitment profile of these MRF neurons would clarify whether the chemogenetic activation boosts the firing of neurons that are already active during walking, or activate neurons that are otherwise silent. Disentangling between these possibilities may have a profound impact on the overall interpretation of the results.

      3. The results should be discussed in the broader context of historic stimulation experiments, notably in cats and other species, as well as more recent circuit mapping approaches in rodents. For instance, the notion that focal stimulation of distinct area within the MRF can elicit or modify the pattern of locomotion is not really new, so is the notion that some of these modulations are phase-specific and can influence the duration of single muscle activation during stance or swing phases. This last point has for instance already been assessed through individual muscle recordings paired with MRF stimulation in cats. Perhaps better introducing these key studies and a thorough discussion of what the results presented in this manuscript bring in terms of novelty will help readers ground this work into a more comprehensive and larger body of work.

    2. Reviewer #2 (Public Review):

      This paper is an interesting conceptual work where certain hotspot areas were found to induce unique gait patterns. These patterns differed from a classic change in speed or gait pattern from a walk to a gallop. From this, a hypothesis was formed that these areas could be important for possible alternative walking patterns seen, for example, during pathologies such as Parkinson's disease or perhaps related to stalking behaviors.

      While I liked the work and found it interesting, it remains descriptive in that the actual behaviors observed can't be causally related to a particular behavior such as stalking or shuffling. If the necessity or sufficiency of this region was related to a specific hunting behavior, for example, its interest to the field would be greater.

      Nevertheless, this paper does contribute to growing evidence that specific behaviors can be triggered by specific neuronal populations within the brainstem.

    1. Reviewer #1 (Public review):

      Summary:

      The authors considered the mechanism underlying previous observations that H2A.Z is preferentially excluded from methylated DNA regions. They considered two non-mutually exclusive mechanisms. First, they tested the hypothesis that nucleosomes containing both methylated DNA and H2A.Z might be intrinsically unstable due to their structural features. Second, they explored the possibility that DNA methylation might impede SRCAP-C from efficiently depositing H2A.Z onto these DNA methylated regions.<br /> Their structural analyses revealed subtle differences between H2A.Z-containing nucleosomes assembled on methylated versus unmethylated DNA. To test the second hypothesis, the authors allowed H2A.Z assembly on sperm chromatin in Xenopus egg extracts and mapped both H2A.Z localization and DNA methylation in this transcriptionally inactive system. They compared these data with corresponding maps from a transcriptionally active Xenopus fibroblast cell line. This comparison confirmed the preferential deposition or enrichment of H2A.Z on unmethylated DNA regions, an effect that was much more pronounced in the fibroblast genome than in sperm chromatin. Furthermore, nucleosome assembly on methylated versus unmethylated DNA, along with SRCAP-C depletion from Xenopus egg extracts, provided a means to test whether SRCAP-C contributes to the preferential loading of H2A.Z onto unmethylated DNA.

      Strengths:

      The strength and originality of this work lie in its focused attempt to dissect the unexplained observation that H2A.Z is excluded from methylated genomic regions.

      Weaknesses:

      The study has two weaknesses. First, although the authors identify specific structural effects of DNA methylation on H2A.Z-containing nucleosomes, they do not provide evidence demonstrating that these structural differences lead to altered histone dynamics or nucleosome instability. Second, building on the elegant work of Berta and colleagues (cited in the manuscript), the authors implicate SRCAP-C in the selective deposition of H2A.Z at unmethylated regions. Yet the role of SRCAP-C appears only partial, and the study does not address how the structural or molecular consequences of DNA methylation prevent efficient H2A.Z deposition. Finally, additional plausible mechanisms beyond the two scenarios the authors considered are not investigated or discussed in the manuscript.

      Comments on revisions:

      The authors have addressed all previously raised concerns and propose a revised version of the manuscript. Notably, the abstract and discussion sections have been improved, and new experimental data have been incorporated. Collectively, these revisions enhance the rigor and clarity of the data interpretation and discussion.

      Given these improvements, this reviewer believes that the manuscript could be published, particularly if this publication is accompanied by the critical points discussed in the rebuttal letter.

    2. Reviewer #2 (Public review):

      This manuscript aims to elucidate the mechanistic basis for the long-standing observation that DNA methylation and the histone variant H2A.Z occupy mutually exclusive genomic regions. The authors test two hypotheses: (i) that DNA methylation intrinsically destabilizes H2A.Z nucleosomes, thereby preventing H2A.Z retention, and (ii) that DNA methylation suppresses H2A.Z deposition by ATP-dependent chromatin-remodelling complexes. The revised manuscript addresses a number of previous concerns, and the manuscript has therefore improved accordingly. However, several limitations remain.

      Comments on revisions:

      The authors have addressed a number of my previous concerns, and the manuscript has improved accordingly. However, several limitations remain that, in my view, constrain the strength of the conclusions. In particular, the absence of a direct comparison with a canonical nucleosome assembled on the same DNA template. This control is essential to determine whether the observed effects are specific to H2A.Z or reflect more general properties of methylated DNA-nucleosome interactions. Notably, even within the authors' own data, there is a trend suggesting that methylated canonical H2A nucleosomes may also exhibit increased accessibility. Although this does not reach statistical significance, the authors themselves argue that subtle differences can be biologically meaningful; it is therefore plausible that extended digestion conditions (e.g., longer HinfI exposure) could reveal a significant effect. Unless a direct structural comparison with a canonical nucleosome is performed, the possibility that the reported phenomenon is not specific to H2A.Z remains. This is compounded by the reliance on a single restriction enzyme-based assay, which represents a limited experimental approach. Such an approach is insufficient to unequivocally support the central claim that DNA methylation increases accessibility of H2A.Z-containing nucleosomes. Additional orthogonal assays would be required to substantiate this conclusion. With respect to the cryo-EM analysis of methylated and unmethylated 601L H2A.Z nucleosomes, and in general, the authors still do not adequately consider the positional context of CpG methylation. Extensive literature demonstrates that the effects of DNA methylation on canonical nucleosome structure and stability are highly position-dependent. Without accounting for the location of methylated CpGs relative to key DNA-histone contact sites, the structural data remain difficult to interpret mechanistically. Overall, while the manuscript has improved, it remains a relatively limited study that draws broad mechanistic conclusions from a minimal experimental data.

    3. Reviewer #3 (Public review):

      Summary:

      Histone variant H2A.Z is evolutionarily conserved among various species. The selective incorporation and removal of histone variants on the genome play crucial roles in regulating nuclear events, including transcription. Shih et al. aimed to address antagonistic mechanisms between histone variant H2A.Z deposition and DNA methylation. To this end, the authors reconstituted H2A.Z nucleosomes in vitro using methylated or unmethylated human satellite II DNA sequence and examined how DNA methylation affects H2A.Z nucleosome structure and dynamics. The cryo-EM analysis revealed that DNA methylation induces a more open conformation in H2A.Z nucleosomes. Consistent with this, their biochemical assays showed that DNA methylation subtly increases restriction enzyme accessibility in H2A.Z nucleosomes compared with canonical H2A nucleosomes. The authors identified genome-wide profiles of H2A.Z and DNA methylation using genomic assays and found their unique distribution between Xenopus sperm pronuclei and fibroblast cells. Using Xenopus egg extract systems, the authors showed SRCAP complex, the chromatin remodelers for H2A.Z deposition, preferentially bind to unmethylated DNA to deposit H2A.Z.

      Strengths:

      The experiments are rigorously performed, and interpretations are clear. The study presents a high-resolution cryo-EM structure of human H2A.Z nucleosome with methylated DNA. Although the effect of DNA methylation on the physical stability of the H2A.Z nucleosome is subtle, this would be important finding that warrants further functional investigation. The discovery that the SRCAP complex senses DNA methylation is novel and provides important mechanistic insight into the antagonism between H2A.Z and DNA methylation.

      Weaknesses:

      The authors have satisfactorily addressed my concerns.

    1. Reviewer #2 (Public review):

      Summary:

      This is a laudable effort to help dissect the contributions of type I and type III IFNs to the antiviral response in chicken and therefore represents an important piece of work, not least in the light of birds being a key carrier and worldwide distributor of influenza virus. The first part of the study characterises the generation of IFNAR and IFNLR KO chicken strains and describes basic differences. Four different viruses are then tested in chicken embryos, while the subsequent analysis of the antiviral response in vivo is performed with one influenza H3N1 strain.

      Strengths:

      Having these two KO chicken strains as a tool is a great achievement. The initial analysis is solid. Clear effect of IFNAR deficiency in in vivo infection, less so for IFNLR deficiency.

      Weaknesses:

      (1) The antibody induction by KLH immunisation: We still don't know whether or not this vaccination induces IFN responses in wt mice, so it is still not possible to judge whether the effects observed are due to steady-state differences or to differential effects of IFN induced during the vaccination phase. Pre-immune results are now shown and are indeed zero. As suggested, the whole figure 4 is now condensed into one or two panels by proper calculation of Ab titers - would these titres be significantly different? This as all of the other in vivo experiments have not been repeated if I understand the methods section correctly. I understand that there are three R restrictions that are tighter in some countries, and I accept that with the numbers used here, some statistical significance is reached, but this is for instance not the case for survival.

      (2) The basic conundrum here and in later figures is now addressed by the authors in the discussion: Situations where IFN type 1 and 3 signalling deficiency each have an independent effect (i.e. fig.4d) suggest that they act by separate, unrelated mechanisms. However, all the literature about these IFN families suggest that they show almost identical signalling and gene induction downstream of their respective receptors. How can the same signalling, clearly active here downstream of the receptors for IFN type 1 or type 3, be non-redundant, i.e. why does the unaffected IFN family not stand in? The mouse studies, which showed a rather subtle phenotype when only one of the two IFN systems was missing, but a massive reduction in virus control in double KO mice, are discussed, but a clear-cut explanation for the differences has not been reached. Reasons could be a direct effect of IFNab on B cells and an indirect effect of IFNL through non-B cells, timing issues, and many other scenarios can be envisaged. The authors do not address this question experimentally, which limits the depth of analysis, they have however now included a discussion of this dilemma.

      (3) In the one in vivo experiment performed with chickens, only one virus tested, more influenza strains should be included as well as non-influenza viruses. I appreciate that this is logistically difficult.

      (4) The basic conundrum of point 2 applies equally to Fig. 6a, both KOs have a phenotype. Again, in 6d, both IFNs appear to be separately required for Mx induction. An explanation has been attempted, but more experiments, for instance looking at different time points to understand if we are dealing simply with different kinetics of the response, have not been attempted, despite the fact that such experiments are likely not covered by strict three R rules.

      (5) The in vivo infection is the most interesting experiment, and the key outcome here is that IFN type 1 is crucial for anti-H3N1 protection in chickens, while type 3 is less impactful. However, this experiment suffers from the different time points when chickens were culled, so many parameters are impossible to compare (e.g. weight loss, histopathology). Some explanation is given as to the comparisons chosen here, but a more thorough analysis at several time points would have strengthened this study.

      Comments on revised version:

      In the rebuttal, the authors have gone to some length to add to the discussion of the experiments, and some aspects are better explained now than before. Many of these explanations remain speculative however, so the study remains inconclusive in several aspects. As no new data was added, my overall judgement of this study remains unchanged.

    1. Reviewer #1 (Public review):

      Summary:

      Ducrocq et al. present research exploring the genetic link between simple multicellular group formation (ace2Δ/ace2Δ) and its interaction with cell-cycle progression mutants (e.g., cln3Δ/cln3Δ), demonstrating that this combination can provide fitness benefits during fluctuating resource conditions, resulting in a rapid increase in the fraction of multicellular cell-cycle mutants over unicellular yeast without selection for multicellular size. Because both the multicellular phenotype and the regulatory link enabling faster escape from the stationary phase are controlled by the ACE2 transcription factor, this work demonstrates that multicellular cluster formation can arise as a side effect of a completely independent fitness advantage unrelated to the benefits of group formation itself. As a "passenger phenotype," multicellularity could thus emerge for other selective reasons, potentially facilitating a later transition to more entrenched multicellularity if novel conditions arise that make multicellular group formation directly beneficial.

      Importantly, while the literature generally assumes that multicellular group formation incurs a cell-level fitness cost, this work demonstrates that certain genetic - environmental interactions can confer fitness benefits even at the level of individual cells forming multicellular groups. This finding should inspire both theoretical and empirical work exploring multicellular group formation selected for benefits at the level of individual cells, rather than the benefits of forming a larger organismal size that most work has relied on so far.

      Strengths:

      This work is novel and exciting for research exploring the very first steps of the transition from unicellularity to simple multicellularity. The formation of multicellular groups is almost always assumed to come at a cell-level fitness cost due to reduced reproductive fitness compared to remaining unicellular, which generally needs to be outweighed by the benefits of multicellular group formation (e.g., large size to escape predation) for the multicellular phenotype to be stable. However, this study presents an interesting case of a genetic and environmental condition under which individual cells forming simple multicellular clusters can actually have higher reproductive fitness than solitary living yeast cells. This contrasts with previous snowflake yeast studies where the multicellular phenotype was primarily beneficial due to strong selection for large groups (rather than cell-level fitness gains).

      The claims and interpretation of the results align well with the data presented. This is due to the careful and straightforward experimental design testing predictions with a clear, stepwise methodology. The authors rule out alternative explanations and provide support for the proposed link between the mutations (ace2, cln3, and others), their impact on faster exit from quiescence and earlier entry into reproduction in fresh media, and the resulting higher fitness in the snowflake yeast phenotype compared to unicellular yeast.

      This experimental framework (combining cell-cycle mutants under the same multicellular background) is very much likely to be adopted by others in the community to explore downstream implications of these results in laboratory and environmental yeast isolates.

      Weaknesses:

      The authors show that the same multicellular phenotype with higher cell-level fitness due to faster exit from the stationary phase can also be observed with alleles found at other loci in non-laboratory yeast strains, implying that the results are likely not specific to a peculiar case genetically engineered in laboratory strains, but that similar phenotypes may be present in nature. However, this remains to be explored by examining the natural ecology of commercially available or wild yeast isolates and their genomes. This is not a weakness of this study per se, but rather a direction for future work. It does mean, however, that the relevance of these findings for early multicellularity in yeast, and even more so for nascent multicellularity in distinct taxa, remains to be explored in the future. Until then, it is difficult to make strong claims about how applicable these results would be for non-laboratory yeast and other taxa. Regardless, this work represents a very exciting finding.

      Comments on revised version:

      The authors addressed all concerns thoroughly.

    1. Reviewer #1 (Public review):

      Summary:

      Morgan et al. studied how paternal dietary alteration influenced testicular phenotype, placental and fetal growth using a mouse model of paternal low protein diet (LPD) or Western Diet (WD) feeding, with or without supplementation of methyl-donors and carriers (MD). They found diet- and sex-specific effects of paternal diet alteration. All experimental diets decreased paternal body weight and the number of spermatogonial stem cells, while fertility was unaffected. WD males (irrespective of MD) showed signs of adiposity and metabolic dysfunction, abnormal seminiferous tubules and dysregulation of testicular genes related to chromatin homeostasis. Conversely, LPD induced abnormalities in the early placental cone, fetal growth restriction and placental insufficiency, which was partly ameliorated by MD. The paternal diets changed placental transcriptome in a sex-specific manner and led to a loss of sexual dimorphism in the placental transcriptome. These data provide a novel insight on how paternal health can affect the outcome of pregnancies, which is often overlooked in prenatal care.

      Strengths:

      The authors have performed a well-designed study using commonly used mouse models of paternal underfeeding (low protein) and overfeeding (Western diet). They performed comprehensive phenotyping at multiple timepoints including of the fathers, the early placenta and late gestation feto-placental unit. The inclusion of both testicular and placental morphological and transcriptomic analysis is a powerful non-biased tool for such exploratory observational studies. The authors describe changes in testicular gene expression revolving around histone (methylation) pathways that are linked to altered offspring development (H3.3 and H3K4), which is in line with hypothesised paternal contributions to offspring health. The authors report sex differences in control placentas that mimic those in humans, providing potential for translatability of the findings. The exploration of sexual dimorphism (often overlooked) and its absence in response to dietary modification is novel and contributes to the evidence-base for the inclusion of both sexes in developmental studies.

      Comments on revised version:

      The authors have done a great job addressing my concerns. The description of the data analysis and the figures are now much clearer. The inclusion of the potential links between the microbiome and male reproductive fitness is informative and improves the flow of the discussion.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigated the effects of a low-protein diet (LPD) and a high sugar- and fat-rich diet (Western diet, WD) on paternal metabolic and reproductive parameters and feto-placental development and gene expression. They did not observe significant effects on fertility; however, they reported gut microbiota dysbiosis, alterations in testicular morphology, and severe detrimental effects on spermatogenesis. In addition, they examined whether the adverse effects of these diets could be prevented by supplementation with methyl donors. Although LPD and WD showed limited negative effects on paternal reproductive health (with no impairment of reproductive success), the consequences on fetal and placental development were evident and, as reported in many previous studies, were sex-dependent.

      Strengths:

      This study is of high quality and addresses a research question of great global relevance, particularly in light of the growing concern regarding the exponential increase in metabolic disorders, such as obesity and diabetes, worldwide. The work highlights the importance of a balanced paternal diet in regulating the expression of metabolic genes in the offspring at both fetal and placental levels. The identification of genes involved in metabolic pathways that may influence offspring health after birth is highly valuable, strengthening the manuscript and emphasizing the need to further investigate long-term outcomes in adult offspring.

      The histological analyses performed on paternal testes clearly demonstrate diet-induced damage. Moreover, although placental morphometric analyses and detailed histological assessments of the different placental zones did not reveal significant differences between groups, their inclusion is important. These results indicate that even in the absence of overt placental phenotypic changes, placental function may still be altered, with potential consequences for fetal programming.

      Comments on revised version:

      The authors have adequately addressed all my previous comments.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In this manuscript, the authors employ diaphragm denervation in rats and mice to study titin-based mechanosensing and longitudinal muscle hypertrophy. By integrating bulk RNA-seq, proteomics, and phosphoproteomics, they map the stretch-responsive signalling landscape, uncovering robust induction of the muscle-ankyrin-repeat proteinsௗ(MARP1-3) together with enhanced phosphorylation of titin's N2A element.

      Genetic ablation of MARPs in mice amplifies longitudinal fibre growth and is accompanied by activation of the mTOR pathway, whereas systemic rapamycin treatment suppresses the hypertrophic response, highlighting mTORC1 as a key downstream effector of titin/MARP signalling.

      Strengths:

      The authors address a clear biological question: "how titin-associated factors translate mechanical stretch into longitudinal fibre growth" using a unique and clinically relevant animal model of diaphragm denervation. Using a comprehensive multiomics approach, the authors identify MARPs as potential mediators of these effects and use a genetic mouse model to provide compelling evidence supporting causality. Additionally, connecting these findings to rapamycin, a drug widely used clinically, further increases the relevance and potential impact of the study.

    2. Reviewer #2 (Public review):

      Summary:

      Muscle hypertrophy is a major regulator of human health and performance. Here, van der Pilj and colleagues assess the role of the giant elastic protein, titin, in regulating the longitudinal hypertrophy of diaphragm muscles following denervation. Interestingly, the authors find an early hypertrophic response, with 30% new serial sarcomeres added within 6 days, followed by subsequent muscle atrophy. Using RBM20 mutant mice, which express a more compliant titin, the authors discovered that this longitudinal hypertrophy is mediated via titin mechanosensing. Through an omics approach, it is suggested that the Muscle ankyrin proteins may regulate this approach. Genetic ablation of MARPs 1-3 blocks the hypertrophic response, although single knockouts are more variable, suggesting extensive complementation between these titin binding proteins. Finally, it is found through the administration of rapamycin that the mTOR signalling pathway plays a role in longitudinal hypertrophic growth.

      Strengths:

      This paper is well written and uses an impressive suite of genetic mouse models to address this interesting question of what drives longitudinal muscle growth.

      Weaknesses:

      While the findings are of interest, they lack sufficient mechanistic detail in the current state to separate cross-sectional versus longitudinal hypertrophy. The authors have excellent tools such as the RBM20 model to functionally dissect mTOR signalling to these processes. It is also unclear if this process is unique to the diaphragm or is conserved across other muscle groups during eccentric contractions.

    1. Reviewer #1 (Public review):

      Summary:

      Deng and colleagues pursue the possibility that red light exposure can provide some benefits and anti-senescence effects in aged mouse models. In addition, they show how red light influences metabolism in cultured keratinocytes. The authors provide a long dissection of the potential paths involved in the changes promoted by red light exposure, identifying CytC oxidase, SIRT4, PPARa and MCD as key players.

      Strengths:

      The authors did a thorough exploration of the multiple potential avenues by which red light exposure influences metabolism. The in vitro and in vivo evidence nicely complement each other.

      Weaknesses:

      This is a challenging hypothesis that would require some additional experimental controls. The pathway dissection, while extensive, is sometimes approached in unconvincing ways, and the results are not always evident to judge or interpret. Technically, the western blots and transcriptomic analyses require notable improvements.

    2. Reviewer #2 (Public review):

      Summary:

      This work identifies a previously unknown way that red light can slow ageing. The authors show that red light lowers the level of a protein called SIRT4 in skin cells. Reducing SIRT4 boosts fatty acid use and increases a type of histone modification that keeps genes active. These changes help cells clear away signs of ageing, reduce inflammation, and restore normal metabolism. The findings open the possibility of developing new treatments that target SIRT4 to reverse age‑related decline.

      Strengths:

      The evidence is solid because the authors use several complementary methods. They test red light in both cultured cells and naturally aged mice, and they confirm the key role of SIRT4 by silencing its gene. Measurements of metabolism, protein changes, and ageing markers all point in the same direction. However, the exact way red light lowers SIRT4 levels is not fully explained, which leaves a minor gap. Overall, the conclusions are well supported and convincing.

      Weaknesses:

      The paper does not evolve to use the mechanistic discoveries of the manuscript to help our community to identify the mechanism of photobiomodulation, which is not known so far.

      I would like to draw attention to a recently published paper by Herrera et al. (FEBS Letters 2025, doi:10.1002/1873-3468.70195), which shows that red light (660 nm) stimulates mitochondrial fatty acid oxidation in keratinocytes via AMPK‑dependent phosphorylation of ACC, without altering expression of electron transport chain complexes. I believe this paper is highly complementary to the current study.

      Herrera et al. demonstrate that red light increases basal, ATP‑linked, and maximal oxygen consumption rates in keratinocytes specifically through enhanced fatty acid oxidation (inhibited by etomoxir). This independently validates the central finding of the current manuscript, i.e., red light boosts lipid metabolism, strengthening the robustness of this concept.

      While the current manuscript focuses on the SIRT4‑MCD axis, Herrera et al. identify AMPK phosphorylation and ACC inhibition as key effectors. The authors can integrate and expand their discussion, since SIRT4 downregulation may converge on AMPK activation, or they may represent parallel, reinforcing mechanisms. This would enrich the mechanistic model and open new hypotheses.

      The mechanism of photobiomodulation: Herrera et al. explicitly challenge the prevailing paradigm that red light acts solely via cytochrome c oxidase (by showing long‑lasting effects, unchanged OXPHOS protein levels, and no difference in permeabilised cells). The current finding (red light acts through SIRT4 downregulation, i.e., not direct enzymatic activation) aligns perfectly with Herrera´s critique.

      Long‑term metabolic effects - Herrera et al. show that a single red light exposure elevates oxygen consumption for up to 2 days. The current study focuses on changes at 12‑24 h. Their data extend the time window and suggest that the metabolic reprogramming you describe may persist longer than currently discussed, which is clinically relevant.

      Discussing Herrera et al.'s results would not only acknowledge independent, corroborating evidence but would also allow the authors to position their SIRT4‑centric mechanism within a broader, emerging understanding of red‑light photobiomodulation.

    1. Reviewer #1 (Public review):

      Summary:

      The manuscript has several strengths, including a technically comprehensive approach that combines mouse genetics, electrophysiology, live imaging in assembloids, and human organoid models, providing a rich and multifaceted dataset. Cross-species validation through the parallel use of mouse and human systems strengthens the generality of the observed phenotypes and increases relevance to human neurodevelopment.

      Consistent phenotypic observations across systems show that ARHGEF6 loss affects migration, neurite morphology, growth cone structure, and neuronal survival, supporting a coherent role in cytoskeletal regulation.

      There is clear evidence for developmental defects, including reduced interneuron numbers, increased apoptosis in the ganglionic eminences, and migration deficits, all well supported by quantitative analyses. Also, there is a high-quality electrophysiological characterization that demonstrates reduced firing in interneurons, providing a well-controlled functional phenotype.

      Strengths:

      The manuscript has several strengths, including a technically comprehensive approach that combines mouse genetics, electrophysiology, live imaging in assembloids, and human organoid models, providing a rich and multifaceted dataset. Cross-species validation through the parallel use of mouse and human systems strengthens the generality of the observed phenotypes and increases relevance to human neurodevelopment.

      Consistent phenotypic observations across systems show that ARHGEF6 loss affects migration, neurite morphology, growth cone structure, and neuronal survival, supporting a coherent role in cytoskeletal regulation.

      There is clear evidence for developmental defects, including reduced interneuron numbers, increased apoptosis in the ganglionic eminences, and migration deficits, all well supported by quantitative analyses. Also, there is a high-quality electrophysiological characterization that demonstrates reduced firing in interneurons, providing a well-controlled functional phenotype.

      Weaknesses:

      Despite the strengths mentioned above, the study has some conceptual and experimental weaknesses that reduce its impact. The mechanistic insight is limited, as the research does not directly establish how ARHGEF6 regulates downstream signaling pathways.

      Also, there is insufficient evidence for interneuron specificity; although the central claim is that ARHGEF6 plays a selective role in interneurons, the data do not adequately exclude the possibility that the observed effects reflect broader neuronal defects. The study lacks critical controls across cell types, as several phenotypes observed in organoids and progenitors, including apoptosis, reduced neuronal output, and altered morphology, could also affect multiple neuronal populations without being directly tested. Furthermore, the data are predominantly descriptive, with many results remaining correlative and failing to establish causal relationships.

      Some more comments:

      (1) Given that ARHGEF6 is a guanine nucleotide exchange factor for Rac1 and Cdc42, the absence of direct measurements of GTPase activity or downstream signaling represents a significant gap. The interpretation that the observed phenotypes are mediated through specific cytoskeletal pathways, therefore, remains inferential.

      (2) The manuscript repeatedly interprets the findings as interneuron-specific. However, several key observations are not demonstrated to be restricted to IN. Without direct comparison to excitatory neurons or other cell types, it is difficult to conclude that ARHGEF6 plays a selective role in interneurons rather than a more general role in neuronal development. The well-done analysis of the transcriptomic dataset is not sufficient to claim IN specificity. This issue is particularly important for the interpretation of the human organoid experiments, where reductions in SOX2⁺ progenitors and NEUN⁺ neurons, as well as increased apoptosis, could reflect global developmental defects. Similarly, in the mouse experiments, the reduction in GAD67⁺ cells is compelling, but it is not shown whether other neuronal populations are also affected.

      (3) The study provides a strong phenotypic description but limited causal resolution. For example, migration defects, altered growth cone morphology, and reduced branching are all consistent with impaired cytoskeletal regulation, but the links between these phenotypes are not directly established. Likewise, while the electrophysiological data convincingly show reduced firing in interneurons, the connection between altered cytoskeletal dynamics and intrinsic excitability is not explored.

      (4) Several aspects of data presentation could be improved. In multiple figures (e.g., Figure 1A, D; Figure 4 and Video S1, 2), the images are difficult to interpret due to high cellular density, limited magnification, or lack of clear annotation. In some cases, it is not fully clear how quantifications were performed or which regions were analyzed. Improving the visual clarity with arrows, boxes, and high-magnification inserts of the data would strengthen confidence in the conclusions.

    2. Reviewer #2 (Public review):

      The authors investigate the impact of the deletion of the small GTPase regulator ARHGEF6 on the development and physiology of interneurons. Using public databases, they first show that ARHGEF6 is enriched in interneurons or in areas that give rise to them, both in development and adulthood, in humans and mice. Using a complete KO mouse previously reported, and using a GAD67-GFP reporter mice line, they show that in the adult mouse cortex and hippocampus, there is a notorious reduction GFP+ cells. These mice show increased apoptotic cells at different timepoints and areas of the brain during development. In the developing cortex of ARHGEF6-KO mice, there are fewer IN in all layers of the developing cortex, and cells present processes not correctly oriented. IN from the hippocampus in culture show reduced excitability and impaired neurite branching. The authors then established isogenic hiPSCs lines to study ARHGEF6 deletion in human cells and differentiated ventral forebrain neurons, to find interneuron-related and non-related phenotypes. Most importantly, human interneurons grown in organoids show reduced branching and altered growth cone morphology. The authors claim that the novel interneuron phenotypes found in these models can explain, in part, the human intellectual disabilities associated with mutations in this protein. The study is well conducted and opens new avenues of research not only for the role of small GTPases regulation in early nervous system development, but also for how interneuron deficiencies impact a wider range of intellectual disability syndromes found in humans.

      However, most conclusions of the present version would be strengthened after considering the following comments:

      Major comments

      (1) The reported biological processes evaluated at different developmental stages may be directly or indirectly related to ARHGEF6 function itself. As a model of a hereditary disease, full organism gene deletion is valid, since the human patients suffer from that condition as well. However, to investigate the roles of a protein, complete deletions may not be very accurate since they can give rise to phenotypes that are only indirectly related to the protein function itself. Most conclusions of the present manuscript should either be discussed in this regard or add evidence for a direct role of the protein. One such evidence is typically performed with acute knockdowns in culture, or in developing brains by in utero electroporation. For example, Figure 1C shows that the principal excitatory neurons in the hippocampus do not express ARHGEF6. However, most electrophysiological and behavioral evidence of defects in ARHGEF6-KO mice arises from evaluating these cells (Remakers et al., 2012). I am not suggesting that either previous or actual evidence is wrong. But I believe readers would benefit from a clear distinction (or add caution notes) between a functional consequence of the deletion (that can be months away and in other cells than the actual molecular defect) and a true cell biological function of the protein under study. In favor of the authors, this is a concern with most conclusions derived from KO organisms.

      (2) Figure 1E-G H I. All conclusions are made with a GAD67-GFP reporter, which is a very powerful and reliable tool for large-scale screening. All the conclusions of the paper would be strengthened if some immunohistochemical staining in the same areas of specific markers for interneurons would be added as supporting complementary evidence.

      (3) Cell death in development: It is surprising that the high amount of TUNEL staining during development does not translate into gross histological changes in the adult brain (studied elsewhere). Can authors discuss possible explanations?

      (4) Section 4 (Figures 2F-J) - The authors present this staining as an analysis of migration. Normally, migration studies are performed with a "pulse-chase" paradigm, where a single cohort is labeled and then followed over time (normally by in utero electroporation of a fluorescent protein). Tissue is then fixed at different time points, and migration can be followed. On the contrary, the evidence is from a single point, in an experimental setting in which all Gad67 IN are stained, and hence, one cannot imply a defect in migration. The differences between WT and ARHGEF6-KO are obvious and interesting; it is just that they cannot be solely attributed to a problem in migration.

      Also, a true phenotype of migration in the current setting should have found that the cells that failed to migrate are accumulated in deeper layers. My impression is that the changes in IN per layer are easier explained by total cell number, rather than migration. Perhaps evaluating earlier timepoints could clarify this.

      (5) It is known that ARHGEF6 deletion produces severe F-actin phenotypes in neurons. Have the authors confirmed in their hippocampal cultures GAD67 cells ALSO have these phenotypes? Stress fibers in somas, growth cones, and actin patches along neurites.

      (6) Section 4. The authors present data for deficient migration of the GFP-labeled interneurons. Is it possible to assess, in the same sections, whether other cell types are also affected? Although the hypothesis that ARHGEF6 deletion will have an impact in IN is well rooted in expression data, by assessing other cell types, one can even include a positive control or evidence for a cell-autonomous phenotype.

      (7) ARHGEDF6 deletion has an important impact on organoid development (size, shape, etc). Have the authors analysed whether these organoids produced fewer interneurons?

      (8) In assembloids, the differences in migration parameters are very small between WT and ARHGEF6-KO, which reinforces that perhaps what is observed in the different layers of cortex during mouse development is likely not entirely due to migration, as concluded.

      (9) To properly weigh the present evidence -interneuron deficits- using the ARHGEF6-KO model, authors should include a deeper discussion in light of much work that has been done using these mice. How does the finding of a diminished IN population in the brain of these mice explain the large amount of electrophysiological and behavioral evidence produced before with these animals? Perhaps the most important work to discuss these aspects is the initial ARHGEF6-KO report by Ramakers and colleagues (2012), but there are others.

      Minor comments

      (1) Figure 1A. It looks clear that the GE shows the highest expression of ARHGEF6; however, the reader needs the reference levels where the log2 expression is calculated. What are the reference levels?

      (2) Have the authors compared the number of GAD67-eGFP cells in the hippocampal cultures between WT and ARHGEF6-KO mice?

      (3) Section 3, as a caution note, authors should mention that it is not possible to know from the evidence provided which cells are dying.

      (4) In the dorsal-ventral assembloids, it is expected that the ventral organoid would contain lots of GFP expression compared to the dorsal, but in the image shown (Figure 5A) both parts of the assembloid seem to have the same amount and distribution of GFP. How is that possible?

    3. Reviewer #3 (Public review):

      Summary:

      ARHGEF6 is a RAC1/CDC42 guanine nucleotide exchange factor that has been proposed to be associated with X-linked intellectual disability, but its relevance to the pathology is not well established. ARHGEF6 has been assigned a role in spine density and plasticity of hippocampal pyramidal neurons, but nothing is known about its role in interneuron development. Here, the authors show that ARHGEF6 is expressed early in development in the inhibitory lineage during the peak of interneuron generation and migration. The aim of the study is therefore to investigate whether, in addition to its role in pyramidal neurons, ARHGEF6 could play a role in inhibitory neuron development. Using both ARHGEF6-KO mice and organoids from ARHGEF6-KO hiPSCs, the authors show that ARHGEF6 plays a critical role in interneuron development and function

      Strengths:

      The major strength of the paper is the very detailed analysis of the role of ARHGEF6 using two different systems: ARHGEF6-KO mice and deletion of ARHGEF6 in human iPSC-derived organoids. Strikingly, deletion of ARHGEF6 in both systems induces similar defects such as an increase in apoptosis, reduced neuronal output, impaired neuronal morphology, and disrupted migratory dynamics. This compelling evidence demonstrates that ARHGEF6, in addition to its already well-described role in spine formation and plasticity, is playing a crucial role during embryonic development through its function in interneurons.

      Weaknesses:

      (1) In Figure 1, the authors show that ARHGEF6 is expressed in different regions of the brain, including the interneuron lineage, and that depletion of ARHGEF6 reduces the number of GABAergic neurons in the adult cortex and hippocampus. To try to better characterize this defect, the authors in Figure 2 investigate whether deletion of ARHGEF6 affects interneuron migration and survival during embryonic development. To do so, ARHGEF6 ko mice were crossed with the GAD67-eGFP reporter line to follow the inhibitory lineage. The authors analyse apoptosis using TUNEL staining, and show that it is significantly increased in the ganglion eminence of ARHGEF6-KO E14.5 embryos. The authors claim that this is not the case in the cortex. However, the image shown in Figure 2A really suggests that staining is increased. Which part of the neocortex is analysed for quantification? This should be clarified.

      (2) In Figure 2F-J, the authors investigate the migration of interneurons by analysing the GAD67-eGFP staining, and clearly show that the migratory abilities of the depleted neurons are reduced. However, the authors do not discuss the fact that, because depletion of ARHGEF6 increases apoptosis, there are fewer neurons available for migration. This is important for the interpretation of the data. This point should be clarified.

      (3) In Supplementary Figure S2, the authors describe the establishment of the ARHGEF6-KO human iPSC line and test the ability of these cells to undergo correct development, especially for the generation of neural progenitor cells. I was wondering why the authors do not present the data of both control and ARHGEF6-KO cells.

      (4) At the molecular level, how ARHGEF6 depletion could affect neuronal survival is missing. In addition, as ARHGEF6 is a GEF for RAC1 and Cdc42 amongst other GEFs, I would have expected that the authors test how RAC1 activity (and Cdc42) is affected in ARHGEF6-depleted brains and in ARHGEF6-KO organoids. The measure of phalloidin staining and the anisotropy index are not really meaningful.

      (5) The authors show that ARHGEF6-KO forebrain organoids were markedly smaller compared to their isogenic controls, and their study suggests that ARHGEF6 expression impacts progenitor maintenance and neurogenesis. Despite representing only a minority of the total neuronal population, I was wondering whether ARHGEF6-KO mice present brain morphology defects such as microcephaly.

    1. Reviewer #1 (Public review):

      A triple-transgenic (3xTgAD) mouse model of Alzheimer's disease was exposed to a high-fat diet and assigned to one of three interventions: voluntary physical activity, a low-fat diet, and their combination. A high-fat diet significantly increased body weight and induced widespread neuroanatomical changes, with effects modulated by sex and genotype. The combined intervention led to significant weight loss in males of both genotypes. Neuroanatomical analyses revealed that a high-fat diet significantly reduced hippocampal and cerebellar volumes in wild-type mice but had a less pronounced effect on 3xTgAD mice; nevertheless, interventions, particularly the combined approach, increased localized brain volumes in these regions regardless of genotype. Spatial gene enrichment analysis of this pattern identified glucose homeostasis. Overall, these findings suggest that voluntary physical activity and a low-fat diet can modulate brain structure and behaviour, partially counteracting the effects of a high-fat diet, and potentially recruiting biological processes that may support brain health.

      The authors describe studies of the 3xTg mouse model of Alzheimer's disease (AD). They set out to study the interactions of diet and exercise on three outcomes: weight gain, MRI, and either the novel object recognition or Morris water maze tasks of memory.

      They conclude there are sex and genotype effects on hippocampal volume.

      There are several strengths to the study. First, they start out with a great deal of mice. Once they are divided into groups, the sample sizes are not always strong, however. It would be good to know that they were sufficiently powered.

      The data are also interesting. Mice were placed on several different diets during the study, which will be of interest to many who question the role of diet in outcomes. They also add exercise as an intervention, and study not only diet but also the combined effect of diet and exercise. This is relevant to those interested in controlling dementia by diet and exercise. Finally, they perform some very interesting analyses to study the data.

      That said, the study also has several limitations. For example, it is quite complex. Mice had a standard diet until 2 months of age, then were switched to either a low-fat or a high-fat diet. Some mice had both a different diet and exercise. MRI was performed at 2, 4, and 6 months, when behavior was tested. A drawback of this design is that no assessment of outcomes relevant to this animal model, such as amyloid-beta or tau phosphorylation, was conducted. Also, they used the novel object recognition task, despite stating in the Discussion that this task does not show impairments until well after 6 months of age. They added exercise, but it is not clear whether the animals used the exercise apparatus equally. Also, the animals were housed "communally", so adding an exercise wheel may have made the cage crowded, adding stress to the study. The diets were not simply low- or high-fat because many constituents besides fat content also changed. Regarding fat, the type of fat also changed between diets. Therefore, the gut microbiome was probably affected differently by factors other than fat intake. There was no measurement of food consumption, so some mice may not have eaten as much of the new diet as they did of the old diet they were used to.

      Regarding the data, only the outcomes of complex analyses are shown. One would first want to see the changes in body weight and perhaps later how it is analyzed in a more complex way. For behavior, one would first want to see outcomes as typically presented. For example, learning, recall, platform test results from the Morris water maze, and discrimination indices for object recognition. Note that, at one point, I believe the authors note that some groups did not explore thoroughly, which would make novel object recognition hard to interpret. If there was any difficulty with ambulation, both tasks would be hard to interpret.

      Regarding MRI, from what can be seen, structures cannot be distinguished clearly. At least some raw data should be shown to demonstrate this and to determine what the data show. The raw data suggest that some of the larger structures can be distinguished, and we should see the data for these areas, even if all areas can't be assessed. Lifestyle interventions can mitigate the effects of diet-induced obesity on body weight, behaviour, and brain anatomy in mouse models. Using a longitudinal design, wild-type and triple-transgenic (3xTgAD) mouse models of Alzheimer's disease were exposed to a high-fat diet and assigned to one of three interventions: voluntary physical activity, a low-fat diet, and their combination. A high-fat diet significantly increased body weight and induced widespread neuroanatomical changes, with effects modulated by sex and genotype. The combined intervention led to significant weight loss in males of both genotypes. Neuroanatomical analyses revealed that a high-fat diet significantly reduced hippocampal and cerebellar volumes in wild-type mice but had a less pronounced effect on 3xTgAD mice; nevertheless, interventions, particularly the combined approach, increased localized brain volumes in these regions regardless of genotype. Multivariate integration of behavioural and neuroanatomical measures identified a brain pattern linking hippocampal and cerebellar volumes to intervention and behavioural performance. Spatial gene-enrichment analysis of this pattern identified biological processes, including glucose homeostasis, as potential biological mechanisms underlying intervention effects. Overall, these findings suggest that voluntary physical activity and a low-fat diet can modulate brain structure and behaviour, partially counteracting the effects of a high-fat diet, and potentially recruiting biological processes that may support brain health. In the end, the authors focus primarily on the hippocampus and discuss the cerebellum, but it seems that changes occur throughout the brain. The choice to focus on the hippocampus and cerebellum needs to be supported.

      To gain further insight, the authors analyze genes across different brain regions using the Allen Brain Atlas. Although this seems reasonable in theory, once one realizes how many genes are shared across diverse brain regions, one wonders how such an analysis was conducted. More understanding of this approach, as well as how it was validated, is important. In the end, the authors conclude that the glucose homeostatic pathways were primarily altered, and one would like to understand whether that is indeed true and whether it is the only set of pathways that were changed.

      This raises another point: what occurs in a normal wild-type mouse on the standard diet during the first 6 months of life? Do the glucose homeostatic pathways change simply due to age? Sex? It may be that, with age, the mice become more sedentary, which is why. Once that is resolved, what occurs on the standard diet for the 3xTg mice? Perhaps they are more active or more sedentary, regardless of diet or exercise? Thus, the studies end up raising more questions than answers.

      Given so much work has already been done, it seems best to simply reorganize the presentation with raw data first, followed by the analysis. For the second section, the implicit assumptions of the analyses should be very clear so that the analyzed data are understood and believable. Limitations of the assumptions, pooling some groups, etc., need to be clear.

      Figures. In Figure 1, the weekly measurements are not shown. The points are connected, so an unbroken line is shown. Around the line are lighter lines indicating errors, but with all the lines and colours, one does not know what standard errors surround the values for any given group. This makes the data hard to interpret. In later figures, significant differences are indicated with asterisks, but this seems to be done inconsistently.

      In the text, more caution is needed for some assertions. For example, it is not clear that a 2- to 6-month-old is an adolescent. Opinions about the ages of mice that correspond to human life stages have always been debated. Another example is indicating that male mice might gain weight differently than females, as if it were an outcome of diet or exercise. This is because male rodents continue to gain weight in adulthood, but females stabilize because estrogen limits appetite. Additionally, females may not show group differences because they are more variable. This can relate to their estrous cycle. If stressed or housed without males nearby, they may not have a regular estrous cycle, which can then affect their outcomes. This may be particularly true for behavior when they may have been tested during different estrous cycle phases, if they had estrous cycles.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript describes an investigation into the effect of diet and exercise interventions in WT and transgenic (male and female) mice who are exposed to either a high-fat or a low-fat diet. The outcome variables include MRI volume and brain morphology, as well as memory performance. First, this study measured the impact of genotype (WT vs 3xTgAD mice), then examined the impact of a high-fat or low-fat diet in each group, and finally examined the impact of a low-fat diet, exercise, or a combined low-fat diet and exercise intervention. This is an important study as it allows us to better understand how changes to lifestyle can affect neurocognitive function and potentially change a person's AD risk.

      Strengths:

      (1) The study uses a well-controlled longitudinal design, allowing the authors to track how diet and exercise interventions influence brain and behaviour over time.

      (2) The integration of multiple levels of analysis (brain imaging, behaviour, and multivariate modelling) provides a rich and comprehensive assessment of intervention effects.

      (3) The inclusion of both genotype and sex as key variables strengthens the relevance and interpretability of the findings, given known differences in risk and response across groups.

      Weaknesses:

      There are a lot of analyses in this paper, and I had a little bit of trouble distilling the major take-home messages. For example, I was left wondering:

      (1) If the effect of genotype and the effect of the high-fat diet were consistent in the current study compared to the authors' previous work (e.g. Rollins et al., 2019). A more direct report on the consistency of these findings (maybe even an overlap map, if possible) would benefit the reader.

      (2) How consistent/different are the volumetric and morphometric (DBM) results from each other? Especially in the regions of interest (hippocampus and cerebellum), are increases in volumes always related to "expansion" of a given region using DBM? Some of the similarities are reported in the results, but for transparency, a side-by-side table comparing the results across techniques for each effect of interest might provide more clarity.

      (3) I was interested in the Partial Least Squares approach that the authors used to investigate how patterns of brain measures relate to the behavioral variables. Because they are presented mostly in the supplement (except for Figure 6E), it's difficult to map the LVs described onto the univariate contrasts in Figures 2-5. In general, greater clarity is needed regarding how the PLS-derived latent variables relate to the univariate findings, and whether the emphasis on LV3 reflects a principled selection or post hoc interpretation.

      (4) If I understand the results correctly, there were only modest differences in behavior reported, and the patterns were somewhat inconsistent across sex and genotype. In fact, the authors report that the high-fat diet alone did not impair memory on the Morris Water maze (line 323). The discrepancy between robust neuroanatomical effects and relatively modest behavioural changes raises important questions about the functional significance of the observed structural alterations.

      (5) On line 507, the authors state, "Notably, 3xTgAD mice already show smaller brain volumes at baseline, which may constrain the detectable impact of the diet." Is this true for the entire brain or just the hippocampus and cerebellum? Would a global reduction in brain volume due to the 3xTgAD AD model affect the interpretation of the intervention effects?

    3. Reviewer #3 (Public review):

      Summary:

      The authors sought to determine the individual and combined effects of exercise and low-fat diet consumption on regional brain volume and cognitive function in triple-transgenic Alzheimer's disease mice and wild-type controls.

      Strengths:

      (1) A strength of this study is its longitudinal design, which captures regional changes in brain volume across the interventions tested.

      (2) Its comprehensive design includes 10 groups and is well-powered to isolate genotype-, sex-, diet- and exercise-related effects (and interactions).

      (3) The analyses of volumetric and voxel-based measures are comprehensive.

      Weaknesses:

      (1) Use of automated tracking for NOR data reduces confidence in the behavioural data.

      (2) No measures of Ab or tau pathology appear to be performed.

      (3) Mice from the critical 'combined' intervention groups are not included in the PLS regression model that integrates behavioural and brain data.

      (4) Analyses of behavioural data include a large number of variables without adequate justification.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents an Important tool for the study of MR1 antigen binding, opening new possibilities, and cutting-edge techniques. The evidence supporting the claims of the authors is solid, although including some functional experiments using primary T-cells would also provide a more complete physiologic evaluation. The work will be of interest to T cell immunologists, in general, especially those studying unconventional T cells.

      Strengths:

      In this study, the authors developed a single-chain MR1-derived protein by exchanging the α3 domain and β2-microglobulin for a helical stabilizing domain that they had previously developed. The aim was to generate a more compact structure that would still fold properly, without the risk of losing β2-microglobulin. This overall more robust structure would facilitate ligand exploration using various cutting-edge biophysical techniques.

      The authors successfully demonstrated that their construct folds similarly to native MR1 and retains the ability to bind MAIT TCR in solution, as shown by cryo-EM experiments. Its melting temperature was equivalent to that of the native protein. Importantly, the construct enables the use of differential scanning fluorometry and transverse relaxation-optimized spectroscopy, which represent the main strengths of this work. These approaches should greatly facilitate the screening of additional unknown ligands and enable interaction mapping.

      Weaknesses:

      One possible area for improvement would be to extend the validation to additional known ligands, particularly weaker binders. Furthermore, although the cryo-EM data are highly convincing, including either MAIT cell staining or MAIT activation assays with the generated construct would provide stronger functional validation of its equivalence to the wild-type protein with respect to ligand-binding properties.

      Overall, this work is of great interest to the field, as several groups worldwide are seeking to identify endogenous/tumour-derived MR1 ligands. In addition, some pathogens lacking the capacity to produce 5-OP-RU have been shown to activate MAIT cells, raising the possibility that unknown pathogen-derived ligands may also exist.

    2. Reviewer #2 (Public review):

      Summary:

      The authors develop a miniaturized MR1 construct (SMART-MR1) in which the α1/α2 platform is stabilized by a synthetic domain, and show that it can bind ligands, engage a cognate TCR, and recapitulate native-like recognition by cryo-EM.

      Strengths:

      The work is well-written, technically strong and carefully executed. The authors combine biochemical, biophysical and structural approaches, including ITC, NMR and cryo-EM, to show that SMART-MR1 behaves in a manner closely resembling native MR1. The reduction in size and the demonstration of solution NMR are clear practical advantages for certain types of mechanistic studies.

      Weaknesses:

      The main limitation is that the manuscript does not clearly establish a practical advantage over existing MR1 formats, such as single-chain MR1-β2M or previously described stabilized constructs. The comparison is largely framed against native MR1, which risks overstating the problem, and on the basis of the data presented, it is unlikely that other researchers will adopt this system. In addition, the choice of the A-F7 TCR as a validation reagent may overestimate the generality of the approach, as this receptor is known to exhibit relatively broad ligand tolerance, including recognition of MR1 presenting vitamin B6 metabolites (PDB 9CGR) and structurally diverse synthetic ligands. The extent to which SMART-MR1 supports recognition by a broader range of MR1-restricted TCRs is not addressed.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript describes the engineering, production and validation of an MR1 variant with enhanced suitability for screening of ligands and biophysical and structural analysis. The authors utilize a previous advance from their laboratory on a classical MHC (HLA-A2) whereby the alpha 3 and b2m domains are replaced by a helical stabilizing domain.

      Strengths:

      This variant has a smaller molecular weight than the native MR1, can be produced easily through refolding and is thus much more suitable for NMR analysis. The authors provide data demonstrating that many of the parameters typically evaluated in protein biochemistry/biophysics are similar to reported values between this engineered variant and the wild-type protein. Overall, this is a significant advance to the MR1 field and more broadly to MR1 relevance in immunology and cancer biology, as this will accelerate high-throughput screening and discovery of disease-relevant ligands for MR1, which have been overshadowed by the misguided fixation on 5-OP-RU.

      Weaknesses:

      Minor concerns about the lack of comparison with the native MR1 extracellular domain construct in the validation of this engineered construct.

    1. Reviewer #1 (Public review):

      Summary:

      P. Izquierdo et al. investigated the genetic determinism of various traits of interest in switchgrass using large-scale genomic and transcriptomic data. More specifically, they worked on a diversity panel comprising 426 genotypes evaluated in common-garden experiments at two locations (Michigan and Texas). The phenotypic and genomic data were already published. In this work, they produced transcriptomic data for each of the 426 genotypes at each site, and they carried out phenotype predictions using genomic and transcriptomic data separately or together. While they were moderately correlated at each location, both omic information appeared to be complementary for the prediction of phenotype. To further exploit the fact that they have data across two locations, they computed differences for phenotypes and transcripts between locations as indicators of trait and transcript plasticity, respectively. They built predictive models of trait plasticity using genomic information and transcript plasticity, which proved to be quite accurate for traits affected by GxE. Finally, they made use of SHAP values from predictive models of flowering time and biomass at each location, as well as for their plasticity, to gain insight into their genetic determinism. These SHAP values provide the importance of the predictive features (SNP and/or transcripts) for trait prediction. This allowed them to confirm some candidate genes and to propose new candidates for both traits.

      Strengths:

      I found this study interesting and rich. I think the sample size (426 genotypes) is large enough to support the findings. The use of a modern machine-learning approach (XGBoost) together with SHAP indices to find interesting features and get insights into the biological mechanisms underlying flowering time and biomass production is quite original. The methodology employed is globally sound. I also like the fact that the authors accounted implicitly for the population structure by providing a baseline prediction using the first 5 PCs.

      Weaknesses:

      While the methodology is globally sound, I sometimes had difficulties following exactly what was done. This is partly due to the fact that the authors used 2 omics (SNPs and transcripts) to predict phenotypes, and sometimes, in the results, it is not clear which of the 2 is the focus. This was especially the case for the importance of the features and the interpretability of the models, where I found it sometimes hard to tell whether the analysis was done on SNPs or transcripts.

      Also, regarding the methodology, I did not understand why the authors needed to perform a feature selection approach. Maybe it was required to perform the interaction analysis, which could not be deployed on all the features? But regarding the importance of the features, I do not get the added value of the selection over the direct use of SHAP indices when using all features. Maybe this is because I am not a specialist in this kind of approach, but maybe the authors could add more details to explain the rationale behind the feature selection.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to evaluate whether integrating genomic (SNP) and transcriptomic information with machine learning can improve phenotypic prediction of polygenic traits across environments. The manuscript explored not only the predictability across models and predictor feature sets, but also attempted to identify meaningful genes and interactions underlying trait variation.

      Strengths:

      The main strength of the manuscript is its integration of SNP, transcriptomic, and phenotype datasets for 426 sorghum genotypes between Texas and Michigan. It provides a systematic comparison of predictor types (SNP versus transcriptomic abundance) and model strategies to integrate them.

      Weaknesses:

      (1) Experimental Design

      The experimental design raises several concerns that should be clarified before strong biological conclusions are drawn from the transcriptomic analyses.

      First, the transcriptomic sampling is not well aligned with the developmental stages most relevant to the phenotypes being modeled. Leaf tissue was collected at a single time point in each environment, whereas traits such as flowering time, biomass, tiller count, and panicle height arise from developmental processes occurring over extended and potentially distinct temporal windows. Consequently, the measured expression profiles are likely to reflect physiological states specific to the sampling dates (May 5-6 in Texas and June 22-24 in Michigan) rather than the regulatory processes underlying the target phenotypes.

      Second, the phrase "haphazardly randomized" is questionable for a field experiment. It is unclear whether the design included formal randomization, blocking, row/column structure, or spatial correction. Without explicit accounting for spatial field heterogeneity, environmental variation within sites may confound genotype and transcriptomic effects.

      Third, the Methods do not clearly describe biological replication for RNA-seq. If each genotype-by-environment combination were represented by a single transcriptomic sample, then within-genotype expression variance cannot be estimated. This is important because transcript abundance is highly sensitive to microenvironment, sampling time, tissue status, developmental stage, and technical variation. The absence of replication significantly weakens confidence in gene-level feature importance and gene-gene interaction claims.

      Four, the analysis of expression differences across environments is based on a simple subtraction (TX - MI) followed by correlation with genetic similarity. This approach is not standard in transcriptomic analysis and does not account for variability, replication, or statistical uncertainty. Conventional methods for assessing differential expression and genotype-by-environment interactions rely on model-based frameworks that explicitly estimate variance components and test for interaction effects. Without such modeling, the observed expression differences may reflect noise or confounding factors rather than genotype-driven responses.

      (2) SHAP contribution values

      Although SHAP is a well-established framework for decomposing model predictions into feature-level contributions, its use in this manuscript raises several concerns regarding interpretation, statistical validity, and biological inference.

      First, SHAP values quantify the contribution of features within the fitted model, conditional on the joint distribution of inputs and the model structure. They do not represent causal effects or direct biological importance. There is a difference where SHAP values are often in log-odds and the regression model uses absolute units. Without a fair evaluation of model fit, the interpretation of SHAP values needs to take a cautious step because a model could fit poorly when a feature shows very high SHAP values.

      In genomic data, where features are highly correlated due to linkage disequilibrium and co-expression, SHAP values can distribute contribution values across correlated variables in ways that are not uniquely identifiable. As a result, features highlighted as "important" may reflect correlation structure rather than true functional relevance.

      This correlative structure can be exacerbated in this manuscript because of the use of TPM-normalized transcript abundances as predictor variables without biological replicates. Assume the estimates of transcript abundances are robust, TPM values are compositional, with a constant-sum constraint that creates dependencies among all genes that induce negative correlations. This issue is particularly relevant for the interpretation of gene importance and interaction effects, where correlated predictors can lead to unstable and non-unique attributions. This biological interpretation of transcript-based features remains uncertain.

      (3) Result interpretation

      For example, in page 11, "plasticity SNP- and transcriptomic-based models generally outperformed single-environment models for traits with low cross-environment correlation, such as green-up (Fig. 2c, r = -0.13, p < 8.3 × 10⁻³) and tiller count (Fig. 2f, r = -0.08, p = 0.1) (Supplementary Fig. S1).", is too broad. For green-up, the Diff model appears much better than MI, but not clearly better than TX.

      And, same page 11, "...Diffexp was more predictive than SNPs for trait plasticity in biomass, flowering time, and tiller count..." only holds true for biomass, not flowering time, or tiller count.

      The aspect of "complementary information" between SNP and transcriptomic models in page 12 is stronger than what is supported by Figure 2. Figure 2 shows different predictive performance, but it does not by itself demonstrate complementarity. Establishing complementarity requires evidence that combining SNP+T improves prediction consistently or captures distinct, non-overlapping signals. Yet the preceding section says SNP+T outperformed either single data type in only 15% of cases, with modest gains. This is confusing. Also, there was not G+T in Figure 2; it is SNP+T.

    1. Reviewer #1 (Public review):

      Summary:

      Wang Liao and colleagues aim to provide a comprehensive synthesis of zebrafish circadian research, with particular emphasis on the decentralized photoreceptive architecture that distinguishes teleosts from mammals, and to outline future research directions leveraging emerging technologies for translational applications. The authors frame zebrafish as occupying a "crucial evolutionary and experimental niche" and argue that the model system is uniquely suited to address open questions in chronobiology.

      Strengths:

      The review is broad in scope and up to date in its citation of recent primary literature. The coverage of physiological outputs - spanning cardiovascular rhythmicity, hepatic metabolism, immune function, reproduction, and gut homeostasis - is more comprehensive than many existing reviews in this area, and researchers seeking an entry point into any of these subfields will find a useful orientation. The figures are well-designed and effectively summarise complex regulatory relationships. The section on immune rhythmicity is a particular strength, providing mechanistic detail on how specific clock components (Clock1a, Per1b, Per2, Cry1a) differentially regulate neutrophil behaviour, bacterial killing, and cytokine expression; this level of molecular specificity distinguishes it from comparable sections in the review. The brief discussion of non-canonical clock gene functions (CLOCK in neuronal connectivity, BMAL1 in stem cell state, vascular calcification) raises genuinely interesting points that are underexplored in the field and might deserve more prominence.

      The future perspectives section makes a conceptually interesting move in suggesting that the zebrafish decentralized architecture could reframe a central question in chronobiology - from how a master clock imposes order on passive peripheral oscillators, to how semi-autonomous oscillators achieve coherence. This is the most original conceptual contribution in the manuscript, and it would benefit from much further development.

      Weaknesses:

      The core limitation of this review is that it functions primarily as an annotated bibliography rather than a critical synthesis. Section after section follows the same pattern: a physiological system is introduced, several findings from recent papers are described in sequence, and the section ends. Missing throughout is an evaluative voice - where does the field agree, where does it disagree, which findings have been replicated versus remain preliminary, and which conceptual questions are genuinely unresolved versus merely unstudied? Readers with expertise in the field will find little that reframes their understanding; readers new to the field will receive information but not the interpretive scaffolding needed to assess its significance.

      The framing of zebrafish as occupying a "crucial evolutionary and experimental niche" is asserted but not substantiated. The experimental advantages of zebrafish - optical transparency, external development, genetic tractability - are real, but they apply primarily to larval stages, typically the first two weeks of development. The review does not adequately address whether the key features it highlights, particularly peripheral photosensitivity and autonomous peripheral oscillators, have been demonstrated in adult animals, where optical transparency is lost. Many of the physiological findings described (sleep-wake cycles, cardiovascular function, reproduction, and immune function) are most relevant in adult or juvenile fish, yet the mechanistic underpinnings often come from larval studies. Whether the mechanisms generalise across developmental stages is not discussed, and this is an important gap that the review could acknowledge explicitly.

      The claim that zebrafish bridge invertebrate and mammalian models is a conventional framing that appears in most zebrafish review articles; its repetition here adds little. More interesting - and underexplored - is the comparative question of how the decentralised clock architecture of teleosts compares with that of other non-mammalian vertebrates, or indeed with invertebrate systems such as Drosophila, where peripheral tissue clocks and non-visual photoreception have also been studied. The review does not engage with this comparative dimension, which would be the natural intellectual context for the claims being made.

      The future perspectives section identifies several promising directions - optogenetic circuit mapping, whole-body longitudinal imaging, inter-organ communication, network modeling - but these are described at a high level of generality. Most are not specific to the questions raised by the zebrafish decentralized clock architecture; they would appear in any forward-looking review of circadian biology. The one conceptually distinctive idea - that zebrafish could be used to ask how distributed oscillators achieve coordinated coherence without hierarchical control - is identified but not developed into concrete experimental questions or testable predictions. The discussion of non-canonical clock gene functions in the Future Perspectives section would benefit from being more directly connected to what zebrafish specifically can offer: given that teleost genome duplication has produced additional paralogues of clock genes, there is a concrete opportunity to dissect canonical from non-canonical functions through comparative analysis of paralogues with diverged expression patterns. This point is hinted at but not made explicitly.

      Appraisal of conclusions:

      The conclusions are broadly consistent with the evidence cited, and the authors are appropriately cautious in noting that many signalling cascades and inter-tissue communication mechanisms remain incompletely characterised. The conclusion that zebrafish represents a valuable and underexploited model for circadian-disease translational research is well-supported. However, the review would be significantly strengthened if the authors distinguished more clearly between what is firmly established, what is supported by preliminary or single-study evidence, and what remains genuinely speculative.

      Likely impact and utility:

      This review will be useful as an orientation document for researchers new to zebrafish circadian biology, and the comprehensive treatment of physiological outputs across organ systems is a genuine service to the field. Its impact as an intellectual contribution is limited by the descriptive approach and the absence of original synthesis or conceptual reframing. The most interesting ideas in the manuscript - the reframing of the central/peripheral clock hierarchy question, and the potential of clock gene paralogues for probing non-canonical functions - could be further developed and, if pursued, could form the basis of a more distinctive and impactful contribution.

    2. Reviewer #2 (Public review):

      Summary:

      This review is valuable in principle because circadian rhythms in zebrafish are unexplored and therefore this degree is valuable in principle. There are a number of significant weaknesses that should be addressed for it to have an impact. First, while the review covers a broad range of topics in chronobiology, it does not put them in context. Placing zebrafish work in the context of other model organisms that are better understood and other fish species would broaden the appeal. The review could also expand to a discussion of sleep, where the understanding in zebrafish is much more advanced. Critically, providing a novel framework, identifying new areas of opportunity and limitations of the system would expand the interest to non-zebrafish research groups. In addition, there are a number of misstatements/mis-citations that are critical to correct. Therefore, I find this review potentially impactful, but its current form is likely to limit its impact.

      Strengths:

      Focusing on decentralized photo sensing is a strength because it is relatively unique to zebrafish.

      The breadth of discussion in zebrafish is a strength.

      Weaknesses:

      It might be helpful to reorganize the review with an introduction on what is known in other better studied systems to be highly conserved, then to focus in on the components of zebrafish that are discussed here.

      A weakness is the lack of integration with other model organisms and other fish systems. Therefore, the narrow focus on zebrafish is unlikely to appeal to broader audiences.

      It's surprising that there is not more discussion of sleep, which has been studied in detail, and its relationship to the clock.

      Discussions of limitations of the model, including adult vs larval analysis and challenges performing long-term behavioral analysis in fish, would be valuable.

    3. Reviewer #3 (Public review):

      Summary:

      Over the past 3 or 4 decades, our understanding of the molecular mechanism underlying the circadian clock has increased substantially. This is in large part due to successful forward and reverse genetics approaches applied to a broad range of genetic model systems, notably Drosophila, Neurospora, mouse, Arabidopsis and cyanobacteria. Although the clock components in these species are diverse, the basic operating principles are highly conserved, allowing us to build a general view of clock mechanisms. Looking forward, there are still many unanswered questions regarding how clocks are organized at the systems level and, in turn, how they are coupled to key aspects of physiology. Each model species has its own set of advantages and disadvantages for tackling particular questions. As this timely review aims to illustrate, the zebrafish has become a particularly valuable model for exploring circadian clock biology. This is in part due to its technical advantages, accessibility of early developmental stages and its directly light-entrainable peripheral clocks. This provides unparalleled opportunities for studying the circadian clock hierarchy and its links with physiology.

      Strengths:

      This review does a good job of integrating the many lines of circadian clock research where the zebrafish has been used as a model and provides an overview of many future challenges it is well-suited to tackle.

      Weaknesses:

      There are citation errors, as well as inaccurate and misleading statements that must be remedied in a revised version.

    1. Reviewer #1 (Public review):

      Sheidaei and colleagues report a novel and potentially important role for an early mitotic actomyosin-based mechanism, PANEM contraction, in promoting timely congression of chromosomes located at the nuclear periphery, particularly those in polar positions. The manuscript will interest researchers studying cell division, cytoskeletal dynamics, and motor proteins. Although some data overlap with the group's prior work, the authors extend those findings by optimizing key perturbations and performing more detailed analyses of chromosome movements, which together provide a clearer mechanistic explanation. The study also builds naturally on recent ideas from other groups about how chromosome positioning influences both early and later mitotic movements.

      Comments on revised version:

      In the revised manuscript, organizational issues have been largely resolved. In addition, the inclusion of new experiments in additional cell lines, along with an expanded discussion that places actomyosin contractility in the broader conceptual context of other mechanisms governing chromosome movement, has significantly strengthened the manuscript.

    2. Reviewer #3 (Public review):

      Sheidaei et al. report how chromosomes are favourably positioned to facilitate kinetochore-microtubule interactions during early mitosis. Studying kinetochore capture during early prophase is extremely difficult due to kinetochore crowding, but the team has taken up the challenge by classifying types of kinetochore movements, carefully marking kinetochore positions in early mitosis, and linking these to map their fate/next positions over time. The work is an excellent addition to the chromosome segregation field, as most of the literature has thus far focused on tracking kinetochores at slightly later stages of mitosis. The authors show that PANEM facilitates chromosome positioning toward the interior of the newly forming spindle, which in turn promotes chromosome congression. In the absence of PANEM, chromosomes end up in unfavourable locations and fail to form proper kinetochore-microtubule interactions. The work highlights the perinuclear actomyosin network in early mitosis (PANEM) as a key spatial and temporal element of chromosome congression, a step that precedes the segregation process.

      Comments on revised version:

      The authors' revisions have brought clarity to the description of movements in many of the figures. The manuscript ties a fundamental process to differences in cancer cell lines.

      The work extends their published discovery that an actomyosin network forms on the cytoplasmic side of the nuclear envelope during prophase. The current manuscript explains how this network facilitates chromosome capture and congression by tracking the motions of individual kinetochores during early mitosis. The findings are broadly useful for the cell division and cytoskeletal fields.

    1. Reviewer #1 (Public review):

      Summary:

      This paper tries to address an important outstanding issue, which is the evolutionary origin of the SLC25 family of mitochondrial carrier proteins, which are common to all eukaryotic life, with few exceptions. The authors have carried out phylogenetic analyses and DALI searches of AlphaFold databases of bacterial and archaeal membrane proteins. They identify two bacterial proteins, CysZ and YhiY, and they propose that they are progenitors of SLC25 family members. Whilst the paper addresses an interesting topic, the conclusions are not supported by the data and are not presented in an unbiased manner, as they highlight only features that provide some tentative support for the hypothesis. They do not address the large number sequence and structural properties that refute the hypothesis, such as the asymmetric vs three-fold pseudo-symmetric features, hexamer vs monomer, and the complete lack of any conserved motifs with similar functions. Any resemblances between CysZ/YhiY and mitochondrial carriers thus seem to be superficial and could well be coincidental, as they represent generic properties of membrane proteins rather than specific ones, indicative of an evolutionary relationship.

      Strengths:

      This paper explores the evolutionary origins of the SLC25 family of mitochondrial carrier proteins, which are found across nearly all eukaryotic organisms. They were likely to be present in the last common ancestor of all eukaryotes, around two billion years ago. The question is whether they are of bacterial, archeal or eukaryotic origin. The authors propose that two bacterial proteins, CysZ and YihY, may represent ancestral forms of these carriers, based on structural comparisons of models, a sequence motif, and phylogenetic analyses. While the research addresses an important and longstanding question, the presented evidence does not convincingly support their hypothesis.

      Weaknesses:

      A central concern is the reliance on structural similarity searches using predicted protein models, since these models are often built using known protein structures as templates, and thus these searches may produce misleading matches. The reported similarities between CysZ, YihY, and mitochondrial carriers are weak and fall within ranges expected for unrelated membrane proteins, which commonly share general structural features, such as helical bundles. Quantitative measures of similarity are low and do not support a shared evolutionary origin. The case for YhiY is extremely poor as neither structure nor sequence features support the claim. Importantly, the opening of the YihY is towards the membrane rather than the water phase, as is the case for carriers, indicating that it has a very different structure and function. The case for CysZ is somewhat better, as it is a helical bundle with two short helices somewhat resembling the matrix helices of mitochondrial carriers, and a short sequence PXDXXK that is part of one of the known sequence motifs of mitochondrial carriers, but this is where the similarities end.

      Mitochondrial carriers have a distinctive threefold pseudo-symmetrical structure and a highly complex transport mechanism involving six structural elements. This paper's hypothesis does not explain how such a high level of threefold pseudo-symmetry could have evolved from entirely asymmetric proteins. To complicate matters further, CysZ is not functional as a monomer but forms a functional hexamer, which also explains why it has two half helices rather than two transmembrane helices. Thus, the hypothesis is that CysZ, which is an asymmetric protomer of a functional hexamer, has evolved into a three-fold pseudo-symmetric protein, which is functional as a monomer. A more convincing explanation is that the threefold pseudo-symmetrical structure arose from gene triplication and fusions, with later mutations introducing asymmetry to support diverse substrate binding. In support of this notion, mitochondrial carriers transporting large molecules, such as ATP, show more asymmetry, whereas those for small molecules remain nearly symmetrical. In general, the vast majority of transport proteins arose from gene duplications and fusions of the domains.

      Although mitochondrial carriers have a similar sequence motif as found in CysZ (PXDXXK), their roles are very different. In mitochondrial carriers, this motif is located roughly in the middle of transmembrane helices H1, H3, and H5, where proline creates a pronounced kink, bringing the charged residues inward to form a salt-bridge network in the central water-filled cavity. The formation and disruption of this network is essential for the transport mechanism when switching between inward- and outward-open states. In CysZ, the motif is found at the end of a helix and in the following loop at the end of the transporter, with residues pointing outward toward the water phase. These residues are typical of membrane-water interface regions, where proline acts as a helix breaker and charged residues interact with the water phase. Thus, this motif in CysZ does not match the position or function seen in mitochondrial carriers, and its presence is likely to be coincidental, because these residues often occur in the water-membrane region. Importantly, none of the other important conserved three-fold symmetrical motifs of mitochondrial carriers is found in these bacterial proteins, such as the cytoplasmic network [YF][DE]xx[RK], cardiolipin binding sites, ER-links, and sequences of small amino acids, which are critical for its dynamic mechanism.

      The phylogenetic relationship is also overstated, as there is no sequence similarity between these proteins other than that occurring because of similar biophysical properties, such as transmembrane helices. The authors suggest that a specific mitochondrial carrier represents the ancestral member of the family, but this conclusion appears to be inferred rather than rigorously demonstrated. Key aspects, such as tree rooting and taxon sampling, are not sufficiently addressed, weakening confidence in the evolutionary claims. Further, the selection of only a few bacterial and archaeal proteomes for analysis limits the study's scope. Broader searches would be necessary to support claims about conservation and ancestry. Independent sequence searches indicate that CysZ and YihY are not widely conserved in the bacterial groups most closely related to mitochondria, undermining the argument that they are plausible ancestors.

      Overall, the presented similarities are superficial and can be explained by general features of membrane proteins rather than by specific adaptations to function. The hypothesis that CysZ and YihY are evolutionary precursors of mitochondrial carriers is not supported by the presented data.

    2. Reviewer #2 (Public review):

      Summary:

      Here, the authors performed a phylogenetic analysis of mitochondrial ATP/ADP carrier (AAC) proteins. They also performed a structure-based screen for remote homologs, seeking to reveal their evolutionary origins. The authors claim that AACs are found at the root of their family tree, and through a structure-based homolog search protocol, identify putative prokaryotic homologs.

      The proposed evolutionary history of AACs is bold and complicated, but the phylogenetic methodology and the way in which the tree is interpreted are incomplete and unconvincing. Further, the structure-based search strategy uses very relaxed cutoffs for fold similarity, which may be fine, but it does not clearly justify this decision. This is potentially very problematic, as I did not find the quantitative or qualitative assessments of fold similarity particularly compelling.

      In summary, the authors have presented a bold and extremely interesting hypothesis for the evolution of these proteins, but there is insufficient support for their claims.

      Strengths:

      (1) The authors are presenting a very interesting hypothesis about the birth of these proteins, including that they may have undergone a radical rearrangement in their sequence at some point in evolution.

      (2) The paper makes use of appropriate tools for structure-based homolog identification.

      (3) Identification of a conserved sequence motif in these twilight zone proteins would be a rare and interesting occurrence, and could be consistent with their proposed homology.

      Weaknesses:

      (1) The phylogenetic analysis and its interpretations are incomplete. The authors regularly refer to the root of the tree, and its placement is given central importance. However, the methodology by which they selected the root is unexplained. This is notable, as the proposed root is curious and quite confusing. It implies that (at least) yeast and Paramecium AACs are independently paraphyletic. While certainly not impossible, this evokes quite a complicated evolutionary history. The taxonomy of this gene family, when rooted this way, does not seem to echo the phylogeny of species, suggesting an extremely complex history of duplication/loss and horizontal gene transfer, none of which the authors discuss in detail. Perhaps more clearly and specifically: I'm very surprised by the branching order at the root, where there are three independent branches of fungal proteins, followed by the excavate proteins in a monophyletic clade, followed by several independent branches of the Paramecium proteins. I very much expect incomplete lineage sorting at this evolutionary depth, but this seems extreme to the point that I question if it is accurately placed. More directly: this very much looks like an unrooted tree, presented radially.

      (2) The Bayesian and ML trees seem quite incongruent, but this is not discussed. In fact, the text states that they "exhibit a similar tree topology." This is admittedly very difficult to assess without very carefully going over the tree, branch by branch, but there are nevertheless differences, the most obvious being paraphyly vs monophyly of taxon-specific AAC clades. Do the authors have any comments on this, and can they show some sort of consensus tree? How does this affect their interpretation?

      (3) Presenting branch support as similarly-sized points makes it nearly impossible to actually judge the strength of support.

      (4) The use of structure for remote homology detection is becoming increasingly popular, and in my opinion, is very powerful. But it is still much too early to be taken for granted. The methodology must be justified. Most importantly, the authors have not clearly described why they chose these quantitative cutoffs (I'm mostly thinking of the Dali Z-score cutoff, which here seems very low for a transmembrane protein of this size, as the Z-score is very dependent on alignment length). The authors reference categories defined by tool authors, but why a Z-score of 3, specifically? The same goes for TM scores. There are not yet any defined best practices, to my knowledge, so the authors should independently validate/justify their approach in some way and/or cite and discuss relevant literature (there have been a growing number of these screens using similar approaches in recent years).

      (5) The proposed homologs have very little quantitative structural similarity to the query structure, or to each other, as shown in Figure 3 (and hence my concerns about the methodology). Also, I did not find the structural alignments in the supplement or Figure 4 to be qualitatively compelling. They simply appear too different, and I cannot discard this qualitative assessment because the quantitative similarities are likewise very weak. It's not clear to me if this is because the folds are in fact different, or if my view of them is a presentation issue (perhaps it could be improved by visualizing more angles, or more carefully cartooning the similarities and differences).

      (6) The authors point out that the alpha-helices are ordered differently in YihY and CysZ, and that their membrane orientation is flipped. Taken at face value, I would view this as evidence against homology. This could perhaps be more reasonably explained as convergent global fold similarity resulting from different underlying structures. However, the authors imply that this may be the result of the transposition of the sequences encoding these alpha helices, yet there is no convincing description or argument concerning when and how this could have occurred. I think this would be a deeply interesting phenomenon, but there is insufficient evidence and discussion to seriously consider whether or not it is homology or convergence.

      (7) Following up on comment #5, the authors did perform a very interesting in silico experiment by transposing sequences to reorder the helices. They then note that structural similarity improved. This is very, very interesting, but without other evidence of homology between the transposed alpha helices, I do not think this disproves alternative hypotheses. Does any such evidence exist?

      (8) The authors show in Figure 5E-F that sequence transposition flips the membrane orientation, such that YihY and CysZ have extracellular termini (which you would expect from homologs, I suppose). But it is just cartooned and not discussed. Is this computationally or experimentally supported?

      (9) The putative presence of a conserved motif would be a very compelling piece of evidence consistent with homology. However, it is not clear to me in the text which proteins actually have the repeats - is it truly just CysZ? What does this mean for YihY? Further, what specifically is being proposed to be homologous? Is SLC25 repeat 2 proposed to be homologous to CysZ repeat 2 (and the same for 3 to 3)? If so, this would seem to have implications for the transposition hypothesis. The helix nomenclature (e.g., H1-6) suggests homology across the proteins (i.e, H1 is homologous to H1); however, wouldn't the presence of these conserved domains instead, for example, suggest homology between SLC H3 and CysZ H2? The authors' conclusions are not clear, and it is difficult to interpret what the implications are for assessing homology.

      (10) The sequence retrieval methods are incomplete, so it is impossible to reproduce the searches or to judge their accuracy and scope. What were the E-value cutoffs and other settings used in the searches?

      (11) The phylogenetic methods are incomplete. What substitution models were used, and how were they chosen? What branch support method was used? What were the stop conditions of the Bayesian analysis (e.g. did the authors monitor for convergence, and how)? How much of the Bayesian analysis was considered burn-in, if any? And echoing points 1 & 2 above, how were these phylogenies rooted?

      (12) Throughout, there is a distinct lack of careful, evolutionarily informative language.

      (i) In reference to the phylogeny, the authors frequently refer to "grouping," but it's not entirely clear what this means. Referring to clades and their branching order would be more informative.

      (ii) The authors refer to the excavate branch as the "most ancient." Whether or not excavates most closely resemble LECA is somewhat irrelevant, because the branch itself is not the most ancient - it is equally as ancient as its sister branch, which may be all other eukaryotes.

      (iii) Likewise, the authors refer to bacterial proteins as "the evolutionary ancestor of mitochondrial AACs," and state that "AAC emerged from the conserved sulfat transporter CysZ." But extant bacteria are not the ancestors of mitochondria - nor are extant proteins descended from other extant proteins. They are, perhaps more accurately, cousins.

      (iv) The authors refer to AACs as "evolutionarily founder member of the SLC25 carrier family," but I'm not sure that has a clear evolutionary meaning, unless the authors mean to say that the common ancestor was more AAC-like than anything-else-like. Even if the rooting is accurate, a basal branch does not necessarily reflect the ancestral state.

    3. Reviewer #3 (Public review):

      Summary:

      The most important weakness is that the authors have avoided the direct structural comparison of experimentally determined x-ray structures of AAC and CysZ. Instead, the comparisons are made through predicted membrane topologies and predicted structural models of protein homologs, which give rise to misleading results. Direct comparison of the X-ray structures of the ADP/ATP carrier and CysZ clearly shows that these proteins have very different folds. Therefore, flaws in the methods produce results that lead to the wrong conclusions, and the authors have not achieved their aims.

      Weaknesses:

      (1) Figure 2. There is something very strange about how the tree is drawn, given that S. cerevisiae AAC1, AAC2 and AAC3 share about 76-83% sequence identity but appear to be very diversified in the tree. The phylogenetic trees are only based on the sequences of three species. The authors should explain in much more detail how they made the phylogenetic trees to support their statement that all mitochondrial carriers have come from an ancient AAC.

      (2) There are at least three and seven X-ray structures of CysZ (with about 43% sequence identity to the E. coli homolog) and AAC, respectively, deposited in the Protein Data Bank. Therefore, there is no need for the approach using predicted structures as described in the manuscript. It is clear from direct comparison of the CysZ and AAC structures that they have very different folds, i.e. lengths of the transmembrane helices, their orientation and packing. CysZ has been suggested to form dimers or trimers of dimers (eLife 2018;7:e27829), with each protomer formed by two long transmembrane helices and four short helices that do not cross the membrane totally. Thus, CysZ has a different membrane topology and oligomeric state than AAC (monomer with six transmembrane helices). CysZ is therefore rightfully classified in a separate 3D domain fold from mitochondrial carriers in various protein family and domain databases.

      (3) In the 3D structures of CysZ, the conserved QYXDYPXDNHK motif is involved in a network of hydrogen bonds and salt bridges thought to hold the helical bundle together (eLife 2018;7:e27829). This motif is similar to PX[DE]XX[KR], a part of the signature motif, typical of mitochondrial carriers, which is repeated three times in the sequences and forms a three-fold pseudo-symmetrical salt bridge network of the so-called matrix gate that opens and closes during the transport cycle. Therefore, although this single motif in CysZ is similar to those of mitochondrial carriers, it is not found in a similar structural context to those in AAC structures.

      (4) It appears odd that the sulfate transporter CysZ should be more similar to nucleotide-transporting AAC than any of the other mitochondrial carriers, of which some transport sulfate.

      (5) The alphafold model of YihY is not very similar to either the crystal structures of CysZ or AAC.

      (6) The authors are relying too much on the TM-score results. The values of 0.5-0.6 between AAC and CysZ or YihY probably reflect that they contain six main helices. However, as noted in point 2, they have very different folds.

    1. Reviewer #1 (Public review):

      Renard, Ukrow et al. applied their recently published computational pipeline (CHROMAS) to the skin of Euprymna berryi and Sepia officinalis to track the dynamics of cephalopod chromatophore expansion. By segmenting each chromatophore into radial slices, and analyzing the co-expansion of slices across regions of the skin, they inferred the motor control underlying chromatophore groups.

      Strengths:

      - The authors demonstrate that most motor units of cephalopod skin include a subregion of multiple chromatophores, creating "virtual chromatophores" between fixed chromatophores. This is an interesting concept that challenges prevailing models of chromatophore organization, and raises interesting possibilities for how chromatophore arrays may be patterned during development.

      - This study introduces new analytical approaches of cephalopod skin that will be valuable for the quantitative study of cephalopod behavior.

      Weaknesses:

      - The authors use patch-clamp experiments in E. berryi to test their approach for inferring motor units. The stimulations indeed evoke expansions of sub-regions of each chromatophore, creating "virtual chromatophores". However, they were not able to predict these motor units from behavioral analysis before confirming them with patch-clamp, limiting the strength of this validation.

      - In S. officinalis, chromatophores are far more numerous than in E. berryi and exhibit frequent spontaneous activity, making it more challenging to distinguish shared motor drive. Patch-clamp experiments in this species would provide important validation and strengthen confidence in the method for inferring motor units.

      - Although multiple experimental conditions were tested (e.g., age, size, behavioral context, sedation, head-fixation, lighting), data is only shown from a small subset of experiments. Analyzing pooled data across conditions would allow for more generalizable conclusions.

      - Different clustering algorithms were used for the two species (HDBSCAN for E. berryi and Affinity Propagation for S. officinalis). Since Affinity Propagation appeared to better capture correlation structure in S. officinalis, it would be informative to reanalyze the E. berryi data using the same method to assess potential algorithm-dependent biases.

      Conclusion:

      The CHROMAS tool is likely to be valuable to the field, given the need for quantitative frameworks in cephalopod biology. The predictions outlined here provide a useful foundation for future experimental investigation.

    2. Reviewer #2 (Public review):

      Summary:

      Overall, this is an excellent paper, making use of a newly developed system for monitoring the behaviour of chromatophores in the skin of (mostly) free swimming bobtail squid and European cuttlefish. The manuscript is very well written, clearly presented and very well structured. The central finding, that individual chromatophores are connected to multiple motor neurones, is not new. Novelty instead comes from the ability to measure the actuation of chromatophore sections across wide areas of skin in free-swimming animals, showing the diversity of local motor units and reinforcing the notion that individual chromatophores are not necessarily the individual units of colour change, but rather local motor units that cover multiple neighbour and near neighbour chromatophore muscles. This is an excellent finding and one that will shape our understanding of the neural control of cephalopod skin colour. I have a number of minor points below that the authors will need to address before acceptance.

      Strengths:

      The methodological approach to collecting large amounts of data about local variations in the expansion of sections of chromatophores is exciting, and the analysis pipeline for clustering sections of chromatophores whose spontaneous activity correlated over time is powerful and exciting.

      Comments on revisions:

      All concerns have been addressed in the revised version of the manuscript.

    3. Reviewer #3 (Public review):

      Summary:

      This study uses high-resolution videography and a custom computer-vision pipeline to dissect the motor control of cephalopod chromatophores in Euprymna berryi and Sepia officinalis. By quantifying anisotropic chromatophore deformations and applying dimensionality reduction methods, the authors infer that individual chromatophores can be a part of multiple motor units. Clustering analyses reveal putative motor units that often span multiple chromatophores, with diverse and overlapping geometries. Chromatophore expansion dynamics are faster and more stereotyped than relaxation, consistent with active neural contraction followed by passive recoil. Together, the results show that chromatophores function not as uniform pixels but as fractionated, coordinately controlled elements that enable flexible pattern generation

      Strengths:

      The authors present compelling, direct evidence that a). chromatophore deformations are anisotropic, and indirect evidence that b). individual chromatophores can be split across multiple putative motor units. This evidence is provided through data collected over large spatial scales, but also at a sub-chromatophore resolution. This combination of scale and resolution is not possible using traditional neuroanatomical and physiological approaches alone.

      The authors also develop a new non-invasive, image analysis approach to extract information about chromatophore deformation across large spatial scales on the organism's body. In principle this approach is applicable across species and may allow for further comparative characterization of chromatophore motor control. It is therefore a promising new tool and useful resource for the community.

      Weaknesses:

      An important weakness of the work is that the methods the authors develop can only be applied during resting, spontaneous 'flickering' activity of chromatophores to yield interpretable results at the motor unit level. This is because common presynaptic input would confound the identification of individual motor units. Thus, there remains a large difficulty in linking insights about single motor unit organization to the circuit and behavioral levels.

      Another weakness of this paper is the rather limited electrophysiological validation of the computational findings. The authors present only one electrophysiology experiment in E. berryi, the species that they used only for 'methodological development' and not for detailed characterization. A complementary electrophysiological experiment in S. officinalis, or some visualization of neuron morphology confirming that motor neurons do indeed project to multiple chromatophores would strengthen the generalizability of their computational analysis. This would be particularly pertinent to validate the author's claim that some motor units contain chromatophores that are quite distant from one another on the animal.

      Overall, the authors' technical contributions and method development are an important advance. This work serves as an excellent proof of concept that their method can extract useful information about chromatophore motor control. Further validation of their method is needed to fully trust the fine-scale conclusions drawn about the distribution and composition of multi-innervated chromatophores. Furthermore, the authors raise many interesting ideas about developmental constraints on circuit wiring and potential adaptive significance of multi-innervated chromatophores for certain features of camouflage patterning. Their method may be able to help resolve some of these questions in the future if it is refined and applied across developmental stages, regions on the animal, and across species

      Comments on revisions:

      Thank you for clarifying my major point of confusion regarding how one might connect these results to behaviorally relevant camouflage. I now have a better understanding of the author's rationale in studying resting activity of motor units and believe that the clarifications added to the manuscript will help other readers who encounter similar confusion.

    1. Reviewer #1 (Public review):

      Summary:

      This paper investigates whether transformer-based models can represent sentence-level semantics in a human-like way. The authors designed a set of 108 sentences specifically to dissociate lexical semantics from sentence-level information and collected 7T fMRI data from 30 participants reading these sentences. They conducted representational similarity analysis (RSA) comparing brain data and model representations, as well as the human behavioral ratings. It is found that transformer-based models match brain representation better than static word embedding baseline which ignores word order but fall short of models that encode the structural relations between words. The main contributions of this paper are:

      (1) The construction of a sentence set that disentangles sentence structure from word meaning.

      (2) A comprehensive comparison of neural sentence representations (via fMRI), human behavior, and multiple computational models at the sentence level.

      Strengths:

      (1) The paper evaluates a wide variety of models, including layer-wise analysis for transformers and region-wise analysis in the human brain.

      (2) The stimulus design allows precise dissociation between lexical and sentence-level semantics. The RSA-based approach is empirically sound and intuitive.

      (3) The constructed sentences, along with the fMRI and behavioral data, represent a valuable resource for studying sentence representation.

      Weaknesses:

      (1) The rationale behind averaging sentence embeddings across multiple transformer models (with different architectures and training objectives) is unclear. These transformer-based models have different training paradigms and model architectures, which may result in misaligned semantic spaces. The averaging operation may dilute the distinct sentence representations learned by each model, potentially weakening the overall semantic encoding for sentences. Please clarify this choice or cite supporting methodology.

      (2) All structure-sensitive models discussed incorporate semantics to some extent. Including a purely syntactic baseline, such as a model based on context-free grammar, would help confirm the importance of syntactic structures.

      (3) In Figure 2, human behavioral judgments show weak correlations with neural data, and even fall below those of computational models, suggesting the behavioral judgments may not reflect the sentence structures in a brain-like way. This discrepancy between behavioral and neural data should be clarified, as it affects the interpretation of the results.

      (4) To better contextualize model and neural performance, sentence similarity should be anchored to a notion of semantic "ground truth", such as the matrix shown in Figure 1a. Comparing this reference with human judgments, brain responses, and model similarities would help establish an upper bound.

      (5) The structure of this paper is confusing. For instance, Figure 5 is cited early but appears much later. Reordering sections and figures would enhance readability.

      (6) While the analysis is broad and comprehensive, it lacks depth in some respects. For instance, it remains unclear what specific insights are gained from comparing across brain regions (e.g., whole brain, language network, and other subregions). Similarly, the results of simple-average and group-average RSA appear quite similar and may not advance the interpretation.

      (7) While explaining the grid-like pattern due to sentence length is important, this part feels somewhat disconnected from the central question of this paper (word order). It might be better placed in supplementary material.

      Comments on revised version:

      The new version of the paper has addressed my main concerns, including:

      (1) clarification about the methodology of Transformer embeddings

      (2) discussion about the purely syntactic models

      (3) discussion about the low correlation between behavioural ratings and brain activations

      (4) better structure of the paper

      (5) clarification about pre-registration

      I believe the paper has been substantially improved after revision.

    2. Reviewer #3 (Public review):

      Summary:

      Large Language Models have revolutionized Artificial Intelligence and can now match or surpass human language abilities on many tasks. This has fuelled interest in cognitive neuroscience in exposing representational similarities between Language Models and brain recordings of language comprehension. The current study breaks from this mold by: (1) Systematically identifying sentence structures for which brain and Large Language Model representations diverge. (2) Accounting for such sentence structures using a model structured by semantic roles. As such the study may now fuel interest in characterizing how Large Language Models and brain representations differ, which may prompt new more brain like language models.

      Strengths:

      * This study presents a bold challenge to a literature trend that has touted similarities between Transformer models and human cognition based on representational correlations with brain activity. This challenge is substantiated by identifying sentences for which brain and model representations of sentences diverge.

      * This study conducts a rigorous pre-registered analysis of a comprehensive selection of the state-of-the-art Large Language Models, on a controlled sentence comprehension fMRI dataset. The analysis is conducted within a Representation Similarity framework to support similarity comparisons between graph structures and brain activity without needing to vectorize graphs. Transformer models are predicted and shown to diverge from brain representations on subsets of sentences with similar word-level content but different sentence structures.

      * The study introduces a 7T fMRI sentence comprehension dataset and accompanying human sentence similarity ratings which may be a fruitful resource for developing more human-like language models. Unlike other model-based sentence datasets, the relation between grammatical structure and word-level content is controlled, and subsets of sentences for which models and brains diverge are identified.

      Weaknesses:

      * The interpretation of findings is nuanced. Although Transformers underperform as brain models on the critical subsets of controlled sentences, a Transformer outperforms all other models when evaluated on the union of all sentences when both word-level content and structure vary. Transformers also yield equivalent or better models of human behavioral data. Thus, although Transformers have demonstrable flaws as human models which are pinpointed here, in the general case (some) Transformers are more human-like than the other models considered.

      * There may be confounds between the critical sentence structure manipulations and visual processing. This is inconvenient because activation in brain regions that process semantics tends to partially correlate with low-level representations of sentence surface features encoded in visual cortex. Although the study commendably controls for confounds associated with sentence length, correlations with the key sentence structure models are most salient in visual cortex and diminish in other brain networks when V1-V4 activation is controlled for.

      * Sentence similarity computations are emphasized as the basis for unifying comparative analyses of graph structures and vector data. A strength of this approach is that correlation is not always the ideal similarity metric. However, a weakness is that similarity computations are not unified across models. This has practical consequences because different similarity metrics applied to the same model produce positive or negative correlations with brain data and repeating analyses with a different representational dissimilarity measure seems to produce some anomalous results.

    1. Joint Public Review:

      This manuscript puts forward the provocative idea that a posttranslational feedback loop regulates daily and ultradian rhythms in neuronal excitability. The authors used in vivo long-term tip recordings of the long trichoid sensilla of male hawkmoths to analyze spontaneous spiking activity indicative of the ORNs' endogenous membrane potential oscillations. This firing pattern was disrupted by pharmacological blockade of the Orco receptor. They then use these recordings together with computational modeling to predict that Orco receptor neuron (ORN) activity is required for circadian, not ultradian, firing patterns. Orco did not show a circadian expression pattern in a qPCR experiment, and its conductance was proposed to be regulated by cyclic nucleotide levels. This evidence led the authors to conclude that a post-translational feedback loop (PTFL) clockwork, associated with the ORN plasma membrane, allows for temporal control of pheromone detection via the generation of multi-scale endogenous membrane potential oscillations. The findings will interest researchers in neurophysiology, circadian rhythms, and sensory biology. However, the manuscript has limited experimental evidence to support its central hypothesis and is undermined by several assumptions that underlie their data analysis and model builds, as well as insufficient biological data including critical controls to validate and/or fully justify the model the authors are proposing.

      Strengths:

      The authors raise several intriguing model-based hypotheses regarding the mechanisms that underlie the generation of olfactory rhythms. The electrophysiological approach and the long-term recording paradigm are elegant and technically impressive. In the revised version, the authors have added additional qPCR data supporting the lack of rhythmic Orco transcript expression and included a new figure suggesting that cAMP can modulate Orco conductance.

      Major weaknesses:

      (1) The cAMP experiment was only conducted at one time-point, which is insufficient to support the central claim that "AMP and cGMP may have ZT-dependent effects on Orco conductivity".

      (2) The revised manuscript continues to rely heavily on prior publications or defers key mechanistic questions (or important manipulations) to future studies. In its current form, the evidence presented remains insufficient to support the central claim that a PTFL constitutes the primary underlying circadian clock mechanism. The proposed model is intriguing, but the data provided do not yet directly demonstrate the novel mechanism.

    1. Reviewer #1 (Public review):

      This rigorous and creative study uses an elegant combination of metabolomics, transcriptomics, and budding yeast molecular genetics to discover that (i) activating AMPK to maintain mitochondrial respiration fueled by cytosolic Acetyl CoA and (ii) increasing fatty acid synthesis independent of respiration drive independent pathways that increase the fitness of replicatively-aged budding yeast cells, albeit without increasing their lifespan. This work will be of interest to scientists in the field of aging and metabolism. Some clarifications in the text would address the following concerns, which would increase the impact of the study:

      (1) What does activation of AMPK (via PGDP-Sak1 expression) do to the replicative lifespan? How many bud scars, in general, do the subpopulations that are older - yet have less Tom70 (increased mitochondrial fitness) - have, after the 48 hrs time point that they are examining? How many divisions occurred in this 48hr time period - i.e. is it long enough to have all cells reach the end of their replicative lifespan? This information is important to rule out that a subset of the mutant cells just divided faster and hence had more divisions within 48 hrs (growing faster and living longer are different things). Having identical growth curves doesn't indicate per se that they all divide at the same rate, as there may be a subpopulation that divides faster and a subpopulation that doesn't grow so well.

      (2) A2A cells do not have an extended replicative lifespan (RLS) but show an increase in the "low senescence" population (Figure 2). If the cells are not becoming senescent, why don't they have longer RLS? Not having a longer lifespan seems inconsistent with the statement that "bud scar counting confirmed that A2A cells reach a higher age than wild type", which comes back to how many times the cells can divide in the 48hr timepoint studied and their rate of cell division? Also, the lifespan curve shown is plotted against time, not cell division number, which does not take into account different division times of cells within the population (described above). It would be much more useful to show standard lifespan curves showing cell division numbers per lifespan per cell.

      (3) Increased "fitness" of the old cells is implied from the increased size of the colonies that the old cells can make. However, this is a measure of the fitness of the daughters per se, not the old mother cells. Are the old mothers just passing on healthier mitochondria and more lipids to the daughters, such that they can divide more times? If the aged cells have an "increased fitness", why don't they divide more times themselves (i.e. live longer?).

      (4) The statement is made that "these experiments define two classes of aging cells with distinct metabolic needs, coherent with the model of two aging trajectories previously proposed (referencing Nan Hao's work)". However, the big difference here is that in Nan Hao's work, their two aging trajectories influenced the length of lifespan, but that does not appear to be the case here. That distinction should be made clear. Perhaps the authors could also speculate as to why the A2A yeast stops dividing after presumably the same number of cell divisions, even though they have an activated AMPK and activated fatty acid synthesis pathway.

      (5) I am a bit confused by the use of the word "senescence" by this lab here and in their previous growth on galactose studies. If yeast don't senesce, which is usually defined as an irreversible arrest of the cell cycle where cells stop dividing, shouldn't the yeast that do not senesce still be dividing and hence have a longer lifespan? Should a different term be used rather than senescence? Such as "fitness late in life". The authors giving their definition of senescence may help reduce this apparent contradiction.

    2. Reviewer #2 (Public review):

      Summary:

      In this study, the authors investigate how cytosolic acetyl-CoA metabolism influences replicative aging in budding yeast. They propose that acetyl-CoA regulates aging through three major pathways: (1) mitochondrial transport to support mitochondrial function, (2) fatty acid synthesis, and (3) global protein acetylation. The data show that AMPK activation promotes mitochondrial import of acetyl-CoA and partially mitigates mitochondrial decline in a subset of aging cells.

      Furthermore, the engineered A2A strain, which enhances mitochondrial acetyl-CoA utilization while relieving inhibition of fatty acid synthesis, increases the proportion of cells exhibiting a "low senescence" phenotype.

      Overall, this is a thoughtful and potentially impactful study that advances our understanding of metabolic control of aging. Addressing the points below, particularly by refining interpretations and, where feasible, incorporating additional analyses, will further strengthen the manuscript and its conclusions.

      Strengths:

      The study has several notable strengths. It addresses an important question by shifting the focus from lifespan to preservation of late-life fitness, which is highly relevant to aging biology. The work integrates metabolic, genetic, and functional analyses to link cytosolic acetyl-CoA flux with distinct aging outcomes, and the engineering of the A2A strain provides a clear and elegant demonstration of how coordinated pathway modulation can improve cellular fitness.

      Weaknesses:

      (1) While the manuscript focuses on mitochondrial transport and fatty acid synthesis, cytosolic acetyl-CoA is also a key regulator of histone acetylation and chromatin silencing. It would strengthen the study to consider whether acetyl-CoA depletion contributes to improved fitness through enhanced rDNA silencing. Given the well-established role of rDNA instability in yeast aging, additional experiments examining rDNA silencing and stability would be valuable. For example, monitoring rDNA copy number changes (not necessarily ERCs) under AMPK activation, oleic acid supplementation, and in the A2A strain, similar to approaches used in the authors' prior work, would help clarify whether chromatin regulation contributes to the observed phenotypes.

      (2) The current data do not fully distinguish whether AMPK activation and oleic acid supplementation act on distinct subpopulations of aging cells. An alternative explanation is that oleic acid supplementation enhances mitochondrial function and acts additively with AMPK activation, thereby increasing the fraction of cells in the "low senescence" state. Since this distinction is not central to the main conclusions, I suggest softening the language around subpopulation specificity. Emphasizing instead that the A2A strain coordinately modulates multiple branches of acetyl-CoA metabolism to improve late-life fitness would maintain the strength of the central message without overinterpretation.

      (3) The manuscript proposes that lipid starvation and excess acetyl-CoA are major drivers of senescence in distinct subpopulations of wild-type aging cells. This conclusion is not yet fully supported by the presented data. Direct measurements of age-dependent divergence in acetyl-CoA and fatty acid levels at the single-cell level would be needed to substantiate this model. Based on the current evidence, a more conservative interpretation would be that aging cells exhibit differential sensitivity to perturbations in acetyl-CoA and lipid metabolism. Accordingly, I recommend revising the statement in the Abstract ("We further implicate lipid starvation and excess acetyl coenzyme A availability as major drivers of senescence...") and the corresponding discussion text to better align with the data.

    3. Reviewer #3 (Public review):

      Summary:

      These findings suggest that PGPD-SAK1 yeast show a subpopulation with lowered TOM70-GFP expression in high bud scar staining aged cells. Deletion of CAT2 or MLS1 reduces this effect. A PGPD-SAK1 acc1S1157A double mutant (called "A2A" here) shows an even larger effect of lowered tom70 expression in high bud scar staining aged cells. Utilization of various additional mutants involved in acetyl-CoA transport, carnitine shuttle, respiration, etc., leads the authors to conclude that these shifts in TOM70-GFP in aged cells are linked to the AMPK-fatty acid metabolic regulatory system.

      Strengths:

      These extensive and clearly described experiments reveal interesting changes in TOM70-GFP intensity in subsets of aged yeast in several mutants eventually identified as linked to the AMPK-fatty acid metabolic regulatory system.

      Weaknesses:

      (1) 3 biological replicates for mRNASeq is low.

      (2) While "Traditional conceptions of ageing implicate a progressive accumulation of damage leading to systemic degradation in performance until death, with evolutionary pressures acting to maximise early life fitness and fecundity at the expense of ageing health." is tangential perhaps to the data and conclusions of the study, both claims of this sentence are at best controversial, and the manuscript is no weaker for their omission.

      (3) The statement that "Here, we determine the basis of senescence and fitness loss in replicatively ageing yeast" is a bit strong as a summary of the present careful work presented here. If the authors had created yeast mutants that retained fitness indefinitely, this would be a more appropriate strength of claim to summarize the work.

    1. Reviewer #1 (Public review):

      [Editor's note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      In this work the authors investigate the molecular dynamics of MinD, a component of the Bacillus subtilis Min system, in vitro and in vivo. In Escherichia coli the Min system is highly dynamic and displays rapid pole to pole oscillation whereby a time average minimum of the Min proteins at mid cell is established. However, in B. subtilis, this is not the case, and there is no MinE present. MinD in B. subtilis dynamically relocalizes from the poles to division sites, and binds to MinC and MinJ, which mediates its interaction with DivIVA. This paper reports biochemical characterization of B. subtilis MinD in vitro and dynamics of MinD variants in vivo, providing mechanistic insight into the mechanism of dynamic localization.

      Strengths:

      In the current study, the authors perform a detailed biochemical characterization of the in vitro ATPase activity of MinD and demonstrate that rapid hydrolysis is elicited by adding phospholipids. They further show using a collection of substitution mutants of MinD that both monomers and dimers bind to the membrane, and ATP occupancy changes the on and off rates. Identification, quantification, and tracking of discrete Halo-MinD populations was nicely done and showed that mutations in MinD alter dynamic localization, correlating with PL binding on and off rates in vitro.

      In the revised manuscript, the authors now demonstrate localization and tracking data for minC and minJ deletion strains, which suggest that MinJ impacts MinD membrane cycling, but MinC does not. Additional in vitro work showed that the PDZ domain of MinJ modifies MinD ATP hydrolysis rates, and the authors propose that MinJ may promote MinD dimer formation.

      Weaknesses of the revised version: No major weaknesses.

    2. Reviewer #2 (Public review):

      Summary:

      Feddersen & Bramkamp determined important characteristics of how MinD protein binds/dissociates to/from the membrane, and dimerizes in relation to its ATPase activity. The presented data clearly shows the differences in function of MinD homologs from B. subtilis and E. coli.

      Strengths:

      The work presents well-executed experiments that lead to interesting conclusions and a new model of how Min system works during B. subtilis mid-cell division. Importantly, this model is supported by in vitro characterization of well-chosen mutants in the functional domains of MinD. Outstandingly, most of the in vitro data are confirmed by single-molecule localization microscopy.

    1. Reviewer #1 (Public review):

      Summary:

      The study by the Obata group characterizes the dynamics of the canonical malate dehydrogenase-citrate synthase metabolon in yeast.

      Strengths:

      The study is well-written and appears to give clear demonstrations of this phenomenon.

      Studies of the dynamics of metabolon formation are rare; if the authors can address the concern detailed below, then they have provided such for one of the canonical metabolons in nature.

      Weaknesses:

      There is a fundamental issue with the study, which is that the authors do not provide enough support or information concerning the split luciferase system that they use. Is the binding reversible or not? How the data is interpreted is massively influenced by this fact. What are the pros and cons of this method in comparison to, for example, FLIM-FRET? The authors state that the method is semi-quantitative - can they document this? All of the conclusions are based on the quality of this method. I know that it has been used by others, but at least some preliminary documentation to address these questions is required.

      Comments on revised version:

      I feel that the authors have adequately addressed my prior concerns. I have no further critiques of their work.

    2. Reviewer #2 (Public review):

      This study explores the dynamic association between malate dehydrogenase (MDH1) and citrate synthase (CIT1) in Saccharomyces cerevisiae, with the aim of linking this interaction to respiratory metabolism. Utilizing a NanoBiT split-luciferase system, the authors monitor protein-protein interactions in vivo under various metabolic conditions.

      Major Concerns:

      (1) NanoBiT Signal May Reflect Protein Abundance Rather Than Interaction Strength<br /> In Figure 1C, the authors report increased MDH1-CIT1 interaction under respiratory (acetate) conditions and decreased interaction during fermentation (glucose), as indicated by NanoBiT luminescence. However, this signal appears to correlate strongly with the expression levels of MDH1 and CIT1, raising the possibility that the observed luminescence reflects protein abundance rather than specific interaction dynamics. To resolve this, NanoBiT signals should be normalized to the expression levels of both proteins to distinguish between abundance-driven and interaction-driven changes.

      (2) Lack of Causal Evidence<br /> The study presents a series of metabolic perturbation experiments (e.g., arsenite, AOA, antimycin A, malonate) and correlates changes in metabolite levels with NanoBiT signals. However, these data are correlative and do not establish a functional role for the MDH1-CIT1 interaction in metabolic regulation. To demonstrate causality, the authors should implement approaches to specifically disrupt the MDH1-CIT1 interaction. One strategy could involve using a 15-residue peptide (Pept1) derived from the Pro354-Pro366 region of CIT1, previously shown to mediate the interaction or introducing the cit1Δ3 (Arg362Glu) mutation, which perturbs binding. Metabolic flux analysis using ^13C-labeled glucose and mitochondrial respiration assays (e.g., Seahorse) could then assess functional consequences.

      (3) Absence of Protein Expression Controls Under Perturbation Conditions<br /> In experiments involving acetate, arsenite, AOA, antimycin A, and malonate, the authors infer changes in MDH1-CIT1 association based solely on NanoBiT signals. However, no accompanying data are provided on MDH1 and CIT1 protein levels under these conditions. This omission weakens the conclusions, as altered expression rather than interaction strength could underlie the observed luminescence changes. Immunoblotting or quantitative proteomics should be used to confirm constant protein expression across conditions.

      Conclusion:

      Although the central question is compelling and the use of NanoBiT in live cells is a strength, the manuscript requires additional experimental rigor. Specifically, normalization of interaction signals, introduction of causative perturbations, and validation of protein expression are essential to substantiate the study's claims.

      Comments on revised version:

      The manuscript is much improved.

    3. Reviewer #3 (Public review):

      Summary:

      Metabolons are multisubunit complexes that promote the physical association of sequential enzymes within a metabolic pathway. Such complexes are proposed to increase metabolic flux and efficiency by channeling reaction intermediates between enzymes. The TCA cycle enzymes malate dehydrogenase (MDH1) and citrate synthase (CIT1) have been linked to metabolon formation, yet the conditions under which these enzymes interact, and whether such interactions are dynamic in response to metabolic cues, remains unclear, particularly in the native cellular context. This study uses a nanoBIT protein-protein interaction assay to map the dynamic behavior of the MDH1-CIT1 interaction in response to multiple metabolic stimuli and challenges in yeast. Beyond mapping these interactions in real time, the authors also performed GC-MS metabolomics to map whole cell metabolite alterations across experimental conditions. Finally, the authors use microscale thermophoresis to determine components that alter the MDH1-CIT1 interaction in vitro. Collectively, the authors synthesize their collected data into a model in which the MDH1-CIT1 metabolon dissociates in conditions of low respiratory flux, and is stimulated during conditions of high respiratory flux. While their data largely support these models, some key exceptions are found that suggest this model is likely oversimplified and will require further work to understand the complexities associated with MDH1-CIT1 interaction dynamics. Nonetheless, the authors put forth an interesting and timely toolkit to begin to understand the interaction kinetics and dynamics of key metabolic enzymes that should serve as a platform to begin disentangling these important yet understudied aspects of metabolic regulation.

      Strengths:

      - The authors address an important question: how do metabolon-associated protein protein interactions change across altered metabolic conditions?

      - The development and validation of the MDH1-CIT1 nanoBIT assay provides an important tool to allow the quantification of this protein-protein interaction in vivo. Importantly, the authors demonstrate that the assay allows kinetic and real time assessment of these protein interactions, which reveal interesting and dynamic behavior across conditions.

      - The use of classic biochemical techniques to confirm that pH and various metabolites can alter the MDH1-CIT1 interaction in vitro is rigorous and supports the model put forth by the authors.

      Weaknesses:

      The authors have addressed identified weaknesses within the revision of their manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigated how visuospatial attention influences the way people build simplified mental representations to support planning and decision-making. Using computational modeling and virtual maze navigation, the authors examined whether spatial proximity and the spatial arrangement of obstacles determine which elements are included in participants' internal models of a task. The study developed and tested an extension of the value-guided construal (VGC) model that incorporates features of spatial attention for selecting simpler task mental representation.

      Strengths:

      (1) Original Perspective: The study introduces an explicit attentional component to established models of planning, offering an approach that bridges perception, attention, and decision-making.

      (2) Methodological Approach: The combination of computational modeling, behavioral data, and eye-tracking provides converging measures to assess the relationship between attention and planning representations.

      (3) Cross-validated data: The study relies on the analysis of three separate datasets, two already published and an additional novel one. This allows for cross-validation of the findings and enhances the robustness of the evidence.

      (4) Focus on Individual Differences: Reports of how individual variability in attentional "spillover" correlates with the sparsity of task representations and spatial proximity add depth to the analysis.

      Appraisal of Aims and Results:

      The study sets out to determine how spatial attention shapes the construction of task representations in planning contexts. The authors provide evidence that spatial proximity and arrangement influence which environmental features are incorporated into internal models used for navigation, and that accounting for these effects improves model predictions. There is clear documentation of individual variation, with some participants showing greater attentional spillover and more sparse awareness profiles.

      Comments on revised version:

      The authors did a great job and I am very happy with the revised manuscript.

    2. Reviewer #2 (Public review):

      Summary:

      Castanheira et al. investigate the role of spatial attention for planning during three maze navigation experiments (one new experiment and two existing datasets). Effective planning in complex situations requires the construction of simplified representations of the task at hand. The authors find that these mental representations (as assessed by conscious awareness) of a given stimulus are influenced by (spatially) surrounding stimuli. Individual participants varied in the degree to which attention influenced their task representations, and this attentional effect correlated with the sparsity of representations (as measured by the range of awareness reports across all stimuli). Spatially grouping task-relevant information on either the left or right side of the maze led to mental representations more similar to optimal representations predicted by the value-guided construal (VGC) model - a normative model describing a theoretical approach to simplifying complex task information. Finally, the authors propose an update to this model, incorporating an attentional spotlight component; the revised descriptive model predicts empirical task representations better than the original (normative) VGC model.

      Strengths:

      The novelty of this study lies in the proposal and investigation of a cognitive mechanism through which a normative model like value-guided construal can enable human planning. After proposing attention as this mechanism, the authors make concrete hypotheses about mismatches between the VGC predictions and real human behavior, which are experimentally validated. Thus, not only does this study describe a possible mechanism for simplification of task information for planning, but the authors also propose a descriptive model, revising VGC to incorporate this attentional component.

      A strength of this paper is the variety of investigative approaches: analysis of existing data, novel experiment, and a computational approach to predict experimental findings from a theoretical model. Analyzing pre-existing datasets increases the size of the participant cohort and strengthens the authors' conclusions. Meanwhile, comparing the predictions of the existing normative model and the authors' own refined model is a clever approach to substantiate their claims. In addition, the authors describe several crucial controls, which are key to the interpretability of their results. In particular, the eye tracking results were critical.

      In summary, this paper constitutes an important step toward a more complete understanding of the human ability to plan.

      Comments on revised version:

      I am overall happy with the revision and agree that the authors have addressed most of the comments.

    3. Reviewer #3 (Public review):

      Summary:

      The authors build on a recent computational model of planning, the "value-guided construal" framework by Ho et al. (2022), which proposes that people plan by constructing simple models of a task, such as by attending to a subset of obstacles in a maze. They analyze both published experimental data and new experimental data from a task in which participants report attention to objects in mazes. The authors find that attention to objects is affected by spatial proximity to other objects (i.e., attentional overspill) as well as whether relevant objects are lateralized to the same hemifield. To account for these results, the authors propose a "spotlight-VGC" model, in which, after calculating attention scores based on the original VGC model, attention to objects is enhanced based on distance. They find that this model better explains participant responses when objects are lateralized to different hemifields. These results demonstrate complex interactions between filtering of task-relevant information and more classical signatures of attentional selection.

      Strengths:

      (1) The paper builds on existing modeling work in a novel manner and integrates classic results on attention into the computational framework.

      (2) The authors report new and extensive analyses of existing data that shed light on additional sources of systematic variability in responses related to attentional spillover effects

      (3) They collect new data using new stimuli in the original paradigm that directly test predictions related to the lateralization of task-relevant information, including eye tracking data that allows them to control for possible confounds.

      (4) The extended model (spotlight-VGC) provides a formal account of these new results.

      Comments on revised version:

      I also agree that the authors addressed our comments and the manuscript is much stronger now.

    4. Reviewer #1 (Public review):

      Summary: This study investigated how visuospatial attention influences the way people build simplified mental representations to support planning and decision-making. Using computational modeling and virtual maze navigation, the authors examined whether spatial proximity and the spatial arrangement of obstacles determine which elements are included in participants' internal models of a task. The study developed and tested an extension of the value-guided construal (VGC) model that incorporates features of spatial attention for selecting simpler task mental representation.

      Strengths:

      (1) Original Perspective: The study introduces an explicit attentional component to established models of planning, offering an approach that bridges perception, attention, and decision-making.

      (2) Methodological Approach: The combination of computational modeling, behavioral data, and eye-tracking provides converging measures to assess the relationship between attention and planning representations.

      (3) Cross-validated data: The study relies on the analysis of three separate datasets, two already published and an additional novel one. This allows for cross-validation of the findings and enhances the robustness of the evidence.

      (4) Focus on Individual Differences: Reports of how individual variability in attentional "spillover" correlates with the sparsity of task representations and spatial proximity add depth to the analysis.

      Weaknesses:

      (1) Clarity of the VGC model and behavioral task: The exposition of the VGC model lacks sufficient detail for non-expert readers. It is not clear how this model infers which maze obstacles are relevant or irrelevant for planning, nor how the maze tasks specifically operationalize "planning" versus other cognitive processes.

      The method for classifying obstacles as relevant or irrelevant to the task and connecting metacognitive awareness (i.e., participants' reports of noticing obstacles) to attentional capture is not well justified. The rationale for why awareness serves as a valid attention proxy, as opposed to behavioral or neurophysiological markers, should be clearer.

      (2) Attention framework: The account of attention is largely limited to the "spotlight" model. When solving a maze, participants trace the correct trail, following it mentally with their overt or covert attention. In this perspective, relevant concepts are also rooted in attention literature pertaining to object-based attention using tasks like curve tracing (e.g., Pooresmaeili & Roelfsema, 2014) and to mental maze solving (e.g., Wong & Scholl, 2024), which may be highly relevant and add nuance to the current work. This view of attention may be more pertinent to the task than models of simultaneously tracking multiple objects cited here. Prior work (notably from the Roelfsema group) indicates that attentional engagement in curve-tracing tasks may be a continuous, bottom-up process that progressively spreads along a trajectory, in time and space, rather than a "spotlight" that simply travels along the path. The spread of attention depends on the spatial proximity to distractors - a point that could also be pertinent to the findings here.

      Moreover, the tracing of a "solution" trail in a maze may be spontaneous and not only a top-down voluntary operation (Wong & Scholl, 2024), a finding that requires a more careful framing of the link to conscious perception discussed in the manuscript.

      Conceptualizing attention as a spatial spotlight may therefore oversimplify its role in navigation and planning. Perhaps the observed attentional modulation reflects a perceptual stage of building the trail in the maze rather than a filter for a later representation for more efficient decision making and planning. A fuller discussion of whether the current model and data can distinguish between these frameworks would benefit readers.

      (3) Lateralization of attention: The analysis considers whether relevant information is distributed bilaterally or unilaterally across the visual display, but does not sufficiently address evidence for attentional asymmetries across the left and right visual fields due to hemispheric specialization (e.g., Bartolomeo & Seidel Malkinson, 2019). Whether effects differ for left versus right hemifield arrangements is not made explicit in the presented findings.

      (4) Individual differences: Individual differences in attentional modulation are a strength of the work, but similar analyses exploring individual variation in lateralization effects could provide further insight, and the lack of such analyses may mask important effects.

      (5) Distinction between overt and covert attention: The current report at times equates eye movement patterns with the locus of attention. However, attention can be covertly shifted without corresponding gaze changes (see, for example, Pooresmaeili & Roelfsema, 2014).

      The implications for interpreting the relationship between eye movement, memory, and attention in this setting are not fully addressed. The potential dynamics of attention along a maze trajectory and their impact on lateralization analysis would benefit from further clarification.

      Appraisal of Aims and Results:

      The study sets out to determine how spatial attention shapes the construction of task representations in planning contexts. The authors provide evidence that spatial proximity and arrangement influence which environmental features are incorporated into internal models used for navigation, and that accounting for these effects improves model predictions. There is clear documentation of individual variation, with some participants showing greater attentional spillover and more sparse awareness profiles.

      However, some conceptual and methodological aspects would be clearer with greater engagement with the broader literature on attention dynamics, a more explicit justification of operational choices, and more targeted lateralization analyses.

    5. Reviewer #2 (Public review):

      Summary:

      Castanheira et al. investigate the role of spatial attention for planning during three maze navigation experiments (one new experiment and two existing datasets). Effective planning in complex situations requires the construction of simplified representations of the task at hand. The authors find that these mental representations (as assessed by conscious awareness) of a given stimulus are influenced by (spatially) surrounding stimuli. Individual participants varied in the degree to which attention influenced their task representations, and this attentional effect correlated with the sparsity of representations (as measured by the range of awareness reports across all stimuli). Spatially grouping task-relevant information on either the left or right side of the maze led to mental representations more similar to optimal representations predicted by the value-guided construal (VGC) model - a normative model describing a theoretical approach to simplifying complex task information. Finally, the authors propose an update to this model, incorporating an attentional spotlight component; the revised descriptive model predicts empirical task representations better than the original (normative) VGC model.

      Strengths:

      The novelty of this study lies in the proposal and investigation of a cognitive mechanism through which a normative model like value-guided construal can enable human planning. After proposing attention as this mechanism, the authors make concrete hypotheses about mismatches between the VGC predictions and real human behavior, which are experimentally validated. Thus, not only does this study describe a possible mechanism for simplification of task information for planning, but the authors also propose a descriptive model, revising VGC to incorporate this attentional component.

      A strength of this paper is the variety of investigative approaches: analysis of existing data, novel experiment, and a computational approach to predict experimental findings from a theoretical model. Analyzing pre-existing datasets increases the size of the participant cohort and strengthens the authors' conclusions. Meanwhile, comparing the predictions of the existing normative model and the authors' own refined model is a clever approach to substantiate their claims. In addition, the authors describe several crucial controls, which are key to the interpretability of their results. In particular, the eye tracking results were critical.

      In summary, this paper constitutes an important step toward a more complete understanding of the human ability to plan.

      Weaknesses:

      (1) There is a critical conceptual gap in the study and its interpretation, mainly due to the reliance on a self-report metric of awareness (rather than an objective measure of behavioral performance).

      a. Awareness is tested by a 9-point self-report scale. It is currently unclear why awareness of task-irrelevant obstacles in this task would necessarily compromise optimal planning. There is no indication of whether self-reported awareness affects performance (e.g., navigation path distance, time to complete the maze, number of errors). Such behavioral evidence of planning would be more compelling.

      b. Relatedly, it would have been more convincing to have an objective measure of awareness, for instance, how the presence or absence of a "task-irrelevant" obstacle affects performance (e.g., change navigation path distance or time to complete the maze), or whether participants can accurately recall the location of obstacles.

      c. Consequently, I'm not sure that we can conclude that the spatial context does impact participants' ability to plan spatial navigation or to "incorporate task-relevant information into their construal". We know that the spatial context affects subjective (self-reported) awareness, but the authors do not present evidence that spatial context affects behavioral performance.

      d. Another concern that may complicate interpretation is the following: Figure 3c shows improved VGC model predictions (steeper slope) for mazes with greater lateralization. However, there are notable outliers in these plots, where a high lateralization index does not correspond to good model performance. There is currently no discussion/explanation of these cases.

      (2) I noticed an issue with clarity regarding task-relevance. It is currently not fully clear which obstacles are "task irrelevant". Also, the term is used inconsistently, sometimes conflating with "awareness". For example, in the "Attentional spotlight model of task representations" section, the authors state that "task-relevant information becomes less relevant when surrounded by task-irrelevant information". But they really mean that participants become less aware of those task-relevant obstacles. I assume task-relevance is an objective characteristic related to maze organization, not to a participant's construal. Indeed, the following paragraph provides evidence of model predictions of awareness.

      (3) The behavioral paradigm has some distinct disadvantages, and the validity of the task is not backed up by behavioral data.

      a. I understand the need for central fixation, but it also makes the task less naturalistic.

      b. The task with its top-down grid view does not seem to mimic real human navigation. Though this grid may be similar to mental maps we form for navigation, the sensory stimuli corresponding to possible paths and to spatial context during real-life navigation are very different.

      c. Behavioral performance is not reported, so it is unknown whether participants are able to properly complete the task. The task seems pretty difficult to navigate, especially when the obstacles disappear, and in combination with the central fixation.

      d. There is no discussion of whether/how this navigation task generalizes to other forms of planning.

    6. Reviewer #3 (Public review):

      Summary:

      The authors build on a recent computational model of planning, the "value-guided construal" framework by Ho et al. (2022), which proposes that people plan by constructing simple models of a task, such as by attending to a subset of obstacles in a maze. They analyze both published experimental data and new experimental data from a task in which participants report attention to objects in mazes. The authors find that attention to objects is affected by spatial proximity to other objects (i.e., attentional overspill) as well as whether relevant objects are lateralized to the same hemifield. To account for these results, the authors propose a "spotlight-VGC" model, in which, after calculating attention scores based on the original VGC model, attention to objects is enhanced based on distance. They find that this model better explains participant responses when objects are lateralized to different hemifields. These results demonstrate complex interactions between filtering of task-relevant information and more classical signatures of attentional selection.

      Strengths:

      (1) The paper builds on existing modeling work in a novel manner and integrates classic results on attention into the computational framework.

      (2) The authors report new and extensive analyses of existing data that shed light on additional sources of systematic variability in responses related to attentional spillover effects

      (3) They collect new data using new stimuli in the original paradigm that directly test predictions related to the lateralization of task-relevant information, including eye tracking data that allows them to control for possible confounds.

      (4) The extended model (spotlight-VGC) provides a formal account of these new results.

      Weaknesses:

      (1) The spotlight-VGC model has a free parameter - the "width" of the attentional spotlight. This seems to have been fixed to be 3 squares. It would be good if the authors could describe a more principled procedure for selecting the width so that others can use the model in other contexts.

      (2) Have the authors considered other ways in which factors such as attentional spillover and lateralization could be incorporated into the model? The spotlight-VGC model, as presented, involves first computing VGC predictions and only afterwards computing spillover. This seems psychologically implausible, since it supposes that the "optimal" representation is first formed and then it gets corrupted. Is there a way to integrate these biases directly into the VGC framework, perhaps as a prior on construals? The authors gesture towards this when they talk about "inductive biases", but this is not formalized.

      (3) Can the authors rule out that the lateralization effects are the result of memory biases since the main measure used is a self-report of attention?

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript the authors derive a mean-field model for a network of Hodgkin-Huxley neurons retaining the equations for ion exchange between the intracellular and extracellular space.<br /> The mean-field model derived in this work relies on approximations and heuristic arguments that, on the one hand, allow a closed-form derivation of the mean-field equations, and on the other hand restrict its validity to a limited regime of activity corresponding to quasi-synchronous neuronal populations. Therefore, rather than an exact mean-field representation, the model provides a description of a mesoscopic population of connected neurons driven by ion exchange dynamics.

      Strengths:

      The idea of deriving a mean-field model which relates the slow-timescale biophysical mechanism of ion exchange and transportation in the brain to the fast-timescale electrical activities of large neuronal ensembles.

      Weaknesses:

      The idea underlying this work is not completely implemented in practice.

      The derived mean field model do not show a one-to-one correspondence with the neural network simulations, except in strongly synchronous regimes. The agreement with the in vitro experiment is hardly evident, both for the mean-field model and for the network model. The assumptions made to derive the closed-form equations of the mean field model have not been justified by any biological reason, they just allow for the mathematical derivation. The final form of the mean-field equations do not clarify whether or not microscopic variables are used together with macroscopic variables in an inconsistent mixture.

      Comments on revisions:

      The main weaknesses I listed in the first report are still present, since the authors did not answer my questions on a solid basis. I report the list for completeness:

      (1) It seems that the reduction methodology that is employed is not the most suitable one for the single-neuron model they are considering.<br /> (2) The formulation of the mean-field derivation is unnecessarily complicated. It could be heavily simplified by following previously published approaches to derive biologically realistic neural masses.<br /> (3) The model seems to work only for highly synchronized situations and not for the standard asynchronous evolution usually observed in neural circuits.

      Therefore, my statement remains unchanged.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aiming in developing a neural mass model characterized by few collective variables mimicking the dynamics of a network of Hodgkin - Huxley neurons encompassing ion-exchange mechanisms. They describe in details the derivation of the mean-field model , then they compare experimental results obtained for the hippocampus of a mice with the neural network simulations and the mean-field results. Furthermore, they report a bifurcation analysis of the developed model and simulation of a small network containing various coupled neural masses, somehow moving towards the simulation of an entire connectome.

      Strengths:

      The author attempts to develop a mean-field model for a globally coupled network of heterogeneous Hodgkin-Huxley neurons with explicit ion exchange mechanism between the cell interior and exterior.

      Weaknesses:

      (1) They do not employ the reduction methodology more suited for the single neuron model they consider.<br /> (2) Their derivation of the neural mass model is based on several assumptions, and not all well justified.<br /> (3) Their formulation of the mean-field derivation is unnecessary complicated, it can be strongly simplified by following previously published approaches to derive biologically realistic neural masses.<br /> (4) Their model seems to work only for highly synchronized situations and not for the standard asynchronous evolution usually observed in neural circuits.

      General Statements:

      The authors honestly declared the many limitations of their approach, once assumed this the results of the mean-field are somehow inconsistent with the neural network simulations as expected.

      The authors suggest to employ this model for the simulations on the whole connectome to follow seizure propagation, however I believe that a simpler model, as the Epileptor, remains superior in this respect to this model. That indeed includes biophysical parameters but their correspondence with the ones employed in the network dynamics remain elusive, due to the many assumptions required to derive this mean field model. Furthermore it is more complicated than the Epileptor, I do not think that the present model will be largely employed by the community.

      Comments on revisions:

      The authors have corrected mistakes present in the manuscript and put a correct list of references.

      However, they refuse

      (1) To simplify the formulation of the model, the model contains unnecessary complications, as I have clearly written in my report, the authors agree, but they do not want to change the formulation;

      (2) To derive the mean field model in a simpler way, as possible, and as I asked many times in my Referee report, this would help the readers to understand the important aspect of the derivation, without not needed and confusing complicated formulations;

      (3) To compare direct simulations of the network with neural mass results in sub-section "Bifurcation analysis: emergent network states and multistability" to show bistability, as I asked.

      As a matter of fact the performed modifications do not solve my previous doubts on the validity of the results reported in the manuscript.

      Therefore, my previous assessments remain valid.

    1. Reviewer #1 (Public review):

      Pyne and Pandey et al. report the observation of early DNA degradation at the phagocytic cup during macrophage engulfment. Using an elegant experimental system that combines actin staining to visualise cup formation with direct monitoring of DNA degradation, the authors identify rapid recruitment of the membrane-bound nuclease DNase X (DNase1L1) to nascent phagocytic cups. This recruitment occurs within minutes of cup formation, is independent of DNA presence at the substrate, and appears to originate from intracellular membrane structures rather than from the extracellular environment. The results support the conclusion that DNase X activity is present at the phagocytic cup and that DNA digestion can begin prior to phagolysosomal maturation.

      The study is technically strong. The experimental system is clean, specific, and allows precise spatial and temporal detection of DNA degradation. The imaging-based approaches are carefully executed and enable convincing visualisation of DNase X recruitment and activity. The use of an alternative substrate beyond the primary SNS system strengthens the core observation, and the data broadly support the authors' central claim.

      However, several limitations temper the physiological interpretation. The system relies largely on short, free DNA substrates, leaving open how efficiently DNase X processes more complex or physiologically relevant DNA structures, such as nucleosome-bound DNA or neutrophil extracellular traps (NETs). It remains unclear whether DNase X deficiency would alter macrophage responses to larger nucleic acid structures, influence engulfment efficiency, or modify downstream inflammatory signalling pathways such as TLR9 or STING activation. Moreover, the experimental setup prevents full phagocytic cup closure, potentially prolonging DNase activity compared with physiological phagocytosis, which typically proceeds rapidly to cargo internalisation. For example, the peak signal observed in Figure 5 occurs approximately 90 minutes after phagocytic cup formation, a time point at which many phagocytic cups would be expected to have already closed under physiological conditions. Additional work using fully engulfed cargo in more physiological contexts would clarify whether early DNase X activity meaningfully contributes to overall DNA clearance kinetics.

      Mechanistically, the signal that triggers DNase X recruitment remains unresolved. Although actin rearrangement was excluded as the primary driver, the upstream cues that direct DNase X-containing membrane structures to the forming cup are not yet defined.

      In the broader context, early DNase X activity at the phagocytic cup could represent an additional safeguard against inflammatory signalling by limiting extracellular or surface-associated DNA before phagolysosomal degradation by DNase II. This mechanism may be particularly relevant in settings where DNA fragmentation before engulfment is incomplete, such as necroptosis or NET formation. Determining whether DNase X deficiency exacerbates inflammatory responses, alters DNA clearance efficiency in vivo, or contributes to immune pathology will be critical for establishing its physiological and disease relevance.

      Overall, this is a compelling study that introduces a novel concept of pre-phagolysosomal DNA digestion. The conclusions are well supported within the in vitro system used, but further investigation using diverse DNA substrates and physiologically relevant models will be required to fully define the impact of this mechanism on immune regulation and disease.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents an elegant and innovative imaging approach to visualize DNase activity at the interface between macrophages and extracellular substrates. The platform is technically strong and enables the study of localized DNA degradation with high spatial resolution. The work is of clear interest and provides a useful framework to investigate how immune cells process extracellular DNA. However, several aspects of the mechanistic interpretation and conceptual framing would benefit from clarification.

      Strengths:

      (1) The study introduces a creative and well-designed imaging platform that allows visualization of localized DNase activity at cell-substrate interfaces.

      (2) The approach is technically robust and represents a valuable tool that could be broadly useful to the field.

      (3) The experiments are thoughtfully designed and address an important question regarding how immune cells interact with extracellular DNA.

      (4) The work opens interesting avenues for studying DNA processing in contexts such as infection and inflammation.

      Weaknesses:

      While the experimental approach is strong, several key conclusions rely on interpretations that would benefit from further clarification:

      (1) First, the conclusion that DNaseX is recruited to phagocytic cups from the "cytoplasm" appears conceptually imprecise. Given that DNaseX is a membrane-anchored protein, it is unlikely to exist as a freely soluble cytoplasmic pool. A more plausible interpretation is that DNaseX is supplied from intracellular membrane compartments. This interpretation would also be more consistent with the data showing dependence on a membrane anchor.

      (2) Second, the interpretation that actin polymerization is not required for DNaseX recruitment raises concerns. Phagocytic cup formation is known to depend strongly on actin dynamics, and it is therefore unclear whether the structures observed under actin inhibition represent fully formed functional cups or partial cell-substrate contacts. This distinction is important for interpreting recruitment versus activity, particularly since enzymatic activity is reduced under these conditions.

      (3) Third, the identification of DNaseX as the main nuclease responsible for the observed activity is not fully resolved. The conclusions rely primarily on gene silencing and staining approaches, but the specificity of these strategies relative to other nucleases is not addressed. It therefore remains possible that additional enzymes contribute to the observed activity.

      (4) Finally, the interpretation of the biofilm experiments may be overstated. While the data clearly show localized DNA degradation in contact with macrophages, it is not fully established that this process depends specifically on phagocytic cup structures. An alternative explanation is that membrane-associated DNase activity more generally mediates this effect. In addition, the physiological relevance of this mechanism would benefit from further discussion.

      Overall, the study is technically strong and introduces a valuable methodology, but several central conclusions are only partially supported by the current data and would benefit from more cautious interpretation and clearer conceptual framing.

    1. Reviewer #1 (Public review):

      Summary:

      During erythroid differentiation, hematopoietic progenitors relinquish multipotency and activate lineage programs. The switch from GATA2 to GATA1 is particularly important in this process, yet GATA2 chromatin‑binding kinetics remain undefined. The authors investigated GATA2-chromatin interaction dynamics during erythroid differentiation in three different cell systems using single‑molecule live‑cell imaging, and they also used CUT&Tag to profile GATA2 chromatin occupancy.

      By single‑molecule imaging, the authors report two interaction modes for GATA2: short‑lived (<1 s) and long‑lived (>5 s) binding. The proportion of long‑lived molecules, the number of binding events, and the duration of long‑lived binding change (or are maintained) during differentiation. Notably, long‑lived chromatin engagement by GATA2 increases during early erythroid differentiation and decreases at the late stage. CUT&Tag identifies regulatory elements selectively occupied by GATA2 during the early transition stage. Together, these results support a model in which transcription factor kinetics form a dynamic chromatin‑engagement profile that characterizes the GATA2‑to‑GATA1 transition.

      Strengths:

      (1) Characterizing transcription‑factor binding kinetics during the GATA2->GATA1 transition addresses a fundamental mechanism in erythroid differentiation.

      (2) Combining single‑molecule live imaging with CUT&Tag provides both dynamic and locus‑specific perspectives.

      (3) Single-molecule analysis across three different cell systems strengthens the potential generalizability of the findings and highlights biological variability.

      Weaknesses:

      I agree that single‑molecule imaging is a powerful approach for investigating GATA2 kinetics, but the single‑molecule data are the most important part of the paper and need improvement. The analyses focus on three measures: (i) duration of long binding, (ii) proportion of short‑ and long‑binding molecules, and (iii) total binding events. However, several methodological and control issues limit confidence in the kinetic interpretations. The authors should address the following major concerns.

      (1) Two binding states: justification and controls

      The authors propose two states of GATA2 binding. Are there only two states? Studies that separate short‑ and long‑lived binding (e.g., Chen et al., 2014, PMID: 25342811) address two states of transcriptional factors very carefully. Some long‑binding duration distributions here are very long‑tailed (e.g., Figure 2D middle), suggesting a possible third state. The authors must explain how they determined that two states provide the "best fit" to the data and how they classified "short" versus "long" binding.

      Controls should be included for long‑lived and short‑lived binding (e.g., histone proteins, HaloTag‑NLS, or a binding‑deficient GATA2 mutant) as in other studies. These controls are essential to exclude alternative explanations (see points below).

      (2) Exclude photophysical and focal‑plane artifacts

      The authors should exclude contributions from (i) photobleaching, (ii) blinking, and (iii) Z‑axis motion (disappearance from the focal plane). Although photobleaching correction is mentioned in the Methods, no details are provided. Describe and quantify the photobleaching correction and demonstrate that it was applied across all cell types and conditions.

      Some spots in the supplementary movies appear to blink or to move substantially between frames. Provide analyses or controls that distinguish true dissociation events from photophysical blinking/bleaching or axial motion.

      (3) HILO illumination and nuclear region sampled

      HILO is powerful but sensitive to illumination angle: slight changes sample different nuclear regions (e.g., nuclear interior versus periphery). The nuclear periphery is enriched in heterochromatin and may bias binding statistics. Explain how the authors controlled the HILO angle and confirmed that comparable nuclear regions were imaged across cells and conditions.

      (4) Quantification of event counts and long‑binding durations

      The number of binding events and measured long‑binding durations are strongly affected by imaging conditions (labeling/staining, bleaching, nucleus size, cell cycle state, focal plane, spot detectability, etc.). Imaging clarity appears to differ among cells/conditions in the supplementary movie. Provide more careful analysis describing how these variables were controlled or corrected for, and assess the sensitivity of results to choices in detection and tracking parameters.

      (5) Evidence that spots are single molecules

      The authors state that spots represent single molecules but do not provide supporting evidence. Spot brightness varies considerably in the movies. Brightness differences may reflect axial position. Provide evidence supporting single‑molecule assignment (e.g., single‑step photobleaching traces, brightness distributions compared to a known single‑molecule control, or photon count analysis).

      (6) Description of spot‑analysis pipeline

      The manuscript lacks a sufficient description of the spot‑analysis method. I reviewed the STRAP pipeline paper cited (Haque and Coleman 2025 bioRxiv) and the GitHub code, but the Methods in the current manuscript should include a detailed STRAP pipeline. This would enable readers to evaluate and reproduce the analyses.

      (7) Differences among cell systems

      The three cell systems yield notably different results (e.g., Figure 2C vs 4C and Figure 2D/3D vs 4D). Provide a more detailed explanation for these differences and discuss how biological variability, technical differences, or imaging biases might account for the discrepancies.

    2. Reviewer #2 (Public review):

      In this study, the authors address the molecular mechanism underlying the transcriptional changes during erythroid differentiation from hematopoietic progenitor cells. The authors combine single-molecule live cell imaging and CUT&RUN to analyze the chromatin binding properties of the GATA2 transcription factor prior to and after initiation of differentiation into the erythroid cell lineage. Using three distinct cellular systems, the authors demonstrate that the chromatin binding of GATA2 is transiently increased early in the differentiation process, as evidenced by increased chromatin binding residence time and the emergence of new genomic binding sites identified by CUT&RUN. The strength of the study lies in the combination of single-molecule imaging, which reports on binding dynamics but is agnostic of the binding site, with CUT&RUN, which reports on the binding sites but does not provide dynamic information. The authors clearly demonstrate that chromatin binding of GATA2 is altered early in the differentiation process and is later displaced as cells switch to expression of GATA1, which has been previously observed. The use of three distinct cell lines, in particular the GATA2-SNAP mouse model, is a strength in principle; however, the results are not fully consistent between the different cell systems. A key difference is that the G1E-ER4 and HPC7 cell line models express HaloTagged GATA2 in addition to the endogenous GATA2 protein. The authors go through great lengths to control GATA2-HaloTag expression levels, but they use polyclonal cell lines and do not analyze expression levels of the GATA2-HaloTag transgene, which is a key variable in interpreting their experimental results. Finally, a key variable determined in their single-molecule analysis is the number of binding events observed during the distinct differentiation changes. The number of binding events observed is influenced by the expression level of the tagged protein, which in turn is controlled by the Shield-1 ligand, and the fraction of molecules labeled with the HaloTag ligand. Since transgene protein levels and the labeling efficiency were not determined, it is hard to assess how reliable the measurements of the number of binding events are across all cell lines.

      To address the weaknesses summarized above the authors could take the following steps:

      (1) Determine the expression levels of the GATA2-HaloTag transgene over the course of differentiation under the conditions used for single-molecule imaging. This will not only allow them to determine the expression of the transgene but also the endogenous untagged protein with which the GATA2-HaloTag fusion proteins compete for binding sites.

      (2) To determine the fraction of molecules labeled during imaging, the authors could carry out a titration of the HaloTag ligand and compare the amount of labeled protein under single-molecule imaging conditions to that of saturating labeling of the HaloTag. This approach will ensure that the number of labeled molecules per cell is comparable across experimental conditions and allow the authors to draw more solid conclusions regarding the number of binding events.

      (3) The analysis of residence times using single-molecule imaging requires robust single-particle tracking without gaps or interruptions of trajectories. The authors should show images of their particle trajectories to demonstrate that their tracking is robust. Or even better, movies superimposing the trajectories onto the imaging data.

    3. Reviewer #3 (Public review):

      Hobbs et al. use live-cell single-molecule tracking (SMT) of HaloTag- and SNAP-tagged GATA2 combined with CUT&Tag chromatin profiling to examine how GATA2 chromatin engagement evolves during erythroid differentiation. Across three complementary systems, G1E-ER4 cells, HPC7 cells, and primary bone marrow progenitors from a new Gata2-SNAP knock-in mouse, they report a transient strengthening of long-lived GATA2 chromatin binding at the "Early" (2 h) erythroid stage, manifested either as increased residence time (G1E-ER4) or expansion of the long-lived bound fraction (HPC7, primary cells). CUT&Tag identifies 1,167 Early-restricted GATA2 peaks partitioning into GATA2-only (promoter-proximal, GATA/RUNX motifs) and GATA2+GATA1 co-bound (distal, GATA/E-box motifs) subclasses. The authors propose that this kinetic phase represents a previously unappreciated dimension of the GATA switch.

      This is a strong study with a genuinely novel finding-the non-monotonic kinetic behavior of GATA2 during erythroid priming, supported by complementary measurements in three biological systems. The issues below are largely clarifications, additional analyses of existing data, and modest refinements to the discussion. With these addressed, the manuscript will make a valuable contribution. I recommend a minor revision.

      Specific points:

      (1) Clarify the photobleaching correction and report per-cell bleach lifetimes.

      The long-lived residence time claim in G1E-ER4 cells depends on careful accounting for photobleaching, which the Methods indicate was done via a right-censoring model. For reviewer and reader confidence, the authors should report the per-stage (or per-cell distribution of) photobleaching lifetimes and the photobleach-corrected residence time values alongside the apparent values in Figure 2D. If feasible, including a brief supplementary analysis with an H2B-Halo or similar long-lived control under matched conditions would further solidify the quantitative claims. This is an analysis of existing data and should not require new imaging.

      (2) Unify or explicitly discuss the mechanistic differences across systems.

      The three systems show qualitatively different signatures: residence time change in G1E-ER4, bound fraction expansion in HPC7, and primary cells. The authors currently group these under "enhanced engagement," but these signatures imply different underlying mechanisms (koff decrease vs. increased kon or increased specific-binding-competent pool). The Discussion partially addresses this by noting engineered vs. native differences, but a more explicit framing in both Results and Discussion would help readers. Specifically, reporting an on-rate proxy (for example, binding events per unit time normalized to detectable molecule number) alongside koff would let readers see how the mechanistic pieces fit together. I do not think this changes the central message; it sharpens it.

      (3) Per-cell GATA2 concentration would strengthen the "uncoupling" claim.

      A central claim of the Figure 6 model is that chromatin engagement is uncoupled from protein abundance. The ectopic Shield-1 stabilization system is a reasonable design choice, but quantifying total nuclear GATA2-Halo signal (for example, from the pre-bleach frame or a brief high-power acquisition) on a per-cell basis across stages would directly support the interpretation. For the primary cells, where the biological claim is strongest, a western blot or quantitative immunofluorescence on the flow-sorted populations would make the uncoupling argument much more defensible. I recognize this may be one additional experiment, but it is a high-value one.

      (4) Additional single-cell distribution analysis.

      Figure 1E and Figures 2 to 4 show substantial cell-to-cell heterogeneity, and the Early populations in particular look potentially bimodal. Given that the authors cite Wheat et al. and Palii et al. on probabilistic hematopoietic transitions, a brief supplementary analysis using distribution-based statistics (K-S test, or mixture model) rather than, or alongside, mean-based ANOVA would align the analysis with this conceptual framing and may reveal whether the Early state represents a subpopulation transition rather than a uniform shift. This is purely an analysis of existing data.

      (5) Quantitative integration of CUT&Tag with SMT.

      The manuscript presents SMT and CUT&Tag as complementary but does not attempt to quantitatively connect them. A back-of-the-envelope calculation of whether a 21% increase in residence time (G1E-ER4), or the fraction expansion in other systems, is consistent with the acquisition of the 1,167 Early-restricted sites, given plausible site affinities, would substantially strengthen integration. Even if the calculation is approximate, framing it explicitly would help readers appreciate that the two datasets reinforce each other.

      (6) Short-lived kinetic interpretation and tracking parameters.

      The 1.5 s gap allowance in tracking is long relative to the 0.55 to 0.73 s short-lived residence times reported in primary cells (Figure Supplement 1F), which could affect the interpretation of the "slowing of target search" claim. A brief sensitivity analysis with tighter gap parameters in the supplement would reassure readers that this effect is robust. Additionally, please clarify how the inferred slowing of search, which should reduce kon, is reconciled with the increased number of binding events per cell at the Early stage.

      (7) CUT&Tag peak definition.

      The Early-restricted peak set is defined by presence and absence at q less than 0.01, which can be sensitive to near-threshold peaks. Please report either (a) the CUT&Tag signal intensity distribution at the 1,167 sites across all three stages as a quantitative scatter or density plot, beyond the heatmap in Figure 5C, or (b) the result of a differential binding analysis (for example, DESeq2 on read counts in a union peak set) as a supplementary confirmation. Please also state the number of CUT&Tag replicates per stage and the overlap of Early-restricted sets across replicates.

      (8) Knock-in mouse validation.

      The Gata2-SNAP allele is a valuable new tool, and it would benefit from slightly more quantitative validation in the supplement. A brief characterization of basic hematopoietic parameters in homozygotes (CBC, LSK/HSPC frequencies, or colony assays) would confirm that the tagged allele is truly physiological and would serve the community that will want to use this mouse going forward. If this has been done, please include it; if not, a statement about what was checked would suffice.

    1. Reviewer #1 (Public review):

      This manuscript is very interesting and timely. By introducing the critical effects of desolvation barriers and solvent (water)-separated minima into the implicit-solvent potentials (of mean force, PMFs) for coarse-grained molecular dynamics simulations of biomolecular liquid-liquid phase separation (LLPS), this work fills a gap that should be apparent to researchers of protein folding in the past couple of decades but has so far escaped deserved attention such that these basic features of aqueous solvation have seldom, though not never, been invoked in recent studies of biomolecular condensates. Although the present paper deals almost exclusively with homopolymers, this work can be a foundation for the future development of a new, more physical coarse-grained interaction scheme for simulating amino acid sequence-dependent effects, which I presume is the authors' ongoing or next endeavor. The results presented in this manuscript are highly valuable.

      However, there is room for improvement in the authors' description of (i) the broader impact of effects of desolvation barrier and solvent-separated minimum in the thermodynamics of biomolecular condensates, especially with regard to the ramifications on hydrostatic pressure-dependent effects; (ii) the physical implication of using a 20-parameter hydropathy scale rather than a 210-parameter pairwise amino acid interaction scheme; and (iii) temperature-dependent effects, including the authors' discussion of "enthalpic" and "entropic" contributions. In all these aspects, the authors' discussion should be put in a more comprehensive context of the existing literature. At a few other places, the description of the methods and results should be clarified as well. Accordingly, the authors should revise the manuscript to address the following items thoroughly within the revised manuscript (not merely in the response letter) with the additional references mentioned below included in the revised discussion:

      (1) In several places, e.g., on line 77 (p.2), the authors appear to suggest that "implicit-solvent representation" is the origin of the deficiency in commonly utilized coarse-grained potentials that this study is aiming to rectify. But desolvation barriers and solvent-separated minima are also features of implicit-solvent representations; they are just features that should be incorporated in more accurate implicit-solvent potentials. This point is stated quite clearly and accurately in the Abstract (p.1) but not consistently in the rest of the text. The authors should check the entire text carefully to ensure that a coherent, accurate perspective is presented.

      (2) In the discussion of the importance of desolvation barriers and solvent-separated minima in the Introduction (pp.1-3), connections should be drawn to recent works that utilize these PMF features to rationalize hydrostatic pressure (P)-modulated effects on biomolecular LLPS, including the P-dependent reentrant phase separation of alpha elastin; see Cinar et al. (2019) Chem Eur J 25:13049 (https://chemistry-europe.onlinelibrary.wiley.com/doi/full/10.1002/chem.201902210) and references therein, especially discussions around Figures 10, 11 & 13 in this reference.

      (3) In the lower panels of Figures 2D, E (p.5), what do the differently colored small circles in the double-minimum free energy profiles represent? Does the color shading have the same meaning as that in the upper panels? If so, what do the positions of the circles on the free energy profile represent? The authors should clarify this.

      (4) The discussion regarding entropy and enthalpy around Figure 2 is quite confusing as it stands. What do the authors mean exactly by the association of entropy or enthalpy with the desolvation barrier of the solvent-separated minimum? Are they referring to conformational entropy?

      (5) Do the authors assume that the PMF (effective implicit-solvent potential) is a purely enthalpic term? It appears to be the authors' assumption. If so, the assumption has to be stated clearly in their discussion of "entropy" vs "enthalpy" around Figure 2.

      (6) Closely related to points 3-5 above, it should be stated clearly that the "temperature" used in the authors' simulations does not represent experimental temperature if the authors are using purely enthalpic effective potentials because PMFs are in fact temperature-dependent. This clarification is necessary to avoid misunderstanding. In this regard, it should be noted that temperature-dependent effective interactions have been used for modeling biomolecular condensates in analytical theory (Lin, Song, Forman-Kay & Chan, J Mol Liq 2017, already in the citation list) as well as in coarse-grained molecular dynamics simulations [Dignon et al. (2019) ACS Cent Sci 5:821-830 (https://pubs.acs.org/doi/10.1021/acscentsci.9b00102); Chakravarti & Joseph (2025) Protein Sci 34:e70284 (https://onlinelibrary.wiley.com/doi/10.1002/pro.70284)]. The latter two studies, not cited currently, are particularly relevant and thus should be cited because the authors may wish to incorporate temperature-dependent features in their ongoing or future effort in constructing a more comprehensive coarse-grained interaction scheme for biomolecular LLPS simulation.

      (7) In tackling "entropy" vs "enthalpy", it should be noted that the temperature dependence of the effective interactions entails an entropic contribution (which is itself temperature dependent) in addition to conformational entropy. As for the effective potential with desolvation barrier and solvent-separated minimum, it should be noted that the decomposition into entropic and enthalpic contributions at the direct contact, desolvation barrier, and solvent-separated minimum can be dramatically different, see, e.g., MaCallum et al. (2007) PNAS 104:6206-6210 (https://www.pnas.org/doi/full/10.1073/pnas.0605859104) and references therein.

      (8) P.7, line 340: The proportionality relation follows directly from the standard Flory-Huggins result T_c = T chi(T)/chi_c, thus the proportionality constant is exactly 1/chi_c. Is this the standard relation that the authors are invoking here? The authors should clarify this.

      (9) The study on dynamic consequences on pp.8-11 is interesting, but clarifications are necessary:

      (i) The vertical schematic in Figure 4A should be explained in detail in its entirety. As it stands, no explanation is provided either in the figure caption or in the text. In particular, what does "elasticity driven" refer to?

      (ii) The top snapshot in Figure 4A is labeled t_sim = 0 ns. Does it mean that the snapshot shown is the only chain configuration that the authors used to start the simulation, and that the snapshot does NOT represent the result of any time evolution, no matter how short the duration is? However, if that is the case, why is this snapshot identified with spinodal decomposition if it is not the product of a time evolution from a more homogeneous configuration?

      (iii) Related to (ii) - do the rectangular boxes shown represent the entire simulation box or just part of the box containing the polymer chains? One would imagine that if the top snapshot represents spinodal decomposition, the simulation would have been started at a more uniform distribution a short time prior? Why is this not the case?

      (iv) What precisely do the small yellow beads and black-colored springs in the zoom-in image of Figure 4E represent?

      (10) In discussing dynamic effects, it is useful to draw connections to related works on the effect of chain flexibility on "aging" of condensate [Biswas & Potoyan (2024) PRX 45:9222-9245 (https://journals.aps.org/prxlife/abstract/10.1103/PRXLife.2.023011)] and characterization of viscoelasticity in simulations of biomolecular condensates [Tejedor et al. (2023) J Phys Chem B 127:4441-4459 (https://pubs.acs.org/doi/10.1021/acs.jpcb.3c01292)], as the effects of desolvation can be explored further based on these prior works.

      (11) Much of the present study is based on the original HPS formulation of Dignon et al. (2018). In this regard and also in anticipation of future development of improved interaction schemes, several issues should be stated and discussed, even if briefly:

      (i) The original HPS model has a basic shortcoming in accounting for the relative interaction strengths of, among others, arginine vs lysine residues [Das et al. (2020) PNAS 117:28795-28805 (https://www.pnas.org/doi/10.1073/pnas.2008122117)].

      (ii) Compared to 210-parameter pairwise interaction schemes, such as KH in Dignon et al. (2018) and Joseph et al. (2021), the 20-parameter interaction scheme is likely too restrictive to account for pairwise amino acid residue interactions [Wessén et al. (2022) J Phys Chem B 45:9222-9245 (https://pubs.acs.org/doi/10.1021/acs.jpcb.2c06181)].

      (iii) The height of the desolvation barrier may vary significantly for different amino acid residue pairs, see, e.g., Figure 11 of Cinar et al. (2019) mentioned above (and references therein). The authors should discuss these nuances in the revised version. They may also wish to take them into consideration in future investigations.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses an important and timely question in the molecular simulation of biomolecular condensates. Most residue-level coarse-grained models used for IDP phase separation employ implicit solvent and represent effective interactions through relatively simple pairwise potentials. While these models have been very useful, they usually do not explicitly distinguish direct contacts from solvent-separated interactions, nor do they include an energetic barrier associated with water removal. This manuscript attempts to address that limitation by introducing desolvation-inspired terms into coarse-grained models and examining their consequences for phase behavior, chain conformations, dense-phase packing, and dynamics.

      Strengths:

      The central idea is physically well motivated. Using a simple homopolymer model, the authors show that increasing the desolvation barrier suppresses phase separation, whereas stabilizing solvent-separated contacts enhances phase separation. They further show that solvent-separated interactions can reduce dense-phase over-compaction, which is a meaningful result given the known challenges in obtaining both accurate single-chain dimensions and realistic dense-phase properties from the same coarse-grained model. The finding that desolvation-like terms can reshape dense-phase packing without simply rescaling the overall interaction strength is interesting and could be useful for future model development. I also found the attempt to connect conformational changes across dilute and dense phases with thermal distance from the critical point to be intriguing. The dynamic analysis, including the FRAP-like simulations and the discussion of kinetic arrest during coarsening, adds another useful dimension to the work.

      Weaknesses:

      At the same time, there are several places where the manuscript would benefit from more careful framing. First, the desolvation terms are still effective coarse-grained parameters rather than a direct representation of water molecules. The language sometimes gives the impression that desolvation is being treated explicitly, whereas the model introduces desolvation-inspired effective interactions into an implicit-solvent framework. Second, the conformational analysis is interesting, but the broader context of prior work on dilute-to-dense phase conformational reorganization of IDPs could be more clearly discussed. This would help clarify what is new in the present work, whether it is the conformational change itself, its dependence on desolvation terms, or the proposed scaling with distance from the critical point. Third, the dynamic results are potentially useful, but the manuscript should more clearly articulate what is nontrivial beyond the expected slowing of local rearrangements by an added barrier in the potential.

      Overall, I think this is a useful and potentially important contribution.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents an original quantitative approach for tracking the online formation and updating of prior beliefs. In an Alternating Serial Reaction Time task, participants were exposed to probabilistic visual streams, and their pre-stimulus saccadic behavior (i.e., the first eye movement after the previous stimulus disappeared) was monitored via eye-tracking. Since the stimuli followed an alternating probabilistic sequence, upcoming events did not appear with full certainty: some stimuli had a higher, some a lower probability. By comparing anticipatory oculomotor behavior between high and low probability events, the authors dissociated between learning/belief updating and general oculomotor noise. Noise-driven errors were more frequent than learning-dependent errors, which nonetheless triggered more belief updating (i.e., a change in oculomotor behavior in a subsequent encounter of the same event). Interestingly, updating depended more strongly on whether a prior belief was consistent with the task's probabilistic structure than on prediction errors. These findings suggest that incidental, implicit statistical learning may rely on conservative updating with a relatively low learning rate, or on errorless algorithms, rather than prediction errors per se.

      Strengths:

      By applying a fine-grained analysis of anticipatory oculomotor behavior, this work establishes new continuous metrics to quantify the gradual learning and refinement of prior expectations during statistical learning. These metrics provide convincing evidence of the dynamics of anticipatory oculomotor behavior.

      The method is paradigm-independent, offering generalizable metrics for tracking the dynamic formation and refinement of predictive models in any task involving probabilistic stimulus streams. In the future, computational modeling may leverage these continuous metrics to better dissect the mechanisms underlying statistical learning.

      Weaknesses:

      The authors subscribe to the idea that statistical learning is not a unified concept but rather is implemented via multiple underlying mechanisms. However, it remains unspecified what these different mechanisms could be, and how eye movements could contribute to distinguishing between them.

      The authors claim that they developed a novel methodological approach to probe whether anticipatory eye movements directly reflect priors, thereby filling an outstanding gap. However, this claim ignores mounting relevant work on structure learning using eye-tracking in the developmental field.

      The authors claim that their framework quantifies trial-by-trial oculomotor dynamics, while in fact the analyses use epochs (i.e. groups of multiple trials) as predictors. Why not use trial number as a predictor to truly investigate trial-by-trial dynamics that directly reflect anticipation, surprisal, and revision?

    2. Reviewer #2 (Public review):

      Summary:

      Hann and colleagues introduce a gaze-based analytical framework designed to capture, on a trial-by-trial basis, how people form and revise their predictions during implicit probabilistic sequence learning. Using an eye-tracking adaptation of an alternating sequence task, they record the first anticipatory saccade during the response-stimulus interval and classify each such saccade along two dimensions: whether it was directed toward a high- or low-probability upcoming stimulus (the learning-dependent vs. not-learning-dependent distinction), and whether the anticipated location coincided with the stimulus that actually appeared. A complementary iterative-updating metric codes whether a participant's prediction for a given three-element context is repeated or revised on successive encounters of that context.

      On the basis of these measures, the authors report that errors congruent with the inferred regularity - which they interpret as reflecting environmental noise - become progressively more frequent than errors reflecting an inaccurate internal model; that participants show a pronounced tendency to repeat their previous prediction rather than revise it; and that updates depend more on whether a prior belief is congruent with the task's statistical structure than on whether the previous prediction was confirmed. They interpret these results as evidence that statistical learning is less error-driven and more repetition-based (Hebbian in character) than is typically assumed.

      Strengths:

      The methodological ambition of the work is considerable, and the paper makes several contributions that are likely to be useful to the implicit-learning and predictive-processing communities. Using the first anticipatory saccade as a pre-response behavioral readout of prediction is conceptually well-motivated: it provides a trial-by-trial index of predictive orienting at a temporal resolution that manual reaction times cannot deliver, and it does so before the outcome of the trial is known. The explicit distinction between errors arising because the task's outcome is stochastic - that is, predictions congruent with the statistical structure but unconfirmed by the stochastic sample - and errors arising because the internal model is inaccurate is a theoretically meaningful move: predictive-coding and Bayesian accounts have long argued that these two sources of surprise should carry different weight for model revision, and the authors offer a behavioral operationalization of that distinction. The analytical pipeline is not tied to the specific paradigm used here and could be applied to other probabilistic sequence-learning tasks, which gives it broader methodological utility than a single-paradigm report. Finally, the demonstration that learners maintain their prior across successive occurrences of the same context, even when it has been disconfirmed by the most recent outcome, is a robust behavioral observation that speaks directly to an unresolved debate about whether statistical learning is dominantly error-driven.

      Weaknesses:

      The framework and the core behavioral observations are valuable, but several inferential steps - from the gaze signal to the cognitive constructs the authors invoke - are not fully supported by the present design, and these gaps affect how readers should interpret the stronger theoretical conclusions.

      The "process-pure" framing conflates sensitivity with construct purity. The authors repeatedly describe the eye-tracking measure as providing a more process-pure index of statistical learning than manual-response paradigms. Anticipatory saccades are themselves a learned motor behavior - the oculomotor system is among the most plastic motor outputs the primate brain generates, and sequence learning in the saccadic system is well-documented. The present design does not dissociate learning of the statistical structure from learning of the oculomotor sequence that expresses it, so the measure is not, on its face, free from the motor-learning confound that the authors criticize in button-press paradigms. The framing should be read as aspirational rather than as demonstrated by the present data.

      The oculomotor reaction-time data do not show the canonical signature of statistical learning. Reaction times for low-probability trials rise across epochs while those for high-probability trials remain approximately flat (Figure 5). The emerging difference between the two trial types, therefore, appears to be driven by a slowing of responses to low-probability stimuli rather than by a facilitation of responses to high-probability ones, and the authors do not rule out the alternative interpretations that this pattern reflects fatigue, a motor floor effect, or inhibition of unexpected locations. Because no fixation constraint is imposed during the response-stimulus interval, pre-stimulus gaze drift toward the anticipated location will artifactually reduce reaction time on precisely those trials the authors wish to treat as learning-driven; the fact that measured reaction times remain well above zero even on trials classified as correct anticipations is itself evidence that this contamination is present. The oculomotor reaction-time data, therefore, do not provide as clean a verification of learning as the manuscript implies.

      The correct/error labeling of anticipatory saccades incorporates information that the participant did not have. Because the first saccade occurs during the response-stimulus interval - that is, before the upcoming stimulus is revealed - the participant's internal predictive state is identical whether the trial is subsequently classified as a learning-dependent correct response or a learning-dependent error. Any difference in the epochwise frequency of these two categories must therefore be driven, at least in part, by the external stochastic structure of the task rather than by a difference in the predictive process itself. In particular, the observation that learning-dependent errors are the most frequent saccade type (Figure 7) is predicted by the prior probabilities of the outcomes alone, given a high-probability prediction, without appeal to any difference in predictive state. Readers should recognize that the theoretically meaningful contrast is between learning-dependent and not-learning-dependent anticipations (two categories), and that the four-way split risks confounding predictive state with outcome stochasticity.

      The iterative-updating metric does not distinguish prior revision from alternative processes. The binary update / no-update code, computed across non-contiguous occurrences of the same three-element context, does not discriminate between a genuine update of the internal model, simple episodic retrieval of a previously encountered triplet, and oculomotor perseveration. Without a formal generative model to anchor the interpretation, the central theoretical claim - that statistical learning is less error-driven than commonly assumed - is underdetermined by the data. The repetition pattern the authors observe is equally consistent with an error-driven model equipped with a low learning rate in a stable environment, an interpretation the authors themselves acknowledge in the Discussion. Adjudicating between these possibilities requires comparison against explicit computational models, which the present manuscript does not provide.

      Data loss and the absence of fixation control. An interpretable saccade is detected on fewer than half of all trials (48.76%; line 889), and the manuscript does not report the distribution of saccade counts per interval, the per-condition trial counts after all exclusions, or the decomposition of the 20% missing-data threshold into its underlying causes. Given that the entire inferential apparatus rests on this subset of trials, the degree of data loss is a relevant context for the reader. Separately, no fixation constraint is imposed between trials: the participant's starting gaze position at the onset of each response-stimulus interval is whatever position was reached at the end of the preceding response, and this starting position carries trial-history information correlated with the upcoming stimulus. This leaves open the possibility that what is classified as predictive orienting partly reflects the mechanical consequences of where the eye happened to be at the end of the previous trial. The authors defend the absence of a fixation cross on the grounds that it would transform the transitional structure of the task, but this is an empirical claim presented without a supporting citation.

      Heterogeneity within the high-probability condition is not addressed. The two routes to a high-probability triplet in the design - pattern-random-pattern (50% of trials) and random-pattern-random (12.5%) - differ both in their base rate and in the reliability of the contextual cue they provide. Collapsing across these subtypes is an analytical choice that may conceal heterogeneity in the underlying learning process.

      Appraisal: Do the results support the authors' conclusions?

      The framework succeeds in providing a trial-by-trial behavioral readout of predictive orienting that is more fine-grained than conventional reaction-time measures, and the behavioral dissociation between errors congruent with the regularity and errors reflecting an inaccurate internal model is a genuine empirical contribution. The conclusions about the mechanistic nature of statistical learning should be read as motivating hypotheses for future modeling work rather than as settled empirical claims.

      Impact and utility:

      The analytical framework introduced here is likely to be useful to researchers working on implicit learning, predictive processing, and Bayesian models of perception and cognition. The measure of predictive orienting and the iterative-updating code could be adapted to a range of probabilistic learning paradigms, and the behavioral dissociation between noise-driven and model-mismatch errors fills a methodological gap that the field has long acknowledged. The authors share their data and code openly, which will facilitate reuse. The most durable contribution of the paper is methodological; the theoretical claims about the nature of statistical learning will require additional computational modeling before they can be regarded as established.

    1. Reviewer #1 (Public review):

      This paper reports an auditory-directed analysis of the HCP 7T short movie dataset. It has the goal of using the film audio to create tonotopic (pRF) maps and combine these with other HCP-provided data (e.g., T1/T2 ratio) to improve understanding of auditory cortex organization and relative functional segregation, particularly in reference to speech processing.

      The paper is ambitious, uses well-founded existing tools for combining data across subjects, and in the Discussion in particular, makes a lot of careful points about interpretation. The paper shows that, at least for a very large dataset on 7T (and for at least a few individual participants) good quality cross-subject-average tonotopic maps can be extracted from fMRI movie datasets via basic spectral modelling of the movie soundtracks. It also suggests ways that these movie-based maps can be combined to come up with potential models of cortical organization. The PCA analysis is a creative way of combining maps (see below for comments)

      These are valuable tools for the field in exploiting/exploring existing data, and I look forward to trying them out myself. I want to emphasize that this is not 'damning with faint praise' - a concrete demonstration of this approach with freely available tools/examples is not only the product of a lot of effort (thank you!), but will be an impetus to research going forward.

      In terms of the contribution to our understanding of auditory cortex organization, using this large N cohort, they replicate a number of findings in the literature from the last couple of decades, including the overlap of low frequency preference with greater speech stimulus preference (e.g. Moerel, de Martino, & Formisano, 2012, J Neuro), patterns of BF width across cortex (Moerel et al., various; Thomas et al. 2015), use of shorter and longer natural sounds (Moerel et al., 2012, 2014; Dick et al., 2012), the importance/influence of sustained spectral attention for tonotopic mapping (da Costa et al., 2013; Dick et al., 2017; Riecke et al. 2017), the use of tonotopy and 'myelin' mapping to establish areal or regional boundaries (Dick et al., 2012; de Martino et al., 2015; Besle et al., 2018, etc) and the overall shape and consistency of tonotopic maps (e.g., Talavage et al., 2004, Humphries et al., 2010 and many others). To my knowledge/memory, this is the first tonotopy paper that has used the cross-subject cortical-surface-based averaging techniques that are driven by more than curvature/sulcal alignment.

      The paper focuses in particular on creating new sets of ROIs based on the various maps derived from the data. Despite being quite familiar with this body of work, I found it difficult to follow how the ROIs were derived, and how and why they were different and/or an improvement over existing parcellation schemes (see for instance Sereno, Sood, & Huang, 2022 for a comprehensive parcellation framework across modalities including auditory, based on combined receptive surface mapping, myelin estimates, and other metrics).

      Given the hour of fast(ish) fMRI data on a 7T with pretty big voxels (so high SNR), one aspect of the results that I found surprising - and potentially informative - was the lack of reliable tonotopic 'mappability' in the majority of participants. The authors' analytic approach to computing the pRFs seems completely reasonable (and shows good average maps), and yet individual maps seem unreliable except for the very best examples. I wondered if this might be due to problems in data collection with earbuds becoming slightly uncoupled and therefore delivering a lot less lower-frequency response and also not preventing scanner noise from getting to the ear; this is often a problem with any in-scanner earbud system (including the Sensimetrics). I wondered if the robustness of the 'speech maps' was associated with that of tonotopy; if they are highly associated, that would suggest that either there were huge individual differences in auditory attention, or perhaps that there was some variability in the acoustic signal delivered to each participant.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors leverage a high-powered 7T fMRI dataset of subjects viewing naturalistic audiovisual movies to elucidate the topographic organization of the human auditory cortex. By applying a nonlinear pRF model, they successfully map tonotopic gradients extending beyond the auditory core into the STG and STS areas. A primary finding is a medial-to-lateral gradient of increasing response compressivity, which the authors claim mirrors the hierarchical cascade architecture of the visual system. Furthermore, the modeling reveals that regions exhibiting high speech selectivity predominantly occupy the low-frequency portions of non-primary tonotopic maps. The authors argue that this architecture reflects an efficient coding mechanism where the cortex magnifies specific spectral features to facilitate the transition from acoustic encoding to flexible speech representation.

      Overall, the study presents concise analyses and compelling high-resolution results that advance our understanding of auditory cortical organization. However, the manuscript currently exhibits several significant theoretical and methodological gaps that temper its broader claims. Most notably, the authors' reliance on a spatial, retinotopic-like analogy overlooks the fundamentally temporal nature of audition. Decoding continuous, natural speech relies heavily on dynamic, full-spectrum temporal integration and contextual recurrent computations, which are difficult to reconcile with the purely static, low-frequency spatial tuning observed here.

      Strengths:

      (1) The utilization of ultra-high-field 7T functional imaging combined with large-scale, naturalistic continuous stimuli provides an excellent signal-to-noise ratio and captures cortical responses under ecologically valid conditions.

      (2) The application of a non-linear pRF encoding model provides a robust, quantitative method for parameterizing and mapping tonotopic features across the cortex, moving beyond simple contrast-based parcellations.

      (3) The manuscript effectively demonstrates the relationship between category selectivity (e.g., speech) and underlying tonotopy, drawing an elegant and structurally useful analogy to the well-established relationship between category selectivity and retinotopy in the visual cortex.

      Weaknesses:

      (1) While the PCA mapping of the functional and structural parameter space is visually compelling, the robustness of this representational geometry across varying acoustic contexts remains ambiguous. Because the model relies on the specific statistical regularities of a single naturalistic audiovisual stimulus set, it is unclear if this low-dimensional structure would hold when tested against isolated speech sounds, environmental noise, or spectrally matched non-speech control stimuli.

      (2) The methodological descriptions currently lack the computational precision required for replication and deep evaluation. I would suggest that the exact mathematical formulation of the encoding model be fully specified in the Methods section. This should include an explicit definition of the objective function, a clear accounting of all terms and hyperparameters utilized during the fitting process, and the exact dimensionalities of both the input feature space and the resulting parameter space.

      (3) There is a critical theoretical disconnect between the observed static, low-frequency tuning in the STG and the known acoustic requirements for continuous speech perception. Speech is a full-spectrum signal; while fundamental frequencies and formants dominate the lower spectrum (which is vital for processing dynamic pitch contours), high-frequency bands (>1 kHz) carry indispensable phonetic information, such as the rapid spectrotemporal dynamics of consonants, especially fricatives. If the speech-responsive cortex is primarily and statically tuned to a low-frequency spectrum, it is unclear how the dynamic, high-frequency spectral information required for semantic decoding is represented. A rich body of electrophysiological literature documents diverse spectrogram coding in the STG. For example, Mesgarani et al. (Science, 2014) demonstrated using spectrotemporal receptive field models that neural populations in the STG are tuned to both low and high-frequency spectrograms well above 1 kHz. The authors must address this discrepancy and attempt to reconcile their static tonotopic findings with the existing literature on dynamic speech encoding.

      (4) While drawing parallels between visual and auditory processing hierarchies is conceptually attractive, the modalities face fundamentally different computational challenges. Vision is largely resolved in space, making a retinotopic spatial coding strategy ecologically and computationally sound. Audition, however, evolves continuously in time. Complex temporal structure, continuous temporal integration, and contextual recurrent computations are paramount for auditory processing, particularly for speech comprehension. In this sense, a purely spatial or tonotopic coding framework is insufficient to fully explain the complex temporal processing dynamics required in the higher-order auditory domain.

    3. Reviewer #3 (Public review):

      Summary:

      The work has the potential to identify the topographical organization of the auditory cortex, which remains controversial with current unnaturalistic sound stimulation, using an elegant approach developed in the visual domain with population receptive field mapping to study the organization of the visual system with naturalistic stimulation conditions.

      Strengths:

      This work presents an analysis of the topographic study of auditory cortical organization, using a substantial Human Connectome Project 7-Tesla functional imaging dataset in which 174 participants viewed naturalistic movies.

      Weaknesses:

      The key issue for the paper is that even the authors seem undecided on what the topographical results are and whether these results are consistent with, refute, or expand our notion of human auditory cortical field organization using this massive dataset obtained under movie-watching conditions. Short of this clarity, and much of the discussion of the issues surrounding topographic mapping is buried in the Supplementary materials section, it is not clear what the authors think the advance of the current work is beyond the large datasets.

      On the flip side, there is little consideration of the challenges of mapping the auditory cortex using naturalistic stimuli that prevent dissociating visual from auditory stimulation conditions, contributing to this clarity or lack thereof in tonotopic mapping.

      As such, the current manuscript struggles to achieve its full potential.

    1. Reviewer #1 (Public review):

      Summary:

      This study by Tsuji et al. explores a mechanical threat model in Drosophila using air puffs as a stimulus. The authors first establish the paradigm and show that air puffs induce cardiac deceleration along with increased locomotion. They then identify dopamine as a key regulator of this response and go on to map the underlying circuit. In doing so, they pinpoint two pairs of DA-WED neurons as critical players. They carefully used intersectional strategies to achieve relatively clean labeling of these neurons, which helps ensure that the observed effects can be attributed specifically to DA-WED neurons. They further show that DA-WED neurons are both required and sufficient to drive cardiac deceleration, and that their activity increases in response to air puff stimulation. These neurons also contribute to the locomotor response. Directly inducing cardiac deceleration via optogenetic manipulation of cardiomyocytes also increases locomotion, suggesting a link between cardiac state and behavioral output.

      Strengths:

      Overall, the experiments are thoughtfully designed, well-controlled, and clearly presented. The figures are easy to follow, and the conclusions are generally well supported by the data. The manuscript is also clearly written, with a discussion that acknowledges potential caveats and outlines future directions. The genetic tools, behavioral paradigm, heart rate measurement approaches, and stimulation methods introduced here will be valuable resources for the community.

      Weaknesses:

      A few minor points to add to the clarity of the manuscript:

      (1) The DA-WED driver (R48A08-AD ∩ VT008692-DBD ∩ TH-FLP) appears quite clean in the brain. However, since the study focuses on cardiac function and locomotion, it would be helpful to check expression in cardiomyocytes and the ventral nerve cord. This would help rule out any off-target expression that might contribute to the phenotypes and further support the idea of a descending pathway from brain dopaminergic neurons.

      (2) Since DA-WED>Kir2.1 abolishes the puff-induced locomotor response (Figure 5b), suggesting that DA-WED neurons are directly involved in mediating locomotion. In the model (Figure 5L), it might make more sense for the pathway from mechanical threat to locomotion to pass through DA-WED neurons. The authors could consider adjusting the schematic if they agree.

      (3) In line 408, Figure 5K should be 5L as it's a discussion of the model.

      (4) In Figure 5j, the x-axis is missing time labels. Even if it matches Figure 5h, adding labels would make it easier to interpret at a glance.

      (5) In line 312, it would be helpful to briefly explain why a 28 ms light pulse was used, compared to other pulse durations elsewhere in the paper.

      (6) The cardiac deceleration seems to recover quickly after the air puff ends, whereas the locomotor response persists longer (around 10-15 seconds; see Figure 1 and Figure 5). This difference might suggest that DA-WED neurons influence locomotion through an additional or partially independent pathway, beyond their role in cardiac regulation. It could be worth briefly discussing this possibility.

    2. Reviewer #2 (Public review):

      Summary:

      The authors study cardiac deceleration during threat responses in Drosophila. Particularly, it focuses on identifying the neuronal control of this deceleration. Using behavioral and cardiac tracking and analysis, genetics, and calcium imaging, they identify two pairs of dopaminergic neurons involved in cardiac deceleration during air puff responses

      Strengths:

      The study is overall well done, and the paper is clearly written. Particularly, the work on identifying the two pairs of dopaminergic neurons involved in cardiac deceleration using a series of drivers and generating new ones is rigorous and extensive. Finally, the authors manipulate the heartbeat to investigate how it influences threat responses

      Weaknesses:

      There are, however, several points that need to be clarified, as some claims are not entirely supported by evidence.

      The authors, for example, claim that dopaminergic neurons are responsible for cardiac deceleration (during the air puff, lines 182-3, page 9). However, based on the work in this study, it seems that other neurons could be involved in this control as well. In addition to dopaminergic neurons, the authors test serotonergic and octopaminergic neurons, which, based on silencing experiments, also show an implication in heart-beat deceleration. Furthermore, because they find that dopaminergic neurons are the only ones that, upon thermogenetic activation, lead to lower heart beat frequency, they conclude that the dopaminergic neurons are responsible for air -puff induced cardiac deceleration.

      However, these activation experiments are done in a different context than the air puff experiments (at a higher temperature, which could have an effect on the heartbeat changes upon activation of different neuron groups), and because silencing of other monoaminergic neuron types during the air puff also resulted in less cardiac deceleration, one cannot exclude the implication of octopaminergic or serotonergic neurons in air-puff-induced deceleration.

      Activation experiments without high temperatures (using, for example, optogenetics) and/or in the presence of the air puff would be important to determine that the dopaminergic neurons are the main type of monoaminergic neurons involved in air-puff-induced cardiac deceleration. Otherwise, the related claims should be rephrased in a way that clearly doesn't exclude a possible implication of other monoaminergic neurons.

      Regarding the interactions between the cardiac deceleration and locomotion, the authors propose, based on the results, that the optogenetic cardiac deceleration is sufficient to induce an increase in locomotion, and that it is the decrease in heartbeat that would be responsible via interoceptive pathways to trigger an increase in locomotion. In the model they propose, the DA-WED neurons would induce a decrease in heartbeat that, in turn, would trigger an increase in locomotion. There is not enough proof that cardiac deceleration is the one that triggers an increase in locomotion during air puff responses. As the authors themselves state, the experiments that would demonstrate this would involve preventing cardiac deceleration while optogenetically activating DA-WED. It can therefore not be excluded that the DA-WED neurons trigger an increase in locomotion that is possibly modulated by the cardiac activity. Both alternatives should be considered (models in Figures 4 and 5).

    3. Reviewer #3 (Public review):

      Summary:

      In this elegant study, Tsuji et al. identify a relationship in Drosophila between cardiodynamics and threatening stimuli where mild air puffs elicit a brief bradycardia that coincides with locomotion increases. They then take advantage of the arsenal of genetic tools available in the fruit fly to reveal the indispensability of dopamine, through the action of Dop1R2, in this phenomenon. Further, they pinpoint the source of this dopamine to two specific pairs of neurons - DA-WED that are threat-activated. They then test and find a potential role for cardiac interoception from the heart in linking behavior and cardiodynamics.

      Strengths:

      This is an interesting and timely story that brings together the tools of fruit fly systems neuroscience and links it with physiology. The experiments are well done and tell a very nice story. In particular, the primary message of the paper - that the authors have identified specific dopaminergic neurons that regulate cardiac activity - is sound.

      Weaknesses:

      There are no important problems with the scientific approach. Rather, there are some interpretive changes I would consider.

      (1) The changes in heart rate are small (10% or so), and, as far as I can tell, are evident for a beat or two. So the data may be better interpreted not as a change in rate but as a lengthening of diastole for a beat or two. That may seem a petty difference, but it might point to particular stretch-activated systems or changes in blood flow as the determinant.

      Heart rate must be averaged over time, and so might be blurring the effects. It may be useful to produce figures centered on beat count and duration rather than time. Because the effect may even be just on a single beat, we suggest the authors try plotting the average beat duration for each beat that follows the air puff. If it's really just the first beat, using a quantification of the change of this duration relative to the average that precedes the puff may produce more striking figures.

      (2) The author's model that cardiac deceleration leads to walking data is only partially supported by their data. In the first figure, the relationship between cardiac deceleration and walking probability seems to be inverted relative to their model (weak stimulus -> strong cardiac effect and weak locomotor effect; strong stimulus-> weak cardiac effect and strong locomotor effect). It is possible that this discrepancy may disappear when the authors look at beat duration rather than heart rate (for instance, if following the strong stimulus, there is a very long beat that is followed by tachycardia, thus weakening their observed HR change). It would also be easier to relate this data in Figure 1 to their interoceptive model if some data were shown that illustrated the relative timing of the cardiac change and the locomotor start.

      (3) Also, since the locomotor and cardiac changes are probabilistic, it would be very useful to see how their respective probabilities change when conditioned on the other. According to their interoceptive model, locomotion should preferentially increase on trials where cardiac deceleration occurs. The authors should discuss this incongruity and also potential alternative interpretations of their cardiac manipulation experiments. Perhaps the bradycardia makes them more sensitive to threats - as suggested in the introduction? Control flies show a mild increase in locomotion following green light (Figure 5j), so perhaps by slowing the heart, they are more sensitive and thus respond more strongly to this stimulus?

      (4) Looking at the example shapes of the beats in Figure 5g versus Figure 1c, the optogenetically induced diastole has a very different shape from the naturally occurring long beat. Thus, the exact cardiac stimulus may be unnatural. If this is true across trials and animals, it may be worth considering that the funny beat (like an anxiogenic atrial fibrillation in mammals) is the source of the fear and, in turn, locomotor behavior (also interesting!) rather than being a true replication of the cardiac events seen following the puff stimulus.

    1. Reviewer #1 (Public review):

      Summary:

      The current study is a follow-up to a previously published study by the same research group (Nold et al. 2025). In the previous study, the authors had included a set of exploratory analyses which assessed the effects of fitness level (denominated by a relative FTP), sex, and drug treatment (Naxolone versus placebo). In this previous study, the authors state that "exploratory analysis showed a significant main effect of fitness level on differences in pain ratings in the [saline] condition... suggesting increased hypoalgesia with increasing fitness levels, pooled across all stimulus intensities".

      In the current study, the authors have recruited an additional 22 female participants (21 included in analysis) from local cycling clubs to assess if fitness level does indeed impact exercise-induced hypoalgesia responses to experimental thermal and pressure pain models.

      Strengths:

      The current study has the potential to present a convincing argument about the effect of fitness level and potentially other factors (e.g., sex) on exercise-induced hypoalgesia responses. Combining data across two of their primary studies would be highly fruitful to the research community interested in this area. Specifically, it has the potential to inform sports medicine practitioners and how they administer exercise protocols to help those experiencing pain with a further consideration for the fitness level (and maybe sex) of their patients.

      Weaknesses:

      However, the current study makes several bold claims about the role of fitness level and sex on exercise-induced hypoalgesia, which I do not believe that this study on its own - or in conjunction with the previously published study by the same authors - can make at present. Namely, the current study does not appear to conduct any specific analyses between the cohorts from either study (current and present). The results mention a difference in the group mean values in "fitness level" between cohorts, but the analysis itself on pain responses/exercise-induced hypoalgesia is limited only to the cohort from the current study. If the authors wanted to provide a convincing argument that fitness level has an effect on exercise-induced hypoalgesia, then the analysis of this study would have to include an analysis between the groups considered to be of "high" and "low" fitness level. I do not think the current study does this. Instead, it makes an assumption from the previous study (Nold et al. 2025) which only states that "exploratory analysis showed a significant main effect of fitness level on differences in pain ratings in the [saline] condition... suggesting increased hypoalgesia with increasing fitness levels, pooled across all stimulus intensities". The analysis of this study would have to include fitness level "high fitness" versus "low fitness" of participants across both studies in its statistical model to properly discern if fitness level has an impact on exercise-induced hypoalgesia.

      A similar comment can be made with respect to sex differences, as these have not been assessed in the analysis of this study either.

      Another area of weakness in this study is how "fitness level" has been demarcated across participants. One issue is how authors have assumed that the current cohort is 'fit', whereas the previous cohort was 'less fit', meaning that the authors could be coming to false conclusions about fitness level. In detail, figures within the current study show a large overlap between the 'fit' and 'less fit' cohorts, where some participants have a higher relative functional threshold power (FTP) in the 'less fit' cohort than the 'fit' cohort and vice versa. Therefore, I believe the authors should better demarcate between those that are in the 'more fit' and 'less fit' groups according to a validated and well-established criterion from the kinesiology and sport science literature. That being said, I think this may be problematic in some ways as FTP is considered a relatively poor measure to denote fitness levels, a limitation highlighted in the previous study's review.

      Altogether, whilst I commend the researchers on their body of work across the two studies, the current methods and analysis provide an incomplete assessment of their primary research question, and therefore, I would urge the authors to reconsider some of their methods/analysis and the framing of their results to better reflect the main research question they have attempted to answer. Likewise, I would recommend that readers ensure they consider the current results with caution until the authors have addressed some areas of concern which currently limit their main conclusions.

    2. Reviewer #2 (Public review):

      This study addresses an important question regarding exercise-induced modulation of pain in women, but the conclusions appear to be based on relatively limited and selective evidence. The authors report an interaction between exercise intensity and stimulus intensity, which they interpret as evidence for exercise-induced hypoalgesia and conclude that fitness, but not sex, modulates this effect. However, this main result relies on a relatively small interaction that emerges only under specific conditions, with inconsistent findings across pain modalities and stimulus intensities, and an analysis approach that does not fully exploit the continuous pain ratings collected. The lack of a baseline condition further limits the interpretability of the findings as reflecting hypoalgesia, and overall, the data provide a rather constrained basis for drawing broader conclusions.

      Strengths:

      (1) The focus on women is important and timely, particularly given the ambiguity in prior findings and the historical bias toward male-dominated samples.

      (2) The attempt to revisit previous findings in a new cohort is valuable in principle.

      Weaknesses:

      (1) The core interpretation may not be fully supported by the data

      The central claim-that the results demonstrate exercise-induced hypoalgesia and its dependence on fitness but not sex-does not appear to be fully supported by the evidence presented.

      1.1 Lack of baseline condition

      The absence of a no-exercise baseline substantially limits interpretation. The study compares high- and low-intensity exercise, but without a baseline, it is not possible to determine whether either condition produces hypoalgesia or hyperalgesia relative to calibration. The observed HI-LI difference, therefore, reflects only a relative contrast between exercise intensities, not an absolute reduction in pain. As a result, attributing the findings to "hypoalgesia" may be difficult to justify fully.

      1.2 Lack of internal replication across conditions

      The reported effect is highly specific and does not clearly generalise across the experimental design. It emerges significantly only for heat pain at the highest stimulus intensity, with no clear effects for other intensities and for pressure pain. Moreover, the main statistical result is a relatively small interaction effect with a modest p value, which translates into a difference of approximately 6-8 VAS units on a 150 scale. This combination-a small effect size, limited statistical strength, and restriction to a single condition-substantially weakens the evidence for a robust or generalisable effect.

      1.3 Deviations from the original study and selective use of data

      Although framed as a follow-up to previous work, the current study introduces substantial methodological changes, particularly in the acquisition and scaling of pain ratings (continuous vs post-hoc ratings, modified VAS with sub-threshold range). Despite collecting rich continuous data, the analysis focuses on peak responses to approximate the previous study. While this may aid comparability, it results in a strong emphasis on a single data point (highest intensity), rather than leveraging the full dataset. This limits both interpretability and comparability.

      1.4 Over-reliance on null results regarding sex differences

      The conclusion that fitness, but not sex, modulates exercise-induced pain may not be directly supported by the data presented. The current study includes only highly fit women, and comparisons with men or less-fit women rely on non-significant differences in a previous cohort. The absence of a significant difference does not provide evidence for equivalence, and no formal statistical support for a null effect is provided. As such, conclusions about the absence of sex differences would unfortunately benefit from more cautious interpretation.

      (2) Limited sample and lack of diversity

      The dataset is narrow in scope, comprising a small sample (N = 21) of healthy, highly fit women. Key demographic characteristics (e.g. age range, BMI distribution) are not fully presented, explored or discussed. This limits generalisability and makes it difficult to draw broader conclusions about exercise-induced pain modulation in women, as the main focus of the study.

      (3) Methodological choices limit the interpretability of the data

      Several methodological decisions would benefit from stronger justification:

      3.1 The use of a non-standard VAS scale (0-150 with a fixed pain threshold at 50) is unconventional and may influence how participants report pain, while limiting comparability with related literature.

      3.2 Participants explicitly reported expecting exercise to reduce pain, introducing a potential confound that is not presently addressed.

      3.3 A more comprehensive use of the full time series of pain ratings would provide a stronger and more transparent basis for interpretation of the present findings.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript investigates how cellular NAD/NADH ratios are controlled in cancer cell lines in vitro. The authors build on previous work, which shows that serine synthesis is sensitive to NAD/NADH ratios and PHGDH expression. Here, the authors demonstrate that serine synthesis is variable across a panel of cell lines, even when controlling for expression of serine synthesis enzymes such as PHGDH. The authors show that cellular NAD/NADH ratios correlate with the ability to synthesize serine and grow in serine-deprived environments when PHGDH levels remain constant. Investigating this variability in NAD/NADH ratios, the authors find that the cells that can positively respond to serine deprivation are able to increase oxygen consumption and cellular NAD/NADH ratios. Cells that do not increase oxygen consumption in response to serine deprivation do not increase NAD/NADH ratios and cannot grow well without serine. The authors go on to show that in cells with the ability to increase oxygen consumption upon serine deprivation, PHGDH expression alone is sufficient to fully restore growth-serine; in cells that cannot increase oxygen consumption, both PHGDH expression and interventions to increase NAD/NADH ratios are required to increase growth. Thus, cells need both PHGDH and NAD/NADH increases to maximize serine synthesis in response to serine deprivation. The authors previously showed that lipid synthesis likewise requires NAD regeneration. Interestingly, one cell line that does not increase oxygen consumption in response to serine limitation tends to increase oxygen consumption in response to lipid deprivation; accordingly, depriving this cell line of lipids increases the synthesis of serine. Together, these findings show that how cells respond to nutrient deprivation is highly variable and that the response to nutrient deprivation (for example, whether or not oxygen consumption is increased) will determine how well cells tolerate depletion of nutrients with related biosynthetic constraints. This work sheds light on the complexity of cancer cell metabolism and helps to explain why it is difficult to predict which nutrients will be limiting to any cancer cell type or environment.

      Strengths:

      (1) The authors use multiple interventions to manipulate NAD/NADH ratios in cells.

      (2) Experiments are well controlled and appropriately interpreted.

      Comments on revised version:

      The authors thoughtfully and thoroughly responded to all reviewer comments. The revised manuscript addresses the critiques.

    2. Reviewer #2 (Public review):

      In the manuscript "Cancer cells differentially modulate mitochondrial respiration to alter redox state and enable biomass synthesis in nutrient-limited environments", Chang et al investigate how cancer cells respond to the limitation of certain environmental nutrients by regulating the cellular NAD+/NADH ratio. They focus on serine and lipid metabolism, pathways known to be controlled by the NAD+/NADH ratio, and propose that changes in mitochondrial respiration in response to deprivation of these nutrients can influence the NAD+/NADH ratio, thereby impacting biomass synthesis.

      While the study is descriptive in nature and does not investigate specific molecular mechanisms that explain the crosstalk between nutrient availability and mitochondrial redox changes, the experimental component is robust, and the conclusions are well supported by the results. Some suggestions could further refine the conclusions and enhance the quality of the manuscript.

      Comments on revised version:

      The authors have provided a very comprehensive response. Their updated paper has improved, and the critiques have been mitigated.

    1. Reviewer #1 (Public review):

      The authors conducted a comprehensive benchmarking and evaluation of co-folding platforms, including AlphaFold3, Boltz-2, Chai-1, and the docking algorithm Dock3.7, which employs a physics-based scoring function that incorporates van der Waals interactions, electrostatics, and ligand desolvation energies. The system of interest was the SARS-CoV-2 NSP3 macrodomain (Mac1), an increasingly popular antiviral target, and the ligand sets comprised 557 unseen ligand poses (keeping the training for these co-folding platforms in mind). Additionally, the authors investigated whether the co-folding models could distinguish true ligands from non-binding small molecules. The study is thorough, with extensive statistical support and consensus across multiple metrics (chemoinformatics for quantifying ligand similarity and efficacy). The questions that the authors aim to address are whether the co-folding models struggle with memorization, whether they can distinguish between a true and a false binder, whether they replicate experimental binding affinities and efficacy, and how they compare to the physics-based docking algorithm (Dock3.7).

      Strengths:

      Overall, this is a scientifically solid paper.

      The work is highly detailed and well executed, featuring thorough data analysis and statistical assessment.

      Comments on revised version:

      The authors have adequately addressed my concerns.

    2. Reviewer #3 (Public review):

      Summary:

      Core conclusions are well-supported by data: co-folding outperforms docking in known ligand pose/affinity prediction (validated by RMSD and IC₅₀ correlation), struggles with false positive discrimination in virtual screens (lower AUC values), and is complementary to docking (non-correlated errors, distinct strengths in drug discovery stages).

      Strengths:

      Unprecedented prospective design with 557 novel Mac1-ligand complexes ensures rigorous, independent evaluation of co-folding methods, provides an unbiased and rigorous benchmark dataset, which contains structures and compounds absent from the co-folding models training sets. Comprehensive comparison of 3 co-folding tools (AlphaFold3, Chai-1, Boltz-2) with DOCK3.7 across diverse targets and metrics enables nuanced performance assessment. The revised results clarify an intriguing finding: co-folding can predict correct ligand poses even when protein formations are mispredicted. The study clearly demonstrates complementary roles of co-folding (superior pose/affinity prediction for known ligands) and docking (better hit prioritization), and addresses deep learning memorization concerns via ligand similarity analysis.

      Weaknesses:

      The study identifies a major limitation of co-folding-failure to capture rare protein conformational changes, which deserve future investigation. The authors include uncalibrated Boltz-2 affinity data (addressing a prior comment) but note that large-scale free energy perturbation (FEP) comparisons are beyond their capabilities.

      Appraisal of Aims Achieved:

      The authors successfully achieved their primary aims and the results provide strong, well-supported evidence for their core conclusions. Key conclusions are grounded in the study's unbiased, training-set independent data, ensures the conclusions are not confounded by model memorization and are broadly applicable to the field's use of these co-folding models.

      Field Impact:

      This study provides a critical reality check for the field: co-folding models are powerful tools for pose prediction but are not yet standalone solutions for virtual screening, a key distinction that will prevent over-reliance on these models and guide more rational tool selection.

  2. May 2026
    1. Reviewer #1 (Public review):

      Summary:

      Eroglu and Hobert demonstrate that injecting CRISPR guides and repair constructs to target three genes at a time, tagging each with a different fluorescent protein, and selecting which gene to tag with which fluorophore based on genes' expression levels, can improve efficiency of gene tagging.

      Strengths:

      This manuscript demonstrates that three genes can be targeted efficiently with three different fluorophores. It also presents some practical considerations, like using the fluorophore least complicated by agar/worm autofluorescence for genes with low expression levels, and cost calculations if the same methods were used on all genes.

      Weaknesses:

      Eroglu has demonstrated in a previous publication that single-stranded DNA injection can increase efficiency of CRISPR in C. elegans, while inserting two fluorescent proteins and a co-CRISPR marker into three loci, and Paix et al 2015 demonstrated simultaneous insertion of two fluorescent tags. The current work is valuable and incremental advance. In general, I applaud the authors' willingness to strategize about how whole proteome tagging might be accomplished. I predict that the advance here will be one of many small advances that will get the field to that goal. The title oversells the advance presented, in my view, since seems like one among many key advances, and the first sentence of the Discussion seems a more apt summary of the key advance here.

      Some injections targeted genes on the same chromosome together, which will create unnecessary issues when doing crossing that will be useful for some future experiments. This made me wonder if injecting 3 together really is helpful vs targeting each gene separately, since only 5 worms need to be injected. It cuts time down by 2/3, but perhaps avoiding targeting the same chromosome with two tags would be useful.

      The limited utility of current blue fluorescent proteins makes me wonder if it's worth using at this stage, before there are better blue fluorescent proteins, or better yet, far red, to avoid issues with live imaging under phototoxic UV or near-UV illumination.

    2. Reviewer #2 (Public review):

      Original Review:

      The manuscript by Eroglu and Hobert presents a set of strains each harboring up to three fluorescently tagged endogenous proteins. While there is technically nothing wrong with the method and the images are beautiful, we struggled to appreciate the advance of this work - who is this paper for?

      As a technical method, the advance is minimal since the first author had already demonstrated that three mutations (fluorophore insertion and co-CRISPR marker) could be introduced simultaneously.

      As a pilot for creating genome-scale resources, it is not clear whether three different fluorophores in one animal, while elegantly designed and implemented, will be desired by the broader community.

      Finally, the interpretation of the patterns observed in the created lines leaves much to be desired. A Table with all the observations must be included and can replace the tedious (and often wrong) descriptions of the observations with the different lines. It would be too much to point out every mistaken expectation of protein expression. Two examples include:

      The expectation that ACDH-10 is enriched in the intestine and epidermal tissues (hypodermis) is naïve - there are multiple paralogs of this protein (look at WormPaths or WormFlux) that may share functions in different tissues. There is also no reason to assume that fatty acid metabolism does not occur in other tissues (including the germline). Finally, there are no published studies about this enzyme, so we really don't know for sure what it's doing.

      The expectation that HXK-1 is ubiquitously expressed is similarly naïve. There are three paralogous enzymes that are all associated with the same reaction, and we have shown that these three function redundantly in vivo, perhaps in different tissues (PMID: 40011787). Moreover, single cell RNA-seq data (PMID: 38816550) also shows enrichment of hxk-1 in gonadal sheath cells.

      The table should have at least the following information: gene/protein name - Wormbase ID - TPM levels of single cell data assigned to tissues for L2, L4 and adult (all published) - tissues in which expression is observed in the lines presented by the authors.

      Other points:

      (1) We would encourage the authors to provide systematic validation of the reported insertions. The manuscript reports that 24 of 30 tags were isolated and visible but does not clearly state whether each isolated line was confirmed by sequence‑level validation to be correctly in‑frame and free of unintended mutations at the target locus.

      (2) The manuscript presents aggregated success counts (e.g., 8/10 mTagBFP2 tags, 9/10 mStayGold, 7/10 mScarlet3) and useful narrative descriptions of injection outcomes. We suggest also to include per‑locus success rates.

      (3) For pools that required re‑injection after initial failures, we would like to see a description of the specific changes that were made to the injection mixes or procedures (e.g., new repair template prep, different Cas9 reagent lot, guide redesign). This will be useful troubleshooting information for others.

      (4) The authors states that the fluorophore sequences are codon-optimized for C. elegans. We suggest they provide the exact donor/tag sequences used specifically state whether the fluorophore sequences contain any synthetic/artificial introns or other sequence modifications (e.g., silent PAM‑disrupting mutations) were included in the donor templates.

      (5) Page 3: Include a reference for "The C. elegans genome encodes around 20,000 genes"

      We hope these comments are useful.

      Comments on Revised Version:

      Overall, we found the responses to be quite recalcitrant.

      We have one remaining composite concern about the comparison between observed expression patterns with the new strains versus published data.

      First, the authors only report patterns for one stage while it should be not too much effort to image the different life stages. However, since this is a revision, we are not formally requesting they do this.

      Second, in the now provided Table (thank you) 'observed expression' (last column) is lacking for 9 of the 30 proteins, and for 6 of these the procedure was not successful. Why not report patterns for the other three? It is confusing also because on page 5, the authors say that "overall, 24 of 30 tags ...all of which were visible with fluorescence stereomicroscopy" - are we missing something? Also, they then said that they "obtained 6/9 of the originally failed tags"; why are the corresponding patterns not included in table 1, and are 9 proteins still labeled as "no" in the "success?" Column?

      Third, we strongly feel that the response to our comments about expression patterns is not adequate. On page 5 the authors say that "all proteins were expected to be ubiquitously expressed" and that "scRNA-seq indicated that transcript abundance was ubiquitous and without strong tissue-specific enrichment with few exceptions". However, in their rebuttal, the authors now argue for tissue-specific expression for proteins with paralogs, turning around their own argument! Moreover, their Table indicates that many genes show tissue-enriched expression by RNA-seq while many of their tagged proteins exhibit ubiquitous expression.

      Overall, this indicates that both the overall accomplishment of generating tagged protein strains and analyzing their expression is oversold.

    3. Reviewer #3 (Public review):

      Summary:

      The authors argue that establishing the expression pattern and sub-cellular localisation of an animal's proteome will highlight hypotheses for further study. This claim is probably accepted by many in the community. This manuscript seeks to confirm the feasibility of establishing such a resource, by using current transgenic methods to knock in DNA encoding different colored fluorescent tags into C. elegans genes.

      Strengths:

      The authors make the points above. For example, they provide evidence that the C. elegans germline harbors two populations of mitochondria that differ qualitatively in the proteins they express. They also confirm that labelling the whole proteome is an achievable goal with relatively limited resources and time.

      Weaknesses:

      The work is somewhat incremental in that it uses existing transgenic technology. Cell biology in C. elegans is challenging because of the small size of many of its cells, notably neurons. This can make establishing the sub-cellular localisation of a fluorescently tagged protein, or co-localizing it with another protein, tricky. The authors point out in their introduction that advances in light microscopy such as diSPIM, STED and ISM (a close relative of SIM), have increased the resolution of light microscopy. They also point out that recent advances in expansion microscopy can similarly help overcome the resolution limit. However, they do not use these technologies to characterize their transgenic strains.

    4. Reviewer #4 (Public review):

      Summary:

      Tagging the entire proteome of a metazoan would be a landmark achievement, providing a powerful complement and extension to existing "omic" catalogs in model systems. Here, Eroglu and Hobert argue that efficiently tagging multiple loci in a single "batch" would make the community-based achievement of this goal realistic. They provide rigorous evidence that such an approach is indeed feasible, exploring issues related to efficiency, design and screening strategies, disruption of gene function, and the potential for endogenously tagged alleles to reveal unexpected aspects of protein expression and localization. While the work has some minor gaps that are important to rigorously assess the feasibility of the proposed effort, the detailed and valuable insights that emerge should provide impetus to the community to coordinate efforts to make this ambitious goal a reality.

      Strengths:

      The work has numerous strengths. The authors provide compelling evidence that:

      - three distinct loci can be efficiently targeted with three distinct fluorescent tags in a single injection.

      - thoughtful targeting design can reduce the likelihood of disruption of function by the tag.

      - systematic design principles based on expression level and predicted localization/function can be used to optimize tagging strategies.

      - the resulting tags can provide unexpected insight into patterns of protein production and subcellular localization.

      Not all of these advances are novel in themselves, but taken together, they represent an important technical and conceptual advance. The most important strength comes from the exceptionally high value of the goal itself, in that the work is that it has the potential to spur a community-wide effort toward achieving the ambitious goal of proteome-wide tagging.

      Weaknesses:

      The work's shortcomings are minor.

      - One concern has to do with the feasibility of the proposed screening strategies. The experimental design cleverly coinjects tags for three loci in different gene expression 'zones'; this expression level determines which tag will be used. As the authors allude to, there is an important distinction between genes with the same overall FKPM value between those that are expressed broadly and those focally expressed in a specific tissue. The proposed strategy claims that there are a sufficient number of highly expressed genes "to be used as visible markers" for recovering successfully edited animals. It would be useful for the authors to discuss the issue of broad vs focused expression among this set of genes a bit more thoroughly, with an eye toward the issue of how likely it is that these genes could indeed consistently be used as visible markers, particularly for those at the low end of this limit.

      - What fraction of the proteome (on a per-gene basis) is secreted proteins? How difficult will it be to screen these for successful tags? Are there specific tags that would be more optimal for secreted proteins? (The authors mention the use of an SL2 or T2A cassette to label the cells in which these proteins are expressed but note that there are technical challenges associated with doing this at scale.)

      - For secreted and/or weakly expressed genes, it would be useful for the authors to estimate for what fraction of these would successful insertions need to be screened by PCR, and what resources (time and money) this would likely entail.

      - For how many genes would a single tag not capture all predicted isoforms?

      - Finally, some readers might object to the authors' assertion in the abstract that this work is "a first step in this direction" (presumably referring to designing a strategy for whole-proteome tagging). There is no concern that the authors are disregarding the extensive work of other groups, as they explicitly mention the contributions of other groups to the foundation that enables the present work. However, the spirit of the abstract could be misinterpreted by a well-intentioned reader.

    1. Reviewer #1 (Public review):

      [Editor's note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      The authors aimed to uncover novel therapeutic vulnerabilities in APC-mutant colorectal cancer (CRC), which constitutes the majority of CRC cases. They hypothesized that modulating oxygen-sensing pathways (via PHD inhibition) could disrupt adaptive stress responses in these tumours.

      Strengths:

      The study employs a powerful, two-pronged approach to identify Molidustat's targets. By using both Thermal Proteome Profiling (TPP) and an orthogonal chemical proteomic competition assay, the authors provide compelling evidence that GSTP1 is a genuine, direct off-target, effectively addressing the common limitation of indirect effects in proteomic screens.

    2. Reviewer #2 (Public review):

      Summary:

      The authors aimed to determine Molidustat targets and the potential utility of these findings. They clearly demonstrate that Molidustat interferes with GSTP1 and some other proteins on top of PHD2. They also demonstrate that PHD2 deletion is not sufficient to recapitulate Molidustat effects in cells and proteomes. Finally, they demonstrate synthetic lethality in organoids for Molidustat and APC deletion.

      Strengths:

      The data on Molidustat proteomes, GSTP1 binding, inhibition and metabolic health of organoids is really clear. All biochemical, docking and omic data are really strong. The potential impact of these findings could be the use of Molidustat in APC null tumours and awareness of potential off-target effects.

    3. Reviewer #3 (Public review):

      In this paper, the authors revealed that Molidustat can induce a dose-dependent increase in Caspase-3/7 activity in the HT29 cell line, which is an APC-mutant colorectal cancer cell line. More importantly, they found that targeting PHD2 alone cannot cause cell death. By using thermal proteome profiling (TPP) and orthogonal chemical proteomic competition assays, they determined GTSP1 as a previously undiscovered off-target of Molidustat. They also revealed that combined PHD2 and GSTP1 loss leads to an increase in intracellular ROS and apoptosis. Moreover, they evaluated the effects of Molidustat in colonic organoids and showed that Molidustat has a high selectivity for colonic organoids with activated WNT signaling and/or KRAS pathway alterations, and this effect is not reproduced by hydroxylase inhibition alone, providing a new potential approach to targeting both PHD2 and GTSP1 for the treatment of APC-mutant CRC.

    1. Reviewer #1 (Public review):

      Summary:

      Knowing that small pupil-size variations accompany brightness variations (even when these are illusory), the authors asked whether pupil constrictions would accompany the synesthetic perception of a brighter color (compared with a darker one), induced by the presentation of a black-white character. This grapheme-colour synesthesia is only experienced by few participants, sixteen of whom were enrolled in this study. The results reliably showed that a relative pupil constriction would "betray" the perception of a brighter color in these participants, while no such effect would be observed in control participants who were asked to report a color in association with each grapheme, even though they did not perceive any.

      Strengths:

      The main strength of the study lays in its combination of psychophysics (brightness ratings) and pupillometry, which allowed for showing clear-cut results.

      Weaknesses:

      I only see the following relatively minor weaknesses, namely:

      - The pupil traces in Figure3 (main results) are heavily pre-processed (per-participant demeaned), loosing any feature besides the effect of interest. As I argued in my first review, I worry that this format gives unrealistic expectations about the effect (the perception of dark/bright colors do not generate a net dilation/constriction of the pupil; perception-related modulations of pupil size are always relative and generally small compared to the numerous other effects registered in pupil size; these include a pupil dilation that is more prominent in the controls and that gets analyzed later on in the manuscript; I do not think that eliminating one of the effects of interests from a main results figure helps the reader understand the results). In the revised manuscript, the authors addressed this concern by adding a Supplementary Figure 4, where a more complete representation of the results is shown (traces from individual trials are baseline corrected and averaged, resulting in more informative timecourses). I would strongly recommend that Supplementary Figure4 is brought to the main text (Figure3 could be presented in Supplementary).

      - Responses to physical brightness modulations were only measured in the synesthethes group, not in controls. The authors point out that pupillary light responses have been thoroughly characterized in previous studies, and conclude that synesthethes' responses were in line with the expectations both in terms of amplitude and latency. However, as we are not dealing with standardized measurements, subtle differences in pupil reactivity across the two populations remain a possibility. I recommend that this possibility is mentioned in the discussion.

      Impact:

      This work is likely to improve our understanding of synesthesia, providing a new tool to quantify the subjective sensations; an interesting potential extension would be using pupillometry for tracking changes over time of the synesthetic experiences, opening up the possibility to evaluate the importance of learning for this peculiar experience.

    2. Reviewer #2 (Public review):

      Synesthesia is a neurological condition where stimulation of one sensory channel leads to involuntary, automatic, and consistent experience of another, unrelated percept. For example, Sir Francis Galton (1880, Nature) famously described the robust tendency of some individual (synesthetes) to associate numerals with a distinct color. Ever since, synesthesia keeps attracting a broad interest in the cognitive neurosciences in light of its implications for the study of domains such as perception, consciousness, and brain connectivity, among others.

      Strauch, Leenaars, and Rouw measured pupil size in a group of 16 grapheme-color synesthetes and two matched control groups. The participants were presented with gray digits - that is, visual stimuli having identical physical properties in terms of brightness. Each participant subsequently rated the corresponding evoked color and brightness: unlike controls, synesthetes did so in a very consistent and reliable fashion. Accordingly, this was also shown in their pupils: despite the same objective luminance, digits associated with brighter percepts caused their pupils to constrict and digits associated with darker percepts caused their pupils to dilate more than controls. These results highlight how crossmodal correspondences are deeply rooted in synesthetes, and puts forward pupillometry as a particularly appealing biomarker for some phenomenological experience (at least those grounded in "brightness").

      Further strengths of the technique are its temporal resolution and its responsiveness to several constructs. Across several tasks, the authors show for example that responses to synesthetic light are somewhat slower than responses to real light (i.e., they are likely mediated), but at the same time faster than responses to mental imagery. The role of mental imagery can also be reasonably dismissed when considering the second feature of pupil size: its responsiveness to mental effort and cognitive load. The pupils tend to dilate with demanding, challenging tasks, and this was the case when control participants were asked to report the color of a digit for which they did not consistently experience a synesthetic association. The same task was, instead, seemingly effortless for synesthetes, again speaking in favor of the automaticity of number-color correspondences in their case.

      Overall, the findings by Strauch, Leenaars, and Rouw are highly significant for the field and likely to be impactful. The strength of their evidence, when accounting for the relatively small sample size and the inherent variability of both phenomenology (color perception and subjective reporting) and physiology (pupil size), is adequate and sufficiently convincing.

      Comments on revisions:

      I thank the authors for addressing all my comments in a satisfactory way. I think that the paper has improved, especially in terms of transparency of the reporting and clarity of the results.

    3. Reviewer #3 (Public review):

      Summary:

      In the present study, the authors examined pupillary responses to uncolored stimuli (number graphemes) among number-color synesthetes and non-synesthetes. After seeing a digit, the synesthetes and active control participants were asked to indicate which color they perceived using three dimensions of hue, saturation, and lightness. The lightness values were the primary independent variable for follow-up analyses. To see how the pupil responded to psychologically "bright" and "dark" digits, the authors split the reported lightness values at the median and plotted them. The synesthetes showed a pupillary constriction to digits they perceived as bright and dilation to digits they perceived as dark. Active control participants did not show that effect. In a subsequent block, only the synesthetes were shown the colors they reported perceiving as colored discs. Their pupillary responses were similar. The authors also found that the differences in pupillary responses between light and dark perceptions (with digits) were only slightly delayed in their onset to the perception of a colored disc, and therefore the color perception accompanying a digit is unlikely to be effortful or a retrieved association, but occurs rather automatically.

      Strengths:

      The authors employed a well-controlled and designed quasi-experiment comparing color-grapheme synesthetes to non-synesthetes and showed convincingly that the color perceptions accompanying graphemes alter the physical perception of brightness. They also made a reasoned attempt to ruled out the possibility that color associations are occurring effortful via retrieved associations.

      The follow are questions which I had asked in a first round of reviews, and which were answered adequately by the authors:

      (1) Are the pupillary responses among synesthetes, which objectively do not seem to match the degree of physical stimulation entering the retina, in any way maladaptive for eye functioning? I understand the constriction/dilation of the pupil to not only benefit visual acuity but also to protect the retina from damage. Are synesthetes at any risk of retinal damage due to over-dilation of the pupil to brighter stimuli? Or are these effects of a magnitude that is too small to matter? As reported in arbitrary units, it was hard to know how large these effects were in terms of measurable changes in dilation (e.g., millimeters).

      (2) Likewise, is the automatic synesthetic merging of two percepts something that could be learned such that natural synesthetes and "artificial" synesthetes would look similar? For example, if a group of non-synesthetic participants were to learn a color-grapheme association to automaticity, would you expect their pupillary responses to the graphemes look similar to the synesthetes? If so (or if not), what would this tell us anything about the phenomenology of synesthesia?

      (3) Do the synesthetic perceptions of digit graphemes merge in a sensible way? For example, if a synesthete sees a particular color with the digit 1, and a different color with the digit 9, what do they perceive when they see 19? or 1-9, or 1 9? Is there color blending, or an altogether different color perception?

    1. Reviewer #1 (Public review):

      This work compiles a comprehensive atlas of ncORFs across mammalian tissues and cell types, derived from reanalysis of ~400 public ribosome profiling datasets. The authors then evaluate cross-species conservation and functional signatures, proposing that evolutionarily ancient ncORFs tend to have higher translation potential, stronger expression, and closer relationships with canonical coding sequences.

      Strengths:

      In general, the study provides a large-scale and timely resource of annotated ncORFs, which could be broadly useful for the community. The authors collected ~400 public ribosome profiling datasets for annotations of ncORFs, which, to my best knowledge, is the largest collection of data for such purpose. The catalog could facilitate future investigations into ncORF biology and broaden understanding of the coding potential of the "non-coding" genome.

      Weaknesses:

      Based on the ncORF catalog, some of the analyses were not properly done. Some of the results are descriptive.

      (1) Bias and representations of data source. Public ribo-seq datasets are unevenly distributed across tissues and cell lines, raising concerns about heterogeneity and underrepresentation of certain contexts. This may limit the generalizability of the catalog.

      (2) The discussion on modular domains of ncORFs is unclear, and the claim that they may originate via TE-related mechanisms is not well supported. Stronger evidence or clearer reasoning is needed.

      (3) The conservation comparisons are not fully convincing. Figure S7 shows only mild differences between ncORFs and CDS, and statistical significance is not clearly demonstrated. Comparisons with other non-coding RNAs should be added, and overlapping sequences between ncORFs and CDS should be excluded to avoid bias.

      (4) Figure 3 indicates that some ncORFs are subject to evolutionary constraints. This is not surprising. The authors should provide further analyses on more detailed features of these "conserved" ncORFs vs. the "non-conserved" ones. Some pretty informative works have been done in drosophila, worms, mouse, and human. Figure 3 suggests some ncORFs are under evolutionary constraint, but this is not unexpected. More granular analyses contrasting "conserved" versus "non-conserved" ncORFs would be informative. In fact, small ORFs, especially uORFs, have been extensively studied, for their functions and corss-species conservations. The authors should explicitly show what is new here in their analyses.

      (5) Translation levels are reported using RPF counts. However, translation efficiency (normalized by RNA expression) is a more appropriate measure to account for expression heterogeneity.

      (6) The correlation analyses between ncORF translation levels and PhyloCSF are confusing and largely descriptive. These sections need sharper framing and clearer conclusions.

      (7) Public ribo-seq datasets, generated by different research labs, are known for their strong batch effects. Representations of tissues and cells are also very unbalanced. Therefore, the co-translation analysis between ncORFs and canonical CDS is not well controlled. This should be done by referring to a recent large-scale ribo-seq meta-analysis (Nat Biotechnol. 2025. doi: 10.1038/s41587-025-02718-5).

      Comments on revisions:

      The authors have made efforts to address most of the previous concerns, and several points have been clarified or improved in the revision. However, in a number of cases, the responses rely more on acknowledgment and reframing rather than substantive analytical strengthening. Overall, the manuscript is improved, particularly in terms of clarity, transparency, and positioning of claims. I support its publication and look forward to seeing how the field engages with and discusses these claims.

    2. Reviewer #2 (Public review):

      Summary:

      Chang et al. attempted to analyze a large number of ribo-seq datasets through a standardized pipeline, identifying novel non-canonical ORFs and elucidating their evolutionary and expression characteristics.

      Strengths:

      (1) The datasets analyzed by the authors are sufficiently comprehensive, and the use of standardized pipelines ensures excellent analytical consistency.

      (2) Their analyses of ORF evolution and co-expression further deepen our understanding of these ORFs.

      Weaknesses:

      (1) The authors primarily conducted analyses through bioinformatics, lacking sufficient wet-lab experimental evidence.

      (2) Some analytical methods and standards were not clearly presented in the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      Meijer et al. sought to investigate the role of cortical layer 6b (L6b) neurons in modulating sleep-wake states and cortical oscillations under baseline and sleep deprived conditions and in response to orexin A and B. Using chronic EEG recordings in mice with silencing of Drd1a+ neurons (via constitutive Cre-dependent knockout of SNAP25), the authors report that while overall baseline sleep-wake architecture and response to sleep deprivation are minimal/unchanged, "L6b silencing leads" to a slowing of theta activity during wakefulness and REM sleep, and a reduction in EEG power during NREM sleep. The manuscript is well written with clarity and transparency. Although Drd1a+ neurons are not exclusive to L6b, the authors describe key future studies to identify a causal role for L6b neurons in brain state regulation. These studies contribute to a growing body of evidence that cortex-in addition to subcortical brain regions-plays a role in brain state regulation.

      Strengths:

      (1) The text is well written.

      (2) The authors are transparent about methodological details and study limitations.

      (3) The stated sleep, circadian, and orexin infusion experiments are well designed, executed, and analyzed.

      Weaknesses:

      (1) Outcomes are attributed to silencing cortical L6b neurons, but the genetic manipulation is not specific to L6b neurons or cortex. The authors acknowledge this as a limitation and offer targets for future studies to identify L6b neuron-specific contributions to stated outcomes that include spatially restricted manipulations.

      (2) Experiments use only male mice, which limits generalizability to females.

      Comments on revised version:

      The authors took great care in addressing my previous comments, and I do not have any additional concerns.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Meijer and colleagues investigated the effects of inactivation (conditional silencing) of cortical layer 6b neurons on sleep-wake states and EEG spectral power under the following three conditions: during natural sleep-wake states, after sleep deprivation, or after intracerebroventricular administration of orexin A and B. The authors report that silencing of L6b neurons did not have a significant effect on the total time spent in sleep-wake states, duration or number of state epochs, or the response to sleep deprivation. However, silencing of L6b neurons did slow down theta-frequency (6-9 Hz) during wake and REM sleep, and reduced the total EEG power during NREM sleep. Infusion of orexin A in the mice in which cortical layer 6b neurons were inactivated produced an increase in wakefulness. A similar effect was observed after infusion of orexin A in the mice in which these neurons were not silenced, but the effect (i.e., increase in wakefulness) was of a smaller magnitude. Silencing of cortical layer 6b neurons attenuated the effect of orexin B in increasing theta activity, as was observed in the control mice. The authors conclude that the cortical neurons in layer 6b play an essential role in state-dependent dynamics of brain activity, vigilance state control and sleep regulation.

      Strengths:

      - A focus on cortical layer 6b neurons, which is an understudied neuronal population, especially in the context of brain and behavioral state transitions.

      - The authors used a well-established mouse model to study the effect of inactivation of cortical layer 6b neurons.

      Weaknesses:

      - Although the authors used a highly selective approach to silence layer 6b neurons, the observed changes in EEG oscillations cannot be solely attributed to layer 6b neurons because of the ICV route for orexin administration.

      - The rationale for using only male rats is not provided.

      Comments on revised version:

      The authors have addressed my concerns.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript explores the role of the Evening Complex (EC), specifically focusing on ELF3, a disordered protein component of the EC, and its temperature-dependent phase behavior. The study highlights the role of polyQ tracts in modulating temperature-sensitive condensate formation and provides a combination of computational approaches, including REST2 simulations and coarse-grained Martini simulations, to investigate how polyQ tract length and sequence context influence this behavior.

      Strengths:

      The study addresses a key question in plant biology - how temperature influences circadian clock-mediated growth regulation through protein phase behavior. The manuscript introduces the novel finding that polyQ tract length modulates the temperature-dependent formation of helices and condensates.

      Weaknesses:

      (1) Coarse-Grained Simulation Results Not Supported by Data:

      The results presented in Figure 6A of the manuscript do not seem to show a clear trend in the number of clusters formed as a function of polyQ tract length. This is particularly evident in the comparison between 0Q and 7Q polyQ lengths, which display statistically similar values in terms of the number of clusters. The lack of distinction between these values raises questions about the sensitivity of the coarse-grained simulations to polyQ tract length, which the authors claim as a key modulator of condensate formation. This discrepancy weakens the argument that polyQ length directly impacts the clustering behavior in the simulations.

      Suggested Analysis:

      a) A more detailed statistical analysis should be performed to assess whether the observed differences between polyQ lengths are significant. This could involve hypothesis testing or the use of error bars in the graphs to better communicate the variability in the data.

      b) Additionally, the authors should examine whether there are other features, such as cluster shape or internal structure, that might differentiate between different polyQ lengths, even if the total number of clusters is similar.

      (2) Inconsistency in Cluster Size Across Temperatures (Figure 6B):

      The results in Figure 6B show a striking difference in the size of the largest cluster between temperatures of 290K and 300K. This abrupt shift in behavior lacks a clear mechanistic explanation. Typically, phase transitions driven by temperature are more gradual, unless there is some underlying structural or chemical shift that the authors have not accounted for. Without a clear explanation, this sudden change in behavior reduces confidence in the simulation results.

      Suggested Analysis:

      a) The authors should explore possible explanations for the dramatic difference in cluster size between 290K and 300K. For example, they could investigate whether specific interactions (such as the breaking or formation of hydrogen bonds or hydrophobic contacts) might explain the behavior at higher temperatures.

      b) It is important to check whether the coarse-grained simulation model has been adequately parameterized and scaled for accurate temperature dependence. Atomistic simulations of monomers and dimers with varying polyQ tract lengths could be used to fine-tune the coarse-grained model, ensuring it accurately reflects molecular behavior. The gross estimate of a 10% scaling factor might be insufficient and could lead to inaccurate representations of cluster formation.

      (3) Scaling of Coarse-Grained Model with Atomistic Simulations:

      As mentioned, the coarse-grained model used in the study may not have been properly scaled against atomistic data. A simple scaling factor of 10% may not be appropriate for accurately capturing the behavior of polyQ tracts across different lengths, especially considering their sensitivity to subtle changes in temperature. Without rigorous validation against atomistic simulations, the coarse-grained model's predictions could be skewed.

      Suggested Analysis:

      a) To address this, the authors should compare the coarse-grained model with atomistic simulations of monomeric and dimeric forms of ELF3 with different polyQ tract lengths. By comparing key structural parameters (e.g., radius of gyration, contact maps, and clustering propensity), the authors could adjust the coarse-grained model to more accurately reflect the atomistic behavior. The authors have wealth of atomistic simulation data that could afford such benchmarking and identification of scaling factor

      b) Additionally, the authors should investigate whether the assumed scaling factor of 10% is appropriate for each polyQ length or whether it needs to be refined based on specific properties, such as the number of hydrophobic interactions or secondary structure stability.

      (4) Lack of Analysis for Liquid-Like Behavior in Phase Separation:

      The simulations presented in the manuscript do not analyze the liquid-like behavior of ELF3 condensates, which is a key characteristic of liquid-liquid phase separation (LLPS). In LLPS systems, condensates are often dynamic, with chains exchanging between clusters, indicating liquid-like rather than solid-like behavior. The authors fail to probe this crucial aspect, which is necessary to support the claim that ELF3 undergoes phase separation.

      Suggested Analysis:

      a) The authors should conduct additional analyses to probe the liquid-like nature of the clusters formed by ELF3. One approach would be to analyze the dynamics of chain exchange between clusters, measuring how frequently chains leave one cluster and join another over time. This analysis would reveal whether the condensates behave as liquid-like, dynamic structures or more static, solid-like aggregates.

      b) Additionally, the temperature dependence of these exchange dynamics should be investigated. In true liquid-liquid phase separation, the rate of chain exchange is often sensitive to temperature. Observing how this rate changes between 290K and 300K, for instance, could help explain the abrupt shift in cluster size seen in Figure 6B.

      c) The authors should also analyze whether the internal structures of the condensates are consistent with a liquid-like phase. For example, radial distribution functions and contact lifetimes could be calculated to reveal whether the clusters exhibit liquid-like organization.

      (5) Lack of justification of polydispersity of polyQ:

      The authors don't provide any rationale for choice of different copies of polyQ used in the manuscript for their chain-growth simulation studies. It will be more apt if it can be motivated via some precedent experimental observations.

      (6) Lack of initiative to connect to Experiments:

      While the computational models and simulations provide robust theoretical insights, the absence of direct experimental validation weakens the overall impact of the manuscript. For example, experimental data on how specific mutations in the polyQ tract influence ELF3 behavior in vivo would significantly bolster the authors' claims. The manuscript would benefit from either citing existing experimental studies that corroborate these findings or from suggesting future experimental directions.

      Comments on revised version:

      The authors have now adequately addressed to the key concerns of manuscript. The manuscript in the present form looks significantly improved.

    2. Reviewer #2 (Public review):

      Summary:

      The authors investigate how ELF3, a disordered scaffolding protein in the plant circadian Evening Complex, responds to temperature by forming reversible nuclear condensates. They focus on the C-terminal prion-like domain and on a variable polyglutamine tract within it, asking how the tract length and surrounding sequence context tune temperature-responsive structural and condensation behavior. Using a tiered set of computational approaches, including sequence heuristics, hierarchical chain-growth ensembles, all-atom enhanced-sampling simulations, and coarse-grained condensate simulations of 100 monomers, they characterize wild-type, polyQ deletion, polyQ expansion, and an aromatic-disrupting F527A variant. In the revised manuscript, the central claim has been reframed so that polyQ length is now described as tuning condensate material properties rather than driving temperature-sensitive phase separation, with temperature-responsive condensation attributed primarily to a sticker-rich aromatic contact network.

      Strengths:

      The biological question is important and timely, and the multiscale computational strategy provides a fresh view of an intrinsically disordered protein and its variants. The all-atom enhanced sampling analyses identify a temperature-dependent long-range aromatic contact involving F527 and a methionine-tyrosine coordination motif, which are concrete and mechanistically interesting observations beyond what coarse-grained or sequence-only methods could provide. In response to the previous round of review the authors have added replicate averaged statistics with error bars on the new condensate analyses, introduced new dynamics observables including effective diffusivity, an anomalous diffusion exponent, the self van Hove function, shape anisotropy, per chain radius of gyration in the condensed phase, and a condensate lifetime, provided cluster size time series for transparency, justified the choice of polyQ tract lengths against published Arabidopsis polymorphisms, expanded the Methods with explicit formulas for the new analyses, and included a split half convergence check for the all atom ensembles. The reframing toward a sticker spacer interpretation is consistent with recent experimental work and represents a more cautious and defensible reading of the data.

      Weaknesses:

      Despite these substantive additions, several core concerns from the previous review remain only partially addressed, and, on close reading, the new supplementary analyses do not robustly support the reframed claim that polyQ length tunes condensate material properties. Error bars and replicate-averaged statistics were added to the new condensate panels, but the helical propensity and per-residue analyses throughout the rest of the manuscript still show only a single curve per temperature, so variability for these key observables remains unreported. Several of the newly added dynamics observables show that the variants are essentially indistinguishable within the reported uncertainty: the self van Hove distributions, the shape anisotropy distributions, and the per chain radius of gyration distributions in the condensed phase overlap almost entirely across variants, and the anomalous diffusion exponent has between replica spreads at low temperature that exceed the variant to variant differences, with variant orderings that change with temperature. The variant-dependent signal that does survive, namely a drop in condensate lifetime for the polyQ expansion and the aromatic mutant at the highest temperature studied, rests on a single temperature point, with replicate spreads spanning most of the metric's dynamic range.

      The cluster size time series at higher temperatures shows the dominant cluster oscillating over a wide range across replicas, indicating intermittent dissolution and incomplete convergence in the very temperature regime where the variant-specific claims are made. The only convergence test provided is a split-half radius-of-gyration analysis for the all-atom ensembles, with no slab-geometry or coexistence-density check for the coarse-grained condensate simulations. The polyQ deletion variant forms dominant clusters comparable in size to wild type at low and intermediate temperatures, which on its own argues that variable polyQ presence is not a primary determinant of clustering and supports the earlier concern that the temperature sensitive behavior is dominated by generic chain length and aromatic sticker effects rather than polyQ specific sequence effects, a concern that the reframing softens but does not resolve. Statistical significance is not assessed anywhere, and with three replicas and largely overlapping error bars, claims of variant-specific differences would benefit from explicit statistical tests. Minor quality control issues are also visible in the supplementary material, including a mislabeling of the aromatic mutant in two analysis panels and an inconsistent trajectory length for one variant at one temperature.

      Additional Context for Readers:

      Readers should interpret the molecular mechanism proposed here with caution. The reframing from polyQ length driving temperature-sensitive phase separation to polyQ length tuning of condensate material properties is more scientifically measured and aligns with recent experimental work, but several of the supplementary observables introduced to support this revised claim indicate that the variants studied are statistically indistinguishable within the reported replicate uncertainty. The most robust observation in the revised work is that the prion-like domain undergoes a temperature-responsive break of an aromatic contact in all-atom simulations and that aromatic sticker contacts dominate inter-protein interactions in coarse-grained condensate simulations. The mechanistic role of the polyQ tract, beyond generic chain length and hydration effects, remains, as in the original submission, not clearly established by the simulations presented. Independent experimental validation of the proposed aromatic contact and of the predicted material-state differences between polyQ variants will be needed to establish the molecular mechanism, and improved condensate convergence tests, uniformly reported error bars across all simulation-derived figures, and explicit statistical tests of variant-versus-variant differences would substantially strengthen confidence in the conclusions.

    1. Reviewer #1 (Public review):

      Summary:

      The authors present a novel approach to subcellular spatial proteomics by combining laser microdissection with expansion microscopy and LC-MS/MS analysis (SPEx). They implement two different workflows for LMD and LC-MS/MS quantification:

      (1)The standard approach, where an area of interest is cut out by LMD, subjected to proteomics analysis, and compared to the rest of the cell without the dissected ROI.

      (2) The subtraction approach, where ROIs are removed, and the remaining cellular material is compared to samples containing both the surrounding material and the ROI.

      The authors assess the technique by applying it to subcellular targets of various sizes, volumes, and protein compositions such as the nucleus, nucleoli, and Golgi. They demonstrate that SPEx can identify proteins enriched or reduced in ROIs.

      Strengths:

      The broad, relatively easy, and inexpensive applicability of this approach to potentially many cell types and subcellular areas of interest provides an exciting alternative to subcellular fractionation, native immunoprecipitation, or genetically encoded proximity labeling constructs. Moreover, by visually selecting ROIs for subsequent analysis, subcellular context or organelle morphology can be taken into account, as discussed by the authors in the discussion section.

      Weaknesses:

      While strongly supporting the sharing of this approach, we have a number of comments and questions that will improve the impact of the manuscript:

      (1) General:

      a) The manuscript would benefit from restructuring and language revision. In its current form, the writing is sometimes dense and verbose (in particular, the Results section). This makes it difficult to follow the authors' arguments.

      b) The authors mention the possibility of selecting organelles based on morphology. This is left for the discussion, but it seems like a missed opportunity - the authors could compare individual organelles in different morphological states, e.g., connected vs. fragmented mitochondria.

      (2) Technical:

      a) Why do the authors strive and optimize for a 10x expansion factor? Is SPEx compatible with a more standard 4x expansion, as e.g., used in the classic U-ExM approach (https://www.nature.com/articles/s41592-018-0238-1)? This could be added to the discussion.

      b) The U-ExM approach shows improved ultrastructural preservation when using 3%FA with 0.1% glutaraldehyde fixation (GA). Is SPEx compatible with the use of low amounts of GA for fixation?

      c) Related to the above, was the anchoring efficiency reduced only to achieve a 10x expansion factor or does this additionally affect the proteome coverage?

      d) Have the authors considered using alternative anchoring approaches, such as GMA (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0291506#pone.0291506.s001), which potentially increase the amount of sample retained in the hydrogel, thus allowing for better proteome coverage? This could be added to the discussion.

      e) The limitation of the approach to near-2D samples should be mentioned, and alternative approaches for more 3D samples could be discussed.

      f) How are peptides that are directly anchored to the hydrogel dealt with during LC-MS/MS analysis? Are they excluded, or can they be identified during the spectral search? The latter would allow us to get a deeper structural understanding of how proteins are actually anchored into hydrogels, which so far has not been assessed.

      An alternative approach to address this question would be to investigate if the peptide coverage of proteins detected by SPEx is enriched for peptides representing the folded core of proteins as opposed to the surface-exposed regions, which likely get more anchored into the hydrogel.

      g) Same question regarding peptides with NHS labeling. Can they be identified, or do they just compete for ionization and thus negatively affect coverage and dynamic range of the LC-MS/MS approach?

      h) How are the primary and secondary antibodies affecting the proteomics analysis identified as contaminants?

      i) Have the authors observed differences in proteomics coverage of only antibody vs NHS-labeling? Depending on the questions above, could pure antibody-based labeling increase proteomic coverage?

    2. Reviewer #2 (Public review):

      Summary:

      This study introduces a method that combines physical expansion of cells, imaging-guided isolation of defined regions, and protein identification to enable compartment-resolved analysis of protein composition at the subcellular scale. The authors aim to address a central limitation in existing approaches, namely the loss of spatial information during sample preparation or the indirect nature of proximity-based labeling methods. Using several cellular compartments as examples, they demonstrate that their approach can recover compartment-enriched protein sets and identify candidate proteins with previously unassigned localization.

      Strengths:

      A major strength of this work is the conceptual simplicity and accessibility of the approach. By combining established techniques in a modular way, the method avoids the need for genetic manipulation or specialized labeling strategies, making it broadly adaptable across experimental systems. The ability to directly select regions of interest based on imaging represents a clear advantage over indirect enrichment strategies and allows flexible targeting of both membrane-bound and non-membrane-bound compartments.

      The experimental design is also a strong aspect of the study. The use of complementary comparison strategies-analyzing isolated compartments alongside matched "subtracted" controls-provides an internal framework for assessing enrichment and depletion, increasing confidence in spatial assignment. The application of the method across multiple organelles of different sizes and properties demonstrates versatility, and the reported specificity for several compartments is encouraging. In particular, the ability to profile small and biochemically challenging structures highlights a potentially important niche for the approach.

      Weaknesses:

      Despite these strengths, several methodological limitations constrain the interpretation of the results. The most important relates to spatial accuracy in three dimensions. While lateral resolution is improved through physical expansion, the lack of depth resolution introduces uncertainty regarding contributions from structures above and below the selected region. Although the authors argue that this does not substantially affect specificity, the current evidence is largely indirect, and a more rigorous quantification of potential contamination would strengthen this conclusion.<br /> Quantitative interpretation also remains challenging. Because the measurements reflect total protein abundance rather than local concentration, differences in compartment size and protein density can influence enrichment values, particularly for small structures embedded within larger volumes. This issue is evident in the analysis of smaller compartments and complicates direct comparison across conditions. Additional normalization or modeling would help clarify how to interpret these measurements.

      Another limitation concerns variability in the expansion process and its downstream consequences. Differences in expansion factor across samples may affect the definition of regions of interest and introduce variability in sampling, yet the impact of this variability is not fully explored. Similarly, the use of a modified chemical treatment to preserve proteins for downstream analysis is central to the workflow but is not extensively validated with respect to preservation of spatial organization.

      While the identification of previously unannotated proteins is an appealing aspect of the study, validation is limited to a small number of examples, and broader support from independent datasets or literature context is lacking. In addition, the study primarily focuses on steady-state measurements in a single cell type, and therefore does not yet demonstrate the ability of the method to capture dynamic or condition-dependent changes in protein localization.

      Finally, the positioning of the method relative to existing approaches could be more clearly articulated. Although qualitative comparisons are provided, a more systematic and quantitative benchmarking against alternative strategies would help readers better understand the specific advantages and trade-offs.

    3. Reviewer #3 (Public review):

      Franziscus et al. describe an elegant approach for spatially specific proteome analysis. To achieve this, they expand fixed cells and subsequently use a laser to micro-dissect a region of interest, which is then analyzed by mass spectrometry.

      They demonstrate the effectiveness of their approach by analyzing the nucleus, nucleolus, and the Golgi, and benchmark their hits against previous datasets for these organelles.

      The manuscript is very well written and nicely guides the reader through the applied methods. The presented data is convincing, and I do not see the need for additional experimental verification of the protocol. The only minor concern is the novelty of the method and the presentation. A combination of expansion, laser microdissection, and proteomics has been applied in the past (PMID: 36450705, PMID: 39477916). In the manuscript, one of these studies is cited, though it does not become clear that this approach is already described. However, Franziscus et al. describe the approach better and make it more accessible to the reader, especially since the other studies described this methodology in combination with tissue expansion and not in combination with single cell expansion as it is done here. I would ask the authors to be clearer in the introduction about what others have already done and what their contribution is here. In general, I am convinced that the community will benefit from the presented protocol to analyze organelle proteomics in detail.

    1. Reviewer #1 (Public review):

      Summary:

      This study investigates a fundamental question in cognitive science: is our ability to reason about the physical world an abstract mental process, or is it "embodied"-directly rooted in our real-time physical interactions with the environment? The authors compared participants' performance in computerized reasoning games with and without Galvanic Vestibular Stimulation (GVS). They suggest that participants failed more often and utilized suboptimal strategies under GVS compared to a sham stimulation condition. Furthermore, they found that this detrimental effect of GVS was reduced when the games were governed by altered gravity (hyper- and hypo-gravity). Consequently, the authors conclude that the physical experience of the body modifies high-level cognitive skills, such as reasoning.

      Strengths:

      The manuscript is well-written, organized, and easy to follow, making complex concepts accessible. Also, combining a specialized physical reasoning task with real-time vestibular disruption (GVS) is an intriguing approach to testing the boundaries of embodied cognition.

      Weaknesses:

      (1) Lack of Overall Effects and Inflated Type I Error for Game-Level Effects

      The study utilizes a within-subject design. Taking Study 1 as an example, each subject participated in a familiarization session (4 games), a baseline session (12 games without stimulation), a GVS session (14 games), and a sham session (14 games). No game was repeated for any single subject. Performance was quantified using three primary measures (success rate, number of attempts, and time per attempt) and two strategy measures (tool switching and the distance between tool placements).

      For Study 1, to identify condition differences at the game level (i.e., Figure 2), the authors effectively conducted 70 independent t-tests (5 measures × 14 games). While 7 significant results were reported, this large number of independent tests invites an inflated Type I error rate, as no multiple-comparison correction appears to have been applied.

      A similar inflation is expected in Study 2, where 50 independent t-tests (5 measures × 10 games) yielded 5 significant comparisons (Figure 4). Although the authors might argue the direction of the differences is systematic, implying GVS generally impairs performance, at least one significant comparison shows the opposite effect: tool switching indicates that GVS led to better performance for the 'Table_A' game in Study 2 (Figure 4d), whereas the same variable indicated GVS led to worse performance in Study 1 (Figure 2d). I suspect that none of the significant game-level results would survive a proper statistical correction. If possible, the authors can redo statistical testing with corrections (FDR or Bonferroni) or with LMM using game as a random effect. Before proper statistical analyses, I strongly encourage the authors to refrain from drawing broad conclusions based on these isolated game-level results.

      Furthermore, when analyzing data across all games, the study found no significant effect of GVS on overall performance or strategy measures in either Study 1 or Study 2. This lack of an aggregate effect contradicts the authors' conclusion that participants failed more often or utilized suboptimal strategies under GVS.

      (2) Missing Rationale for Classification Analysis

      It is puzzling why the authors pursued two exploratory analyses on tool placement after revealing that the two related primary measures (tool positioning and switching) did not generate significant condition differences in Study 1. These additional analyses-the Dirichlet Process Gaussian Mixture Model and leave-one-out classification-were not pre-registered. In the absence of overall condition differences, the authors appear to be "doubling down" by applying sophisticated classification tools to the raw data without a clear prior rationale.

      (3) Insufficient Evidence for the Reduced Effect of GVS Under Altered Gravity

      To compare Study 1 and Study 2, the authors devised a "gravity-weighted index," but its definition is not sufficiently justified. The index assigns weights of 1, 2, and 3 to low-, medium-, and high-gravity-dependent games, respectively. The choice of these specific weights appears arbitrary, making the quantitative results difficult to interpret. More importantly, there is no citation or explanation regarding how these three levels of "gravity impact" were defined in the first place (Line 468). This index was also not pre-registered.

      The authors state that for the success rate index, a value close to -1 indicates a large negative difference for GVS, 0 indicates no difference, and 1 indicates a large positive difference. These are theoretical bounds; the actual distribution of each index should be examined to validate such claims. However, the paper lacks descriptive statistics for this composite index.

      Notably, the "reduction" of the GVS effect in altered gravity was only demonstrated in one of the five available indices (success rate, p = 0.046). In fact, the success rate in Study 2 was 66.7(sham) vs 67.3 (GVS) in Table 2. It is highly debatable whether this marginal result justifies the conclusion that GVS effects "were reduced when the games included reasoning about altered gravity".

      (4) Questionable Assumptions Regarding Strategy

      The authors assume that "big changes in tool positioning and frequent tool switching indicate poor evaluation of the failed outcome". This assumption is questionable. In solving this cognitive task, participants must explore and exploit solutions based on feedback. Large shifts in positioning or frequent tool switching might reflect active, adaptive exploration based on failed outcomes rather than a failure to evaluate them.

      (5) Confounding Factors in GVS Interpretation

      The central theoretical question is whether physical reasoning is grounded in physical experience. GVS is used here to manipulate that experience. However, GVS does not selectively target the vestibular nerve; it also activates distributed fronto-parietal attention networks and hippocampal circuits essential for any reasoning task. Additionally, the vestibular system is linked to the limbic system and the cerebellum, which regulate emotional reactivity and arousal. Because attention and emotion are likely affected by GVS, the authors should be much more cautious in attributing their behavioral findings solely to changes in the "physical experience of the body."

    2. Reviewer #2 (Public review):

      Summary

      The paper investigates whether the real-time physical experience of the body shapes high-level physical reasoning. Participants played a set of computerized tool-use reasoning games (the Virtual Tools paradigm) in which they must use knowledge of physical laws - including gravity, collisions, and inertia - to guide a ball into a target area. In Study 1, participants played the games under terrestrial gravity while receiving either Galvanic Vestibular Stimulation (GVS), which introduces noise into the vestibular organ and disrupts gravitational signalling, or a Sham condition with matched skin sensation. In Study 2, a separate cohort played the same games redesigned under hypogravity (0.5 g - half Earth g) or hypergravity (2 g - double Earth g), again with concurrent GVS or Sham stimulation. Performance was assessed through success rate, number of attempts, and time per attempt; strategy was assessed through the spatial distance between successive tool placements and the frequency of tool switching across attempts. A post-hoc gravity-weighted index (GWI) was computed to compare the effect of vestibular perturbation across the two studies. The main finding is that GVS impairs performance in gravity-dependent games under terrestrial gravity, yet the same perturbation appears to be neutral or even beneficial when the game environment involves non-terrestrial gravity - a result the authors interpret as evidence for an adaptable, body-grounded internal model of physics.

      Strengths

      One of the most notable strengths of this work is its conceptual positioning at the intersection of embodied cognition and physical reasoning. Rather than treating the human body either as an abstract information-processing device or as a purely biomechanical system, the authors take seriously the idea that cognition is scaffolded by ongoing sensorimotor state - and they test this idea with a paradigm that is both tractable and theoretically motivated. The use of the Virtual Tools paradigm is well-suited to this goal: the games vary systematically in their reliance on gravitational predictions, allowing selective impairment (rather than general disruption) to serve as a signature of embodied physical reasoning.

      The dual-study design is another strength. Testing the same vestibular perturbation under terrestrial and altered game-gravity conditions, and observing a reversal in its effect depending on context, provides a form of internal control that is conceptually compelling. The additional clustering analyses (Dirichlet Process Gaussian Mixture Model and leave-one-out kernel density classification) strengthen the strategy results beyond raw distance measures, confirming that GVS systematically shifts participants' spatial exploration strategies.

      The paper is also clearly written and engages meaningfully with relevant theoretical frameworks - predictive coding, embodied cognition, and stochastic resonance - making it accessible and stimulating for a broad audience.

      Weaknesses

      (1) Absence of multiple-comparisons correction. A large number of game-level pairwise t-tests are conducted in both studies (upward of twenty per study) without correction for familywise error rate. The game-level effects that anchor the main narrative - in Study 1 alone: Remove, GoalMove, Spiky, Falling_A, Shafts_B, Gap, and Chaining - arise from an uncorrected pool of comparisons. The probability that some of these constitute false positives is non-trivial. The authors should apply a correction (e.g., Benjamini-Hochberg) or at a minimum discuss this limitation explicitly.

      (2) The facilitation claim rests on a post-hoc and arbitrarily parameterized index. The gravity-weighted index (GWI), which drives the central cross-study comparison, uses integer coefficients (1, 2, 3) to weight games by gravity dependency level. These coefficients are entirely arbitrary and bear no principled relationship to the actual gravitational magnitudes used in the study. Why not use the gravity dependency ratings themselves, or the empirically estimated gravity impact scores from the computational modelling mentioned in the Methods? The choice of weights should be either principled or tested across a range of values to demonstrate robustness. Furthermore, the notation in equation (1) as currently typeset reads as "Gravity minus Weighted Index" rather than "Gravity-Weighted Index"; this should be corrected.

      (3) The "facilitation" interpretation exceeds what the data in Study 2 directly support. Across all games in Study 2, GVS versus Sham differences in absolute performance are non-significant in all directions. The facilitation claim derives entirely from the GWI being higher in Study 2 than in Study 1 - a between-subjects comparison involving different participant groups and a non-pre-registered metric. The language of "facilitation" should be tempered accordingly, or the authors should provide additional analyses to support this framing.

      (4) Gravitational manipulation is visual only, and the vestibular system is only one component of the gravity-sensing network. Gravity perception results, as the authors very well know, from a distributed multisensory integration process that involves, in addition to the vestibular system, visual, proprioceptive, and visceral inputs. The present paradigm manipulates gravitational context solely through visual cues and targets the vestibular system through GVS - a point the authors acknowledge but do not discuss in sufficient depth. It is important to distinguish clearly between real gravitational alterations (as achieved in parabolic flight or centrifuge environments, where the entire body is physically subjected to a different gravitational vector) and virtually altered gravity, where only one sensory modality is targeted while others remain anchored to 1 g. The scope of the conclusions should reflect this distinction.

      (5) The choice of 0.5 g and 2 g may lack sensitivity. Combining the two altered-gravity conditions in Study 2, because no significant effect of hypo versus hypergravity was found, is statistically pragmatic but conceptually unsatisfying. There is evidence in the space physiology literature that gravitational processing is not linearly symmetric around 1 g: threshold effects exist below and above terrestrial gravity that may not be captured by modest deviations (half and double g) - see refs below. It is worth discussing whether the absence of a hypo/hyper distinction in Study 2 reflects a genuine equivalence or a lack of sensitivity, and whether more extreme conditions (e.g., near-zero g or 4-5 g) might reveal different processing regimes. Whether 0.5 g and 2 g were sufficient to saturate the system or merely insufficient to perturb it remains an open question with direct implications for the interpretation of the null GWI effects on strategy measures.

      Lee SMC, Ribeiro LC, Martin DS, Zwart SR, Feiveson AH, Laurie SS, Macias BR, Crucian BE, Krieger S, Weber D, Grune T, Platts SH, Smith SM, and Stenger MB. Arterial structure and function during and after long-duration spaceflight. J Appl Physiol (1985) 129: 108-123, 2020.

      de Winkel KN, Clément G, Groen EL, and Werkhoven PJ. The perception of verticality in lunar and Martian gravity conditions. Neurosci Lett 529: 7-11, 2012.

      Clément G, Moore ST, Raphan T, and Cohen B. Perception of tilt (somatogravic illusion) in response to sustained linear acceleration during spaceflight. Exp Brain Res 138: 410-418, 2001.

      Benson AJ, Kass JR, and Vogel H. European vestibular experiments on the Spacelab-1 mission: 4. Thresholds of perception of whole-body linear oscillation. Exp Brain Res 64: 264-271, 1986.

      (6) High-level reasoning is not defined with sufficient precision. The term "high-level reasoning" appears from the title onward and in the heading of the Study 1 results section (line 138), but it is never formally defined. The reader needs a clearer account of what distinguishes high-level physical reasoning from low-level sensorimotor prediction, and where the games used here fall along that continuum. What specific physical competencies - ballistic trajectories, free-fall predictions, collision dynamics, frictional forces, inertial effects - are required across the game set? When describing the subset of games that drive key effects, this information is critical for evaluating whether effects are specific to gravity reasoning or to some other physical concept.

      (7) Performance measures are disconnected from underlying kinematics. The performance measures (success rate, number of attempts, time per attempt) are coarse, high-level summaries. Time per attempt is used as a proxy for performance efficiency, yet participants received no instructions regarding speed, and different individuals may have adopted systematically different speed-accuracy trade-offs. It would be valuable to know whether time per attempt correlates with attempt number within a given game (which would indicate within-game learning) and whether mouse movement data - trajectory, velocity, hesitation - were recorded and could be analysed to provide more mechanistic insight into strategy formation.

    3. Reviewer #3 (Public review):

      Summary:

      This manuscript investigates a theoretically important question in cognitive science: whether higher-level physical reasoning is an abstract, modular process or is grounded in real-time body-environment interactions. To address this question, the authors combine galvanic vestibular stimulation (GVS) with the Virtual Tools task to test whether perturbing vestibular gravity signals affects performance in physical reasoning. The study is conceptually innovative and has the potential to bridge embodied sensory processing and higher-level cognition. However, in its current form, the evidence only partially supports the main claims, and several aspects of the analysis and interpretation limit the strength of the conclusions.

      Strengths:

      A major strength of the manuscript is the originality of the experimental paradigm. The combination of galvanic vestibular stimulation (GVS), which perturbs gravity-related vestibular signals, with computerized game-based tasks that require physical reasoning provides a novel way to test whether ongoing bodily experience influences higher-level cognition. Conceptually, the study is highly original and meaningfully bridges two domains that are often studied separately: sensorimotor processing and higher-level cognition.

      Weaknesses:

      The main weakness of the manuscript is that its central conclusion is not strongly supported by the data. The key finding depends on a marginally significant cross-study comparison, whereas direct GVS-versus-Sham differences in Study 2 are minimal across aggregate measures. In addition, many game-level analyses involve a large number of uncorrected multiple comparisons, raising the possibility that some of the reported effects may reflect chance findings. The manuscript's most important metric, the Gravity-Weighted Index, was not preregistered and is exploratory in nature, yet it is treated as a primary basis for confirmatory conclusions. The cross-study comparison is also difficult to interpret because the two studies differ in participant samples, number of games, and partially in the stimulus set. Finally, the mechanistic claims in the Discussion-particularly those invoking predictive coding, stochastic resonance, or updating of internal gravity models-go well beyond what can be directly inferred from the present behavioral data. Overall, the study provides intriguing but limited evidence that vestibular signals may influence some physical reasoning tasks under specific conditions, rather than strong evidence for a broad account of physical reasoning as grounded in online vestibular processing

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors study two residues in the GHKL ATPase active site of Aq MutL and GyrB, and argue that the catalytic base function is shared between two conserved acidic residues that are 3 residues apart.

      They generated mutant versions in MutL and GyrB (both ala and the appropriate Asn/Gln version) and performed ATPase analysis. They also generated high-resolution crystal structures of the GyrB NTD with AMPPnP for WT and mutants of the two acidic residues. The data show that mutation in either of these residues does not fully kill activity (with the exception of the Alanine mutation of the first of the two, which interferes with ATP (or AMPPnP) binding). When the acidic residues are mutated to Asn/Gln, the catalytic water can still be positioned, and hence these mutants are more active than the Ala mutants. In both cases, the double mutation is catalytically dead.

      The authors then perform phylogenetic analysis and ancestral gene reconstruction, and based on this, they argue that HSP90 forms a different class of GHKL ATPases, and lost rather than gained this separate status.

      Strengths:

      The biochemical analysis seems solid.

      Weaknesses:

      (1) A major question that remains is why the mutations have so much more detrimental effect in MutL (100-fold lower kcat/KM) than they do in GyrB (3-fold lower). Can the authors explain this? Doesn't this argue against the proposed catalytic conservation?

      (2) The structure figures all have omit maps for just the AMPPnP and the water, whereas the density for the acidic residues and their mutants is not shown.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Fukui et al. re-examined the ATP hydrolysis mechanism in GHKL ATPases, revealing a cooperative role of two conserved acidic residues rather than one. The authors have used a range of biochemical and structural techniques on various mutants from different members of the GHKL ATPase family to test and validate their proposed mechanism.

      Through a detailed re-analysis of their previously published structure of the aqMutL NTD (ATPase domain) in complex with AMPPCP, they identified Glu29 and Glu32 as interacting with nucleophilic water for the catalysis. The authors carefully dissected the respective roles of these two acidic residues with a series of site-directed mutations. Mutations at Glu29 impaired ATPase activity without affecting protein secondary structure or ATP binding in the case of the E29Q mutant. Moreover, mutations at Glu32 did not affect secondary structure (except for E32G) but reduced ATPase activity. Activity was abolished when both residues (E29Q/E32Q) were mutated.

      The authors extended their study to another GHKL ATPase, aqGyrB. Their findings further supported the cooperative function of the corresponding acidic residues in aqGyrB (Glu48 and Asp51) during ATP hydrolysis. Mutation of these residues partially impaired ATP hydrolysis without affecting protein secondary structure. ATPase activity was completely lost in the double mutant E48Q/D51M. While the E48Q mutant retained the ability to bind ATP, the E48A mutant did not. High-resolution structures of the WT and E48A, E48Q, D51A, and D51N mutants of the aqGyrB NTD demonstrated that nucleophilic water positioning depended on these residues. E48 played a dominant role in water positioning and is critical for stabilising ATP lid formation and associated conformational changes, whereas D51 contributed cooperatively to catalysis.

      The authors investigated the functional impact of mutating the corresponding residues in the human MutL homologs PMS2 and MLH1. Clinical variants consistently exhibited reduced or abolished ATPase activity, providing a potential molecular basis for Lynch syndrome through impaired DNA mismatch repair.

      Lastly, through evolutionary analysis, the authors inferred that the second acidic residue was likely present in the common ancestor of MutL, GyrB, and MORC proteins, but was lost in the case of Hsp90.

      Strengths:

      (1) This study contains a detailed structural and biochemical analysis of a biologically important set of GHKL ATPases. The authors identify a second acidic residue that is conserved and contributes to catalysis in a large subset of GHKL ATPases. An updated and extended mechanistic model of ATP hydrolysis by this class of enzymes is proposed, which involves cooperative and partially overlapping roles for the catalytic residue pair. This revised mechanistic model is invaluable for the interpretation of clinical variants of GHKL ATPases such as PMS2 and MLH1.

      (2) The work described was performed to an excellent and rigorous technical standard. The structural and biochemical data are sound. The evidence supporting the claims is compelling.

      Weaknesses:

      (1) The identification in this study of a second acidic residue contributing to catalysis but not absolutely essential for catalysis is a useful finding. However, given that many structures of GHLK ATPases have been determined with different nucleotide analogs bound and that the essential role of the first acidic residue is well established, the importance and scope of the advances described here remain focused within the field of study of GHKL ATPases.

      (2) The authors assessed the consequences of variants in the human MutL homologs PMS2 and MLH1, but various other human GHKL ATPases contain clinically relevant variants, some of which have stronger disease associations than the mutations examined in this study. A broader analysis of the effect (or likely effect) of disease-linked mutations in GHKL ATPases would have strengthened this study.

      (3) In MLH1, the E37K mutation completely abolishes ATPase activity, but the corresponding mutations in aqMutL, aqGyrB, and PMS2 do not. It remains unclear why E37K in MLH1 leads to complete loss of activity, as the authors propose that water molecule positioning via the first acidic residue, as well as ATP lid stabilisation and associated conformational changes, should still be possible.

      (4) The authors do not examine ATP binding in the E32 mutants of aqMutL NTD and the D51 mutants of aqGyrB, or AMPPNP binding of the NLH1 and PMS2 mutants. Hence, the relative contributions of the acidic residues to ATP binding and hydrolysis remain partially unclear.

      (5) The ATPase assays for PMS2 and MLH1 (Figure 7 and Table 1) were performed with purification/solubility tags still present. Hence, it cannot be ruled out that these tags influence the measured activities.

      (6) The authors suggest that the two-acidic-residue mechanism proposed in this study could be shared among several GHKL ATPase families, yet they also state that the hydrogen-bonding network was not observed in MutL and MORC family proteins. This raises doubt about how conserved the mechanism is, e.g., in MutL and MORC proteins.

    1. Reviewer #1 (Public review):

      Summary:

      By using an established NAFLD model, choline-deficient high-fat diet, Barros et al show that LPS challenge causes excessive IFN-γ production by hepatic NK cells which further induces recruitment and polarization of a PD-L1 positive neutrophil subset leading to massive TNFα production and increased host mortality. Genetic inhibition of IFN-γ or pharmacological blockade of PD-L1 decreases recruitment of these neutrophils and TNFα release, consequently preventing liver damage and decreasing host death.

      Since NAFLD is often accompanied by chronic, low-grade inflammation, it can lead to an overactive but dysfunctional immune response and increase the body's overall susceptibility to infections, therefore this is very important research question.

      Strengths:

      The biggest strength of the manuscript is vast number of mouse strains used.

      Weaknesses:

      After the review, there are still some open questions from my side:

      (1) I would like the authors to defend their choice of diet type since this has not been done in the review/response to authors. In case they cannot, we need additional proof (HFD or WD model).

      (2) Since the authors used same control groups (chow and HFCD), as required by the animal ethics committee, they must have power analysis test to show that the number of controls (but also in other groups) they used is enough to see the effect. Please provide it.

    2. Reviewer #2 (Public review):

      Summary:

      This is an extremely interesting mouse study, trying to understand how sepsis is tolerated during obesity/NAFLD. The researchers combine a well-established model of NASH (Choline-deficiency with High Fat Diet) with a sepsis model (IP injection of 10mg/kg LPS), leading to dramatic mortality in mice. Using this model, they characterize the complex contributions of immune cells. Specifically, they find that NK-cells and Neutrophils contribute the most to mortality in this model due to IFNG and PD-L1+ Neutrophils.

      Strengths:

      The biggest strength of the manuscript is how clear the primary phenotypes/endpoints of their model are. Within 6 hours of LPS injection, there is a stark elevation of liver inflammation and damage, which is exacerbated by a High Fat/CholineDeficient diet (HFCD). And after 1 day, almost all of the mice die. Using these endpoints, the authors were able to identify which cells were critical for mortality in the model and the specific mediators involved.

      Comments on revisions:

      I have no further comments.

    1. Reviewer #1 (Public review):

      Summary:

      This study asks whether synapses formed by the same broad neuronal class (excitatory pyramidal neurons, PN) adapt their presynaptic organization in a cortex-specific manner, comparing the prefrontal cortex (PFC) with the primary somatosensory cortex (S1). The authors combine sophisticated electrophysiology (paired recordings and extracellular minimal stimulation), pharmacological perturbations of presynaptic Ca²⁺-secretion coupling, bouton Ca²⁺ imaging, and mechanistic modeling. Across two prominent excitatory connections (Layer 5 (L5) PN-L5PN and L2/3-L5PN), they provide convergent evidence that mature PFC synapses operate with looser Ca²⁺ channel-release sensor coupling than their S1 counterparts.

      Overall, the study provides an appealing mechanistic link between synaptic nano/micro-architecture and cortical-area specialization. The idea that PFC synapses retain a more "plasticity-favoring" presynaptic state, while the primary sensory cortex emphasizes reliability and timing precision, is potentially impactful for how we think about circuit computation and plasticity across cortical hierarchies.

      Strengths:

      A major strength is the multi-pronged experimental strategy. The paper first establishes robust, area-dependent differences in synaptic efficacy, reliability, timing, and short-term plasticity (facilitation prevailing in PFC versus depression in S1), using both paired recordings and minimal extracellular stimulation paradigms. The coupling interpretation is then directly supported by differential sensitivity to EGTA (and appropriate positive-control effects of fast chelators). Finally, volume-averaged calcium signals are reported to be similar across areas, arguing against trivial explanations based on gross differences in calcium influx, and the modeling provides a quantitative framework for interpreting the observed chelator effects.

      Weaknesses:

      Limitations are minor and concern interpretation/clarity rather than core results. Some key inferences rely on indirect readouts (chelator sensitivity, fluctuation analysis-derived parameters, bouton-averaged calcium signals), each of which carries assumptions and potential confounds that should be discussed more explicitly. In particular, the repatching paradigm for the paired-recording EGTA experiment, though very impressive, and the limited number of extracellular calcium conditions used for fluctuation analysis (three concentrations), can influence quantitative estimates and the confidence intervals around them.

    2. Reviewer #2 (Public review):

      Schwarze et al. investigated whether synaptic efficacy is brain-region specific. To this end, they compared synaptic connections established by layer 5 (L5) neocortical pyramidal cells and between L5 and L2/3 pyramidal cells. In order to identify the mechanism of this brain region specificity, the authors employed several experimental approaches, including paired electrophysiological recordings, extracellular stimulation, low- and high-affinity intracellular calcium chelators (EGTA and BAPTA), multiple probability fluctuation analysis (MPFA), and intracellular measurements of calcium transients as well as computational modelling. The findings of the present study indicate that synaptic connections in the primary somatosensory cortex (S1) are significantly stronger and more reliable than those in the prefrontal cortex (PFC).

      The study is timely, and the topic is of significant interest to the neuroscience community. Despite the extensive research that has been carried out on the neuroanatomy and receptor distribution of different brain regions, comparatively little attention has been paid to differences in synaptic physiology. The authors' approach is characterised by its elegance and comprehensive nature, and the conclusions drawn are compelling. Nevertheless, there are a number of unresolved issues.

      Major points:

      (1) The authors state that data from the S1 cortex were obtained in a previous study. In the context of an explicitly comparative study (PFC vs. S1cortex), it would have been advantageous for the authors to perform a subset of experiments in which both cortices were obtained from a single animal. This is a feasible undertaking, given the spatial separation of the PFC and S1 cortex.

      (2) Figure 1A is somewhat misleading because it could suggest that the authors have performed dual recordings in identified PFC pyramidal cells.

      (3) PFC and S1 cortex in rodents differ markedly in their morphological organisation. For example, in all sensory cortices, layer 4 is very pronounced; however, in the PFC of rodent,s no clear layer 4 can be found. On the other hand, PFC shows a clear separation of layers 2 and 3, which is not visible inthe S1 cortex. Furthermore, PFC pyramidal cells in layers 2, 3, and 5 exhibit significant heterogeneity, diverging considerably from those found in layers 5a and 5b of S1 cortex. Thus, there is no clear correlation between L5 pyramidal cells in the PFC and the S1 cortex. In order to achieve a meaningful comparison of the data obtained in PFC and S1 cortex, it is necessary for the authors to determine whether the record is from similar pyramidal cell populations.

      (3) In addition, PFC pyramidal cells in layer 2, 3 and 5 are highly heterogeneous and differ markedly from those in layer 5a and 5b of S1 cortex. To achieve a meaningful comparison of the data obtained in the PFC and the S1 cortex, the authors need to determine whether the record from similar pyramidal cell populations.

      (4) For the S1 cortex, in rats it has been found that L5 synaptic connection between pairs of L5a pyramidal cells and pairs of L5b pyramidal cells differ markedly with respect to mean EPSP amplitude, latency and coefficient of variation (cv, a surrogate measure for the synaptic release probability) (cf. Markram et al., 1997; Frick et al., 2008). It is therefore likely that PFC and S1 pre- and postsynaptic pyramidal cells are not only morphologically and electrophysiological distinct but also with respect to their synaptic properties. At least, the authors need to discuss these confounding issues and preferentially address them experimentally. For example, it would be helpful to demonstrate that paired recordings were made from the same pyramidal cell types, perhaps by documenting their morphology and/or firing patterns. In addition, they should discuss the marked difference in EPSP amplitude and putative release probability between their data and the earlier studies.

      (5) In order to perform multiple probability fluctuation analysis (MPFA), a parabolic fit with a mere three points is inadequate, particularly because 2 mM and 5 mM Ca2+ are close to the peak of the variance-to-mean parabola, and only 1 mM Ca2+ is on its initial linear part. A more meaningful result would have been obtained with an additional Ca2+ concentration between 1.0 and 2.0 mM, as these are closer to the physiological range. In this context, the authors should have quoted the more recent and more detailed paper by the Silver group (Saviane and Silver, 2006; Lanore and Silver, 2016) and not just the Clements and Silver review paper.

      (6) Methods: The authors should clarify whether their paired recordings from L5 pyramidal cells involved whole-cell recordings from both pre- and postsynaptic neurons. From Figure 1B, it appears as if the presynaptic neurons were not recorded in whole cell mode but rather stimulated in cell-attached mode. This is also reflected in the artefact visible in the current trace recorded in the postsynaptic neuron. The authors should explicitly state their methodological approach and mention how reliable the timing of the presynaptic action potential was under these circumstances. The same holds true for the extracellular stimulation protocol. A significantly more detailed description of the experimental protocol is necessary here.

      (7) Methods: The authors use Student's t-test for data comparison. The authors should verify that the data distribution was indeed normal, e.g. by using a Shapiro-Wilk test. If this is not the case, non-parametric tests should be used.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Max Schwarze and colleagues examined the coupling distance between presynaptic Ca²⁺ channels and the vesicular release sensor at neocortical synapses in mice. They propose that Ca²⁺ channel-release sensor coupling differs across cortical areas, with relatively loose (microdomain) coupling in prefrontal cortex (PFC) and tighter (nanodomain) coupling in primary somatosensory cortex (S1) for comparable pyramidal-neuron synapse types. To test this, they combine paired recordings and minimal stimulation with chelator manipulations (EGTA/BAPTA), mean-variance/MPFA-style analyses, presynaptic Ca²⁺ imaging, and computational modeling. They conclude that presynaptic coupling organization is area-specific in the mature cortex and contributes to regional differences in synaptic timing, reliability, and short-term plasticity.

      Strengths:

      This study tackles an important question and is strengthened by a cohesive body of evidence assembled from multiple complementary approaches. A major asset is the inclusion of high-value datasets, particularly the paired recordings between L5 pyramidal neurons and the systematic assessment of EGTA sensitivity, which provide a solid functional foundation for the authors' central claims. The work is further distinguished by its genuinely multimodal design: combining electrophysiology with presynaptic calcium imaging (and integrating these observations with quantitative analyses and modeling) offers a more mechanistic view of neurotransmitter release than any single method could provide. Overall, the direct, within-framework comparison of presynaptic release-control mechanisms across cortical areas for comparable synapse types is compelling and gives the conclusions a level of robustness and interpretability that is often difficult to achieve in studies of cortical synaptic diversity.

      Weaknesses:

      Several aspects would benefit from clearer explanation, stronger integration with the existing literature, and a more explicit discussion of limitations and potential confounds. Without these additions, some conclusions remain speculative. Throughout the manuscript, the authors also often imply that different measurements reflect the same underlying synapse population. This is unlikely to be strictly true across all experiments and makes it difficult to integrate results from the various approaches into a single, unified set of functional synaptic properties. In addition, some statements-particularly those linking coupling mode to "higher-order neocortical functions"-appear broader than what is directly supported by the experiments and should be tempered or more precisely scoped.

      Below, I list several topics that could help better frame the main findings of the present study and clarify how it relates to previously published work.

      (1) The authors use EGTA sensitivity of EPSCs (together with additional metrics) to argue that S1 and PFC synapses differ in Ca²⁺ channel-release sensor coupling. While this is a plausible interpretation, EGTA effects are not uniquely determined by coupling distance and can also reflect differences in Ca²⁺ entry kinetics, action potential waveform, endogenous buffering/extrusion, or release-sensor/vesicle state. The authors use a constrained modeling approach, but the rationale for the different constraint sets is not fully clear from the current description. It would be helpful to expand and clarify the Methods section to explain how these constraints were defined, justified, and applied (and how alternative constraint choices would affect the results). In this context, the Abstract's broader claim that the study "reveals microdomain coupling as a presynaptic structure-function correlate of higher-order neocortical functions" appears overstated. Given the well-known diversity of cortical synapses even within a single region (e.g., synapses onto different interneuron subclasses or different PN cell types, extracortical sources like thalamus), the authors should clarify the intended scope: is the conclusion meant to apply broadly across synapse classes in S1 and PFC, or only to the specific connection type(s) examined here?

      (2) The chelator logic is sound in principle, but the Discussion should more explicitly acknowledge standard caveats and alternative explanations. The authors partly address this by including presynaptic Ca²⁺ imaging and modeling, yet it would help to explain more clearly how the combination of (i) chelator sensitivity, (ii) presynaptic Ca²⁺ signals, and (iii) model constraints rules out-or substantially reduces the likelihood of-changes in AP waveform, Ca²⁺ influx kinetics, buffering/extrusion, or sensor/vesicle state as the primary drivers. In addition, recent hypotheses emphasizing vesicle priming and/or release-site occupancy as contributors to apparent EGTA sensitivity should be discussed as a complementary or alternative interpretation.

      (3) A substantial portion of the S1 comparison appears to rely on previously published datasets. This should be made unambiguous in the Results and Methods, and it would be helpful to summarize this clearly (e.g., in a table indicating which figures/analyses use new data versus reanalysis of published data). If this information is already present, it should be highlighted more prominently.

      (4) The modeling is informative, but the choice of a specific VGCC-release-site geometry and channel arrangement is not sufficiently justified. The manuscript adopts a particular spatial configuration, yet the rationale for selecting this geometry, rather than other plausible architectures discussed in the literature, is not clearly explained, nor is it meaningfully revisited in the Discussion. The authors should justify why the same organization is assumed across two distinct cortical areas and, ideally, include (or at a minimum discuss) a sensitivity analysis showing how key inferences (e.g., coupling distance and channel number) depend on the assumed geometry.

      (5) The calcium imaging data are valuable, but given the diversity of synapses within each cortical layer, it is not clear that imaged boutons can be confidently assigned to the specific connection types being interrogated electrophysiologically. A substantial fraction of boutons likely corresponds to different postsynaptic targets (including interneurons and distinct pyramidal-cell classes), and this heterogeneity could complicate interpretation. This limitation should be discussed explicitly

      (6) In unitary connections, the authors assess EGTA effects alongside other functional parameters (strength, delay, short-term plasticity), which is a major strength. However, for L2/3 to L5 connections, it appears that EGTA sensitivity was tested primarily using extracellular stimulation. Given anatomical and circuit differences between PFC and S1, extracellular stimulation may recruit different synapse populations across regions, potentially confounding regional comparisons of EGTA sensitivity. This limitation should be acknowledged explicitly. While I am not requesting technically demanding L2/3↔L5 paired recordings in S1, the possibility that different synapse identities are being sampled should be treated as a meaningful source of uncertainty. The Discussion would also benefit from placing the magnitude of EGTA effects in the context of prior "loose coupling" literature, where comparatively large EGTA effects have been reported in some systems. In addition, the reported difference between adult PFC EGTA effects and S1 inhibition appears small (on the order of <10%) and should be interpreted cautiously, especially given that PFC and S1 mature on different timelines and P21-P26 is unlikely to reflect a mature PFC circuit state. The adult cohort (P90-P100) is therefore important, but the age mismatch complicates PFC-S1 comparisons; ideally, S1 should be assessed at matched ages, or this limitation should be discussed explicitly. Finally, for statistical robustness, in panel D of Figure 2, were the comparisons corrected for multiple testing to control Type I error?

      (7) Alterations in initial release probability are often associated with changes in short-term plasticity. In the present manuscript, the authors report similar initial release probability at PFC and S1 synapses, yet observe differences in short-term plasticity profiles. The mechanistic basis for this apparent dissociation is not addressed and should be discussed explicitly, including potential explanations.

      (8) There are multiple instances where the text appears to cite non-existent or misnumbered figure panels (e.g., references to "Figure 4G-I / 4J" when the relevant material appears elsewhere). These should be corrected throughout, as they currently reduce readability and confidence.

      (9) The Methods describe P21-P26 animals, whereas the Results include older cohorts (e.g., P90-P100) and additional regions (e.g., mPFC). The Methods should be updated so that all cohorts and regions analyzed in the Results are fully described.

    1. Reviewer #1 (Public review):

      Very nice and coherent body of work with appropriate in vitro to in vivo transition in methods.

      Lovely and easy to follow figures that can be understood even without the manuscript.

      My recommendation is that a sentence or two be added clearly stating the authors think nafamostat is off the table and suggest other approaches/drugs that might be considered instead of just making a general statement. I think all this can be done in a few sentences.

      Gabexate was administered to a snakebite victim in this case report from about 20 years ago and also a good example of the now better recognized threat to pregnancy.

      Nasu K, Ueda T, Miyakawa I. Intrauterine fetal death caused by pit viper venom poisoning in early pregnancy. Gynecol Obstet Invest. 2004;57(2):114-6. doi: 10.1159/000075676. Epub 2003 Dec 19. PMID: 14691344

    2. Reviewer #2 (Public review):

      Summary:

      The authors set out to test whether a defined set of small molecules can lessen damaging effects caused by venoms from several Bothrops species, and whether these effects are consistent enough to suggest a broadly applicable approach. They present a cross-venom dataset spanning in-vitro activity readouts and blood-based functional outcomes, and include a chicken embryo model to explore whether venom inhibition can translate into improved survival. The central message is that certain small molecules can reduce specific venom-driven effects across multiple samples, providing a comparative resource for the field and a basis for prioritizing future validation.

      Strengths:

      The main value of this work is the breadth and structure of the dataset, which places multiple venoms and multiple readouts into a single, comparable framework that should be useful for readers evaluating patterns across samples. The experimental flow is generally coherent, moving from activity measurements to functional outcomes and then to an in-vivo test, which helps the reader understand how the authors link mechanism-oriented assays to more integrated endpoints. The manuscript also provides practical information for the community by highlighting which readouts appear most consistently affected across venoms, which can help guide hypothesis generation and study design in follow-up work.

      Comments on revisions:

      I would like to thank the authors for answering my questions. The manuscript has gained in quality, knowing the limitations that are now better stated in the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      This study presents a new Bayesian approach to estimate importation probabilities of malaria combining epidemiological data, travel history, and genetic data through pairwise IBD estimates. Importation is an important factor challenging malaria elimination, especially in low transmission settings. This paper focus on Magude and Matutuine, two districts in south Mozambique with very low malaria transmission. The results show isolation-by-distance in Mozambique, with genetic relatedness decreasing with distances larger than 100 km, and no spatial correlation for distances between 10 and 100 km. But again strong spatial correlation in distances smaller than 10 km. They report high genetic relatedness between Matutuine and Inhambane, higher than between Matutuine and Magude. Inhambane is the main source of importation in Matutuine, accounting for 63.5% of imported cases. Magude, on the other hand, shows smaller importation and travel rates than Matutuine, as it is a rural area with less mobility. Additionally, they report higher levels of importation and travel in the dry season, when transmission is lower. Also, no association with importation was found for occupation, sex and other factors. These data have practical implications for public health strategies aiming malaria elimination, for example, testing and treating travelers from Matutuine in the dry season.

      Strengths:

      The strength of this study relies in the combination of different sources of data - epidemiological, travel and genetic data - to estimate importation probabilities, the statistical analyses.

      Weaknesses:

      The authors recognize the limitations related to sample size and the biases of travel reports.

    2. Reviewer #2 (Public review):

      Summary:

      Based on a detailed dataset, the authors present a novel Bayesian approach to classify malaria cases as either imported or locally acquired.

      Strengths:

      The proposed Bayesian approach for case classification is simple, well justified, and allows the integration of parasite genomics, travel history, and epidemiological data.

      Weakness:

      While the authors aim to classify cases as imported or locally acquired, the method does not quantify the contribution of each case type to overall transmission, which the authors leave for future study.

    3. Reviewer #3 (Public review):

      This work provides a novel statistical model to identify imported malaria cases, which are an important challenge for elimination, particularly in low-transmission areas. This tool was applied in Plasmodium falciparum populations in Mozambique and determined differences in importation rates in two low-transmission districts in the South.

      Strengths:

      The study has several strengths, particularly the development of a novel Bayesian model integrating genomic, epidemiological, and travel data to estimate importation probabilities. The findings provided important insights into malaria transmission dynamics, including the identification of importation sources and regional differences in importation rates across Mozambique. These results highlight the potential value of targeted interventions among traveler populations to support malaria elimination efforts. Moreover, this approach could be adapted to other epidemiological settings.

      Weaknesses:

      The study has some limitations, including uneven sample representation across provinces, incomplete metadata for risk factor analysis and a proxy for transmission intensity. Future work will include a new sample collection effort and the incorporation of monthly malaria incidence estimates.

    1. Reviewer #1 (Public review):

      Summary:

      Sidarta-Oliveira et al. present TopOMetry, a novel dimensionality reduction method based on the eigendecomposition of approximated Laplace-Beltrami Operator. Shortly, TopOMetry is an iterative version of the existing spectral methods (e.g., Laplacian Eigenmap or Diffusion map). It approximates the Laplacian operators twice, once in a "phenotypic space" and then once again in the eigenbases space. By doing this the approximated operator will contain more information of the manifold, which allows for more robust and accurate downstream analyses.

      Strengths:

      - Introduces operator-native fidelity scores and Riemannian diagnostics to single-cell analysis, enabling researchers to evaluate and trust embeddings - functionality absent in prior methods.<br /> - The approach was rigorously tested based on synthetic and real single-cell RNA-seq datasets.<br /> - The package is well-made and easily scalable to millions of cells.<br /> - The comprehensive documentation helps the end-users to run desired analyses.

      Weaknesses:

      - The method is an extension of the current state-of-art methods, not a fundamentally new one.

      Comments on revised version:

      The revised manuscript partially addresses the concerns raised in the prior review. The jargon weakness has been substantially mitigated by relocating mathematical derivations to the Methods section and simplifying language in the main text; this weakness has been updated accordingly.

      The introduction of operator-native fidelity scores and Riemannian diagnostics represents a meaningful addition and has been added to the Strengths. The benchmarking scope has also been notably expanded.

      The core weakness - that the method is an extension of existing spectral methods rather than a fundamentally new contribution - remains unchanged, as the authors' rebuttal did not provide a sufficiently precise mathematical argument to overturn it.

    2. Reviewer #2 (Public review):

      Summary:

      This work introduces a novel framework to systematically learn the latent dimensions of single-cell data, grounded in the theory of the Riemannian manifold. The authors demonstrate how this framework can be applied to various important tasks, such as estimating intrinsic dimensionalities, annotating cell types, etc. They did a great job of tackling an important but not yet established problem in the field and approaching it with a theoretically sound and novel approach. I think after a more rigorous and comprehensive validation, this work could be impactful.

      Strengths:

      - Dimensionality reduction is a routine step in analyzing many high-dimensional data, such as molecular data. While the downstream analysis results depend heavily on this step, existing methods rely on strong assumptions and are sometimes heuristic. The authors present a novel, theoretically grounded approach to address this important problem.

      - The authors demonstrated its usability in downstream analysis in a comprehensive manner. Especially, they show evidence suggesting novel T-cell subpopulations.

      - I commend the authors for releasing and maintaining their software well with comprehensive documentation. This significantly increases the usability and accessibility of the method.

      Weaknesses:

      - The paper lacks experiments that validate the results. It would be beneficial to see additional evaluation settings with better-established ground truths to more strongly demonstrate the method's effectiveness.

      - Batch effects are prevalent in single-cell data. The paper does not adequately address how the proposed method handles this issue.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Ding et al. use genetic mouse models to demonstrate that atrial trabeculation is more dependent on Tie1/Tie2 signaling than ventricular trabeculation. With additional experimentation that would support the current claims, the results may hold significant value, as atrial trabeculation remains an understudied phenomenon in cardiac biology with potential implications for atrial cardiomyopathy and atrial fibrillation.

      Strengths:

      Detailed characterization of atrial versus ventricular trabeculation across different developmental timepoints, and the use of appropriate animal models to address the scientific question at hand.

      Weaknesses:

      The authors have consistently treated mice with tamoxifen after ventricular, but not atrial, trabeculation has already started. As such, the observed cardiac phenotypes - where predominantly atrial trabeculation is affected - might be a mere consequence of the precise time window in which Tie1/2 signaling was impaired, rather than a direct measurement of its relative importance for atrial versus ventricular trabeculation. The conclusions of the paper may thus be significantly strengthened by depleting Tie1/2 signaling prior to the onset of ventricular trabeculation, as is done for atrial trabeculation.

    2. Reviewer #2 (Public review):

      Summary:

      Ding et al. examine the role of TIE1 in cardiac chamber morphogenesis using genetic mouse models targeting Tie1, Tek, or both, and analyzing endocardial cell-mediated chamber formation across multiple embryonic developmental and postnatal stages, supported by analysis of published single-cell datasets and new bulk RNA seq analyses of murine cardiac tissue. The authors find that Tie1 and Tek expression is higher in atrial than ventricular endocardial cells. Notably, endothelial Tie1 is required for atrial trabeculation at E12.5, but is less critical in ventricular trabeculation. TIE1 also acts synergistically with TIE2 during atrial trabeculation. While Tie1 deficiency alone does not cause defects at E10.5, combined heterozygous deletion of Tek disrupts both atrial and ventricular development at E10.5. This synergy is further supported by analyses at later embryonic stages and in postnatal hearts.

      Strengths:

      The study is well-designed, clearly written, and supported by high-quality figures. The performed experiments demonstrate a previously unrecognized role for Tie1 in cardiac development and identify synergistic control of cardiac morphogenesis by Tie1 and Tie2. This synergy is consistent with the previously identified roles of Tie1 and Tek in venous development and with Tie1 involvement in angiopoietin-dependent postnatal vascular and lymphatic remodeling. Together, these findings support a role for Tie1 as a contributor to Ang1-Tie2 signaling during heart development.

      Weaknesses:

      The manuscript does not include direct mechanistic studies; however, RNA seq analysis of atria and ventricles showed reduced expression of Tek, Dll1, and Notch1 upon Tie1 deficiency in developing hearts. Although previously reported mechanisms, such as TIE1-TIE2 heterodimer formation and effects on endothelial junctions, migration, or survival are discussed, no direct mechanistic experiments are performed. Addressing some of these mechanisms would have clarified the basis of Tie1-Tie2 synergy. As two distinct Tie1 models are used, including one targeting the kinase domain, the authors should state whether phenotypes differed or were similar between models.

    3. Reviewer #3 (Public review):

      Summary:

      Ding et al. investigate the roles of TIE1 and TEK (Tie2) in mouse cardiac development, with a particular focus on atrial trabeculation. The authors employ multiple genetic models, including Tie1ICDflox/flox (with Cdh5-CreERT2), a knockout-first allele (EUCOMM, Tie1 tm1a/tm1a), and a Tek deletion model.

      Based on the dataset from Feng et al. 2022 Nat Commun, the authors report increased expression of Tie1 and Tek transcripts in atrial endocardial cells compared to ventricular cells at embryonic day (E) 14.5. Loss of Tie1 leads to early atrial trabeculation defects detectable at E12.5, whereas ventricular defects appear later and are less pronounced at E14.5. Chamber-specific RNA sequencing reveals stronger transcriptional changes in atrial tissue.

      Conditional deletion of Tek results in a similar phenotype, with more pronounced atrial defects. Combined deletion of Tie1 and Tek (Tie1 ΔICD/ΔICD; Tek+/-) leads to earlier and more severe defects in both atrial and ventricular trabeculation and results in embryonic lethality around E12.5, suggesting a synergistic interaction between the two genes.

      Conditional endothelial deletion of Tie1 combined with heterozygous global Tek at later embryonic stages allows analysis at later time points and again shows more severe defects in atrial trabeculation. Postnatal analysis of this model reveals reduced heart-to-body weight ratios and potential mild atrial abnormalities.

      Strengths:

      (1) The authors address chamber-specific signaling mechanisms underlying atrial versus ventricular trabeculation, an area of high developmental and clinical relevance.

      (2) The study provides a comprehensive temporal analysis across multiple embryonic stages.

      (3) The use of multiple genetic models strengthens the overall conclusions and allows comparative interpretation.

      (4) While focusing on trabeculation, the authors also include observations on coronary vessel development, increasing the broader relevance of the work. The findings are therefore of interest to the wider cardiovascular research community.

      Weaknesses:

      (1) Timing of recombination vs. trabeculation onset

      Ventricular trabeculation begins earlier than atrial trabeculation. Since tamoxifen (in contrast to 4-hydroxytamoxifen) requires metabolic activation, Cre-mediated recombination will occur with a delay. This suggests that atrial trabeculation may be targeted before its onset, whereas ventricular trabeculation may already be underway for 2-3 days at the time of effective gene deletion.

      How do the authors account for this discrepancy in their interpretation?

      Have earlier induction time points been tested to better capture the onset of ventricular trabeculation? This limitation should be explicitly discussed.

      (2) Clarity of genetic models and experimental design

      The study employs several genetic constructs. It would improve clarity if, for each experiment, the specific genetic model and tamoxifen regimen were clearly described before presenting the results.

      (3) Tie1 tm1a/tm1a phenotype vs. known global knockout

      Previous studies (PMID: 8846781, 7596437) show that complete Tie1 loss leads to severe edema, vascular rupture, and embryonic lethality around E13.5-E14.5.

      How does the Tie1 tm1a/tm1a allele differ, given that animals appear to survive longer? Is this allele hypomorphic rather than a full knockout?

      This point requires clarification.

      (4) Limited mechanistic insight

      While the authors aim to investigate underlying mechanisms, the current study is largely descriptive and based on mRNA expression and genetic interaction analyses (Tie1/Tek co-deletion). Direct mechanistic insights into signaling pathways remain limited. However, the dataset provides a valuable foundation for future mechanistic studies, which should be more clearly acknowledged in the discussion.

    1. Reviewer #1 (Public review):

      Summary:

      In this review paper, the authors describe the concept of neural correlates of consciousness (NCC) and explain how noninvasive neuroimaging methods fall short of being able to properly characterise an unconfounded NCC. They argue that intracranial research is a means to address this gap and provide a review of many intracranial neuroimaging studies that have sought to answer questions regarding the neural basis of perceptual consciousness.

      Strengths:

      The authors have provided an in-depth, timely, and scholarly contribution to the study of NCCs. First and foremost, the review surveys a vast array of literature. The authors synthesise findings such that a coherent narrative of what invasive electrophysiology studies have revealed about the neural basis of consciousness can be easily grasped by the reader. The authors also succeed in describing how single-cell recordings can interface with task-design to help mitigate the impact of confounded neural activity when searching for NCCs.

      The review is also, to the best of my knowledge, the first review to specifically target intracranial approaches to consciousness and to describe their results in a single article. This is a credit to the authors - as it becomes ever harder to apply strict tests to theories of consciousness using methods such as fMRI and M/EEG, it is important to have informative resources describing the results of human intracranial research so that theorists will have to constrain their theories further in accordance with such data. Additionally, the authors provide a compelling case for single-celled research in consciousness science, despite the dominance of theories situated at the system and circuit level of analysis. As far as the authors were aiming to provide a complete and coherent overview of intracranial approaches to the study of NCCs, I believe they have achieved their aim.

      Weaknesses:

      Overall, I feel positive about this paper. The authors have addressed my comments from my previous review and I see no significant weaknesses in the current version.

      Comment on revised version:

      No comments - congratulations to the authors!

    2. Reviewer #2 (Public review):

      Summary:

      In this work, the authors review the study of the neural correlates of consciousness (NCCs). They discuss several of the difficulties that researchers must face when studying NCCs, and argue that several of these difficulties can be alleviated by using intracranial recordings in humans.

      They describe what constitutes an NCC, and the difficulties to distinguish between an NCC proper from the prerequisites and consequences of conscious processing.

      They also describe the two main types of experimental designs used to study NCCs. These are the contrastive approach (with its report and non-report variants), and the supraliminal approach, each with their own merits and pitfalls.

      They discuss the limitations of non-invasive methods, such as fMRI, EEG and MEG, as well as the limitations of the use of invasive recordings in non-human animals.

      After setting the stage in this way, the authors provide an extensive review on the knowledge acquired by using invasive recordings in humans. This included population level measurements in vision and in other sensory modalities, as well as single neuron level studies. The authors also discuss studies of subcortical NCCs.

      The second half of this work discusses the theoretical insights gained through the use of intracranial recordings, as well as their limitations, and a perspective for future work.

      Strengths:

      This work offers an impressive review, which will serve as a useful reference document, both for newcomers to the study of NCC as for experienced researchers. The inclusion of non-visual and subcortical NCCs is of particular merit, as these have been understudied.

      Besides serving as a review, this work includes a perspective, exploring several directions to pursue for the progress of the field.

      Weaknesses:

      No major weaknesses.

      Appraisal of whether the authors achieved their aims:

      In this work, the authors have gathered an impressive review, and have discussed several important problems in the field of study of NCCs, as well as provided a perspective on how the field could move forward.

      Discussion of the likely impact of the work on the field:

      This work has the potential of becoming a must read for anyone working in the field of consciousness research.

      Comment on revised version:

      The authors have addressed all my concerns. Once again, my compliments for a nice piece of work.

    3. Reviewer #3 (Public review):

      Summary:

      This narrative review provides a clear, well-structured, and comprehensive synthesis of intracerebral recording work on the neural correlates of consciousness. It is written in an accessible manner that will be useful to a broad community of researchers, from those new to iEEG to specialists in the field.

      Strengths:

      The manuscript successfully integrates methodological and theoretical perspectives and offers a balanced overview of current sometimes contradicting evidence. As such, the manuscript is important as call for a concernted better exploration of NCCs using iEEG in the future.

      Weaknesses:

      The manuscript discusses extensively the use of "report" as a criterion for identifying conscious perception and its limitations for separating between correlates of consciousness and post consciousness processes, yet the term is not defined at the outset. The authors should specify what they mean by "report" (e.g., verbal report, nonverbal self-report, or any meta-cognitive indication of experience). Importantly, this definition should be explicitly linked to the theoretical landscape: whether the authors adopt an access-consciousness perspective in which (self) reportability is central, or whether the review also aims to address phenomenal consciousness. Making this conceptual grounding explicit at the beginning will help readers interpret the empirical work surveyed throughout the review.

      In addition, the review would benefit from an earlier introduction of the distinction between states and contents of consciousness. This distinction becomes important in the later section on anesthesia, sleep, and epileptic seizures, where the focus shifts from content-specific NCCs to alterations in global states. Presenting these definitions upfront, and briefly explaining how states and contents interact, would strengthen the coherence of the manuscript.

      Overall, this is an excellent and timely review. With clearer initial theoretical definitions of consciousness, the manuscript will offer an even stronger conceptual framework for interpreting intracerebral studies of consciousness.

      Comments on revised version:

      The current version of the manuscript is clear and complete. Kudos to the authors for their thorough revisions.

      My only remaining point concerns the definition of "report": "We define a report as any explicit behavioral response (whether verbal, manual, or otherwise) that communicates a participant's subjective state."

      It would be helpful to clarify whether this definition is intended to exclude purely internal, explicit self-reports that are not externally expressed. As currently formulated, the definition appears to require overt behavioral communication. However, this raises a conceptual issue in relation to the no-report paradigm literature, where the distinction between report, metacognitive access, and overt motor/verbal expression is precisely at stake.

      Could the authors specify whether "report" is meant to (i) be restricted to externally observable, behaviorally expressed reports, or (ii) extend to internally generated, explicit metacognitive judgments even when they are not communicated? Clarifying this point would help situate the manuscript more precisely within ongoing debates on the role of report in identifying neural correlates of consciousness.

    4. Reviewer #1 (Public review):

      Summary

      In this review paper, the authors describe the concept of neural correlates of consciousness (NCC) and explain how noninvasive neuroimaging methods fall short of being able to properly characterise an unconfounded NCC. They argue that intracranial research is a means to address this gap and provide a review of many intracranial neuroimaging studies that have sought to answer questions regarding the neural basis of perceptual consciousness.

      Strengths

      The authors have provided an in-depth, timely, and scholarly contribution to the study of NCCs. First and foremost, the review surveys a vast array of literature. The authors synthesise findings such that a coherent narrative of what invasive electrophysiology studies have revealed about the neural basis of consciousness can be easily grasped by the reader. The review is also, to the best of my knowledge, the first review to specifically target intracranial approaches to consciousness and to describe their results in a single article. This is a credit to the authors, as it becomes ever harder to apply strict tests to theories of consciousness using methods such as fMRI and M/EEG it is important to have informative resources describing the results of human intracranial research so that theorists will have to constrain their theories further in accordance with such data. As far as the authors were aiming to provide a complete and coherent overview of intracranial approaches to the study of NCCs, I believe they have achieved their aim.

      Weaknesses

      Overall, I feel positive about this paper. However, there are a couple of aspects to the manuscript that I think could be improved.

      (1) Distinguishing NCCs from their prerequisites or consequences

      This section in the introduction was particularly confusing to me. Namely, in this section, the authors' aim is to explain how intracranial recordings can help distinguish 'pure' NCCs from their antecedents and consequences. However, the authors almost exclusively describe different tasks (e.g., no-report tasks) that have been used to help solve this problem, rather than elaborating on how intracranial recordings may resolve this issue. The authors claim that no-report designs rely on null findings, and invasive recordings can be more sensitive to smaller effects, which can help in such cases. However, this motivation pertains to the previous sub-section (limits of noninvasive methods), since it is primarily concerned with the lack of temporal and spatial resolution of fMRI and M/EEG. It is not, in and of itself, a means to distinguish NCCs from their confounds.

      As such, in its current formulation, I do not find the argument that intracranial recordings are better suited to identifying pure NCCs (i.e. separating them from pre- or post-processing) convincing. To me, this is a problem solved through novel paradigms and better-developed theories. As it stands, the paper justifies my position by highlighting task developments that help to distinguish NCCs from prerequisites and consequences, rather than giving a novel argument as to why intracranial recordings outperform noninvasive methods beyond the reasons they explained in the previous section. Again, this position is justified when, from lines 505-506, the authors describe how none of the reported single-cell studies were able to dissociate NCCs from post-perceptual processing. As such, it seems as if, even with intracranial recording, NCCs and their confounds cannot be disentangled without appropriate tasks.

      The section 'Towards Better Behavioural Paradigms' is a clear attempt to address these issues and, as such, I am sure the authors share the same concerns as I am raising. Still, I remain unconvinced that the distinguishing of NCCs from pre-/post- processing is a fair motivation for using intracranial over noninvasive measures.

      (2) Drawing misleading conclusions from certain studies

      There are passages of the manuscript where the authors draw conclusions from studies that are not necessarily warranted by the studies they cite. For instance:

      Lines 265 - 271: "The results of these two studies revealed a complex pattern: on the one hand, HGA in the lateral occipitotemporal cortex and the ventral visual cortex correlated with stimulus strength. On the other hand, it also correlated with another factor that does not appear to play a role in visibility (repetition suppression), and did not correlate with a non-sensory factor that affects visibility reports (prior exposure). These results suggest that activity in occipitotemporal cortex regions reflecting higher-order visual processing may be a precursor to the NCC but not an NCC proper."

      It's possible to imagine a theory that would predict HGA could correlate with stimulus strength and repetition suppression, or that it would not correlate with prior exposure (e.g. prior exposure could impact response bias without affecting subjective visibility itself). The authors describe this exact ambiguity in interpretation later in the article (line 664), but in its current form, at least in line 270 (when the study is most extensively discussed), the manuscript heavily implies that HGA is not an NCC proper. This generates a false impression that intracranial recordings have conclusively determined that occipitotemporal HGA is not a pure NCC, which is certainly a premature conclusion.

      Line 243: "Altogether, these early human intracranial studies indicate that early-latency visual processing steps, reflected in broadband and low gamma activity, occur irrespective of whether a stimulus is consciously perceived or not. They also identified a candidate NCC: later (>200 ms) activity in the occipitotemporal region responsible for higher-order visual processing."

      The authors claim in this section that later (>200ms) activity in occipitotemporal regions may be a candidate for an NCC. However, the Fisch et al. (2009) study they describe in support of this conclusion found that early (~150ms) activity could dissociate conscious and unconscious processing. This would suggest that it is early processing that lays claim to perceptual consciousness. The authors explicitly describe the Fisch et al results as showing evidence for early markers of consciousness (line 240: '...exhibited an early...response following recognized vs unrecognised stimuli.) Yet only a few lines later they use this to support the conclusion that a candidate NCC is 'later (>200ms) activity in the occipitotemporal region' (line 245). As such, I am not sure what conclusion the authors want me to make from these studies.

      This problem is repeated in lines 386-387: "Altogether, studies that investigated the cortical correlates of visual consciousness point to a role of neural responses starting ~250 ms after stimulus onset in the non-primary visual cortex and prefrontal cortex."

      This seems to be directly in conflict with the Fisch et al results, which show that correlates of consciousness can begin ~100ms earlier than the authors state in this passage.

      (3) Justifying single-neuron cortical correlates of consciousness

      The purpose of the present manuscript is to highlight why and how intracortical measures of neural activity can help reveal the neural correlates of perceptual consciousness. As such, in the section 'Single-neuron cortical correlates of perceptual consciousness', I think the paper is lacking an argument as to why single-neuron research is useful when searching for the NCC. Most theories of consciousness are based around circuit or system-level analyses (e.g., global ignition, recurrent feedback, prefrontal indexing, etc.) and usually do not make predictions about single cells. Without any elaboration or argument as to why single-cell research is necessary for a science of consciousness, the research described in this section, although excellent and valuable in its own right, seems out of place in the broader discussion of NCCs. A particularly strong interpretation here could be that intracranial recordings mislead researchers into studying single cells simply because it is the finest level of analysis, rather than because it offers helpful insight into the NCCs.

      (4) No mention of combined fMRI-EEG research

      A minor point, but I was surprised that the authors did not mention any combined fMRI-EEG research when they were discussing the limits of noninvasive recordings. Intracortical recordings are one way to surpass the spatial and temporal resolution limits of M/EEG and fMRI respectively, but studies that combine fMRI and EEG are also an alternative means to solve this problem: by combining the spatial resolution of fMRI with the temporal resolution of EEG, researchers can - in theory - compare when and where certain activity patterns (be they univariate ERPs or multivariate patterns) arise. The authors do cite one paper (Dellert et al., 2021 JNeuro) that used this kind of setup, but they discuss it only with respect to the task and ignore the recording method. The argument for using intracranial recordings is weaker for not mentioning a viable, noninvasive alternative that resolves the same issues.

    5. Reviewer #2 (Public review):

      Summary:

      In this work, the authors review the study of the neural correlates of consciousness (NCCs). They discuss several of the difficulties that researchers must face when studying NCCs, and argue that several of these difficulties can be alleviated by using intracranial recordings in humans.

      They describe what constitutes an NCC, and the difficulties to distinguish between an NCC proper from the prerequisites and consequences of conscious processing.

      They also describe the two main types of experimental designs used to study NCCs. These are the contrastive approach (with its report and non-report variants), and the supraliminal approach, each with its own merits and pitfalls.

      They discuss the limitations of non-invasive methods, such as fMRI, EEG and MEG, as well as the limitations of the use of invasive recordings in non-human animals.

      After setting the stage in this way, the authors provide an extensive review of the knowledge acquired by using invasive recordings in humans. This included population-level measurements in vision and in other sensory modalities, as well as single-neuron level studies. The authors also discuss studies of subcortical NCCs.

      The second half of this work discusses the theoretical insights gained through the use of intracranial recordings, as well as their limitations, and a perspective for future work.

      Strengths:

      This work offers an impressive review, which will serve as a useful reference document, both for newcomers to the study of NCC and for experienced researchers. The inclusion of non-visual and subcortical NCCs is of particular merit, as these have been understudied.

      Besides serving as a review, this work includes a perspective, exploring several directions to pursue for the progress of the field.

      Weaknesses:

      The intention of the authors is to argue how some of the problems faced when studying NCCs are alleviated by the use of intracranial recordings in humans. But in some cases, the link between the problems related to the study of NCCs and the advantages of intracranial recordings over non-invasive methods is not clear.

      For example, the authors explain the difficulties in distinguishing between true NCCs from their prerequisites and consequences. This constitutes a difficult conceptual problems that plague all recording techniques. The authors don't provide a convincing explanation of how intracranial recordings offer advantages over EEG or MEG when dealing with these problems.

      For example, the authors explain how the use of non-report designs to rule out post-perceptual processing relies on null results, which, according to them, are harder to interpret given the low resolution of non-invasive methods. But the interpretation of null results is actually more complicated in the case of intracranial recordings. As the coverage achieved by the electrodes is sparse, if a null result is attested, it remains possible that a true effect was present in a nearby patch of cortex out of coverage.

      The authors argue that the spatial resolution of intracranial recordings is better than that of EEG and MEG. While this is technically true (especially compared to EEG), the true spatial scale of the NCCs is unknown. If NCCs' span is in the mm range, then the additional spatial resolution of intracranial recordings might not be an advantage.

      Another factor that should be taken into consideration when assessing the spatial resolution of intracranial recordings is that while the listening zone of individual intracranial contacts is small, coverage is sparse and defined by clinical criteria (something that the authors discuss). In practice, the activity recorded by contacts is usually attributed to anatomically defined ROIs with a scale in the cm range. Given the sparse and uneven (across regions and patients) coverage afforded by intracranial recordings, the advantage of intracranial recordings in terms of spatial resolution is overstated.

      Appraisal of whether the authors achieved their aims:

      In this work, the authors have gathered an impressive review and have discussed several important problems in the field of study of NCCs, as well as provided a perspective on how the field could move forward.

      What is less clear is how the use of intracranial recordings per se holds potential to overcome problems such as the distinction between true NCCs and the prerequisites and consequences of conscious processing.

      Discussion of the likely impact of the work on the field:

      This work has the potential of becoming a must-read for anyone working in the field of consciousness research.

    6. Reviewer #3 (Public review):

      Summary:

      This narrative review provides a clear, well-structured, and comprehensive synthesis of intracerebral recording work on the neural correlates of consciousness. It is written in an accessible manner that will be useful to a broad community of researchers, from those new to iEEG to specialists in the field.

      Strengths:

      The manuscript successfully integrates methodological and theoretical perspectives and offers a balanced overview of current, sometimes contradicting evidence. As such, the manuscript is important as it calls for a concerted and better exploration of NCCs using iEEG in the future.

      Weaknesses:

      The manuscript extensively discusses the use of "report" as a criterion for identifying conscious perception and its limitations for separating between correlates of consciousness and post-consciousness processes, yet the term is not defined at the outset. The authors should specify what they mean by "report" (e.g., verbal report, nonverbal self-report, or any meta-cognitive indication of experience). Importantly, this definition should be explicitly linked to the theoretical landscape: whether the authors adopt an access-consciousness perspective in which (self) reportability is central, or whether the review also aims to address phenomenal consciousness. Making this conceptual grounding explicit at the beginning will help readers interpret the empirical work surveyed throughout the review.

      In addition, the review would benefit from an earlier introduction of the distinction between states and contents of consciousness. This distinction becomes important in the later section on anaesthesia, sleep, and epileptic seizures, where the focus shifts from content-specific NCCs to alterations in global states. Presenting these definitions upfront and briefly explaining how states and contents interact would strengthen the coherence of the manuscript.

      Overall, this is an excellent and timely review. With clearer initial theoretical definitions of consciousness, the manuscript will offer an even stronger conceptual framework for interpreting intracerebral studies of consciousness.

    1. Reviewer #1 (Public review):

      Summary:

      The authors utilize genetic code expansion to tag TDP-43 and G3BP1, and evaluate this protein tagging system (ANAP) compared to antibodies and evaluate protein trafficking and stress granule formation in response to stress with sodium arsenite treatment. They find similar staining to antibodies in HeLa cells, mouse embryonic stem cells and primary mouse cortical neurons. By incorporating the intrinsically fluorescent noncanonical amino acid Anap at carefully selected sites, the authors enable live-cell and neuronal visualization of protein localization, stress-induced redistribution, and dynamic behavior without the structural and functional compromises often associated with large fluorescent protein tags. The work provides technical framework that will be useful for live imaging of tagged proteins.

      Strengths:

      A key strength is the demonstration of the specificity of the Anap fluorescence signal through appropriate controls and the agreement between Anap labeling and antibody-based detection across multiple cell types, including primary neurons. The ability to visualize stress-induced redistribution of both G3BP1 and TDP 43 in living cells highlights the practical value of this approach.<br /> The functional validation of TDP 43-Anap is compelling. The rescue of both cell viability and RNA splicing defects in TDP 43 knockout models provides evidence that Anap incorporation preserves core protein functions. This is important, as functional disruption is a central concern for any alternative tagging strategy applied to aggregation-prone or RNA-binding proteins.

      Weaknesses:

      While some inherent limitations of genetic code expansion remain (e.g., variable amber suppression efficiency and the inability to directly assess endogenous protein behavior), these are acknowledged and discussed appropriately. Importantly, these limitations do not undermine the central contributions of the study.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Chen and colleagues describe a novel means of labeling two RNA binding proteins, G3BP1 and TDP-43, using genetic code expansion. Overexpressed constructs that incorporate the intrinsically-fluorescent non-canonical amino acid Anap redistribute to cytoplasmic granules upon application of external stressors such as sodium arsenite. Similar labeling and redistribution of overexpressed G3BP1 and TDP-43 was observed in cultures of mouse primary neurons.

      Genetic code expansion and non-canonical amino acid labeling have many advantages over traditional fusion proteins for tracking protein redistribution in living cells. The authors show that they are able to label exogenous G3BP1 and TDP-43 with the non-canonical amino acid Anap, and follow labeled proteins in living cells with and without stress.

      I suspect that this method could be incredibly valuable to many investigators studying the dynamics and interactions of proteins that are difficult to label or detect by conventional methods.

    1. Reviewer #1 (Public review):

      Summary:

      LRRK2 protein is familially linked to Parkinson's disease by the presence of several gene variants that confer a gain-of-function effect on LRRK2 kinase activity.

      The authors examine the effects of BDNF stimulation in immortalized neuron-like cells, cultured mouse primary neurons, hIPSC-derived neurons, and brain tissue from genetically modified mice. They examine a LRRK2 regulatory phosphorylation residue, LRRK2 binding relationships, other kinase phosphorylation status, and measures of synaptic structure and function.

      Strengths:

      The study addresses an important research question: how does a PD-linked protein interact with other proteins, and contribute to responses to a well-characterized neuronal signalling pathway involved in the regulation of synaptic function and cell health.

      They employ a range of models and techniques to convincingly demonstrate that BDNF stimulation alters LRRK2 phosphorylation at pS935 and binding to many proteins. Several independent data sets lead to some exciting conclusions.<br /> In this re-revised manuscript, some aspects are very convincing and well validated e.g., drebrin binding to LRRK2, increased by BDNF, and reduced LRRK2 protein levels in young (but not mature) drebrin KO mice. A phosphoproteomic analysis of PD mutant Knock-in mouse brain is included. Overall, the links between LRRK2, LRRK2 activity, and the changes to synaptic molecules, structures, and activity are intriguing.

      Weaknesses:

      Enthusiasm for the title claim that "LRRK2 regulates synaptic function through BDNF signalling" is tempered by disconnected results across different model systems and inconsistent alterations upon kinase phosphorylation in SHSY5Y cell line and primary neurons. Exciting conclusions are sometimes not consistently supported by the data and/or only conducted in one of the models.

      BDNF increasing pS935 LRRK2 is quite well supported in cell lines, as is BDNF regulation of derbrin-LRRK2 binding. However, there is a lack of connection between this result and subsequent alterations to LRRK2 substrates e.g., phosphorylation of Rab GTPases, especially in neurons. Interesting omic data sets are provided, but with very little or no validation. For example, only drebrin protein was assessed in BDNF treatment omic, and the phosphoproteomic analysis of PD mutant Knock-in mouse is stand alone with no validation and G2019S is not explored elsewhere in the study.

      The major disconnect this reviewer struggles with is the conclusion that the quite clear data in SHSY5Y cells is the same as that from neurons regarding BDNF / LRRK2 and ERK / Akt. It seems they are not.

      ERK and Akt phosphorylation by BDNF is absent in CRISPR KO SHSY5Y cells.<br /> This conclusion is at odds with interpretation of neuronal data. To explain; in div14 neurons, BDNF's transient increase in pLRRK2 is seen and strongly prevented by MLi2. BDNF also increased pAkt & pERK1&2 in WT... but also in LRRK2 KO. Furthermore, this happened in the presence of MLi2 in WT despite no pLRRK2 increase. While the 5min BDNF induced increase to pAkt appears reduced in LKO, the same time BDNF in LKO with MLi2 is as high as WT (in these unquantified examples) and ERK is almost identical. This is described as "significantly reduced" but I see no replicates or quantification, and face value assessment of the blot argues against this.<br /> Thus, there is little or no evidence supporting that LRRK2 activity is involved in BDNF-stimulated increases in pAkt or pERK, upstream, in neurons as neither Mli2 nor KO prevented this.

      Synapse markers increased in WT neuron with BDNF treatment which did not happen in LKO neurons. So this process requires pLRRK2, but is unrelated to pAkt or pERK (which do still go up with BDNF in KO)? Similarly, an increase in synaptic activity in WT hiPSC neurons in response to BDNF seems lost in LRRK2 KO hiPSC neurons, although their activity is already increased and depending on the age of the cells the effects were different. Both of these experiments lack supporting evidence by other measures e.g., LRRK2 inhibition effects on BDNF-induced increases in WT and parallel biochemistry of p'd LRRK2, Akt, ERK in WT & KO.

      LRRK2 activating Akt1 has been published before (e.g., Ohta 2011 - not cited), but Ohta also conclude that LRRK2 gain of function mutations (more LRRK2 kinase activity) were associated with a reduced ability of LRRK2 to bind AND phosphorylate Akt at the same residue, in contradiction to the mechanism proposed here? This should be discussed. Here the authors also conclude Akt is Upstream of LRRK2. However, it appears from the data here in neurons that pLRRK2 increases in response to BDNF are separate from BDNF signalling to Akt.

      Of note, in comparison to bTubulin control, LKO total Akt levels appear consistently higher in this single example blot; a large increase in Akt would skew the ratio down, while absolute levels of pAkt (probably the most important matter for an active enzyme - what is the ratio against total protein stain) are similar or increased. These are major problems for the conclusions as presented.

      BDNF increased mEPSC frequency in hIPSC neurons; which didn't happen in LKO, which already had high frequency. Earlier in the manuscript BDNF is shown to alter synapse number in WT but not LKO mouse neurons, but no increase in synapse number was seen following BDNF treatment in any WT or LKO hiPSC neurons +/- BDFN.

      If we are to assume that the WT neurons have LRRK2 (not demonstrated), and that LRRK2 KO neurons have similar drebrin (not demonstrated) it is unclear how to interpret this result in the model of BDNF-LRRK2 being upstream of pERK/Akt. There is no evidence that the BDNF increase in WT is blocked by LRRK2 inhibition, nor has it been associated with changes (or not) to pAkt or ERK1, which would be expected in both WT and KO based on Figure 4C.

      There are many reports of acute and longer term BDNF application increasing event frequency in brain slices & primary neurons. Overexpression of BDNF in NPCs has also been shown to increase synapse function in hiPSC neurons derived from them. Here, BDNF has an effect on frequency in only one 6 comparisons (3 timepoints, two lines). Is it not concerning that expected BDNF effects occur at only one time point in WT, and that generally a lack of effect is more common both in WT and LKO... is this due to slow appearance of TrkB receptors and degeneration at 90 days?

      There are no other data provided to show that BDNF was having a consistent expected effect in human neurons (pAkt, pLRRK2 etc etc), and there is little to link between this data and that in previous figures of the study.

      The discussion of some of the weaknesses is mostly fair, asides the disparities noted above which are not.

    2. Reviewer #2 (Public review):

      The data show that BDNF regulates the PD-associated kinase LRRK2, they place LRRK2 within well-described BDNF pathways biochemically, and they show that LRRK2 can play a role mediating BDNF-driven synaptic outcomes at excitatory synapses. The chief strength is that the data provide a potential focal point for multiple observations that have been made across many labs. The findings will be of broad interest because LRRK2 has emerged as a protein that is likely to be part of Parkinson's pathology and its normal and pathological actions remain poorly understood.

      A major strength of the study is the multiple approaches that were used (biochemistry, bioinformatics, light and electron microscopy and electrophysiology) across different experimental models (cells, primary neurons, human neurons, mice) to identify and examine the impact of BDNF on LRRK2 signaling and functions. Noteworthy is also the employment of LRRK2KO preparations to validate outcomes and to place LRRK2 actions up or downstream.

      The demonstration that LRRK2 and drebrin interact directly is important and suggests that other interacting proteins identified biochemically and bioinformatically in the paper will be important to pursue.

    1. Reviewer #2 (Public review):

      Summary:

      The authors investigated whether early-life malaria exposure has long-term effects on immune responses to unrelated antigens. They leveraged a natural experiment in coastal Kenya where two adjacent communities (Junju and Ngerenya) experienced divergent malaria transmission patterns after 2004. Using 15 years of longitudinal data from 123 children with weekly malaria surveillance and annual serological sampling, they measured antibody responses to multiple pathogens using a protein microarray technology and ELISA.

      Strengths:

      (1) Extensive longitudinal data collection with weekly malaria surveillance, enabling precise exposure classification.

      (2) Use of a natural experiment design that allows for causal inference about malaria's immunological effects.

      (3) Broad panel of antigens tested, demonstrating generalized rather than antigen-specific effects.

      (4) Within-cohort analysis in Ngerenya controls for geographic and environmental factors.

      (5) Validation of key findings using both serologic microarray and ELISA.

      (6) Important public health implications for vaccine strategies in malaria-endemic regions.

      Weaknesses:

      (1) Due to its nature, the study lacks the ability to determine the direction of the associations found between malaria exposure and other IgG levels to unrelated pathogens.

      (2) No evaluation of the clinical Implications of the reduced IgG levels observed in the area with high malaria exposure.

      Assessment of Claims:

      The data appear to support the authors' primary claims. The strength of the evidence is limited by the observational nature of the study and the results should be interpreted in that light. Together with the currently available evidence of P. falciparum's impact on the host's immune function, this natural experiment design provides further evidence for a relationship between early malaria exposure and reduced antibody responses to other pathogens and vaccine-derived antigens. The within-Ngerenya analysis controls for geographic factors and thus enhances the quality of the evidence; there is limited physical, nutritional, and socio-economic information on factors that may have driven the observed changes.

      Impact and Utility:

      This work has fundamental implications for understanding vaccine effectiveness in malaria-endemic regions and may contribute to inform vaccination strategies. The findings, if confirmed, would suggest that children in areas of high malaria transmission may require modified immunization approaches. The dataset provides a valuable resource for future studies of malaria's immunological legacy.

      Context:

      This study builds on prior work showing acute immunosuppressive effects of malaria but uniquely attempts to demonstrate the durability of these effects years after exposure. The natural experiment design addresses limitations of previous observational studies by providing a more controlled comparison.

    1. Reviewer #1 (Public review):

      Sebag et al. addressed the role of ADH5 in BAT in the development of aging and metabolic disarrangements associated with it. This is a follow-up study after the authors' demonstration of the role of BAT ADH5 in glucose homeostasis, obesity, and cold tolerance. By ablating ADH5 specifically in brown adipocytes or pharmacologically modulating ADH5 through activation of its transcription factor, the authors conclude that preservation of BAT function is crucial for healthy aging and ADH5 is causally involved in this process. The topic is appealing given the rise in the aging population and the unclear role of BAT function in this process. Overall, the study uses several techniques and addresses several physiological and molecular manifestations of aging. Therefore, the findings contribute to the growing body of literature pointing to the biological role of BAT activity in aging.

      Comments on revised version:

      I have no further comments other than to congratulate the authors on the nice piece of work.

    2. Reviewer #2 (Public review):

      Summary:

      This study investigates the role of the enzyme Alcohol Dehydrogenase 5 (ADH5) in brown adipose tissue (BAT) during aging. BAT is crucial for thermogenesis and energy balance, but its function and mass diminish with age, contributing to metabolic dysfunction and age-related diseases. ADH5, also known as S-nitrosoglutathione reductase, regulates nitric oxide (NO) signaling by removing damaging S-nitrosylation modifications from proteins. The authors show that aging in mice leads to increased protein S-nitrosylation associated with a combination of increased Nos2 expression and reduced ADH5 expression in BAT, resulting in impaired metabolic and cognitive functions. Deletion of ADH5 in BAT accelerates tissue senescence and systemic metabolic decline. Mechanistically, aging suppresses ADH5 via downregulation of heat shock factor 1 (HSF1), a master regulator of protein homeostasis. Importantly, pharmacologically boosting HSF1 improves BAT function and mitigates both metabolic and cognitive declines in aged mice. The findings highlight a critical HSF1-ADH5 pathway in BAT that protects against aging-related dysfunction, suggesting that targeting this pathway may offer new therapeutic strategies for improving metabolic health and cognition during aging.

      Strengths:

      This research provides insight into the interplay between redox biology, proteostasis, and metabolic decline in aging. By showing that age regulates genes that control SNO status in BAT and further developing a therapy to target ADH5 in BAT to prevent age related decline, the authors have identified a putative mechanism to combat age related decline in BAT function.

      Weaknesses:

      None identified.

      Comments on revised version:

      Congratulations to the authors for this interesting manuscript. I don't want to pat myself on the back, but I found the increased Nos2 expression in Figure 1C of the revised manuscript very satisfying, as it reinforces the shift in the regulation of SNO status that happens in BAT with aging. I appreciate the authors addressing this suggestion.

    1. Reviewer #1 (Public review):

      Summary:

      In their manuscript, Metz Reed and colleagues present an exceptionally thorough analysis of three-dimensional genome reorganization during breast cancer progression using the well-characterized MCF10 model system. The integration of high-resolution Micro-C contact maps with multi-omics profiling provides compelling insights into stage-specific dynamics of chromatin compartments, TAD boundaries, and looping events. The discovery that stable chromatin loops enable epigenetic reprogramming of cancer genes while structural changes selectively drive metastasis-associated pathways represents a significant conceptual advance. This work substantially deepens our understanding of genome topology in malignancy.

      Strengths:

      This work sets a benchmark for integrative 3D genomics in oncology. Its methodological sophistication and conceptual advances establish a new paradigm for studying nuclear architecture in disease.

      Comments on revised version:

      The authors made a significant effort to improve the manuscript. My comments were sufficiently addressed.

    2. Reviewer #2 (Public review):

      Using the MCF10 breast cancer progression sequence, the authors combined high-resolution Micro-C chromatin conformation capture with RNA-seq and ChIP-seq to depict the sequential reorganization of compartments, topologically associated domains (TADs), and long-range loops in benign, pre-tumor, and metastatic states, and coupled these three-dimensional changes with gene expression and enhancer activity. Four main findings were: (i) chromatin structure was largely quiescent, still limiting gene output differentiation, with upregulated sites being most significantly affected; (ii) enhancer-promoter contact strength covariated with transcriptional amplitude; (iii) 127 genes gained expression with increasing chromatin contact; and (iv) progression-related genes acquired altered histone markers in distal enhancers, which remained connected by stable loops. These conclusions are widely accepted and provide strong justification for the publication of this paper.

    3. Reviewer #3 (Public review):

      Summary:

      The authors tackle an important problem- that is defining the topological changes that occur during tumorigenesis. To study this, they use an established stepwise cell model of breast cancer. A strength of their study is a careful, robust differential analysis of topological features across each cell state that is presented clearly and rigorously. They define changes in compartmentalization, TAD structure and chromatin looping. Intriguingly, when the authors integrate differential gene expression with chromatin looping, they see that most differentially regulated genes are not involved in loop changes, suggesting that changes in promoter or enhancer chromatin marks may play a bigger role in regulating transcription than differential loops. The differential topology analysis and its integration with transcription is very well done- one of the best versions of this I have read in the 3D genome field! However, the paper is framed largely as a cancer biology study and it teaches us much less about this. I am worried that some of the trends for each topologic feature are not going to be consistent across the pre-malignant-malignant-metastatic spectrum and would like the authors to soften some of their claims a bit regarding how this clarifies our understanding of cancer evolution.

      Updated comments on revision:

      There are still some issues with this paper. First, it reads descriptively. It is a series of comparisons with limited biologic insight as changes are always seen in genomics and in this case, they're often not tied back to transcription or gene regulation in cancer. Cell lines do not represent cancer faithfully and in this case should not be argued to represent malignant transformation broadly. The authors did not really soften their language as much as I think required. I would caution the authors to further qualify their results in the context of a single, clonal cell line that has undergone stepwise transformation. This is not a patient cohort analysis or frank progression. This matters because there is likely to be much more noise, not pertinent to transformation, in a cell line model. It doesn't negate the validity of the study, but this language should be adjusted appropriately. It was nice to see the authors compare gene expression data from their model to the primary tumor data, however the limited overlap is concerning that at the least patterns of transcriptional regulation in their model are not faithful to primary tumors. If this is the case, it raises concern that the topological changes are also not generalizable to cancer.

      The authors declined a number of functional assays to validate their observations (which are purely correlative). And while I see that the burden of extra experiments may be beyond the scope of this study, they must soften their language to justify the observed relationships.

    1. Reviewer #1 (Public review):

      Summary:

      Patients with STX11 mutations develop familial hemophagocytic lymphohistiocytosis Type 4, a fatal immune disorder marked by defective T and NK cell cytotoxicity and cytokine storm. The conventional explanation attributes this to impaired cytotoxic granule release, but this has never fully accounted for the broader disease picture. This study proposes an alternative mechanism. The authors show that STX11 is required for store-operated calcium entry through ORAI1 channels, which are essential for both cytotoxic killing and NFAT-driven gene expression in T cells. In STX11-deficient cells, ORAI1 currents drop, NFAT nuclear translocation fails, IL-2 expression is suppressed, and degranulation is impaired. These defects are largely rescued by ionomycin or a constitutively active ORAI1 mutant, placing the primary lesion at calcium signaling rather than the fusion machinery. Mechanistically, STX11 binds the C-terminal tail of ORAI1 via its Habc domain and maintains ORAI1 in a state competent for productive assembly prior to STIM1-dependent gating, a step the authors call "priming."

      Strengths:

      The paper identifies a novel and disease-relevant role for STX11 in calcium channel regulation and raises the possibility of using channel agonists as a therapeutic strategy in the disease. The biochemical and functional data are of high quality and generally consistent with the interpretation. The proposal that a non-conventional syntaxin directly interacts with ion channels to prime its activation is novel and interesting.

      Weaknesses:

      For readers to appreciate the value of patient experiments derived from a single individual, the authors should quote prior studies showing that STX11 protein levels are abolished in all known human STX11 mutations. The priming model, while functionally well-supported, rests on indirect structural evidence, and the precise conformational transition involved remains to be defined. These are acknowledged limitations, but alternate mechanisms have not been explored and formally excluded. More direct evidence should be provided to exclude the possibility that STX11 could act as a conventional SNARE and sustain calcium fluxes by promoting the delivery of additional ORAI1 channels from vesicles.

    2. Reviewer #2 (Public review):

      Summary:

      Vig's lab delineates a critical role for STX11 in CRAC channel function, particularly in the context of the fatal immune disorder familial hemophagocytic lymphohistiocytosis type 4 (FHL4). They demonstrate that Syntaxin 11 directly binds and regulates Orai1, and that STX11 depletion abolishes CRAC currents and downstream signaling. Loss of STX11 reduces IL2 gene expression and impairs degranulation, both of which are rescued by the constitutively active Orai1 mutant H134S, whereas a gain‑of‑function mutant targeting the C‑terminus fails to restore these defects. The authors conclude that STX11 primes Orai1 for optimal local assembly that is independent of STIM1 yet required for CRAC channel gating.

      Strengths:

      This study is firmly grounded in disease biology and demonstrates that STX11 downregulation leads to profound functional defects. Using a comprehensive suite of methods and analyses, the authors interrogate the co-regulation of STX11 and Orai1 and present a near-complete view of STX11's modulatory role in CRAC channel function and downstream signaling pathways. The figures are clear, and the statistical analyses are rigorous and convincing.

      Weaknesses:

      The authors conclude that Syntaxin 11 directly binds Orai1. This conclusion is well supported by a multifaceted approach, including co-immunoprecipitation (co-IP), molecular dynamics simulations, co-localization/FRET assays, and targeted mutational analysis-all of which are thoroughly executed. While the interaction appears reasonably strong in co-IP experiments, the STX11-Orai1 interaction is comparatively weaker in pull-down assays, which the authors attribute to instability of the purified His-STX11 protein. A remaining gap is direct evidence of interaction in live cells; this is understandably challenging given that fluorescent tagging of STX11 is not feasible. Fully resolving this question lies beyond the scope of the present study and will require more advanced approaches to capture STX11 binding dynamics.

    1. Reviewer #1 (Public review):

      Summary:

      This paper by Boni and colleagues presents the engineering of a multi-step differentiation program in Escherichia coli based on synthetic gene circuits. The motivation behind the study was to engineer a system capable of undergoing differentiation in a step-wise manner without the presence of external spatial cues and without inducers added during the differentiation process. To achieve this, the authors created several synthetic gene circuits, one being a toggle switch, and the others being quorum-sensing-mediated gene expression modules. The outputs of the differentiation process are fluorescent proteins, which allowed the authors to quantify the behavior of the system using fluorescence intensity measurements. The authors additionally built a multi-component mathematical model which is able to reproduce the experimental data.

      The data presented are convincing and support the claims; the work is well executed.

      Strengths:

      (1) The differentiation process proceeds autonomously after the initial step in liquid culture in the presence of external inducers.

      (2) It is indeed a step-wise process.

      (3) The mathematical model predicts the outcome (% of green, blue and red FP-expressing cells in the population) when changing the initial ratio of green:blue FP-expressing cells.

      Weaknesses:

      (1) No spatial pattern emerges. There are some isolated colonies that turn on the downstream FPs, but I do not see a pattern, really. Nonetheless, some colonies do differentiate (i.e. they turn on additional FPs).

      (2) The mathematical model appears somewhat superfluous. While it can clearly reproduce the data, it is not used to make interesting predictions, changing parameters (and not initial conditions) that guide further experimental implementations.

      Future directions

      The utility of this differentiation process (e.g. in metabolic engineering or for the study of biofilm formation and antibiotic resistance) will become clearer once the FPs are substituted with functional proteins that exert an effect on the cells.

    2. Reviewer #2 (Public review):

      In this manuscript, the authors implement a three-step genetic programme in E. coli that converts an initially homogeneous population into spatially structured sender, receiver, and "matured" receiver colonies on agar without externally supplied positional information. They combine a TetR/LacI toggle switch for symmetry breaking, LuxI/LuxR quorum sensing for a paracrine signalling step, and CinI/CinR for an autocrine signalling-like maturation step, and complement the experiments with a mathematical model that qualitatively reproduces pattern formation over a range of initial conditions.

      While the article has many strengths such as a clear conceptual framing using Waddington landscapes, a modular and carefully optimised circuit design, thorough experimental characterisation of the toggle and quorum-sensing modules, integration of spatial modelling with experiments, and generally clear writing and figures, I think it will benefit the article to clarify the definition and stability of "differentiated" states, clarify several quantitative and modelling aspects, better explain how fitted curves and promoter engineering were done, and improve some figure design and wording to avoid ambiguity.

      Detailed comments below:

      (1) P5-8 / and more generally: A major concern is that producing a reporter output is not, by itself, differentiation. For a state to be credibly called "differentiated", it should be stable (self-maintained) over relevant timescales, ideally in the absence of the inducing context. As written, the manuscript sometimes seems to equate cell type with reporter expression. I strongly suggest adding a short subsection explicitly defining state versus output, and for each claimed state, stating whether it is stable/bistable or unstable/reversible, with evidence. Concretely, the authors should enumerate:<br /> a) Toggle-derived sender versus receiver: stable? under what conditions (inducer ranges, hysteresis window)?<br /> b) Paracrine-induced "red" receivers: is this a stable differentiated state, or a context-dependent induction requiring proximity to senders?<br /> c) "Mature" (yellow) state: does it persist after removal from the spatial signal field? If not, it should be described as an induced output programme rather than a mature lineage state.

      At present, later sections (and the "maturation" language) risk over-stating what is demonstrated.

      (2) Figure 2d: It is unclear whether this panel is intended to be qualitative (schematic/illustrative) or generated from quantitative data. The legend should explicitly state the origin (e.g., representative image, averaged data, simulation output, schematic) and, if quantitative, what was measured, how many replicates, and how the visualisation was constructed.

      (3) Figure 2e: The cross-sectional line is described as meant to be comparable, yet the leftmost plot appears to have a different slope from the others. The authors should explain whether this reflects a different scaling/normalisation, a different underlying dataset/condition, or simply a plotting artefact. If these are fitted trends, report the fit function (see also the comment on fitted lines below).

      (4) Around P7-8: (saddle/separatrix description): When describing the saddle or separatrix between the two valleys, it would be helpful to briefly connect this more directly to a quantitative dynamical-systems perspective: for instance, the intersection of nullclines and how nullcline geometry changes under IPTG/aTc induction. This will make the landscape picture more complete for readers familiar with the original genetic toggle switch work (Garder et al., 2000).

      (5) P9, lines 157-159: The current phrasing ("in absence of noise, the system would be fully deterministic... in living cells, however, stochastic bursts... change the trajectory") risks conflating predicting population-level percentages with predicting colony-level trajectories. It would help to clearly separate (i) the ability to predict the overall fraction of ON/OFF (green/blue) colonies from inducer conditions (which is largely deterministic at the population level) from (ii) the intrinsically stochastic choice of state made by any given founder cell and its colony.

      (6) P11, lines 193-195 (promoter engineering): The main text currently only refers to screening variants and choosing pLux76; I suggest briefly stating in the main text (not only in the supplement) what was changed (for example, promoter box variants, core promoter strength modifications) and what design criteria were used (reduced leakiness, increased dynamic range).

      (7) Use of fitted lines (Figures 2, 4, 5, 7): Wherever fitted curves are overlaid on data, the asuthors should indicate in the figure legend the explicit form of the fit as well as the fit equation/ parameters. As a reader, it is difficult to interpret what is empirical smoothing versus what is a mechanistic functional form.

      (8) P13, lines 232-235: The comparison between induction directly with C6-HSL and induction from sender colonies is qualitative ("significantly smaller range"). The authors should provide distances (for example, in mm) for the induction range in each case and, if possible, approximate total HSL amounts or concentrations, so that the reader can appreciate the magnitude of the difference.

      (9) P13, lines 259-262: The authors model the transition to the stationary phase via a monotonically decreasing sigmoid in time for biosynthetic capacity. What is the rationale or literature basis for this approach to model entry into the stationary phase? The authors should cite prior work and clarify why this form is appropriate here, versus alternatives (nutrient diffusion limitation, logistic growth with resource depletion, etc.).

      (10) Figure 6c: Are the areas of the plate shown in each column the same field of view across conditions/time, or are these simply representative regions selected per condition (possibly from different plates)? The caption/legend should clarify whether these are matched locations and how images were chosen.

      (11) Figure 7a: The combination of solid, dashed, and dash-dot arrows/lines is visually hard to read. I suggest replacing the dash-dot line with a fully dotted line or using different colours (if consistent with journal style) to improve readability.

      (12) Figure 7e and similar analyses: The authors should explain in the Methods and/or captions how "distance from sender colonies" is computed when multiple senders exist. Is the distance always measured to the nearest sender, and how are cases handled where a receiver is in the overlapping influence of several senders? This clarification is important for interpreting the fitted curves.

    3. Reviewer #3 (Public review):

      This manuscript presents an engineered 3-step circuit in E. coli that combines toggle-switch-based symmetry breaking with quorum-sensing interactions to generate colony-scale spatial patterns. The work is interesting as a synthetic circuit integration study and as a demonstration of self-organized patterning across physically separated colonies. The authors provided a compelling demonstration of the characterization/tuning of parts to guide the overall system engineering. A notable strength is the demonstration that a single circuit can generate a range of self-organized spatial patterns across separate colonies.

      However, I think the paper needs to tone down the extent to which the system demonstrates multi-step differentiation or morphogenesis, which is not critical for making the paper valuable. Only the first step of their circuit design (Figure 1), the toggle switch, generates stable alternative states. The latter steps are mainly signal-dependent reporter activation states layered on top of the blue receiver state, rather than true fate transitions. The authors explicitly state that red expression is added without replacing the blue identity, and they also acknowledge that red cells lose their identity upon restreaking unless they remain near sender cells. That substantially weakens the differentiation analogy and makes the Waddington framing too strong.

      A related concern is that the 3rd step does not introduce a new spatial organizing rule. The authors show that the second signal remains confined to cells already receiving the first signal, and explicitly conclude that it functions only as an autocrine cue rather than a second paracrine layer. As a result, the 3-step system seems more like an added local readout or maturation layer. Overall, the main 2-step outcome is sparse green sender colonies surrounded by red-expressing blue receivers, with distant receivers remaining blue. That is a valid engineered pattern, but it is still a local, threshold-response circuit architecture.

      The autonomy claim should be toned down and stated more precisely. The plate patterning occurs without externally imposed spatial gradients, which is a strength. However, by design, the overall system behavior depends strongly on pre-culture inducer conditions that set the sender:receiver ratio, and this externally imposed history is central to the final pattern. This property is tied to how the circuit is designed where steps 2 and 3 largely respond to symmetry breaking introduced in step 1, which is dependent on both history and initialization on the plate. In particular, currently the pattern formation process is quite variable (e.g. figure 5), depending on how different colonies flip the toggle switch, and consequently, how many become senders and how many become receivers. It would have been fascinating if they could also demonstrate the differentiation within individual colonies, leading to intra-colony patterns. This aspect should at least be discussed.

      The mathematical model is useful in guiding both the characterization of parts, modules and the overall system. However, the claims around its quantitative predictive power should also be made narrower. The simulations are built from multiple fitted and partly hand-tuned components, including toggle-switch response curves, colony-growth rules, diffusion, reporter-response functions, and activity decline. This supports a calibrated qualitative reconstruction of the observed patterns, but not a strong predictive or mechanistic validation.

      Other specific points:

      (1) Given the topic of the work, the authors should cite closely relevant studies in programming pattern formation, including:<br /> Cao et al, Cell 2016 Collective space-sensing coordinates pattern scaling in engineered bacteria<br /> Rajasekaran et al, Cell 2024 A programmable reaction-diffusion system for spatiotemporal cell signaling circuit design<br /> Lu et al, BioRxiv 2024 Discovery of interpretable patterning rules by integrating mechanistic modeling and deep learning

      (2) The model assumes identical diffusion coefficients for C6-HSL and C14-HSL despite their substantially different molecular sizes and hydrophobicities. This assumption could distort kinetic lag with differential diffusion in explaining the autocrine confinement of the third step. Its impact should at least be explored in the simulations.

      (3) The mCherry response parameters change significantly between the 2-step and 3-step systems. The authors acknowledged this change but did not provide a clear explanation.

      (4) The 3-step system is evaluated at only a single condition with no simulation comparison, in contrast to the systematic 11-condition validation of the 2-step system.

    1. Reviewer #1 (Public review):

      Summary:

      The metabolic profiles of immune cells under steady-state or immune-activated conditions remain poorly characterized. The authors find that embryonically derived hemocytes in Drosophila larvae predominantly utilize mitochondrial respiration to generate energy and exhibit minimal glycolysis rates under unchallenged conditions. Hemocytes developmentally elevate ATP production rates. Mitochondrial respiration drives metabolic activation in larval hemocytes. More specifically, lamellocytes exhibit unique metabolic activities, including enhanced trehalose catabolism and mitochondrial remodeling, required for their encapsulation response.

      Strengths:

      The study shows the metabolism that is most likely to operate in different immune cells in Drosophila during development and also during infection. This is related to mitochondrial organization and proliferation and/or differentiation state.

      Weaknesses:

      Even though there is a rigorous analysis of mitochondrial activity using the Sea Horse analyzer, the analysis of diverse mitochondrial activities in the different immune cell types across development and in infection could be carried out using microscopy. ROS, mitochondrial membrane potential, NADH/+ and FADH/+ levels in vivo are likely to give a more specific readout of change in cellular activities. The activities of mitochondrial fusion and fission need to be collectively tested to understand their role in development and also in infection. The relevance of the change in mitochondrial activity for development or immunity remains to be tested.

    2. Reviewer #2 (Public review):

      Summary:

      This study presents an analysis of the metabolism of Drosophila larval immune cells during development and activation. The authors compared the utilization of glycolysis and oxidative phosphorylation for energy metabolism. Although this topic has been widely discussed and well-studied in immune cell research, particularly in mammals, it has received little attention in insects. The authors demonstrated that quiescent and activated larval Drosophila immune cells predominantly use mitochondrial oxidative phosphorylation to produce energy. This finding is significant for the emerging field of insect immunometabolism research and is interesting in comparison to mammalian immunity, where immune cell activation is often associated with a shift toward greater reliance on glycolysis.

      Strengths:

      Using the Agilent Seahorse system, the authors developed and fine-tuned a method to measure the energy metabolism of Drosophila immune cells, obtaining high-quality, robust data. Through genetic manipulations targeting immune cells specifically, they analyzed metabolic changes in cells with different activations, going beyond developmental changes. They convincingly demonstrated ATP production, primarily in the mitochondria of immune cells, at various developmental stages and in various activated states. The results presented mostly support the conclusions drawn. This methodology and its results are valuable for further studies of insect immunometabolism. In a broader context, they are also valuable for comparing the metabolism of immune cells across different animal groups.

      Weaknesses:

      The genetic manipulations used were suitable for obtaining immune cells of various types and activation states, such as proliferation, differentiation, and immune activation. However, this method has limitations: the mixture of different cell types was always analyzed, and the specific type of interest was often a minority cell population. Had the other cells remained in their initial control state, the observed change in metabolism could have been primarily attributed to the desired cell type. However, the remaining cells that did not transform into the desired type were also usually influenced or activated in some way, making it difficult to determine to which group the observed change should be attributed. For example, consider the induction of lamellocyte differentiation using Hml>Hop[tum]. There are approximately 1,000 lamellocytes per larva, but according to Supplementary Figure 4, there are still about 5,000 Hml+ cells, and even these cells have activated Jak/Stat signaling. Therefore, it can be assumed that they are also activated. After a real infection, the proportion of lamellocytes is greater, but the remaining plasmatocytes are also activated. The authors should mention these limitations more clearly. However, as the authors correctly note, solving this problem will require single-cell approaches, which current technologies still limit. I see this as a problem when interpreting the proliferation effect. The crucial question is what percentage of the analyzed cells induced by Hml>Ras[V12] were actually in the division stage. Not all hemocytes are Hml+, so not all are induced. Of those that are induced, how many are in the division stage at the time of analysis? Meanwhile, those that were not dividing at that moment also had activated Ras, which triggers many processes besides division. Information on what percentage of the analyzed cells were dividing is missing. This information is important because the finding that dividing Drosophila immune cells primarily use mitochondria and oxidative phosphorylation to produce ATP contrasts with the debated significance of the Warburg effect in dividing mammalian cells. This finding would be significant, but unfortunately, it is not robustly supported by the presented data.

    3. Reviewer #3 (Public review):

      Summary :

      This study investigates the metabolic profiles of hemocytes across multiple stage/conditions and suggests that hemocytes act as regulators of metabolism rather than merely receivers of metabolic cues. The authors show that hemocytes rely primarily on mitochondrial respiration, which is further enhanced during proliferation in development or upon genetic manipulation of plasmatocytes, but not crystal cells.

      Metabolic respiration is also activated in lamellocytes, and this activation correlates with changes in mitochondrial morphology. The authors further attempt to identify mechanisms underlying this activation, proposing that mitochondrial fission may contribute to the ability of lamellocytes to encapsulate wasp eggs.

      Strengths:

      This work provides detailed and valuable insights into the metabolic phenotypes of hemocyte populations at different developmental stages and under both physiological and pathological conditions. The authors perform a longitudinal assessment of hemocyte metabolism and compare metabolic states across contexts.

      Importantly, they provide evidence that hemocytes regulate metabolism to perform essential immunological functions, such as wasp egg encapsulation. This reinforces the view that hemocytes are key regulators and communicators that adapt their metabolic programs according to developmental and environmental demands.

      Weaknesses:

      The results presented are insightful, although several controls and validations could strengthen the conclusions. It would be preferable to also include responder transgenes alone as a control for leakiness, and the scRNA-seq findings would benefit from in vivo validation.

      Some conclusions appear inconsistent or insufficiently supported. For instance, although mitochondrial respiration in plasmatocytes peaks at 96 h AEL, this increase is not accompanied by detectable mitochondrial rearrangement, which remains constant between 96 h AEL and 120 h AEL.

      In general, the authors should temper some statements or provide further data.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      Summary:

      This manuscript addresses an important methodological issue-the fragility of meta-analytic findings-by extending fragility concepts beyond trial-level analysis. The proposed EOIMETA framework provides a generalizable and analytically tractable approach that complements existing methods such as the traditional Fragility Index and Atal et al.'s algorithm. The findings are significant in showing that even large meta-analyses can be highly fragile, with results overturned by very small numbers of event recodings or additions. The evidence is clearly presented, supported by applications to vitamin D supplementation trials, and contributes meaningfully to ongoing debates about the robustness of meta-analytic evidence. Overall, the strength of evidence is moderate to strong.

      Strengths:

      (1) The manuscript tackles a highly relevant methodological question on the robustness of meta-analytic evidence.

      (2) EOIMETA represents an innovative extension of fragility concepts from single trials to meta-analyses.

      (3) The applications are clearly presented and highlight the potential importance of fragility considerations for evidence synthesis.

    2. Reviewer #3 (Public review):

      Summary and strengths:

      In this manuscript, Grimes presents an extension of Ellipse of Insignificant (EOI) and Region of Attainable Redaction (ROAR) metrics to meta-analysis setting as metrics for fragility and robustness evaluation of meta-analysis. The author applies these metrics to three meta-analyses of Vitamin D and cancer mortality, finding substantial fragility in their conclusions. Overall, I think extension/adaption is a conceptually valuable addition to meta-analysis evaluation, and the manuscript is generally well-written.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have provided new data and text that addresses all of the reviewers' comments on the previous versions in a wholly satisfactory way.]

      Summary:

      This study presents evidence that addition of the two GTPases EngA and ObgE to reactions comprised of rRNAs and total ribosomal proteins purified from native bacterial ribosomes can bypass the requirements for non-physiological temperature shifts and Mg+2 ion concentrations for in vitro reconstitution of functional E. coli ribosomes.

      Strengths:

      This advance allows ribosome reconstitution in a fully reconstituted protein synthesis system containing individually purified recombinant translation factors, with the reconstituted ribosomes substituting for native purified ribosomes to support protein synthesis. This represents a significant development in the long-term effort to produce synthetic cells.

    2. Reviewer #2 (Public review):

      This study has developed a single-step method to assemble active bacterial ribosomes under near-physiological conditions by using the GTPase factors EngA and ObgE. These factors eliminate the need for the traditional, harsh manipulations of temperature and magnesium levels. This integration is an important step toward the bottom-up construction of synthetic cells.

    1. Reviewer #1 (Public review):

      [Editors' note: this version has been assessed by the Reviewing Editor without further input from the original reviewers. The authors have addressed the comments raised in the previous round of review.]

      The paper from Hudait and Voth details a number of coarse-grained simulations as well as some experiments focused on the stability of HIV capsids in the presence of the drug lenacapavir. The authors find that LEN hyperstabilizes the capsid, making it fragile and prone to breaking inside the nuclear pore complex.

      Comments on previous round of revisions:

      I found that the authors addressed my concerns satisfactorily. The other reviewer raised a number of important points regarding the nuances of the model and the interpretation of the simulations, which the authors rebutted. I think the paper in its current form now is a worthwhile addition to the literature.

    2. Reviewer #3 (Public review):

      This is a technically sophisticated study that integrates coarse-grained modeling with live-cell imaging to address an important and timely question regarding HIV-1 capsid inhibition by lenacapavir.

      In summary, in my view, the manuscript represents a solid contribution to the field.

    1. Reviewer #1 (Public review):

      Summary:

      Zinn and colleagues investigated the role of proteases 2A and 3C of enterovirus D68 (EVD68), an emerging pathogen associated with outbreaks of acute flaccid myelitis (AFM), a polio-like disease, on the nucleocytoplasmic trafficking in different systems, including human neurons derived from pluripotent cells. They found that 2A specifically cleaved Nup98 and POM121. Using reporter proteins and RNA synthesis and trafficking assays in cells expressing viral proteases, they showed that 2A induces broad loss of the nuclear pore barrier function, but, surprisingly, the RNA export appears to be minimally affected. Since nucleocytoplasmic trafficking defects are known to be associated with neuropatologies, they propose a hypothesis that 2A-dependent cleavage of nucleoporins in motoneurons underlies the development of EVD68-induced AFM. They further show that a 2A-specific inhibitor increases the survival of human neurons differentiated from stem cells upon EVD68 infection.

      Strengths:

      Use of multiple methods to investigate the effect of 2A and 3C expression on nucleoporin cleavage and nucleocytoplasmic trafficking.

      Comments on revisions:

      The following issues remain unresolved:

      First, the authors still do not show representative images confirming specific nucleoporin degradation (Fig.1), which is the main focus of the work.

      Second, the conclusion that 2A-mediated degradation of the nucleo-cytoplasmic barrier does not affect export of the RNA from the nucleus is not supported by the presented data. The representative images shown in Fig 3C do not have the signal for GFP (like in Fig. 2), and therefore it is impossible to see if those cells indeed express EVD68 proteases.

      Moreover, to show RNA export, not only the decrease of nuclear EU signal should be quantified, but also the increase of the cytoplasmic signal. The diminishing of the nuclear staining may not necessarily reflect RNA export, but may well be explained by nuclease activity, all the more relevant in cells expressing 2A, where the nuclear-cytoplasmic barrier is disrupted and cytoplasmic nucleases may enter the nucleus.

      The same applies to images in Fig. 3D. There are no markers of infection; moreover, the experiment description indicates that EU labeling began at 24 h post-infection with an MOI of 5, i.e., essentially all cells should have been infected. This is difficult to believe as the replication cycle of most EVD68 strains in HeLa cells is no longer than 12 h, yet the images do not show any signs of CPE, and demonstrate a strong EU signal, inconsistent with the expected inhibition of nuclear transcription, a known attribute of enterovirus infections.

      The claim that nuclear transcription and RNA export remain unaffected in conditions of 2A-mediated disruption of the nucleo-cytoplasmic barrier is very strong and requires equally strong evidence.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript investigates the role of EV-D68 proteases 2A and 3C in nuclear pore complex (NPC) dysfunction and their contribution to motor neuron toxicity. The authors demonstrate that both proteases cleave only a limited number of nucleoporins, with 2A^pro showing the strongest impact by inhibiting nuclear import and export of proteins and disrupting NPC permeability without affecting RNA export. Importantly, treatment with the 2A^pro inhibitor telaprevir reduced neuronal cell death in a dose-dependent manner, achieving neuroprotection at concentrations below those required to inhibit viral replication. The study addresses a relevant mechanism underlying EV-D68-induced neuropathology and explores a potential therapeutic intervention.

    3. Reviewer #3 (Public review):

      Summary:

      The author showed expression of the viral proteases 2Apro and 3Cpro of EV-D68, which cleaved specific components of the nuclear pore complex (Nup98 and POM121 by 2Apro), and 2A but not 3C expression altered nuclear import and export. Similar nucleocytoplasmic transport deficits are observed in EV-D68-infected RD cells and iPSC-derived motor neurons (diMNs). 2A inhibitor telaprevir partially rescued the nucleocytoplasmic transport deficits and suppressed neuronal cell death after infection. While it's clear that 2A can cleave NPC proteins and affect nuclear transport, the link to neurotoxicity after EV-D68 infection is less convincing.

      This study opens up a very intriguing hypothesis: that EV-D68 2Apro could be directly responsible for motor neuron cell death, mediated by POM121 and possibly Nup98 cleavage, that ultimately results in paralysis known as acute flaccid myelitis. This hypothesis notably does run counter to other published data showing that human neuronal organoids derived from iPSCs can support productive EV-D68 infection for weeks without cell death and that EV-D68-infected mice can have paralysis prevented by depletion of CD8 T cells, still with EV-D68 infection of the spinal cord. However, even if 2Apro is not ultimately responsible for motor neurons dying in human infections, that does not exclude the possibility that cleavage of nups could still disrupt motor neuron function. Notably, most children with AFM have some amount of motor function return after their acute period of paralysis, but most still have some residual paralysis for years to life. It is possible that 2A pro could mediate the acute onset of weakness, while T cells killing neurons could determine the amount of long-term, residual paralysis.

      Strengths:

      The characterization of nuclear pore complex components that appear to be targets of both poliovirus and EV-D68 proteases is quite thorough and expansive, so this data set alone will be useful for reference to the field. And the process by which the authors narrowed their focus to EV-D68 2Apro reducing Nup98 and POM121 as consequential to both import and export of nuclear cargo but not RNA was technically impressive, thorough, and convincing. As will be detailed below, when the authors move from studying over-expressed proteases in transformed cell lines to studying actual virus infection in both transformed cell lines and iPSC-derived neurons, some of the data only indirectly support their conclusions; however, the quality of the experiments performed is still high. So even if the claim that 2Apro causes neurotoxicity is circumstantial, the data certainly are intriguing and certainly justify further study of the effects of EV-D68 2Apro on the NPC and how this impacts pathogenesis. This is a convincing start to an intriguing line of inquiry.

      Comments on revisions:

      The authors have returned a stronger revised manuscript, being responsive to most of the combined reviewers' comments. It was especially important to add the clarity and specificity that the data in this manuscript did not establish a direct link for 2Apro causing AFM. The authors have clarified this language adequately, such that it is appropriate to remove the "incomplete" portion of the short assessment as they have requested. Adding in experiments with EV-D68 virus infection to complement their work with recombinant proteases also strengthened their conclusions.

      There are still some areas where discrepancies remain, although these are minor and can mostly be acknowledged as limitations of their approach rather than needing more experiments, unless the authors choose to do the additional experiments. To try to make this understandable, I have copied from the rebuttal letter (*) original comment, (**) author's rebuttal, and (***) a reply to the rebuttal:

      (*)(2) Telaprevir was able to rescue nucleocytoplasmic transport in RD cells at low concentrations (Figure 4A). It is not shown if this correlates with its antiviral effect in RD cells, or could this correlate with inhibition of 2A cleavage of Nup98 or POM121, which is never measured.

      (**) In the aforementioned new experiment in Figure 4A, we have also included a dose-response curve for telaprevir showing its inhibition of POM121 and Nup98 cleavage.

      (***) Fig.4A is in diMN not RD cells. The EC50 of telaprevir could be very different in RD cell vs diMNs. This question remains unanswered.

      (*) (3) Building off of the prior point, the authors' claim that the neuroprotective effect of telaprevir is independent of its antiviral effect is not well-founded. Figure 4E (neuroprotection) was done with MOI 5, and Figure 4G (virus growth) was MOI 0.5. Telaprevir neuroprotection is not shown at MOI 0.5, nor is the neuroprotective effect correlated with inhibition of 2A cleavage of Nup98 or POM121.

      (**) The selection of MOIs for these two experiments was limited by technical considerations. If the viral growth curve were to be performed at MOI 5, it would be confounded by cell death. Further, a low MOI is required in order to allow multiple rounds of infection, and is therefore more sensitive for assaying the effect of telaprevir on viral replication. On the other hand, at MOI 0.5 diMN death is very gradual, and the neuroprotection assay we would have lacked the statistical power to determine whether a rescue of this small magnitude of toxicity is significant. The EC50 of telaprevir is not expected to vary at different MOIs.

      (***) This should be discussed in the Discussion as a limitation of the experiment.

      (**) We have also now correlated the inhibition of 2Apro cleavage of Nup98 and POM121 with the neuroprotective effect at comparable concentrations of telaprevir, as described above.

      (***) Unless you quantify this, my eye disagrees with you. In Fig.4A, cleavage of NUP98 is rescued by 3uM telaprevir, but that does not seem to be the case for POM121.

      Additionally, in Fig. 4D, why is only NLS but not NES is impaired in diMN? This should be discussed.

    1. Reviewer #1 (Public review):

      Summary

      Fogel & Ujfalussy report an extension of a visualization tool that was originally designed to enable an understanding of detailed biophysical neuron models. Named "extended currentscape", this new iteration enables visual assessment of individual currents across a neuron's spatially extended dendritic arbor with simultaneous readout of somatic currents and voltage. The overall aim was to permit a visually intuitive understanding for how a model neuron's inputs determine its output. This goal was worthwhile and the authors achieved it. Demonstrating the utility of extended currentscape, the authors leverage their models to generate interesting and detailed biophysical insights into widely studied neurophysiological phenomena with clear behavioral relevance. Overall, this study provides a valuable and well-characterized biophysical modeling resource to the neuroscience community.

      Strengths

      The authors significantly extended a previously published open-source biophysical modeling tool. Beyond providing important new capabilities, the potential impact of extended currentscape is boosted by its integration with preexisting resources in the field.

      In keeping with the authors' goal to provide an approachable platform with intuitive visualizations of how current flows through neurons, the manuscript is approachable to non-computationalists. In particular, a dedicated glossary and elegant illustrations in Figure 2 boost accessibility for biologists.

      Extended currentscape produces intriguing and detailed predictions spanning neurophysiological phenomena such as local dendritic spikes, complex spike generation, and feature selectivity (hippocampal place fields). By triggering analysis of modeled synaptic inputs on these events, the authors trace their origins from dendritic integration to synaptic input patterns.

      The authors cleverly apply a graph theoretical approach to efficiently model bidirectional current flow throughout a neuron's dendritic arbor. As a result, extended currentscape can run on a standard personal computer.

      The code is well-documented and freely available via GitHub.

      Weaknesses

      While extended currentscape meets its objective of modeling and illustrating the propagation of axial currents throughout a model neuron in great detail, it requires simulation and measurement of synaptic input currents. For this reason, there currently exists a very high technical barrier to conclusively test its intriguing predictions: simultaneous readout of synaptic inputs throughout a neuron's dendritic arbor. Mitigating this weakness, the authors propose a relatively more feasible alternative approach in Discussion: simultaneous voltage imaging of dendrites and their soma while estimating synaptic inputs from the distributions of voltage dynamics along individual dendritic branches.

    2. Reviewer #2 (Public review):

      The electrical activity of neurons and neuronal circuits is dictated by the concerted activity of multiple ionic currents. Because directly investigating these currents experimentally is not possible with current methods, researchers rely on biophysical models to develop hypotheses and intuitions about their dynamics. Models of neural activity produce large amounts of data that are hard to visualize and interpret. The currentscape technique helps visualize the contributions of currents to membrane potential activity, but it is limited to model neurons without spatial properties. The extended currentscape technique overcomes this limitation by tracking the contributions of the different currents from distant locations. This extension allows tracking not only the types of currents that contribute to the activity in a given location, but also visualizing the spatial region where the currents originate. The procedure is first illustrated in a simple setting that allows testing its validity in an intuitive situation where a cell with an apical trunk and two dendritic branches responds to synaptic inputs. The procedure is then applied to study the initiation of complex spike bursts in a model hippocampal place cell.

      The extended currentscape method represents a significant improvement over the original technique, which is already utilized by several research groups. By enabling the analysis of current contributions in spatially extended models, this technique provides a new lens for investigating neuronal and circuit dynamics and will be of use to the modeling community.

      Comments on revisions:

      The changes in Figure 2 greatly improved the manuscript.

    1. Reviewer #1 (Public review):

      Summary:

      In this article by Xiao et al. the authors aimed to identify the precise targets by which magnesium isoglycyrrhizinate (MgIG) functions to improve liver injury in response to ethanol treatment. The authors found through a series of in-vivo and molecular approaches that MgIG treatment attenuates alcohol-induced liver injury through a potential SREBP2-IdI1 axis. The revised manuscript adds to a previous set of literature showing MgIG improves liver function across a variety of etiologies, and also provides mechanistic insight into its mechanism of action. All major weaknesses were addressed in the revised submission.

      Strengths:

      (1) The authors use a combination of approaches from both in-vivo mouse models to in-vitro approaches with AML12 hepatocytes to support the notion that MgIG does improve liver function in response to ethanol treatment.

      (2) The authors use both knockdown and overexpression approaches, in-vivo and in-vitro, to support most of the claims provided.

      (3) Identification of HSD11B1 as the protein target of MgIG, as well as confirmation of direct protein-protein interactions between HSD11B1/SREBP2/IDI1 is novel.

      Weaknesses:

      The authors addressed all my concerns.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors investigated magnesium isoglycyrrhizinate (MgIG)'s hepatoprotective actions in chronic-binge alcohol-associated liver disease (ALD) mouse models and ethanol/palmitic acid-challenged AML-12 hepatocytes. They found that MgIG markedly attenuated alcohol-induced liver injury, evidenced by ameliorated histological damage, reduced hepatic steatosis, and normalized liver-to-body weight ratios. RNA sequencing identified isopentenyl diphosphate delta isomerase 1 (IDI1) as a key downstream effector. Hepatocyte-specific genetic manipulations confirmed that MgIG modulates the SREBP2-IDI1 axis. The mechanistic studies suggested that MgIG could directly target HSD11B1 and modulate the HSD11B1-SREBP2-IDI1 axis to attenuate ALD. This manuscript is of interest to the research field of ALD.

      Strengths:

      The authors have performed both in vivo and in vitro studies to demonstrate the action of magnesium isoglycyrrhizinate on hepatocytes and an animal model of alcohol-associated liver disease.

      Original comment (1):

      In Supplemental Figure 1A, all the treatment arms (A-control, MgIG-25 mg/kg, MgIG-50 mg/kg) showed body weight loss compared to the untreated controls. However, Figure 1E showed body weight gain in the treatment arms (A-control and MgIG-25 mg/kg), why? In Supplemental Figure 1A, the mice with MgIG (25 mg/kg) showed the lowest body weight, compared to either A-control or MgIG (50 mg/kg) treatment. Can the authors explain why MgIG (25 mg/kg) causes bodyweight loss more than MgIG (50 mg/kg)? What about the other parameters (ALT, ALS, NAS, etc.) for the mice with MgIG (50 mg/kg)?

      Author's response:

      We agree that this observation does not strictly follow a dose-dependent pattern. In vivo responses to pharmacological interventions, particularly in metabolic and liver disease models, are not always linear. The relatively greater body weight reduction observed in the 25 mg/kg group may be influenced by inter-individual variability, differences in metabolic adaptation, or sample size-related variation. Importantly, these differences in body weight were not statistically significant. Therefore, we selected the 50 mg/kg dose for subsequent animal experiments, as it demonstrated more consistent and stable improvements across multiple parameters, including body weight, ALT, AST, TG, and TC.

      New comment:

      My first question: All the treatment arms (A-control, MgIG-25 mg/kg, MgIG-50 mg/kg) showed significant body weight loss compared to the untreated controls (Supplemental Figure 1A), but the body weight significantly increased in the treatment arms (A-control and MgIG-50 mg/kg) compared to the untreated controls (Figure 1E). Why?

      My second question: Mice with MgIG (25 mg/kg) showed the lowest body weight, compared to either A-control or MgIG (50 mg/kg) treatment. According to the authors' explanation, the MgIG (25 mg/kg) caused bodyweight loss are attributed to inter-individual variability, differences in metabolic adaptation, or sample size-related variation. Did these differences happen in MgIG (25 mg/kg) only? or in all other groups? The mouse group assignment should be randomized; however, a large variation in bodyweight was seen in MgIG (25 mg/kg) group. It is not convincing for the author to select MgIG (50 mg/kg) group for subsequent animal experiments, because of a large variation in MgIG (25 mg/kg) group, and because that MgIG (50 mg/kg) group demonstrated more consistent and stable improvements across multiple parameters. The author should reanalyze and compare all the raw data between MgIG (50 mg/kg) group and MgIG (25 mg/kg) group, and address the issues being pointed out and justify rationale for the animal group assignment.

      Original comment (2):

      IL-6 is a key pro-inflammatory cytokine significantly involved in ALD, acting as a marker of ALD severity. Can the authors explain why MgIG 1.0 mg/ml shows higher IL-6 gene expression than MgIG (0.1-0.5 mg/ml)? Same question for the mRNA levels of lipid metabolic enzymes Acc1 and Scd1.

      Author's response:

      Thank you for this important comment. We agree that IL-6, as well as lipid metabolism-related genes such as Acc1 and Scd1, are key indicators in ALD. The relatively higher expression observed at 1.0 mg/mL MgIG compared to lower concentrations (0.1-0.5 mg/mL) may be related to experimental constraints associated with the MgIG formulation used in this study. Specifically, to maintain consistency with our in vivo experiments, we used a clinically available liquid formulation of MgIG (5 mg/mL), which is approved for intravenous administration in China. Due to its relatively low stock concentration, achieving higher working concentrations (e.g., 1.0 mg/mL) in vitro required a larger volume of the MgIG solution, thereby proportionally reducing the volume of culture medium. This reduction in effective culture conditions may adversely affect hepatocyte viability and function. Supporting this, our CCK-8 and LDH assays indicated that higher MgIG concentrations were associated with subtle cytotoxicity or impaired cell status.

      New comment:

      The author's response did not answer my question. If the authors believe it could be experimental constraints associated with the MgIG formulation, then it is questionable for this MgIG formulation used in all other associated experiments. The experiments, at least those the MgIG formulation associated experiments, need to be repeated.

      Original comment (3):

      For the qPCR results of Hsd11b1 knockdown (siRNA) and Hsd11b1 overexpression (plasmid) in AML-12 cells (Figure 5B), what is the description for the gene expression level (Y axis)? Fold changes versus GAPDH? Hsd11b1 overexpression showed non-efficiency (20-23, units on Y axis), even lower than the Hsd11b1 knockdown (above 50, units on Y axis). The authors need to explain this. For the plasmid-based Hsd11b1 overexpression, why does the scramble control inhibit Hsd11b1 gene expression (less than 2, units on the Y axis)? Again, this needs to be explained.

      Author's response:

      Thank you for this important comment, and we apologize for the lack of clarity in the Y-axis labeling, which may have led to misunderstanding.

      As shown in Figures 5A and 5B, we have revised the Y-axis description to clearly indicate that gene expression levels are presented as relative expression normalized to GAPDH (fold change relative to the control group).

      New comment:

      The author explained the relative expression was normalized to GAPDH (fold change), but they did not answer my question. My question is for Figure 5B. in Figure 5B (left, Hsd11b1-KD), scramble control showed over 100 (unit), however, in Figure 5B (right, Hsd11b1-OE), scramble control showed only 0.5-1 (unit). The data seemed that authors used same scramble control for both KD and OE? If yes, they should provide more details of the KD and OE experiments and explain why this happened. If they used plasmid for OE control, they also need to clarify it. In addition, qPCR is not a good assay to show the success of KD or OE, Western blotting should be done as convincing data to show the success of KD or OE.

    1. Reviewer #2 (Public review):

      In this paper, Biswas et al. describe the role of acetylcholine (ACh) signaling in protection against chronic oxidative stress in C. elegans. They showed that disruption of ACh signaling in either unc-17 mutant or gar-3 mutants led to sensitivity to toxicity caused by chronic paraquat (PQ) treatment. Using RNA seq, they found that approximately 70% of the genes induced by chronic PQ exposure in wild type failed to upregulate in these mutants. The overexpression of gar-3 selectively in cholinergic neurons was sufficient to promote protection against chronic PQ exposure in an ACh-dependent manner. The study points to a previously undescribed role for ACh signaling in providing organism-wide protection from chronic oxidative stress likely through the transcriptional regulation of numerous oxidative stress-response genes. The paper is well-written, and the data are robust. While the study identifies the muscarinic ACh receptor gar-3 as an important regulator of the response to PQ, the specific neurons in which gar-3 functions were not unambiguously identified, and the sources of ACh that regulate GAR-3 signaling and the identities of the tissues targeted by gar-3 remain unknown.

      Comments on revisions:

      No further comments.

    1. Reviewer #1 (Public review):

      Summary:

      In the manuscript "Conformational Variability of HIV-1 Env Trimer and Viral Vulnerability", the authors study the fully glycosylated HIV-1 Env protein using an all-atom forcefield. It combines long all-atom simulations of Env in a realistic asymmetric bilayer with careful data analysis. This work clarifies how the CT domain modulates the overall conformation of the Env ectodomain and characterizes different MPER-TMD conformations. The authors also carefully analyze the accessibility of different antibodies to the Env protein.

      Strengths:

      This paper is state-of-the-art given the scale of the system and the sophistication of the methods. The biological question is important, the methodology is rigorous, and the results will interest a broad elife audience. The authors also establish strong connections to previous literature and acknowledge the limitations of the CT-truncated protein construct, which enhances the manuscript's relevance to the community.

    2. Reviewer #2 (Public review):

      In this work, the authors elucidate how a viral surface protein behaves in a membrane environment and how its large-scale motions influence the exposure of antibody-binding sites. Using long-timescale, all-atom molecular dynamics simulations of a fully glycosylated, full-length protein embedded in a virus-like membrane, the study systematically examines the coupling between ectodomain motion, transmembrane orientation, membrane interactions, and epitope accessibility. Multiple model variants differing in cleavage state, initial transmembrane configuration, and presence of the cytoplasmic tail are compared to identify general features of protein-membrane dynamics relevant to antibody recognition.

      A major strength of this study is the scope and ambition of the simulations. The authors perform multiple microsecond-scale simulations of a highly complex, biologically realistic system that includes the full ectodomain, transmembrane region, cytoplasmic tail, glycans, and a heterogeneous membrane. The finding that the ectodomain explores a wide range of tilt angles while the transmembrane region remains more constrained, with limited correlation between the two, offers useful conceptual insight into how global motions may be accommodated without large rearrangements at the membrane anchor. The explicit consideration of membrane and glycan steric effects on antibody accessibility further strengthens the study.

      The main limitations relate to sampling and model dependence inherent to simulations of this size and complexity. The analysis of antibody accessibility is based on geometric and steric criteria, which do not capture potential conformational adaptations of antibodies or membrane remodeling during binding; the authors have appropriately noted this as a limitation.

      In the revised manuscript, the authors have addressed all previously raised concerns. Time series plots of the tilt angles have been added, figure captions and visual encodings have been clarified, quantitative descriptions of angular distributions have been strengthened, and the distance metric for MPER exposure is now accompanied by temporal data. The overall presentation is substantially improved, and the conclusions are well supported by the data as presented.

    3. Reviewer #3 (Public review):

      Summary:

      This study uses large-scale all-atom molecular dynamics simulations to examine the conformational plasticity of the HIV-1 envelope glycoprotein glycoprotein (Env) in a membrane context, with particular emphasis on how the transmembrane domain (TMD), cytoplasmic tail (CT), protomer cleavage, and membrane environment influence ectodomain orientation and antibody epitope exposure. By comparing Env constructs with and without the CT, explicitly modeling glycosylation, and embedding Env in an asymmetric lipid bilayer, the authors aim to provide an integrated view of how membrane-proximal regions and lipid interactions shape Env antigenicity, including epitopes targeted by MPER-directed antibodies.

      Strengths:

      The authors have made a genuine effort to address the concerns raised in the first round of review, and the revised manuscript is substantively improved. The addition of dynamical cross-correlation maps, expanded citation of prior computational work, clarification of the membrane composition rationale, data deposition to Zenodo, and the new discussion contextualizing the independence of ectodomain and TMD motions are all welcome. Several scientifically interesting aspects of the work merit highlighting before the remaining concerns are addressed.

      A key strength of this work remains the scope, scale, and realism of the simulation systems. The authors construct a very large, nearly complete-Env-scale model that includes a glycosylated Env trimer embedded in an asymmetric bilayer, enabling analysis of membrane-protein interactions that are difficult to capture experimentally. The inclusion of specific glycans at reported sites, and the focus on constructs with and without the CT or cleavage, are well motivated by existing biological and structural data.

      The observation that R696 orientation and its interacting partners give rise to asymmetric protomer conformations and distinct TMD tilts is a notable finding. The statement that interactions between R696 and lipid headgroups or CT residues can be strong enough to introduce a kink into the TMD is well-supported by representative snapshots and consistent with prior isolated-TMD simulations. The use of two initialization depths ("high" and "low") to probe R696 leaflet preference is methodologically interesting and the authors' interpretation - that there is a slight bias toward cytoplasmic leaflet interactions, but that these contacts could be highly dynamic over the course of viral entry - is appropriately cautious. It would be valuable to explicitly frame this as a hypothesis with testable predictions that future experimental or enhanced-sampling work could address. Similarly, the equilibration-driven kinking of the TMD core, consistent with prior isolated-TMD studies, represents a useful validation that extends those earlier observations to the intact trimeric context.

      The simulations reveal substantial tilting motions of the ectodomain relative to the membrane, with angles spanning roughly 0-30{degree sign} (and up to ~40{degree sign} in some analyses), while the ectodomain itself remains relatively rigid. This framing, that much of Env's conformational variability arises from rigid-body tilting rather than large internal rearrangements, is an important conceptual contribution. The authors also provide interesting observations regarding asymmetric bilayer deformations, including localized thinning and altered lipid headgroup interactions near the TMD and CT, which suggest a reciprocal coupling between Env and the surrounding membrane.

      The analysis of antibody-relevant epitopes across the prefusion state, including the V1/V2 and V3 loops, the CD4 binding site, and the MPER, is another strength. The study makes effective use of existing experimental knowledge in this context, for example by focusing on specific glycans known to occlude antibody binding, to motivate and interpret the simulations.

      Finally, the revised discussion provides more context that situates the study's findings and discrepancies within the broader literature, strengthening the manuscript's clarity and interpretability.

      Weaknesses:

      The revised work is much improved, but still includes substantive issues with writing including organization, such as paragraph run-ons, and citation issues. Improving these would help readers make the most of this important study.

      The revised Introduction now includes a paragraph summarizing prior MD work, which is an improvement. However, the paragraph remains structured around the limitations and setup of previous studies (e.g., "early studies were constrained by limited computational resources", short trajectory lengths, isolated constructs) rather than their findings. Readers benefit most from understanding what those studies showed - and where the present work confirms, extends, or diverges from those results. The current framing inadvertently positions prior work as deficient scaffolding rather than as independent data points converging on shared conclusions. The Introduction could be revised to briefly summarize the key biological conclusions from prior MD studies alongside their technical context, which could then be revisited in their appropriate place alongside key results.

      The authors have verified that PDB entries are cited at first mention, and this is noted. However, a recurring issue remains: key literature-supported conclusions appear in the Results and Discussion sections without accompanying citations at each point of use. Passages that summarize experimental or computational findings - particularly those used to validate or contextualize the authors' own results - require citation at every point of claim, not only at first introduction of a reference. This is not a minor stylistic preference. Downstream readers, systematic reviewers, and automated tools that map literature to claims (e.g., scite) rely on co-occurrence of claims and citations within the same passage. A citation appearing several paragraphs earlier does not carry attribution forward. As a practical example: the statement that "MPER-targeting antibodies bind effectively only after the gp120-gp41 trimer undergoes major conformational rearrangements toward a fusion-intermediate or post-fusion state (Frey et al., 2008; Alam et al., 2009; Chen et al., 2014; Lee et al., 2016)", which is appropriate. That same standard of inline attribution should be applied throughout - including in Results and Discussion subsections where prior experimental findings are mentioned without citation.

      Additionally, cited literature should be framed to highlight convergence with the authors' conclusions, not primarily to limitations of previous studies. Where prior studies independently support a finding, this should be stated explicitly. Independent replication across methods and systems is one of the strongest arguments for ground truth; treating it as such would improve the manuscript's scientific standing.

      Finally, the dynamical cross-correlation maps assess ectodomain-TMD coupling, and the authors appropriately acknowledge that microsecond simulations capture only the closed ground state. However, the revised manuscript does not address the question raised in the first review regarding CT-TMD and CT-ectodomain correlations. The Results section states that "very weak correlations between the ectodomain and the TMD" were found, but it is not clear whether the CT was included in this analysis or whether analogous correlation maps for CT-TMD and CT-ectodomain pairs were computed for the full-length systems. Additional analyses of the authors' deposited MD trajectories-such as probing for exposure of cryptic epitopes and potential allosteric coupling-could serve as valuable extensions of this work.

    1. Reviewer #1 (Public review):

      Summary:

      There is evidence that some genes encode mRNAs from which separate processed transcripts may arise, separating the coding sequence (CDS) from the 3'-UTR, and with both mRNA elements remaining stable in the cell. However, the functional consequences of these mRNA fragments have not been firmly established. In the manuscript by Yang et al., the authors probe the mRNA domain architecture of Nanog in the context of embryonic stem cell colonies and blastocysts. The authors detect spatial separation of Nanog CDS-containing mRNA from abundant Nanog 3'-UTR RNAs depending on the cell position in 2D embryonic stem cell colonies or in blastocysts.

      Strengths:

      The phenotypic analyses of the Nanog mRNA hold promise for revealing distinct roles for the Nanog encoded protein and a separate RNA encompassing the Nanog 3'-UTR.

      Weaknesses:

      There are a number of questions about the molecular nature of the mRNA species that the authors should address in order for the results to be firmly established, as noted below.

      (1) It is not clear how the authors verified that their probes are specific for Nanog CDS or 3'-UTR regions. Especially for the 3'-UTR probe, it is confusing why colonies show green only regions, suggesting only the CDS is present. I would expect the CDS and 3'-UTR probes to colocalize in the interior cells. Is it possible that the 3'-UTR probe is targeting another RNA?

      (2) It would help for the authors to include a graphic similar to Figure 3, Figure Supplement 1A, that diagrams the location of the CDS and 3'-UTR probes (this should also be done for Oct4 and Sox2). This graphic could also show all potential polyadenylation signals.

      (3) I think, based on the fluorescence patterns, there is evidence that the signal for the Nanog 3'-UTR probe is nuclear (images with DAPI staining), but this is not commented on that I could find. This should be discussed, as nuclear retention has implications for the noncoding function of the 3'-UTR fragment.

      (4) Figure 2, Figure Supplement 1A needs a better explanation. It's not clear how the reads map to the different regions of the Nanog mature mRNA. The authors should show examples at different ratios of CDS to 3'-UTR. Do the reads have a sharp boundary at the junction of where the isolated 3'-UTR is thought to occur?

      (5) I looked in the Zenbu browser at human NANOG CAGE mapping in the FANTOM5 dataset. I could not see evidence for substantial capping of a 3'-UTR fragment when filtering for embryonic cell types. Given the strong signal for the 3'-UTR in border cells, I would expect to see evidence for capping if the RNA were indeed capped. This suggests that if it exists, it is likely uncapped and (as noted in point 3) is likely nuclear retained.

      (6) Are there predicted polyadenylation signals near the end of the CDS that would generate a short 3'-UTR, and are these signals conserved across mammals?

      (7) It would help to see a zoomed-in view of the region targeted by one of the guide RNAs in the 3'-UTR, and where that site is relative to the polyadenylation signal. Is the polyadenylation signal upstream, i.e., CDS proximal?

      (8) A final note, the use of green and red together will be challenging for those who are colorblind. Providing a different false color palette would be helpful.

      I am refraining from comments on the cell biology and morphological insights, as they are remote from my core expertise.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript shows that the coding sequence (CDS) and 3' untranslated region (3'UTR) of mRNA transcripts from the Nanog gene have distinct expression patterns and functions. In both human and mouse embryonic stem cells colonies and blastocysts, these domains are spatially segregated, with 3'UTR-enriched cells occupying the borders and CDS-enriched cells residing in the interior. CDS mRNA expression is correlated with the expected regulation of transcription and epigenetics associated with the Nanog protein. Interestingly, expression of the 3'UTR appears to play an independent role in cell behavior and colony morphogenesis. Indeed, deletion of the 3'UTR causes specific defects in cell spreading and protrusive activity, with alteration in the localization of adhesion and cytoskeleton-associated proteins. Remarkably, a large proportion of those defects are rescued upon ROCK inhibition. Deletion of either Nanog CDS or 3'UTR leads to distinct modifications in the differentiation competence.

      Strengths:

      The independent role of 3'UTR mRNA domains, although identified in neurosciences a couple of years ago, is a novel and exciting field relatively unexplored in early development.

      The manuscript offers a multilayer series of experiments, in ES cells colony, blastocysts, and embryoid bodies, including imaging, -omics, genetic and pharmacological challenges, and differentiation experiments, thereby unveiling very convincingly the role of Nanog 3'UTR in morphogenesis.

      Weaknesses:

      The pathways leading to the generation of those distinct transcript domains are unknown. Although the functional differential roles are well demonstrated, whether the expression patterns are a cause or a consequence of the cells' localisation in the embryo remains to be explored.

    3. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Yang et al reported distinct functions of the protein-coding sequence (CDS) and the 3' untranslated region (UTR) in the Nanog mRNA in pluripotent stem cells. They first observed different localization patterns for the CDS and 3' UTR in embryonic stem cells and in blastocyst embryos, and this pattern correlates with cell populations in different pluripotent states based on single-cell sequencing data. To characterize the potentially distinct functions of these regions, the authors generated knockout (KO) cell lines in which either the CDS or the 3' UTR was genetically ablated. These deletions led to different phenotypes in multiple assays. These results provided evidence that the CDS and 3' UTR of an mRNA could have distinct functions. Although these results are potentially interesting, several questions need to be addressed before the validity of their conclusion can be confirmed.

      Strengths:

      This study provides evidence for distinct functions of the protein-coding sequence and 3' untranslated region of an mRNA in pluripotent stem cells. The concept could be more broadly applied.

      Weaknesses:

      The initial observation (distinct localization of CDS and 3' UTRs) and the causal relationship between the KO and phenotype need further validation.

      Major points:

      (1) The authors showed distinct localization patterns of the CDS and 3' UTRs in human and mouse ESCs and blastocysts, and the overlap between their signals was minimal (Figure 1). Does this mean that the CDS and 3' UTR RNAs exist separately? For example, in cells that only showed signals for 3' UTRs, do these RNAs only contain 3' UTRs and lack CDS? Was this confirmed by RNA-seq experiments? If so, how are they generated (i.e., by transcription from a novel promoter or partial degradation of the full-length mRNAs)? This is a key question. Without a clear characterization of these RNAs, the rest of the study cannot be substantiated.

      (2) To confirm that the phenotypes of CDS or 3' UTR KO cells were caused by the deleted regions instead of other artifacts, rescue experiments should be performed.

      (3) As over-expression of the 3' UTR showed a phenotype, important regions within it should be identified, and also the possibility that the 3' UTR contains open reading frame(s) and is translated should be tested.

    1. Reviewer #1 (Public review):

      Summary:

      Dalben et al. grafted the fusion loop mature (FLM) modification, based on a previously reported D2-FLM, to another serotype DENV4, and adapted them to replicate in Vero cells for live attenuated vaccine (LAV) manufacturing while retaining favorable antigenic profiles, generating two new strains: D2-vFLM and D4-vFLM. Deep sequencing revealed adapted mutations at the junction of envelope domains I and II (EDI and EDII), and both D2-vFLM and D4-vFLM showed no evidence of ADE in the presence of FL-targeting Abs. Sera from D2-vFLM immunized mice displayed strong homotypic and reduced heterotypic neutralization compared to wild-type viruses, with minimal to no ADE potential in vitro. Moreover, D2-vFLM immunization completely protected AG129 mice from lethal challenge with mouse-adapted D220. They demonstrate that the FLM modification platform is transferable across serotypes and yields strains with favorable immunogenicity and reduced ADE risk. The FLM approach provides a promising path toward the development of a safer tetravalent DENV LAV.

      Strengths:

      The authors carried out a series of experiments to generate and characterize two new strains (D2-vFLM and D4-vFLM) of FLM-modified viruses, and showed their antigenic and immunogenic profiles. The observation that the FLM modification platform is transferable across serotypes and yields strains with favorable immunogenicity and reduced ADE risk is interesting.

      Weaknesses:

      However, one concern is the total number of mutations (including originally introduced and compensatory mutations) in this FLM vaccine platform, and it is not clear regarding the future directions for the proof-of-concept vaccine in this study.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, YR Dalben et al describe the generation of DENV2 and DENV4 strains with mutations in the fusion loop (FL) of the E protein and pre-membrane (prM) protein to limit potential antibody-dependent enhancement (ADE) resulting from vaccination with live-attenuated vaccines and adapted these strains for growth in Vero cells. They show that the DENV2 version D2-vFLM is immunogenic and generates neutralizing serum against DENV2 and DENV4 after 2 boosts and is protective against lethal challenge. Serum from D2-vFLM also showed no ADE against DENV4.

      Strengths:

      Overall, the paper is well written and presented, and the data presented support most of the conclusions made. Grafting D2-FLM mutations to DENV4 and adapting both to growth in Vero cells is a good step to show that this method could be used to generate production-level LAV. The growth and stability data are clear and well-conducted.

      Weaknesses:

      However, there are several weaknesses, mostly in regard to the immunogenicity data, that limit the overall impact. The FLM mutations were only grafted to DENV4 but not to the other Dengue serotypes. The authors acknowledge that this is a proof-of-concept, but generating mutants of the other serotypes would strengthen the idea that this could be used to develop a tetravalent LAV. Immunizations in mice were only performed for D2-vFLM but not D4-vFLM. Immunogenicity data for D4-vFLM would strengthen this work if it shows that it can be immunogenic, protective, and limit ADE, as is shown for D2-vFLM. ADE from D2-vFLM was only tested against DENV4; does it also limit ADE from the other serotypes? This would better show that these mutations do limit ADE across serotypes and not just a single one.

      Additionally, some of the immunization data likely need to be repeated:

      The authors should describe why they pooled the sera from the mice and whether they purified total IgG or not (Figure 5). They should also probably repeat the challenge experiment since it was 4 mice (D2) against 5 (D2-vFLM), and it is unclear if there is a statistical difference between the results obtained. It is not even mentioned in the Results section (D2 result vs D2-FLM), and thus unclear if using D2-FLM is an improvement in the way the data is currently presented.

    1. Reviewer #1 (Public review):

      Summary:

      The authors present a simplified neural bursting model with explicitly controllable parameterization of oscillator dynamics designed for neural circuit modeling involved in rhythm generation.

      Strengths:

      (1) The purpose of the model and applied abstractions are well articulated and justified (2D model, independent parameter control).

      (2) Explicit control of burst duration, inter-burst interval, amplitude, resetting-behavior/entrainment. This allows modelers to focus on circuit interactions and is especially useful when details of intrinsic currents and bursting mechanisms are unknown. One could even imagine a scenario where this model would help identify predictions on key underlying burst generation mechanisms.

      (3) The model is well described and validated with simulations and comparisons to the base model and one alternative model.

      (4) Circuit-level validation is convincing, as it reproduces not only trivial examples.

      (5) The underlying mechanism in phase space is well reasoned and justified, extends previous work, e.g., by McKean, by improving usability.

      Weaknesses:

      (1) The paper heavily relies on numerical demonstrations but does not provide a formal analysis of stability, bifurcations, or entrainment. While appropriate for the intended purposes, a more formal footing could strengthen the model.

      (2) Lots of nice demonstrations are shown, but it is less clear how model parameterization was chosen, how behavior depends on parameterization, and in what parameter ranges certain behavior can be expected. A more detailed description of parameterization/exploration of parameter space would greatly benefit anyone using this model in the future.

      (3) Some claims on reproduction of prior locomotor CPG model and production of "more biologically realistic activity" by the presented model are overstated. The key feature of the locomotor CPG models cited was that they not only reproduced speed-dependent gait expression of intact mice, but also changes of gait expression after silencing/removal of specific commissural and long propriospinal interneurons (e.g., selective loss of trot after deleting of V0V; changes in gait expression and step-to-step variability after silencing of descending long-propriospinal neurons or ascending V3 LPNs). While likely (at least partially) feasible with the model formulation, the correspondence of these silencing/ablation of neuron classes has not been shown by the model. Importantly, though, it appears that authors didn't show how the model in general behaves under the influence of noise, which is key to reproducing LPN silencing.

    2. Reviewer #2 (Public review):

      Summary:

      The authors propose a reduced model for intrinsically bursting neurons. The model simply consists of exponential decay of an adaptation variable in a phenomenological silent phase, an exponential growth of that variable in an active phase, and imposed thresholds for jumps between these phases, with some add-ons to allow for effects such as input-dependence.

      Strengths:

      The model could be used as a controller for an artificial system that needs to switch between on and off states with separate control of state durations. It has some flexibility to allow for variable levels of the activity variable during the active phase. The authors show that the model can be tuned to capture phase response properties of neurons and patterns generated by small networks of neurons.

      Weaknesses:

      The proposed approach lacks biological relevance, practicality, and originality.

      (1) Biological relevance:

      Central pattern generators and other bursting neurons use specific physical principles to generate their bursts of activity. These principles place constraints on the tuning of these bursts, including relationships between active and silent phase durations and other properties. By discarding these relationships, the proposed model risks losing key constraints that affect performance in biologically relevant scenarios. The proposed model does not allow for the emergence of interesting dynamical phenomena, which occur naturally in neurons and neuronal networks.

      It is also important to note that spikes within bursts can be important and of interest. Biophysical models allow for easy extension to include spikes via fast sodium and potassium currents. The proposed model does not allow for such extensibility.

      Finally, as shown in the seminal early-2000s work of Izhikevich, building on fast-slow decomposition work by Rinzel and others, there is a wide variety of possible neuronal bursting patterns. At the very least, several of these have been observed in neuronal recordings. The authors' model is specific to square-wave bursting.

      (2) Practicality:

      The model makes use of various cut-off functions and other aspects that are implemented as rules. Combining rules with differential equations makes for an awkward modeling framework that is inconvenient to implement, conceptualize, and analyze (e.g., from a bifurcation perspective). Moreover, the authors add more and more adjustments to their basic framework to capture additional features, but these add-ons simply make the model more, and unnecessarily, complicated and awkward. It's worth noting that the authors argue for their model based on the idea that more biophysical models are difficult to tune, yet they compare their model to a biophysical one that they were able to tune to achieve the various patterns that they study. They do not give any indication of how easy or hard it was to tune their own model, nor do they compare simulation times between the two models. I do note that the biophysical model seems to have 22 parameters, whereas the simplified one has 21 in Table 2, which is essentially the same number. Finally, although the authors give some extensions of the model to match observed data, their model does not seem useful for predicting performance in never-before-tested scenarios.

      (3) Originality:

      As the authors note, the use of low-dimensional, specifically planar, neural models dates back to early authors such as FitzHugh and Nagumo. What the authors fail to acknowledge is that Rinzel, Terman, Kopell, and others did seminal work on neuronal activity, including phenomena such as post-inhibitory rebound and fast threshold modulation, using a relaxation oscillation framework, starting several decades ago. Their work included applications to central pattern generators (e.g., see Terman and collaborators on respiratory CPGs). It is astonishing that the authors don't seem to be aware of this work and do not mention it at all. Moreover, I don't see any advantage of the proposed framework over the earlier relaxation oscillator setting, where many important mechanistic principles have already been analyzed, including extensions to networks. On a related note, even through they propose a piecewise linear model, the authors do not cite the substantial existing work on piecewise linear models (e.g., Hahnloser, Neural Networks, 1998, for an early example; 2024 SIAM Review article by Coombes et al and references therein for much more) including work specifically on bursting, nor do they cite various other previous efforts to capture bursting with simplified models including work on piecewise linear maps by Aguirre et al.

    3. Reviewer #3 (Public review):

      This computational modeling study introduces the methodology of replacing bursting neurons in a model circuit with a simplified piecewise-linear model with an "active" and a "quiet" state representing, respectively, the burst of spikes and the inter-burst interval. The shape of the active state loosely represents the intra-burst firing rate. Because (piecewise) linear systems are explicitly solvable, the transitions from quiet to active and vice versa can be calculated explicitly to match exactly what a biophysically realistic model or a biological neuron does in different conditions. The base piecewise-linear model is built to represent a 2D biophysical neuron with a cubic v-nullcline. The simplicity of the model allows for matching the kinetics of more complex models with a tractable simplified set of equations, as exemplified by approximations of burst duration and amplitude, phase-response curves, entrainment, and, finally, mimicking the activities of two CPG circuit models using this simplified representation.

      Major comments

      (1) The use of piecewise linear approximations to explicitly estimate properties of biophysical neurons is a well-known and common technique. This study adds nothing to the technique in terms of novelty.

      (2) Although the model explicitly matches active and inactive durations of a circuit neuron, the dynamics are explicitly "clamped" by the user because the reduced model parameters explicitly depend on the input. There are cases where this is useful, for example, when we are interested in the dynamics of _other_ neurons (B, C, D, ...) within the context of activity, and we "clamp" the dynamics of neuron A. One should note that this is no better than having a look-up table. Effectively, to give a comparison, it is like using a sine wave to represent a pacemaker neuron and explicitly define its frequency at different input levels so that it responds "dynamically". However, the neuron is restricted to what the user puts in, and therefore, calling it a dynamical system is entirely wrong. I am afraid that the use of this crude tool is not described well enough in the manuscript to warn a naïve user not to fall for this trap.

      (3) The phase resetting curves are used incorrectly. PRCs are useful when the perturbation is weak (soft), which would demonstrate the nature of the vector field near the limit cycle and therefore inform us of the nature of its stability or instability. A hard PRC would always reset the cycle to the fixed offset from the perturbation phase and is therefore uninformative in understanding dynamics. (It is, however, useful experimentally in identifying which neurons are part of the CPG.) The authors clearly know that the dynamics of the system away from the limit cycle do not conserve those of a biophysical neuron. So what is the point?

      (4) I work on the STG, one of the systems exemplified here. Even in the small and relatively regular CPGs of the STG, the definition of the active and quiet parts of a burst is often less clear than what the authors suggest. Bursting neurons often do multiple bursts in a cycle, and therefore, substituting the burst envelope is a subjective matter. This is even more problematic in bursting neurons in the brain, where there is often no quiet period. This should be discussed.

    1. Reviewer #1 (Public review):

      The idea is super interesting, and the subsequent work is potentially significant because it links peripheral inflammation to remodelling of perinodal adipose tissue and draining lymph nodes. This suggests an antigen-independent manner by which local tissue inflammation can communicate with and reshape immune organ structure and tissue metabolism. However, the evidence is suggestive. For instance, many conclusions rely on correlational weight/cellularity relationships, models with confounders (spontaneous wounding; potentially systemic IMQ), and macrophage dependence inferred from a single pharmacologic approach without definitive depletion/lineage or tracer-based causal link.

      Major Comments:

      (1) "Wounding/fighting" evidence is confounding.

      Unless I am mistaken, a large part of the argument for inflammation-driven perinodal fat pad atrophy and LN expansion relies on spontaneous fighting injuries in co-housed CCR2-/- males, including animals "culled...due to excessive wounding." Because wound severity, duration, infection load, stress, and cage dynamics are uncontrolled, isn't it difficult to assign causality to "cutaneous inflammation"?

      (2) The "CCR2-independent macrophage" conclusion.

      The manuscript interprets persistence/accumulation of macrophages despite reduced inflammatory monocytes as CCR2-independent recruitment or local proliferation. However, CCR2 deficiency can alter immune baselines and long-term tissue remodelling. Perhaps consider bone marrow chimeras (WT to CCR2-/-, CCR2-/- to WT ????) or an inducible CCR2 deletion approach to separate developmental/systemic effects from acute inflammation-driven mechanisms. If "in situ proliferation" is proposed, include a direct readout (e.g., Ki67 in ATMs in the fat pad).

      (3) IMQ and systemic effects.

      The work relies on topical Aldara/imiquimod as an "inflammation without antigen" driver of distal LN/fat-pad remodelling. But IMQ is well known (and cited by the authors) to enter circulation and drive systemic responses, which could blur whether effects are truly draining-site specific vs systemic metabolic/inflammatory effects. It would be ideal to provide systemic context: plasma cytokines and/or metabolic readouts (e.g., circulating FFAs) to distinguish local vs systemic drivers.

      (4) Macrophage dependence is inferred from CSF1R inhibitor treatment.

      However, validation of macrophage depletion and specificity is incomplete. The manuscript uses AZD7507 (CSF1R inhibitor) and observes partial rescue of fat pad/LN phenotype while skin severity (PASI) is unaffected. But, to this reviewer, the data shown do not clearly quantify actual macrophage depletion efficiency in the target fat pad, and LN at endpoint, and CSF1R blockade can affect multiple myeloid populations. Therefore, show absolute macrophage counts (and likely other myeloid populations) in fat pad and LN with/without AZD7507 at the analysed timepoints, not only outcome weights. (The methods describe dosing but not endpoint depletion quantification??)

      (5) Fat pad atrophy/LN expansion is a correlation.

      The paper emphasises negative correlations between fat pad and LN weights/cellularity at baseline and with inflammation. But correlation does not establish whether fat pad lipolysis drives LN expansion, whether LN changes drive fat remodelling, or whether both reflect systemic mediators. Add tissue-level evidence distinguishing true adipocyte loss vs other contributors to "weight change" (e.g., oedema/fibrosis).

      (6) Evidence for "fatty acid donation" from fat pad to LN.

      The lipid data are described as "exemplary," and the inference that LN fatty acids originate from the fat pad is based on temporal ordering and relative abundance. This does not rule out plasma spillover, LN-intrinsic metabolism, or altered lymph flow.